This commit introduces several improvements to the AMP (Advanced Model Proxy) module:
- **Model Mapping Logic:** The `FallbackHandler` now uses a more robust approach for model mapping. It includes the extraction and preservation of dynamic "thinking suffixes" (e.g., `(xhigh)`) during mapping, ensuring that these configurations are correctly applied to the mapped model. A new `resolveMappedModel` function centralizes this logic for cleaner code.
- **ModelMapper Verification:** The `ModelMapper` in `model_mapping.go` now verifies that the target model of a mapping has available providers *after* normalizing it. This prevents mappings to non-existent or unresolvable models.
- **Gemini Thinking Configuration Cleanup:** In `gemini_thinking.go`, unnecessary `generationConfig.thinkingConfig.include_thoughts` and `generationConfig.thinkingConfig.thinkingBudget` fields are now deleted from the request body when applying Gemini thinking levels. This prevents potential conflicts or redundant configurations.
- **Testing:** A new test case `TestModelMapper_MapModel_TargetWithThinkingSuffix` has been added to `model_mapping_test.go` to specifically cover the preservation of thinking suffixes during model mapping.
Ensures compatibility with the Amp client by suppressing
"thinking" blocks when "tool_use" blocks are also present in
the response.
The Amp client has issues rendering both types of blocks
simultaneously. This change filters out "thinking" blocks in
such cases, preventing rendering problems.
Normalize action handling by accommodating wildcard patterns in route definitions for Gemini endpoints. Adjust `request.Action` parsing logic to correctly process routes with prefixed actions.
All Amp management endpoints (e.g., /api/user, /threads) are now protected by the standard API key authentication middleware. This ensures that all management operations require a valid API key, significantly improving security.
As a result of this change:
- The `restrict-management-to-localhost` setting now defaults to `false`. API key authentication provides a stronger and more flexible security control than IP-based restrictions, improving usability in containerized environments.
- The reverse proxy logic now strips the client's `Authorization` header after authenticating the initial request. It then injects the configured `upstream-api-key` for the request to the upstream Amp service.
BREAKING CHANGE: Amp management endpoints now require a valid API key for authentication. Requests without a valid API key in the `Authorization` header will be rejected with a 401 Unauthorized error.
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
Add a configuration option to control whether model mappings take
precedence over local API keys for Amp CLI requests.
- Add PrioritizeModelMappings field to AmpCode config struct
- When false (default): Local API keys take precedence (original behavior)
- When true: Model mappings take precedence over local API keys
- Add management API endpoints GET/PUT /prioritize-model-mappings
This allows users who want mapping priority to enable it explicitly
while preserving backward compatibility.
Config example:
ampcode:
model-mappings:
- from: claude-opus-4-5-20251101
to: gemini-claude-opus-4-5-thinking
prioritize-model-mappings: true
Simplifies routing logic by delegating all provider/mapping/proxy
decisions to FallbackHandler. Previously, the route checked for
provider/mapping availability before calling the handler, then
FallbackHandler performed the same checks again.
Changes:
- Remove model extraction and provider checking from route (lines 182-201)
- Route now only checks if request is POST with /models/ path
- FallbackHandler handles provider -> mapping -> proxy fallback
- Remove unused internal/util import
Benefits:
- Eliminates duplicate checks (addresses PR review feedback #2)
- Centralizes all provider/mapping logic in FallbackHandler
- Reduces routing code by ~20 lines
- Aligns with how other /api/provider routes work
Performance: No impact (checks still happen once in FallbackHandler)
- Change createGeminiBridgeHandler to accept gin.HandlerFunc instead of *gemini.GeminiAPIHandler
This allows tests to inject mock handlers instead of duplicating bridge logic
- Replace magic number 8 with len(modelsPrefix) for better maintainability
- Remove redundant test case that doesn't test edge case in production
- Update routes.go to pass geminiHandlers.GeminiHandler directly
Addresses PR review feedback on test architecture and code clarity.
Amp-Thread-ID: https://ampcode.com/threads/T-1ae2c691-e434-4b99-a49a-10cabd3544db
Gemini handler extracts model from URL path, not JSON body, so
rewriting the request body alone wasn't sufficient for model mapping.
- Add MappedModelContextKey constant for context passing
- Update routes.go to use NewFallbackHandlerWithMapper
- Add check for valid mapping before routing to local handler
- Add tests for gemini bridge model mapping
- Store ampModule in Server struct to access it during config updates
- Call ampModule.OnConfigUpdated() in UpdateClients() for hot reload
- Watch config directory instead of file to handle atomic saves (vim, VSCode, etc.)
- Improve config file event detection with basename matching
- Add diagnostic logging for config reload tracing
AMP CLI requests /threads.rss at the root level, but the AMP module
only registered routes under /api/*. This caused a 404 error during
AMP CLI startup.
Add the missing root-level route with the same security middleware
(noCORS, optional localhost restriction) as other management routes.
- Add AmpModelMapping config to route models like 'claude-opus-4.5' to 'claude-sonnet-4'
- Add ModelMapper interface and DefaultModelMapper implementation with hot-reload support
- Enhance FallbackHandler to apply model mappings before falling back to ampcode.com
- Add structured logging for routing decisions (local provider, mapping, amp credits)
- Update config.example.yaml with amp-model-mappings documentation
Fix critical security vulnerability in amp-restrict-management-to-localhost
feature where attackers could bypass localhost restriction by spoofing
X-Forwarded-For headers.
Changes:
- Use RemoteAddr (actual TCP connection) instead of ClientIP() in
localhostOnlyMiddleware to prevent header spoofing attacks
- Add comprehensive test coverage for spoofing prevention (6 test cases)
- Update documentation with reverse proxy deployment guidance and
limitations of the RemoteAddr approach
The fix prevents attacks like:
curl -H "X-Forwarded-For: 127.0.0.1" https://server/api/user
Trade-off: Users behind reverse proxies will need to disable the feature
and use alternative security measures (firewall rules, proxy ACLs).
Addresses security review feedback from PR #287.
AMP CLI sends Gemini requests to non-standard paths that were being
directly proxied to ampcode.com without checking for local OAuth.
This fix adds:
- GeminiBridge handler to transform AMP CLI paths to standard format
- Enhanced model extraction from AMP's /publishers/google/models/* paths
- FallbackHandler wrapper to check for local OAuth before proxying
Flow:
- If user has local Google OAuth → use it (free tier)
- If no local OAuth → fallback to ampcode.com (charges credits)
Fixes issue where gemini-3-pro-preview requests always charged AMP
credits even when user had valid Google Cloud OAuth configured.
Amp CLI sends 'context-1m-2025-08-07' in Anthropic-Beta header which
requires a special 1M context window subscription. After upstream rebase
to v6.3.7 (commit 38cfbac), CLIProxyAPI now respects client-provided
Anthropic-Beta headers instead of always using defaults.
When users configure local OAuth providers (Claude, etc), requests bypass
the ampcode.com proxy and use their own API subscriptions. These personal
subscriptions typically don't include the 1M context beta feature, causing
'long context beta not available' errors.
Changes:
- Add filterBetaFeatures() helper to strip specific beta features
- Filter context-1m-2025-08-07 in fallback handler when using local providers
- Preserve full headers when proxying to ampcode.com (paid users get all features)
- Add 7 test cases covering all edge cases
This fix is isolated to the Amp module and only affects the local provider
path. Users proxying through ampcode.com are unaffected and receive full
1M context support as part of their paid service.
- add fallback handler that forwards Amp provider requests to ampcode.com when the provider isn’t configured locally
- wrap AMP provider routes with the fallback so requests always have a handler
- share Gemini thinking model normalization helper between core handlers and AMP fallback
Add full Amp CLI support to enable routing AI model requests through the proxy
while maintaining Amp-specific features like thread management, user info, and
telemetry. Includes complete documentation and pull bot configuration.
Features:
- Modular architecture with RouteModule interface for clean integration
- Reverse proxy for Amp management routes (thread/user/meta/ads/telemetry)
- Provider-specific route aliases (/api/provider/{provider}/*)
- Secret management with precedence: config > env > file
- 5-minute secret caching to reduce file I/O
- Automatic gzip decompression for responses
- Proper connection cleanup to prevent leaks
- Localhost-only restriction for management routes (configurable)
- CORS protection for management endpoints
Documentation:
- Complete setup guide (USING_WITH_FACTORY_AND_AMP.md)
- OAuth setup for OpenAI (ChatGPT Plus/Pro) and Anthropic (Claude Pro/Max)
- Factory CLI config examples with all model variants
- Amp CLI/IDE configuration examples
- tmux setup for remote server deployment
- Screenshots and diagrams
Configuration:
- Pull bot disabled for this repo (manual rebase workflow)
- Config fields: AmpUpstreamURL, AmpUpstreamAPIKey, AmpRestrictManagementToLocalhost
- Compatible with upstream DisableCooling and other features
Technical details:
- internal/api/modules/amp/: Complete Amp routing module
- sdk/api/httpx/: HTTP utilities for gzip/transport
- 94.6% test coverage with 34 comprehensive test cases
- Clean integration minimizes merge conflict risk
Security:
- Management routes restricted to localhost by default
- Configurable via amp-restrict-management-to-localhost
- Prevents drive-by browser attacks on user data
This provides a production-ready foundation for Amp CLI integration while
maintaining clean separation from upstream code for easy rebasing.
Amp-Thread-ID: https://ampcode.com/threads/T-9e2befc5-f969-41c6-890c-5b779d58cf18