Migrates the AMP module to a new unified routing system, replacing the fallback handler with a router-based approach.
This change introduces a `ModelRoutingWrapper` that handles model extraction, routing decisions, and proxying based on provider availability and model mappings.
It provides a more flexible and maintainable routing mechanism by centralizing routing logic.
The changes include:
- Introducing new `routing` package with core routing logic.
- Creating characterization tests to capture existing behavior.
- Implementing model extraction and rewriting.
- Updating AMP module routes to utilize the new routing wrapper.
- Deprecating `FallbackHandler` in favor of the new `ModelRoutingWrapper`.
Improves AMP request handling by consolidating model mapping logic into a helper function for better readability and maintainability.
Enhances error handling for premature client connection closures during reverse proxy operations by explicitly acknowledging and swallowing the ErrAbortHandler panic, preventing noisy stack traces.
Removes unused method `findProviderViaOAuthAlias` from the `DefaultModelMapper`.
Uses centralized context keys for accessing mapped and fallback models.
This change deprecates the string-based context keys used in the AMP fallback handlers in favor of the `ctxkeys` package, promoting consistency and reducing the risk of typos.
The authentication conductor now retrieves fallback models using the shared `ctxkeys` constants.
Introduce `Filter` rules in the payload configuration to remove specified JSON paths from the payload. Update related helper functions and add examples to `config.example.yaml`.
- Add ErrorLogsMaxFiles config field with default value 10
- Support hot-reload via config file changes
- Add Management API: GET/PUT/PATCH /v0/management/error-logs-max-files
- Maintain SDK backward compatibility with NewFileRequestLogger (3 params)
- Add NewFileRequestLoggerWithOptions for custom error log retention
When request logging is disabled, forced error logs are retained up to
the configured limit. Set to 0 to disable cleanup.
- Added a new routing package to manage provider registration and model resolution.
- Introduced Router, Executor, and Provider interfaces to handle different provider types.
- Implemented OAuthProvider and APIKeyProvider to support OAuth and API key authentication.
- Enhanced DefaultModelMapper to include OAuth model alias handling and fallback mechanisms.
- Updated context management in API handlers to preserve fallback models.
- Added tests for routing logic and provider selection.
- Enhanced Claude request conversion to handle reasoning content based on thinking mode.
Enhances the AMP model mapping functionality to support fallback mechanisms using .
This change allows the system to attempt alternative models (aliases) if the primary mapped model fails due to issues like quota exhaustion. It updates the model mapper to load and utilize the configuration, enabling provider lookup via aliases. It also introduces context keys to pass fallback model names between handlers.
Additionally, this change introduces a fix to prevent ReverseProxy from panicking by swallowing ErrAbortHandler panics.
Amp-Thread-ID: https://ampcode.com/threads/T-019c0cd1-9e59-722b-83f0-e0582aba6914
Co-authored-by: Amp <amp@ampcode.com>
Introduce a custom HTTP client utilizing utls with Firefox TLS fingerprinting to bypass Cloudflare fingerprinting on Anthropic domains. Includes support for proxy configuration and enhanced connection management for HTTP/2.
Add support for Gemini's code_execution and url_context tools in the
request translators, enabling:
- Agentic Vision: Image analysis with Python code execution for
bounding boxes, annotations, and visual reasoning
- URL Context: Live web page content fetching and analysis
Tools are passed through using the same pattern as google_search:
- code_execution: {} -> codeExecution: {}
- url_context: {} -> urlContext: {}
Tested with Gemini 3 Flash Preview agentic vision successfully.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Removes x-* extension fields from JSON schemas to ensure compatibility with the Gemini API.
These fields, while valid in OpenAPI/JSON Schema, are not recognized by the Gemini API and can cause issues.
The change recursively walks the schema, identifies these extension fields, and removes them, except when they define properties.
Amp-Thread-ID: https://ampcode.com/threads/T-019c0cd1-9e59-722b-83f0-e0582aba6914
Co-authored-by: Amp <amp@ampcode.com>
- Add ensureCacheControl() to auto-inject cache breakpoints
- Cache tools (last tool), system (last element), and messages (2nd-to-last user turn)
- Add prompt-caching-2024-07-31 beta header
- Return original payload on sjson error to prevent corruption
- Include verification test for caching logic
Enables up to 90% cost reduction on cached tokens.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add firstChunkTimestamp field to ResponseWriterWrapper for sync capture
- Capture TTFB in Write() and WriteString() before async channel send
- Add SetFirstChunkTimestamp() to StreamingLogWriter interface
- Make requestTimestamp/apiResponseTimestamp required in LogRequest()
- Remove timestamp capture from WriteAPIResponse() (now via setter)
- Fix Gemini handler to set API_RESPONSE_TIMESTAMP before writing response
This ensures accurate TTFB measurement for all streaming API formats
(OpenAI, Gemini, Claude) by capturing timestamp synchronously when
the first response chunk arrives, not when the stream finalizes.
Previously:
- REQUEST INFO timestamp was captured at log write time (not request arrival)
- API RESPONSE had NO timestamp at all
This fix:
- Captures REQUEST INFO timestamp when request first arrives
- Adds API RESPONSE timestamp when upstream response arrives
Changes:
- Add Timestamp field to RequestInfo, set at middleware initialization
- Set API_RESPONSE_TIMESTAMP in appendAPIResponse() and gemini handler
- Pass timestamps through logging chain to writeNonStreamingLog()
- Add timestamp output to API RESPONSE section
This enables accurate measurement of backend response latency in error logs.
When using Gemini API format with Antigravity backend, the executor
renames usageMetadata to cpaUsageMetadata in non-terminal chunks.
The Gemini translator was returning this internal field name directly
to clients instead of the standard usageMetadata field.
Add restoreUsageMetadata() to rename cpaUsageMetadata back to
usageMetadata before returning responses to clients.
When Claude API sends an assistant message with empty text content like:
{"role":"assistant","content":[{"type":"text","text":""}]}
The translator was creating a part object {} with no data field,
causing Gemini API to return error:
"required oneof field 'data' must have one initialized field"
This fix:
1. Skips empty text parts (text="") during translation
2. Skips entire messages when their parts array becomes empty
This ensures compatibility when clients send empty assistant messages
in their conversation history.
Optimize channel operations by introducing reusable context-aware send functions (`send` and `sendErr`) across `wsrelay`, `handlers`, and `cliproxy`. Ensure graceful handling of canceled contexts during stream operations.
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.