- Add model mapping feature to README.md Amp CLI section
- Add detailed Model Mapping Configuration section to amp-cli-integration.md
- Update architecture diagram to show model mapping flow
- Update Model Fallback Behavior to include mapping step
- Add Table of Contents entry for model mapping
- Add AmpModelMapping config to route models like 'claude-opus-4.5' to 'claude-sonnet-4'
- Add ModelMapper interface and DefaultModelMapper implementation with hot-reload support
- Enhance FallbackHandler to apply model mappings before falling back to ampcode.com
- Add structured logging for routing decisions (local provider, mapping, amp credits)
- Update config.example.yaml with amp-model-mappings documentation
- Updated `antigravity_openai_request.go` to process non-JSON outputs gracefully by verifying and distinguishing between JSON and plain string formats.
- Ensured proper assignment of parsed or raw response to `functionResponse`.
- Added `ContextLength` field with a value of 200,000 to all applicable Claude model definitions.
- Standardized `MaxCompletionTokens` values across models for consistency and alignment.
**fix(translator): add support for "xhigh" reasoning effort in OpenAI responses**
- Updated handling in `openai_openai-responses_request.go` to include the new "xhigh" reasoning effort level.
Fixes an issue where Claude thinking models would return 400 errors when
the thinking.budget_tokens was greater than or equal to max_tokens.
Changes:
- Add MaxCompletionTokens: 128000 to all Claude thinking model definitions
- Add ensureMaxTokensForThinking() function in claude_executor.go that:
- Checks if thinking is enabled with a budget_tokens value
- Looks up the model's MaxCompletionTokens from the registry
- Ensures max_tokens is set to at least the model's MaxCompletionTokens
- Falls back to budget_tokens + 4000 buffer if registry lookup fails
This ensures Anthropic API constraint (max_tokens > thinking.budget_tokens)
is always satisfied when using extended thinking features.
Fixes: #339🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implemented logic to pair consecutive function calls and their outputs, ensuring proper sequencing for processing.
- Adjusted `gemini_openai-responses_request.go` to normalize message structures and maintain expected flow.
- Updated handling of `output` in `gemini_openai-responses_request.go` to use `.Str` instead of `.Raw` when parsing non-JSON string outputs.
- Added checks to distinguish between JSON and non-JSON `output` types for accurate `functionResponse` construction.
- Updated handling of `output` in `gemini_openai-responses_request.go` to use `.Raw` instead of `.String` for preserving original JSON encoding.
- Ensured proper setting of raw JSON output when constructing `functionResponse`.
- Updated handling of `thoughtSignature` across all translator modules to retain other content payloads if present.
- Adjusted logic for `thought_signature` and `inline_data` keys for consistent processing.
- Added new `Gemini 3 Pro Image Preview` model with detailed metadata and configuration.
- Removed outdated `Claude Sonnet 4.5 Thinking` model definition for cleanup and relevance.
**feat(handlers, executor): add Gemini 3 Pro Preview support and refine Claude system instructions**
- Added support for the new "Gemini 3 Pro Preview" action in Gemini handlers, including detailed metadata and configuration.
- Removed redundant `cache_control` field from Claude system instructions for cleaner payload structure.
**fix(executor): replace redundant commented code with `checkSystemInstructions` helper**
- Replaced commented-out `sjson.SetRawBytes` lines with the new `checkSystemInstructions` function.
- Centralized system instruction handling for better code clarity and reuse.
- Ensured consistent logic for managing `system` field across Claude executor flows.
- Commented out multiple instances of `sjson.SetRawBytes` for setting `system` key to Claude instructions as they are redundant.
- Code cleanup to improve clarity and maintainability without affecting functionality.
- Add Claude thinking model definitions (sonnet-4-5-thinking, opus-4-5-thinking variants)
- Add Thinking support for antigravity models with -thinking suffix
- Add injectThinkingConfig() for automatic thinking budget based on model suffix
- Add resolveUpstreamModel() mappings for thinking variants to actual Claude models
- Add extractAndRemoveBetas() to convert betas array to anthropic-beta header
- Update applyClaudeHeaders() to merge custom betas from request body
Closes#324
- Introduced `appendAPIResponse` helper to preserve and append data to existing API responses.
- Ensured newline inclusion when appending, if necessary.
- Improved `nil` and data type checks for response handling.
- Updated middleware to skip request logging for `GET` requests.
- Added additional metadata fields (`Name`, `Description`, `DisplayName`, `Version`) to `ModelInfo` struct initialization for better model representation.
- Removed unnecessary whitespace in the code.
- Ensured `functionDeclarations` key renaming only occurs if the key exists in Gemini tools processing.
- Prevented unnecessary JSON reassignment when the target key is absent.
- Fixed parameter key renaming to correctly handle `functionDeclarations` and `parametersJsonSchema` in Gemini tools.
- Resolved potential overwriting issue by reassigning JSON strings after each key rename.
- Introduced `TLSConfig` to support HTTPS configurations, including enabling TLS, specifying certificate and key files.
- Updated HTTP server logic to handle HTTPS mode when TLS is enabled.
- Enhanced `config.example.yaml` with TLS settings example.
- Adjusted internal URL generation to respect protocol based on TLS state.
- Fixed improper handling of `indexAssigned` and `Index` during auth reassignment.
- Ensured `EnsureIndex` is invoked after validating existing auth entries.
**feat(retry): add configurable retry logic with cooldown support**
- Introduced `max-retry-interval` configuration for cooldown durations between retries.
- Added `SetRetryConfig` in `Manager` to handle retry attempts and cooldown intervals.
- Enhanced provider execution logic to include retry attempts, cooldown management, and dynamic wait periods.
- Updated API endpoints and YAML configuration to support `max-retry-interval`.
- Introduced `logOnErrorOnly` mode to enable logging only for error responses when request logging is disabled.
- Added endpoints to list and download error logs (`/request-error-logs`).
- Implemented error log file cleanup to retain only the newest 10 logs.
- Refactored `ResponseWriterWrapper` to support forced logging for error responses.
- Enhanced middleware to capture data for upstream error persistence.
- Improved log file naming and error log filename generation.
- Restored `thoughtSignature` validator bypass for model-specific parts in Gemini content processing.
- Removed redundant logic from the `executor` for cleaner handling.
Remove generationConfig.maxOutputTokens, generationConfig.responseMimeType and generationConfig.responseJsonSchema from the Gemini payload in translateRequest so we no longer send unsupported or conflicting response configuration fields. This lets the backend or caller control response formatting and output limits and helps prevent potential API errors caused by these keys.
- Updated `defaultAntigravityAgent` to version `1.11.5`.
- Adjusted const value formatting for improved readability.
**feat(executor): introduce fallback mechanism for Antigravity base URLs**
- Added retry logic with fallback order for Antigravity base URLs to handle request errors and rate limits.
- Refactored base URL handling with `antigravityBaseURLFallbackOrder` and related utilities.
- Enhanced error handling in non-streaming and streaming requests with retry support and improved metadata reporting.
- Updated `buildRequest` to support dynamic base URL assignment.
- Improved Antigravity executor to handle `thinkingConfig` adjustments and default `thinkingBudget` when `thinkingLevel` is removed.
- Updated translator response handling to set default values for output token counts when specific token data is missing.
- Introduced `modelName2Alias` and `alias2ModelName` functions for mapping between model names and aliases.
- Improved Antigravity payload transformation to include alias-to-model name conversion.
- Enhanced processing for Claude Sonnet models to adjust template parameters based on schema presence.
- Added `authFileUnchanged` to skip reloads for unchanged files based on SHA256 hash comparisons.
- Introduced `isKnownAuthFile` to verify known files before handling removal events.
- Improved event processing in `handleEvent` to reduce unnecessary reloads and enhance performance.
- Refactored response translation logic to support mixed content types (`input_text`, `output_text`, `input_image`) with better role assignments and part handling.
- Added image processing logic for embedding inline data with MIME type and base64 encoded content.
- Updated Antigravity request conversion to replace Gemini CLI references for consistency.