- Introduced `gpt-5.1-codex-max` variants to model definitions (`low`, `medium`, `high`, `xhigh`).
- Updated executor logic to map effort levels for Codex Max models.
- Added `lastCodexMaxPrompt` processing for `gpt-5.1-codex-max` prompts.
- Defined instructions for `gpt-5.1-codex-max` in a new file: `codex_instructions/gpt-5.1-codex-max_prompt.md`.
Extract reasoning effort mapping into a reusable function `setReasoningEffortByAlias` to reduce redundancy and improve maintainability. Introduce support for the "gpt-5.1-none" variant in the registry and runtime executor.
Stop advertising and mapping the unsupported gpt-5.1-minimal variant in the model registry and Codex executor, and align bare gpt-5.1 requests to use medium reasoning effort like Codex CLI while preserving minimal for gpt-5.
Expand executor logic to handle GPT-5.1 Codex family and its variants, including reasoning effort configurations for minimal, low, medium, and high levels. Ensure proper mapping of models to payload parameters.
Introduce `PayloadConfig` in the configuration to define default and override rules for modifying payload parameters. Implement `applyPayloadConfig` and `applyPayloadConfigWithRoot` to apply these rules across various executors, ensuring consistent parameter handling for different models and protocols. Update all relevant executors to utilize this functionality.
feat(runtime): add support for GPT-5.1 models and variants
Introduce GPT-5.1 model family, including minimal, low, medium, high, Codex, and Codex Mini variants. Update tokenization and reasoning effort handling to accommodate new models in executor and registry.
Adds three new Codex Mini model variants (mini, mini-medium, mini-high)
that map to codex-mini-latest. Codex Mini supports medium and high
reasoning effort levels only (no low/minimal). Base model defaults to
medium reasoning effort.
- Standardized User-Agent strings for Codex and Claude executors to improve request tracing and compatibility.
- Updated header insertion logic in both executors for consistency.
feat(translator): add token counting functionality for Gemini, Claude, and CLI
- Introduced `TokenCount` handling across various Codex translators (Gemini, Claude, CLI) with respective implementations.
- Added utility methods for token counting and formatting responses.
- Integrated `tiktoken-go/tokenizer` library for tokenization.
- Updated CodexExecutor with token counting logic to support multiple models including GPT-5 variants.
- Refined go.mod and go.sum to include new dependencies.
feat(runtime): add token counting functionality across executors
- Implemented token counting in OpenAICompatExecutor, QwenExecutor, and IFlowExecutor.
- Added utilities for token counting and response formatting using `tiktoken-go/tokenizer`.
- Integrated token counting into translators for Gemini, Claude, and Gemini CLI.
- Enhanced multiple model support, including GPT-5 variants, for token counting.
docs: update environment variable instructions for multi-model support
- Added details for setting `ANTHROPIC_DEFAULT_OPUS_MODEL`, `ANTHROPIC_DEFAULT_SONNET_MODEL`, and `ANTHROPIC_DEFAULT_HAIKU_MODEL` for version 2.x.x.
- Clarified usage of `ANTHROPIC_MODEL` and `ANTHROPIC_SMALL_FAST_MODEL` for version 1.x.x.
- Expanded examples for setting environment variables across different models including Gemini, GPT-5, Claude, and Qwen3.
- Added comprehensive instructions for Codex CLI harness, sandboxing, approvals, and editing constraints to `internal/misc/codex_instructions/`.
- Clarified `approval_policy` configurations and scenarios requiring escalated permissions.
- Provided detailed style and structure guidelines for presenting results in the Codex CLI.
- Updated the Execute methods in various executors (GeminiCLIExecutor, GeminiExecutor, IFlowExecutor, OpenAICompatExecutor, QwenExecutor) to return a response and error as named return values for improved clarity.
- Enhanced error handling by deferring failure tracking in usage reporters, ensuring that failures are reported correctly.
- Improved response body handling by ensuring proper closure and error logging for HTTP responses across all executors.
- Added failure tracking and reporting in the usage reporter to capture unsuccessful requests.
- Updated the usage logging structure to include a 'Failed' field for better tracking of request outcomes.
- Adjusted the logic in the RequestStatistics and Record methods to accommodate the new failure tracking mechanism.
- Added detailed logging of upstream request metadata including URL, method, headers, and body for Codex, Gemini, IFlow, OpenAI Compat, and Qwen executors.
- Implemented error logging for API response failures to capture errors during HTTP requests.
- Introduced structured logging for authentication details (AuthID, AuthLabel, AuthType, AuthValue) to improve traceability.
- Updated response logging to include status codes and headers for better debugging.
- Ensured that all executors consistently log API interactions to facilitate monitoring and troubleshooting.
- Added new "Gemini 2.5 Flash Image Preview" model definition, with enhanced image generation capabilities.
- Increased scanner buffer size to 20,971,520 bytes across executors and translators to handle larger payloads.
- Implemented logic to delete `previous_response_id` property from the request body in Codex executor.
- Applied changes consistently across relevant Codex executor paths.
- Added `newProxyAwareHTTPClient` to centralize proxy configuration with priority on `auth.ProxyURL` and `cfg.ProxyURL`.
- Integrated enhanced proxy support across executors for HTTP, HTTPS, and SOCKS5 protocols.
- Refactored redundant HTTP client initialization to use `newProxyAwareHTTPClient` for consistent behavior.
- Deleted `ClientAdapter` implementation and associated fallback methods.
- Removed legacy executor logic from `codex`, `claude`, `gemini`, and `qwen` executors.
- Simplified `handlers` by eliminating `UnwrapError` handling and related dependencies.
- Cleaned up `model_registry` by removing logic associated with suspended clients.
- Updated `.gitignore` to ignore `.serena/` directory.
- Implemented `TokenCount` transform method across translators to calculate token usage.
- Integrated token counting logic into executor pipelines for Claude, Gemini, and CLI translators.
- Added corresponding API endpoints and handlers (`/messages/count_tokens`) for token usage retrieval.
- Enhanced translation registry to support `TokenCount` functionality alongside existing response types.
- Added `LoggerPlugin` to log usage metrics for observability.
- Introduced a new `Manager` to handle usage record queuing and plugin registration.
- Integrated new usage reporter and detailed metrics parsing into executors, covering providers like OpenAI, Codex, Claude, and Gemini.
- Improved token usage breakdown across streaming and non-streaming responses.
- Introduced `CountTokens` method to Codex, Claude, Gemini, Qwen, OpenAI-compatible, and other executors.
- Implemented `ExecuteCount` in `AuthManager` for token counting via provider round-robin.
- Updated handlers to leverage `ExecuteCountWithAuthManager` for streamlined token counting.
- Added fallback and error handling logic for token counting requests.
- Introduced `EnsureHeader` in `internal/misc/header_utils.go` to streamline header setting across executors.
- Updated Codex, Claude, and Gemini executors to utilize `EnsureHeader` for consistent header application.
- Incorporated Gin context headers (if available) into request header manipulation for better integration.
- Added detailed debug logs in all executors (Codex, Claude, Gemini, Qwen, OpenAI-compatible) to capture HTTP status and response body for failed API requests.
- Added `last_refresh` timestamp to metadata for Codex, Claude, Qwen, and Gemini executors.
- Implemented `extractLastRefreshTimestamp` utility for parsing diverse timestamp formats in management handlers.
- Ensured consistent update and preservation of `last_refresh` in file-based auth handling.
- Introduced `logrus` for structured debugging across all executors.
- Added debug log messages in `Refresh` methods for better traceability.
- Updated `Manager` to log additional details during refresh checks.
- Added `Refresh` method implementations for Codex, Claude, Gemini, and Qwen executors.
- Introduced OAuth-based token handling for Gemini and Qwen with support for refresh tokens.
- Updated Codex and Claude to use new internal auth services.
- Enhanced metadata structure and consistency for token storage across all executors.
- Renamed constants from uppercase to CamelCase for consistency.
- Replaced redundant file-based auth handling logic with the new `util.CountAuthFiles` helper.
- Fixed various error-handling inconsistencies and enhanced robustness in file operations.
- Streamlined auth client reload logic in server and watcher components.
- Applied minor code readability improvements across multiple packages.