76 Commits

  • feat(runtime): add User-Agent headers to codex and claude executors
    - Standardized User-Agent strings for Codex and Claude executors to improve request tracing and compatibility.
    - Updated header insertion logic in both executors for consistency.
  • Fixed: #140 #133 #80
    feat(translator): add token counting functionality for Gemini, Claude, and CLI
    
    - Introduced `TokenCount` handling across various Codex translators (Gemini, Claude, CLI) with respective implementations.
    - Added utility methods for token counting and formatting responses.
    - Integrated `tiktoken-go/tokenizer` library for tokenization.
    - Updated CodexExecutor with token counting logic to support multiple models including GPT-5 variants.
    - Refined go.mod and go.sum to include new dependencies.
    
    feat(runtime): add token counting functionality across executors
    
    - Implemented token counting in OpenAICompatExecutor, QwenExecutor, and IFlowExecutor.
    - Added utilities for token counting and response formatting using `tiktoken-go/tokenizer`.
    - Integrated token counting into translators for Gemini, Claude, and Gemini CLI.
    - Enhanced multiple model support, including GPT-5 variants, for token counting.
    
    docs: update environment variable instructions for multi-model support
    
    - Added details for setting `ANTHROPIC_DEFAULT_OPUS_MODEL`, `ANTHROPIC_DEFAULT_SONNET_MODEL`, and `ANTHROPIC_DEFAULT_HAIKU_MODEL` for version 2.x.x.
    - Clarified usage of `ANTHROPIC_MODEL` and `ANTHROPIC_SMALL_FAST_MODEL` for version 1.x.x.
    - Expanded examples for setting environment variables across different models including Gemini, GPT-5, Claude, and Qwen3.
  • docs: add GPT-5 Codex guidelines for internal usage
    - Added comprehensive instructions for Codex CLI harness, sandboxing, approvals, and editing constraints to `internal/misc/codex_instructions/`.
    - Clarified `approval_policy` configurations and scenarios requiring escalated permissions.
    - Provided detailed style and structure guidelines for presenting results in the Codex CLI.
  • Refactor executor error handling and usage reporting
    - Updated the Execute methods in various executors (GeminiCLIExecutor, GeminiExecutor, IFlowExecutor, OpenAICompatExecutor, QwenExecutor) to return a response and error as named return values for improved clarity.
    - Enhanced error handling by deferring failure tracking in usage reporters, ensuring that failures are reported correctly.
    - Improved response body handling by ensuring proper closure and error logging for HTTP responses across all executors.
    - Added failure tracking and reporting in the usage reporter to capture unsuccessful requests.
    - Updated the usage logging structure to include a 'Failed' field for better tracking of request outcomes.
    - Adjusted the logic in the RequestStatistics and Record methods to accommodate the new failure tracking mechanism.
  • Fixed: #148
    feat(executor): add initial cache_helpers.go file
  • Enhance logging for API requests and responses across executors
    - Added detailed logging of upstream request metadata including URL, method, headers, and body for Codex, Gemini, IFlow, OpenAI Compat, and Qwen executors.
    - Implemented error logging for API response failures to capture errors during HTTP requests.
    - Introduced structured logging for authentication details (AuthID, AuthLabel, AuthType, AuthValue) to improve traceability.
    - Updated response logging to include status codes and headers for better debugging.
    - Ensured that all executors consistently log API interactions to facilitate monitoring and troubleshooting.
  • feat(registry/runtime): add Gemini 2.5 model and increase buffer sizes
    - Added new "Gemini 2.5 Flash Image Preview" model definition, with enhanced image generation capabilities.
    - Increased scanner buffer size to 20,971,520 bytes across executors and translators to handle larger payloads.
  • feat(runtime): remove previous_response_id from Codex executor request body
    - Implemented logic to delete `previous_response_id` property from the request body in Codex executor.
    - Applied changes consistently across relevant Codex executor paths.
  • feat(runtime): introduce newProxyAwareHTTPClient for enhanced proxy handling
    - Added `newProxyAwareHTTPClient` to centralize proxy configuration with priority on `auth.ProxyURL` and `cfg.ProxyURL`.
    - Integrated enhanced proxy support across executors for HTTP, HTTPS, and SOCKS5 protocols.
    - Refactored redundant HTTP client initialization to use `newProxyAwareHTTPClient` for consistent behavior.
  • refactor(executor): remove ClientAdapter and legacy fallback logic
    - Deleted `ClientAdapter` implementation and associated fallback methods.
    - Removed legacy executor logic from `codex`, `claude`, `gemini`, and `qwen` executors.
    - Simplified `handlers` by eliminating `UnwrapError` handling and related dependencies.
    - Cleaned up `model_registry` by removing logic associated with suspended clients.
    - Updated `.gitignore` to ignore `.serena/` directory.
  • feat(translators): add token counting support for Claude and Gemini responses
    - Implemented `TokenCount` transform method across translators to calculate token usage.
    - Integrated token counting logic into executor pipelines for Claude, Gemini, and CLI translators.
    - Added corresponding API endpoints and handlers (`/messages/count_tokens`) for token usage retrieval.
    - Enhanced translation registry to support `TokenCount` functionality alongside existing response types.
  • feat(usage): implement usage tracking infrastructure across executors
    - Added `LoggerPlugin` to log usage metrics for observability.
    - Introduced a new `Manager` to handle usage record queuing and plugin registration.
    - Integrated new usage reporter and detailed metrics parsing into executors, covering providers like OpenAI, Codex, Claude, and Gemini.
    - Improved token usage breakdown across streaming and non-streaming responses.
  • feat(executor): add CountTokens support across all executors
    - Introduced `CountTokens` method to Codex, Claude, Gemini, Qwen, OpenAI-compatible, and other executors.
    - Implemented `ExecuteCount` in `AuthManager` for token counting via provider round-robin.
    - Updated handlers to leverage `ExecuteCountWithAuthManager` for streamlined token counting.
    - Added fallback and error handling logic for token counting requests.
  • refactor(headers): centralize header logic using EnsureHeader utility
    - Introduced `EnsureHeader` in `internal/misc/header_utils.go` to streamline header setting across executors.
    - Updated Codex, Claude, and Gemini executors to utilize `EnsureHeader` for consistent header application.
    - Incorporated Gin context headers (if available) into request header manipulation for better integration.
  • refactor(executor): centralize header application logic for executors
    - Replaced repetitive header setting logic with helper methods (`applyCodexHeaders`, `applyClaudeHeaders`, `applyQwenHeaders`) in Codex, Claude, and Qwen executors.
    - Ensured consistent headers in HTTP requests across all executors.
    - Introduced UUID and additional structured headers for better traceability (e.g., session IDs, metadata).
  • chore(executor): add debug logging for API request errors
    - Added detailed debug logs in all executors (Codex, Claude, Gemini, Qwen, OpenAI-compatible) to capture HTTP status and response body for failed API requests.
  • feat(auth): standardize last_refresh metadata handling across executors
    - Added `last_refresh` timestamp to metadata for Codex, Claude, Qwen, and Gemini executors.
    - Implemented `extractLastRefreshTimestamp` utility for parsing diverse timestamp formats in management handlers.
    - Ensured consistent update and preservation of `last_refresh` in file-based auth handling.
  • chore(logging): add debug logs for executor Refresh methods
    - Introduced `logrus` for structured debugging across all executors.
    - Added debug log messages in `Refresh` methods for better traceability.
    - Updated `Manager` to log additional details during refresh checks.
  • feat: implement token refresh support for executors
    - Added `Refresh` method implementations for Codex, Claude, Gemini, and Qwen executors.
    - Introduced OAuth-based token handling for Gemini and Qwen with support for refresh tokens.
    - Updated Codex and Claude to use new internal auth services.
    - Enhanced metadata structure and consistency for token storage across all executors.
  • refactor: standardize constant naming and improve file-based auth handling
    - Renamed constants from uppercase to CamelCase for consistency.
    - Replaced redundant file-based auth handling logic with the new `util.CountAuthFiles` helper.
    - Fixed various error-handling inconsistencies and enhanced robustness in file operations.
    - Streamlined auth client reload logic in server and watcher components.
    - Applied minor code readability improvements across multiple packages.