CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-02-02 20:40:52 +08:00

Author	SHA1	Message	Date
Ben Vargas	598f0af19b	fix: apply thinkingLevel from model suffix metadata for Gemini 3 The previous commit added thinkingLevel support but didn't apply it when the reasoning effort came from model name suffix (e.g., model(minimal)). This was because ResolveThinkingConfigFromMetadata returns nil for level-based models, bypassing the metadata application. Changes: - Add ApplyGemini3ThinkingLevelFromMetadata for standard Gemini API - Add ApplyGemini3ThinkingLevelFromMetadataCLI for CLI API format - Update gemini_cli_executor to apply Gemini 3 thinkingLevel from metadata - Update antigravity_executor to apply Gemini 3 thinkingLevel from metadata - Update aistudio_executor to apply Gemini 3 thinkingLevel from metadata - Add comprehensive test coverage for Gemini 3 thinkingLevel functions	2025-12-17 16:08:38 -07:00
Luis Pater	14ce6aebd1	Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing fix(gemini-cli): enhance 429 retry delay parsing	2025-12-14 14:04:14 +08:00
hkfires	27c9c5c4da	refactor(executor): clarify executor comments and oauth names	2025-12-11 21:56:44 +08:00
hkfires	fc9f6c974a	refactor(executor): clarify providers and streams Add package and constructor documentation for AI Studio, Antigravity, Gemini CLI, Gemini API, and Vertex executors to describe their roles and inputs. Introduce a shared stream scanner buffer constant in the Gemini API executor and reuse it in Gemini CLI and Vertex streaming code so stream handling uses a consistent configuration. Update Refresh implementations for AI Studio, Gemini CLI, Gemini API (API key), and Vertex executors to short‑circuit and simply return the incoming auth object, while keeping Antigravity token renewal as the only executor that performs OAuth refresh. Remove OAuth2-based token refresh logic and related dependencies from the Gemini API executor, since it now operates strictly with API key credentials.	2025-12-11 21:56:43 +08:00
sususu	07d21463ca	fix(gemini-cli): enhance 429 retry delay parsing Add fallback parsing for quota reset delay when RetryInfo is not present: - Try ErrorInfo.metadata.quotaResetDelay (e.g., "373.801628ms") - Parse from error.message "Your quota will reset after Xs." This ensures proper cooldown timing for rate-limited requests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 09:34:39 +08:00
sususu	76c563d161	fix(executor): increase buffer size for stream scanners to 50MB across multiple executors	2025-12-10 23:20:04 +08:00
hkfires	9b202b6c1c	fix(executor): centralize default thinking config	2025-12-09 21:05:06 +08:00
hkfires	5ec9b5e5a9	feat(executor): normalize thinking budget across all Gemini executors	2025-12-09 21:05:06 +08:00
Luis Pater	bceecfb2e3	Fixed: #414 refactor(gemini): comment out unused CLI preview entry	2025-12-04 17:55:13 +08:00
auroraflux	1c6f4be8ae	refactor(executor): dedupe thinking metadata helpers across Gemini executors Extract applyThinkingMetadata and applyThinkingMetadataCLI helpers to payload_helpers.go and use them across all four Gemini-based executors: - gemini_executor.go (Execute, ExecuteStream, CountTokens) - gemini_cli_executor.go (Execute, ExecuteStream, CountTokens) - aistudio_executor.go (translateRequest) - antigravity_executor.go (Execute, ExecuteStream) This eliminates code duplication introduced in the -reasoning suffix PR and centralizes the thinking config application logic. Net reduction: 28 lines of code.	2025-11-30 15:20:15 -08:00
Luis Pater	d50b0f7524	refactor(executor): simplify Gemini CLI execution and remove internal retry logic - Removed nested retry handling for 429 rate limit errors. - Simplified request/response handling by cleaning redundant retry-related code. - Eliminated `parseRetryDelay` function and max retry configuration logic.	2025-11-20 17:49:37 +08:00
Ben Vargas	0ff094b87f	fix(executor): prevent streaming on failed response when no fallback Fix critical bug where ExecuteStream would create a streaming channel from a failed (non-2xx) response after exhausting all retries with no fallback models available. When retries were exhausted on the last model, the code would break from the inner loop but fall through to streaming channel creation (line 401), immediately returning at line 461. This made the error handling code at lines 464-471 unreachable, causing clients to receive an empty/closed stream instead of a proper error response. Solution: Check if httpResp is non-2xx before creating the streaming channel. If failed, continue the outer loop to reach error handling. Identified by: codex-bot review Ref: https://github.com/router-for-me/CLIProxyAPI/pull/280#pullrequestreview-3484560423	2025-11-19 13:14:40 -07:00
Ben Vargas	ed23472d94	fix(executor): prevent streaming from 429 response when fallback available Fix critical bug where ExecuteStream would create a streaming channel using a 429 error response instead of continuing to the next fallback model after exhausting retries. When 429 retries were exhausted and a fallback model was available, the inner retry loop would break but immediately fall through to the streaming channel creation, attempting to stream from the failed 429 response instead of trying the next model. Solution: Add shouldContinueToNextModel flag to explicitly skip the streaming logic and continue the outer model loop when appropriate. Identified by: codex-bot review Ref: https://github.com/router-for-me/CLIProxyAPI/pull/280#pullrequestreview-3484479106	2025-11-19 13:05:38 -07:00
Ben Vargas	6a3de3a89c	feat(executor): add intelligent retry logic for 429 rate limits Implement Google RetryInfo.retryDelay support for handling 429 rate limit errors. Retries same model up to 3 times using exact delays from Google's API before trying fallback models. - Add parseRetryDelay() to extract Google's retry guidance - Implement inner retry loop in Execute() and ExecuteStream() - Context-aware waiting with cancellation support - Cap delays at 60s maximum for safety	2025-11-19 12:47:39 -07:00
Luis Pater	db2d22c978	fix(runtime): simplify scanner buffer allocation in executor implementations	2025-11-18 10:59:49 +08:00
Luis Pater	fcd98f4f9b	feat(runtime): add payload configuration support for executors Introduce `PayloadConfig` in the configuration to define default and override rules for modifying payload parameters. Implement `applyPayloadConfig` and `applyPayloadConfigWithRoot` to apply these rules across various executors, ensuring consistent parameter handling for different models and protocols. Update all relevant executors to utilize this functionality.	2025-11-13 23:27:40 +08:00
Luis Pater	d0aa741d59	feat(gemini-cli): add multi-project support and enhance credential handling Introduce support for multi-project Gemini CLI logins, including shared and virtual credential management. Enhance runtime, metadata handling, and token updates for better project granularity and consistency across virtual and shared credentials. Extend onboarding to allow activating all available projects.	2025-11-13 02:55:32 +08:00
Luis Pater	3ce1b4159b	fix(executor): remove outdated Gemini model previews from CLI fallback order	2025-11-07 10:30:22 +08:00
Luis Pater	89b0d53a09	fix(executor): remove `safetySettings` from payload for Gemini requests	2025-11-01 16:53:48 +08:00
hkfires	a517290726	refactor(executor): summarize API error bodies of html in debug logs	2025-10-31 06:58:38 +08:00
hkfires	1bbbd16df6	chore(logging): clarify 429 rate-limit retries in Gemini executor	2025-10-29 19:19:18 +08:00
hkfires	58d30369b4	fix(gemini-cli): correctly strip/normalize thinking config by model	2025-10-29 19:19:18 +08:00
hkfires	7dd93a4a25	fix(executor): only apply thinking config to supported models	2025-10-29 19:19:17 +08:00
hkfires	7061cd6058	fix(gemini): map responseModalities to uppercase IMAGE/TEXT	2025-10-26 19:35:22 +08:00
Luis Pater	e783923464	feat(executor): add debug logs for rate-limiting retries in Gemini CLI executor	2025-10-23 10:39:21 +08:00
Luis Pater	20985d1a10	Refactor executor error handling and usage reporting - Updated the Execute methods in various executors (GeminiCLIExecutor, GeminiExecutor, IFlowExecutor, OpenAICompatExecutor, QwenExecutor) to return a response and error as named return values for improved clarity. - Enhanced error handling by deferring failure tracking in usage reporters, ensuring that failures are reported correctly. - Improved response body handling by ensuring proper closure and error logging for HTTP responses across all executors. - Added failure tracking and reporting in the usage reporter to capture unsuccessful requests. - Updated the usage logging structure to include a 'Failed' field for better tracking of request outcomes. - Adjusted the logic in the RequestStatistics and Record methods to accommodate the new failure tracking mechanism.	2025-10-21 11:22:24 +08:00
Luis Pater	3dd0844b98	Enhance logging for API requests and responses across executors - Added detailed logging of upstream request metadata including URL, method, headers, and body for Codex, Gemini, IFlow, OpenAI Compat, and Qwen executors. - Implemented error logging for API response failures to capture errors during HTTP requests. - Introduced structured logging for authentication details (AuthID, AuthLabel, AuthType, AuthValue) to improve traceability. - Updated response logging to include status codes and headers for better debugging. - Ensured that all executors consistently log API interactions to facilitate monitoring and troubleshooting.	2025-10-17 04:12:38 +08:00
Luis Pater	ade279d1f2	Feature: #103 feat(gemini): add Gemini thinking configuration support and metadata normalization - Introduced logic to parse and apply `thinkingBudget` and `include_thoughts` configurations from metadata. - Enhanced request handling to include normalized Gemini model metadata, preserving the original model identifier. - Updated Gemini and Gemini-CLI executors to apply thinking configuration based on metadata overrides. - Refactored handlers to support metadata extraction and cloning during request preparation.	2025-10-16 11:31:18 +08:00
Luis Pater	20787cd107	feat(registry, executor, util): add support for `gemini-2.5-flash-image-preview` and improve aspect ratio handling - Introduced `gemini-2.5-flash-image-preview` model to the registry with updated definitions. - Enhanced Gemini CLI and API executors to handle image aspect ratio adjustments for the new model. - Added utility function to create base64 white image placeholders based on aspect ratio configurations.	2025-10-10 01:49:58 +08:00
Luis Pater	b2cdbbdd47	feat(registry, executor): add support for `glm-4.6` model and enhance Gemini CLI token handling - Added `glm-4.6` model to registry and documentation. - Updated Gemini CLI executor to pass configuration to `prepareGeminiCLITokenSource` for improved token management.	2025-10-09 20:57:18 +08:00
Luis Pater	d45ebff66b	feat(registry, executor): add support for `gemini-2.5-flash-image` model - Introduced `gemini-2.5-flash-image` model with updated definitions in registry. - Enhanced model marker detection in Gemini CLI executor to include support for the new model.	2025-10-09 10:06:10 +08:00
hkfires	c62ecc2442	fix(gemini): Disable thinking config for incompatible models	2025-10-06 16:32:03 +08:00
Luis Pater	bbdd68a8b4	feat(registry/runtime): add Gemini 2.5 model and increase buffer sizes - Added new "Gemini 2.5 Flash Image Preview" model definition, with enhanced image generation capabilities. - Increased scanner buffer size to 20,971,520 bytes across executors and translators to handle larger payloads.	2025-10-06 04:44:45 +08:00
Luis Pater	de796ac1c2	feat(runtime): introduce `newProxyAwareHTTPClient` for enhanced proxy handling - Added `newProxyAwareHTTPClient` to centralize proxy configuration with priority on `auth.ProxyURL` and `cfg.ProxyURL`. - Integrated enhanced proxy support across executors for HTTP, HTTPS, and SOCKS5 protocols. - Refactored redundant HTTP client initialization to use `newProxyAwareHTTPClient` for consistent behavior.	2025-09-30 09:04:15 +08:00
Luis Pater	f5dc380b63	rebuild branch	2025-09-25 10:32:48 +08:00
Luis Pater	3f69254f43	remove all	2025-09-25 10:31:02 +08:00
Luis Pater	3dd5095792	feat(translators): add token counting support for Claude and Gemini responses - Implemented `TokenCount` transform method across translators to calculate token usage. - Integrated token counting logic into executor pipelines for Claude, Gemini, and CLI translators. - Added corresponding API endpoints and handlers (`/messages/count_tokens`) for token usage retrieval. - Enhanced translation registry to support `TokenCount` functionality alongside existing response types.	2025-09-24 11:59:38 +08:00
Luis Pater	3ade03f3b3	feat(usage): implement usage tracking infrastructure across executors - Added `LoggerPlugin` to log usage metrics for observability. - Introduced a new `Manager` to handle usage record queuing and plugin registration. - Integrated new usage reporter and detailed metrics parsing into executors, covering providers like OpenAI, Codex, Claude, and Gemini. - Improved token usage breakdown across streaming and non-streaming responses.	2025-09-24 03:49:09 +08:00
Luis Pater	ac59023abb	feat(executor): add `CountTokens` support across all executors - Introduced `CountTokens` method to Codex, Claude, Gemini, Qwen, OpenAI-compatible, and other executors. - Implemented `ExecuteCount` in `AuthManager` for token counting via provider round-robin. - Updated handlers to leverage `ExecuteCountWithAuthManager` for streamlined token counting. - Added fallback and error handling logic for token counting requests.	2025-09-23 02:27:51 +08:00
Luis Pater	d32fc0400e	refactor(headers): centralize header logic using `EnsureHeader` utility - Introduced `EnsureHeader` in `internal/misc/header_utils.go` to streamline header setting across executors. - Updated Codex, Claude, and Gemini executors to utilize `EnsureHeader` for consistent header application. - Incorporated Gin context headers (if available) into request header manipulation for better integration.	2025-09-23 02:01:57 +08:00
Luis Pater	7ea88358f0	refactor(executor): centralize header application logic for executors - Replaced repetitive header setting logic with helper methods (`applyCodexHeaders`, `applyClaudeHeaders`, `applyQwenHeaders`) in Codex, Claude, and Qwen executors. - Ensured consistent headers in HTTP requests across all executors. - Introduced UUID and additional structured headers for better traceability (e.g., session IDs, metadata).	2025-09-23 01:20:10 +08:00
Luis Pater	c6b391304d	chore(executor): add debug logging for API request errors - Added detailed debug logs in all executors (Codex, Claude, Gemini, Qwen, OpenAI-compatible) to capture HTTP status and response body for failed API requests.	2025-09-22 23:37:53 +08:00
Luis Pater	053134f66e	refactor(auth): remove unused `Refresh` methods from authenticators - Deleted `Refresh` implementations in Codex, Claude, Gemini, Qwen, and Gemini-web authenticators. - Updated the `Authenticator` interface to exclude `Refresh` for cleaner design. - Revised `Manager` and related components to handle refresh logic improvements. - Simplified token refresh behavior and eliminated redundant code paths.	2025-09-22 21:11:53 +08:00
Luis Pater	837ae1b1b3	chore(logging): add debug logs for executor `Refresh` methods - Introduced `logrus` for structured debugging across all executors. - Added debug log messages in `Refresh` methods for better traceability. - Updated `Manager` to log additional details during refresh checks.	2025-09-22 20:03:31 +08:00
Luis Pater	d9ad65622a	refactor: standardize constant naming and improve file-based auth handling - Renamed constants from uppercase to CamelCase for consistency. - Replaced redundant file-based auth handling logic with the new `util.CountAuthFiles` helper. - Fixed various error-handling inconsistencies and enhanced robustness in file operations. - Streamlined auth client reload logic in server and watcher components. - Applied minor code readability improvements across multiple packages.	2025-09-22 02:56:45 +08:00
Luis Pater	4999fce7f4	v6 version first commit	2025-09-22 01:40:24 +08:00

46 Commits