CLIProxyAPI

fix(gemini-cli): enhance 429 retry delay parsing

Add fallback parsing for quota reset delay when RetryInfo is not present:
- Try ErrorInfo.metadata.quotaResetDelay (e.g., "373.801628ms")
- Parse from error.message "Your quota will reset after Xs."

This ensures proper cooldown timing for rate-limited requests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

sususu · 2025-12-11 09:34:39 +08:00

07d21463ca

feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic

- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.

Luis Pater · 2025-12-11 03:10:50 +08:00

423ce97665

fix(logging): update response aggregation logic to include all attempts

Luis Pater · 2025-12-10 16:53:48 +08:00

94d61c7b2b

fix(antigravity): remove references to autopush endpoint and update fallback logic

Luis Pater · 2025-12-10 00:13:20 +08:00

f25f419e5a

fix(executor): centralize default thinking config

hkfires · 2025-12-09 21:05:06 +08:00

9b202b6c1c

feat(executor): enforce minimum thinking budget for antigravity models

hkfires · 2025-12-09 21:05:06 +08:00

6a66b6801a

feat(executor): normalize thinking budget across all Gemini executors

hkfires · 2025-12-09 21:05:06 +08:00

5ec9b5e5a9

Merge pull request #470 from router-for-me/agry

fix(gemini): normalize model listing output

Luis Pater · 2025-12-09 21:00:29 +08:00

5db3b58717

Fixed: #463

fix(antigravity): remove `$ref` and `$defs` from JSON during key deletion

Luis Pater · 2025-12-09 17:32:17 +08:00

39b6b3b289

feat(antigravity): support canonical names for antigravity models

hkfires · 2025-12-09 16:54:13 +08:00

e5312fb5a2

feat(aistudio): normalize thinking budget in request translation

hkfires · 2025-12-09 08:27:44 +08:00

96b55acff8

fix(antigravity): remove exclusiveMaximum from JSON during key deletion

Luis Pater · 2025-12-08 23:28:01 +08:00

af00304b0c

refactor(logging): remove unused variable in ensureAttempt and redundant function call

Luis Pater · 2025-12-08 22:25:58 +08:00

6ad188921c

feat(antigravity): enforce thinking budget limits for Claude models

hkfires · 2025-12-08 20:36:17 +08:00

a283545b6b

feat(registry): add explicit thinking support config for antigravity models

hkfires · 2025-12-07 19:12:55 +08:00

9c09128e00

Fixed: #424

**feat(antigravity): add support for maxOutputTokens and refine Claude model handling**

Luis Pater · 2025-12-07 01:55:57 +08:00

fd29ab418a

fix(antigravity): restore production base URL in the executor

Luis Pater · 2025-12-06 01:11:37 +08:00

d7564173dd

Fixed: #421

feat(antigravity): implement project ID retrieval and integration in payload processing

Luis Pater · 2025-12-06 00:40:55 +08:00

c44c46dd80

**refactor(antigravity): handle anyOf property, remove exclusiveMinimum, and comment unused prod URL**

Luis Pater · 2025-12-05 21:24:12 +08:00

d4d529833d

**feat(util): add DeleteKey function and update antigravity executor for Claude model compatibility**

Luis Pater · 2025-12-05 01:55:45 +08:00

d6352dd4d4

Fixed: #414

**refactor(gemini): comment out unused CLI preview entry**

Luis Pater · 2025-12-04 17:55:13 +08:00

bceecfb2e3

**feat(antigravity): add support for Claude-Opus-4-5-Thinking model**

Luis Pater · 2025-12-04 16:13:13 +08:00

6a2906e3e5

refactor(antigravity): uncomment prod environment URL in fallback chain

Luis Pater · 2025-12-02 22:47:18 +08:00

e93f87294a

**refactor(cliproxy, config): remove vertex-compat flow, streamline Vertex API key handling**

- Removed `vertex-compat` executor and related configuration.
- Consolidated Vertex compatibility checks into `vertex` handling with `apikey`-based model resolution.
- Streamlined model generation logic for Vertex API key entries.

Luis Pater · 2025-12-02 09:18:24 +08:00

0fd2abbc3b

feat: Add support for VertexAI compatible service (#375 )

feat: consolidate Vertex AI compatibility with API key support in Gemini

Aero · 2025-12-02 08:14:22 +08:00

0ebb654019

refactor(executor): dedupe thinking metadata helpers across Gemini executors

Extract applyThinkingMetadata and applyThinkingMetadataCLI helpers to
payload_helpers.go and use them across all four Gemini-based executors:
- gemini_executor.go (Execute, ExecuteStream, CountTokens)
- gemini_cli_executor.go (Execute, ExecuteStream, CountTokens)
- aistudio_executor.go (translateRequest)
- antigravity_executor.go (Execute, ExecuteStream)

This eliminates code duplication introduced in the -reasoning suffix PR
and centralizes the thinking config application logic.

Net reduction: 28 lines of code.

auroraflux · 2025-11-30 15:20:15 -08:00

1c6f4be8ae

Merge pull request #376 from auroraflux/feat/reasoning-suffix-support

feat(util): add -reasoning suffix support for Gemini models

Luis Pater · 2025-11-30 20:55:38 +08:00

73208c4e55

**feat(util): add -reasoning suffix support for Gemini models**

Adds support for the `-reasoning` model name suffix which enables
thinking/reasoning mode with dynamic budget. This allows clients to
request reasoning-enabled inference using model names like
`gemini-2.5-flash-reasoning` without explicit configuration.

The suffix is normalized to the base model (e.g., gemini-2.5-flash)
with thinkingBudget=-1 (dynamic) and include_thoughts=true.

Follows the existing pattern established by -nothinking and
-thinking-N suffixes.

auroraflux · 2025-11-30 01:18:57 -08:00

32d3809f8c

**fix(executor, auth): ensure index assignment consistency for auth objects**

- Updated `usage_helpers.go` to call `EnsureIndex()` for proper index assignment in reporter initialization.
- Adjusted `auth/manager.go` to assign auth indices inside a locked section when they are unassigned, ensuring thread safety and consistency.

Luis Pater · 2025-11-30 16:56:29 +08:00

a748e93fd9

fix(claude): ensure max_tokens exceeds thinking budget for thinking models

Fixes an issue where Claude thinking models would return 400 errors when
the thinking.budget_tokens was greater than or equal to max_tokens.

Changes:
- Add MaxCompletionTokens: 128000 to all Claude thinking model definitions
- Add ensureMaxTokensForThinking() function in claude_executor.go that:
  - Checks if thinking is enabled with a budget_tokens value
  - Looks up the model's MaxCompletionTokens from the registry
  - Ensures max_tokens is set to at least the model's MaxCompletionTokens
  - Falls back to budget_tokens + 4000 buffer if registry lookup fails

This ensures Anthropic API constraint (max_tokens > thinking.budget_tokens)
is always satisfied when using extended thinking features.

Fixes: #339

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

nestharus · 2025-11-26 22:31:05 -08:00

e73cdf5cff

Fixed: #339

**feat(handlers, executor): add Gemini 3 Pro Preview support and refine Claude system instructions**

- Added support for the new "Gemini 3 Pro Preview" action in Gemini handlers, including detailed metadata and configuration.
- Removed redundant `cache_control` field from Claude system instructions for cleaner payload structure.

Luis Pater · 2025-11-26 11:42:57 +08:00

a4a26d978e

Fixed: #337

**fix(executor): replace redundant commented code with `checkSystemInstructions` helper**

- Replaced commented-out `sjson.SetRawBytes` lines with the new `checkSystemInstructions` function.
- Centralized system instruction handling for better code clarity and reuse.
- Ensured consistent logic for managing `system` field across Claude executor flows.

Luis Pater · 2025-11-26 08:27:48 +08:00

ed9f6e897e

Merge branch 'dev' into feat/claude-thinking-and-beta-headers

Luis Pater · 2025-11-26 02:16:40 +08:00

2e5681ea32

**fix(executor): comment out redundant code for setting Claude system instructions**

- Commented out multiple instances of `sjson.SetRawBytes` for setting `system` key to Claude instructions as they are redundant.
- Code cleanup to improve clarity and maintainability without affecting functionality.

Luis Pater · 2025-11-26 02:06:16 +08:00

52c17f03a5

feat(claude): add thinking model variants and beta headers support

- Add Claude thinking model definitions (sonnet-4-5-thinking, opus-4-5-thinking variants)
- Add Thinking support for antigravity models with -thinking suffix
- Add injectThinkingConfig() for automatic thinking budget based on model suffix
- Add resolveUpstreamModel() mappings for thinking variants to actual Claude models
- Add extractAndRemoveBetas() to convert betas array to anthropic-beta header
- Update applyClaudeHeaders() to merge custom betas from request body

Closes #324

nestharus · 2025-11-25 03:33:05 -08:00

d0e694d4ed

**fix(executor): update antigravity executor to enhance model metadata handling**

- Added additional metadata fields (`Name`, `Description`, `DisplayName`, `Version`) to `ModelInfo` struct initialization for better model representation.
- Removed unnecessary whitespace in the code.

Luis Pater · 2025-11-25 09:19:01 +08:00

113db3c5bf

**fix(translator): reintroduce thoughtSignature bypass logic for model parts**

- Restored `thoughtSignature` validator bypass for model-specific parts in Gemini content processing.
- Removed redundant logic from the `executor` for cleaner handling.

Luis Pater · 2025-11-23 20:52:23 +08:00

ddb0c0ec1c

fix(aistudio): strip Gemini generation config overrides

Remove generationConfig.maxOutputTokens, generationConfig.responseMimeType and generationConfig.responseJsonSchema from the Gemini payload in translateRequest so we no longer send unsupported or conflicting response configuration fields. This lets the backend or caller control response formatting and output limits and helps prevent potential API errors caused by these keys.

hkfires · 2025-11-23 19:44:03 +08:00

62bfd62871

**chore(executor): update default agent version and simplify const formatting**

- Updated `defaultAntigravityAgent` to version `1.11.5`.
- Adjusted const value formatting for improved readability.

**feat(executor): introduce fallback mechanism for Antigravity base URLs**

- Added retry logic with fallback order for Antigravity base URLs to handle request errors and rate limits.
- Refactored base URL handling with `antigravityBaseURLFallbackOrder` and related utilities.
- Enhanced error handling in non-streaming and streaming requests with retry support and improved metadata reporting.
- Updated `buildRequest` to support dynamic base URL assignment.

Luis Pater · 2025-11-23 17:53:07 +08:00

257621c5ed

**feat(executor, translator): enhance token handling and payload processing**

- Improved Antigravity executor to handle `thinkingConfig` adjustments and default `thinkingBudget` when `thinkingLevel` is removed.
- Updated translator response handling to set default values for output token counts when specific token data is missing.

Luis Pater · 2025-11-23 11:32:37 +08:00

ac064389ca

**feat(executor): add model alias mapping and improve Antigravity payload handling**

- Introduced `modelName2Alias` and `alias2ModelName` functions for mapping between model names and aliases.
- Improved Antigravity payload transformation to include alias-to-model name conversion.
- Enhanced processing for Claude Sonnet models to adjust template parameters based on schema presence.

Luis Pater · 2025-11-23 03:16:14 +08:00

8d23ffc873

fix(gemini): parse stream usage from JSON, skip thoughtSignature

hkfires · 2025-11-22 16:07:12 +08:00

166fa9e2e6

fix(gemini): filter SSE usage metadata in streams

hkfires · 2025-11-22 15:53:36 +08:00

88e566281e

fix(runtime): treat non-empty finishReason as terminal

hkfires · 2025-11-22 15:39:46 +08:00

d32bb9db6b

fix(executor): expire stop chunks without usage metadata

hkfires · 2025-11-22 15:27:47 +08:00

8356b35320

feat(runtime): track antigravity usage and token counts

hkfires · 2025-11-22 14:04:28 +08:00

19a048879c

fix: handle empty and non-JSON SSE chunks safely

hkfires · 2025-11-22 13:49:23 +08:00

1061354b2f

fix: preserve SSE usage metadata-only trailing chunks

hkfires · 2025-11-22 13:25:25 +08:00

46b4110ff3

fix(sse): preserve usage metadata for stop chunks

hkfires · 2025-11-22 12:50:23 +08:00

8ce22b8403

Fixed: #302

**feat(executor): enhance WebSocket error handling and metadata logging**

- Added handling for stream closure before start with appropriate error recording.
- Improved metadata logging for non-OK HTTP status codes in WebSocket responses.
- Consolidated event processing logic with `processEvent` for better error handling and payload management.
- Refactored stream initialization to include the first event handling for smoother execution flow.

Luis Pater · 2025-11-22 11:18:13 +08:00

d291eb9489

166 Commits