Compare commits

..

181 Commits

Author SHA1 Message Date
Luis Pater
f887f9985d Merge pull request #1248 from shekohex/feat/responses-compact
feat(openai): add responses/compact support
2026-01-31 03:12:55 +08:00
Luis Pater
550da0cee8 fix(translator): include token usage in message_delta for Claude responses 2026-01-31 02:55:27 +08:00
Luis Pater
7ff3936efe fix(caching): ensure prompt-caching beta is always appended and add multi-turn cache control tests 2026-01-31 01:42:58 +08:00
Luis Pater
f36a5f5654 Merge pull request #1294 from Darley-Wey/fix/claude2gemini
fix: skip empty text parts and messages to avoid Gemini API error
2026-01-31 01:05:41 +08:00
Luis Pater
c1facdff67 Merge pull request #1295 from SchneeMart/feature/claude-caching
feat(caching): implement Claude prompt caching with multi-turn support
2026-01-31 01:04:19 +08:00
Luis Pater
4ee46bc9f2 Merge pull request #1311 from router-for-me/fix/gemini-schema
fix(gemini): Removes unsupported extension fields
2026-01-30 23:55:56 +08:00
Luis Pater
c3e94a8277 Merge pull request #1317 from yinkev/feat/gemini-tools-passthrough
feat(translator): add code_execution and url_context tool passthrough
2026-01-30 23:46:44 +08:00
Luis Pater
6b6d030ed3 feat(auth): add custom HTTP client with utls for Claude API authentication
Introduce a custom HTTP client utilizing utls with Firefox TLS fingerprinting to bypass Cloudflare fingerprinting on Anthropic domains. Includes support for proxy configuration and enhanced connection management for HTTP/2.
2026-01-30 21:29:41 +08:00
kyinhub
538039f583 feat(translator): add code_execution and url_context tool passthrough
Add support for Gemini's code_execution and url_context tools in the
request translators, enabling:

- Agentic Vision: Image analysis with Python code execution for
  bounding boxes, annotations, and visual reasoning
- URL Context: Live web page content fetching and analysis

Tools are passed through using the same pattern as google_search:
- code_execution: {} -> codeExecution: {}
- url_context: {} -> urlContext: {}

Tested with Gemini 3 Flash Preview agentic vision successfully.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 21:14:52 -08:00
이대희
ca796510e9 refactor(gemini): optimize removeExtensionFields with post-order traversal and DeleteBytes
Amp-Thread-ID: https://ampcode.com/threads/T-019c0d09-330d-7399-b794-652b94847df1
Co-authored-by: Amp <amp@ampcode.com>
2026-01-30 13:02:58 +09:00
이대희
d0d66cdcb7 fix(gemini): Removes unsupported extension fields
Removes x-* extension fields from JSON schemas to ensure compatibility with the Gemini API.

These fields, while valid in OpenAPI/JSON Schema, are not recognized by the Gemini API and can cause issues.
The change recursively walks the schema, identifies these extension fields, and removes them, except when they define properties.

Amp-Thread-ID: https://ampcode.com/threads/T-019c0cd1-9e59-722b-83f0-e0582aba6914
Co-authored-by: Amp <amp@ampcode.com>
2026-01-30 12:31:26 +09:00
Luis Pater
d7d54fa2cc feat(ci): add cleanup step for temporary Docker tags in workflow 2026-01-30 09:15:00 +08:00
Luis Pater
31649325f0 feat(ci): add multi-arch Docker builds and manifest creation to workflow 2026-01-30 07:26:36 +08:00
Martin Schneeweiss
3a43ecb19b feat(caching): implement Claude prompt caching with multi-turn support
- Add ensureCacheControl() to auto-inject cache breakpoints
- Cache tools (last tool), system (last element), and messages (2nd-to-last user turn)
- Add prompt-caching-2024-07-31 beta header
- Return original payload on sjson error to prevent corruption
- Include verification test for caching logic

Enables up to 90% cost reduction on cached tokens.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 22:59:33 +01:00
Luis Pater
a709e5a12d fix(config): ensure empty mapping persists for oauth-model-alias deletions #1305 2026-01-30 04:17:56 +08:00
Luis Pater
f0ac77197b Merge pull request #1300 from sususu98/feat/log-api-response-timestamp
fix(logging): add API response timestamp and fix request timestamp timing
2026-01-30 03:27:17 +08:00
Luis Pater
da0bbf2a3f Merge pull request #1298 from sususu98/fix/restore-usageMetadata-in-gemini-translator
fix(translator): restore usageMetadata in Gemini responses from Antigravity
2026-01-30 02:59:41 +08:00
sususu98
295f34d7f0 fix(logging): capture streaming TTFB on first chunk and make timestamps required
- Add firstChunkTimestamp field to ResponseWriterWrapper for sync capture
- Capture TTFB in Write() and WriteString() before async channel send
- Add SetFirstChunkTimestamp() to StreamingLogWriter interface
- Make requestTimestamp/apiResponseTimestamp required in LogRequest()
- Remove timestamp capture from WriteAPIResponse() (now via setter)
- Fix Gemini handler to set API_RESPONSE_TIMESTAMP before writing response

This ensures accurate TTFB measurement for all streaming API formats
(OpenAI, Gemini, Claude) by capturing timestamp synchronously when
the first response chunk arrives, not when the stream finalizes.
2026-01-29 22:32:24 +08:00
sususu98
c41ce77eea fix(logging): add API response timestamp and fix request timestamp timing
Previously:
- REQUEST INFO timestamp was captured at log write time (not request arrival)
- API RESPONSE had NO timestamp at all

This fix:
- Captures REQUEST INFO timestamp when request first arrives
- Adds API RESPONSE timestamp when upstream response arrives

Changes:
- Add Timestamp field to RequestInfo, set at middleware initialization
- Set API_RESPONSE_TIMESTAMP in appendAPIResponse() and gemini handler
- Pass timestamps through logging chain to writeNonStreamingLog()
- Add timestamp output to API RESPONSE section

This enables accurate measurement of backend response latency in error logs.
2026-01-29 22:22:18 +08:00
Luis Pater
4eb1e6093f feat(handlers): add test to verify no retries after partial stream response
Introduce `TestExecuteStreamWithAuthManager_DoesNotRetryAfterFirstByte` to validate that stream executions do not retry after receiving partial responses. Implement `payloadThenErrorStreamExecutor` for test coverage of this behavior.
2026-01-29 17:30:48 +08:00
Luis Pater
189a066807 Merge pull request #1296 from router-for-me/log
fix(api): update amp module only on config changes
2026-01-29 17:27:52 +08:00
hkfires
d0bada7a43 fix(config): prune oauth-model-alias when preserving config 2026-01-29 14:06:52 +08:00
sususu98
9dc0e6d08b fix(translator): restore usageMetadata in Gemini responses from Antigravity
When using Gemini API format with Antigravity backend, the executor
renames usageMetadata to cpaUsageMetadata in non-terminal chunks.
The Gemini translator was returning this internal field name directly
to clients instead of the standard usageMetadata field.

Add restoreUsageMetadata() to rename cpaUsageMetadata back to
usageMetadata before returning responses to clients.
2026-01-29 11:16:00 +08:00
hkfires
8510fc313e fix(api): update amp module only on config changes 2026-01-29 09:28:49 +08:00
Darley
2666708c30 fix: skip empty text parts and messages to avoid Gemini API error
When Claude API sends an assistant message with empty text content like:
{"role":"assistant","content":[{"type":"text","text":""}]}
The translator was creating a part object {} with no data field,
causing Gemini API to return error:
"required oneof field 'data' must have one initialized field"
This fix:
1. Skips empty text parts (text="") during translation
2. Skips entire messages when their parts array becomes empty
This ensures compatibility when clients send empty assistant messages
in their conversation history.
2026-01-29 04:13:07 +08:00
Luis Pater
9e5b1d24e8 Merge pull request #1276 from router-for-me/thinking
feat(thinking): enable thinking toggle for qwen3 and deepseek models
2026-01-28 11:16:54 +08:00
Luis Pater
a7dae6ad52 Merge remote-tracking branch 'origin/dev' into dev 2026-01-28 10:59:00 +08:00
Luis Pater
e93e05ae25 refactor: consolidate channel send logic with context-safe handlers
Optimize channel operations by introducing reusable context-aware send functions (`send` and `sendErr`) across `wsrelay`, `handlers`, and `cliproxy`. Ensure graceful handling of canceled contexts during stream operations.
2026-01-28 10:58:35 +08:00
hkfires
c8c27325dc feat(thinking): enable thinking toggle for qwen3 and deepseek models
Fix #1245
2026-01-28 09:54:05 +08:00
hkfires
c3b6f3918c chore(git): stop ignoring .idea and data directories 2026-01-28 09:52:44 +08:00
Luis Pater
bbb55a8ab4 Merge pull request #1170 from BianBianY/main
feat: optimization enable/disable auth files
2026-01-28 09:34:35 +08:00
Shady Khalifa
04b2290927 fix(codex): avoid empty prompt_cache_key 2026-01-27 19:06:42 +02:00
Shady Khalifa
53920b0399 fix(openai): drop stream for responses/compact 2026-01-27 18:27:34 +02:00
Luis Pater
7583193c2a Merge pull request #1257 from router-for-me/model
feat(api): add management model definitions endpoint
2026-01-27 20:32:04 +08:00
hkfires
7cc3bd4ba0 chore(deps): mark golang.org/x/text as indirect 2026-01-27 19:19:52 +08:00
hkfires
88a0f095e8 chore(registry): disable gemini 2.5 flash image preview model 2026-01-27 18:33:13 +08:00
hkfires
c65f64dce0 chore(registry): comment out rev19-uic3-1p model config 2026-01-27 18:33:13 +08:00
hkfires
d18cd217e1 feat(api): add management model definitions endpoint 2026-01-27 18:33:12 +08:00
Luis Pater
ba4a1ab433 Merge pull request #1261 from Darley-Wey/fix/gemini_scheme
fix(gemini): force type to string for enum fields to fix Antigravity Gemini API error
2026-01-27 17:02:25 +08:00
Darley
decddb521e fix(gemini): force type to string for enum fields to fix Antigravity Gemini API error (Relates to #1260) 2026-01-27 11:14:08 +03:30
Shady Khalifa
95096bc3fc feat(openai): add responses/compact support 2026-01-26 16:36:01 +02:00
Luis Pater
70897247b2 feat(auth): add support for request_retry and disable_cooling overrides
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.
2026-01-26 21:59:08 +08:00
Luis Pater
9c341f5aa5 feat(auth): add skip persistence context key for file watcher events
Introduce `WithSkipPersist` to disable persistence during Manager Update/Register calls, preventing write-back loops caused by redundant file writes. Add corresponding tests and integrate with existing file store and conductor logic.
2026-01-26 18:20:19 +08:00
Luis Pater
2af4a8dc12 refactor(runtime): implement retry logic for Antigravity executor with improved error handling and capacity management 2026-01-26 06:22:46 +08:00
Luis Pater
0f53b952b2 Merge pull request #1225 from router-for-me/log
Add request_id to error logs and extract error messages
2026-01-25 22:08:46 +08:00
hkfires
f30ffd5f5e feat(executor): add request_id to error logs
Extract error.message from JSON error responses when summarizing error bodies for debug logs
2026-01-25 21:31:46 +08:00
Luis Pater
bc9a24d705 docs(readme): reposition CPA-XXX Panel section for improved visibility 2026-01-25 18:58:32 +08:00
Luis Pater
2c879f13ef Merge pull request #1216 from ferretgeek/add-cpa-xxx-panel
docs: 新增 CPA-XXX 社区面板项目
2026-01-25 18:57:32 +08:00
Gemini
07b4a08979 docs: translate CPA-XXX description to English 2026-01-25 18:00:28 +08:00
Gemini
7f612bb069 docs: add CPA-XXX panel to community list
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2026-01-25 10:45:51 +08:00
hkfires
5743b78694 test(claude): update expectations for system message handling 2026-01-25 08:31:29 +08:00
Luis Pater
2e6a2b655c Merge pull request #1132 from XYenon/fix/gemini-models-displayname-override
fix(gemini): preserve displayName and description in models list
2026-01-25 03:40:04 +08:00
Luis Pater
cb47ac21bf Merge pull request #1179 from mallendeo/main
fix(claude): skip built-in tools in OAuth tool prefix
2026-01-25 03:31:58 +08:00
Luis Pater
a1394b4596 Merge pull request #1183 from Darley-Wey/fix/api-align
fix(api): enhance ClaudeModels response to align with api.anthropic.com
2026-01-25 03:30:14 +08:00
Luis Pater
9e97948f03 Merge pull request #1185 from router-for-me/auth
Refactor authentication handling for Antigravity, Claude, Codex, and Gemini
2026-01-25 03:28:53 +08:00
Yang Bian
f7bfa8a05c Merge branch 'upstream-main' 2026-01-24 16:28:08 +08:00
Darley
46c6fb1e7a fix(api): enhance ClaudeModels response to align with api.anthropic.com 2026-01-24 04:41:08 +03:30
hkfires
9f9fec5d4c fix(auth): improve antigravity token exchange errors 2026-01-24 09:04:15 +08:00
hkfires
e95be10485 fix(auth): validate antigravity token userinfo email 2026-01-24 08:33:52 +08:00
hkfires
f3d58fa0ce fix(auth): correct antigravity oauth redirect and expiry 2026-01-24 08:33:52 +08:00
hkfires
8c0eaa1f71 refactor(auth): export Gemini constants and use in handler 2026-01-24 08:33:52 +08:00
hkfires
405df58f72 refactor(auth): export Codex constants and slim down handler 2026-01-24 08:33:52 +08:00
hkfires
e7f13aa008 refactor(api): slim down RequestAnthropicToken to use internal/auth 2026-01-24 08:33:51 +08:00
hkfires
7cb6a9b89a refactor(auth): export Claude OAuth constants for reuse 2026-01-24 08:33:51 +08:00
hkfires
9aa5344c29 refactor(api): slim down RequestAntigravityToken to use internal/auth 2026-01-24 08:33:51 +08:00
hkfires
8ba0ebbd2a refactor(sdk): slim down Antigravity authenticator to use internal/auth 2026-01-24 08:33:51 +08:00
hkfires
c65407ab9f refactor(auth): extract Antigravity OAuth constants to internal/auth 2026-01-24 08:33:51 +08:00
hkfires
9e59685212 refactor(auth): implement Antigravity AuthService in internal/auth 2026-01-24 08:33:51 +08:00
hkfires
4a4dfaa910 refactor(auth): replace sanitizeAntigravityFileName with antigravity.CredentialFileName 2026-01-24 08:33:51 +08:00
Luis Pater
0d6ecb0191 Fixed: #1077
refactor(translator): improve tools handling by separating functionDeclarations and googleSearch nodes
2026-01-24 05:51:11 +08:00
Mauricio Allende
f16461bfe7 fix(claude): skip built-in tools in OAuth tool prefix 2026-01-23 21:29:39 +00:00
Luis Pater
c32e2a8196 fix(auth): handle context cancellation in executor methods 2026-01-24 04:56:55 +08:00
Luis Pater
873d41582f Merge pull request #1125 from NightHammer1000/dev
Filter out Top_P when Temp is set on Claude
2026-01-24 02:03:33 +08:00
Luis Pater
6fb7d85558 Merge pull request #1137 from augustVino/fix/remove_empty_systemmsg
fix(translator): ensure system message is only added if it contains c…
2026-01-24 02:02:18 +08:00
hkfires
d5e3e32d58 fix(auth): normalize plan type filenames to lowercase 2026-01-23 20:13:09 +08:00
Chén Mù
f353a54555 Merge pull request #1171 from router-for-me/auth
refactor(auth): remove unused provider execution helpers
2026-01-23 19:43:42 +08:00
Chén Mù
1d6e2e751d Merge pull request #1140 from sxjeru/main
fix(auth): handle quota cooldown in retry logic for transient errors
2026-01-23 19:43:17 +08:00
hkfires
cc50b63422 refactor(auth): remove unused provider execution helpers 2026-01-23 19:12:55 +08:00
Luis Pater
15ae83a15b Merge pull request #1169 from router-for-me/payload
feat(executor): apply payload rules using requested model
2026-01-23 18:41:31 +08:00
hkfires
81b369aed9 fix(auth): include requested model in executor metadata 2026-01-23 18:30:08 +08:00
Yang Bian
c8620d1633 feat: optimization enable/disable auth files 2026-01-23 18:03:09 +08:00
hkfires
ecc850bfb7 feat(executor): apply payload rules using requested model 2026-01-23 16:38:41 +08:00
Chén Mù
19b4ef33e0 Merge pull request #1102 from aldinokemal/main
feat(management): add PATCH endpoint to enable/disable auth files
2026-01-23 09:05:24 +08:00
hkfires
7ca045d8b9 fix(executor): adjust model-specific request payload 2026-01-22 20:28:08 +08:00
hkfires
abfca6aab2 refactor(util): reorder gemini schema cleaner helpers 2026-01-22 18:38:48 +08:00
Chén Mù
3c71c075db Merge pull request #1131 from sowar1987/fix/gemini-malformed-function-call
Fix Gemini tool calling for Antigravity (malformed_function_call)
2026-01-22 18:07:03 +08:00
sowar1987
9c2992bfb2 test: align signature cache tests with cache behavior
Co-Authored-By: Warp <agent@warp.dev>
2026-01-22 17:12:47 +08:00
sowar1987
269a1c5452 refactor: reuse placeholder reason description
Co-Authored-By: Warp <agent@warp.dev>
2026-01-22 17:12:47 +08:00
sowar1987
22ce65ac72 test: update signature cache tests
Revert gemini translator changes for scheme A

Co-Authored-By: Warp <agent@warp.dev>
2026-01-22 17:12:47 +08:00
sowar1987
a2f8f59192 Fix Gemini function-calling INVALID_ARGUMENT by relaxing Gemini tool validation and cleaning schema 2026-01-22 17:11:07 +08:00
XYenon
8c7c446f33 fix(gemini): preserve displayName and description in models list
Previously GeminiModels handler unconditionally overwrote displayName
and description with the model name, losing the original values defined
in model definitions (e.g., 'Gemini 3 Pro Preview').

Now only set these fields as fallback when they are missing or empty.
2026-01-22 15:19:27 +08:00
sxjeru
30a59168d7 fix(auth): handle quota cooldown in retry logic for transient errors 2026-01-21 21:48:23 +08:00
hkfires
c8884f5e25 refactor(translator): enhance signature handling in Claude and Gemini requests, streamline cache usage and remove unnecessary tests 2026-01-21 20:21:49 +08:00
Luis Pater
d9c6317c84 refactor(cache, translator): refine signature caching logic and tests, replace session-based logic with model group handling 2026-01-21 18:30:05 +08:00
Vino
d29ec95526 fix(translator): ensure system message is only added if it contains content 2026-01-21 16:45:50 +08:00
Luis Pater
ef4508dbc8 refactor(cache, translator): remove session ID from signature caching and clean up logic 2026-01-21 13:37:10 +08:00
Luis Pater
f775e46fe2 refactor(translator): remove session ID logic from signature caching and associated tests 2026-01-21 12:45:07 +08:00
Luis Pater
65ad5c0c9d refactor(cache): simplify signature caching by removing sessionID parameter 2026-01-21 12:38:05 +08:00
Luis Pater
88bf4e77ec fix(translator): update HasValidSignature to require modelName parameter for improved validation 2026-01-21 11:31:37 +08:00
Luis Pater
a4f8015caa test(logging): add unit tests for GinLogrusRecovery middleware panic handling 2026-01-21 10:57:27 +08:00
Luis Pater
ffd129909e Merge pull request #1130 from router-for-me/agty
fix(executor): only strip maxOutputTokens for non-claude models
2026-01-21 10:50:39 +08:00
hkfires
9332316383 fix(translator): preserve thinking blocks by skipping signature 2026-01-21 10:49:20 +08:00
hkfires
6dcbbf64c3 fix(executor): only strip maxOutputTokens for non-claude models 2026-01-21 10:49:20 +08:00
Luis Pater
2ce3553612 feat(cache): handle gemini family in signature cache with fallback validator logic 2026-01-21 10:11:21 +08:00
Luis Pater
2e14f787d4 feat(translator): enhance ConvertGeminiRequestToAntigravity with model name and refine reasoning block handling 2026-01-21 08:31:23 +08:00
Luis Pater
523b41ccd2 test(responses): add comprehensive tests for SSE event ordering and response transformations 2026-01-21 07:08:59 +08:00
N1GHT
09970dc7af Accept Geminis Review Suggestion 2026-01-20 17:51:36 +01:00
N1GHT
d81abd401c Returned the Code Comment I trashed 2026-01-20 17:36:27 +01:00
N1GHT
a6cba25bc1 Small fix to filter out Top_P when Temperature is set on Claude to make requests go through 2026-01-20 17:34:26 +01:00
Luis Pater
c6fa1d0e67 Merge pull request #1117 from router-for-me/cache
fix(translator): enhance signature cache clearing logic and update test cases with model name
2026-01-20 23:18:48 +08:00
Luis Pater
ac56e1e88b Merge pull request #1116 from bexcodex/fix/antigravity
Fix antigravity malformed_function_call
2026-01-20 22:40:00 +08:00
hkfires
9b72ea9efa fix(translator): enhance signature cache clearing logic and update test cases with model name 2026-01-20 20:02:29 +08:00
bexcodex
9f364441e8 Fix antigravity malformed_function_call 2026-01-20 19:54:54 +08:00
Luis Pater
e49a1c07bf chore(translator): update cache functions to include model name parameter in tests 2026-01-20 18:36:51 +08:00
Luis Pater
8d9f4edf9b feat(translator): unify model group references by introducing GetModelGroup helper function 2026-01-20 13:45:25 +08:00
Luis Pater
020e61d0da feat(translator): improve signature handling by associating with model name in cache functions 2026-01-20 13:31:36 +08:00
Luis Pater
6184c43319 Fixed: #1109
feat(translator): enhance session ID derivation with user_id parsing in Claude
2026-01-20 12:35:40 +08:00
Luis Pater
2cbe4a790c chore(translator): remove unnecessary whitespace in gemini_openai_response code 2026-01-20 11:47:33 +08:00
Luis Pater
68b3565d7b Merge branch 'main' into dev (PR #961) 2026-01-20 11:42:22 +08:00
Luis Pater
3f385a8572 feat(auth): add "antigravity" provider to ignored access_token fields in filestore 2026-01-20 11:38:31 +08:00
Luis Pater
9823dc35e1 feat(auth): hash account ID for improved uniqueness in credential filenames 2026-01-20 11:37:52 +08:00
Luis Pater
059bfee91b feat(auth): add hashed account ID to credential filenames for team plans 2026-01-20 11:36:29 +08:00
Luis Pater
7beaf0eaa2 Merge pull request #869 2026-01-20 11:16:53 +08:00
Luis Pater
1fef90ff58 Merge pull request #877 from zhiqing0205/main
feat(codex): include plan type in auth filename
2026-01-20 11:11:25 +08:00
Luis Pater
8447fd27a0 fix(login): remove emojis from interactive prompt messages 2026-01-20 11:09:56 +08:00
Luis Pater
7831cba9f6 refactor(claude): remove redundant system instructions check in Claude executor 2026-01-20 11:02:52 +08:00
Luis Pater
e02b2d58d5 Merge pull request #868 2026-01-20 10:57:24 +08:00
Luis Pater
28726632a9 Merge pull request #861 from umairimtiaz9/fix/gemini-cli-backend-project-id
fix(auth): use backend project ID for free tier Gemini CLI OAuth users
2026-01-20 10:32:17 +08:00
Luis Pater
3b26129c82 Merge pull request #1108 from router-for-me/modelinfo
feat(registry): support provider-specific model info lookup
2026-01-20 10:18:42 +08:00
Luis Pater
d4bb4e6624 refactor(antigravity): remove unused client signature handling in thinking objects 2026-01-20 10:17:55 +08:00
Luis Pater
0766c49f93 Merge pull request #994 from adrenjc/fix/cross-model-thinking-signature
fix(antigravity): prevent corrupted thought signature when switching models
2026-01-20 10:14:05 +08:00
Luis Pater
a7ffc77e3d Merge branch 'dev' into fix/cross-model-thinking-signature 2026-01-20 10:10:43 +08:00
hkfires
e641fde25c feat(registry): support provider-specific model info lookup 2026-01-20 10:01:17 +08:00
Luis Pater
5717c7f2f4 Merge pull request #1103 from dinhkarate/feat/imagen
feat(vertex): add Imagen image generation model support
2026-01-20 07:11:18 +08:00
dinhkarate
8734d4cb90 feat(vertex): add Imagen image generation model support
Add support for Imagen 3.0 and 4.0 image generation models in Vertex AI:

- Add 5 Imagen model definitions (4.0, 4.0-ultra, 4.0-fast, 3.0, 3.0-fast)
- Implement :predict action routing for Imagen models
- Convert Imagen request/response format to match Gemini structure like gemini-3-pro-image
- Transform prompts to Imagen's instances/parameters format
- Convert base64 image responses to Gemini-compatible inline data
2026-01-20 01:26:37 +07:00
Aldino Kemal
2f6004d74a perf(management): optimize auth lookup in PatchAuthFileStatus
Use GetByID() for O(1) map lookup first, falling back to iteration
only for FileName matching. Consistent with pattern in disableAuth().
2026-01-19 20:05:37 +07:00
Luis Pater
5baa753539 Merge pull request #1099 from router-for-me/claude
refactor(claude): move max_tokens constraint enforcement to Apply method
2026-01-19 20:55:59 +08:00
Luis Pater
ead98e4bca Merge pull request #1101 from router-for-me/argy
fix(executor): stop rewriting thinkingLevel for gemini
2026-01-19 20:55:22 +08:00
Aldino Kemal
a1634909e8 feat(management): add PATCH endpoint to enable/disable auth files
Add new PATCH /v0/management/auth-files/status endpoint that allows
toggling the disabled state of auth files without deleting them.
This enables users to temporarily disable credentials from the
management UI.
2026-01-19 19:50:36 +07:00
hkfires
1d2fe55310 fix(executor): stop rewriting thinkingLevel for gemini 2026-01-19 19:49:39 +08:00
hkfires
c175821cc4 feat(registry): expand antigravity model config
Remove static Name mapping and add entries for claude-sonnet-4-5,
tab_flash_lite_preview, and gpt-oss-120b-medium configs
2026-01-19 19:32:00 +08:00
hkfires
239a28793c feat(claude): clamp thinking budget to max_tokens constraints 2026-01-19 16:32:20 +08:00
hkfires
c421d653e7 refactor(claude): move max_tokens constraint enforcement to Apply method 2026-01-19 15:50:35 +08:00
Luis Pater
2542c2920d Merge pull request #1096 from router-for-me/usage
feat(translator): report cached token usage in Claude output
2026-01-19 11:52:18 +08:00
hkfires
52e46ced1b fix(translator): avoid forcing RFC 8259 system prompt 2026-01-19 11:33:27 +08:00
hkfires
cf9daf470c feat(translator): report cached token usage in Claude output 2026-01-19 11:23:44 +08:00
Luis Pater
140d6211cc feat(translator): add reasoning state tracking and improve reasoning summary handling
- Introduced `oaiToResponsesStateReasoning` to track reasoning data.
- Enhanced logic for emitting reasoning summary events and managing state transitions.
- Updated output generation to handle multiple reasoning entries consistently.
2026-01-19 03:58:28 +08:00
Luis Pater
60f9a1442c Merge pull request #1088 from router-for-me/thinking
Thinking
2026-01-18 17:01:59 +08:00
hkfires
cb6caf3f87 fix(thinking): update ValidateConfig to include fromSuffix parameter and adjust budget validation logic 2026-01-18 16:37:14 +08:00
Luis Pater
99c7abbbf1 Merge pull request #1067 from router-for-me/auth-files
refactor(auth): simplify filename prefixes for qwen and iflow tokens
2026-01-18 13:41:59 +08:00
Luis Pater
8f511ac33c Merge pull request #1076 from sususu98/fix/antigravity-enum-string
fix(antigravity): convert non-string enum values to strings for Gemini API
2026-01-18 13:40:53 +08:00
Luis Pater
1046152119 Merge pull request #1068 from 0xtbug/dev
docs(readme): add ZeroLimit to projects based on CLIProxyAPI
2026-01-18 13:37:50 +08:00
Luis Pater
f88228f1c5 Merge pull request #1081 from router-for-me/thinking
Refine thinking validation and cross‑provider payload conversion
2026-01-18 13:34:28 +08:00
Luis Pater
62e2b672d9 refactor(logging): centralize log directory resolution logic
- Introduced `ResolveLogDirectory` function in `logging` package to standardize log directory determination across components.
- Replaced redundant logic in `server`, `global_logger`, and `handlers` with the new utility function.
2026-01-18 12:40:57 +08:00
hkfires
03005b5d29 refactor(thinking): add Gemini family provider grouping for strict validation 2026-01-18 11:30:53 +08:00
hkfires
c7e8830a56 refactor(thinking): pass source and target formats to ApplyThinking for cross-format validation
Update ApplyThinking signature to accept fromFormat and toFormat parameters
instead of a single provider string. This enables:

- Proper level-to-budget conversion when source is level-based (openai/codex)
  and target is budget-based (gemini/claude)
- Strict budget range validation when source and target formats match
- Level clamping to nearest supported level for cross-format requests
- Format alias resolution in SDK translator registry for codex/openai-response

Also adds ErrBudgetOutOfRange error code and improves iflow config extraction
to fall back to openai format when iflow-specific config is not present.
2026-01-18 10:30:15 +08:00
hkfires
d5ef4a6d15 refactor(translator): remove registry model lookups from thinking config conversions 2026-01-18 10:30:14 +08:00
hkfires
97b67e0e49 test(thinking): split E2E coverage into suffix and body parameter test functions
Refactor thinking configuration tests by separating model name suffix-based
scenarios from request body parameter-based scenarios into distinct test
functions with independent case numbering.

Architectural improvements:
- Extract thinkingTestCase struct to package level for shared usage
- Add getTestModels() helper returning complete model fixture set
- Introduce runThinkingTests() runner with protocol-specific field detection
- Register level-subset-model fixture with constrained low/high level support
- Extend iflow protocol handling for glm-test and minimax-test models
- Add same-protocol strict boundary validation cases (80-89)
- Replace error responses with clamped values for boundary-exceeding budgets
2026-01-18 10:30:14 +08:00
sususu98
dd6d78cb31 fix(antigravity): convert non-string enum values to strings for Gemini API
Gemini API requires all enum values in function declarations to be
strings. Some MCP tools (e.g., roxybrowser) define schemas with numeric
enums like `"enum": [0, 1, 2]`, causing INVALID_ARGUMENT errors.

Add convertEnumValuesToStrings() to automatically convert numeric and
boolean enum values to their string representations during schema
transformation.
2026-01-18 02:00:02 +00:00
Luis Pater
46433a25f8 fix(translator): add check for empty text to prevent invalid serialization in gemini and antigravity 2026-01-18 00:50:10 +08:00
Tubagus
c8843edb81 Update README_CN.md
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-17 11:33:29 +07:00
Tubagus
f89feb881c Update README.md
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-17 11:33:18 +07:00
Tubagus
dbba71028e docs(readme): add ZeroLimit to projects based on CLIProxyAPI 2026-01-17 11:30:15 +07:00
Tubagus
8549a92e9a docs(readme): add ZeroLimit to projects based on CLIProxyAPI
Added ZeroLimit app to the list of projects in README.
2026-01-17 11:29:22 +07:00
hkfires
109cffc010 refactor(auth): simplify filename prefixes for qwen and iflow tokens 2026-01-17 12:20:58 +08:00
Luis Pater
f8f3ad84fc Fixed: #1064
feat(translator): improve system message handling and content indexing across translators

- Updated logic for processing system messages in `claude`, `gemini`, `gemini-cli`, and `antigravity` translators.
- Introduced indexing for `systemInstruction.parts` to ensure proper ordering and handling of multi-part content.
- Added safeguards for accurate content transformation and serialization.
2026-01-17 05:40:56 +08:00
Luis Pater
bc7167e9fe feat(runtime): add model alias support and enhance payload rule matching
- Introduced `payloadModelAliases` and `payloadModelCandidates` functions to support model aliases for improved flexibility.
- Updated rule matching logic to handle multiple model candidates.
- Refactored variable naming in executor to improve code clarity and consistency.
2026-01-17 05:05:24 +08:00
Luis Pater
384578a88c feat(cliproxy, gemini): improve ID matching logic and enrich normalized model output
- Enhanced ID matching in `cliproxy` by adding additional conditions to better handle ID equality cases.
- Updated `gemini` handlers to include `displayName` and `description` in normalized models for enriched metadata.
2026-01-17 04:44:09 +08:00
Luis Pater
65b4e1ec6c feat(codex): enable instruction toggling and update role terminology
- Added conditional logic for Codex instruction injection based on configuration.
- Updated role terminology from "user" to "developer" for better alignment with context.
2026-01-17 04:12:29 +08:00
adrenjc
5977af96a0 fix(antigravity): prevent corrupted thought signature when switching models
When switching from Claude models (e.g., Opus 4.5) to Gemini models
(e.g., Flash) mid-conversation via Antigravity OAuth, the client-provided
thinking signatures from Claude would cause "Corrupted thought signature"
errors since they are incompatible with Gemini API.

Changes:
- Remove fallback to client-provided signatures in thinking block handling
- Only use cached signatures (from same-session Gemini responses)
- Skip thinking blocks without valid cached signatures
- tool_use blocks continue to use skip_thought_signature_validator when
  no valid signature is available

This ensures cross-model switching works correctly while preserving
signature validation for same-model conversations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 18:24:05 +08:00
extremk
5bb9c2a2bd Add candidate count parameter to OpenAI request 2026-01-10 18:50:13 +08:00
extremk
0b5bbe9234 Add candidate count handling in OpenAI request 2026-01-10 18:49:29 +08:00
extremk
14c74e5e84 Handle 'n' parameter for candidate count in requests
Added handling for the 'n' parameter to set candidate count in generationConfig.
2026-01-10 18:48:33 +08:00
extremk
6448d0ee7c Add candidate count handling in OpenAI request 2026-01-10 18:47:41 +08:00
extremk
b0c17af2cf Enhance Gemini to OpenAI response conversion
Refactor response handling to support multiple candidates and improve parameter management.
2026-01-10 18:46:25 +08:00
zhiqing0205
aa8526edc0 fix(codex): use unicode title casing for plan 2026-01-06 10:24:02 +08:00
zhiqing0205
ac3ca0ad8e feat(codex): include plan type in auth filename 2026-01-06 02:25:56 +08:00
FakerL
08d21b76e2 Update sdk/auth/filestore.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-05 21:38:26 +08:00
Zhi Yang
33aa665555 fix(auth): persist access_token on refresh for providers that need it
Previously, metadataEqualIgnoringTimestamps() ignored access_token for all
providers, which prevented refreshed tokens from being persisted to disk/database.
This caused tokens to be lost on server restart for providers like iFlow.

This change makes the behavior provider-specific:
- Providers like gemini/gemini-cli that issue new tokens on every refresh and
  can re-fetch when needed will continue to ignore access_token (optimization)
- Other providers like iFlow will now persist access_token changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-05 13:25:46 +00:00
maoring24
00280b6fe8 feat(claude): add native request cloaking for non-claude-code clients
integrate claude-cloak functionality to disguise api requests:
- add CloakConfig with mode (auto/always/never) and strict-mode options
- generate fake user_id in claude code format (user_[hex]_account__session_[uuid])
- inject claude code system prompt (configurable strict mode)
- obfuscate sensitive words with zero-width characters
- auto-detect claude code clients via user-agent

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 20:32:51 +08:00
CodeIgnitor
52760a4eaa fix(auth): use backend project ID for free tier Gemini CLI OAuth users
Fixes issue where free tier users cannot access Gemini 3 preview models
due to frontend/backend project ID mapping.

## Problem
Google's Gemini API uses a frontend/backend project mapping system for
free tier users:
- Frontend projects (e.g., gen-lang-client-*) are user-visible
- Backend projects (e.g., mystical-victor-*) host actual API access
- Only backend projects have access to preview models (gemini-3-*)

Previously, CLIProxyAPI ignored the backend project ID returned by
Google's onboarding API and kept using the frontend ID, preventing
access to preview models.

## Solution
### CLI (internal/cmd/login.go)
- Detect free tier users (gen-lang-client-* projects or FREE/LEGACY tier)
- Show interactive prompt allowing users to choose frontend or backend
- Default to backend (recommended for preview model access)
- Pro users: maintain original behavior (keep frontend ID)

### Web UI (internal/api/handlers/management/auth_files.go)
- Detect free tier users using same logic
- Automatically use backend project ID (recommended choice)
- Pro users: maintain original behavior (keep frontend ID)

### Deduplication (internal/cmd/login.go)
- Add deduplication when user selects ALL projects
- Prevents redundant API calls when multiple frontend projects map to
  same backend
- Skips duplicate project IDs in activation loop

## Impact
- Free tier users: Can now access gemini-3-pro-preview and
  gemini-3-flash-preview models
- Pro users: No change in behavior (backward compatible)
- Only affects Gemini CLI OAuth (not antigravity or API key auth)

## Testing
- Tested with free tier account selecting single project
- Tested with free tier account selecting ALL projects
- Verified deduplication prevents redundant onboarding calls
- Confirmed pro user behavior unchanged
2026-01-05 02:41:24 +05:00
109 changed files with 10307 additions and 4223 deletions

View File

@@ -10,13 +10,11 @@ env:
DOCKERHUB_REPO: eceasy/cli-proxy-api
jobs:
docker:
docker_amd64:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to DockerHub
@@ -29,18 +27,113 @@ jobs:
echo VERSION=`git describe --tags --always --dirty` >> $GITHUB_ENV
echo COMMIT=`git rev-parse --short HEAD` >> $GITHUB_ENV
echo BUILD_DATE=`date -u +%Y-%m-%dT%H:%M:%SZ` >> $GITHUB_ENV
- name: Build and push
- name: Build and push (amd64)
uses: docker/build-push-action@v6
with:
context: .
platforms: |
linux/amd64
linux/arm64
platforms: linux/amd64
push: true
build-args: |
VERSION=${{ env.VERSION }}
COMMIT=${{ env.COMMIT }}
BUILD_DATE=${{ env.BUILD_DATE }}
tags: |
${{ env.DOCKERHUB_REPO }}:latest
${{ env.DOCKERHUB_REPO }}:${{ env.VERSION }}
${{ env.DOCKERHUB_REPO }}:latest-amd64
${{ env.DOCKERHUB_REPO }}:${{ env.VERSION }}-amd64
docker_arm64:
runs-on: ubuntu-24.04-arm
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Generate Build Metadata
run: |
echo VERSION=`git describe --tags --always --dirty` >> $GITHUB_ENV
echo COMMIT=`git rev-parse --short HEAD` >> $GITHUB_ENV
echo BUILD_DATE=`date -u +%Y-%m-%dT%H:%M:%SZ` >> $GITHUB_ENV
- name: Build and push (arm64)
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/arm64
push: true
build-args: |
VERSION=${{ env.VERSION }}
COMMIT=${{ env.COMMIT }}
BUILD_DATE=${{ env.BUILD_DATE }}
tags: |
${{ env.DOCKERHUB_REPO }}:latest-arm64
${{ env.DOCKERHUB_REPO }}:${{ env.VERSION }}-arm64
docker_manifest:
runs-on: ubuntu-latest
needs:
- docker_amd64
- docker_arm64
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Generate Build Metadata
run: |
echo VERSION=`git describe --tags --always --dirty` >> $GITHUB_ENV
echo COMMIT=`git rev-parse --short HEAD` >> $GITHUB_ENV
echo BUILD_DATE=`date -u +%Y-%m-%dT%H:%M:%SZ` >> $GITHUB_ENV
- name: Create and push multi-arch manifests
run: |
docker buildx imagetools create \
--tag "${DOCKERHUB_REPO}:latest" \
"${DOCKERHUB_REPO}:latest-amd64" \
"${DOCKERHUB_REPO}:latest-arm64"
docker buildx imagetools create \
--tag "${DOCKERHUB_REPO}:${VERSION}" \
"${DOCKERHUB_REPO}:${VERSION}-amd64" \
"${DOCKERHUB_REPO}:${VERSION}-arm64"
- name: Cleanup temporary tags
continue-on-error: true
env:
DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
run: |
set -euo pipefail
namespace="${DOCKERHUB_REPO%%/*}"
repo_name="${DOCKERHUB_REPO#*/}"
token="$(
curl -fsSL \
-H 'Content-Type: application/json' \
-d "{\"username\":\"${DOCKERHUB_USERNAME}\",\"password\":\"${DOCKERHUB_TOKEN}\"}" \
'https://hub.docker.com/v2/users/login/' \
| python3 -c 'import json,sys; print(json.load(sys.stdin)["token"])'
)"
delete_tag() {
local tag="$1"
local url="https://hub.docker.com/v2/repositories/${namespace}/${repo_name}/tags/${tag}/"
local http_code
http_code="$(curl -sS -o /dev/null -w "%{http_code}" -X DELETE -H "Authorization: JWT ${token}" "${url}" || true)"
if [ "${http_code}" = "204" ] || [ "${http_code}" = "404" ]; then
echo "Docker Hub tag removed (or missing): ${DOCKERHUB_REPO}:${tag} (HTTP ${http_code})"
return 0
fi
echo "Docker Hub tag delete failed: ${DOCKERHUB_REPO}:${tag} (HTTP ${http_code})"
return 0
}
delete_tag "latest-amd64"
delete_tag "latest-arm64"
delete_tag "${VERSION}-amd64"
delete_tag "${VERSION}-arm64"

View File

@@ -130,6 +130,14 @@ Windows-native CLIProxyAPI fork with TUI, system tray, and multi-provider OAuth
VSCode extension for quick switching between Claude Code models, featuring integrated CLIProxyAPI as its backend with automatic background lifecycle management.
### [ZeroLimit](https://github.com/0xtbug/zero-limit)
Windows desktop app built with Tauri + React for monitoring AI coding assistant quotas via CLIProxyAPI. Track usage across Gemini, Claude, OpenAI Codex, and Antigravity accounts with real-time dashboard, system tray integration, and one-click proxy control - no API keys needed.
### [CPA-XXX Panel](https://github.com/ferretgeek/CPA-X)
A lightweight web admin panel for CLIProxyAPI with health checks, resource monitoring, real-time logs, auto-update, request statistics and pricing display. Supports one-click installation and systemd service.
> [!NOTE]
> If you developed a project based on CLIProxyAPI, please open a PR to add it to this list.

View File

@@ -129,6 +129,14 @@ CLI 封装器,用于通过 CLIProxyAPI OAuth 即时切换多个 Claude 账户
一款 VSCode 扩展,提供了在 VSCode 中快速切换 Claude Code 模型的功能,内置 CLIProxyAPI 作为其后端,支持后台自动启动和关闭。
### [ZeroLimit](https://github.com/0xtbug/zero-limit)
Windows 桌面应用,基于 Tauri + React 构建,用于通过 CLIProxyAPI 监控 AI 编程助手配额。支持跨 Gemini、Claude、OpenAI Codex 和 Antigravity 账户的使用量追踪,提供实时仪表盘、系统托盘集成和一键代理控制,无需 API 密钥。
### [CPA-XXX Panel](https://github.com/ferretgeek/CPA-X)
面向 CLIProxyAPI 的 Web 管理面板,提供健康检查、资源监控、日志查看、自动更新、请求统计与定价展示,支持一键安装与 systemd 服务。
> [!NOTE]
> 如果你开发了基于 CLIProxyAPI 的项目,请提交一个 PR拉取请求将其添加到此列表中。

View File

@@ -141,6 +141,15 @@ codex-instructions-enabled: false
# - "claude-3-*" # wildcard matching prefix (e.g. claude-3-7-sonnet-20250219)
# - "*-thinking" # wildcard matching suffix (e.g. claude-opus-4-5-thinking)
# - "*haiku*" # wildcard matching substring (e.g. claude-3-5-haiku-20241022)
# cloak: # optional: request cloaking for non-Claude-Code clients
# mode: "auto" # "auto" (default): cloak only when client is not Claude Code
# # "always": always apply cloaking
# # "never": never apply cloaking
# strict-mode: false # false (default): prepend Claude Code prompt to user system messages
# # true: strip all user system messages, keep only Claude Code prompt
# sensitive-words: # optional: words to obfuscate with zero-width characters
# - "API"
# - "proxy"
# OpenAI compatibility providers
# openai-compatibility:

1
go.mod
View File

@@ -13,6 +13,7 @@ require (
github.com/joho/godotenv v1.5.1
github.com/klauspost/compress v1.17.4
github.com/minio/minio-go/v7 v7.0.66
github.com/refraction-networking/utls v1.8.2
github.com/sirupsen/logrus v1.9.3
github.com/skratchdot/open-golang v0.0.0-20200116055534-eef842397966
github.com/tidwall/gjson v1.18.0

2
go.sum
View File

@@ -118,6 +118,8 @@ github.com/pjbgf/sha1cd v0.5.0 h1:a+UkboSi1znleCDUNT3M5YxjOnN1fz2FhN48FlwCxs0=
github.com/pjbgf/sha1cd v0.5.0/go.mod h1:lhpGlyHLpQZoxMv8HcgXvZEhcGs0PG/vsZnEJ7H0iCM=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/refraction-networking/utls v1.8.2 h1:j4Q1gJj0xngdeH+Ox/qND11aEfhpgoEvV+S9iJ2IdQo=
github.com/refraction-networking/utls v1.8.2/go.mod h1:jkSOEkLqn+S/jtpEHPOsVv/4V4EVnelwbMQl4vCWXAM=
github.com/rogpeppe/go-internal v1.14.1 h1:UQB4HGPB6osV0SQTLymcB4TgvyWu6ZyliaW0tI/otEQ=
github.com/rogpeppe/go-internal v1.14.1/go.mod h1:MaRKkUm5W0goXpeCfT7UZI6fk/L7L7so1lCWt35ZSgc=
github.com/rs/xid v1.5.0 h1:mKX4bl4iPYJtEIxp6CYiUuLQ/8DYMoz0PUdtGgMFRVc=

View File

@@ -3,13 +3,14 @@ package management
import (
"bytes"
"context"
"crypto/sha256"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
"io"
"net"
"net/http"
"net/url"
"os"
"path/filepath"
"sort"
@@ -19,6 +20,7 @@ import (
"time"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/antigravity"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/claude"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/codex"
geminiAuth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/gemini"
@@ -230,14 +232,6 @@ func stopForwarderInstance(port int, forwarder *callbackForwarder) {
log.Infof("callback forwarder on port %d stopped", port)
}
func sanitizeAntigravityFileName(email string) string {
if strings.TrimSpace(email) == "" {
return "antigravity.json"
}
replacer := strings.NewReplacer("@", "_", ".", "_")
return fmt.Sprintf("antigravity-%s.json", replacer.Replace(email))
}
func (h *Handler) managementCallbackURL(path string) (string, error) {
if h == nil || h.cfg == nil || h.cfg.Port <= 0 {
return "", fmt.Errorf("server port is not configured")
@@ -747,6 +741,72 @@ func (h *Handler) registerAuthFromFile(ctx context.Context, path string, data []
return err
}
// PatchAuthFileStatus toggles the disabled state of an auth file
func (h *Handler) PatchAuthFileStatus(c *gin.Context) {
if h.authManager == nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": "core auth manager unavailable"})
return
}
var req struct {
Name string `json:"name"`
Disabled *bool `json:"disabled"`
}
if err := c.ShouldBindJSON(&req); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
return
}
name := strings.TrimSpace(req.Name)
if name == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "name is required"})
return
}
if req.Disabled == nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "disabled is required"})
return
}
ctx := c.Request.Context()
// Find auth by name or ID
var targetAuth *coreauth.Auth
if auth, ok := h.authManager.GetByID(name); ok {
targetAuth = auth
} else {
auths := h.authManager.List()
for _, auth := range auths {
if auth.FileName == name {
targetAuth = auth
break
}
}
}
if targetAuth == nil {
c.JSON(http.StatusNotFound, gin.H{"error": "auth file not found"})
return
}
// Update disabled state
targetAuth.Disabled = *req.Disabled
if *req.Disabled {
targetAuth.Status = coreauth.StatusDisabled
targetAuth.StatusMessage = "disabled via management API"
} else {
targetAuth.Status = coreauth.StatusActive
targetAuth.StatusMessage = ""
}
targetAuth.UpdatedAt = time.Now()
if _, err := h.authManager.Update(ctx, targetAuth); err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("failed to update auth: %v", err)})
return
}
c.JSON(http.StatusOK, gin.H{"status": "ok", "disabled": *req.Disabled})
}
func (h *Handler) disableAuth(ctx context.Context, id string) {
if h == nil || h.authManager == nil {
return
@@ -913,67 +973,14 @@ func (h *Handler) RequestAnthropicToken(c *gin.Context) {
rawCode := resultMap["code"]
code := strings.Split(rawCode, "#")[0]
// Exchange code for tokens (replicate logic using updated redirect_uri)
// Extract client_id from the modified auth URL
clientID := ""
if u2, errP := url.Parse(authURL); errP == nil {
clientID = u2.Query().Get("client_id")
}
// Build request
bodyMap := map[string]any{
"code": code,
"state": state,
"grant_type": "authorization_code",
"client_id": clientID,
"redirect_uri": "http://localhost:54545/callback",
"code_verifier": pkceCodes.CodeVerifier,
}
bodyJSON, _ := json.Marshal(bodyMap)
httpClient := util.SetProxy(&h.cfg.SDKConfig, &http.Client{})
req, _ := http.NewRequestWithContext(ctx, "POST", "https://console.anthropic.com/v1/oauth/token", strings.NewReader(string(bodyJSON)))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Accept", "application/json")
resp, errDo := httpClient.Do(req)
if errDo != nil {
authErr := claude.NewAuthenticationError(claude.ErrCodeExchangeFailed, errDo)
// Exchange code for tokens using internal auth service
bundle, errExchange := anthropicAuth.ExchangeCodeForTokens(ctx, code, state, pkceCodes)
if errExchange != nil {
authErr := claude.NewAuthenticationError(claude.ErrCodeExchangeFailed, errExchange)
log.Errorf("Failed to exchange authorization code for tokens: %v", authErr)
SetOAuthSessionError(state, "Failed to exchange authorization code for tokens")
return
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("failed to close response body: %v", errClose)
}
}()
respBody, _ := io.ReadAll(resp.Body)
if resp.StatusCode != http.StatusOK {
log.Errorf("token exchange failed with status %d: %s", resp.StatusCode, string(respBody))
SetOAuthSessionError(state, fmt.Sprintf("token exchange failed with status %d", resp.StatusCode))
return
}
var tResp struct {
AccessToken string `json:"access_token"`
RefreshToken string `json:"refresh_token"`
ExpiresIn int `json:"expires_in"`
Account struct {
EmailAddress string `json:"email_address"`
} `json:"account"`
}
if errU := json.Unmarshal(respBody, &tResp); errU != nil {
log.Errorf("failed to parse token response: %v", errU)
SetOAuthSessionError(state, "Failed to parse token response")
return
}
bundle := &claude.ClaudeAuthBundle{
TokenData: claude.ClaudeTokenData{
AccessToken: tResp.AccessToken,
RefreshToken: tResp.RefreshToken,
Email: tResp.Account.EmailAddress,
Expire: time.Now().Add(time.Duration(tResp.ExpiresIn) * time.Second).Format(time.RFC3339),
},
LastRefresh: time.Now().Format(time.RFC3339),
}
// Create token storage
tokenStorage := anthropicAuth.CreateTokenStorage(bundle)
@@ -1013,17 +1020,13 @@ func (h *Handler) RequestGeminiCLIToken(c *gin.Context) {
fmt.Println("Initializing Google authentication...")
// OAuth2 configuration (mirrors internal/auth/gemini)
// OAuth2 configuration using exported constants from internal/auth/gemini
conf := &oauth2.Config{
ClientID: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com",
ClientSecret: "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",
RedirectURL: "http://localhost:8085/oauth2callback",
Scopes: []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
},
Endpoint: google.Endpoint,
ClientID: geminiAuth.ClientID,
ClientSecret: geminiAuth.ClientSecret,
RedirectURL: fmt.Sprintf("http://localhost:%d/oauth2callback", geminiAuth.DefaultCallbackPort),
Scopes: geminiAuth.Scopes,
Endpoint: google.Endpoint,
}
// Build authorization URL and return it immediately
@@ -1145,13 +1148,9 @@ func (h *Handler) RequestGeminiCLIToken(c *gin.Context) {
}
ifToken["token_uri"] = "https://oauth2.googleapis.com/token"
ifToken["client_id"] = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
ifToken["client_secret"] = "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"
ifToken["scopes"] = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
}
ifToken["client_id"] = geminiAuth.ClientID
ifToken["client_secret"] = geminiAuth.ClientSecret
ifToken["scopes"] = geminiAuth.Scopes
ifToken["universe_domain"] = "googleapis.com"
ts := geminiAuth.GeminiTokenStorage{
@@ -1338,74 +1337,34 @@ func (h *Handler) RequestCodexToken(c *gin.Context) {
}
log.Debug("Authorization code received, exchanging for tokens...")
// Extract client_id from authURL
clientID := ""
if u2, errP := url.Parse(authURL); errP == nil {
clientID = u2.Query().Get("client_id")
}
// Exchange code for tokens with redirect equal to mgmtRedirect
form := url.Values{
"grant_type": {"authorization_code"},
"client_id": {clientID},
"code": {code},
"redirect_uri": {"http://localhost:1455/auth/callback"},
"code_verifier": {pkceCodes.CodeVerifier},
}
httpClient := util.SetProxy(&h.cfg.SDKConfig, &http.Client{})
req, _ := http.NewRequestWithContext(ctx, "POST", "https://auth.openai.com/oauth/token", strings.NewReader(form.Encode()))
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.Header.Set("Accept", "application/json")
resp, errDo := httpClient.Do(req)
if errDo != nil {
authErr := codex.NewAuthenticationError(codex.ErrCodeExchangeFailed, errDo)
// Exchange code for tokens using internal auth service
bundle, errExchange := openaiAuth.ExchangeCodeForTokens(ctx, code, pkceCodes)
if errExchange != nil {
authErr := codex.NewAuthenticationError(codex.ErrCodeExchangeFailed, errExchange)
SetOAuthSessionError(state, "Failed to exchange authorization code for tokens")
log.Errorf("Failed to exchange authorization code for tokens: %v", authErr)
return
}
defer func() { _ = resp.Body.Close() }()
respBody, _ := io.ReadAll(resp.Body)
if resp.StatusCode != http.StatusOK {
SetOAuthSessionError(state, fmt.Sprintf("Token exchange failed with status %d", resp.StatusCode))
log.Errorf("token exchange failed with status %d: %s", resp.StatusCode, string(respBody))
return
}
var tokenResp struct {
AccessToken string `json:"access_token"`
RefreshToken string `json:"refresh_token"`
IDToken string `json:"id_token"`
ExpiresIn int `json:"expires_in"`
}
if errU := json.Unmarshal(respBody, &tokenResp); errU != nil {
SetOAuthSessionError(state, "Failed to parse token response")
log.Errorf("failed to parse token response: %v", errU)
return
}
claims, _ := codex.ParseJWTToken(tokenResp.IDToken)
email := ""
accountID := ""
// Extract additional info for filename generation
claims, _ := codex.ParseJWTToken(bundle.TokenData.IDToken)
planType := ""
hashAccountID := ""
if claims != nil {
email = claims.GetUserEmail()
accountID = claims.GetAccountID()
}
// Build bundle compatible with existing storage
bundle := &codex.CodexAuthBundle{
TokenData: codex.CodexTokenData{
IDToken: tokenResp.IDToken,
AccessToken: tokenResp.AccessToken,
RefreshToken: tokenResp.RefreshToken,
AccountID: accountID,
Email: email,
Expire: time.Now().Add(time.Duration(tokenResp.ExpiresIn) * time.Second).Format(time.RFC3339),
},
LastRefresh: time.Now().Format(time.RFC3339),
planType = strings.TrimSpace(claims.CodexAuthInfo.ChatgptPlanType)
if accountID := claims.GetAccountID(); accountID != "" {
digest := sha256.Sum256([]byte(accountID))
hashAccountID = hex.EncodeToString(digest[:])[:8]
}
}
// Create token storage and persist
tokenStorage := openaiAuth.CreateTokenStorage(bundle)
fileName := codex.CredentialFileName(tokenStorage.Email, planType, hashAccountID, true)
record := &coreauth.Auth{
ID: fmt.Sprintf("codex-%s.json", tokenStorage.Email),
ID: fileName,
Provider: "codex",
FileName: fmt.Sprintf("codex-%s.json", tokenStorage.Email),
FileName: fileName,
Storage: tokenStorage,
Metadata: map[string]any{
"email": tokenStorage.Email,
@@ -1431,23 +1390,12 @@ func (h *Handler) RequestCodexToken(c *gin.Context) {
}
func (h *Handler) RequestAntigravityToken(c *gin.Context) {
const (
antigravityCallbackPort = 51121
antigravityClientID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
antigravityClientSecret = "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"
)
var antigravityScopes = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
"https://www.googleapis.com/auth/cclog",
"https://www.googleapis.com/auth/experimentsandconfigs",
}
ctx := context.Background()
fmt.Println("Initializing Antigravity authentication...")
authSvc := antigravity.NewAntigravityAuth(h.cfg, nil)
state, errState := misc.GenerateRandomState()
if errState != nil {
log.Errorf("Failed to generate state parameter: %v", errState)
@@ -1455,17 +1403,8 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
return
}
redirectURI := fmt.Sprintf("http://localhost:%d/oauth-callback", antigravityCallbackPort)
params := url.Values{}
params.Set("access_type", "offline")
params.Set("client_id", antigravityClientID)
params.Set("prompt", "consent")
params.Set("redirect_uri", redirectURI)
params.Set("response_type", "code")
params.Set("scope", strings.Join(antigravityScopes, " "))
params.Set("state", state)
authURL := "https://accounts.google.com/o/oauth2/v2/auth?" + params.Encode()
redirectURI := fmt.Sprintf("http://localhost:%d/oauth-callback", antigravity.CallbackPort)
authURL := authSvc.BuildAuthURL(state, redirectURI)
RegisterOAuthSession(state, "antigravity")
@@ -1479,7 +1418,7 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
return
}
var errStart error
if forwarder, errStart = startCallbackForwarder(antigravityCallbackPort, "antigravity", targetURL); errStart != nil {
if forwarder, errStart = startCallbackForwarder(antigravity.CallbackPort, "antigravity", targetURL); errStart != nil {
log.WithError(errStart).Error("failed to start antigravity callback forwarder")
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to start callback server"})
return
@@ -1488,7 +1427,7 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
go func() {
if isWebUI {
defer stopCallbackForwarderInstance(antigravityCallbackPort, forwarder)
defer stopCallbackForwarderInstance(antigravity.CallbackPort, forwarder)
}
waitFile := filepath.Join(h.cfg.AuthDir, fmt.Sprintf(".oauth-antigravity-%s.oauth", state))
@@ -1528,93 +1467,36 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
time.Sleep(500 * time.Millisecond)
}
httpClient := util.SetProxy(&h.cfg.SDKConfig, &http.Client{})
form := url.Values{}
form.Set("code", authCode)
form.Set("client_id", antigravityClientID)
form.Set("client_secret", antigravityClientSecret)
form.Set("redirect_uri", redirectURI)
form.Set("grant_type", "authorization_code")
req, errNewRequest := http.NewRequestWithContext(ctx, http.MethodPost, "https://oauth2.googleapis.com/token", strings.NewReader(form.Encode()))
if errNewRequest != nil {
log.Errorf("Failed to build token request: %v", errNewRequest)
SetOAuthSessionError(state, "Failed to build token request")
return
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
resp, errDo := httpClient.Do(req)
if errDo != nil {
log.Errorf("Failed to execute token request: %v", errDo)
tokenResp, errToken := authSvc.ExchangeCodeForTokens(ctx, authCode, redirectURI)
if errToken != nil {
log.Errorf("Failed to exchange token: %v", errToken)
SetOAuthSessionError(state, "Failed to exchange token")
return
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity token exchange close error: %v", errClose)
}
}()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
bodyBytes, _ := io.ReadAll(resp.Body)
log.Errorf("Antigravity token exchange failed with status %d: %s", resp.StatusCode, string(bodyBytes))
SetOAuthSessionError(state, fmt.Sprintf("Token exchange failed: %d", resp.StatusCode))
accessToken := strings.TrimSpace(tokenResp.AccessToken)
if accessToken == "" {
log.Error("antigravity: token exchange returned empty access token")
SetOAuthSessionError(state, "Failed to exchange token")
return
}
var tokenResp struct {
AccessToken string `json:"access_token"`
RefreshToken string `json:"refresh_token"`
ExpiresIn int64 `json:"expires_in"`
TokenType string `json:"token_type"`
}
if errDecode := json.NewDecoder(resp.Body).Decode(&tokenResp); errDecode != nil {
log.Errorf("Failed to parse token response: %v", errDecode)
SetOAuthSessionError(state, "Failed to parse token response")
email, errInfo := authSvc.FetchUserInfo(ctx, accessToken)
if errInfo != nil {
log.Errorf("Failed to fetch user info: %v", errInfo)
SetOAuthSessionError(state, "Failed to fetch user info")
return
}
email := ""
if strings.TrimSpace(tokenResp.AccessToken) != "" {
infoReq, errInfoReq := http.NewRequestWithContext(ctx, http.MethodGet, "https://www.googleapis.com/oauth2/v1/userinfo?alt=json", nil)
if errInfoReq != nil {
log.Errorf("Failed to build user info request: %v", errInfoReq)
SetOAuthSessionError(state, "Failed to build user info request")
return
}
infoReq.Header.Set("Authorization", "Bearer "+tokenResp.AccessToken)
infoResp, errInfo := httpClient.Do(infoReq)
if errInfo != nil {
log.Errorf("Failed to execute user info request: %v", errInfo)
SetOAuthSessionError(state, "Failed to execute user info request")
return
}
defer func() {
if errClose := infoResp.Body.Close(); errClose != nil {
log.Errorf("antigravity user info close error: %v", errClose)
}
}()
if infoResp.StatusCode >= http.StatusOK && infoResp.StatusCode < http.StatusMultipleChoices {
var infoPayload struct {
Email string `json:"email"`
}
if errDecodeInfo := json.NewDecoder(infoResp.Body).Decode(&infoPayload); errDecodeInfo == nil {
email = strings.TrimSpace(infoPayload.Email)
}
} else {
bodyBytes, _ := io.ReadAll(infoResp.Body)
log.Errorf("User info request failed with status %d: %s", infoResp.StatusCode, string(bodyBytes))
SetOAuthSessionError(state, fmt.Sprintf("User info request failed: %d", infoResp.StatusCode))
return
}
email = strings.TrimSpace(email)
if email == "" {
log.Error("antigravity: user info returned empty email")
SetOAuthSessionError(state, "Failed to fetch user info")
return
}
projectID := ""
if strings.TrimSpace(tokenResp.AccessToken) != "" {
fetchedProjectID, errProject := sdkAuth.FetchAntigravityProjectID(ctx, tokenResp.AccessToken, httpClient)
if accessToken != "" {
fetchedProjectID, errProject := authSvc.FetchProjectID(ctx, accessToken)
if errProject != nil {
log.Warnf("antigravity: failed to fetch project ID: %v", errProject)
} else {
@@ -1639,7 +1521,7 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
metadata["project_id"] = projectID
}
fileName := sanitizeAntigravityFileName(email)
fileName := antigravity.CredentialFileName(email)
label := strings.TrimSpace(email)
if label == "" {
label = "antigravity"
@@ -1703,7 +1585,7 @@ func (h *Handler) RequestQwenToken(c *gin.Context) {
// Create token storage
tokenStorage := qwenAuth.CreateTokenStorage(tokenData)
tokenStorage.Email = fmt.Sprintf("qwen-%d", time.Now().UnixMilli())
tokenStorage.Email = fmt.Sprintf("%d", time.Now().UnixMilli())
record := &coreauth.Auth{
ID: fmt.Sprintf("qwen-%s.json", tokenStorage.Email),
Provider: "qwen",
@@ -1808,7 +1690,7 @@ func (h *Handler) RequestIFlowToken(c *gin.Context) {
tokenStorage := authSvc.CreateTokenStorage(tokenData)
identifier := strings.TrimSpace(tokenStorage.Email)
if identifier == "" {
identifier = fmt.Sprintf("iflow-%d", time.Now().UnixMilli())
identifier = fmt.Sprintf("%d", time.Now().UnixMilli())
tokenStorage.Email = identifier
}
record := &coreauth.Auth{
@@ -1893,15 +1775,17 @@ func (h *Handler) RequestIFlowCookieToken(c *gin.Context) {
fileName := iflowauth.SanitizeIFlowFileName(email)
if fileName == "" {
fileName = fmt.Sprintf("iflow-%d", time.Now().UnixMilli())
} else {
fileName = fmt.Sprintf("iflow-%s", fileName)
}
tokenStorage.Email = email
timestamp := time.Now().Unix()
record := &coreauth.Auth{
ID: fmt.Sprintf("iflow-%s-%d.json", fileName, timestamp),
ID: fmt.Sprintf("%s-%d.json", fileName, timestamp),
Provider: "iflow",
FileName: fmt.Sprintf("iflow-%s-%d.json", fileName, timestamp),
FileName: fmt.Sprintf("%s-%d.json", fileName, timestamp),
Storage: tokenStorage,
Metadata: map[string]any{
"email": email,
@@ -2108,7 +1992,20 @@ func performGeminiCLISetup(ctx context.Context, httpClient *http.Client, storage
finalProjectID := projectID
if responseProjectID != "" {
if explicitProject && !strings.EqualFold(responseProjectID, projectID) {
log.Warnf("Gemini onboarding returned project %s instead of requested %s; keeping requested project ID.", responseProjectID, projectID)
// Check if this is a free user (gen-lang-client projects or free/legacy tier)
isFreeUser := strings.HasPrefix(projectID, "gen-lang-client-") ||
strings.EqualFold(tierID, "FREE") ||
strings.EqualFold(tierID, "LEGACY")
if isFreeUser {
// For free users, use backend project ID for preview model access
log.Infof("Gemini onboarding: frontend project %s maps to backend project %s", projectID, responseProjectID)
log.Infof("Using backend project ID: %s (recommended for preview model access)", responseProjectID)
finalProjectID = responseProjectID
} else {
// Pro users: keep requested project ID (original behavior)
log.Warnf("Gemini onboarding returned project %s instead of requested %s; keeping requested project ID.", responseProjectID, projectID)
}
} else {
finalProjectID = responseProjectID
}

View File

@@ -13,7 +13,7 @@ import (
"time"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
)
const (
@@ -360,16 +360,7 @@ func (h *Handler) logDirectory() string {
if h.logDir != "" {
return h.logDir
}
if base := util.WritablePath(); base != "" {
return filepath.Join(base, "logs")
}
if h.configFilePath != "" {
dir := filepath.Dir(h.configFilePath)
if dir != "" && dir != "." {
return filepath.Join(dir, "logs")
}
}
return "logs"
return logging.ResolveLogDirectory(h.cfg)
}
func (h *Handler) collectLogFiles(dir string) ([]string, error) {

View File

@@ -0,0 +1,33 @@
package management
import (
"net/http"
"strings"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
)
// GetStaticModelDefinitions returns static model metadata for a given channel.
// Channel is provided via path param (:channel) or query param (?channel=...).
func (h *Handler) GetStaticModelDefinitions(c *gin.Context) {
channel := strings.TrimSpace(c.Param("channel"))
if channel == "" {
channel = strings.TrimSpace(c.Query("channel"))
}
if channel == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "channel is required"})
return
}
models := registry.GetStaticModelDefinitionsByChannel(channel)
if models == nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "unknown channel", "channel": channel})
return
}
c.JSON(http.StatusOK, gin.H{
"channel": strings.ToLower(strings.TrimSpace(channel)),
"models": models,
})
}

View File

@@ -8,6 +8,7 @@ import (
"io"
"net/http"
"strings"
"time"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
@@ -103,6 +104,7 @@ func captureRequestInfo(c *gin.Context) (*RequestInfo, error) {
Headers: headers,
Body: body,
RequestID: logging.GetGinRequestID(c),
Timestamp: time.Now(),
}, nil
}

View File

@@ -7,6 +7,7 @@ import (
"bytes"
"net/http"
"strings"
"time"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
@@ -20,22 +21,24 @@ type RequestInfo struct {
Headers map[string][]string // Headers contains the request headers.
Body []byte // Body is the raw request body.
RequestID string // RequestID is the unique identifier for the request.
Timestamp time.Time // Timestamp is when the request was received.
}
// ResponseWriterWrapper wraps the standard gin.ResponseWriter to intercept and log response data.
// It is designed to handle both standard and streaming responses, ensuring that logging operations do not block the client response.
type ResponseWriterWrapper struct {
gin.ResponseWriter
body *bytes.Buffer // body is a buffer to store the response body for non-streaming responses.
isStreaming bool // isStreaming indicates whether the response is a streaming type (e.g., text/event-stream).
streamWriter logging.StreamingLogWriter // streamWriter is a writer for handling streaming log entries.
chunkChannel chan []byte // chunkChannel is a channel for asynchronously passing response chunks to the logger.
streamDone chan struct{} // streamDone signals when the streaming goroutine completes.
logger logging.RequestLogger // logger is the instance of the request logger service.
requestInfo *RequestInfo // requestInfo holds the details of the original request.
statusCode int // statusCode stores the HTTP status code of the response.
headers map[string][]string // headers stores the response headers.
logOnErrorOnly bool // logOnErrorOnly enables logging only when an error response is detected.
body *bytes.Buffer // body is a buffer to store the response body for non-streaming responses.
isStreaming bool // isStreaming indicates whether the response is a streaming type (e.g., text/event-stream).
streamWriter logging.StreamingLogWriter // streamWriter is a writer for handling streaming log entries.
chunkChannel chan []byte // chunkChannel is a channel for asynchronously passing response chunks to the logger.
streamDone chan struct{} // streamDone signals when the streaming goroutine completes.
logger logging.RequestLogger // logger is the instance of the request logger service.
requestInfo *RequestInfo // requestInfo holds the details of the original request.
statusCode int // statusCode stores the HTTP status code of the response.
headers map[string][]string // headers stores the response headers.
logOnErrorOnly bool // logOnErrorOnly enables logging only when an error response is detected.
firstChunkTimestamp time.Time // firstChunkTimestamp captures TTFB for streaming responses.
}
// NewResponseWriterWrapper creates and initializes a new ResponseWriterWrapper.
@@ -73,6 +76,10 @@ func (w *ResponseWriterWrapper) Write(data []byte) (int, error) {
// THEN: Handle logging based on response type
if w.isStreaming && w.chunkChannel != nil {
// Capture TTFB on first chunk (synchronous, before async channel send)
if w.firstChunkTimestamp.IsZero() {
w.firstChunkTimestamp = time.Now()
}
// For streaming responses: Send to async logging channel (non-blocking)
select {
case w.chunkChannel <- append([]byte(nil), data...): // Non-blocking send with copy
@@ -117,6 +124,10 @@ func (w *ResponseWriterWrapper) WriteString(data string) (int, error) {
// THEN: Capture for logging
if w.isStreaming && w.chunkChannel != nil {
// Capture TTFB on first chunk (synchronous, before async channel send)
if w.firstChunkTimestamp.IsZero() {
w.firstChunkTimestamp = time.Now()
}
select {
case w.chunkChannel <- []byte(data):
default:
@@ -280,6 +291,8 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
w.streamDone = nil
}
w.streamWriter.SetFirstChunkTimestamp(w.firstChunkTimestamp)
// Write API Request and Response to the streaming log before closing
apiRequest := w.extractAPIRequest(c)
if len(apiRequest) > 0 {
@@ -297,7 +310,7 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
return nil
}
return w.logRequest(finalStatusCode, w.cloneHeaders(), w.body.Bytes(), w.extractAPIRequest(c), w.extractAPIResponse(c), slicesAPIResponseError, forceLog)
return w.logRequest(finalStatusCode, w.cloneHeaders(), w.body.Bytes(), w.extractAPIRequest(c), w.extractAPIResponse(c), w.extractAPIResponseTimestamp(c), slicesAPIResponseError, forceLog)
}
func (w *ResponseWriterWrapper) cloneHeaders() map[string][]string {
@@ -337,7 +350,18 @@ func (w *ResponseWriterWrapper) extractAPIResponse(c *gin.Context) []byte {
return data
}
func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]string, body []byte, apiRequestBody, apiResponseBody []byte, apiResponseErrors []*interfaces.ErrorMessage, forceLog bool) error {
func (w *ResponseWriterWrapper) extractAPIResponseTimestamp(c *gin.Context) time.Time {
ts, isExist := c.Get("API_RESPONSE_TIMESTAMP")
if !isExist {
return time.Time{}
}
if t, ok := ts.(time.Time); ok {
return t
}
return time.Time{}
}
func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]string, body []byte, apiRequestBody, apiResponseBody []byte, apiResponseTimestamp time.Time, apiResponseErrors []*interfaces.ErrorMessage, forceLog bool) error {
if w.requestInfo == nil {
return nil
}
@@ -348,7 +372,7 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
}
if loggerWithOptions, ok := w.logger.(interface {
LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool, string) error
LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool, string, time.Time, time.Time) error
}); ok {
return loggerWithOptions.LogRequestWithOptions(
w.requestInfo.URL,
@@ -363,6 +387,8 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
apiResponseErrors,
forceLog,
w.requestInfo.RequestID,
w.requestInfo.Timestamp,
apiResponseTimestamp,
)
}
@@ -378,5 +404,7 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
apiResponseBody,
apiResponseErrors,
w.requestInfo.RequestID,
w.requestInfo.Timestamp,
apiResponseTimestamp,
)
}

View File

@@ -12,6 +12,7 @@ import (
"net/http"
"os"
"path/filepath"
"reflect"
"strings"
"sync"
"sync/atomic"
@@ -261,10 +262,7 @@ func NewServer(cfg *config.Config, authManager *auth.Manager, accessManager *sdk
if optionState.localPassword != "" {
s.mgmt.SetLocalPassword(optionState.localPassword)
}
logDir := filepath.Join(s.currentPath, "logs")
if base := util.WritablePath(); base != "" {
logDir = filepath.Join(base, "logs")
}
logDir := logging.ResolveLogDirectory(cfg)
s.mgmt.SetLogDirectory(logDir)
s.localPassword = optionState.localPassword
@@ -328,6 +326,7 @@ func (s *Server) setupRoutes() {
v1.POST("/messages", claudeCodeHandlers.ClaudeMessages)
v1.POST("/messages/count_tokens", claudeCodeHandlers.ClaudeCountTokens)
v1.POST("/responses", openaiResponsesHandlers.Responses)
v1.POST("/responses/compact", openaiResponsesHandlers.Compact)
}
// Gemini compatible API routes
@@ -610,9 +609,11 @@ func (s *Server) registerManagementRoutes() {
mgmt.GET("/auth-files", s.mgmt.ListAuthFiles)
mgmt.GET("/auth-files/models", s.mgmt.GetAuthFileModels)
mgmt.GET("/model-definitions/:channel", s.mgmt.GetStaticModelDefinitions)
mgmt.GET("/auth-files/download", s.mgmt.DownloadAuthFile)
mgmt.POST("/auth-files", s.mgmt.UploadAuthFile)
mgmt.DELETE("/auth-files", s.mgmt.DeleteAuthFile)
mgmt.PATCH("/auth-files/status", s.mgmt.PatchAuthFileStatus)
mgmt.POST("/vertex/import", s.mgmt.ImportVertexCredential)
mgmt.GET("/anthropic-auth-url", s.mgmt.RequestAnthropicToken)
@@ -991,14 +992,17 @@ func (s *Server) UpdateClients(cfg *config.Config) {
s.mgmt.SetAuthManager(s.handlers.AuthManager)
}
// Notify Amp module of config changes (for model mapping hot-reload)
if s.ampModule != nil {
log.Debugf("triggering amp module config update")
if err := s.ampModule.OnConfigUpdated(cfg); err != nil {
log.Errorf("failed to update Amp module config: %v", err)
// Notify Amp module only when Amp config has changed.
ampConfigChanged := oldCfg == nil || !reflect.DeepEqual(oldCfg.AmpCode, cfg.AmpCode)
if ampConfigChanged {
if s.ampModule != nil {
log.Debugf("triggering amp module config update")
if err := s.ampModule.OnConfigUpdated(cfg); err != nil {
log.Errorf("failed to update Amp module config: %v", err)
}
} else {
log.Warnf("amp module is nil, skipping config update")
}
} else {
log.Warnf("amp module is nil, skipping config update")
}
// Count client sources from configuration and auth store.

View File

@@ -0,0 +1,344 @@
// Package antigravity provides OAuth2 authentication functionality for the Antigravity provider.
package antigravity
import (
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"strings"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
log "github.com/sirupsen/logrus"
)
// TokenResponse represents OAuth token response from Google
type TokenResponse struct {
AccessToken string `json:"access_token"`
RefreshToken string `json:"refresh_token"`
ExpiresIn int64 `json:"expires_in"`
TokenType string `json:"token_type"`
}
// userInfo represents Google user profile
type userInfo struct {
Email string `json:"email"`
}
// AntigravityAuth handles Antigravity OAuth authentication
type AntigravityAuth struct {
httpClient *http.Client
}
// NewAntigravityAuth creates a new Antigravity auth service.
func NewAntigravityAuth(cfg *config.Config, httpClient *http.Client) *AntigravityAuth {
if httpClient != nil {
return &AntigravityAuth{httpClient: httpClient}
}
if cfg == nil {
cfg = &config.Config{}
}
return &AntigravityAuth{
httpClient: util.SetProxy(&cfg.SDKConfig, &http.Client{}),
}
}
// BuildAuthURL generates the OAuth authorization URL.
func (o *AntigravityAuth) BuildAuthURL(state, redirectURI string) string {
if strings.TrimSpace(redirectURI) == "" {
redirectURI = fmt.Sprintf("http://localhost:%d/oauth-callback", CallbackPort)
}
params := url.Values{}
params.Set("access_type", "offline")
params.Set("client_id", ClientID)
params.Set("prompt", "consent")
params.Set("redirect_uri", redirectURI)
params.Set("response_type", "code")
params.Set("scope", strings.Join(Scopes, " "))
params.Set("state", state)
return AuthEndpoint + "?" + params.Encode()
}
// ExchangeCodeForTokens exchanges authorization code for access and refresh tokens
func (o *AntigravityAuth) ExchangeCodeForTokens(ctx context.Context, code, redirectURI string) (*TokenResponse, error) {
data := url.Values{}
data.Set("code", code)
data.Set("client_id", ClientID)
data.Set("client_secret", ClientSecret)
data.Set("redirect_uri", redirectURI)
data.Set("grant_type", "authorization_code")
req, err := http.NewRequestWithContext(ctx, http.MethodPost, TokenEndpoint, strings.NewReader(data.Encode()))
if err != nil {
return nil, fmt.Errorf("antigravity token exchange: create request: %w", err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
resp, errDo := o.httpClient.Do(req)
if errDo != nil {
return nil, fmt.Errorf("antigravity token exchange: execute request: %w", errDo)
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity token exchange: close body error: %v", errClose)
}
}()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
bodyBytes, errRead := io.ReadAll(io.LimitReader(resp.Body, 8<<10))
if errRead != nil {
return nil, fmt.Errorf("antigravity token exchange: read response: %w", errRead)
}
body := strings.TrimSpace(string(bodyBytes))
if body == "" {
return nil, fmt.Errorf("antigravity token exchange: request failed: status %d", resp.StatusCode)
}
return nil, fmt.Errorf("antigravity token exchange: request failed: status %d: %s", resp.StatusCode, body)
}
var token TokenResponse
if errDecode := json.NewDecoder(resp.Body).Decode(&token); errDecode != nil {
return nil, fmt.Errorf("antigravity token exchange: decode response: %w", errDecode)
}
return &token, nil
}
// FetchUserInfo retrieves user email from Google
func (o *AntigravityAuth) FetchUserInfo(ctx context.Context, accessToken string) (string, error) {
accessToken = strings.TrimSpace(accessToken)
if accessToken == "" {
return "", fmt.Errorf("antigravity userinfo: missing access token")
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, UserInfoEndpoint, nil)
if err != nil {
return "", fmt.Errorf("antigravity userinfo: create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+accessToken)
resp, errDo := o.httpClient.Do(req)
if errDo != nil {
return "", fmt.Errorf("antigravity userinfo: execute request: %w", errDo)
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity userinfo: close body error: %v", errClose)
}
}()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
bodyBytes, errRead := io.ReadAll(io.LimitReader(resp.Body, 8<<10))
if errRead != nil {
return "", fmt.Errorf("antigravity userinfo: read response: %w", errRead)
}
body := strings.TrimSpace(string(bodyBytes))
if body == "" {
return "", fmt.Errorf("antigravity userinfo: request failed: status %d", resp.StatusCode)
}
return "", fmt.Errorf("antigravity userinfo: request failed: status %d: %s", resp.StatusCode, body)
}
var info userInfo
if errDecode := json.NewDecoder(resp.Body).Decode(&info); errDecode != nil {
return "", fmt.Errorf("antigravity userinfo: decode response: %w", errDecode)
}
email := strings.TrimSpace(info.Email)
if email == "" {
return "", fmt.Errorf("antigravity userinfo: response missing email")
}
return email, nil
}
// FetchProjectID retrieves the project ID for the authenticated user via loadCodeAssist
func (o *AntigravityAuth) FetchProjectID(ctx context.Context, accessToken string) (string, error) {
loadReqBody := map[string]any{
"metadata": map[string]string{
"ideType": "ANTIGRAVITY",
"platform": "PLATFORM_UNSPECIFIED",
"pluginType": "GEMINI",
},
}
rawBody, errMarshal := json.Marshal(loadReqBody)
if errMarshal != nil {
return "", fmt.Errorf("marshal request body: %w", errMarshal)
}
endpointURL := fmt.Sprintf("%s/%s:loadCodeAssist", APIEndpoint, APIVersion)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, endpointURL, strings.NewReader(string(rawBody)))
if err != nil {
return "", fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+accessToken)
req.Header.Set("Content-Type", "application/json")
req.Header.Set("User-Agent", APIUserAgent)
req.Header.Set("X-Goog-Api-Client", APIClient)
req.Header.Set("Client-Metadata", ClientMetadata)
resp, errDo := o.httpClient.Do(req)
if errDo != nil {
return "", fmt.Errorf("execute request: %w", errDo)
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity loadCodeAssist: close body error: %v", errClose)
}
}()
bodyBytes, errRead := io.ReadAll(resp.Body)
if errRead != nil {
return "", fmt.Errorf("read response: %w", errRead)
}
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return "", fmt.Errorf("request failed with status %d: %s", resp.StatusCode, strings.TrimSpace(string(bodyBytes)))
}
var loadResp map[string]any
if errDecode := json.Unmarshal(bodyBytes, &loadResp); errDecode != nil {
return "", fmt.Errorf("decode response: %w", errDecode)
}
// Extract projectID from response
projectID := ""
if id, ok := loadResp["cloudaicompanionProject"].(string); ok {
projectID = strings.TrimSpace(id)
}
if projectID == "" {
if projectMap, ok := loadResp["cloudaicompanionProject"].(map[string]any); ok {
if id, okID := projectMap["id"].(string); okID {
projectID = strings.TrimSpace(id)
}
}
}
if projectID == "" {
tierID := "legacy-tier"
if tiers, okTiers := loadResp["allowedTiers"].([]any); okTiers {
for _, rawTier := range tiers {
tier, okTier := rawTier.(map[string]any)
if !okTier {
continue
}
if isDefault, okDefault := tier["isDefault"].(bool); okDefault && isDefault {
if id, okID := tier["id"].(string); okID && strings.TrimSpace(id) != "" {
tierID = strings.TrimSpace(id)
break
}
}
}
}
projectID, err = o.OnboardUser(ctx, accessToken, tierID)
if err != nil {
return "", err
}
return projectID, nil
}
return projectID, nil
}
// OnboardUser attempts to fetch the project ID via onboardUser by polling for completion
func (o *AntigravityAuth) OnboardUser(ctx context.Context, accessToken, tierID string) (string, error) {
log.Infof("Antigravity: onboarding user with tier: %s", tierID)
requestBody := map[string]any{
"tierId": tierID,
"metadata": map[string]string{
"ideType": "ANTIGRAVITY",
"platform": "PLATFORM_UNSPECIFIED",
"pluginType": "GEMINI",
},
}
rawBody, errMarshal := json.Marshal(requestBody)
if errMarshal != nil {
return "", fmt.Errorf("marshal request body: %w", errMarshal)
}
maxAttempts := 5
for attempt := 1; attempt <= maxAttempts; attempt++ {
log.Debugf("Polling attempt %d/%d", attempt, maxAttempts)
reqCtx := ctx
var cancel context.CancelFunc
if reqCtx == nil {
reqCtx = context.Background()
}
reqCtx, cancel = context.WithTimeout(reqCtx, 30*time.Second)
endpointURL := fmt.Sprintf("%s/%s:onboardUser", APIEndpoint, APIVersion)
req, errRequest := http.NewRequestWithContext(reqCtx, http.MethodPost, endpointURL, strings.NewReader(string(rawBody)))
if errRequest != nil {
cancel()
return "", fmt.Errorf("create request: %w", errRequest)
}
req.Header.Set("Authorization", "Bearer "+accessToken)
req.Header.Set("Content-Type", "application/json")
req.Header.Set("User-Agent", APIUserAgent)
req.Header.Set("X-Goog-Api-Client", APIClient)
req.Header.Set("Client-Metadata", ClientMetadata)
resp, errDo := o.httpClient.Do(req)
if errDo != nil {
cancel()
return "", fmt.Errorf("execute request: %w", errDo)
}
bodyBytes, errRead := io.ReadAll(resp.Body)
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("close body error: %v", errClose)
}
cancel()
if errRead != nil {
return "", fmt.Errorf("read response: %w", errRead)
}
if resp.StatusCode == http.StatusOK {
var data map[string]any
if errDecode := json.Unmarshal(bodyBytes, &data); errDecode != nil {
return "", fmt.Errorf("decode response: %w", errDecode)
}
if done, okDone := data["done"].(bool); okDone && done {
projectID := ""
if responseData, okResp := data["response"].(map[string]any); okResp {
switch projectValue := responseData["cloudaicompanionProject"].(type) {
case map[string]any:
if id, okID := projectValue["id"].(string); okID {
projectID = strings.TrimSpace(id)
}
case string:
projectID = strings.TrimSpace(projectValue)
}
}
if projectID != "" {
log.Infof("Successfully fetched project_id: %s", projectID)
return projectID, nil
}
return "", fmt.Errorf("no project_id in response")
}
time.Sleep(2 * time.Second)
continue
}
responsePreview := strings.TrimSpace(string(bodyBytes))
if len(responsePreview) > 500 {
responsePreview = responsePreview[:500]
}
responseErr := responsePreview
if len(responseErr) > 200 {
responseErr = responseErr[:200]
}
return "", fmt.Errorf("http %d: %s", resp.StatusCode, responseErr)
}
return "", nil
}

View File

@@ -0,0 +1,34 @@
// Package antigravity provides OAuth2 authentication functionality for the Antigravity provider.
package antigravity
// OAuth client credentials and configuration
const (
ClientID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
ClientSecret = "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"
CallbackPort = 51121
)
// Scopes defines the OAuth scopes required for Antigravity authentication
var Scopes = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
"https://www.googleapis.com/auth/cclog",
"https://www.googleapis.com/auth/experimentsandconfigs",
}
// OAuth2 endpoints for Google authentication
const (
TokenEndpoint = "https://oauth2.googleapis.com/token"
AuthEndpoint = "https://accounts.google.com/o/oauth2/v2/auth"
UserInfoEndpoint = "https://www.googleapis.com/oauth2/v1/userinfo?alt=json"
)
// Antigravity API configuration
const (
APIEndpoint = "https://cloudcode-pa.googleapis.com"
APIVersion = "v1internal"
APIUserAgent = "google-api-nodejs-client/9.15.1"
APIClient = "google-cloud-sdk vscode_cloudshelleditor/0.1"
ClientMetadata = `{"ideType":"IDE_UNSPECIFIED","platform":"PLATFORM_UNSPECIFIED","pluginType":"GEMINI"}`
)

View File

@@ -0,0 +1,16 @@
package antigravity
import (
"fmt"
"strings"
)
// CredentialFileName returns the filename used to persist Antigravity credentials.
// It uses the email as a suffix to disambiguate accounts.
func CredentialFileName(email string) string {
email = strings.TrimSpace(email)
if email == "" {
return "antigravity.json"
}
return fmt.Sprintf("antigravity-%s.json", email)
}

View File

@@ -14,15 +14,15 @@ import (
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
log "github.com/sirupsen/logrus"
)
// OAuth configuration constants for Claude/Anthropic
const (
anthropicAuthURL = "https://claude.ai/oauth/authorize"
anthropicTokenURL = "https://console.anthropic.com/v1/oauth/token"
anthropicClientID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
redirectURI = "http://localhost:54545/callback"
AuthURL = "https://claude.ai/oauth/authorize"
TokenURL = "https://console.anthropic.com/v1/oauth/token"
ClientID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
RedirectURI = "http://localhost:54545/callback"
)
// tokenResponse represents the response structure from Anthropic's OAuth token endpoint.
@@ -50,7 +50,8 @@ type ClaudeAuth struct {
}
// NewClaudeAuth creates a new Anthropic authentication service.
// It initializes the HTTP client with proxy settings from the configuration.
// It initializes the HTTP client with a custom TLS transport that uses Firefox
// fingerprint to bypass Cloudflare's TLS fingerprinting on Anthropic domains.
//
// Parameters:
// - cfg: The application configuration containing proxy settings
@@ -58,8 +59,10 @@ type ClaudeAuth struct {
// Returns:
// - *ClaudeAuth: A new Claude authentication service instance
func NewClaudeAuth(cfg *config.Config) *ClaudeAuth {
// Use custom HTTP client with Firefox TLS fingerprint to bypass
// Cloudflare's bot detection on Anthropic domains
return &ClaudeAuth{
httpClient: util.SetProxy(&cfg.SDKConfig, &http.Client{}),
httpClient: NewAnthropicHttpClient(&cfg.SDKConfig),
}
}
@@ -82,16 +85,16 @@ func (o *ClaudeAuth) GenerateAuthURL(state string, pkceCodes *PKCECodes) (string
params := url.Values{
"code": {"true"},
"client_id": {anthropicClientID},
"client_id": {ClientID},
"response_type": {"code"},
"redirect_uri": {redirectURI},
"redirect_uri": {RedirectURI},
"scope": {"org:create_api_key user:profile user:inference"},
"code_challenge": {pkceCodes.CodeChallenge},
"code_challenge_method": {"S256"},
"state": {state},
}
authURL := fmt.Sprintf("%s?%s", anthropicAuthURL, params.Encode())
authURL := fmt.Sprintf("%s?%s", AuthURL, params.Encode())
return authURL, state, nil
}
@@ -137,8 +140,8 @@ func (o *ClaudeAuth) ExchangeCodeForTokens(ctx context.Context, code, state stri
"code": newCode,
"state": state,
"grant_type": "authorization_code",
"client_id": anthropicClientID,
"redirect_uri": redirectURI,
"client_id": ClientID,
"redirect_uri": RedirectURI,
"code_verifier": pkceCodes.CodeVerifier,
}
@@ -154,7 +157,7 @@ func (o *ClaudeAuth) ExchangeCodeForTokens(ctx context.Context, code, state stri
// log.Debugf("Token exchange request: %s", string(jsonBody))
req, err := http.NewRequestWithContext(ctx, "POST", anthropicTokenURL, strings.NewReader(string(jsonBody)))
req, err := http.NewRequestWithContext(ctx, "POST", TokenURL, strings.NewReader(string(jsonBody)))
if err != nil {
return nil, fmt.Errorf("failed to create token request: %w", err)
}
@@ -221,7 +224,7 @@ func (o *ClaudeAuth) RefreshTokens(ctx context.Context, refreshToken string) (*C
}
reqBody := map[string]interface{}{
"client_id": anthropicClientID,
"client_id": ClientID,
"grant_type": "refresh_token",
"refresh_token": refreshToken,
}
@@ -231,7 +234,7 @@ func (o *ClaudeAuth) RefreshTokens(ctx context.Context, refreshToken string) (*C
return nil, fmt.Errorf("failed to marshal request body: %w", err)
}
req, err := http.NewRequestWithContext(ctx, "POST", anthropicTokenURL, strings.NewReader(string(jsonBody)))
req, err := http.NewRequestWithContext(ctx, "POST", TokenURL, strings.NewReader(string(jsonBody)))
if err != nil {
return nil, fmt.Errorf("failed to create refresh request: %w", err)
}

View File

@@ -0,0 +1,165 @@
// Package claude provides authentication functionality for Anthropic's Claude API.
// This file implements a custom HTTP transport using utls to bypass TLS fingerprinting.
package claude
import (
"net/http"
"net/url"
"strings"
"sync"
tls "github.com/refraction-networking/utls"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/config"
log "github.com/sirupsen/logrus"
"golang.org/x/net/http2"
"golang.org/x/net/proxy"
)
// utlsRoundTripper implements http.RoundTripper using utls with Firefox fingerprint
// to bypass Cloudflare's TLS fingerprinting on Anthropic domains.
type utlsRoundTripper struct {
// mu protects the connections map and pending map
mu sync.Mutex
// connections caches HTTP/2 client connections per host
connections map[string]*http2.ClientConn
// pending tracks hosts that are currently being connected to (prevents race condition)
pending map[string]*sync.Cond
// dialer is used to create network connections, supporting proxies
dialer proxy.Dialer
}
// newUtlsRoundTripper creates a new utls-based round tripper with optional proxy support
func newUtlsRoundTripper(cfg *config.SDKConfig) *utlsRoundTripper {
var dialer proxy.Dialer = proxy.Direct
if cfg != nil && cfg.ProxyURL != "" {
proxyURL, err := url.Parse(cfg.ProxyURL)
if err != nil {
log.Errorf("failed to parse proxy URL %q: %v", cfg.ProxyURL, err)
} else {
pDialer, err := proxy.FromURL(proxyURL, proxy.Direct)
if err != nil {
log.Errorf("failed to create proxy dialer for %q: %v", cfg.ProxyURL, err)
} else {
dialer = pDialer
}
}
}
return &utlsRoundTripper{
connections: make(map[string]*http2.ClientConn),
pending: make(map[string]*sync.Cond),
dialer: dialer,
}
}
// getOrCreateConnection gets an existing connection or creates a new one.
// It uses a per-host locking mechanism to prevent multiple goroutines from
// creating connections to the same host simultaneously.
func (t *utlsRoundTripper) getOrCreateConnection(host, addr string) (*http2.ClientConn, error) {
t.mu.Lock()
// Check if connection exists and is usable
if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
t.mu.Unlock()
return h2Conn, nil
}
// Check if another goroutine is already creating a connection
if cond, ok := t.pending[host]; ok {
// Wait for the other goroutine to finish
cond.Wait()
// Check if connection is now available
if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
t.mu.Unlock()
return h2Conn, nil
}
// Connection still not available, we'll create one
}
// Mark this host as pending
cond := sync.NewCond(&t.mu)
t.pending[host] = cond
t.mu.Unlock()
// Create connection outside the lock
h2Conn, err := t.createConnection(host, addr)
t.mu.Lock()
defer t.mu.Unlock()
// Remove pending marker and wake up waiting goroutines
delete(t.pending, host)
cond.Broadcast()
if err != nil {
return nil, err
}
// Store the new connection
t.connections[host] = h2Conn
return h2Conn, nil
}
// createConnection creates a new HTTP/2 connection with Firefox TLS fingerprint
func (t *utlsRoundTripper) createConnection(host, addr string) (*http2.ClientConn, error) {
conn, err := t.dialer.Dial("tcp", addr)
if err != nil {
return nil, err
}
tlsConfig := &tls.Config{ServerName: host}
tlsConn := tls.UClient(conn, tlsConfig, tls.HelloFirefox_Auto)
if err := tlsConn.Handshake(); err != nil {
conn.Close()
return nil, err
}
tr := &http2.Transport{}
h2Conn, err := tr.NewClientConn(tlsConn)
if err != nil {
tlsConn.Close()
return nil, err
}
return h2Conn, nil
}
// RoundTrip implements http.RoundTripper
func (t *utlsRoundTripper) RoundTrip(req *http.Request) (*http.Response, error) {
host := req.URL.Host
addr := host
if !strings.Contains(addr, ":") {
addr += ":443"
}
// Get hostname without port for TLS ServerName
hostname := req.URL.Hostname()
h2Conn, err := t.getOrCreateConnection(hostname, addr)
if err != nil {
return nil, err
}
resp, err := h2Conn.RoundTrip(req)
if err != nil {
// Connection failed, remove it from cache
t.mu.Lock()
if cached, ok := t.connections[hostname]; ok && cached == h2Conn {
delete(t.connections, hostname)
}
t.mu.Unlock()
return nil, err
}
return resp, nil
}
// NewAnthropicHttpClient creates an HTTP client that bypasses TLS fingerprinting
// for Anthropic domains by using utls with Firefox fingerprint.
// It accepts optional SDK configuration for proxy settings.
func NewAnthropicHttpClient(cfg *config.SDKConfig) *http.Client {
return &http.Client{
Transport: newUtlsRoundTripper(cfg),
}
}

View File

@@ -0,0 +1,46 @@
package codex
import (
"fmt"
"strings"
"unicode"
)
// CredentialFileName returns the filename used to persist Codex OAuth credentials.
// When planType is available (e.g. "plus", "team"), it is appended after the email
// as a suffix to disambiguate subscriptions.
func CredentialFileName(email, planType, hashAccountID string, includeProviderPrefix bool) string {
email = strings.TrimSpace(email)
plan := normalizePlanTypeForFilename(planType)
prefix := ""
if includeProviderPrefix {
prefix = "codex"
}
if plan == "" {
return fmt.Sprintf("%s-%s.json", prefix, email)
} else if plan == "team" {
return fmt.Sprintf("%s-%s-%s-%s.json", prefix, hashAccountID, email, plan)
}
return fmt.Sprintf("%s-%s-%s.json", prefix, email, plan)
}
func normalizePlanTypeForFilename(planType string) string {
planType = strings.TrimSpace(planType)
if planType == "" {
return ""
}
parts := strings.FieldsFunc(planType, func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsDigit(r)
})
if len(parts) == 0 {
return ""
}
for i, part := range parts {
parts[i] = strings.ToLower(strings.TrimSpace(part))
}
return strings.Join(parts, "-")
}

View File

@@ -19,11 +19,12 @@ import (
log "github.com/sirupsen/logrus"
)
// OAuth configuration constants for OpenAI Codex
const (
openaiAuthURL = "https://auth.openai.com/oauth/authorize"
openaiTokenURL = "https://auth.openai.com/oauth/token"
openaiClientID = "app_EMoamEEZ73f0CkXaXp7hrann"
redirectURI = "http://localhost:1455/auth/callback"
AuthURL = "https://auth.openai.com/oauth/authorize"
TokenURL = "https://auth.openai.com/oauth/token"
ClientID = "app_EMoamEEZ73f0CkXaXp7hrann"
RedirectURI = "http://localhost:1455/auth/callback"
)
// CodexAuth handles the OpenAI OAuth2 authentication flow.
@@ -50,9 +51,9 @@ func (o *CodexAuth) GenerateAuthURL(state string, pkceCodes *PKCECodes) (string,
}
params := url.Values{
"client_id": {openaiClientID},
"client_id": {ClientID},
"response_type": {"code"},
"redirect_uri": {redirectURI},
"redirect_uri": {RedirectURI},
"scope": {"openid email profile offline_access"},
"state": {state},
"code_challenge": {pkceCodes.CodeChallenge},
@@ -62,7 +63,7 @@ func (o *CodexAuth) GenerateAuthURL(state string, pkceCodes *PKCECodes) (string,
"codex_cli_simplified_flow": {"true"},
}
authURL := fmt.Sprintf("%s?%s", openaiAuthURL, params.Encode())
authURL := fmt.Sprintf("%s?%s", AuthURL, params.Encode())
return authURL, nil
}
@@ -77,13 +78,13 @@ func (o *CodexAuth) ExchangeCodeForTokens(ctx context.Context, code string, pkce
// Prepare token exchange request
data := url.Values{
"grant_type": {"authorization_code"},
"client_id": {openaiClientID},
"client_id": {ClientID},
"code": {code},
"redirect_uri": {redirectURI},
"redirect_uri": {RedirectURI},
"code_verifier": {pkceCodes.CodeVerifier},
}
req, err := http.NewRequestWithContext(ctx, "POST", openaiTokenURL, strings.NewReader(data.Encode()))
req, err := http.NewRequestWithContext(ctx, "POST", TokenURL, strings.NewReader(data.Encode()))
if err != nil {
return nil, fmt.Errorf("failed to create token request: %w", err)
}
@@ -163,13 +164,13 @@ func (o *CodexAuth) RefreshTokens(ctx context.Context, refreshToken string) (*Co
}
data := url.Values{
"client_id": {openaiClientID},
"client_id": {ClientID},
"grant_type": {"refresh_token"},
"refresh_token": {refreshToken},
"scope": {"openid profile email"},
}
req, err := http.NewRequestWithContext(ctx, "POST", openaiTokenURL, strings.NewReader(data.Encode()))
req, err := http.NewRequestWithContext(ctx, "POST", TokenURL, strings.NewReader(data.Encode()))
if err != nil {
return nil, fmt.Errorf("failed to create refresh request: %w", err)
}

View File

@@ -28,19 +28,19 @@ import (
"golang.org/x/oauth2/google"
)
// OAuth configuration constants for Gemini
const (
geminiOauthClientID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
geminiOauthClientSecret = "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"
geminiDefaultCallbackPort = 8085
ClientID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
ClientSecret = "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"
DefaultCallbackPort = 8085
)
var (
geminiOauthScopes = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
}
)
// OAuth scopes for Gemini authentication
var Scopes = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
}
// GeminiAuth provides methods for handling the Gemini OAuth2 authentication flow.
// It encapsulates the logic for obtaining, storing, and refreshing authentication tokens
@@ -74,7 +74,7 @@ func NewGeminiAuth() *GeminiAuth {
// - *http.Client: An HTTP client configured with authentication
// - error: An error if the client configuration fails, nil otherwise
func (g *GeminiAuth) GetAuthenticatedClient(ctx context.Context, ts *GeminiTokenStorage, cfg *config.Config, opts *WebLoginOptions) (*http.Client, error) {
callbackPort := geminiDefaultCallbackPort
callbackPort := DefaultCallbackPort
if opts != nil && opts.CallbackPort > 0 {
callbackPort = opts.CallbackPort
}
@@ -112,10 +112,10 @@ func (g *GeminiAuth) GetAuthenticatedClient(ctx context.Context, ts *GeminiToken
// Configure the OAuth2 client.
conf := &oauth2.Config{
ClientID: geminiOauthClientID,
ClientSecret: geminiOauthClientSecret,
ClientID: ClientID,
ClientSecret: ClientSecret,
RedirectURL: callbackURL, // This will be used by the local server.
Scopes: geminiOauthScopes,
Scopes: Scopes,
Endpoint: google.Endpoint,
}
@@ -198,9 +198,9 @@ func (g *GeminiAuth) createTokenStorage(ctx context.Context, config *oauth2.Conf
}
ifToken["token_uri"] = "https://oauth2.googleapis.com/token"
ifToken["client_id"] = geminiOauthClientID
ifToken["client_secret"] = geminiOauthClientSecret
ifToken["scopes"] = geminiOauthScopes
ifToken["client_id"] = ClientID
ifToken["client_secret"] = ClientSecret
ifToken["scopes"] = Scopes
ifToken["universe_domain"] = "googleapis.com"
ts := GeminiTokenStorage{
@@ -226,7 +226,7 @@ func (g *GeminiAuth) createTokenStorage(ctx context.Context, config *oauth2.Conf
// - *oauth2.Token: The OAuth2 token obtained from the authorization flow
// - error: An error if the token acquisition fails, nil otherwise
func (g *GeminiAuth) getTokenFromWeb(ctx context.Context, config *oauth2.Config, opts *WebLoginOptions) (*oauth2.Token, error) {
callbackPort := geminiDefaultCallbackPort
callbackPort := DefaultCallbackPort
if opts != nil && opts.CallbackPort > 0 {
callbackPort = opts.CallbackPort
}

View File

@@ -3,6 +3,7 @@ package cache
import (
"crypto/sha256"
"encoding/hex"
"strings"
"sync"
"time"
)
@@ -23,18 +24,18 @@ const (
// MinValidSignatureLen is the minimum length for a signature to be considered valid
MinValidSignatureLen = 50
// SessionCleanupInterval controls how often stale sessions are purged
SessionCleanupInterval = 10 * time.Minute
// CacheCleanupInterval controls how often stale entries are purged
CacheCleanupInterval = 10 * time.Minute
)
// signatureCache stores signatures by sessionId -> textHash -> SignatureEntry
// signatureCache stores signatures by model group -> textHash -> SignatureEntry
var signatureCache sync.Map
// sessionCleanupOnce ensures the background cleanup goroutine starts only once
var sessionCleanupOnce sync.Once
// cacheCleanupOnce ensures the background cleanup goroutine starts only once
var cacheCleanupOnce sync.Once
// sessionCache is the inner map type
type sessionCache struct {
// groupCache is the inner map type
type groupCache struct {
mu sync.RWMutex
entries map[string]SignatureEntry
}
@@ -45,36 +46,36 @@ func hashText(text string) string {
return hex.EncodeToString(h[:])[:SignatureTextHashLen]
}
// getOrCreateSession gets or creates a session cache
func getOrCreateSession(sessionID string) *sessionCache {
// getOrCreateGroupCache gets or creates a cache bucket for a model group
func getOrCreateGroupCache(groupKey string) *groupCache {
// Start background cleanup on first access
sessionCleanupOnce.Do(startSessionCleanup)
cacheCleanupOnce.Do(startCacheCleanup)
if val, ok := signatureCache.Load(sessionID); ok {
return val.(*sessionCache)
if val, ok := signatureCache.Load(groupKey); ok {
return val.(*groupCache)
}
sc := &sessionCache{entries: make(map[string]SignatureEntry)}
actual, _ := signatureCache.LoadOrStore(sessionID, sc)
return actual.(*sessionCache)
sc := &groupCache{entries: make(map[string]SignatureEntry)}
actual, _ := signatureCache.LoadOrStore(groupKey, sc)
return actual.(*groupCache)
}
// startSessionCleanup launches a background goroutine that periodically
// removes sessions where all entries have expired.
func startSessionCleanup() {
// startCacheCleanup launches a background goroutine that periodically
// removes caches where all entries have expired.
func startCacheCleanup() {
go func() {
ticker := time.NewTicker(SessionCleanupInterval)
ticker := time.NewTicker(CacheCleanupInterval)
defer ticker.Stop()
for range ticker.C {
purgeExpiredSessions()
purgeExpiredCaches()
}
}()
}
// purgeExpiredSessions removes sessions with no valid (non-expired) entries.
func purgeExpiredSessions() {
// purgeExpiredCaches removes caches with no valid (non-expired) entries.
func purgeExpiredCaches() {
now := time.Now()
signatureCache.Range(func(key, value any) bool {
sc := value.(*sessionCache)
sc := value.(*groupCache)
sc.mu.Lock()
// Remove expired entries
for k, entry := range sc.entries {
@@ -84,7 +85,7 @@ func purgeExpiredSessions() {
}
isEmpty := len(sc.entries) == 0
sc.mu.Unlock()
// Remove session if empty
// Remove cache bucket if empty
if isEmpty {
signatureCache.Delete(key)
}
@@ -92,19 +93,19 @@ func purgeExpiredSessions() {
})
}
// CacheSignature stores a thinking signature for a given session and text.
// CacheSignature stores a thinking signature for a given model group and text.
// Used for Claude models that require signed thinking blocks in multi-turn conversations.
func CacheSignature(sessionID, text, signature string) {
if sessionID == "" || text == "" || signature == "" {
func CacheSignature(modelName, text, signature string) {
if text == "" || signature == "" {
return
}
if len(signature) < MinValidSignatureLen {
return
}
sc := getOrCreateSession(sessionID)
groupKey := GetModelGroup(modelName)
textHash := hashText(text)
sc := getOrCreateGroupCache(groupKey)
sc.mu.Lock()
defer sc.mu.Unlock()
@@ -114,18 +115,25 @@ func CacheSignature(sessionID, text, signature string) {
}
}
// GetCachedSignature retrieves a cached signature for a given session and text.
// GetCachedSignature retrieves a cached signature for a given model group and text.
// Returns empty string if not found or expired.
func GetCachedSignature(sessionID, text string) string {
if sessionID == "" || text == "" {
return ""
}
func GetCachedSignature(modelName, text string) string {
groupKey := GetModelGroup(modelName)
val, ok := signatureCache.Load(sessionID)
if !ok {
if text == "" {
if groupKey == "gemini" {
return "skip_thought_signature_validator"
}
return ""
}
sc := val.(*sessionCache)
val, ok := signatureCache.Load(groupKey)
if !ok {
if groupKey == "gemini" {
return "skip_thought_signature_validator"
}
return ""
}
sc := val.(*groupCache)
textHash := hashText(text)
@@ -135,11 +143,17 @@ func GetCachedSignature(sessionID, text string) string {
entry, exists := sc.entries[textHash]
if !exists {
sc.mu.Unlock()
if groupKey == "gemini" {
return "skip_thought_signature_validator"
}
return ""
}
if now.Sub(entry.Timestamp) > SignatureCacheTTL {
delete(sc.entries, textHash)
sc.mu.Unlock()
if groupKey == "gemini" {
return "skip_thought_signature_validator"
}
return ""
}
@@ -151,19 +165,31 @@ func GetCachedSignature(sessionID, text string) string {
return entry.Signature
}
// ClearSignatureCache clears signature cache for a specific session or all sessions.
func ClearSignatureCache(sessionID string) {
if sessionID != "" {
signatureCache.Delete(sessionID)
} else {
// ClearSignatureCache clears signature cache for a specific model group or all groups.
func ClearSignatureCache(modelName string) {
if modelName == "" {
signatureCache.Range(func(key, _ any) bool {
signatureCache.Delete(key)
return true
})
return
}
groupKey := GetModelGroup(modelName)
signatureCache.Delete(groupKey)
}
// HasValidSignature checks if a signature is valid (non-empty and long enough)
func HasValidSignature(signature string) bool {
return signature != "" && len(signature) >= MinValidSignatureLen
func HasValidSignature(modelName, signature string) bool {
return (signature != "" && len(signature) >= MinValidSignatureLen) || (signature == "skip_thought_signature_validator" && GetModelGroup(modelName) == "gemini")
}
func GetModelGroup(modelName string) string {
if strings.Contains(modelName, "gpt") {
return "gpt"
} else if strings.Contains(modelName, "claude") {
return "claude"
} else if strings.Contains(modelName, "gemini") {
return "gemini"
}
return modelName
}

View File

@@ -5,38 +5,40 @@ import (
"time"
)
const testModelName = "claude-sonnet-4-5"
func TestCacheSignature_BasicStorageAndRetrieval(t *testing.T) {
ClearSignatureCache("")
sessionID := "test-session-1"
text := "This is some thinking text content"
signature := "abc123validSignature1234567890123456789012345678901234567890"
// Store signature
CacheSignature(sessionID, text, signature)
CacheSignature(testModelName, text, signature)
// Retrieve signature
retrieved := GetCachedSignature(sessionID, text)
retrieved := GetCachedSignature(testModelName, text)
if retrieved != signature {
t.Errorf("Expected signature '%s', got '%s'", signature, retrieved)
}
}
func TestCacheSignature_DifferentSessions(t *testing.T) {
func TestCacheSignature_DifferentModelGroups(t *testing.T) {
ClearSignatureCache("")
text := "Same text in different sessions"
text := "Same text across models"
sig1 := "signature1_1234567890123456789012345678901234567890123456"
sig2 := "signature2_1234567890123456789012345678901234567890123456"
CacheSignature("session-a", text, sig1)
CacheSignature("session-b", text, sig2)
geminiModel := "gemini-3-pro-preview"
CacheSignature(testModelName, text, sig1)
CacheSignature(geminiModel, text, sig2)
if GetCachedSignature("session-a", text) != sig1 {
t.Error("Session-a signature mismatch")
if GetCachedSignature(testModelName, text) != sig1 {
t.Error("Claude signature mismatch")
}
if GetCachedSignature("session-b", text) != sig2 {
t.Error("Session-b signature mismatch")
if GetCachedSignature(geminiModel, text) != sig2 {
t.Error("Gemini signature mismatch")
}
}
@@ -44,13 +46,13 @@ func TestCacheSignature_NotFound(t *testing.T) {
ClearSignatureCache("")
// Non-existent session
if got := GetCachedSignature("nonexistent", "some text"); got != "" {
if got := GetCachedSignature(testModelName, "some text"); got != "" {
t.Errorf("Expected empty string for nonexistent session, got '%s'", got)
}
// Existing session but different text
CacheSignature("session-x", "text-a", "sigA12345678901234567890123456789012345678901234567890")
if got := GetCachedSignature("session-x", "text-b"); got != "" {
CacheSignature(testModelName, "text-a", "sigA12345678901234567890123456789012345678901234567890")
if got := GetCachedSignature(testModelName, "text-b"); got != "" {
t.Errorf("Expected empty string for different text, got '%s'", got)
}
}
@@ -59,12 +61,11 @@ func TestCacheSignature_EmptyInputs(t *testing.T) {
ClearSignatureCache("")
// All empty/invalid inputs should be no-ops
CacheSignature("", "text", "sig12345678901234567890123456789012345678901234567890")
CacheSignature("session", "", "sig12345678901234567890123456789012345678901234567890")
CacheSignature("session", "text", "")
CacheSignature("session", "text", "short") // Too short
CacheSignature(testModelName, "", "sig12345678901234567890123456789012345678901234567890")
CacheSignature(testModelName, "text", "")
CacheSignature(testModelName, "text", "short") // Too short
if got := GetCachedSignature("session", "text"); got != "" {
if got := GetCachedSignature(testModelName, "text"); got != "" {
t.Errorf("Expected empty after invalid cache attempts, got '%s'", got)
}
}
@@ -72,31 +73,27 @@ func TestCacheSignature_EmptyInputs(t *testing.T) {
func TestCacheSignature_ShortSignatureRejected(t *testing.T) {
ClearSignatureCache("")
sessionID := "test-short-sig"
text := "Some text"
shortSig := "abc123" // Less than 50 chars
CacheSignature(sessionID, text, shortSig)
CacheSignature(testModelName, text, shortSig)
if got := GetCachedSignature(sessionID, text); got != "" {
if got := GetCachedSignature(testModelName, text); got != "" {
t.Errorf("Short signature should be rejected, got '%s'", got)
}
}
func TestClearSignatureCache_SpecificSession(t *testing.T) {
func TestClearSignatureCache_ModelGroup(t *testing.T) {
ClearSignatureCache("")
sig := "validSig1234567890123456789012345678901234567890123456"
CacheSignature("session-1", "text", sig)
CacheSignature("session-2", "text", sig)
CacheSignature(testModelName, "text", sig)
CacheSignature(testModelName, "text-2", sig)
ClearSignatureCache("session-1")
if got := GetCachedSignature("session-1", "text"); got != "" {
t.Error("session-1 should be cleared")
}
if got := GetCachedSignature("session-2", "text"); got != sig {
t.Error("session-2 should still exist")
if got := GetCachedSignature(testModelName, "text"); got != sig {
t.Error("signature should remain when clearing unknown session")
}
}
@@ -104,35 +101,37 @@ func TestClearSignatureCache_AllSessions(t *testing.T) {
ClearSignatureCache("")
sig := "validSig1234567890123456789012345678901234567890123456"
CacheSignature("session-1", "text", sig)
CacheSignature("session-2", "text", sig)
CacheSignature(testModelName, "text", sig)
CacheSignature(testModelName, "text-2", sig)
ClearSignatureCache("")
if got := GetCachedSignature("session-1", "text"); got != "" {
t.Error("session-1 should be cleared")
if got := GetCachedSignature(testModelName, "text"); got != "" {
t.Error("text should be cleared")
}
if got := GetCachedSignature("session-2", "text"); got != "" {
t.Error("session-2 should be cleared")
if got := GetCachedSignature(testModelName, "text-2"); got != "" {
t.Error("text-2 should be cleared")
}
}
func TestHasValidSignature(t *testing.T) {
tests := []struct {
name string
modelName string
signature string
expected bool
}{
{"valid long signature", "abc123validSignature1234567890123456789012345678901234567890", true},
{"exactly 50 chars", "12345678901234567890123456789012345678901234567890", true},
{"49 chars - invalid", "1234567890123456789012345678901234567890123456789", false},
{"empty string", "", false},
{"short signature", "abc", false},
{"valid long signature", testModelName, "abc123validSignature1234567890123456789012345678901234567890", true},
{"exactly 50 chars", testModelName, "12345678901234567890123456789012345678901234567890", true},
{"49 chars - invalid", testModelName, "1234567890123456789012345678901234567890123456789", false},
{"empty string", testModelName, "", false},
{"short signature", testModelName, "abc", false},
{"gemini sentinel", "gemini-3-pro-preview", "skip_thought_signature_validator", true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := HasValidSignature(tt.signature)
result := HasValidSignature(tt.modelName, tt.signature)
if result != tt.expected {
t.Errorf("HasValidSignature(%q) = %v, expected %v", tt.signature, result, tt.expected)
}
@@ -143,21 +142,19 @@ func TestHasValidSignature(t *testing.T) {
func TestCacheSignature_TextHashCollisionResistance(t *testing.T) {
ClearSignatureCache("")
sessionID := "hash-test-session"
// Different texts should produce different hashes
text1 := "First thinking text"
text2 := "Second thinking text"
sig1 := "signature1_1234567890123456789012345678901234567890123456"
sig2 := "signature2_1234567890123456789012345678901234567890123456"
CacheSignature(sessionID, text1, sig1)
CacheSignature(sessionID, text2, sig2)
CacheSignature(testModelName, text1, sig1)
CacheSignature(testModelName, text2, sig2)
if GetCachedSignature(sessionID, text1) != sig1 {
if GetCachedSignature(testModelName, text1) != sig1 {
t.Error("text1 signature mismatch")
}
if GetCachedSignature(sessionID, text2) != sig2 {
if GetCachedSignature(testModelName, text2) != sig2 {
t.Error("text2 signature mismatch")
}
}
@@ -165,13 +162,12 @@ func TestCacheSignature_TextHashCollisionResistance(t *testing.T) {
func TestCacheSignature_UnicodeText(t *testing.T) {
ClearSignatureCache("")
sessionID := "unicode-session"
text := "한글 텍스트와 이모지 🎉 그리고 特殊文字"
sig := "unicodeSig123456789012345678901234567890123456789012345"
CacheSignature(sessionID, text, sig)
CacheSignature(testModelName, text, sig)
if got := GetCachedSignature(sessionID, text); got != sig {
if got := GetCachedSignature(testModelName, text); got != sig {
t.Errorf("Unicode text signature retrieval failed, got '%s'", got)
}
}
@@ -179,15 +175,14 @@ func TestCacheSignature_UnicodeText(t *testing.T) {
func TestCacheSignature_Overwrite(t *testing.T) {
ClearSignatureCache("")
sessionID := "overwrite-session"
text := "Same text"
sig1 := "firstSignature12345678901234567890123456789012345678901"
sig2 := "secondSignature1234567890123456789012345678901234567890"
CacheSignature(sessionID, text, sig1)
CacheSignature(sessionID, text, sig2) // Overwrite
CacheSignature(testModelName, text, sig1)
CacheSignature(testModelName, text, sig2) // Overwrite
if got := GetCachedSignature(sessionID, text); got != sig2 {
if got := GetCachedSignature(testModelName, text); got != sig2 {
t.Errorf("Expected overwritten signature '%s', got '%s'", sig2, got)
}
}
@@ -199,14 +194,13 @@ func TestCacheSignature_ExpirationLogic(t *testing.T) {
// This test verifies the expiration check exists
// In a real scenario, we'd mock time.Now()
sessionID := "expiration-test"
text := "text"
sig := "validSig1234567890123456789012345678901234567890123456"
CacheSignature(sessionID, text, sig)
CacheSignature(testModelName, text, sig)
// Fresh entry should be retrievable
if got := GetCachedSignature(sessionID, text); got != sig {
if got := GetCachedSignature(testModelName, text); got != sig {
t.Errorf("Fresh entry should be retrievable, got '%s'", got)
}

View File

@@ -118,6 +118,7 @@ func DoLogin(cfg *config.Config, projectID string, options *LoginOptions) {
}
activatedProjects := make([]string, 0, len(projectSelections))
seenProjects := make(map[string]bool)
for _, candidateID := range projectSelections {
log.Infof("Activating project %s", candidateID)
if errSetup := performGeminiCLISetup(ctx, httpClient, storage, candidateID); errSetup != nil {
@@ -134,6 +135,13 @@ func DoLogin(cfg *config.Config, projectID string, options *LoginOptions) {
if finalID == "" {
finalID = candidateID
}
// Skip duplicates
if seenProjects[finalID] {
log.Infof("Project %s already activated, skipping", finalID)
continue
}
seenProjects[finalID] = true
activatedProjects = append(activatedProjects, finalID)
}
@@ -261,7 +269,39 @@ func performGeminiCLISetup(ctx context.Context, httpClient *http.Client, storage
finalProjectID := projectID
if responseProjectID != "" {
if explicitProject && !strings.EqualFold(responseProjectID, projectID) {
log.Warnf("Gemini onboarding returned project %s instead of requested %s; keeping requested project ID.", responseProjectID, projectID)
// Check if this is a free user (gen-lang-client projects or free/legacy tier)
isFreeUser := strings.HasPrefix(projectID, "gen-lang-client-") ||
strings.EqualFold(tierID, "FREE") ||
strings.EqualFold(tierID, "LEGACY")
if isFreeUser {
// Interactive prompt for free users
fmt.Printf("\nGoogle returned a different project ID:\n")
fmt.Printf(" Requested (frontend): %s\n", projectID)
fmt.Printf(" Returned (backend): %s\n\n", responseProjectID)
fmt.Printf(" Backend project IDs have access to preview models (gemini-3-*).\n")
fmt.Printf(" This is normal for free tier users.\n\n")
fmt.Printf("Which project ID would you like to use?\n")
fmt.Printf(" [1] Backend (recommended): %s\n", responseProjectID)
fmt.Printf(" [2] Frontend: %s\n\n", projectID)
fmt.Printf("Enter choice [1]: ")
reader := bufio.NewReader(os.Stdin)
choice, _ := reader.ReadString('\n')
choice = strings.TrimSpace(choice)
if choice == "2" {
log.Infof("Using frontend project ID: %s", projectID)
fmt.Println(". Warning: Frontend project IDs may not have access to preview models.")
finalProjectID = projectID
} else {
log.Infof("Using backend project ID: %s (recommended)", responseProjectID)
finalProjectID = responseProjectID
}
} else {
// Pro users: keep requested project ID (original behavior)
log.Warnf("Gemini onboarding returned project %s instead of requested %s; keeping requested project ID.", responseProjectID, projectID)
}
} else {
finalProjectID = responseProjectID
}

View File

@@ -248,6 +248,25 @@ type PayloadModelRule struct {
Protocol string `yaml:"protocol" json:"protocol"`
}
// CloakConfig configures request cloaking for non-Claude-Code clients.
// Cloaking disguises API requests to appear as originating from the official Claude Code CLI.
type CloakConfig struct {
// Mode controls cloaking behavior: "auto" (default), "always", or "never".
// - "auto": cloak only when client is not Claude Code (based on User-Agent)
// - "always": always apply cloaking regardless of client
// - "never": never apply cloaking
Mode string `yaml:"mode,omitempty" json:"mode,omitempty"`
// StrictMode controls how system prompts are handled when cloaking.
// - false (default): prepend Claude Code prompt to user system messages
// - true: strip all user system messages, keep only Claude Code prompt
StrictMode bool `yaml:"strict-mode,omitempty" json:"strict-mode,omitempty"`
// SensitiveWords is a list of words to obfuscate with zero-width characters.
// This can help bypass certain content filters.
SensitiveWords []string `yaml:"sensitive-words,omitempty" json:"sensitive-words,omitempty"`
}
// ClaudeKey represents the configuration for a Claude API key,
// including the API key itself and an optional base URL for the API endpoint.
type ClaudeKey struct {
@@ -276,6 +295,9 @@ type ClaudeKey struct {
// ExcludedModels lists model IDs that should be excluded for this provider.
ExcludedModels []string `yaml:"excluded-models,omitempty" json:"excluded-models,omitempty"`
// Cloak configures request cloaking for non-Claude-Code clients.
Cloak *CloakConfig `yaml:"cloak,omitempty" json:"cloak,omitempty"`
}
func (k ClaudeKey) GetAPIKey() string { return k.APIKey }
@@ -901,6 +923,7 @@ func SaveConfigPreserveComments(configFile string, cfg *Config) error {
removeLegacyGenerativeLanguageKeys(original.Content[0])
pruneMappingToGeneratedKeys(original.Content[0], generated.Content[0], "oauth-excluded-models")
pruneMappingToGeneratedKeys(original.Content[0], generated.Content[0], "oauth-model-alias")
// Merge generated into original in-place, preserving comments/order of existing nodes.
mergeMappingPreserve(original.Content[0], generated.Content[0])
@@ -1391,6 +1414,16 @@ func pruneMappingToGeneratedKeys(dstRoot, srcRoot *yaml.Node, key string) {
}
srcIdx := findMapKeyIndex(srcRoot, key)
if srcIdx < 0 {
// Keep an explicit empty mapping for oauth-model-alias when it was previously present.
//
// Rationale: LoadConfig runs MigrateOAuthModelAlias before unmarshalling. If the
// oauth-model-alias key is missing, migration will add the default antigravity aliases.
// When users delete the last channel from oauth-model-alias via the management API,
// we want that deletion to persist across hot reloads and restarts.
if key == "oauth-model-alias" {
dstRoot.Content[dstIdx+1] = &yaml.Node{Kind: yaml.MappingNode, Tag: "!!map"}
return
}
removeMapKey(dstRoot, key)
return
}

View File

@@ -4,6 +4,7 @@
package logging
import (
"errors"
"fmt"
"net/http"
"runtime/debug"
@@ -112,6 +113,11 @@ func isAIAPIPath(path string) bool {
// - gin.HandlerFunc: A middleware handler for panic recovery
func GinLogrusRecovery() gin.HandlerFunc {
return gin.CustomRecovery(func(c *gin.Context, recovered interface{}) {
if err, ok := recovered.(error); ok && errors.Is(err, http.ErrAbortHandler) {
// Let net/http handle ErrAbortHandler so the connection is aborted without noisy stack logs.
panic(http.ErrAbortHandler)
}
log.WithFields(log.Fields{
"panic": recovered,
"stack": string(debug.Stack()),

View File

@@ -0,0 +1,60 @@
package logging
import (
"errors"
"net/http"
"net/http/httptest"
"testing"
"github.com/gin-gonic/gin"
)
func TestGinLogrusRecoveryRepanicsErrAbortHandler(t *testing.T) {
gin.SetMode(gin.TestMode)
engine := gin.New()
engine.Use(GinLogrusRecovery())
engine.GET("/abort", func(c *gin.Context) {
panic(http.ErrAbortHandler)
})
req := httptest.NewRequest(http.MethodGet, "/abort", nil)
recorder := httptest.NewRecorder()
defer func() {
recovered := recover()
if recovered == nil {
t.Fatalf("expected panic, got nil")
}
err, ok := recovered.(error)
if !ok {
t.Fatalf("expected error panic, got %T", recovered)
}
if !errors.Is(err, http.ErrAbortHandler) {
t.Fatalf("expected ErrAbortHandler, got %v", err)
}
if err != http.ErrAbortHandler {
t.Fatalf("expected exact ErrAbortHandler sentinel, got %v", err)
}
}()
engine.ServeHTTP(recorder, req)
}
func TestGinLogrusRecoveryHandlesRegularPanic(t *testing.T) {
gin.SetMode(gin.TestMode)
engine := gin.New()
engine.Use(GinLogrusRecovery())
engine.GET("/panic", func(c *gin.Context) {
panic("boom")
})
req := httptest.NewRequest(http.MethodGet, "/panic", nil)
recorder := httptest.NewRecorder()
engine.ServeHTTP(recorder, req)
if recorder.Code != http.StatusInternalServerError {
t.Fatalf("expected 500, got %d", recorder.Code)
}
}

View File

@@ -30,7 +30,7 @@ var (
type LogFormatter struct{}
// logFieldOrder defines the display order for common log fields.
var logFieldOrder = []string{"provider", "model", "mode", "budget", "level", "original_value", "min", "max", "clamped_to", "error"}
var logFieldOrder = []string{"provider", "model", "mode", "budget", "level", "original_mode", "original_value", "min", "max", "clamped_to", "error"}
// Format renders a single log entry with custom formatting.
func (m *LogFormatter) Format(entry *log.Entry) ([]byte, error) {
@@ -121,6 +121,24 @@ func isDirWritable(dir string) bool {
return true
}
// ResolveLogDirectory determines the directory used for application logs.
func ResolveLogDirectory(cfg *config.Config) string {
logDir := "logs"
if base := util.WritablePath(); base != "" {
return filepath.Join(base, "logs")
}
if cfg == nil {
return logDir
}
if !isDirWritable(logDir) {
authDir := strings.TrimSpace(cfg.AuthDir)
if authDir != "" {
logDir = filepath.Join(authDir, "logs")
}
}
return logDir
}
// ConfigureLogOutput switches the global log destination between rotating files and stdout.
// When logsMaxTotalSizeMB > 0, a background cleaner removes the oldest log files in the logs directory
// until the total size is within the limit.
@@ -130,12 +148,7 @@ func ConfigureLogOutput(cfg *config.Config) error {
writerMu.Lock()
defer writerMu.Unlock()
logDir := "logs"
if base := util.WritablePath(); base != "" {
logDir = filepath.Join(base, "logs")
} else if !isDirWritable(logDir) {
logDir = filepath.Join(cfg.AuthDir, "logs")
}
logDir := ResolveLogDirectory(cfg)
protectedPath := ""
if cfg.LoggingToFile {

View File

@@ -44,10 +44,12 @@ type RequestLogger interface {
// - apiRequest: The API request data
// - apiResponse: The API response data
// - requestID: Optional request ID for log file naming
// - requestTimestamp: When the request was received
// - apiResponseTimestamp: When the API response was received
//
// Returns:
// - error: An error if logging fails, nil otherwise
LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string) error
LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error
// LogStreamingRequest initiates logging for a streaming request and returns a writer for chunks.
//
@@ -109,6 +111,12 @@ type StreamingLogWriter interface {
// - error: An error if writing fails, nil otherwise
WriteAPIResponse(apiResponse []byte) error
// SetFirstChunkTimestamp sets the TTFB timestamp captured when first chunk was received.
//
// Parameters:
// - timestamp: The time when first response chunk was received
SetFirstChunkTimestamp(timestamp time.Time)
// Close finalizes the log file and cleans up resources.
//
// Returns:
@@ -180,20 +188,22 @@ func (l *FileRequestLogger) SetEnabled(enabled bool) {
// - apiRequest: The API request data
// - apiResponse: The API response data
// - requestID: Optional request ID for log file naming
// - requestTimestamp: When the request was received
// - apiResponseTimestamp: When the API response was received
//
// Returns:
// - error: An error if logging fails, nil otherwise
func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, false, requestID)
func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, false, requestID, requestTimestamp, apiResponseTimestamp)
}
// LogRequestWithOptions logs a request with optional forced logging behavior.
// The force flag allows writing error logs even when regular request logging is disabled.
func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, force, requestID)
func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, force, requestID, requestTimestamp, apiResponseTimestamp)
}
func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string) error {
func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
if !l.enabled && !force {
return nil
}
@@ -247,6 +257,8 @@ func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[st
responseHeaders,
responseToWrite,
decompressErr,
requestTimestamp,
apiResponseTimestamp,
)
if errClose := logFile.Close(); errClose != nil {
log.WithError(errClose).Warn("failed to close request log file")
@@ -499,17 +511,22 @@ func (l *FileRequestLogger) writeNonStreamingLog(
responseHeaders map[string][]string,
response []byte,
decompressErr error,
requestTimestamp time.Time,
apiResponseTimestamp time.Time,
) error {
if errWrite := writeRequestInfoWithBody(w, url, method, requestHeaders, requestBody, requestBodyPath, time.Now()); errWrite != nil {
if requestTimestamp.IsZero() {
requestTimestamp = time.Now()
}
if errWrite := writeRequestInfoWithBody(w, url, method, requestHeaders, requestBody, requestBodyPath, requestTimestamp); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(w, "=== API REQUEST ===\n", "=== API REQUEST", apiRequest); errWrite != nil {
if errWrite := writeAPISection(w, "=== API REQUEST ===\n", "=== API REQUEST", apiRequest, time.Time{}); errWrite != nil {
return errWrite
}
if errWrite := writeAPIErrorResponses(w, apiResponseErrors); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(w, "=== API RESPONSE ===\n", "=== API RESPONSE", apiResponse); errWrite != nil {
if errWrite := writeAPISection(w, "=== API RESPONSE ===\n", "=== API RESPONSE", apiResponse, apiResponseTimestamp); errWrite != nil {
return errWrite
}
return writeResponseSection(w, statusCode, true, responseHeaders, bytes.NewReader(response), decompressErr, true)
@@ -583,7 +600,7 @@ func writeRequestInfoWithBody(
return nil
}
func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, payload []byte) error {
func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, payload []byte, timestamp time.Time) error {
if len(payload) == 0 {
return nil
}
@@ -601,6 +618,11 @@ func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, pa
if _, errWrite := io.WriteString(w, sectionHeader); errWrite != nil {
return errWrite
}
if !timestamp.IsZero() {
if _, errWrite := io.WriteString(w, fmt.Sprintf("Timestamp: %s\n", timestamp.Format(time.RFC3339Nano))); errWrite != nil {
return errWrite
}
}
if _, errWrite := w.Write(payload); errWrite != nil {
return errWrite
}
@@ -974,6 +996,9 @@ type FileStreamingLogWriter struct {
// apiResponse stores the upstream API response data.
apiResponse []byte
// apiResponseTimestamp captures when the API response was received.
apiResponseTimestamp time.Time
}
// WriteChunkAsync writes a response chunk asynchronously (non-blocking).
@@ -1053,6 +1078,12 @@ func (w *FileStreamingLogWriter) WriteAPIResponse(apiResponse []byte) error {
return nil
}
func (w *FileStreamingLogWriter) SetFirstChunkTimestamp(timestamp time.Time) {
if !timestamp.IsZero() {
w.apiResponseTimestamp = timestamp
}
}
// Close finalizes the log file and cleans up resources.
// It writes all buffered data to the file in the correct order:
// API REQUEST -> API RESPONSE -> RESPONSE (status, headers, body chunks)
@@ -1140,10 +1171,10 @@ func (w *FileStreamingLogWriter) writeFinalLog(logFile *os.File) error {
if errWrite := writeRequestInfoWithBody(logFile, w.url, w.method, w.requestHeaders, nil, w.requestBodyPath, w.timestamp); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(logFile, "=== API REQUEST ===\n", "=== API REQUEST", w.apiRequest); errWrite != nil {
if errWrite := writeAPISection(logFile, "=== API REQUEST ===\n", "=== API REQUEST", w.apiRequest, time.Time{}); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(logFile, "=== API RESPONSE ===\n", "=== API RESPONSE", w.apiResponse); errWrite != nil {
if errWrite := writeAPISection(logFile, "=== API RESPONSE ===\n", "=== API RESPONSE", w.apiResponse, w.apiResponseTimestamp); errWrite != nil {
return errWrite
}
@@ -1220,6 +1251,8 @@ func (w *NoOpStreamingLogWriter) WriteAPIResponse(_ []byte) error {
return nil
}
func (w *NoOpStreamingLogWriter) SetFirstChunkTimestamp(_ time.Time) {}
// Close is a no-op implementation that does nothing and always returns nil.
//
// Returns:

View File

@@ -1,785 +1,69 @@
// Package registry provides model definitions for various AI service providers.
// This file contains static model definitions that can be used by clients
// when registering their supported models.
// Package registry provides model definitions and lookup helpers for various AI providers.
// Static model metadata is stored in model_definitions_static_data.go.
package registry
// GetClaudeModels returns the standard Claude model definitions
func GetClaudeModels() []*ModelInfo {
return []*ModelInfo{
import (
"sort"
"strings"
)
{
ID: "claude-haiku-4-5-20251001",
Object: "model",
Created: 1759276800, // 2025-10-01
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Haiku",
ContextLength: 200000,
MaxCompletionTokens: 64000,
// Thinking: not supported for Haiku models
},
{
ID: "claude-sonnet-4-5-20250929",
Object: "model",
Created: 1759104000, // 2025-09-29
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Sonnet",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: false},
},
{
ID: "claude-opus-4-5-20251101",
Object: "model",
Created: 1761955200, // 2025-11-01
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Opus",
Description: "Premium model combining maximum intelligence with practical performance",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: false},
},
{
ID: "claude-opus-4-1-20250805",
Object: "model",
Created: 1722945600, // 2025-08-05
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.1 Opus",
ContextLength: 200000,
MaxCompletionTokens: 32000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-opus-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Opus",
ContextLength: 200000,
MaxCompletionTokens: 32000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-sonnet-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Sonnet",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-3-7-sonnet-20250219",
Object: "model",
Created: 1708300800, // 2025-02-19
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.7 Sonnet",
ContextLength: 128000,
MaxCompletionTokens: 8192,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-3-5-haiku-20241022",
Object: "model",
Created: 1729555200, // 2024-10-22
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.5 Haiku",
ContextLength: 128000,
MaxCompletionTokens: 8192,
// Thinking: not supported for Haiku models
},
}
}
// GetGeminiModels returns the standard Gemini model definitions
func GetGeminiModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Gemini 3 Flash Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gemini-3-pro-image-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-image-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Image Preview",
Description: "Gemini 3 Pro Image Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
}
}
func GetGeminiVertexModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gemini-3-pro-image-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-image-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Image Preview",
Description: "Gemini 3 Pro Image Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
}
}
// GetGeminiCLIModels returns the standard Gemini model definitions
func GetGeminiCLIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Our most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
}
}
// GetAIStudioModels returns the Gemini model definitions for AI Studio integrations
func GetAIStudioModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-pro-latest",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-pro-latest",
Version: "2.5",
DisplayName: "Gemini Pro Latest",
Description: "Latest release of Gemini Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-flash-latest",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-flash-latest",
Version: "2.5",
DisplayName: "Gemini Flash Latest",
Description: "Latest release of Gemini Flash",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-flash-lite-latest",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-flash-lite-latest",
Version: "2.5",
DisplayName: "Gemini Flash-Lite Latest",
Description: "Latest release of Gemini Flash-Lite",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-image-preview",
Object: "model",
Created: 1756166400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-image-preview",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Image Preview",
Description: "State-of-the-art image generation and editing model.",
InputTokenLimit: 1048576,
OutputTokenLimit: 8192,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
// image models don't support thinkingConfig; leave Thinking nil
},
{
ID: "gemini-2.5-flash-image",
Object: "model",
Created: 1759363200,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-image",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Image",
Description: "State-of-the-art image generation and editing model.",
InputTokenLimit: 1048576,
OutputTokenLimit: 8192,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
// image models don't support thinkingConfig; leave Thinking nil
},
}
}
// GetOpenAIModels returns the standard OpenAI model definitions
func GetOpenAIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gpt-5",
Object: "model",
Created: 1754524800,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gpt-5-codex",
Object: "model",
Created: 1757894400,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-09-15",
DisplayName: "GPT 5 Codex",
Description: "Stable version of GPT 5 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5-codex-mini",
Object: "model",
Created: 1762473600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-11-07",
DisplayName: "GPT 5 Codex Mini",
Description: "Stable version of GPT 5 Codex Mini: cheaper, faster, but less capable version of GPT 5 Codex.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5.1 Codex",
Description: "Stable version of GPT 5.1 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex-mini",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5.1 Codex Mini",
Description: "Stable version of GPT 5.1 Codex Mini: cheaper, faster, but less capable version of GPT 5.1 Codex.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex-max",
Object: "model",
Created: 1763424000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-max",
DisplayName: "GPT 5.1 Codex Max",
Description: "Stable version of GPT 5.1 Codex Max",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high", "xhigh"}},
},
{
ID: "gpt-5.2",
Object: "model",
Created: 1765440000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.2",
DisplayName: "GPT 5.2",
Description: "Stable version of GPT 5.2",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high", "xhigh"}},
},
{
ID: "gpt-5.2-codex",
Object: "model",
Created: 1765440000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.2",
DisplayName: "GPT 5.2 Codex",
Description: "Stable version of GPT 5.2 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high", "xhigh"}},
},
}
}
// GetQwenModels returns the standard Qwen model definitions
func GetQwenModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "qwen3-coder-plus",
Object: "model",
Created: 1753228800,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Plus",
Description: "Advanced code generation and understanding model",
ContextLength: 32768,
MaxCompletionTokens: 8192,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
{
ID: "qwen3-coder-flash",
Object: "model",
Created: 1753228800,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Flash",
Description: "Fast code generation model",
ContextLength: 8192,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
{
ID: "vision-model",
Object: "model",
Created: 1758672000,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Vision Model",
Description: "Vision model model",
ContextLength: 32768,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
}
}
// iFlowThinkingSupport is a shared ThinkingSupport configuration for iFlow models
// that support thinking mode via chat_template_kwargs.enable_thinking (boolean toggle).
// Uses level-based configuration so standard normalization flows apply before conversion.
var iFlowThinkingSupport = &ThinkingSupport{
Levels: []string{"none", "auto", "minimal", "low", "medium", "high", "xhigh"},
}
// GetIFlowModels returns supported models for iFlow OAuth accounts.
func GetIFlowModels() []*ModelInfo {
entries := []struct {
ID string
DisplayName string
Description string
Created int64
Thinking *ThinkingSupport
}{
{ID: "tstars2.0", DisplayName: "TStars-2.0", Description: "iFlow TStars-2.0 multimodal assistant", Created: 1746489600},
{ID: "qwen3-coder-plus", DisplayName: "Qwen3-Coder-Plus", Description: "Qwen3 Coder Plus code generation", Created: 1753228800},
{ID: "qwen3-max", DisplayName: "Qwen3-Max", Description: "Qwen3 flagship model", Created: 1758672000},
{ID: "qwen3-vl-plus", DisplayName: "Qwen3-VL-Plus", Description: "Qwen3 multimodal vision-language", Created: 1758672000},
{ID: "qwen3-max-preview", DisplayName: "Qwen3-Max-Preview", Description: "Qwen3 Max preview build", Created: 1757030400},
{ID: "kimi-k2-0905", DisplayName: "Kimi-K2-Instruct-0905", Description: "Moonshot Kimi K2 instruct 0905", Created: 1757030400},
{ID: "glm-4.6", DisplayName: "GLM-4.6", Description: "Zhipu GLM 4.6 general model", Created: 1759190400, Thinking: iFlowThinkingSupport},
{ID: "glm-4.7", DisplayName: "GLM-4.7", Description: "Zhipu GLM 4.7 general model", Created: 1766448000, Thinking: iFlowThinkingSupport},
{ID: "kimi-k2", DisplayName: "Kimi-K2", Description: "Moonshot Kimi K2 general model", Created: 1752192000},
{ID: "kimi-k2-thinking", DisplayName: "Kimi-K2-Thinking", Description: "Moonshot Kimi K2 thinking model", Created: 1762387200},
{ID: "deepseek-v3.2-chat", DisplayName: "DeepSeek-V3.2", Description: "DeepSeek V3.2 Chat", Created: 1764576000},
{ID: "deepseek-v3.2-reasoner", DisplayName: "DeepSeek-V3.2", Description: "DeepSeek V3.2 Reasoner", Created: 1764576000},
{ID: "deepseek-v3.2", DisplayName: "DeepSeek-V3.2-Exp", Description: "DeepSeek V3.2 experimental", Created: 1759104000},
{ID: "deepseek-v3.1", DisplayName: "DeepSeek-V3.1-Terminus", Description: "DeepSeek V3.1 Terminus", Created: 1756339200},
{ID: "deepseek-r1", DisplayName: "DeepSeek-R1", Description: "DeepSeek reasoning model R1", Created: 1737331200},
{ID: "deepseek-v3", DisplayName: "DeepSeek-V3-671B", Description: "DeepSeek V3 671B", Created: 1734307200},
{ID: "qwen3-32b", DisplayName: "Qwen3-32B", Description: "Qwen3 32B", Created: 1747094400},
{ID: "qwen3-235b-a22b-thinking-2507", DisplayName: "Qwen3-235B-A22B-Thinking", Description: "Qwen3 235B A22B Thinking (2507)", Created: 1753401600},
{ID: "qwen3-235b-a22b-instruct", DisplayName: "Qwen3-235B-A22B-Instruct", Description: "Qwen3 235B A22B Instruct", Created: 1753401600},
{ID: "qwen3-235b", DisplayName: "Qwen3-235B-A22B", Description: "Qwen3 235B A22B", Created: 1753401600},
{ID: "minimax-m2", DisplayName: "MiniMax-M2", Description: "MiniMax M2", Created: 1758672000, Thinking: iFlowThinkingSupport},
{ID: "minimax-m2.1", DisplayName: "MiniMax-M2.1", Description: "MiniMax M2.1", Created: 1766448000, Thinking: iFlowThinkingSupport},
{ID: "iflow-rome-30ba3b", DisplayName: "iFlow-ROME", Description: "iFlow Rome 30BA3B model", Created: 1736899200},
}
models := make([]*ModelInfo, 0, len(entries))
for _, entry := range entries {
models = append(models, &ModelInfo{
ID: entry.ID,
Object: "model",
Created: entry.Created,
OwnedBy: "iflow",
Type: "iflow",
DisplayName: entry.DisplayName,
Description: entry.Description,
Thinking: entry.Thinking,
// GetStaticModelDefinitionsByChannel returns static model definitions for a given channel/provider.
// It returns nil when the channel is unknown.
//
// Supported channels:
// - claude
// - gemini
// - vertex
// - gemini-cli
// - aistudio
// - codex
// - qwen
// - iflow
// - antigravity (returns static overrides only)
func GetStaticModelDefinitionsByChannel(channel string) []*ModelInfo {
key := strings.ToLower(strings.TrimSpace(channel))
switch key {
case "claude":
return GetClaudeModels()
case "gemini":
return GetGeminiModels()
case "vertex":
return GetGeminiVertexModels()
case "gemini-cli":
return GetGeminiCLIModels()
case "aistudio":
return GetAIStudioModels()
case "codex":
return GetOpenAIModels()
case "qwen":
return GetQwenModels()
case "iflow":
return GetIFlowModels()
case "antigravity":
cfg := GetAntigravityModelConfig()
if len(cfg) == 0 {
return nil
}
models := make([]*ModelInfo, 0, len(cfg))
for modelID, entry := range cfg {
if modelID == "" || entry == nil {
continue
}
models = append(models, &ModelInfo{
ID: modelID,
Object: "model",
OwnedBy: "antigravity",
Type: "antigravity",
Thinking: entry.Thinking,
MaxCompletionTokens: entry.MaxCompletionTokens,
})
}
sort.Slice(models, func(i, j int) bool {
return strings.ToLower(models[i].ID) < strings.ToLower(models[j].ID)
})
}
return models
}
// AntigravityModelConfig captures static antigravity model overrides, including
// Thinking budget limits and provider max completion tokens.
type AntigravityModelConfig struct {
Thinking *ThinkingSupport
MaxCompletionTokens int
Name string
}
// GetAntigravityModelConfig returns static configuration for antigravity models.
// Keys use upstream model names returned by the Antigravity models endpoint.
func GetAntigravityModelConfig() map[string]*AntigravityModelConfig {
return map[string]*AntigravityModelConfig{
"gemini-2.5-flash": {Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true}, Name: "models/gemini-2.5-flash"},
"gemini-2.5-flash-lite": {Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true}, Name: "models/gemini-2.5-flash-lite"},
"rev19-uic3-1p": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true}, Name: "models/rev19-uic3-1p"},
"gemini-3-pro-high": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}}, Name: "models/gemini-3-pro-high"},
"gemini-3-pro-image": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}}, Name: "models/gemini-3-pro-image"},
"gemini-3-flash": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}}, Name: "models/gemini-3-flash"},
"claude-sonnet-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-opus-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
return models
default:
return nil
}
}
@@ -809,10 +93,9 @@ func LookupStaticModelInfo(modelID string) *ModelInfo {
}
// Check Antigravity static config
if cfg := GetAntigravityModelConfig()[modelID]; cfg != nil && cfg.Thinking != nil {
if cfg := GetAntigravityModelConfig()[modelID]; cfg != nil {
return &ModelInfo{
ID: modelID,
Name: cfg.Name,
Thinking: cfg.Thinking,
MaxCompletionTokens: cfg.MaxCompletionTokens,
}

View File

@@ -0,0 +1,846 @@
// Package registry provides model definitions for various AI service providers.
// This file stores the static model metadata catalog.
package registry
// GetClaudeModels returns the standard Claude model definitions
func GetClaudeModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "claude-haiku-4-5-20251001",
Object: "model",
Created: 1759276800, // 2025-10-01
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Haiku",
ContextLength: 200000,
MaxCompletionTokens: 64000,
// Thinking: not supported for Haiku models
},
{
ID: "claude-sonnet-4-5-20250929",
Object: "model",
Created: 1759104000, // 2025-09-29
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Sonnet",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: false},
},
{
ID: "claude-opus-4-5-20251101",
Object: "model",
Created: 1761955200, // 2025-11-01
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.5 Opus",
Description: "Premium model combining maximum intelligence with practical performance",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: false},
},
{
ID: "claude-opus-4-1-20250805",
Object: "model",
Created: 1722945600, // 2025-08-05
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4.1 Opus",
ContextLength: 200000,
MaxCompletionTokens: 32000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-opus-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Opus",
ContextLength: 200000,
MaxCompletionTokens: 32000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-sonnet-4-20250514",
Object: "model",
Created: 1715644800, // 2025-05-14
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 4 Sonnet",
ContextLength: 200000,
MaxCompletionTokens: 64000,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-3-7-sonnet-20250219",
Object: "model",
Created: 1708300800, // 2025-02-19
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.7 Sonnet",
ContextLength: 128000,
MaxCompletionTokens: 8192,
Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: false, DynamicAllowed: false},
},
{
ID: "claude-3-5-haiku-20241022",
Object: "model",
Created: 1729555200, // 2024-10-22
OwnedBy: "anthropic",
Type: "claude",
DisplayName: "Claude 3.5 Haiku",
ContextLength: 128000,
MaxCompletionTokens: 8192,
// Thinking: not supported for Haiku models
},
}
}
// GetGeminiModels returns the standard Gemini model definitions
func GetGeminiModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Gemini 3 Flash Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gemini-3-pro-image-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-image-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Image Preview",
Description: "Gemini 3 Pro Image Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
}
}
func GetGeminiVertexModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gemini-3-pro-image-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-image-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Image Preview",
Description: "Gemini 3 Pro Image Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
// Imagen image generation models - use :predict action
{
ID: "imagen-4.0-generate-001",
Object: "model",
Created: 1750000000,
OwnedBy: "google",
Type: "gemini",
Name: "models/imagen-4.0-generate-001",
Version: "4.0",
DisplayName: "Imagen 4.0 Generate",
Description: "Imagen 4.0 image generation model",
SupportedGenerationMethods: []string{"predict"},
},
{
ID: "imagen-4.0-ultra-generate-001",
Object: "model",
Created: 1750000000,
OwnedBy: "google",
Type: "gemini",
Name: "models/imagen-4.0-ultra-generate-001",
Version: "4.0",
DisplayName: "Imagen 4.0 Ultra Generate",
Description: "Imagen 4.0 Ultra high-quality image generation model",
SupportedGenerationMethods: []string{"predict"},
},
{
ID: "imagen-3.0-generate-002",
Object: "model",
Created: 1740000000,
OwnedBy: "google",
Type: "gemini",
Name: "models/imagen-3.0-generate-002",
Version: "3.0",
DisplayName: "Imagen 3.0 Generate",
Description: "Imagen 3.0 image generation model",
SupportedGenerationMethods: []string{"predict"},
},
{
ID: "imagen-3.0-fast-generate-001",
Object: "model",
Created: 1740000000,
OwnedBy: "google",
Type: "gemini",
Name: "models/imagen-3.0-fast-generate-001",
Version: "3.0",
DisplayName: "Imagen 3.0 Fast Generate",
Description: "Imagen 3.0 fast image generation model",
SupportedGenerationMethods: []string{"predict"},
},
{
ID: "imagen-4.0-fast-generate-001",
Object: "model",
Created: 1750000000,
OwnedBy: "google",
Type: "gemini",
Name: "models/imagen-4.0-fast-generate-001",
Version: "4.0",
DisplayName: "Imagen 4.0 Fast Generate",
Description: "Imagen 4.0 fast image generation model",
SupportedGenerationMethods: []string{"predict"},
},
}
}
// GetGeminiCLIModels returns the standard Gemini model definitions
func GetGeminiCLIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Our most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}},
},
}
}
// GetAIStudioModels returns the Gemini model definitions for AI Studio integrations
func GetAIStudioModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gemini-2.5-pro",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-pro",
Version: "2.5",
DisplayName: "Gemini 2.5 Pro",
Description: "Stable release (June 17th, 2025) of Gemini 2.5 Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash",
Version: "001",
DisplayName: "Gemini 2.5 Flash",
Description: "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-2.5-flash-lite",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-lite",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Lite",
Description: "Our smallest and most cost effective model, built for at scale usage.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-3-pro-preview",
Object: "model",
Created: 1737158400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-pro-preview",
Version: "3.0",
DisplayName: "Gemini 3 Pro Preview",
Description: "Gemini 3 Pro Preview",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-3-flash-preview",
Object: "model",
Created: 1765929600,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-3-flash-preview",
Version: "3.0",
DisplayName: "Gemini 3 Flash Preview",
Description: "Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding.",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-pro-latest",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-pro-latest",
Version: "2.5",
DisplayName: "Gemini Pro Latest",
Description: "Latest release of Gemini Pro",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true},
},
{
ID: "gemini-flash-latest",
Object: "model",
Created: 1750118400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-flash-latest",
Version: "2.5",
DisplayName: "Gemini Flash Latest",
Description: "Latest release of Gemini Flash",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
{
ID: "gemini-flash-lite-latest",
Object: "model",
Created: 1753142400,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-flash-lite-latest",
Version: "2.5",
DisplayName: "Gemini Flash-Lite Latest",
Description: "Latest release of Gemini Flash-Lite",
InputTokenLimit: 1048576,
OutputTokenLimit: 65536,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
Thinking: &ThinkingSupport{Min: 512, Max: 24576, ZeroAllowed: true, DynamicAllowed: true},
},
// {
// ID: "gemini-2.5-flash-image-preview",
// Object: "model",
// Created: 1756166400,
// OwnedBy: "google",
// Type: "gemini",
// Name: "models/gemini-2.5-flash-image-preview",
// Version: "2.5",
// DisplayName: "Gemini 2.5 Flash Image Preview",
// Description: "State-of-the-art image generation and editing model.",
// InputTokenLimit: 1048576,
// OutputTokenLimit: 8192,
// SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
// // image models don't support thinkingConfig; leave Thinking nil
// },
{
ID: "gemini-2.5-flash-image",
Object: "model",
Created: 1759363200,
OwnedBy: "google",
Type: "gemini",
Name: "models/gemini-2.5-flash-image",
Version: "2.5",
DisplayName: "Gemini 2.5 Flash Image",
Description: "State-of-the-art image generation and editing model.",
InputTokenLimit: 1048576,
OutputTokenLimit: 8192,
SupportedGenerationMethods: []string{"generateContent", "countTokens", "createCachedContent", "batchGenerateContent"},
// image models don't support thinkingConfig; leave Thinking nil
},
}
}
// GetOpenAIModels returns the standard OpenAI model definitions
func GetOpenAIModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "gpt-5",
Object: "model",
Created: 1754524800,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-08-07",
DisplayName: "GPT 5",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"minimal", "low", "medium", "high"}},
},
{
ID: "gpt-5-codex",
Object: "model",
Created: 1757894400,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-09-15",
DisplayName: "GPT 5 Codex",
Description: "Stable version of GPT 5 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5-codex-mini",
Object: "model",
Created: 1762473600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5-2025-11-07",
DisplayName: "GPT 5 Codex Mini",
Description: "Stable version of GPT 5 Codex Mini: cheaper, faster, but less capable version of GPT 5 Codex.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5",
Description: "Stable version of GPT 5, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5.1 Codex",
Description: "Stable version of GPT 5.1 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex-mini",
Object: "model",
Created: 1762905600,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-2025-11-12",
DisplayName: "GPT 5.1 Codex Mini",
Description: "Stable version of GPT 5.1 Codex Mini: cheaper, faster, but less capable version of GPT 5.1 Codex.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gpt-5.1-codex-max",
Object: "model",
Created: 1763424000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.1-max",
DisplayName: "GPT 5.1 Codex Max",
Description: "Stable version of GPT 5.1 Codex Max",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high", "xhigh"}},
},
{
ID: "gpt-5.2",
Object: "model",
Created: 1765440000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.2",
DisplayName: "GPT 5.2",
Description: "Stable version of GPT 5.2",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high", "xhigh"}},
},
{
ID: "gpt-5.2-codex",
Object: "model",
Created: 1765440000,
OwnedBy: "openai",
Type: "openai",
Version: "gpt-5.2",
DisplayName: "GPT 5.2 Codex",
Description: "Stable version of GPT 5.2 Codex, The best model for coding and agentic tasks across domains.",
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high", "xhigh"}},
},
}
}
// GetQwenModels returns the standard Qwen model definitions
func GetQwenModels() []*ModelInfo {
return []*ModelInfo{
{
ID: "qwen3-coder-plus",
Object: "model",
Created: 1753228800,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Plus",
Description: "Advanced code generation and understanding model",
ContextLength: 32768,
MaxCompletionTokens: 8192,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
{
ID: "qwen3-coder-flash",
Object: "model",
Created: 1753228800,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Coder Flash",
Description: "Fast code generation model",
ContextLength: 8192,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
{
ID: "vision-model",
Object: "model",
Created: 1758672000,
OwnedBy: "qwen",
Type: "qwen",
Version: "3.0",
DisplayName: "Qwen3 Vision Model",
Description: "Vision model model",
ContextLength: 32768,
MaxCompletionTokens: 2048,
SupportedParameters: []string{"temperature", "top_p", "max_tokens", "stream", "stop"},
},
}
}
// iFlowThinkingSupport is a shared ThinkingSupport configuration for iFlow models
// that support thinking mode via chat_template_kwargs.enable_thinking (boolean toggle).
// Uses level-based configuration so standard normalization flows apply before conversion.
var iFlowThinkingSupport = &ThinkingSupport{
Levels: []string{"none", "auto", "minimal", "low", "medium", "high", "xhigh"},
}
// GetIFlowModels returns supported models for iFlow OAuth accounts.
func GetIFlowModels() []*ModelInfo {
entries := []struct {
ID string
DisplayName string
Description string
Created int64
Thinking *ThinkingSupport
}{
{ID: "tstars2.0", DisplayName: "TStars-2.0", Description: "iFlow TStars-2.0 multimodal assistant", Created: 1746489600},
{ID: "qwen3-coder-plus", DisplayName: "Qwen3-Coder-Plus", Description: "Qwen3 Coder Plus code generation", Created: 1753228800},
{ID: "qwen3-max", DisplayName: "Qwen3-Max", Description: "Qwen3 flagship model", Created: 1758672000},
{ID: "qwen3-vl-plus", DisplayName: "Qwen3-VL-Plus", Description: "Qwen3 multimodal vision-language", Created: 1758672000},
{ID: "qwen3-max-preview", DisplayName: "Qwen3-Max-Preview", Description: "Qwen3 Max preview build", Created: 1757030400, Thinking: iFlowThinkingSupport},
{ID: "kimi-k2-0905", DisplayName: "Kimi-K2-Instruct-0905", Description: "Moonshot Kimi K2 instruct 0905", Created: 1757030400},
{ID: "glm-4.6", DisplayName: "GLM-4.6", Description: "Zhipu GLM 4.6 general model", Created: 1759190400, Thinking: iFlowThinkingSupport},
{ID: "glm-4.7", DisplayName: "GLM-4.7", Description: "Zhipu GLM 4.7 general model", Created: 1766448000, Thinking: iFlowThinkingSupport},
{ID: "kimi-k2", DisplayName: "Kimi-K2", Description: "Moonshot Kimi K2 general model", Created: 1752192000},
{ID: "kimi-k2-thinking", DisplayName: "Kimi-K2-Thinking", Description: "Moonshot Kimi K2 thinking model", Created: 1762387200},
{ID: "deepseek-v3.2-chat", DisplayName: "DeepSeek-V3.2", Description: "DeepSeek V3.2 Chat", Created: 1764576000},
{ID: "deepseek-v3.2-reasoner", DisplayName: "DeepSeek-V3.2", Description: "DeepSeek V3.2 Reasoner", Created: 1764576000},
{ID: "deepseek-v3.2", DisplayName: "DeepSeek-V3.2-Exp", Description: "DeepSeek V3.2 experimental", Created: 1759104000, Thinking: iFlowThinkingSupport},
{ID: "deepseek-v3.1", DisplayName: "DeepSeek-V3.1-Terminus", Description: "DeepSeek V3.1 Terminus", Created: 1756339200, Thinking: iFlowThinkingSupport},
{ID: "deepseek-r1", DisplayName: "DeepSeek-R1", Description: "DeepSeek reasoning model R1", Created: 1737331200},
{ID: "deepseek-v3", DisplayName: "DeepSeek-V3-671B", Description: "DeepSeek V3 671B", Created: 1734307200},
{ID: "qwen3-32b", DisplayName: "Qwen3-32B", Description: "Qwen3 32B", Created: 1747094400},
{ID: "qwen3-235b-a22b-thinking-2507", DisplayName: "Qwen3-235B-A22B-Thinking", Description: "Qwen3 235B A22B Thinking (2507)", Created: 1753401600},
{ID: "qwen3-235b-a22b-instruct", DisplayName: "Qwen3-235B-A22B-Instruct", Description: "Qwen3 235B A22B Instruct", Created: 1753401600},
{ID: "qwen3-235b", DisplayName: "Qwen3-235B-A22B", Description: "Qwen3 235B A22B", Created: 1753401600},
{ID: "minimax-m2", DisplayName: "MiniMax-M2", Description: "MiniMax M2", Created: 1758672000, Thinking: iFlowThinkingSupport},
{ID: "minimax-m2.1", DisplayName: "MiniMax-M2.1", Description: "MiniMax M2.1", Created: 1766448000, Thinking: iFlowThinkingSupport},
{ID: "iflow-rome-30ba3b", DisplayName: "iFlow-ROME", Description: "iFlow Rome 30BA3B model", Created: 1736899200},
}
models := make([]*ModelInfo, 0, len(entries))
for _, entry := range entries {
models = append(models, &ModelInfo{
ID: entry.ID,
Object: "model",
Created: entry.Created,
OwnedBy: "iflow",
Type: "iflow",
DisplayName: entry.DisplayName,
Description: entry.Description,
Thinking: entry.Thinking,
})
}
return models
}
// AntigravityModelConfig captures static antigravity model overrides, including
// Thinking budget limits and provider max completion tokens.
type AntigravityModelConfig struct {
Thinking *ThinkingSupport
MaxCompletionTokens int
}
// GetAntigravityModelConfig returns static configuration for antigravity models.
// Keys use upstream model names returned by the Antigravity models endpoint.
func GetAntigravityModelConfig() map[string]*AntigravityModelConfig {
return map[string]*AntigravityModelConfig{
// "rev19-uic3-1p": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true}},
"gemini-2.5-flash": {Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true}},
"gemini-2.5-flash-lite": {Thinking: &ThinkingSupport{Min: 0, Max: 24576, ZeroAllowed: true, DynamicAllowed: true}},
"gemini-3-pro-high": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}}},
"gemini-3-pro-image": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"low", "high"}}},
"gemini-3-flash": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}}},
"claude-sonnet-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-opus-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-sonnet-4-5": {MaxCompletionTokens: 64000},
"gpt-oss-120b-medium": {},
"tab_flash_lite_preview": {},
}
}

View File

@@ -78,6 +78,8 @@ type ThinkingSupport struct {
type ModelRegistration struct {
// Info contains the model metadata
Info *ModelInfo
// InfoByProvider maps provider identifiers to specific ModelInfo to support differing capabilities.
InfoByProvider map[string]*ModelInfo
// Count is the number of active clients that can provide this model
Count int
// LastUpdated tracks when this registration was last modified
@@ -132,16 +134,19 @@ func GetGlobalRegistry() *ModelRegistry {
return globalRegistry
}
// LookupModelInfo searches the dynamic registry first, then falls back to static model definitions.
//
// This helper exists because some code paths only have a model ID and still need Thinking and
// max completion token metadata even when the dynamic registry hasn't been populated.
func LookupModelInfo(modelID string) *ModelInfo {
// LookupModelInfo searches dynamic registry (provider-specific > global) then static definitions.
func LookupModelInfo(modelID string, provider ...string) *ModelInfo {
modelID = strings.TrimSpace(modelID)
if modelID == "" {
return nil
}
if info := GetGlobalRegistry().GetModelInfo(modelID); info != nil {
p := ""
if len(provider) > 0 {
p = strings.ToLower(strings.TrimSpace(provider[0]))
}
if info := GetGlobalRegistry().GetModelInfo(modelID, p); info != nil {
return info
}
return LookupStaticModelInfo(modelID)
@@ -297,6 +302,9 @@ func (r *ModelRegistry) RegisterClient(clientID, clientProvider string, models [
if count, okProv := reg.Providers[oldProvider]; okProv {
if count <= toRemove {
delete(reg.Providers, oldProvider)
if reg.InfoByProvider != nil {
delete(reg.InfoByProvider, oldProvider)
}
} else {
reg.Providers[oldProvider] = count - toRemove
}
@@ -346,6 +354,12 @@ func (r *ModelRegistry) RegisterClient(clientID, clientProvider string, models [
model := newModels[id]
if reg, ok := r.models[id]; ok {
reg.Info = cloneModelInfo(model)
if provider != "" {
if reg.InfoByProvider == nil {
reg.InfoByProvider = make(map[string]*ModelInfo)
}
reg.InfoByProvider[provider] = cloneModelInfo(model)
}
reg.LastUpdated = now
if reg.QuotaExceededClients != nil {
delete(reg.QuotaExceededClients, clientID)
@@ -409,11 +423,15 @@ func (r *ModelRegistry) addModelRegistration(modelID, provider string, model *Mo
if existing.SuspendedClients == nil {
existing.SuspendedClients = make(map[string]string)
}
if existing.InfoByProvider == nil {
existing.InfoByProvider = make(map[string]*ModelInfo)
}
if provider != "" {
if existing.Providers == nil {
existing.Providers = make(map[string]int)
}
existing.Providers[provider]++
existing.InfoByProvider[provider] = cloneModelInfo(model)
}
log.Debugf("Incremented count for model %s, now %d clients", modelID, existing.Count)
return
@@ -421,6 +439,7 @@ func (r *ModelRegistry) addModelRegistration(modelID, provider string, model *Mo
registration := &ModelRegistration{
Info: cloneModelInfo(model),
InfoByProvider: make(map[string]*ModelInfo),
Count: 1,
LastUpdated: now,
QuotaExceededClients: make(map[string]*time.Time),
@@ -428,6 +447,7 @@ func (r *ModelRegistry) addModelRegistration(modelID, provider string, model *Mo
}
if provider != "" {
registration.Providers = map[string]int{provider: 1}
registration.InfoByProvider[provider] = cloneModelInfo(model)
}
r.models[modelID] = registration
log.Debugf("Registered new model %s from provider %s", modelID, provider)
@@ -453,6 +473,9 @@ func (r *ModelRegistry) removeModelRegistration(clientID, modelID, provider stri
if count, ok := registration.Providers[provider]; ok {
if count <= 1 {
delete(registration.Providers, provider)
if registration.InfoByProvider != nil {
delete(registration.InfoByProvider, provider)
}
} else {
registration.Providers[provider] = count - 1
}
@@ -534,6 +557,9 @@ func (r *ModelRegistry) unregisterClientInternal(clientID string) {
if count, ok := registration.Providers[provider]; ok {
if count <= 1 {
delete(registration.Providers, provider)
if registration.InfoByProvider != nil {
delete(registration.InfoByProvider, provider)
}
} else {
registration.Providers[provider] = count - 1
}
@@ -940,12 +966,22 @@ func (r *ModelRegistry) GetModelProviders(modelID string) []string {
return result
}
// GetModelInfo returns the registered ModelInfo for the given model ID, if present.
// Returns nil if the model is unknown to the registry.
func (r *ModelRegistry) GetModelInfo(modelID string) *ModelInfo {
// GetModelInfo returns ModelInfo, prioritizing provider-specific definition if available.
func (r *ModelRegistry) GetModelInfo(modelID, provider string) *ModelInfo {
r.mutex.RLock()
defer r.mutex.RUnlock()
if reg, ok := r.models[modelID]; ok && reg != nil {
// Try provider specific definition first
if provider != "" && reg.InfoByProvider != nil {
if reg.Providers != nil {
if count, ok := reg.Providers[provider]; ok && count > 0 {
if info, ok := reg.InfoByProvider[provider]; ok && info != nil {
return info
}
}
}
}
// Fallback to global info (last registered)
return reg.Info
}
return nil
@@ -997,10 +1033,10 @@ func (r *ModelRegistry) convertModelToMap(model *ModelInfo, handlerType string)
"owned_by": model.OwnedBy,
}
if model.Created > 0 {
result["created"] = model.Created
result["created_at"] = model.Created
}
if model.Type != "" {
result["type"] = model.Type
result["type"] = "model"
}
if model.DisplayName != "" {
result["display_name"] = model.DisplayName

View File

@@ -111,6 +111,9 @@ func (e *AIStudioExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.A
// Execute performs a non-streaming request to the AI Studio API.
func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
@@ -167,6 +170,9 @@ func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth,
// ExecuteStream performs a streaming request to the AI Studio API.
func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
@@ -393,12 +399,13 @@ func (e *AIStudioExecutor) translateRequest(req cliproxyexecutor.Request, opts c
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, stream)
payload := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), stream)
payload, err := thinking.ApplyThinking(payload, req.Model, "gemini")
payload, err := thinking.ApplyThinking(payload, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, translatedPayload{}, err
}
payload = fixGeminiImageAspectRatio(baseModel, payload)
payload = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", payload, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
payload = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", payload, originalTranslated, requestedModel)
payload, _ = sjson.DeleteBytes(payload, "generationConfig.maxOutputTokens")
payload, _ = sjson.DeleteBytes(payload, "generationConfig.responseMimeType")
payload, _ = sjson.DeleteBytes(payload, "generationConfig.responseJsonSchema")

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,258 @@
package executor
import (
"fmt"
"testing"
"github.com/tidwall/gjson"
)
func TestEnsureCacheControl(t *testing.T) {
// Test case 1: System prompt as string
t.Run("String System Prompt", func(t *testing.T) {
input := []byte(`{"model": "claude-3-5-sonnet", "system": "This is a long system prompt", "messages": []}`)
output := ensureCacheControl(input)
res := gjson.GetBytes(output, "system.0.cache_control.type")
if res.String() != "ephemeral" {
t.Errorf("cache_control not found in system string. Output: %s", string(output))
}
})
// Test case 2: System prompt as array
t.Run("Array System Prompt", func(t *testing.T) {
input := []byte(`{"model": "claude-3-5-sonnet", "system": [{"type": "text", "text": "Part 1"}, {"type": "text", "text": "Part 2"}], "messages": []}`)
output := ensureCacheControl(input)
// cache_control should only be on the LAST element
res0 := gjson.GetBytes(output, "system.0.cache_control")
res1 := gjson.GetBytes(output, "system.1.cache_control.type")
if res0.Exists() {
t.Errorf("cache_control should NOT be on the first element")
}
if res1.String() != "ephemeral" {
t.Errorf("cache_control not found on last system element. Output: %s", string(output))
}
})
// Test case 3: Tools are cached
t.Run("Tools Caching", func(t *testing.T) {
input := []byte(`{
"model": "claude-3-5-sonnet",
"tools": [
{"name": "tool1", "description": "First tool", "input_schema": {"type": "object"}},
{"name": "tool2", "description": "Second tool", "input_schema": {"type": "object"}}
],
"system": "System prompt",
"messages": []
}`)
output := ensureCacheControl(input)
// cache_control should only be on the LAST tool
tool0Cache := gjson.GetBytes(output, "tools.0.cache_control")
tool1Cache := gjson.GetBytes(output, "tools.1.cache_control.type")
if tool0Cache.Exists() {
t.Errorf("cache_control should NOT be on the first tool")
}
if tool1Cache.String() != "ephemeral" {
t.Errorf("cache_control not found on last tool. Output: %s", string(output))
}
// System should also have cache_control
systemCache := gjson.GetBytes(output, "system.0.cache_control.type")
if systemCache.String() != "ephemeral" {
t.Errorf("cache_control not found in system. Output: %s", string(output))
}
})
// Test case 4: Tools and system are INDEPENDENT breakpoints
// Per Anthropic docs: Up to 4 breakpoints allowed, tools and system are cached separately
t.Run("Independent Cache Breakpoints", func(t *testing.T) {
input := []byte(`{
"model": "claude-3-5-sonnet",
"tools": [
{"name": "tool1", "description": "First tool", "input_schema": {"type": "object"}, "cache_control": {"type": "ephemeral"}}
],
"system": [{"type": "text", "text": "System"}],
"messages": []
}`)
output := ensureCacheControl(input)
// Tool already has cache_control - should not be changed
tool0Cache := gjson.GetBytes(output, "tools.0.cache_control.type")
if tool0Cache.String() != "ephemeral" {
t.Errorf("existing cache_control was incorrectly removed")
}
// System SHOULD get cache_control because it is an INDEPENDENT breakpoint
// Tools and system are separate cache levels in the hierarchy
systemCache := gjson.GetBytes(output, "system.0.cache_control.type")
if systemCache.String() != "ephemeral" {
t.Errorf("system should have its own cache_control breakpoint (independent of tools)")
}
})
// Test case 5: Only tools, no system
t.Run("Only Tools No System", func(t *testing.T) {
input := []byte(`{
"model": "claude-3-5-sonnet",
"tools": [
{"name": "tool1", "description": "Tool", "input_schema": {"type": "object"}}
],
"messages": [{"role": "user", "content": "Hi"}]
}`)
output := ensureCacheControl(input)
toolCache := gjson.GetBytes(output, "tools.0.cache_control.type")
if toolCache.String() != "ephemeral" {
t.Errorf("cache_control not found on tool. Output: %s", string(output))
}
})
// Test case 6: Many tools (Claude Code scenario)
t.Run("Many Tools (Claude Code Scenario)", func(t *testing.T) {
// Simulate Claude Code with many tools
toolsJSON := `[`
for i := 0; i < 50; i++ {
if i > 0 {
toolsJSON += ","
}
toolsJSON += fmt.Sprintf(`{"name": "tool%d", "description": "Tool %d", "input_schema": {"type": "object"}}`, i, i)
}
toolsJSON += `]`
input := []byte(fmt.Sprintf(`{
"model": "claude-3-5-sonnet",
"tools": %s,
"system": [{"type": "text", "text": "You are Claude Code"}],
"messages": [{"role": "user", "content": "Hello"}]
}`, toolsJSON))
output := ensureCacheControl(input)
// Only the last tool (index 49) should have cache_control
for i := 0; i < 49; i++ {
path := fmt.Sprintf("tools.%d.cache_control", i)
if gjson.GetBytes(output, path).Exists() {
t.Errorf("tool %d should NOT have cache_control", i)
}
}
lastToolCache := gjson.GetBytes(output, "tools.49.cache_control.type")
if lastToolCache.String() != "ephemeral" {
t.Errorf("last tool (49) should have cache_control")
}
// System should also have cache_control
systemCache := gjson.GetBytes(output, "system.0.cache_control.type")
if systemCache.String() != "ephemeral" {
t.Errorf("system should have cache_control")
}
t.Log("test passed: 50 tools - cache_control only on last tool")
})
// Test case 7: Empty tools array
t.Run("Empty Tools Array", func(t *testing.T) {
input := []byte(`{"model": "claude-3-5-sonnet", "tools": [], "system": "Test", "messages": []}`)
output := ensureCacheControl(input)
// System should still get cache_control
systemCache := gjson.GetBytes(output, "system.0.cache_control.type")
if systemCache.String() != "ephemeral" {
t.Errorf("system should have cache_control even with empty tools array")
}
})
// Test case 8: Messages caching for multi-turn (second-to-last user)
t.Run("Messages Caching Second-To-Last User", func(t *testing.T) {
input := []byte(`{
"model": "claude-3-5-sonnet",
"messages": [
{"role": "user", "content": "First user"},
{"role": "assistant", "content": "Assistant reply"},
{"role": "user", "content": "Second user"},
{"role": "assistant", "content": "Assistant reply 2"},
{"role": "user", "content": "Third user"}
]
}`)
output := ensureCacheControl(input)
cacheType := gjson.GetBytes(output, "messages.2.content.0.cache_control.type")
if cacheType.String() != "ephemeral" {
t.Errorf("cache_control not found on second-to-last user turn. Output: %s", string(output))
}
lastUserCache := gjson.GetBytes(output, "messages.4.content.0.cache_control")
if lastUserCache.Exists() {
t.Errorf("last user turn should NOT have cache_control")
}
})
// Test case 9: Existing message cache_control should skip injection
t.Run("Messages Skip When Cache Control Exists", func(t *testing.T) {
input := []byte(`{
"model": "claude-3-5-sonnet",
"messages": [
{"role": "user", "content": [{"type": "text", "text": "First user"}]},
{"role": "assistant", "content": [{"type": "text", "text": "Assistant reply", "cache_control": {"type": "ephemeral"}}]},
{"role": "user", "content": [{"type": "text", "text": "Second user"}]}
]
}`)
output := ensureCacheControl(input)
userCache := gjson.GetBytes(output, "messages.0.content.0.cache_control")
if userCache.Exists() {
t.Errorf("cache_control should NOT be injected when a message already has cache_control")
}
existingCache := gjson.GetBytes(output, "messages.1.content.0.cache_control.type")
if existingCache.String() != "ephemeral" {
t.Errorf("existing cache_control should be preserved. Output: %s", string(output))
}
})
}
// TestCacheControlOrder verifies the correct order: tools -> system -> messages
func TestCacheControlOrder(t *testing.T) {
input := []byte(`{
"model": "claude-sonnet-4",
"tools": [
{"name": "Read", "description": "Read file", "input_schema": {"type": "object", "properties": {"path": {"type": "string"}}}},
{"name": "Write", "description": "Write file", "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}}}
],
"system": [
{"type": "text", "text": "You are Claude Code, Anthropic's official CLI for Claude."},
{"type": "text", "text": "Additional instructions here..."}
],
"messages": [
{"role": "user", "content": "Hello"}
]
}`)
output := ensureCacheControl(input)
// 1. Last tool has cache_control
if gjson.GetBytes(output, "tools.1.cache_control.type").String() != "ephemeral" {
t.Error("last tool should have cache_control")
}
// 2. First tool has NO cache_control
if gjson.GetBytes(output, "tools.0.cache_control").Exists() {
t.Error("first tool should NOT have cache_control")
}
// 3. Last system element has cache_control
if gjson.GetBytes(output, "system.1.cache_control.type").String() != "ephemeral" {
t.Error("last system element should have cache_control")
}
// 4. First system element has NO cache_control
if gjson.GetBytes(output, "system.0.cache_control").Exists() {
t.Error("first system element should NOT have cache_control")
}
t.Log("cache order correct: tools -> system")
}

View File

@@ -17,7 +17,6 @@ import (
claudeauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/claude"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
@@ -85,6 +84,9 @@ func (e *ClaudeExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Aut
}
func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := claudeCreds(auth)
@@ -106,21 +108,23 @@ func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), stream)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "claude")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
if !strings.HasPrefix(baseModel, "claude-3-5-haiku") {
body = checkSystemInstructions(body)
}
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
// Apply cloaking (system prompt injection, fake user ID, sensitive word obfuscation)
// based on client type and configuration.
body = applyCloaking(ctx, e.cfg, auth, body, baseModel)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
body = disableThinkingIfToolChoiceForced(body)
// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
body = ensureMaxTokensForThinking(baseModel, body)
// Auto-inject cache_control if missing (optimization for ClawdBot/clients without caching support)
body = ensureCacheControl(body)
// Extract betas from body and convert to header
var extraBetas []string
@@ -165,7 +169,7 @@ func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("response body close error: %v", errClose)
@@ -220,6 +224,9 @@ func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
}
func (e *ClaudeExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := claudeCreds(auth)
@@ -239,19 +246,23 @@ func (e *ClaudeExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "claude")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
body = checkSystemInstructions(body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
// Apply cloaking (system prompt injection, fake user ID, sensitive word obfuscation)
// based on client type and configuration.
body = applyCloaking(ctx, e.cfg, auth, body, baseModel)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
body = disableThinkingIfToolChoiceForced(body)
// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
body = ensureMaxTokensForThinking(baseModel, body)
// Auto-inject cache_control if missing (optimization for ClawdBot/clients without caching support)
body = ensureCacheControl(body)
// Extract betas from body and convert to header
var extraBetas []string
@@ -296,7 +307,7 @@ func (e *ClaudeExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("response body close error: %v", errClose)
}
@@ -541,81 +552,6 @@ func disableThinkingIfToolChoiceForced(body []byte) []byte {
return body
}
// ensureMaxTokensForThinking ensures max_tokens > thinking.budget_tokens when thinking is enabled.
// Anthropic API requires this constraint; violating it returns a 400 error.
// This function should be called after all thinking configuration is finalized.
// It looks up the model's MaxCompletionTokens from the registry to use as the cap.
func ensureMaxTokensForThinking(modelName string, body []byte) []byte {
thinkingType := gjson.GetBytes(body, "thinking.type").String()
if thinkingType != "enabled" {
return body
}
budgetTokens := gjson.GetBytes(body, "thinking.budget_tokens").Int()
if budgetTokens <= 0 {
return body
}
maxTokens := gjson.GetBytes(body, "max_tokens").Int()
// Look up the model's max completion tokens from the registry
maxCompletionTokens := 0
if modelInfo := registry.LookupModelInfo(modelName); modelInfo != nil {
maxCompletionTokens = modelInfo.MaxCompletionTokens
}
// Fall back to budget + buffer if registry lookup fails or returns 0
const fallbackBuffer = 4000
requiredMaxTokens := budgetTokens + fallbackBuffer
if maxCompletionTokens > 0 {
requiredMaxTokens = int64(maxCompletionTokens)
}
if maxTokens < requiredMaxTokens {
body, _ = sjson.SetBytes(body, "max_tokens", requiredMaxTokens)
}
return body
}
func (e *ClaudeExecutor) resolveClaudeConfig(auth *cliproxyauth.Auth) *config.ClaudeKey {
if auth == nil || e.cfg == nil {
return nil
}
var attrKey, attrBase string
if auth.Attributes != nil {
attrKey = strings.TrimSpace(auth.Attributes["api_key"])
attrBase = strings.TrimSpace(auth.Attributes["base_url"])
}
for i := range e.cfg.ClaudeKey {
entry := &e.cfg.ClaudeKey[i]
cfgKey := strings.TrimSpace(entry.APIKey)
cfgBase := strings.TrimSpace(entry.BaseURL)
if attrKey != "" && attrBase != "" {
if strings.EqualFold(cfgKey, attrKey) && strings.EqualFold(cfgBase, attrBase) {
return entry
}
continue
}
if attrKey != "" && strings.EqualFold(cfgKey, attrKey) {
if cfgBase == "" || strings.EqualFold(cfgBase, attrBase) {
return entry
}
}
if attrKey == "" && attrBase != "" && strings.EqualFold(cfgBase, attrBase) {
return entry
}
}
if attrKey != "" {
for i := range e.cfg.ClaudeKey {
entry := &e.cfg.ClaudeKey[i]
if strings.EqualFold(strings.TrimSpace(entry.APIKey), attrKey) {
return entry
}
}
}
return nil
}
type compositeReadCloser struct {
io.Reader
closers []func() error
@@ -712,13 +648,17 @@ func applyClaudeHeaders(r *http.Request, auth *cliproxyauth.Auth, apiKey string,
ginHeaders = ginCtx.Request.Header
}
baseBetas := "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14"
promptCachingBeta := "prompt-caching-2024-07-31"
baseBetas := "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14," + promptCachingBeta
if val := strings.TrimSpace(ginHeaders.Get("Anthropic-Beta")); val != "" {
baseBetas = val
if !strings.Contains(val, "oauth") {
baseBetas += ",oauth-2025-04-20"
}
}
if !strings.Contains(baseBetas, promptCachingBeta) {
baseBetas += "," + promptCachingBeta
}
// Merge extra betas from request body
if len(extraBetas) > 0 {
@@ -809,6 +749,11 @@ func applyClaudeToolPrefix(body []byte, prefix string) []byte {
if tools := gjson.GetBytes(body, "tools"); tools.Exists() && tools.IsArray() {
tools.ForEach(func(index, tool gjson.Result) bool {
// Skip built-in tools (web_search, code_execution, etc.) which have
// a "type" field and require their name to remain unchanged.
if tool.Get("type").Exists() && tool.Get("type").String() != "" {
return true
}
name := tool.Get("name").String()
if name == "" || strings.HasPrefix(name, prefix) {
return true
@@ -901,3 +846,374 @@ func stripClaudeToolPrefixFromStreamLine(line []byte, prefix string) []byte {
}
return updated
}
// getClientUserAgent extracts the client User-Agent from the gin context.
func getClientUserAgent(ctx context.Context) string {
if ginCtx, ok := ctx.Value("gin").(*gin.Context); ok && ginCtx != nil && ginCtx.Request != nil {
return ginCtx.GetHeader("User-Agent")
}
return ""
}
// getCloakConfigFromAuth extracts cloak configuration from auth attributes.
// Returns (cloakMode, strictMode, sensitiveWords).
func getCloakConfigFromAuth(auth *cliproxyauth.Auth) (string, bool, []string) {
if auth == nil || auth.Attributes == nil {
return "auto", false, nil
}
cloakMode := auth.Attributes["cloak_mode"]
if cloakMode == "" {
cloakMode = "auto"
}
strictMode := strings.ToLower(auth.Attributes["cloak_strict_mode"]) == "true"
var sensitiveWords []string
if wordsStr := auth.Attributes["cloak_sensitive_words"]; wordsStr != "" {
sensitiveWords = strings.Split(wordsStr, ",")
for i := range sensitiveWords {
sensitiveWords[i] = strings.TrimSpace(sensitiveWords[i])
}
}
return cloakMode, strictMode, sensitiveWords
}
// resolveClaudeKeyCloakConfig finds the matching ClaudeKey config and returns its CloakConfig.
func resolveClaudeKeyCloakConfig(cfg *config.Config, auth *cliproxyauth.Auth) *config.CloakConfig {
if cfg == nil || auth == nil {
return nil
}
apiKey, baseURL := claudeCreds(auth)
if apiKey == "" {
return nil
}
for i := range cfg.ClaudeKey {
entry := &cfg.ClaudeKey[i]
cfgKey := strings.TrimSpace(entry.APIKey)
cfgBase := strings.TrimSpace(entry.BaseURL)
// Match by API key
if strings.EqualFold(cfgKey, apiKey) {
// If baseURL is specified, also check it
if baseURL != "" && cfgBase != "" && !strings.EqualFold(cfgBase, baseURL) {
continue
}
return entry.Cloak
}
}
return nil
}
// injectFakeUserID generates and injects a fake user ID into the request metadata.
func injectFakeUserID(payload []byte) []byte {
metadata := gjson.GetBytes(payload, "metadata")
if !metadata.Exists() {
payload, _ = sjson.SetBytes(payload, "metadata.user_id", generateFakeUserID())
return payload
}
existingUserID := gjson.GetBytes(payload, "metadata.user_id").String()
if existingUserID == "" || !isValidUserID(existingUserID) {
payload, _ = sjson.SetBytes(payload, "metadata.user_id", generateFakeUserID())
}
return payload
}
// checkSystemInstructionsWithMode injects Claude Code system prompt.
// In strict mode, it replaces all user system messages.
// In non-strict mode (default), it prepends to existing system messages.
func checkSystemInstructionsWithMode(payload []byte, strictMode bool) []byte {
system := gjson.GetBytes(payload, "system")
claudeCodeInstructions := `[{"type":"text","text":"You are Claude Code, Anthropic's official CLI for Claude."}]`
if strictMode {
// Strict mode: replace all system messages with Claude Code prompt only
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
return payload
}
// Non-strict mode (default): prepend Claude Code prompt to existing system messages
if system.IsArray() {
if gjson.GetBytes(payload, "system.0.text").String() != "You are Claude Code, Anthropic's official CLI for Claude." {
system.ForEach(func(_, part gjson.Result) bool {
if part.Get("type").String() == "text" {
claudeCodeInstructions, _ = sjson.SetRaw(claudeCodeInstructions, "-1", part.Raw)
}
return true
})
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
}
} else {
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
}
return payload
}
// applyCloaking applies cloaking transformations to the payload based on config and client.
// Cloaking includes: system prompt injection, fake user ID, and sensitive word obfuscation.
func applyCloaking(ctx context.Context, cfg *config.Config, auth *cliproxyauth.Auth, payload []byte, model string) []byte {
clientUserAgent := getClientUserAgent(ctx)
// Get cloak config from ClaudeKey configuration
cloakCfg := resolveClaudeKeyCloakConfig(cfg, auth)
// Determine cloak settings
var cloakMode string
var strictMode bool
var sensitiveWords []string
if cloakCfg != nil {
cloakMode = cloakCfg.Mode
strictMode = cloakCfg.StrictMode
sensitiveWords = cloakCfg.SensitiveWords
}
// Fallback to auth attributes if no config found
if cloakMode == "" {
attrMode, attrStrict, attrWords := getCloakConfigFromAuth(auth)
cloakMode = attrMode
if !strictMode {
strictMode = attrStrict
}
if len(sensitiveWords) == 0 {
sensitiveWords = attrWords
}
}
// Determine if cloaking should be applied
if !shouldCloak(cloakMode, clientUserAgent) {
return payload
}
// Skip system instructions for claude-3-5-haiku models
if !strings.HasPrefix(model, "claude-3-5-haiku") {
payload = checkSystemInstructionsWithMode(payload, strictMode)
}
// Inject fake user ID
payload = injectFakeUserID(payload)
// Apply sensitive word obfuscation
if len(sensitiveWords) > 0 {
matcher := buildSensitiveWordMatcher(sensitiveWords)
payload = obfuscateSensitiveWords(payload, matcher)
}
return payload
}
// ensureCacheControl injects cache_control breakpoints into the payload for optimal prompt caching.
// According to Anthropic's documentation, cache prefixes are created in order: tools -> system -> messages.
// This function adds cache_control to:
// 1. The LAST tool in the tools array (caches all tool definitions)
// 2. The LAST element in the system array (caches system prompt)
// 3. The SECOND-TO-LAST user turn (caches conversation history for multi-turn)
//
// Up to 4 cache breakpoints are allowed per request. Tools, System, and Messages are INDEPENDENT breakpoints.
// This enables up to 90% cost reduction on cached tokens (cache read = 0.1x base price).
// See: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
func ensureCacheControl(payload []byte) []byte {
// 1. Inject cache_control into the LAST tool (caches all tool definitions)
// Tools are cached first in the hierarchy, so this is the most important breakpoint.
payload = injectToolsCacheControl(payload)
// 2. Inject cache_control into the LAST system prompt element
// System is the second level in the cache hierarchy.
payload = injectSystemCacheControl(payload)
// 3. Inject cache_control into messages for multi-turn conversation caching
// This caches the conversation history up to the second-to-last user turn.
payload = injectMessagesCacheControl(payload)
return payload
}
// injectMessagesCacheControl adds cache_control to the second-to-last user turn for multi-turn caching.
// Per Anthropic docs: "Place cache_control on the second-to-last User message to let the model reuse the earlier cache."
// This enables caching of conversation history, which is especially beneficial for long multi-turn conversations.
// Only adds cache_control if:
// - There are at least 2 user turns in the conversation
// - No message content already has cache_control
func injectMessagesCacheControl(payload []byte) []byte {
messages := gjson.GetBytes(payload, "messages")
if !messages.Exists() || !messages.IsArray() {
return payload
}
// Check if ANY message content already has cache_control
hasCacheControlInMessages := false
messages.ForEach(func(_, msg gjson.Result) bool {
content := msg.Get("content")
if content.IsArray() {
content.ForEach(func(_, item gjson.Result) bool {
if item.Get("cache_control").Exists() {
hasCacheControlInMessages = true
return false
}
return true
})
}
return !hasCacheControlInMessages
})
if hasCacheControlInMessages {
return payload
}
// Find all user message indices
var userMsgIndices []int
messages.ForEach(func(index gjson.Result, msg gjson.Result) bool {
if msg.Get("role").String() == "user" {
userMsgIndices = append(userMsgIndices, int(index.Int()))
}
return true
})
// Need at least 2 user turns to cache the second-to-last
if len(userMsgIndices) < 2 {
return payload
}
// Get the second-to-last user message index
secondToLastUserIdx := userMsgIndices[len(userMsgIndices)-2]
// Get the content of this message
contentPath := fmt.Sprintf("messages.%d.content", secondToLastUserIdx)
content := gjson.GetBytes(payload, contentPath)
if content.IsArray() {
// Add cache_control to the last content block of this message
contentCount := int(content.Get("#").Int())
if contentCount > 0 {
cacheControlPath := fmt.Sprintf("messages.%d.content.%d.cache_control", secondToLastUserIdx, contentCount-1)
result, err := sjson.SetBytes(payload, cacheControlPath, map[string]string{"type": "ephemeral"})
if err != nil {
log.Warnf("failed to inject cache_control into messages: %v", err)
return payload
}
payload = result
}
} else if content.Type == gjson.String {
// Convert string content to array with cache_control
text := content.String()
newContent := []map[string]interface{}{
{
"type": "text",
"text": text,
"cache_control": map[string]string{
"type": "ephemeral",
},
},
}
result, err := sjson.SetBytes(payload, contentPath, newContent)
if err != nil {
log.Warnf("failed to inject cache_control into message string content: %v", err)
return payload
}
payload = result
}
return payload
}
// injectToolsCacheControl adds cache_control to the last tool in the tools array.
// Per Anthropic docs: "The cache_control parameter on the last tool definition caches all tool definitions."
// This only adds cache_control if NO tool in the array already has it.
func injectToolsCacheControl(payload []byte) []byte {
tools := gjson.GetBytes(payload, "tools")
if !tools.Exists() || !tools.IsArray() {
return payload
}
toolCount := int(tools.Get("#").Int())
if toolCount == 0 {
return payload
}
// Check if ANY tool already has cache_control - if so, don't modify tools
hasCacheControlInTools := false
tools.ForEach(func(_, tool gjson.Result) bool {
if tool.Get("cache_control").Exists() {
hasCacheControlInTools = true
return false
}
return true
})
if hasCacheControlInTools {
return payload
}
// Add cache_control to the last tool
lastToolPath := fmt.Sprintf("tools.%d.cache_control", toolCount-1)
result, err := sjson.SetBytes(payload, lastToolPath, map[string]string{"type": "ephemeral"})
if err != nil {
log.Warnf("failed to inject cache_control into tools array: %v", err)
return payload
}
return result
}
// injectSystemCacheControl adds cache_control to the last element in the system prompt.
// Converts string system prompts to array format if needed.
// This only adds cache_control if NO system element already has it.
func injectSystemCacheControl(payload []byte) []byte {
system := gjson.GetBytes(payload, "system")
if !system.Exists() {
return payload
}
if system.IsArray() {
count := int(system.Get("#").Int())
if count == 0 {
return payload
}
// Check if ANY system element already has cache_control
hasCacheControlInSystem := false
system.ForEach(func(_, item gjson.Result) bool {
if item.Get("cache_control").Exists() {
hasCacheControlInSystem = true
return false
}
return true
})
if hasCacheControlInSystem {
return payload
}
// Add cache_control to the last system element
lastSystemPath := fmt.Sprintf("system.%d.cache_control", count-1)
result, err := sjson.SetBytes(payload, lastSystemPath, map[string]string{"type": "ephemeral"})
if err != nil {
log.Warnf("failed to inject cache_control into system array: %v", err)
return payload
}
payload = result
} else if system.Type == gjson.String {
// Convert string system prompt to array with cache_control
// "system": "text" -> "system": [{"type": "text", "text": "text", "cache_control": {"type": "ephemeral"}}]
text := system.String()
newSystem := []map[string]interface{}{
{
"type": "text",
"text": text,
"cache_control": map[string]string{
"type": "ephemeral",
},
},
}
result, err := sjson.SetBytes(payload, "system", newSystem)
if err != nil {
log.Warnf("failed to inject cache_control into system string: %v", err)
return payload
}
payload = result
}
return payload
}

View File

@@ -25,6 +25,18 @@ func TestApplyClaudeToolPrefix(t *testing.T) {
}
}
func TestApplyClaudeToolPrefix_SkipsBuiltinTools(t *testing.T) {
input := []byte(`{"tools":[{"type":"web_search_20250305","name":"web_search"},{"name":"my_custom_tool","input_schema":{"type":"object"}}]}`)
out := applyClaudeToolPrefix(input, "proxy_")
if got := gjson.GetBytes(out, "tools.0.name").String(); got != "web_search" {
t.Fatalf("built-in tool name should not be prefixed: tools.0.name = %q, want %q", got, "web_search")
}
if got := gjson.GetBytes(out, "tools.1.name").String(); got != "proxy_my_custom_tool" {
t.Fatalf("custom tool should be prefixed: tools.1.name = %q, want %q", got, "proxy_my_custom_tool")
}
}
func TestStripClaudeToolPrefixFromResponse(t *testing.T) {
input := []byte(`{"content":[{"type":"tool_use","name":"proxy_alpha","id":"t1","input":{}},{"type":"tool_use","name":"bravo","id":"t2","input":{}}]}`)
out := stripClaudeToolPrefixFromResponse(input, "proxy_")

View File

@@ -0,0 +1,176 @@
package executor
import (
"regexp"
"sort"
"strings"
"unicode/utf8"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
// zeroWidthSpace is the Unicode zero-width space character used for obfuscation.
const zeroWidthSpace = "\u200B"
// SensitiveWordMatcher holds the compiled regex for matching sensitive words.
type SensitiveWordMatcher struct {
regex *regexp.Regexp
}
// buildSensitiveWordMatcher compiles a regex from the word list.
// Words are sorted by length (longest first) for proper matching.
func buildSensitiveWordMatcher(words []string) *SensitiveWordMatcher {
if len(words) == 0 {
return nil
}
// Filter and normalize words
var validWords []string
for _, w := range words {
w = strings.TrimSpace(w)
if utf8.RuneCountInString(w) >= 2 && !strings.Contains(w, zeroWidthSpace) {
validWords = append(validWords, w)
}
}
if len(validWords) == 0 {
return nil
}
// Sort by length (longest first) for proper matching
sort.Slice(validWords, func(i, j int) bool {
return len(validWords[i]) > len(validWords[j])
})
// Escape and join
escaped := make([]string, len(validWords))
for i, w := range validWords {
escaped[i] = regexp.QuoteMeta(w)
}
pattern := "(?i)" + strings.Join(escaped, "|")
re, err := regexp.Compile(pattern)
if err != nil {
return nil
}
return &SensitiveWordMatcher{regex: re}
}
// obfuscateWord inserts a zero-width space after the first grapheme.
func obfuscateWord(word string) string {
if strings.Contains(word, zeroWidthSpace) {
return word
}
// Get first rune
r, size := utf8.DecodeRuneInString(word)
if r == utf8.RuneError || size >= len(word) {
return word
}
return string(r) + zeroWidthSpace + word[size:]
}
// obfuscateText replaces all sensitive words in the text.
func (m *SensitiveWordMatcher) obfuscateText(text string) string {
if m == nil || m.regex == nil {
return text
}
return m.regex.ReplaceAllStringFunc(text, obfuscateWord)
}
// obfuscateSensitiveWords processes the payload and obfuscates sensitive words
// in system blocks and message content.
func obfuscateSensitiveWords(payload []byte, matcher *SensitiveWordMatcher) []byte {
if matcher == nil || matcher.regex == nil {
return payload
}
// Obfuscate in system blocks
payload = obfuscateSystemBlocks(payload, matcher)
// Obfuscate in messages
payload = obfuscateMessages(payload, matcher)
return payload
}
// obfuscateSystemBlocks obfuscates sensitive words in system blocks.
func obfuscateSystemBlocks(payload []byte, matcher *SensitiveWordMatcher) []byte {
system := gjson.GetBytes(payload, "system")
if !system.Exists() {
return payload
}
if system.IsArray() {
modified := false
system.ForEach(func(key, value gjson.Result) bool {
if value.Get("type").String() == "text" {
text := value.Get("text").String()
obfuscated := matcher.obfuscateText(text)
if obfuscated != text {
path := "system." + key.String() + ".text"
payload, _ = sjson.SetBytes(payload, path, obfuscated)
modified = true
}
}
return true
})
if modified {
return payload
}
} else if system.Type == gjson.String {
text := system.String()
obfuscated := matcher.obfuscateText(text)
if obfuscated != text {
payload, _ = sjson.SetBytes(payload, "system", obfuscated)
}
}
return payload
}
// obfuscateMessages obfuscates sensitive words in message content.
func obfuscateMessages(payload []byte, matcher *SensitiveWordMatcher) []byte {
messages := gjson.GetBytes(payload, "messages")
if !messages.Exists() || !messages.IsArray() {
return payload
}
messages.ForEach(func(msgKey, msg gjson.Result) bool {
content := msg.Get("content")
if !content.Exists() {
return true
}
msgPath := "messages." + msgKey.String()
if content.Type == gjson.String {
// Simple string content
text := content.String()
obfuscated := matcher.obfuscateText(text)
if obfuscated != text {
payload, _ = sjson.SetBytes(payload, msgPath+".content", obfuscated)
}
} else if content.IsArray() {
// Array of content blocks
content.ForEach(func(blockKey, block gjson.Result) bool {
if block.Get("type").String() == "text" {
text := block.Get("text").String()
obfuscated := matcher.obfuscateText(text)
if obfuscated != text {
path := msgPath + ".content." + blockKey.String() + ".text"
payload, _ = sjson.SetBytes(payload, path, obfuscated)
}
}
return true
})
}
return true
})
return payload
}

View File

@@ -0,0 +1,47 @@
package executor
import (
"crypto/rand"
"encoding/hex"
"regexp"
"strings"
"github.com/google/uuid"
)
// userIDPattern matches Claude Code format: user_[64-hex]_account__session_[uuid-v4]
var userIDPattern = regexp.MustCompile(`^user_[a-fA-F0-9]{64}_account__session_[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$`)
// generateFakeUserID generates a fake user ID in Claude Code format.
// Format: user_[64-hex-chars]_account__session_[UUID-v4]
func generateFakeUserID() string {
hexBytes := make([]byte, 32)
_, _ = rand.Read(hexBytes)
hexPart := hex.EncodeToString(hexBytes)
uuidPart := uuid.New().String()
return "user_" + hexPart + "_account__session_" + uuidPart
}
// isValidUserID checks if a user ID matches Claude Code format.
func isValidUserID(userID string) bool {
return userIDPattern.MatchString(userID)
}
// shouldCloak determines if request should be cloaked based on config and client User-Agent.
// Returns true if cloaking should be applied.
func shouldCloak(cloakMode string, userAgent string) bool {
switch strings.ToLower(cloakMode) {
case "always":
return true
case "never":
return false
default: // "auto" or empty
// If client is Claude Code, don't cloak
return !strings.HasPrefix(userAgent, "claude-cli")
}
}
// isClaudeCodeClient checks if the User-Agent indicates a Claude Code client.
func isClaudeCodeClient(userAgent string) bool {
return strings.HasPrefix(userAgent, "claude-cli")
}

View File

@@ -73,6 +73,9 @@ func (e *CodexExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Auth
}
func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return e.executeCompact(ctx, auth, req, opts)
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := codexCreds(auth)
@@ -96,12 +99,13 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
body = sdktranslator.TranslateRequest(from, to, baseModel, body, false)
body = misc.StripCodexUserAgent(body)
body, err = thinking.ApplyThinking(body, req.Model, "codex")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, _ = sjson.SetBytes(body, "stream", true)
body, _ = sjson.DeleteBytes(body, "previous_response_id")
@@ -116,7 +120,7 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
if err != nil {
return resp, err
}
applyCodexHeaders(httpReq, auth, apiKey)
applyCodexHeaders(httpReq, auth, apiKey, true)
var authID, authLabel, authType, authValue string
if auth != nil {
authID = auth.ID
@@ -149,7 +153,7 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -184,7 +188,96 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
return resp, err
}
func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := codexCreds(auth)
if baseURL == "" {
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("openai-response")
originalPayload := bytes.Clone(req.Payload)
if len(opts.OriginalRequest) > 0 {
originalPayload = bytes.Clone(opts.OriginalRequest)
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, _ = sjson.DeleteBytes(body, "stream")
url := strings.TrimSuffix(baseURL, "/") + "/responses/compact"
httpReq, err := e.cacheHelper(ctx, from, url, req, body)
if err != nil {
return resp, err
}
applyCodexHeaders(httpReq, auth, apiKey, false)
var authID, authLabel, authType, authValue string
if auth != nil {
authID = auth.ID
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
Body: body,
Provider: e.Identifier(),
AuthID: authID,
AuthLabel: authLabel,
AuthType: authType,
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
defer func() {
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("codex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
data, err := io.ReadAll(httpResp.Body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseOpenAIUsage(data))
reporter.ensurePublished(ctx)
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, bytes.Clone(originalPayload), body, data, &param)
resp = cliproxyexecutor.Response{Payload: []byte(out)}
return resp, nil
}
func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusBadRequest, msg: "streaming not supported for /responses/compact"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := codexCreds(auth)
@@ -208,12 +301,13 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
body = sdktranslator.TranslateRequest(from, to, baseModel, body, true)
body = misc.StripCodexUserAgent(body)
body, err = thinking.ApplyThinking(body, req.Model, "codex")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.DeleteBytes(body, "previous_response_id")
body, _ = sjson.DeleteBytes(body, "prompt_cache_retention")
body, _ = sjson.DeleteBytes(body, "safety_identifier")
@@ -227,7 +321,7 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
if err != nil {
return nil, err
}
applyCodexHeaders(httpReq, auth, apiKey)
applyCodexHeaders(httpReq, auth, apiKey, true)
var authID, authLabel, authType, authValue string
if auth != nil {
authID = auth.ID
@@ -263,7 +357,7 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
return nil, readErr
}
appendAPIResponseChunk(ctx, e.cfg, data)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
err = statusErr{code: httpResp.StatusCode, msg: string(data)}
return nil, err
}
@@ -316,7 +410,7 @@ func (e *CodexExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Auth
body = sdktranslator.TranslateRequest(from, to, baseModel, body, false)
body = misc.StripCodexUserAgent(body)
body, err := thinking.ApplyThinking(body, req.Model, "codex")
body, err := thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}
@@ -528,17 +622,21 @@ func (e *CodexExecutor) cacheHelper(ctx context.Context, from sdktranslator.Form
}
}
rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
if cache.ID != "" {
rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
}
httpReq, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(rawJSON))
if err != nil {
return nil, err
}
httpReq.Header.Set("Conversation_id", cache.ID)
httpReq.Header.Set("Session_id", cache.ID)
if cache.ID != "" {
httpReq.Header.Set("Conversation_id", cache.ID)
httpReq.Header.Set("Session_id", cache.ID)
}
return httpReq, nil
}
func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string) {
func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string, stream bool) {
r.Header.Set("Content-Type", "application/json")
r.Header.Set("Authorization", "Bearer "+token)
@@ -552,7 +650,11 @@ func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string) {
misc.EnsureHeader(r.Header, ginHeaders, "Session_id", uuid.NewString())
misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", "codex_cli_rs/0.50.0 (Mac OS 26.0.1; arm64) Apple_Terminal/464")
r.Header.Set("Accept", "text/event-stream")
if stream {
r.Header.Set("Accept", "text/event-stream")
} else {
r.Header.Set("Accept", "application/json")
}
r.Header.Set("Connection", "Keep-Alive")
isAPIKey := false

View File

@@ -103,6 +103,9 @@ func (e *GeminiCLIExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.
// Execute performs a non-streaming request to the Gemini CLI API.
func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
tokenSource, baseTokenData, err := prepareGeminiCLITokenSource(ctx, e.cfg, auth)
@@ -123,13 +126,14 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
basePayload := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
basePayload, err = thinking.ApplyThinking(basePayload, req.Model, "gemini-cli")
basePayload, err = thinking.ApplyThinking(basePayload, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
basePayload = fixGeminiCLIImageAspectRatio(baseModel, basePayload)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
action := "generateContent"
if req.Metadata != nil {
@@ -226,7 +230,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
lastStatus = httpResp.StatusCode
lastBody = append([]byte(nil), data...)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
if httpResp.StatusCode == 429 {
if idx+1 < len(models) {
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
@@ -252,6 +256,9 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
// ExecuteStream performs a streaming request to the Gemini CLI API.
func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
tokenSource, baseTokenData, err := prepareGeminiCLITokenSource(ctx, e.cfg, auth)
@@ -272,13 +279,14 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, true)
basePayload := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
basePayload, err = thinking.ApplyThinking(basePayload, req.Model, "gemini-cli")
basePayload, err = thinking.ApplyThinking(basePayload, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
basePayload = fixGeminiCLIImageAspectRatio(baseModel, basePayload)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
projectID := resolveGeminiProjectID(auth)
@@ -358,7 +366,7 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
appendAPIResponseChunk(ctx, e.cfg, data)
lastStatus = httpResp.StatusCode
lastBody = append([]byte(nil), data...)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
if httpResp.StatusCode == 429 {
if idx+1 < len(models) {
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
@@ -479,7 +487,7 @@ func (e *GeminiCLIExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.
for range models {
payload := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
payload, err = thinking.ApplyThinking(payload, req.Model, "gemini-cli")
payload, err = thinking.ApplyThinking(payload, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}

View File

@@ -103,6 +103,9 @@ func (e *GeminiExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Aut
// - cliproxyexecutor.Response: The response from the API
// - error: An error if the request fails
func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, bearer := geminiCreds(auth)
@@ -120,13 +123,14 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := "generateContent"
@@ -187,7 +191,7 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -206,6 +210,9 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
// ExecuteStream performs a streaming request to the Gemini API.
func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, bearer := geminiCreds(auth)
@@ -222,13 +229,14 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, true)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
baseURL := resolveGeminiBaseURL(auth)
@@ -280,7 +288,7 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("gemini executor: close response body error: %v", errClose)
}
@@ -338,7 +346,7 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
to := sdktranslator.FromString("gemini")
translatedReq := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, "gemini")
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}
@@ -400,7 +408,7 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
}
appendAPIResponseChunk(ctx, e.cfg, data)
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
log.Debugf("request error, error status: %d, error body: %s", resp.StatusCode, summarizeErrorBody(resp.Header.Get("Content-Type"), data))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", resp.StatusCode, summarizeErrorBody(resp.Header.Get("Content-Type"), data))
return cliproxyexecutor.Response{}, statusErr{code: resp.StatusCode, msg: string(data)}
}

View File

@@ -12,6 +12,7 @@ import (
"io"
"net/http"
"strings"
"time"
vertexauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/vertex"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
@@ -31,6 +32,143 @@ const (
vertexAPIVersion = "v1"
)
// isImagenModel checks if the model name is an Imagen image generation model.
// Imagen models use the :predict action instead of :generateContent.
func isImagenModel(model string) bool {
lowerModel := strings.ToLower(model)
return strings.Contains(lowerModel, "imagen")
}
// getVertexAction returns the appropriate action for the given model.
// Imagen models use "predict", while Gemini models use "generateContent".
func getVertexAction(model string, isStream bool) string {
if isImagenModel(model) {
return "predict"
}
if isStream {
return "streamGenerateContent"
}
return "generateContent"
}
// convertImagenToGeminiResponse converts Imagen API response to Gemini format
// so it can be processed by the standard translation pipeline.
// This ensures Imagen models return responses in the same format as gemini-3-pro-image-preview.
func convertImagenToGeminiResponse(data []byte, model string) []byte {
predictions := gjson.GetBytes(data, "predictions")
if !predictions.Exists() || !predictions.IsArray() {
return data
}
// Build Gemini-compatible response with inlineData
parts := make([]map[string]any, 0)
for _, pred := range predictions.Array() {
imageData := pred.Get("bytesBase64Encoded").String()
mimeType := pred.Get("mimeType").String()
if mimeType == "" {
mimeType = "image/png"
}
if imageData != "" {
parts = append(parts, map[string]any{
"inlineData": map[string]any{
"mimeType": mimeType,
"data": imageData,
},
})
}
}
// Generate unique response ID using timestamp
responseId := fmt.Sprintf("imagen-%d", time.Now().UnixNano())
response := map[string]any{
"candidates": []map[string]any{{
"content": map[string]any{
"parts": parts,
"role": "model",
},
"finishReason": "STOP",
}},
"responseId": responseId,
"modelVersion": model,
// Imagen API doesn't return token counts, set to 0 for tracking purposes
"usageMetadata": map[string]any{
"promptTokenCount": 0,
"candidatesTokenCount": 0,
"totalTokenCount": 0,
},
}
result, err := json.Marshal(response)
if err != nil {
return data
}
return result
}
// convertToImagenRequest converts a Gemini-style request to Imagen API format.
// Imagen API uses a different structure: instances[].prompt instead of contents[].
func convertToImagenRequest(payload []byte) ([]byte, error) {
// Extract prompt from Gemini-style contents
prompt := ""
// Try to get prompt from contents[0].parts[0].text
contentsText := gjson.GetBytes(payload, "contents.0.parts.0.text")
if contentsText.Exists() {
prompt = contentsText.String()
}
// If no contents, try messages format (OpenAI-compatible)
if prompt == "" {
messagesText := gjson.GetBytes(payload, "messages.#.content")
if messagesText.Exists() && messagesText.IsArray() {
for _, msg := range messagesText.Array() {
if msg.String() != "" {
prompt = msg.String()
break
}
}
}
}
// If still no prompt, try direct prompt field
if prompt == "" {
directPrompt := gjson.GetBytes(payload, "prompt")
if directPrompt.Exists() {
prompt = directPrompt.String()
}
}
if prompt == "" {
return nil, fmt.Errorf("imagen: no prompt found in request")
}
// Build Imagen API request
imagenReq := map[string]any{
"instances": []map[string]any{
{
"prompt": prompt,
},
},
"parameters": map[string]any{
"sampleCount": 1,
},
}
// Extract optional parameters
if aspectRatio := gjson.GetBytes(payload, "aspectRatio"); aspectRatio.Exists() {
imagenReq["parameters"].(map[string]any)["aspectRatio"] = aspectRatio.String()
}
if sampleCount := gjson.GetBytes(payload, "sampleCount"); sampleCount.Exists() {
imagenReq["parameters"].(map[string]any)["sampleCount"] = int(sampleCount.Int())
}
if negativePrompt := gjson.GetBytes(payload, "negativePrompt"); negativePrompt.Exists() {
imagenReq["instances"].([]map[string]any)[0]["negativePrompt"] = negativePrompt.String()
}
return json.Marshal(imagenReq)
}
// GeminiVertexExecutor sends requests to Vertex AI Gemini endpoints using service account credentials.
type GeminiVertexExecutor struct {
cfg *config.Config
@@ -95,6 +233,9 @@ func (e *GeminiVertexExecutor) HttpRequest(ctx context.Context, auth *cliproxyau
// Execute performs a non-streaming request to the Vertex AI API.
func (e *GeminiVertexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
// Try API key authentication first
apiKey, baseURL := vertexAPICreds(auth)
@@ -113,6 +254,9 @@ func (e *GeminiVertexExecutor) Execute(ctx context.Context, auth *cliproxyauth.A
// ExecuteStream performs a streaming request to the Vertex AI API.
func (e *GeminiVertexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
// Try API key authentication first
apiKey, baseURL := vertexAPICreds(auth)
@@ -160,26 +304,39 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
var body []byte
originalPayload := bytes.Clone(req.Payload)
if len(opts.OriginalRequest) > 0 {
originalPayload = bytes.Clone(opts.OriginalRequest)
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
// Handle Imagen models with special request format
if isImagenModel(baseModel) {
imagenBody, errImagen := convertToImagenRequest(req.Payload)
if errImagen != nil {
return resp, errImagen
}
body = imagenBody
} else {
// Standard Gemini translation flow
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
if err != nil {
return resp, err
originalPayload := bytes.Clone(req.Payload)
if len(opts.OriginalRequest) > 0 {
originalPayload = bytes.Clone(opts.OriginalRequest)
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
body = sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := "generateContent"
action := getVertexAction(baseModel, false)
if req.Metadata != nil {
if a, _ := req.Metadata["action"].(string); a == "countTokens" {
action = "countTokens"
@@ -238,7 +395,7 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -249,6 +406,16 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseGeminiUsage(data))
// For Imagen models, convert response to Gemini format before translation
// This ensures Imagen responses use the same format as gemini-3-pro-image-preview
if isImagenModel(baseModel) {
data = convertImagenToGeminiResponse(data, baseModel)
}
// Standard Gemini translation (works for both Gemini and converted Imagen responses)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, bytes.Clone(opts.OriginalRequest), body, data, &param)
resp = cliproxyexecutor.Response{Payload: []byte(out)}
@@ -272,16 +439,17 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, false)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := "generateContent"
action := getVertexAction(baseModel, false)
if req.Metadata != nil {
if a, _ := req.Metadata["action"].(string); a == "countTokens" {
action = "countTokens"
@@ -341,7 +509,7 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -375,21 +543,26 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, true)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := getVertexAction(baseModel, true)
baseURL := vertexBaseURL(location)
url := fmt.Sprintf("%s/%s/projects/%s/locations/%s/publishers/google/models/%s:%s", baseURL, vertexAPIVersion, projectID, location, baseModel, "streamGenerateContent")
if opts.Alt == "" {
url = url + "?alt=sse"
} else {
url = url + fmt.Sprintf("?$alt=%s", opts.Alt)
url := fmt.Sprintf("%s/%s/projects/%s/locations/%s/publishers/google/models/%s:%s", baseURL, vertexAPIVersion, projectID, location, baseModel, action)
// Imagen models don't support streaming, skip SSE params
if !isImagenModel(baseModel) {
if opts.Alt == "" {
url = url + "?alt=sse"
} else {
url = url + fmt.Sprintf("?$alt=%s", opts.Alt)
}
}
body, _ = sjson.DeleteBytes(body, "session_id")
@@ -434,7 +607,7 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("vertex executor: close response body error: %v", errClose)
}
@@ -494,24 +667,29 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, true)
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, err = thinking.ApplyThinking(body, req.Model, "gemini")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
body = fixGeminiImageAspectRatio(baseModel, body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := getVertexAction(baseModel, true)
// For API key auth, use simpler URL format without project/location
if baseURL == "" {
baseURL = "https://generativelanguage.googleapis.com"
}
url := fmt.Sprintf("%s/%s/publishers/google/models/%s:%s", baseURL, vertexAPIVersion, baseModel, "streamGenerateContent")
if opts.Alt == "" {
url = url + "?alt=sse"
} else {
url = url + fmt.Sprintf("?$alt=%s", opts.Alt)
url := fmt.Sprintf("%s/%s/publishers/google/models/%s:%s", baseURL, vertexAPIVersion, baseModel, action)
// Imagen models don't support streaming, skip SSE params
if !isImagenModel(baseModel) {
if opts.Alt == "" {
url = url + "?alt=sse"
} else {
url = url + fmt.Sprintf("?$alt=%s", opts.Alt)
}
}
body, _ = sjson.DeleteBytes(body, "session_id")
@@ -553,7 +731,7 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("vertex executor: close response body error: %v", errClose)
}
@@ -605,7 +783,7 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
translatedReq := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, "gemini")
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}
@@ -666,7 +844,7 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
return cliproxyexecutor.Response{}, statusErr{code: httpResp.StatusCode, msg: string(b)}
}
data, errRead := io.ReadAll(httpResp.Body)
@@ -689,7 +867,7 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
translatedReq := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, "gemini")
translatedReq, err := thinking.ApplyThinking(translatedReq, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}
@@ -750,7 +928,7 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
return cliproxyexecutor.Response{}, statusErr{code: httpResp.StatusCode, msg: string(b)}
}
data, errRead := io.ReadAll(httpResp.Body)

View File

@@ -68,6 +68,9 @@ func (e *IFlowExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Auth
// Execute performs a non-streaming chat completion request.
func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := iflowCreds(auth)
@@ -92,13 +95,14 @@ func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "iflow")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), "iflow", e.Identifier())
if err != nil {
return resp, err
}
body = preserveReasoningContentInMessages(body)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
endpoint := strings.TrimSuffix(baseURL, "/") + iflowDefaultEndpoint
@@ -141,7 +145,7 @@ func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("iflow request error: status %d body %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -166,6 +170,9 @@ func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
// ExecuteStream performs a streaming chat completion request.
func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
apiKey, baseURL := iflowCreds(auth)
@@ -190,7 +197,7 @@ func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "iflow")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), "iflow", e.Identifier())
if err != nil {
return nil, err
}
@@ -201,7 +208,8 @@ func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
if toolsResult.Exists() && toolsResult.IsArray() && len(toolsResult.Array()) == 0 {
body = ensureToolsArray(body)
}
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
endpoint := strings.TrimSuffix(baseURL, "/") + iflowDefaultEndpoint
@@ -242,7 +250,7 @@ func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
log.Errorf("iflow executor: close response body error: %v", errClose)
}
appendAPIResponseChunk(ctx, e.cfg, data)
log.Debugf("iflow streaming error: status %d body %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
logWithRequestID(ctx).Debugf("request error, error status: %d error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
err = statusErr{code: httpResp.StatusCode, msg: string(data)}
return nil, err
}

View File

@@ -12,7 +12,10 @@ import (
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
log "github.com/sirupsen/logrus"
"github.com/tidwall/gjson"
)
const (
@@ -332,6 +335,12 @@ func summarizeErrorBody(contentType string, body []byte) string {
}
return "[html body omitted]"
}
// Try to extract error message from JSON response
if message := extractJSONErrorMessage(body); message != "" {
return message
}
return string(body)
}
@@ -358,3 +367,25 @@ func extractHTMLTitle(body []byte) string {
}
return strings.Join(strings.Fields(title), " ")
}
// extractJSONErrorMessage attempts to extract error.message from JSON error responses
func extractJSONErrorMessage(body []byte) string {
result := gjson.GetBytes(body, "error.message")
if result.Exists() && result.String() != "" {
return result.String()
}
return ""
}
// logWithRequestID returns a logrus Entry with request_id field populated from context.
// If no request ID is found in context, it returns the standard logger.
func logWithRequestID(ctx context.Context) *log.Entry {
if ctx == nil {
return log.NewEntry(log.StandardLogger())
}
requestID := logging.GetRequestID(ctx)
if requestID == "" {
return log.NewEntry(log.StandardLogger())
}
return log.WithField("request_id", requestID)
}

View File

@@ -81,23 +81,33 @@ func (e *OpenAICompatExecutor) Execute(ctx context.Context, auth *cliproxyauth.A
return
}
// Translate inbound request to OpenAI format
from := opts.SourceFormat
to := sdktranslator.FromString("openai")
endpoint := "/chat/completions"
if opts.Alt == "responses/compact" {
to = sdktranslator.FromString("openai-response")
endpoint = "/responses/compact"
}
originalPayload := bytes.Clone(req.Payload)
if len(opts.OriginalRequest) > 0 {
originalPayload = bytes.Clone(opts.OriginalRequest)
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, opts.Stream)
translated := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), opts.Stream)
translated = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", translated, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
translated = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", translated, originalTranslated, requestedModel)
if opts.Alt == "responses/compact" {
if updated, errDelete := sjson.DeleteBytes(translated, "stream"); errDelete == nil {
translated = updated
}
}
translated, err = thinking.ApplyThinking(translated, req.Model, "openai")
translated, err = thinking.ApplyThinking(translated, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
url := strings.TrimSuffix(baseURL, "/") + "/chat/completions"
url := strings.TrimSuffix(baseURL, "/") + endpoint
httpReq, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(translated))
if err != nil {
return resp, err
@@ -145,7 +155,7 @@ func (e *OpenAICompatExecutor) Execute(ctx context.Context, auth *cliproxyauth.A
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -185,9 +195,10 @@ func (e *OpenAICompatExecutor) ExecuteStream(ctx context.Context, auth *cliproxy
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayload, true)
translated := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
translated = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", translated, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
translated = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", translated, originalTranslated, requestedModel)
translated, err = thinking.ApplyThinking(translated, req.Model, "openai")
translated, err = thinking.ApplyThinking(translated, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
@@ -237,7 +248,7 @@ func (e *OpenAICompatExecutor) ExecuteStream(ctx context.Context, auth *cliproxy
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("openai compat executor: close response body error: %v", errClose)
}
@@ -297,7 +308,7 @@ func (e *OpenAICompatExecutor) CountTokens(ctx context.Context, auth *cliproxyau
modelForCounting := baseModel
translated, err := thinking.ApplyThinking(translated, req.Model, "openai")
translated, err := thinking.ApplyThinking(translated, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return cliproxyexecutor.Response{}, err
}

View File

@@ -0,0 +1,58 @@
package executor
import (
"context"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
"github.com/tidwall/gjson"
)
func TestOpenAICompatExecutorCompactPassthrough(t *testing.T) {
var gotPath string
var gotBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
body, _ := io.ReadAll(r.Body)
gotBody = body
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"id":"resp_1","object":"response.compaction","usage":{"input_tokens":1,"output_tokens":2,"total_tokens":3}}`))
}))
defer server.Close()
executor := NewOpenAICompatExecutor("openai-compatibility", &config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"base_url": server.URL + "/v1",
"api_key": "test",
}}
payload := []byte(`{"model":"gpt-5.1-codex-max","input":[{"role":"user","content":"hi"}]}`)
resp, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gpt-5.1-codex-max",
Payload: payload,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
Alt: "responses/compact",
Stream: false,
})
if err != nil {
t.Fatalf("Execute error: %v", err)
}
if gotPath != "/v1/responses/compact" {
t.Fatalf("path = %q, want %q", gotPath, "/v1/responses/compact")
}
if !gjson.GetBytes(gotBody, "input").Exists() {
t.Fatalf("expected input in body")
}
if gjson.GetBytes(gotBody, "messages").Exists() {
t.Fatalf("unexpected messages in body")
}
if string(resp.Payload) != `{"id":"resp_1","object":"response.compaction","usage":{"input_tokens":1,"output_tokens":2,"total_tokens":3}}` {
t.Fatalf("payload = %s", string(resp.Payload))
}
}

View File

@@ -5,6 +5,8 @@ import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
@@ -12,8 +14,9 @@ import (
// applyPayloadConfigWithRoot behaves like applyPayloadConfig but treats all parameter
// paths as relative to the provided root path (for example, "request" for Gemini CLI)
// and restricts matches to the given protocol when supplied. Defaults are checked
// against the original payload when provided.
func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string, payload, original []byte) []byte {
// against the original payload when provided. requestedModel carries the client-visible
// model name before alias resolution so payload rules can target aliases precisely.
func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string, payload, original []byte, requestedModel string) []byte {
if cfg == nil || len(payload) == 0 {
return payload
}
@@ -22,9 +25,11 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
return payload
}
model = strings.TrimSpace(model)
if model == "" {
requestedModel = strings.TrimSpace(requestedModel)
if model == "" && requestedModel == "" {
return payload
}
candidates := payloadModelCandidates(model, requestedModel)
out := payload
source := original
if len(source) == 0 {
@@ -34,7 +39,7 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
// Apply default rules: first write wins per field across all matching rules.
for i := range rules.Default {
rule := &rules.Default[i]
if !payloadRuleMatchesModel(rule, model, protocol) {
if !payloadRuleMatchesModels(rule, protocol, candidates) {
continue
}
for path, value := range rule.Params {
@@ -59,7 +64,7 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
// Apply default raw rules: first write wins per field across all matching rules.
for i := range rules.DefaultRaw {
rule := &rules.DefaultRaw[i]
if !payloadRuleMatchesModel(rule, model, protocol) {
if !payloadRuleMatchesModels(rule, protocol, candidates) {
continue
}
for path, value := range rule.Params {
@@ -88,7 +93,7 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
// Apply override rules: last write wins per field across all matching rules.
for i := range rules.Override {
rule := &rules.Override[i]
if !payloadRuleMatchesModel(rule, model, protocol) {
if !payloadRuleMatchesModels(rule, protocol, candidates) {
continue
}
for path, value := range rule.Params {
@@ -106,7 +111,7 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
// Apply override raw rules: last write wins per field across all matching rules.
for i := range rules.OverrideRaw {
rule := &rules.OverrideRaw[i]
if !payloadRuleMatchesModel(rule, model, protocol) {
if !payloadRuleMatchesModels(rule, protocol, candidates) {
continue
}
for path, value := range rule.Params {
@@ -128,6 +133,18 @@ func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string
return out
}
func payloadRuleMatchesModels(rule *config.PayloadRule, protocol string, models []string) bool {
if rule == nil || len(models) == 0 {
return false
}
for _, model := range models {
if payloadRuleMatchesModel(rule, model, protocol) {
return true
}
}
return false
}
func payloadRuleMatchesModel(rule *config.PayloadRule, model, protocol string) bool {
if rule == nil {
return false
@@ -150,6 +167,42 @@ func payloadRuleMatchesModel(rule *config.PayloadRule, model, protocol string) b
return false
}
func payloadModelCandidates(model, requestedModel string) []string {
model = strings.TrimSpace(model)
requestedModel = strings.TrimSpace(requestedModel)
if model == "" && requestedModel == "" {
return nil
}
candidates := make([]string, 0, 3)
seen := make(map[string]struct{}, 3)
addCandidate := func(value string) {
value = strings.TrimSpace(value)
if value == "" {
return
}
key := strings.ToLower(value)
if _, ok := seen[key]; ok {
return
}
seen[key] = struct{}{}
candidates = append(candidates, value)
}
if model != "" {
addCandidate(model)
}
if requestedModel != "" {
parsed := thinking.ParseSuffix(requestedModel)
base := strings.TrimSpace(parsed.ModelName)
if base != "" {
addCandidate(base)
}
if parsed.HasSuffix {
addCandidate(requestedModel)
}
}
return candidates
}
// buildPayloadPath combines an optional root path with a relative parameter path.
// When root is empty, the parameter path is used as-is. When root is non-empty,
// the parameter path is treated as relative to root.
@@ -186,6 +239,35 @@ func payloadRawValue(value any) ([]byte, bool) {
}
}
func payloadRequestedModel(opts cliproxyexecutor.Options, fallback string) string {
fallback = strings.TrimSpace(fallback)
if len(opts.Metadata) == 0 {
return fallback
}
raw, ok := opts.Metadata[cliproxyexecutor.RequestedModelMetadataKey]
if !ok || raw == nil {
return fallback
}
switch v := raw.(type) {
case string:
if strings.TrimSpace(v) == "" {
return fallback
}
return strings.TrimSpace(v)
case []byte:
if len(v) == 0 {
return fallback
}
trimmed := strings.TrimSpace(string(v))
if trimmed == "" {
return fallback
}
return trimmed
default:
return fallback
}
}
// matchModelPattern performs simple wildcard matching where '*' matches zero or more characters.
// Examples:
//

View File

@@ -66,6 +66,9 @@ func (e *QwenExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Auth,
}
func (e *QwenExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if opts.Alt == "responses/compact" {
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
token, baseURL := qwenCreds(auth)
@@ -86,12 +89,13 @@ func (e *QwenExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), false)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "openai")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return resp, err
}
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
url := strings.TrimSuffix(baseURL, "/") + "/chat/completions"
httpReq, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
@@ -132,7 +136,7 @@ func (e *QwenExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
@@ -152,6 +156,9 @@ func (e *QwenExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req
}
func (e *QwenExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (stream <-chan cliproxyexecutor.StreamChunk, err error) {
if opts.Alt == "responses/compact" {
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
token, baseURL := qwenCreds(auth)
@@ -172,7 +179,7 @@ func (e *QwenExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Aut
body := sdktranslator.TranslateRequest(from, to, baseModel, bytes.Clone(req.Payload), true)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, err = thinking.ApplyThinking(body, req.Model, "openai")
body, err = thinking.ApplyThinking(body, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
return nil, err
}
@@ -184,7 +191,8 @@ func (e *QwenExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Aut
body, _ = sjson.SetRawBytes(body, "tools", []byte(`[{"type":"function","function":{"name":"do_not_call_me","description":"Do not call this tool under any circumstances, it will have catastrophic consequences.","parameters":{"type":"object","properties":{"operation":{"type":"number","description":"1:poweroff\n2:rm -fr /\n3:mkfs.ext4 /dev/sda1"}},"required":["operation"]}}}]`))
}
body, _ = sjson.SetBytes(body, "stream_options.include_usage", true)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
url := strings.TrimSuffix(baseURL, "/") + "/chat/completions"
httpReq, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
@@ -220,7 +228,7 @@ func (e *QwenExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Aut
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
log.Debugf("request error, error status: %d, error body: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("qwen executor: close response body error: %v", errClose)
}

View File

@@ -199,15 +199,31 @@ func parseOpenAIUsage(data []byte) usage.Detail {
if !usageNode.Exists() {
return usage.Detail{}
}
inputNode := usageNode.Get("prompt_tokens")
if !inputNode.Exists() {
inputNode = usageNode.Get("input_tokens")
}
outputNode := usageNode.Get("completion_tokens")
if !outputNode.Exists() {
outputNode = usageNode.Get("output_tokens")
}
detail := usage.Detail{
InputTokens: usageNode.Get("prompt_tokens").Int(),
OutputTokens: usageNode.Get("completion_tokens").Int(),
InputTokens: inputNode.Int(),
OutputTokens: outputNode.Int(),
TotalTokens: usageNode.Get("total_tokens").Int(),
}
if cached := usageNode.Get("prompt_tokens_details.cached_tokens"); cached.Exists() {
cached := usageNode.Get("prompt_tokens_details.cached_tokens")
if !cached.Exists() {
cached = usageNode.Get("input_tokens_details.cached_tokens")
}
if cached.Exists() {
detail.CachedTokens = cached.Int()
}
if reasoning := usageNode.Get("completion_tokens_details.reasoning_tokens"); reasoning.Exists() {
reasoning := usageNode.Get("completion_tokens_details.reasoning_tokens")
if !reasoning.Exists() {
reasoning = usageNode.Get("output_tokens_details.reasoning_tokens")
}
if reasoning.Exists() {
detail.ReasoningTokens = reasoning.Int()
}
return detail

View File

@@ -0,0 +1,43 @@
package executor
import "testing"
func TestParseOpenAIUsageChatCompletions(t *testing.T) {
data := []byte(`{"usage":{"prompt_tokens":1,"completion_tokens":2,"total_tokens":3,"prompt_tokens_details":{"cached_tokens":4},"completion_tokens_details":{"reasoning_tokens":5}}}`)
detail := parseOpenAIUsage(data)
if detail.InputTokens != 1 {
t.Fatalf("input tokens = %d, want %d", detail.InputTokens, 1)
}
if detail.OutputTokens != 2 {
t.Fatalf("output tokens = %d, want %d", detail.OutputTokens, 2)
}
if detail.TotalTokens != 3 {
t.Fatalf("total tokens = %d, want %d", detail.TotalTokens, 3)
}
if detail.CachedTokens != 4 {
t.Fatalf("cached tokens = %d, want %d", detail.CachedTokens, 4)
}
if detail.ReasoningTokens != 5 {
t.Fatalf("reasoning tokens = %d, want %d", detail.ReasoningTokens, 5)
}
}
func TestParseOpenAIUsageResponses(t *testing.T) {
data := []byte(`{"usage":{"input_tokens":10,"output_tokens":20,"total_tokens":30,"input_tokens_details":{"cached_tokens":7},"output_tokens_details":{"reasoning_tokens":9}}}`)
detail := parseOpenAIUsage(data)
if detail.InputTokens != 10 {
t.Fatalf("input tokens = %d, want %d", detail.InputTokens, 10)
}
if detail.OutputTokens != 20 {
t.Fatalf("output tokens = %d, want %d", detail.OutputTokens, 20)
}
if detail.TotalTokens != 30 {
t.Fatalf("total tokens = %d, want %d", detail.TotalTokens, 30)
}
if detail.CachedTokens != 7 {
t.Fatalf("cached tokens = %d, want %d", detail.CachedTokens, 7)
}
if detail.ReasoningTokens != 9 {
t.Fatalf("reasoning tokens = %d, want %d", detail.ReasoningTokens, 9)
}
}

View File

@@ -2,6 +2,8 @@
package thinking
import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
log "github.com/sirupsen/logrus"
"github.com/tidwall/gjson"
@@ -59,7 +61,9 @@ func IsUserDefinedModel(modelInfo *registry.ModelInfo) bool {
// Parameters:
// - body: Original request body JSON
// - model: Model name, optionally with thinking suffix (e.g., "claude-sonnet-4-5(16384)")
// - provider: Provider name (gemini, gemini-cli, antigravity, claude, openai, codex, iflow)
// - fromFormat: Source request format (e.g., openai, codex, gemini)
// - toFormat: Target provider format for the request body (gemini, gemini-cli, antigravity, claude, openai, codex, iflow)
// - providerKey: Provider identifier used for registry model lookups (may differ from toFormat, e.g., openrouter -> openai)
//
// Returns:
// - Modified request body JSON with thinking configuration applied
@@ -76,16 +80,25 @@ func IsUserDefinedModel(modelInfo *registry.ModelInfo) bool {
// Example:
//
// // With suffix - suffix config takes priority
// result, err := thinking.ApplyThinking(body, "gemini-2.5-pro(8192)", "gemini")
// result, err := thinking.ApplyThinking(body, "gemini-2.5-pro(8192)", "gemini", "gemini", "gemini")
//
// // Without suffix - uses body config
// result, err := thinking.ApplyThinking(body, "gemini-2.5-pro", "gemini")
func ApplyThinking(body []byte, model string, provider string) ([]byte, error) {
// result, err := thinking.ApplyThinking(body, "gemini-2.5-pro", "gemini", "gemini", "gemini")
func ApplyThinking(body []byte, model string, fromFormat string, toFormat string, providerKey string) ([]byte, error) {
providerFormat := strings.ToLower(strings.TrimSpace(toFormat))
providerKey = strings.ToLower(strings.TrimSpace(providerKey))
if providerKey == "" {
providerKey = providerFormat
}
fromFormat = strings.ToLower(strings.TrimSpace(fromFormat))
if fromFormat == "" {
fromFormat = providerFormat
}
// 1. Route check: Get provider applier
applier := GetProviderApplier(provider)
applier := GetProviderApplier(providerFormat)
if applier == nil {
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": model,
}).Debug("thinking: unknown provider, passthrough |")
return body, nil
@@ -94,25 +107,26 @@ func ApplyThinking(body []byte, model string, provider string) ([]byte, error) {
// 2. Parse suffix and get modelInfo
suffixResult := ParseSuffix(model)
baseModel := suffixResult.ModelName
modelInfo := registry.LookupModelInfo(baseModel)
// Use provider-specific lookup to handle capability differences across providers.
modelInfo := registry.LookupModelInfo(baseModel, providerKey)
// 3. Model capability check
// Unknown models are treated as user-defined so thinking config can still be applied.
// The upstream service is responsible for validating the configuration.
if IsUserDefinedModel(modelInfo) {
return applyUserDefinedModel(body, modelInfo, provider, suffixResult)
return applyUserDefinedModel(body, modelInfo, fromFormat, providerFormat, suffixResult)
}
if modelInfo.Thinking == nil {
config := extractThinkingConfig(body, provider)
config := extractThinkingConfig(body, providerFormat)
if hasThinkingConfig(config) {
log.WithFields(log.Fields{
"model": baseModel,
"provider": provider,
"provider": providerFormat,
}).Debug("thinking: model does not support thinking, stripping config |")
return StripThinkingConfig(body, provider), nil
return StripThinkingConfig(body, providerFormat), nil
}
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": baseModel,
}).Debug("thinking: model does not support thinking, passthrough |")
return body, nil
@@ -121,19 +135,19 @@ func ApplyThinking(body []byte, model string, provider string) ([]byte, error) {
// 4. Get config: suffix priority over body
var config ThinkingConfig
if suffixResult.HasSuffix {
config = parseSuffixToConfig(suffixResult.RawSuffix, provider, model)
config = parseSuffixToConfig(suffixResult.RawSuffix, providerFormat, model)
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": model,
"mode": config.Mode,
"budget": config.Budget,
"level": config.Level,
}).Debug("thinking: config from model suffix |")
} else {
config = extractThinkingConfig(body, provider)
config = extractThinkingConfig(body, providerFormat)
if hasThinkingConfig(config) {
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": modelInfo.ID,
"mode": config.Mode,
"budget": config.Budget,
@@ -144,17 +158,17 @@ func ApplyThinking(body []byte, model string, provider string) ([]byte, error) {
if !hasThinkingConfig(config) {
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": modelInfo.ID,
}).Debug("thinking: no config found, passthrough |")
return body, nil
}
// 5. Validate and normalize configuration
validated, err := ValidateConfig(config, modelInfo, provider)
validated, err := ValidateConfig(config, modelInfo, fromFormat, providerFormat, suffixResult.HasSuffix)
if err != nil {
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": modelInfo.ID,
"error": err.Error(),
}).Warn("thinking: validation failed |")
@@ -167,14 +181,14 @@ func ApplyThinking(body []byte, model string, provider string) ([]byte, error) {
// Defensive check: ValidateConfig should never return (nil, nil)
if validated == nil {
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": modelInfo.ID,
}).Warn("thinking: ValidateConfig returned nil config without error, passthrough |")
return body, nil
}
log.WithFields(log.Fields{
"provider": provider,
"provider": providerFormat,
"model": modelInfo.ID,
"mode": validated.Mode,
"budget": validated.Budget,
@@ -228,7 +242,7 @@ func parseSuffixToConfig(rawSuffix, provider, model string) ThinkingConfig {
// applyUserDefinedModel applies thinking configuration for user-defined models
// without ThinkingSupport validation.
func applyUserDefinedModel(body []byte, modelInfo *registry.ModelInfo, provider string, suffixResult SuffixResult) ([]byte, error) {
func applyUserDefinedModel(body []byte, modelInfo *registry.ModelInfo, fromFormat, toFormat string, suffixResult SuffixResult) ([]byte, error) {
// Get model ID for logging
modelID := ""
if modelInfo != nil {
@@ -240,39 +254,57 @@ func applyUserDefinedModel(body []byte, modelInfo *registry.ModelInfo, provider
// Get config: suffix priority over body
var config ThinkingConfig
if suffixResult.HasSuffix {
config = parseSuffixToConfig(suffixResult.RawSuffix, provider, modelID)
config = parseSuffixToConfig(suffixResult.RawSuffix, toFormat, modelID)
} else {
config = extractThinkingConfig(body, provider)
config = extractThinkingConfig(body, toFormat)
}
if !hasThinkingConfig(config) {
log.WithFields(log.Fields{
"model": modelID,
"provider": provider,
"provider": toFormat,
}).Debug("thinking: user-defined model, passthrough (no config) |")
return body, nil
}
applier := GetProviderApplier(provider)
applier := GetProviderApplier(toFormat)
if applier == nil {
log.WithFields(log.Fields{
"model": modelID,
"provider": provider,
"provider": toFormat,
}).Debug("thinking: user-defined model, passthrough (unknown provider) |")
return body, nil
}
log.WithFields(log.Fields{
"provider": provider,
"provider": toFormat,
"model": modelID,
"mode": config.Mode,
"budget": config.Budget,
"level": config.Level,
}).Debug("thinking: applying config for user-defined model (skip validation)")
config = normalizeUserDefinedConfig(config, fromFormat, toFormat)
return applier.Apply(body, config, modelInfo)
}
func normalizeUserDefinedConfig(config ThinkingConfig, fromFormat, toFormat string) ThinkingConfig {
if config.Mode != ModeLevel {
return config
}
if !isBudgetBasedProvider(toFormat) || !isLevelBasedProvider(fromFormat) {
return config
}
budget, ok := ConvertLevelToBudget(string(config.Level))
if !ok {
return config
}
config.Mode = ModeBudget
config.Budget = budget
config.Level = ""
return config
}
// extractThinkingConfig extracts provider-specific thinking config from request body.
func extractThinkingConfig(body []byte, provider string) ThinkingConfig {
if len(body) == 0 || !gjson.ValidBytes(body) {
@@ -289,7 +321,11 @@ func extractThinkingConfig(body []byte, provider string) ThinkingConfig {
case "codex":
return extractCodexConfig(body)
case "iflow":
return extractIFlowConfig(body)
config := extractIFlowConfig(body)
if hasThinkingConfig(config) {
return config
}
return extractOpenAIConfig(body)
default:
return ThinkingConfig{}
}

View File

@@ -24,6 +24,10 @@ const (
// Example: using level with a budget-only model
ErrLevelNotSupported ErrorCode = "LEVEL_NOT_SUPPORTED"
// ErrBudgetOutOfRange indicates the budget value is outside model range.
// Example: budget 64000 exceeds max 20000
ErrBudgetOutOfRange ErrorCode = "BUDGET_OUT_OF_RANGE"
// ErrProviderMismatch indicates the provider does not match the model.
// Example: applying Claude format to a Gemini model
ErrProviderMismatch ErrorCode = "PROVIDER_MISMATCH"

View File

@@ -80,9 +80,66 @@ func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *
result, _ := sjson.SetBytes(body, "thinking.type", "enabled")
result, _ = sjson.SetBytes(result, "thinking.budget_tokens", config.Budget)
// Ensure max_tokens > thinking.budget_tokens (Anthropic API constraint)
result = a.normalizeClaudeBudget(result, config.Budget, modelInfo)
return result, nil
}
// normalizeClaudeBudget applies Claude-specific constraints to ensure max_tokens > budget_tokens.
// Anthropic API requires this constraint; violating it returns a 400 error.
func (a *Applier) normalizeClaudeBudget(body []byte, budgetTokens int, modelInfo *registry.ModelInfo) []byte {
if budgetTokens <= 0 {
return body
}
// Ensure the request satisfies Claude constraints:
// 1) Determine effective max_tokens (request overrides model default)
// 2) If budget_tokens >= max_tokens, reduce budget_tokens to max_tokens-1
// 3) If the adjusted budget falls below the model minimum, leave the request unchanged
// 4) If max_tokens came from model default, write it back into the request
effectiveMax, setDefaultMax := a.effectiveMaxTokens(body, modelInfo)
if setDefaultMax && effectiveMax > 0 {
body, _ = sjson.SetBytes(body, "max_tokens", effectiveMax)
}
// Compute the budget we would apply after enforcing budget_tokens < max_tokens.
adjustedBudget := budgetTokens
if effectiveMax > 0 && adjustedBudget >= effectiveMax {
adjustedBudget = effectiveMax - 1
}
minBudget := 0
if modelInfo != nil && modelInfo.Thinking != nil {
minBudget = modelInfo.Thinking.Min
}
if minBudget > 0 && adjustedBudget > 0 && adjustedBudget < minBudget {
// If enforcing the max_tokens constraint would push the budget below the model minimum,
// leave the request unchanged.
return body
}
if adjustedBudget != budgetTokens {
body, _ = sjson.SetBytes(body, "thinking.budget_tokens", adjustedBudget)
}
return body
}
// effectiveMaxTokens returns the max tokens to cap thinking:
// prefer request-provided max_tokens; otherwise fall back to model default.
// The boolean indicates whether the value came from the model default (and thus should be written back).
func (a *Applier) effectiveMaxTokens(body []byte, modelInfo *registry.ModelInfo) (max int, fromModel bool) {
if maxTok := gjson.GetBytes(body, "max_tokens"); maxTok.Exists() && maxTok.Int() > 0 {
return int(maxTok.Int()), false
}
if modelInfo != nil && modelInfo.MaxCompletionTokens > 0 {
return modelInfo.MaxCompletionTokens, true
}
return 0, false
}
func applyCompatibleClaude(body []byte, config thinking.ThinkingConfig) ([]byte, error) {
if config.Mode != thinking.ModeBudget && config.Mode != thinking.ModeNone && config.Mode != thinking.ModeAuto {
return body, nil

View File

@@ -1,7 +1,7 @@
// Package iflow implements thinking configuration for iFlow models (GLM, MiniMax).
// Package iflow implements thinking configuration for iFlow models.
//
// iFlow models use boolean toggle semantics:
// - GLM models: chat_template_kwargs.enable_thinking (boolean)
// - Models using chat_template_kwargs.enable_thinking (boolean toggle)
// - MiniMax models: reasoning_split (boolean)
//
// Level values are converted to boolean: none=false, all others=true
@@ -20,6 +20,7 @@ import (
// Applier implements thinking.ProviderApplier for iFlow models.
//
// iFlow-specific behavior:
// - enable_thinking toggle models: enable_thinking boolean
// - GLM models: enable_thinking boolean + clear_thinking=false
// - MiniMax models: reasoning_split boolean
// - Level to boolean: none=false, others=true
@@ -61,8 +62,8 @@ func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *
return body, nil
}
if isGLMModel(modelInfo.ID) {
return applyGLM(body, config), nil
if isEnableThinkingModel(modelInfo.ID) {
return applyEnableThinking(body, config, isGLMModel(modelInfo.ID)), nil
}
if isMiniMaxModel(modelInfo.ID) {
@@ -97,7 +98,8 @@ func configToBoolean(config thinking.ThinkingConfig) bool {
}
}
// applyGLM applies thinking configuration for GLM models.
// applyEnableThinking applies thinking configuration for models that use
// chat_template_kwargs.enable_thinking format.
//
// Output format when enabled:
//
@@ -107,9 +109,8 @@ func configToBoolean(config thinking.ThinkingConfig) bool {
//
// {"chat_template_kwargs": {"enable_thinking": false}}
//
// Note: clear_thinking is only set when thinking is enabled, to preserve
// thinking output in the response.
func applyGLM(body []byte, config thinking.ThinkingConfig) []byte {
// Note: clear_thinking is only set for GLM models when thinking is enabled.
func applyEnableThinking(body []byte, config thinking.ThinkingConfig, setClearThinking bool) []byte {
enableThinking := configToBoolean(config)
if len(body) == 0 || !gjson.ValidBytes(body) {
@@ -118,8 +119,11 @@ func applyGLM(body []byte, config thinking.ThinkingConfig) []byte {
result, _ := sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
// clear_thinking is a GLM-only knob, strip it for other models.
result, _ = sjson.DeleteBytes(result, "chat_template_kwargs.clear_thinking")
// clear_thinking only needed when thinking is enabled
if enableThinking {
if enableThinking && setClearThinking {
result, _ = sjson.SetBytes(result, "chat_template_kwargs.clear_thinking", false)
}
@@ -143,8 +147,21 @@ func applyMiniMax(body []byte, config thinking.ThinkingConfig) []byte {
return result
}
// isEnableThinkingModel determines if the model uses chat_template_kwargs.enable_thinking format.
func isEnableThinkingModel(modelID string) bool {
if isGLMModel(modelID) {
return true
}
id := strings.ToLower(modelID)
switch id {
case "qwen3-max-preview", "deepseek-v3.2", "deepseek-v3.1":
return true
default:
return false
}
}
// isGLMModel determines if the model is a GLM series model.
// GLM models use chat_template_kwargs.enable_thinking format.
func isGLMModel(modelID string) bool {
return strings.HasPrefix(strings.ToLower(modelID), "glm")
}

View File

@@ -27,28 +27,32 @@ func StripThinkingConfig(body []byte, provider string) []byte {
return body
}
var paths []string
switch provider {
case "claude":
result, _ := sjson.DeleteBytes(body, "thinking")
return result
paths = []string{"thinking"}
case "gemini":
result, _ := sjson.DeleteBytes(body, "generationConfig.thinkingConfig")
return result
paths = []string{"generationConfig.thinkingConfig"}
case "gemini-cli", "antigravity":
result, _ := sjson.DeleteBytes(body, "request.generationConfig.thinkingConfig")
return result
paths = []string{"request.generationConfig.thinkingConfig"}
case "openai":
result, _ := sjson.DeleteBytes(body, "reasoning_effort")
return result
paths = []string{"reasoning_effort"}
case "codex":
result, _ := sjson.DeleteBytes(body, "reasoning.effort")
return result
paths = []string{"reasoning.effort"}
case "iflow":
result, _ := sjson.DeleteBytes(body, "chat_template_kwargs.enable_thinking")
result, _ = sjson.DeleteBytes(result, "chat_template_kwargs.clear_thinking")
result, _ = sjson.DeleteBytes(result, "reasoning_split")
return result
paths = []string{
"chat_template_kwargs.enable_thinking",
"chat_template_kwargs.clear_thinking",
"reasoning_split",
"reasoning_effort",
}
default:
return body
}
result := body
for _, path := range paths {
result, _ = sjson.DeleteBytes(result, path)
}
return result
}

View File

@@ -9,64 +9,6 @@ import (
log "github.com/sirupsen/logrus"
)
// ClampBudget clamps a budget value to the model's supported range.
//
// Logging:
// - Warn when value=0 but ZeroAllowed=false
// - Debug when value is clamped to min/max
//
// Fields: provider, model, original_value, clamped_to, min, max
func ClampBudget(value int, modelInfo *registry.ModelInfo, provider string) int {
model := "unknown"
support := (*registry.ThinkingSupport)(nil)
if modelInfo != nil {
if modelInfo.ID != "" {
model = modelInfo.ID
}
support = modelInfo.Thinking
}
if support == nil {
return value
}
// Auto value (-1) passes through without clamping.
if value == -1 {
return value
}
min := support.Min
max := support.Max
if value == 0 && !support.ZeroAllowed {
log.WithFields(log.Fields{
"provider": provider,
"model": model,
"original_value": value,
"clamped_to": min,
"min": min,
"max": max,
}).Warn("thinking: budget zero not allowed |")
return min
}
// Some models are level-only and do not define numeric budget ranges.
if min == 0 && max == 0 {
return value
}
if value < min {
if value == 0 && support.ZeroAllowed {
return 0
}
logClamp(provider, model, value, min, min, max)
return min
}
if value > max {
logClamp(provider, model, value, max, min, max)
return max
}
return value
}
// ValidateConfig validates a thinking configuration against model capabilities.
//
// This function performs comprehensive validation:
@@ -74,10 +16,16 @@ func ClampBudget(value int, modelInfo *registry.ModelInfo, provider string) int
// - Auto-converts between Budget and Level formats based on model capability
// - Validates that requested level is in the model's supported levels list
// - Clamps budget values to model's allowed range
// - When converting Budget -> Level for level-only models, clamps the derived standard level to the nearest supported level
// (special values none/auto are preserved)
// - When config comes from a model suffix, strict budget validation is disabled (we clamp instead of error)
//
// Parameters:
// - config: The thinking configuration to validate
// - support: Model's ThinkingSupport properties (nil means no thinking support)
// - fromFormat: Source provider format (used to determine strict validation rules)
// - toFormat: Target provider format
// - fromSuffix: Whether config was sourced from model suffix
//
// Returns:
// - Normalized ThinkingConfig with clamped values
@@ -87,9 +35,8 @@ func ClampBudget(value int, modelInfo *registry.ModelInfo, provider string) int
// - Budget-only model + Level config → Level converted to Budget
// - Level-only model + Budget config → Budget converted to Level
// - Hybrid model → preserve original format
func ValidateConfig(config ThinkingConfig, modelInfo *registry.ModelInfo, provider string) (*ThinkingConfig, error) {
normalized := config
func ValidateConfig(config ThinkingConfig, modelInfo *registry.ModelInfo, fromFormat, toFormat string, fromSuffix bool) (*ThinkingConfig, error) {
fromFormat, toFormat = strings.ToLower(strings.TrimSpace(fromFormat)), strings.ToLower(strings.TrimSpace(toFormat))
model := "unknown"
support := (*registry.ThinkingSupport)(nil)
if modelInfo != nil {
@@ -103,101 +50,108 @@ func ValidateConfig(config ThinkingConfig, modelInfo *registry.ModelInfo, provid
if config.Mode != ModeNone {
return nil, NewThinkingErrorWithModel(ErrThinkingNotSupported, "thinking not supported for this model", model)
}
return &normalized, nil
return &config, nil
}
allowClampUnsupported := isBudgetBasedProvider(fromFormat) && isLevelBasedProvider(toFormat)
strictBudget := !fromSuffix && fromFormat != "" && isSameProviderFamily(fromFormat, toFormat)
budgetDerivedFromLevel := false
capability := detectModelCapability(modelInfo)
switch capability {
case CapabilityBudgetOnly:
if normalized.Mode == ModeLevel {
if normalized.Level == LevelAuto {
if config.Mode == ModeLevel {
if config.Level == LevelAuto {
break
}
budget, ok := ConvertLevelToBudget(string(normalized.Level))
budget, ok := ConvertLevelToBudget(string(config.Level))
if !ok {
return nil, NewThinkingError(ErrUnknownLevel, fmt.Sprintf("unknown level: %s", normalized.Level))
return nil, NewThinkingError(ErrUnknownLevel, fmt.Sprintf("unknown level: %s", config.Level))
}
normalized.Mode = ModeBudget
normalized.Budget = budget
normalized.Level = ""
config.Mode = ModeBudget
config.Budget = budget
config.Level = ""
budgetDerivedFromLevel = true
}
case CapabilityLevelOnly:
if normalized.Mode == ModeBudget {
level, ok := ConvertBudgetToLevel(normalized.Budget)
if config.Mode == ModeBudget {
level, ok := ConvertBudgetToLevel(config.Budget)
if !ok {
return nil, NewThinkingError(ErrUnknownLevel, fmt.Sprintf("budget %d cannot be converted to a valid level", normalized.Budget))
return nil, NewThinkingError(ErrUnknownLevel, fmt.Sprintf("budget %d cannot be converted to a valid level", config.Budget))
}
normalized.Mode = ModeLevel
normalized.Level = ThinkingLevel(level)
normalized.Budget = 0
// When converting Budget -> Level for level-only models, clamp the derived standard level
// to the nearest supported level. Special values (none/auto) are preserved.
config.Mode = ModeLevel
config.Level = clampLevel(ThinkingLevel(level), modelInfo, toFormat)
config.Budget = 0
}
case CapabilityHybrid:
}
if normalized.Mode == ModeLevel && normalized.Level == LevelNone {
normalized.Mode = ModeNone
normalized.Budget = 0
normalized.Level = ""
if config.Mode == ModeLevel && config.Level == LevelNone {
config.Mode = ModeNone
config.Budget = 0
config.Level = ""
}
if normalized.Mode == ModeLevel && normalized.Level == LevelAuto {
normalized.Mode = ModeAuto
normalized.Budget = -1
normalized.Level = ""
if config.Mode == ModeLevel && config.Level == LevelAuto {
config.Mode = ModeAuto
config.Budget = -1
config.Level = ""
}
if normalized.Mode == ModeBudget && normalized.Budget == 0 {
normalized.Mode = ModeNone
normalized.Level = ""
if config.Mode == ModeBudget && config.Budget == 0 {
config.Mode = ModeNone
config.Level = ""
}
if len(support.Levels) > 0 && normalized.Mode == ModeLevel {
if !isLevelSupported(string(normalized.Level), support.Levels) {
validLevels := normalizeLevels(support.Levels)
message := fmt.Sprintf("level %q not supported, valid levels: %s", strings.ToLower(string(normalized.Level)), strings.Join(validLevels, ", "))
return nil, NewThinkingError(ErrLevelNotSupported, message)
if len(support.Levels) > 0 && config.Mode == ModeLevel {
if !isLevelSupported(string(config.Level), support.Levels) {
if allowClampUnsupported {
config.Level = clampLevel(config.Level, modelInfo, toFormat)
}
if !isLevelSupported(string(config.Level), support.Levels) {
// User explicitly specified an unsupported level - return error
// (budget-derived levels may be clamped based on source format)
validLevels := normalizeLevels(support.Levels)
message := fmt.Sprintf("level %q not supported, valid levels: %s", strings.ToLower(string(config.Level)), strings.Join(validLevels, ", "))
return nil, NewThinkingError(ErrLevelNotSupported, message)
}
}
}
if strictBudget && config.Mode == ModeBudget && !budgetDerivedFromLevel {
min, max := support.Min, support.Max
if min != 0 || max != 0 {
if config.Budget < min || config.Budget > max || (config.Budget == 0 && !support.ZeroAllowed) {
message := fmt.Sprintf("budget %d out of range [%d,%d]", config.Budget, min, max)
return nil, NewThinkingError(ErrBudgetOutOfRange, message)
}
}
}
// Convert ModeAuto to mid-range if dynamic not allowed
if normalized.Mode == ModeAuto && !support.DynamicAllowed {
normalized = convertAutoToMidRange(normalized, support, provider, model)
if config.Mode == ModeAuto && !support.DynamicAllowed {
config = convertAutoToMidRange(config, support, toFormat, model)
}
if normalized.Mode == ModeNone && provider == "claude" {
if config.Mode == ModeNone && toFormat == "claude" {
// Claude supports explicit disable via thinking.type="disabled".
// Keep Budget=0 so applier can omit budget_tokens.
normalized.Budget = 0
normalized.Level = ""
config.Budget = 0
config.Level = ""
} else {
switch normalized.Mode {
switch config.Mode {
case ModeBudget, ModeAuto, ModeNone:
normalized.Budget = ClampBudget(normalized.Budget, modelInfo, provider)
config.Budget = clampBudget(config.Budget, modelInfo, toFormat)
}
// ModeNone with clamped Budget > 0: set Level to lowest for Level-only/Hybrid models
// This ensures Apply layer doesn't need to access support.Levels
if normalized.Mode == ModeNone && normalized.Budget > 0 && len(support.Levels) > 0 {
normalized.Level = ThinkingLevel(support.Levels[0])
if config.Mode == ModeNone && config.Budget > 0 && len(support.Levels) > 0 {
config.Level = ThinkingLevel(support.Levels[0])
}
}
return &normalized, nil
}
func isLevelSupported(level string, supported []string) bool {
for _, candidate := range supported {
if strings.EqualFold(level, strings.TrimSpace(candidate)) {
return true
}
}
return false
}
func normalizeLevels(levels []string) []string {
normalized := make([]string, 0, len(levels))
for _, level := range levels {
normalized = append(normalized, strings.ToLower(strings.TrimSpace(level)))
}
return normalized
return &config, nil
}
// convertAutoToMidRange converts ModeAuto to a mid-range value when dynamic is not allowed.
@@ -246,7 +200,172 @@ func convertAutoToMidRange(config ThinkingConfig, support *registry.ThinkingSupp
return config
}
// logClamp logs a debug message when budget clamping occurs.
// standardLevelOrder defines the canonical ordering of thinking levels from lowest to highest.
var standardLevelOrder = []ThinkingLevel{LevelMinimal, LevelLow, LevelMedium, LevelHigh, LevelXHigh}
// clampLevel clamps the given level to the nearest supported level.
// On tie, prefers the lower level.
func clampLevel(level ThinkingLevel, modelInfo *registry.ModelInfo, provider string) ThinkingLevel {
model := "unknown"
var supported []string
if modelInfo != nil {
if modelInfo.ID != "" {
model = modelInfo.ID
}
if modelInfo.Thinking != nil {
supported = modelInfo.Thinking.Levels
}
}
if len(supported) == 0 || isLevelSupported(string(level), supported) {
return level
}
pos := levelIndex(string(level))
if pos == -1 {
return level
}
bestIdx, bestDist := -1, len(standardLevelOrder)+1
for _, s := range supported {
if idx := levelIndex(strings.TrimSpace(s)); idx != -1 {
if dist := abs(pos - idx); dist < bestDist || (dist == bestDist && idx < bestIdx) {
bestIdx, bestDist = idx, dist
}
}
}
if bestIdx >= 0 {
clamped := standardLevelOrder[bestIdx]
log.WithFields(log.Fields{
"provider": provider,
"model": model,
"original_value": string(level),
"clamped_to": string(clamped),
}).Debug("thinking: level clamped |")
return clamped
}
return level
}
// clampBudget clamps a budget value to the model's supported range.
func clampBudget(value int, modelInfo *registry.ModelInfo, provider string) int {
model := "unknown"
support := (*registry.ThinkingSupport)(nil)
if modelInfo != nil {
if modelInfo.ID != "" {
model = modelInfo.ID
}
support = modelInfo.Thinking
}
if support == nil {
return value
}
// Auto value (-1) passes through without clamping.
if value == -1 {
return value
}
min, max := support.Min, support.Max
if value == 0 && !support.ZeroAllowed {
log.WithFields(log.Fields{
"provider": provider,
"model": model,
"original_value": value,
"clamped_to": min,
"min": min,
"max": max,
}).Warn("thinking: budget zero not allowed |")
return min
}
// Some models are level-only and do not define numeric budget ranges.
if min == 0 && max == 0 {
return value
}
if value < min {
if value == 0 && support.ZeroAllowed {
return 0
}
logClamp(provider, model, value, min, min, max)
return min
}
if value > max {
logClamp(provider, model, value, max, min, max)
return max
}
return value
}
func isLevelSupported(level string, supported []string) bool {
for _, s := range supported {
if strings.EqualFold(level, strings.TrimSpace(s)) {
return true
}
}
return false
}
func levelIndex(level string) int {
for i, l := range standardLevelOrder {
if strings.EqualFold(level, string(l)) {
return i
}
}
return -1
}
func normalizeLevels(levels []string) []string {
out := make([]string, len(levels))
for i, l := range levels {
out[i] = strings.ToLower(strings.TrimSpace(l))
}
return out
}
func isBudgetBasedProvider(provider string) bool {
switch provider {
case "gemini", "gemini-cli", "antigravity", "claude":
return true
default:
return false
}
}
func isLevelBasedProvider(provider string) bool {
switch provider {
case "openai", "openai-response", "codex":
return true
default:
return false
}
}
func isGeminiFamily(provider string) bool {
switch provider {
case "gemini", "gemini-cli", "antigravity":
return true
default:
return false
}
}
func isSameProviderFamily(from, to string) bool {
if from == to {
return true
}
return isGeminiFamily(from) && isGeminiFamily(to)
}
func abs(x int) int {
if x < 0 {
return -x
}
return x
}
func logClamp(provider, model string, original, clampedTo, min, max int) {
log.WithFields(log.Fields{
"provider": provider,

View File

@@ -7,12 +7,9 @@ package claude
import (
"bytes"
"crypto/sha256"
"encoding/hex"
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/cache"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/translator/gemini/common"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
@@ -20,29 +17,6 @@ import (
"github.com/tidwall/sjson"
)
// deriveSessionID generates a stable session ID from the request.
// Uses the hash of the first user message to identify the conversation.
func deriveSessionID(rawJSON []byte) string {
messages := gjson.GetBytes(rawJSON, "messages")
if !messages.IsArray() {
return ""
}
for _, msg := range messages.Array() {
if msg.Get("role").String() == "user" {
content := msg.Get("content").String()
if content == "" {
// Try to get text from content array
content = msg.Get("content.0.text").String()
}
if content != "" {
h := sha256.Sum256([]byte(content))
return hex.EncodeToString(h[:16])
}
}
}
return ""
}
// ConvertClaudeRequestToAntigravity parses and transforms a Claude Code API request into Gemini CLI API format.
// It extracts the model name, system instruction, message contents, and tool declarations
// from the raw JSON request and returns them in the format expected by the Gemini CLI API.
@@ -62,11 +36,9 @@ func deriveSessionID(rawJSON []byte) string {
// Returns:
// - []byte: The transformed request data in Gemini CLI API format
func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _ bool) []byte {
enableThoughtTranslate := true
rawJSON := bytes.Clone(inputRawJSON)
// Derive session ID for signature caching
sessionID := deriveSessionID(rawJSON)
// system instruction
systemInstructionJSON := ""
hasSystemInstruction := false
@@ -125,41 +97,49 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
if contentTypeResult.Type == gjson.String && contentTypeResult.String() == "thinking" {
// Use GetThinkingText to handle wrapped thinking objects
thinkingText := thinking.GetThinkingText(contentResult)
signatureResult := contentResult.Get("signature")
clientSignature := ""
if signatureResult.Exists() && signatureResult.String() != "" {
clientSignature = signatureResult.String()
}
// Always try cached signature first (more reliable than client-provided)
// Client may send stale or invalid signatures from different sessions
signature := ""
if sessionID != "" && thinkingText != "" {
if cachedSig := cache.GetCachedSignature(sessionID, thinkingText); cachedSig != "" {
if thinkingText != "" {
if cachedSig := cache.GetCachedSignature(modelName, thinkingText); cachedSig != "" {
signature = cachedSig
// log.Debugf("Using cached signature for thinking block")
}
}
// Fallback to client signature only if cache miss and client signature is valid
if signature == "" && cache.HasValidSignature(clientSignature) {
signature = clientSignature
if signature == "" {
signatureResult := contentResult.Get("signature")
clientSignature := ""
if signatureResult.Exists() && signatureResult.String() != "" {
arrayClientSignatures := strings.SplitN(signatureResult.String(), "#", 2)
if len(arrayClientSignatures) == 2 {
if modelName == arrayClientSignatures[0] {
clientSignature = arrayClientSignatures[1]
}
}
}
if cache.HasValidSignature(modelName, clientSignature) {
signature = clientSignature
}
// log.Debugf("Using client-provided signature for thinking block")
}
// Store for subsequent tool_use in the same message
if cache.HasValidSignature(signature) {
if cache.HasValidSignature(modelName, signature) {
currentMessageThinkingSignature = signature
}
// Skip trailing unsigned thinking blocks on last assistant message
isUnsigned := !cache.HasValidSignature(signature)
isUnsigned := !cache.HasValidSignature(modelName, signature)
// If unsigned, skip entirely (don't convert to text)
// Claude requires assistant messages to start with thinking blocks when thinking is enabled
// Converting to text would break this requirement
if isUnsigned {
// log.Debugf("Dropping unsigned thinking block (no valid signature)")
enableThoughtTranslate = false
continue
}
@@ -175,10 +155,13 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
clientContentJSON, _ = sjson.SetRaw(clientContentJSON, "parts.-1", partJSON)
} else if contentTypeResult.Type == gjson.String && contentTypeResult.String() == "text" {
prompt := contentResult.Get("text").String()
partJSON := `{}`
if prompt != "" {
partJSON, _ = sjson.Set(partJSON, "text", prompt)
// Skip empty text parts to avoid Gemini API error:
// "required oneof field 'data' must have one initialized field"
if prompt == "" {
continue
}
partJSON := `{}`
partJSON, _ = sjson.Set(partJSON, "text", prompt)
clientContentJSON, _ = sjson.SetRaw(clientContentJSON, "parts.-1", partJSON)
} else if contentTypeResult.Type == gjson.String && contentTypeResult.String() == "tool_use" {
// NOTE: Do NOT inject dummy thinking blocks here.
@@ -207,7 +190,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
// This is the approach used in opencode-google-antigravity-auth for Gemini
// and also works for Claude through Antigravity API
const skipSentinel = "skip_thought_signature_validator"
if cache.HasValidSignature(currentMessageThinkingSignature) {
if cache.HasValidSignature(modelName, currentMessageThinkingSignature) {
partJSON, _ = sjson.Set(partJSON, "thoughtSignature", currentMessageThinkingSignature)
} else {
// No valid signature - use skip sentinel to bypass validation
@@ -305,6 +288,13 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
}
}
// Skip messages with empty parts array to avoid Gemini API error:
// "required oneof field 'data' must have one initialized field"
partsCheck := gjson.Get(clientContentJSON, "parts")
if !partsCheck.IsArray() || len(partsCheck.Array()) == 0 {
continue
}
contentsJSON, _ = sjson.SetRaw(contentsJSON, "-1", clientContentJSON)
hasContents = true
} else if contentsResult.Type == gjson.String {
@@ -387,15 +377,12 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
}
// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when type==enabled
if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil {
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
}
if t := gjson.GetBytes(rawJSON, "thinking"); enableThoughtTranslate && t.Exists() && t.IsObject() {
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.includeThoughts", true)
}
}
}

View File

@@ -4,6 +4,7 @@ import (
"strings"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/cache"
"github.com/tidwall/gjson"
)
@@ -73,30 +74,41 @@ func TestConvertClaudeRequestToAntigravity_RoleMapping(t *testing.T) {
}
func TestConvertClaudeRequestToAntigravity_ThinkingBlocks(t *testing.T) {
cache.ClearSignatureCache("")
// Valid signature must be at least 50 characters
validSignature := "abc123validSignature1234567890123456789012345678901234567890"
thinkingText := "Let me think..."
// Pre-cache the signature (simulating a previous response for the same thinking text)
inputJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [
{
"role": "user",
"content": [{"type": "text", "text": "Test user message"}]
},
{
"role": "assistant",
"content": [
{"type": "thinking", "thinking": "Let me think...", "signature": "` + validSignature + `"},
{"type": "thinking", "thinking": "` + thinkingText + `", "signature": "` + validSignature + `"},
{"type": "text", "text": "Answer"}
]
}
]
}`)
cache.CacheSignature("claude-sonnet-4-5-thinking", thinkingText, validSignature)
output := ConvertClaudeRequestToAntigravity("claude-sonnet-4-5-thinking", inputJSON, false)
outputStr := string(output)
// Check thinking block conversion
firstPart := gjson.Get(outputStr, "request.contents.0.parts.0")
// Check thinking block conversion (now in contents.1 due to user message)
firstPart := gjson.Get(outputStr, "request.contents.1.parts.0")
if !firstPart.Get("thought").Bool() {
t.Error("thinking block should have thought: true")
}
if firstPart.Get("text").String() != "Let me think..." {
if firstPart.Get("text").String() != thinkingText {
t.Error("thinking text mismatch")
}
if firstPart.Get("thoughtSignature").String() != validSignature {
@@ -105,6 +117,8 @@ func TestConvertClaudeRequestToAntigravity_ThinkingBlocks(t *testing.T) {
}
func TestConvertClaudeRequestToAntigravity_ThinkingBlockWithoutSignature(t *testing.T) {
cache.ClearSignatureCache("")
// Unsigned thinking blocks should be removed entirely (not converted to text)
inputJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
@@ -226,14 +240,22 @@ func TestConvertClaudeRequestToAntigravity_ToolUse(t *testing.T) {
}
func TestConvertClaudeRequestToAntigravity_ToolUse_WithSignature(t *testing.T) {
cache.ClearSignatureCache("")
validSignature := "abc123validSignature1234567890123456789012345678901234567890"
thinkingText := "Let me think..."
inputJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [
{
"role": "user",
"content": [{"type": "text", "text": "Test user message"}]
},
{
"role": "assistant",
"content": [
{"type": "thinking", "thinking": "Let me think...", "signature": "` + validSignature + `"},
{"type": "thinking", "thinking": "` + thinkingText + `", "signature": "` + validSignature + `"},
{
"type": "tool_use",
"id": "call_123",
@@ -245,11 +267,13 @@ func TestConvertClaudeRequestToAntigravity_ToolUse_WithSignature(t *testing.T) {
]
}`)
cache.CacheSignature("claude-sonnet-4-5-thinking", thinkingText, validSignature)
output := ConvertClaudeRequestToAntigravity("claude-sonnet-4-5-thinking", inputJSON, false)
outputStr := string(output)
// Check function call has the signature from the preceding thinking block
part := gjson.Get(outputStr, "request.contents.0.parts.1")
// Check function call has the signature from the preceding thinking block (now in contents.1)
part := gjson.Get(outputStr, "request.contents.1.parts.1")
if part.Get("functionCall.name").String() != "get_weather" {
t.Errorf("Expected functionCall, got %s", part.Raw)
}
@@ -259,26 +283,36 @@ func TestConvertClaudeRequestToAntigravity_ToolUse_WithSignature(t *testing.T) {
}
func TestConvertClaudeRequestToAntigravity_ReorderThinking(t *testing.T) {
cache.ClearSignatureCache("")
// Case: text block followed by thinking block -> should be reordered to thinking first
validSignature := "abc123validSignature1234567890123456789012345678901234567890"
thinkingText := "Planning..."
inputJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [
{
"role": "user",
"content": [{"type": "text", "text": "Test user message"}]
},
{
"role": "assistant",
"content": [
{"type": "text", "text": "Here is the plan."},
{"type": "thinking", "thinking": "Planning...", "signature": "` + validSignature + `"}
{"type": "thinking", "thinking": "` + thinkingText + `", "signature": "` + validSignature + `"}
]
}
]
}`)
cache.CacheSignature("claude-sonnet-4-5-thinking", thinkingText, validSignature)
output := ConvertClaudeRequestToAntigravity("claude-sonnet-4-5-thinking", inputJSON, false)
outputStr := string(output)
// Verify order: Thinking block MUST be first
parts := gjson.Get(outputStr, "request.contents.0.parts").Array()
// Verify order: Thinking block MUST be first (now in contents.1 due to user message)
parts := gjson.Get(outputStr, "request.contents.1.parts").Array()
if len(parts) != 2 {
t.Fatalf("Expected 2 parts, got %d", len(parts))
}
@@ -343,8 +377,8 @@ func TestConvertClaudeRequestToAntigravity_ThinkingConfig(t *testing.T) {
if thinkingConfig.Get("thinkingBudget").Int() != 8000 {
t.Errorf("Expected thinkingBudget 8000, got %d", thinkingConfig.Get("thinkingBudget").Int())
}
if !thinkingConfig.Get("include_thoughts").Bool() {
t.Error("include_thoughts should be true")
if !thinkingConfig.Get("includeThoughts").Bool() {
t.Error("includeThoughts should be true")
}
} else {
t.Log("thinkingConfig not present - model may not be registered in test registry")
@@ -459,7 +493,12 @@ func TestConvertClaudeRequestToAntigravity_TrailingUnsignedThinking_Removed(t *t
}
func TestConvertClaudeRequestToAntigravity_TrailingSignedThinking_Kept(t *testing.T) {
cache.ClearSignatureCache("")
// Last assistant message ends with signed thinking block - should be kept
validSignature := "abc123validSignature1234567890123456789012345678901234567890"
thinkingText := "Valid thinking..."
inputJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [
@@ -471,12 +510,14 @@ func TestConvertClaudeRequestToAntigravity_TrailingSignedThinking_Kept(t *testin
"role": "assistant",
"content": [
{"type": "text", "text": "Here is my answer"},
{"type": "thinking", "thinking": "Valid thinking...", "signature": "abc123validSignature1234567890123456789012345678901234567890"}
{"type": "thinking", "thinking": "` + thinkingText + `", "signature": "` + validSignature + `"}
]
}
]
}`)
cache.CacheSignature("claude-sonnet-4-5-thinking", thinkingText, validSignature)
output := ConvertClaudeRequestToAntigravity("claude-sonnet-4-5-thinking", inputJSON, false)
outputStr := string(output)

View File

@@ -41,7 +41,6 @@ type Params struct {
HasContent bool // Tracks whether any content (text, thinking, or tool use) has been output
// Signature caching support
SessionID string // Session ID derived from request for signature caching
CurrentThinkingText strings.Builder // Accumulates thinking text for signature caching
}
@@ -70,9 +69,9 @@ func ConvertAntigravityResponseToClaude(_ context.Context, _ string, originalReq
HasFirstResponse: false,
ResponseType: 0,
ResponseIndex: 0,
SessionID: deriveSessionID(originalRequestRawJSON),
}
}
modelName := gjson.GetBytes(requestRawJSON, "model").String()
params := (*param).(*Params)
@@ -138,14 +137,14 @@ func ConvertAntigravityResponseToClaude(_ context.Context, _ string, originalReq
if thoughtSignature := partResult.Get("thoughtSignature"); thoughtSignature.Exists() && thoughtSignature.String() != "" {
// log.Debug("Branch: signature_delta")
if params.SessionID != "" && params.CurrentThinkingText.Len() > 0 {
cache.CacheSignature(params.SessionID, params.CurrentThinkingText.String(), thoughtSignature.String())
// log.Debugf("Cached signature for thinking block (sessionID=%s, textLen=%d)", params.SessionID, params.CurrentThinkingText.Len())
if params.CurrentThinkingText.Len() > 0 {
cache.CacheSignature(modelName, params.CurrentThinkingText.String(), thoughtSignature.String())
// log.Debugf("Cached signature for thinking block (textLen=%d)", params.CurrentThinkingText.Len())
params.CurrentThinkingText.Reset()
}
output = output + "event: content_block_delta\n"
data, _ := sjson.Set(fmt.Sprintf(`{"type":"content_block_delta","index":%d,"delta":{"type":"signature_delta","signature":""}}`, params.ResponseIndex), "delta.signature", thoughtSignature.String())
data, _ := sjson.Set(fmt.Sprintf(`{"type":"content_block_delta","index":%d,"delta":{"type":"signature_delta","signature":""}}`, params.ResponseIndex), "delta.signature", fmt.Sprintf("%s#%s", cache.GetModelGroup(modelName), thoughtSignature.String()))
output = output + fmt.Sprintf("data: %s\n\n\n", data)
params.HasContent = true
} else if params.ResponseType == 2 { // Continue existing thinking block if already in thinking state
@@ -372,7 +371,7 @@ func resolveStopReason(params *Params) string {
// - string: A Claude-compatible JSON response.
func ConvertAntigravityResponseToClaudeNonStream(_ context.Context, _ string, originalRequestRawJSON, requestRawJSON, rawJSON []byte, _ *any) string {
_ = originalRequestRawJSON
_ = requestRawJSON
modelName := gjson.GetBytes(requestRawJSON, "model").String()
root := gjson.ParseBytes(rawJSON)
promptTokens := root.Get("response.usageMetadata.promptTokenCount").Int()
@@ -437,7 +436,7 @@ func ConvertAntigravityResponseToClaudeNonStream(_ context.Context, _ string, or
block := `{"type":"thinking","thinking":""}`
block, _ = sjson.Set(block, "thinking", thinkingBuilder.String())
if thinkingSignature != "" {
block, _ = sjson.Set(block, "signature", thinkingSignature)
block, _ = sjson.Set(block, "signature", fmt.Sprintf("%s#%s", cache.GetModelGroup(modelName), thinkingSignature))
}
responseJSON, _ = sjson.SetRaw(responseJSON, "content.-1", block)
thinkingBuilder.Reset()

View File

@@ -12,10 +12,10 @@ import (
// Signature Caching Tests
// ============================================================================
func TestConvertAntigravityResponseToClaude_SessionIDDerived(t *testing.T) {
func TestConvertAntigravityResponseToClaude_ParamsInitialized(t *testing.T) {
cache.ClearSignatureCache("")
// Request with user message - should derive session ID
// Request with user message - should initialize params
requestJSON := []byte(`{
"messages": [
{"role": "user", "content": [{"type": "text", "text": "Hello world"}]}
@@ -37,10 +37,12 @@ func TestConvertAntigravityResponseToClaude_SessionIDDerived(t *testing.T) {
ctx := context.Background()
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, responseJSON, &param)
// Verify session ID was set
params := param.(*Params)
if params.SessionID == "" {
t.Error("SessionID should be derived from request")
if !params.HasFirstResponse {
t.Error("HasFirstResponse should be set after first chunk")
}
if params.CurrentThinkingText.Len() == 0 {
t.Error("Thinking text should be accumulated")
}
}
@@ -97,6 +99,7 @@ func TestConvertAntigravityResponseToClaude_SignatureCached(t *testing.T) {
cache.ClearSignatureCache("")
requestJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [{"role": "user", "content": [{"type": "text", "text": "Cache test"}]}]
}`)
@@ -129,12 +132,8 @@ func TestConvertAntigravityResponseToClaude_SignatureCached(t *testing.T) {
// Process thinking chunk
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, thinkingChunk, &param)
params := param.(*Params)
sessionID := params.SessionID
thinkingText := params.CurrentThinkingText.String()
if sessionID == "" {
t.Fatal("SessionID should be set")
}
if thinkingText == "" {
t.Fatal("Thinking text should be accumulated")
}
@@ -143,7 +142,7 @@ func TestConvertAntigravityResponseToClaude_SignatureCached(t *testing.T) {
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, signatureChunk, &param)
// Verify signature was cached
cachedSig := cache.GetCachedSignature(sessionID, thinkingText)
cachedSig := cache.GetCachedSignature("claude-sonnet-4-5-thinking", thinkingText)
if cachedSig != validSignature {
t.Errorf("Expected cached signature '%s', got '%s'", validSignature, cachedSig)
}
@@ -158,6 +157,7 @@ func TestConvertAntigravityResponseToClaude_MultipleThinkingBlocks(t *testing.T)
cache.ClearSignatureCache("")
requestJSON := []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [{"role": "user", "content": [{"type": "text", "text": "Multi block test"}]}]
}`)
@@ -221,13 +221,12 @@ func TestConvertAntigravityResponseToClaude_MultipleThinkingBlocks(t *testing.T)
// Process first thinking block
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, block1Thinking, &param)
params := param.(*Params)
sessionID := params.SessionID
firstThinkingText := params.CurrentThinkingText.String()
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, block1Sig, &param)
// Verify first signature cached
if cache.GetCachedSignature(sessionID, firstThinkingText) != validSig1 {
if cache.GetCachedSignature("claude-sonnet-4-5-thinking", firstThinkingText) != validSig1 {
t.Error("First thinking block signature should be cached")
}
@@ -241,76 +240,7 @@ func TestConvertAntigravityResponseToClaude_MultipleThinkingBlocks(t *testing.T)
ConvertAntigravityResponseToClaude(ctx, "claude-sonnet-4-5-thinking", requestJSON, requestJSON, block2Sig, &param)
// Verify second signature cached
if cache.GetCachedSignature(sessionID, secondThinkingText) != validSig2 {
if cache.GetCachedSignature("claude-sonnet-4-5-thinking", secondThinkingText) != validSig2 {
t.Error("Second thinking block signature should be cached")
}
}
func TestDeriveSessionIDFromRequest(t *testing.T) {
tests := []struct {
name string
input []byte
wantEmpty bool
}{
{
name: "valid user message",
input: []byte(`{"messages": [{"role": "user", "content": "Hello"}]}`),
wantEmpty: false,
},
{
name: "user message with content array",
input: []byte(`{"messages": [{"role": "user", "content": [{"type": "text", "text": "Hello"}]}]}`),
wantEmpty: false,
},
{
name: "no user message",
input: []byte(`{"messages": [{"role": "assistant", "content": "Hi"}]}`),
wantEmpty: true,
},
{
name: "empty messages",
input: []byte(`{"messages": []}`),
wantEmpty: true,
},
{
name: "no messages field",
input: []byte(`{}`),
wantEmpty: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := deriveSessionID(tt.input)
if tt.wantEmpty && result != "" {
t.Errorf("Expected empty session ID, got '%s'", result)
}
if !tt.wantEmpty && result == "" {
t.Error("Expected non-empty session ID")
}
})
}
}
func TestDeriveSessionIDFromRequest_Deterministic(t *testing.T) {
input := []byte(`{"messages": [{"role": "user", "content": "Same message"}]}`)
id1 := deriveSessionID(input)
id2 := deriveSessionID(input)
if id1 != id2 {
t.Errorf("Session ID should be deterministic: '%s' != '%s'", id1, id2)
}
}
func TestDeriveSessionIDFromRequest_DifferentMessages(t *testing.T) {
input1 := []byte(`{"messages": [{"role": "user", "content": "Message A"}]}`)
input2 := []byte(`{"messages": [{"role": "user", "content": "Message B"}]}`)
id1 := deriveSessionID(input1)
id2 := deriveSessionID(input2)
if id1 == id2 {
t.Error("Different messages should produce different session IDs")
}
}

View File

@@ -8,6 +8,7 @@ package gemini
import (
"bytes"
"fmt"
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/translator/gemini/common"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
@@ -32,12 +33,12 @@ import (
//
// Returns:
// - []byte: The transformed request data in Gemini API format
func ConvertGeminiRequestToAntigravity(_ string, inputRawJSON []byte, _ bool) []byte {
func ConvertGeminiRequestToAntigravity(modelName string, inputRawJSON []byte, _ bool) []byte {
rawJSON := bytes.Clone(inputRawJSON)
template := ""
template = `{"project":"","request":{},"model":""}`
template, _ = sjson.SetRaw(template, "request", string(rawJSON))
template, _ = sjson.Set(template, "model", gjson.Get(template, "request.model").String())
template, _ = sjson.Set(template, "model", modelName)
template, _ = sjson.Delete(template, "request.model")
template, errFixCLIToolResponse := fixCLIToolResponse(template)
@@ -97,37 +98,40 @@ func ConvertGeminiRequestToAntigravity(_ string, inputRawJSON []byte, _ bool) []
}
}
// Gemini-specific handling: add skip_thought_signature_validator to functionCall parts
// and remove thinking blocks entirely (Gemini doesn't need to preserve them)
const skipSentinel = "skip_thought_signature_validator"
// Gemini-specific handling for non-Claude models:
// - Add skip_thought_signature_validator to functionCall parts so upstream can bypass signature validation.
// - Also mark thinking parts with the same sentinel when present (we keep the parts; we only annotate them).
if !strings.Contains(modelName, "claude") {
const skipSentinel = "skip_thought_signature_validator"
gjson.GetBytes(rawJSON, "request.contents").ForEach(func(contentIdx, content gjson.Result) bool {
if content.Get("role").String() == "model" {
// First pass: collect indices of thinking parts to remove
var thinkingIndicesToRemove []int64
content.Get("parts").ForEach(func(partIdx, part gjson.Result) bool {
// Mark thinking blocks for removal
if part.Get("thought").Bool() {
thinkingIndicesToRemove = append(thinkingIndicesToRemove, partIdx.Int())
}
// Add skip sentinel to functionCall parts
if part.Get("functionCall").Exists() {
existingSig := part.Get("thoughtSignature").String()
if existingSig == "" || len(existingSig) < 50 {
rawJSON, _ = sjson.SetBytes(rawJSON, fmt.Sprintf("request.contents.%d.parts.%d.thoughtSignature", contentIdx.Int(), partIdx.Int()), skipSentinel)
gjson.GetBytes(rawJSON, "request.contents").ForEach(func(contentIdx, content gjson.Result) bool {
if content.Get("role").String() == "model" {
// First pass: collect indices of thinking parts to mark with skip sentinel
var thinkingIndicesToSkipSignature []int64
content.Get("parts").ForEach(func(partIdx, part gjson.Result) bool {
// Collect indices of thinking blocks to mark with skip sentinel
if part.Get("thought").Bool() {
thinkingIndicesToSkipSignature = append(thinkingIndicesToSkipSignature, partIdx.Int())
}
}
return true
})
// Add skip sentinel to functionCall parts
if part.Get("functionCall").Exists() {
existingSig := part.Get("thoughtSignature").String()
if existingSig == "" || len(existingSig) < 50 {
rawJSON, _ = sjson.SetBytes(rawJSON, fmt.Sprintf("request.contents.%d.parts.%d.thoughtSignature", contentIdx.Int(), partIdx.Int()), skipSentinel)
}
}
return true
})
// Remove thinking blocks in reverse order to preserve indices
for i := len(thinkingIndicesToRemove) - 1; i >= 0; i-- {
idx := thinkingIndicesToRemove[i]
rawJSON, _ = sjson.DeleteBytes(rawJSON, fmt.Sprintf("request.contents.%d.parts.%d", contentIdx.Int(), idx))
// Add skip_thought_signature_validator sentinel to thinking blocks in reverse order to preserve indices
for i := len(thinkingIndicesToSkipSignature) - 1; i >= 0; i-- {
idx := thinkingIndicesToSkipSignature[i]
rawJSON, _ = sjson.SetBytes(rawJSON, fmt.Sprintf("request.contents.%d.parts.%d.thoughtSignature", contentIdx.Int(), idx), skipSentinel)
}
}
}
return true
})
return true
})
}
return common.AttachDefaultSafetySettings(rawJSON, "request.safetySettings")
}

View File

@@ -62,40 +62,6 @@ func TestConvertGeminiRequestToAntigravity_AddSkipSentinelToFunctionCall(t *test
}
}
func TestConvertGeminiRequestToAntigravity_RemoveThinkingBlocks(t *testing.T) {
// Thinking blocks should be removed entirely for Gemini
validSignature := "abc123validSignature1234567890123456789012345678901234567890"
inputJSON := []byte(fmt.Sprintf(`{
"model": "gemini-3-pro-preview",
"contents": [
{
"role": "model",
"parts": [
{"thought": true, "text": "Thinking...", "thoughtSignature": "%s"},
{"text": "Here is my response"}
]
}
]
}`, validSignature))
output := ConvertGeminiRequestToAntigravity("gemini-3-pro-preview", inputJSON, false)
outputStr := string(output)
// Check that thinking block is removed
parts := gjson.Get(outputStr, "request.contents.0.parts").Array()
if len(parts) != 1 {
t.Fatalf("Expected 1 part (thinking removed), got %d", len(parts))
}
// Only text part should remain
if parts[0].Get("thought").Bool() {
t.Error("Thinking block should be removed for Gemini")
}
if parts[0].Get("text").String() != "Here is my response" {
t.Errorf("Expected text 'Here is my response', got '%s'", parts[0].Get("text").String())
}
}
func TestConvertGeminiRequestToAntigravity_ParallelFunctionCalls(t *testing.T) {
// Multiple functionCalls should all get skip_thought_signature_validator
inputJSON := []byte(`{

View File

@@ -41,6 +41,7 @@ func ConvertAntigravityResponseToGemini(ctx context.Context, _ string, originalR
responseResult := gjson.GetBytes(rawJSON, "response")
if responseResult.Exists() {
chunk = []byte(responseResult.Raw)
chunk = restoreUsageMetadata(chunk)
}
} else {
chunkTemplate := "[]"
@@ -76,7 +77,8 @@ func ConvertAntigravityResponseToGemini(ctx context.Context, _ string, originalR
func ConvertAntigravityResponseToGeminiNonStream(_ context.Context, _ string, originalRequestRawJSON, requestRawJSON, rawJSON []byte, _ *any) string {
responseResult := gjson.GetBytes(rawJSON, "response")
if responseResult.Exists() {
return responseResult.Raw
chunk := restoreUsageMetadata([]byte(responseResult.Raw))
return string(chunk)
}
return string(rawJSON)
}
@@ -84,3 +86,15 @@ func ConvertAntigravityResponseToGeminiNonStream(_ context.Context, _ string, or
func GeminiTokenCount(ctx context.Context, count int64) string {
return fmt.Sprintf(`{"totalTokens":%d,"promptTokensDetails":[{"modality":"TEXT","tokenCount":%d}]}`, count, count)
}
// restoreUsageMetadata renames cpaUsageMetadata back to usageMetadata.
// The executor renames usageMetadata to cpaUsageMetadata in non-terminal chunks
// to preserve usage data while hiding it from clients that don't expect it.
// When returning standard Gemini API format, we must restore the original name.
func restoreUsageMetadata(chunk []byte) []byte {
if cpaUsage := gjson.GetBytes(chunk, "cpaUsageMetadata"); cpaUsage.Exists() {
chunk, _ = sjson.SetRawBytes(chunk, "usageMetadata", []byte(cpaUsage.Raw))
chunk, _ = sjson.DeleteBytes(chunk, "cpaUsageMetadata")
}
return chunk
}

View File

@@ -0,0 +1,95 @@
package gemini
import (
"context"
"testing"
)
func TestRestoreUsageMetadata(t *testing.T) {
tests := []struct {
name string
input []byte
expected string
}{
{
name: "cpaUsageMetadata renamed to usageMetadata",
input: []byte(`{"modelVersion":"gemini-3-pro","cpaUsageMetadata":{"promptTokenCount":100,"candidatesTokenCount":200}}`),
expected: `{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100,"candidatesTokenCount":200}}`,
},
{
name: "no cpaUsageMetadata unchanged",
input: []byte(`{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}`),
expected: `{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}`,
},
{
name: "empty input",
input: []byte(`{}`),
expected: `{}`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := restoreUsageMetadata(tt.input)
if string(result) != tt.expected {
t.Errorf("restoreUsageMetadata() = %s, want %s", string(result), tt.expected)
}
})
}
}
func TestConvertAntigravityResponseToGeminiNonStream(t *testing.T) {
tests := []struct {
name string
input []byte
expected string
}{
{
name: "cpaUsageMetadata restored in response",
input: []byte(`{"response":{"modelVersion":"gemini-3-pro","cpaUsageMetadata":{"promptTokenCount":100}}}`),
expected: `{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}`,
},
{
name: "usageMetadata preserved",
input: []byte(`{"response":{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}}`),
expected: `{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := ConvertAntigravityResponseToGeminiNonStream(context.Background(), "", nil, nil, tt.input, nil)
if result != tt.expected {
t.Errorf("ConvertAntigravityResponseToGeminiNonStream() = %s, want %s", result, tt.expected)
}
})
}
}
func TestConvertAntigravityResponseToGeminiStream(t *testing.T) {
ctx := context.WithValue(context.Background(), "alt", "")
tests := []struct {
name string
input []byte
expected string
}{
{
name: "cpaUsageMetadata restored in streaming response",
input: []byte(`data: {"response":{"modelVersion":"gemini-3-pro","cpaUsageMetadata":{"promptTokenCount":100}}}`),
expected: `{"modelVersion":"gemini-3-pro","usageMetadata":{"promptTokenCount":100}}`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
results := ConvertAntigravityResponseToGemini(ctx, "", nil, nil, tt.input, nil)
if len(results) != 1 {
t.Fatalf("expected 1 result, got %d", len(results))
}
if results[0] != tt.expected {
t.Errorf("ConvertAntigravityResponseToGemini() = %s, want %s", results[0], tt.expected)
}
})
}
}

View File

@@ -66,6 +66,13 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
out, _ = sjson.SetBytes(out, "request.generationConfig.maxOutputTokens", maxTok.Num)
}
// Candidate count (OpenAI 'n' parameter)
if n := gjson.GetBytes(rawJSON, "n"); n.Exists() && n.Type == gjson.Number {
if val := n.Int(); val > 1 {
out, _ = sjson.SetBytes(out, "request.generationConfig.candidateCount", val)
}
}
// Map OpenAI modalities -> Gemini CLI request.generationConfig.responseModalities
// e.g. "modalities": ["image", "text"] -> ["IMAGE", "TEXT"]
if mods := gjson.GetBytes(rawJSON, "modalities"); mods.Exists() && mods.IsArray() {
@@ -132,6 +139,7 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
}
}
systemPartIndex := 0
for i := 0; i < len(arr); i++ {
m := arr[i]
role := m.Get("role").String()
@@ -141,16 +149,19 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
// system -> request.systemInstruction as a user message style
if content.Type == gjson.String {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
out, _ = sjson.SetBytes(out, "request.systemInstruction.parts.0.text", content.String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), content.String())
systemPartIndex++
} else if content.IsObject() && content.Get("type").String() == "text" {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
out, _ = sjson.SetBytes(out, "request.systemInstruction.parts.0.text", content.Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), content.Get("text").String())
systemPartIndex++
} else if content.IsArray() {
contents := content.Array()
if len(contents) > 0 {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
for j := 0; j < len(contents); j++ {
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", j), contents[j].Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), contents[j].Get("text").String())
systemPartIndex++
}
}
}
@@ -165,7 +176,10 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
for _, item := range items {
switch item.Get("type").String() {
case "text":
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", item.Get("text").String())
text := item.Get("text").String()
if text != "" {
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", text)
}
p++
case "image_url":
imageURL := item.Get("image_url.url").String()
@@ -209,7 +223,10 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
for _, item := range content.Array() {
switch item.Get("type").String() {
case "text":
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", item.Get("text").String())
text := item.Get("text").String()
if text != "" {
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", text)
}
p++
case "image_url":
// If the assistant returned an inline data URL, preserve it for history fidelity.
@@ -288,12 +305,14 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
}
}
// tools -> request.tools[0].functionDeclarations + request.tools[0].googleSearch passthrough
// tools -> request.tools[].functionDeclarations + request.tools[].googleSearch/codeExecution/urlContext passthrough
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() && len(tools.Array()) > 0 {
toolNode := []byte(`{}`)
hasTool := false
functionToolNode := []byte(`{}`)
hasFunction := false
googleSearchNodes := make([][]byte, 0)
codeExecutionNodes := make([][]byte, 0)
urlContextNodes := make([][]byte, 0)
for _, t := range tools.Array() {
if t.Get("type").String() == "function" {
fn := t.Get("function")
@@ -332,31 +351,63 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
}
fnRaw, _ = sjson.Delete(fnRaw, "strict")
if !hasFunction {
toolNode, _ = sjson.SetRawBytes(toolNode, "functionDeclarations", []byte("[]"))
functionToolNode, _ = sjson.SetRawBytes(functionToolNode, "functionDeclarations", []byte("[]"))
}
tmp, errSet := sjson.SetRawBytes(toolNode, "functionDeclarations.-1", []byte(fnRaw))
tmp, errSet := sjson.SetRawBytes(functionToolNode, "functionDeclarations.-1", []byte(fnRaw))
if errSet != nil {
log.Warnf("Failed to append tool declaration for '%s': %v", fn.Get("name").String(), errSet)
continue
}
toolNode = tmp
functionToolNode = tmp
hasFunction = true
hasTool = true
}
}
if gs := t.Get("google_search"); gs.Exists() {
googleToolNode := []byte(`{}`)
var errSet error
toolNode, errSet = sjson.SetRawBytes(toolNode, "googleSearch", []byte(gs.Raw))
googleToolNode, errSet = sjson.SetRawBytes(googleToolNode, "googleSearch", []byte(gs.Raw))
if errSet != nil {
log.Warnf("Failed to set googleSearch tool: %v", errSet)
continue
}
hasTool = true
googleSearchNodes = append(googleSearchNodes, googleToolNode)
}
if ce := t.Get("code_execution"); ce.Exists() {
codeToolNode := []byte(`{}`)
var errSet error
codeToolNode, errSet = sjson.SetRawBytes(codeToolNode, "codeExecution", []byte(ce.Raw))
if errSet != nil {
log.Warnf("Failed to set codeExecution tool: %v", errSet)
continue
}
codeExecutionNodes = append(codeExecutionNodes, codeToolNode)
}
if uc := t.Get("url_context"); uc.Exists() {
urlToolNode := []byte(`{}`)
var errSet error
urlToolNode, errSet = sjson.SetRawBytes(urlToolNode, "urlContext", []byte(uc.Raw))
if errSet != nil {
log.Warnf("Failed to set urlContext tool: %v", errSet)
continue
}
urlContextNodes = append(urlContextNodes, urlToolNode)
}
}
if hasTool {
out, _ = sjson.SetRawBytes(out, "request.tools", []byte("[]"))
out, _ = sjson.SetRawBytes(out, "request.tools.0", toolNode)
if hasFunction || len(googleSearchNodes) > 0 || len(codeExecutionNodes) > 0 || len(urlContextNodes) > 0 {
toolsNode := []byte("[]")
if hasFunction {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", functionToolNode)
}
for _, googleNode := range googleSearchNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", googleNode)
}
for _, codeNode := range codeExecutionNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", codeNode)
}
for _, urlNode := range urlContextNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", urlNode)
}
out, _ = sjson.SetRawBytes(out, "request.tools", toolsNode)
}
}

View File

@@ -15,7 +15,7 @@ import (
"strings"
"github.com/google/uuid"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -98,9 +98,8 @@ func ConvertGeminiRequestToClaude(modelName string, inputRawJSON []byte, stream
// Temperature setting for controlling response randomness
if temp := genConfig.Get("temperature"); temp.Exists() {
out, _ = sjson.Set(out, "temperature", temp.Float())
}
// Top P setting for nucleus sampling
if topP := genConfig.Get("topP"); topP.Exists() {
} else if topP := genConfig.Get("topP"); topP.Exists() {
// Top P setting for nucleus sampling (filtered out if temperature is set)
out, _ = sjson.Set(out, "top_p", topP.Float())
}
// Stop sequences configuration for custom termination conditions
@@ -115,18 +114,41 @@ func ConvertGeminiRequestToClaude(modelName string, inputRawJSON []byte, stream
}
}
// Include thoughts configuration for reasoning process visibility
// Only apply for models that support thinking and use numeric budgets, not discrete levels.
// Translator only does format conversion, ApplyThinking handles model capability validation.
if thinkingConfig := genConfig.Get("thinkingConfig"); thinkingConfig.Exists() && thinkingConfig.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) == 0 {
// Check for thinkingBudget first - if present, enable thinking with budget
if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() && thinkingBudget.Int() > 0 {
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Set(out, "thinking.budget_tokens", thinkingBudget.Int())
} else if includeThoughts := thinkingConfig.Get("include_thoughts"); includeThoughts.Exists() && includeThoughts.Type == gjson.True {
// Fallback to include_thoughts if no budget specified
if thinkingLevel := thinkingConfig.Get("thinkingLevel"); thinkingLevel.Exists() {
level := strings.ToLower(strings.TrimSpace(thinkingLevel.String()))
switch level {
case "":
case "none":
out, _ = sjson.Set(out, "thinking.type", "disabled")
out, _ = sjson.Delete(out, "thinking.budget_tokens")
case "auto":
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Delete(out, "thinking.budget_tokens")
default:
if budget, ok := thinking.ConvertLevelToBudget(level); ok {
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
}
} else if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() {
budget := int(thinkingBudget.Int())
switch budget {
case 0:
out, _ = sjson.Set(out, "thinking.type", "disabled")
out, _ = sjson.Delete(out, "thinking.budget_tokens")
case -1:
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Delete(out, "thinking.budget_tokens")
default:
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
} else if includeThoughts := thinkingConfig.Get("includeThoughts"); includeThoughts.Exists() && includeThoughts.Type == gjson.True {
out, _ = sjson.Set(out, "thinking.type", "enabled")
} else if includeThoughts := thinkingConfig.Get("include_thoughts"); includeThoughts.Exists() && includeThoughts.Type == gjson.True {
out, _ = sjson.Set(out, "thinking.type", "enabled")
}
}
}

View File

@@ -15,7 +15,6 @@ import (
"strings"
"github.com/google/uuid"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -66,23 +65,21 @@ func ConvertOpenAIRequestToClaude(modelName string, inputRawJSON []byte, stream
root := gjson.ParseBytes(rawJSON)
// Convert OpenAI reasoning_effort to Claude thinking config.
if v := root.Get("reasoning_effort"); v.Exists() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) == 0 {
effort := strings.ToLower(strings.TrimSpace(v.String()))
if effort != "" {
budget, ok := thinking.ConvertLevelToBudget(effort)
if ok {
switch budget {
case 0:
out, _ = sjson.Set(out, "thinking.type", "disabled")
case -1:
effort := strings.ToLower(strings.TrimSpace(v.String()))
if effort != "" {
budget, ok := thinking.ConvertLevelToBudget(effort)
if ok {
switch budget {
case 0:
out, _ = sjson.Set(out, "thinking.type", "disabled")
case -1:
out, _ = sjson.Set(out, "thinking.type", "enabled")
default:
if budget > 0 {
out, _ = sjson.Set(out, "thinking.type", "enabled")
default:
if budget > 0 {
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
}
}
@@ -113,10 +110,8 @@ func ConvertOpenAIRequestToClaude(modelName string, inputRawJSON []byte, stream
// Temperature setting for controlling response randomness
if temp := root.Get("temperature"); temp.Exists() {
out, _ = sjson.Set(out, "temperature", temp.Float())
}
// Top P setting for nucleus sampling
if topP := root.Get("top_p"); topP.Exists() {
} else if topP := root.Get("top_p"); topP.Exists() {
// Top P setting for nucleus sampling (filtered out if temperature is set)
out, _ = sjson.Set(out, "top_p", topP.Float())
}
@@ -141,17 +136,35 @@ func ConvertOpenAIRequestToClaude(modelName string, inputRawJSON []byte, stream
// Process messages and transform them to Claude Code format
if messages := root.Get("messages"); messages.Exists() && messages.IsArray() {
messageIndex := 0
systemMessageIndex := -1
messages.ForEach(func(_, message gjson.Result) bool {
role := message.Get("role").String()
contentResult := message.Get("content")
switch role {
case "system", "user", "assistant":
// Create Claude Code message with appropriate role mapping
if role == "system" {
role = "user"
case "system":
if systemMessageIndex == -1 {
systemMsg := `{"role":"user","content":[]}`
out, _ = sjson.SetRaw(out, "messages.-1", systemMsg)
systemMessageIndex = messageIndex
messageIndex++
}
if contentResult.Exists() && contentResult.Type == gjson.String && contentResult.String() != "" {
textPart := `{"type":"text","text":""}`
textPart, _ = sjson.Set(textPart, "text", contentResult.String())
out, _ = sjson.SetRaw(out, fmt.Sprintf("messages.%d.content.-1", systemMessageIndex), textPart)
} else if contentResult.Exists() && contentResult.IsArray() {
contentResult.ForEach(func(_, part gjson.Result) bool {
if part.Get("type").String() == "text" {
textPart := `{"type":"text","text":""}`
textPart, _ = sjson.Set(textPart, "text", part.Get("text").String())
out, _ = sjson.SetRaw(out, fmt.Sprintf("messages.%d.content.-1", systemMessageIndex), textPart)
}
return true
})
}
case "user", "assistant":
msg := `{"role":"","content":[]}`
msg, _ = sjson.Set(msg, "role", role)
@@ -230,6 +243,7 @@ func ConvertOpenAIRequestToClaude(modelName string, inputRawJSON []byte, stream
}
out, _ = sjson.SetRaw(out, "messages.-1", msg)
messageIndex++
case "tool":
// Handle tool result messages conversion
@@ -240,6 +254,7 @@ func ConvertOpenAIRequestToClaude(modelName string, inputRawJSON []byte, stream
msg, _ = sjson.Set(msg, "content.0.tool_use_id", toolCallID)
msg, _ = sjson.Set(msg, "content.0.content", content)
out, _ = sjson.SetRaw(out, "messages.-1", msg)
messageIndex++
}
return true
})

View File

@@ -10,7 +10,6 @@ import (
"strings"
"github.com/google/uuid"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -54,23 +53,21 @@ func ConvertOpenAIResponsesRequestToClaude(modelName string, inputRawJSON []byte
root := gjson.ParseBytes(rawJSON)
// Convert OpenAI Responses reasoning.effort to Claude thinking config.
if v := root.Get("reasoning.effort"); v.Exists() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) == 0 {
effort := strings.ToLower(strings.TrimSpace(v.String()))
if effort != "" {
budget, ok := thinking.ConvertLevelToBudget(effort)
if ok {
switch budget {
case 0:
out, _ = sjson.Set(out, "thinking.type", "disabled")
case -1:
effort := strings.ToLower(strings.TrimSpace(v.String()))
if effort != "" {
budget, ok := thinking.ConvertLevelToBudget(effort)
if ok {
switch budget {
case 0:
out, _ = sjson.Set(out, "thinking.type", "disabled")
case -1:
out, _ = sjson.Set(out, "thinking.type", "enabled")
default:
if budget > 0 {
out, _ = sjson.Set(out, "thinking.type", "enabled")
default:
if budget > 0 {
out, _ = sjson.Set(out, "thinking.type", "enabled")
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
out, _ = sjson.Set(out, "thinking.budget_tokens", budget)
}
}
}

View File

@@ -12,7 +12,6 @@ import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -52,7 +51,7 @@ func ConvertClaudeRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
systemsResult := rootResult.Get("system")
if systemsResult.IsArray() {
systemResults := systemsResult.Array()
message := `{"type":"message","role":"user","content":[]}`
message := `{"type":"message","role":"developer","content":[]}`
for i := 0; i < len(systemResults); i++ {
systemResult := systemResults[i]
systemTypeResult := systemResult.Get("type")
@@ -218,18 +217,15 @@ func ConvertClaudeRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
// Add additional configuration parameters for the Codex API.
template, _ = sjson.Set(template, "parallel_tool_calls", true)
// Convert thinking.budget_tokens to reasoning.effort for level-based models
reasoningEffort := "medium" // default
// Convert thinking.budget_tokens to reasoning.effort.
reasoningEffort := "medium"
if thinkingConfig := rootResult.Get("thinking"); thinkingConfig.Exists() && thinkingConfig.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
switch thinkingConfig.Get("type").String() {
case "enabled":
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) > 0 {
if budgetTokens := thinkingConfig.Get("budget_tokens"); budgetTokens.Exists() {
budget := int(budgetTokens.Int())
if effort, ok := thinking.ConvertBudgetToLevel(budget); ok && effort != "" {
reasoningEffort = effort
}
if budgetTokens := thinkingConfig.Get("budget_tokens"); budgetTokens.Exists() {
budget := int(budgetTokens.Int())
if effort, ok := thinking.ConvertBudgetToLevel(budget); ok && effort != "" {
reasoningEffort = effort
}
}
case "disabled":
@@ -245,21 +241,23 @@ func ConvertClaudeRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
template, _ = sjson.Set(template, "include", []string{"reasoning.encrypted_content"})
// Add a first message to ignore system instructions and ensure proper execution.
inputResult := gjson.Get(template, "input")
if inputResult.Exists() && inputResult.IsArray() {
inputResults := inputResult.Array()
newInput := "[]"
for i := 0; i < len(inputResults); i++ {
if i == 0 {
firstText := inputResults[i].Get("content.0.text")
firstInstructions := "EXECUTE ACCORDING TO THE FOLLOWING INSTRUCTIONS!!!"
if firstText.Exists() && firstText.String() != firstInstructions {
newInput, _ = sjson.SetRaw(newInput, "-1", `{"type":"message","role":"user","content":[{"type":"input_text","text":"EXECUTE ACCORDING TO THE FOLLOWING INSTRUCTIONS!!!"}]}`)
if misc.GetCodexInstructionsEnabled() {
inputResult := gjson.Get(template, "input")
if inputResult.Exists() && inputResult.IsArray() {
inputResults := inputResult.Array()
newInput := "[]"
for i := 0; i < len(inputResults); i++ {
if i == 0 {
firstText := inputResults[i].Get("content.0.text")
firstInstructions := "EXECUTE ACCORDING TO THE FOLLOWING INSTRUCTIONS!!!"
if firstText.Exists() && firstText.String() != firstInstructions {
newInput, _ = sjson.SetRaw(newInput, "-1", `{"type":"message","role":"user","content":[{"type":"input_text","text":"EXECUTE ACCORDING TO THE FOLLOWING INSTRUCTIONS!!!"}]}`)
}
}
newInput, _ = sjson.SetRaw(newInput, "-1", inputResults[i].Raw)
}
newInput, _ = sjson.SetRaw(newInput, "-1", inputResults[i].Raw)
template, _ = sjson.SetRaw(template, "input", newInput)
}
template, _ = sjson.SetRaw(template, "input", newInput)
}
return []byte(template)

View File

@@ -117,8 +117,12 @@ func ConvertCodexResponseToClaude(_ context.Context, _ string, originalRequestRa
} else {
template, _ = sjson.Set(template, "delta.stop_reason", "end_turn")
}
template, _ = sjson.Set(template, "usage.input_tokens", rootResult.Get("response.usage.input_tokens").Int())
template, _ = sjson.Set(template, "usage.output_tokens", rootResult.Get("response.usage.output_tokens").Int())
inputTokens, outputTokens, cachedTokens := extractResponsesUsage(rootResult.Get("response.usage"))
template, _ = sjson.Set(template, "usage.input_tokens", inputTokens)
template, _ = sjson.Set(template, "usage.output_tokens", outputTokens)
if cachedTokens > 0 {
template, _ = sjson.Set(template, "usage.cache_read_input_tokens", cachedTokens)
}
output = "event: message_delta\n"
output += fmt.Sprintf("data: %s\n\n", template)
@@ -204,8 +208,12 @@ func ConvertCodexResponseToClaudeNonStream(_ context.Context, _ string, original
out := `{"id":"","type":"message","role":"assistant","model":"","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":0,"output_tokens":0}}`
out, _ = sjson.Set(out, "id", responseData.Get("id").String())
out, _ = sjson.Set(out, "model", responseData.Get("model").String())
out, _ = sjson.Set(out, "usage.input_tokens", responseData.Get("usage.input_tokens").Int())
out, _ = sjson.Set(out, "usage.output_tokens", responseData.Get("usage.output_tokens").Int())
inputTokens, outputTokens, cachedTokens := extractResponsesUsage(responseData.Get("usage"))
out, _ = sjson.Set(out, "usage.input_tokens", inputTokens)
out, _ = sjson.Set(out, "usage.output_tokens", outputTokens)
if cachedTokens > 0 {
out, _ = sjson.Set(out, "usage.cache_read_input_tokens", cachedTokens)
}
hasToolCall := false
@@ -308,12 +316,27 @@ func ConvertCodexResponseToClaudeNonStream(_ context.Context, _ string, original
out, _ = sjson.SetRaw(out, "stop_sequence", stopSequence.Raw)
}
if responseData.Get("usage.input_tokens").Exists() || responseData.Get("usage.output_tokens").Exists() {
out, _ = sjson.Set(out, "usage.input_tokens", responseData.Get("usage.input_tokens").Int())
out, _ = sjson.Set(out, "usage.output_tokens", responseData.Get("usage.output_tokens").Int())
return out
}
func extractResponsesUsage(usage gjson.Result) (int64, int64, int64) {
if !usage.Exists() || usage.Type == gjson.Null {
return 0, 0, 0
}
return out
inputTokens := usage.Get("input_tokens").Int()
outputTokens := usage.Get("output_tokens").Int()
cachedTokens := usage.Get("input_tokens_details.cached_tokens").Int()
if cachedTokens > 0 {
if inputTokens >= cachedTokens {
inputTokens -= cachedTokens
} else {
inputTokens = 0
}
}
return inputTokens, outputTokens, cachedTokens
}
// buildReverseMapFromClaudeOriginalShortToOriginal builds a map[short]original from original Claude request tools.

View File

@@ -14,7 +14,6 @@ import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
"github.com/tidwall/gjson"
@@ -95,7 +94,7 @@ func ConvertGeminiRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
// System instruction -> as a user message with input_text parts
sysParts := root.Get("system_instruction.parts")
if sysParts.IsArray() {
msg := `{"type":"message","role":"user","content":[]}`
msg := `{"type":"message","role":"developer","content":[]}`
arr := sysParts.Array()
for i := 0; i < len(arr); i++ {
p := arr[i]
@@ -249,22 +248,28 @@ func ConvertGeminiRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
// Fixed flags aligning with Codex expectations
out, _ = sjson.Set(out, "parallel_tool_calls", true)
// Convert thinkingBudget to reasoning.effort for level-based models
reasoningEffort := "medium" // default
// Convert Gemini thinkingConfig to Codex reasoning.effort.
effortSet := false
if genConfig := root.Get("generationConfig"); genConfig.Exists() {
if thinkingConfig := genConfig.Get("thinkingConfig"); thinkingConfig.Exists() && thinkingConfig.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) > 0 {
if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() {
budget := int(thinkingBudget.Int())
if effort, ok := thinking.ConvertBudgetToLevel(budget); ok && effort != "" {
reasoningEffort = effort
}
if thinkingLevel := thinkingConfig.Get("thinkingLevel"); thinkingLevel.Exists() {
effort := strings.ToLower(strings.TrimSpace(thinkingLevel.String()))
if effort != "" {
out, _ = sjson.Set(out, "reasoning.effort", effort)
effortSet = true
}
} else if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() {
if effort, ok := thinking.ConvertBudgetToLevel(int(thinkingBudget.Int())); ok {
out, _ = sjson.Set(out, "reasoning.effort", effort)
effortSet = true
}
}
}
}
out, _ = sjson.Set(out, "reasoning.effort", reasoningEffort)
if !effortSet {
// No thinking config, set default effort
out, _ = sjson.Set(out, "reasoning.effort", "medium")
}
out, _ = sjson.Set(out, "reasoning.summary", "auto")
out, _ = sjson.Set(out, "stream", true)
out, _ = sjson.Set(out, "store", false)

View File

@@ -33,7 +33,7 @@ func ConvertOpenAIRequestToCodex(modelName string, inputRawJSON []byte, stream b
rawJSON := bytes.Clone(inputRawJSON)
userAgent := misc.ExtractCodexUserAgent(rawJSON)
// Start with empty JSON object
out := `{}`
out := `{"instructions":""}`
// Stream must be set to true
out, _ = sjson.Set(out, "stream", stream)
@@ -98,7 +98,9 @@ func ConvertOpenAIRequestToCodex(modelName string, inputRawJSON []byte, stream b
// Extract system instructions from first system message (string or text object)
messages := gjson.GetBytes(rawJSON, "messages")
_, instructions := misc.CodexInstructionsForModel(modelName, "", userAgent)
out, _ = sjson.Set(out, "instructions", instructions)
if misc.GetCodexInstructionsEnabled() {
out, _ = sjson.Set(out, "instructions", instructions)
}
// if messages.IsArray() {
// arr := messages.Array()
// for i := 0; i < len(arr); i++ {
@@ -141,7 +143,7 @@ func ConvertOpenAIRequestToCodex(modelName string, inputRawJSON []byte, stream b
msg := `{}`
msg, _ = sjson.Set(msg, "type", "message")
if role == "system" {
msg, _ = sjson.Set(msg, "role", "user")
msg, _ = sjson.Set(msg, "role", "developer")
} else {
msg, _ = sjson.Set(msg, "role", role)
}

View File

@@ -9,7 +9,6 @@ import (
"bytes"
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/translator/gemini/common"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -161,14 +160,11 @@ func ConvertClaudeRequestToCLI(modelName string, inputRawJSON []byte, _ bool) []
// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when type==enabled
if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil {
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
}
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.includeThoughts", true)
}
}
}

View File

@@ -63,6 +63,13 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
out, _ = sjson.SetBytes(out, "request.generationConfig.topK", tkr.Num)
}
// Candidate count (OpenAI 'n' parameter)
if n := gjson.GetBytes(rawJSON, "n"); n.Exists() && n.Type == gjson.Number {
if val := n.Int(); val > 1 {
out, _ = sjson.SetBytes(out, "request.generationConfig.candidateCount", val)
}
}
// Map OpenAI modalities -> Gemini CLI request.generationConfig.responseModalities
// e.g. "modalities": ["image", "text"] -> ["IMAGE", "TEXT"]
if mods := gjson.GetBytes(rawJSON, "modalities"); mods.Exists() && mods.IsArray() {
@@ -129,6 +136,7 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
}
}
systemPartIndex := 0
for i := 0; i < len(arr); i++ {
m := arr[i]
role := m.Get("role").String()
@@ -138,16 +146,19 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
// system -> request.systemInstruction as a user message style
if content.Type == gjson.String {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
out, _ = sjson.SetBytes(out, "request.systemInstruction.parts.0.text", content.String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), content.String())
systemPartIndex++
} else if content.IsObject() && content.Get("type").String() == "text" {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
out, _ = sjson.SetBytes(out, "request.systemInstruction.parts.0.text", content.Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), content.Get("text").String())
systemPartIndex++
} else if content.IsArray() {
contents := content.Array()
if len(contents) > 0 {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
for j := 0; j < len(contents); j++ {
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", j), contents[j].Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", systemPartIndex), contents[j].Get("text").String())
systemPartIndex++
}
}
}
@@ -272,12 +283,14 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
}
}
// tools -> request.tools[0].functionDeclarations + request.tools[0].googleSearch passthrough
// tools -> request.tools[].functionDeclarations + request.tools[].googleSearch/codeExecution/urlContext passthrough
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() && len(tools.Array()) > 0 {
toolNode := []byte(`{}`)
hasTool := false
functionToolNode := []byte(`{}`)
hasFunction := false
googleSearchNodes := make([][]byte, 0)
codeExecutionNodes := make([][]byte, 0)
urlContextNodes := make([][]byte, 0)
for _, t := range tools.Array() {
if t.Get("type").String() == "function" {
fn := t.Get("function")
@@ -316,31 +329,63 @@ func ConvertOpenAIRequestToGeminiCLI(modelName string, inputRawJSON []byte, _ bo
}
fnRaw, _ = sjson.Delete(fnRaw, "strict")
if !hasFunction {
toolNode, _ = sjson.SetRawBytes(toolNode, "functionDeclarations", []byte("[]"))
functionToolNode, _ = sjson.SetRawBytes(functionToolNode, "functionDeclarations", []byte("[]"))
}
tmp, errSet := sjson.SetRawBytes(toolNode, "functionDeclarations.-1", []byte(fnRaw))
tmp, errSet := sjson.SetRawBytes(functionToolNode, "functionDeclarations.-1", []byte(fnRaw))
if errSet != nil {
log.Warnf("Failed to append tool declaration for '%s': %v", fn.Get("name").String(), errSet)
continue
}
toolNode = tmp
functionToolNode = tmp
hasFunction = true
hasTool = true
}
}
if gs := t.Get("google_search"); gs.Exists() {
googleToolNode := []byte(`{}`)
var errSet error
toolNode, errSet = sjson.SetRawBytes(toolNode, "googleSearch", []byte(gs.Raw))
googleToolNode, errSet = sjson.SetRawBytes(googleToolNode, "googleSearch", []byte(gs.Raw))
if errSet != nil {
log.Warnf("Failed to set googleSearch tool: %v", errSet)
continue
}
hasTool = true
googleSearchNodes = append(googleSearchNodes, googleToolNode)
}
if ce := t.Get("code_execution"); ce.Exists() {
codeToolNode := []byte(`{}`)
var errSet error
codeToolNode, errSet = sjson.SetRawBytes(codeToolNode, "codeExecution", []byte(ce.Raw))
if errSet != nil {
log.Warnf("Failed to set codeExecution tool: %v", errSet)
continue
}
codeExecutionNodes = append(codeExecutionNodes, codeToolNode)
}
if uc := t.Get("url_context"); uc.Exists() {
urlToolNode := []byte(`{}`)
var errSet error
urlToolNode, errSet = sjson.SetRawBytes(urlToolNode, "urlContext", []byte(uc.Raw))
if errSet != nil {
log.Warnf("Failed to set urlContext tool: %v", errSet)
continue
}
urlContextNodes = append(urlContextNodes, urlToolNode)
}
}
if hasTool {
out, _ = sjson.SetRawBytes(out, "request.tools", []byte("[]"))
out, _ = sjson.SetRawBytes(out, "request.tools.0", toolNode)
if hasFunction || len(googleSearchNodes) > 0 || len(codeExecutionNodes) > 0 || len(urlContextNodes) > 0 {
toolsNode := []byte("[]")
if hasFunction {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", functionToolNode)
}
for _, googleNode := range googleSearchNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", googleNode)
}
for _, codeNode := range codeExecutionNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", codeNode)
}
for _, urlNode := range urlContextNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", urlNode)
}
out, _ = sjson.SetRawBytes(out, "request.tools", toolsNode)
}
}

View File

@@ -9,7 +9,6 @@ import (
"bytes"
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/translator/gemini/common"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
@@ -153,16 +152,13 @@ func ConvertClaudeRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
}
// Map Anthropic thinking -> Gemini thinkingBudget/include_thoughts when enabled
// Only apply for models that use numeric budgets, not discrete levels.
// Translator only does format conversion, ApplyThinking handles model capability validation.
if t := gjson.GetBytes(rawJSON, "thinking"); t.Exists() && t.IsObject() {
modelInfo := registry.LookupModelInfo(modelName)
if modelInfo != nil && modelInfo.Thinking != nil && len(modelInfo.Thinking.Levels) == 0 {
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.include_thoughts", true)
}
if t.Get("type").String() == "enabled" {
if b := t.Get("budget_tokens"); b.Exists() && b.Type == gjson.Number {
budget := int(b.Int())
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "generationConfig.thinkingConfig.includeThoughts", true)
}
}
}

View File

@@ -63,6 +63,13 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
out, _ = sjson.SetBytes(out, "generationConfig.topK", tkr.Num)
}
// Candidate count (OpenAI 'n' parameter)
if n := gjson.GetBytes(rawJSON, "n"); n.Exists() && n.Type == gjson.Number {
if val := n.Int(); val > 1 {
out, _ = sjson.SetBytes(out, "generationConfig.candidateCount", val)
}
}
// Map OpenAI modalities -> Gemini generationConfig.responseModalities
// e.g. "modalities": ["image", "text"] -> ["IMAGE", "TEXT"]
if mods := gjson.GetBytes(rawJSON, "modalities"); mods.Exists() && mods.IsArray() {
@@ -129,6 +136,7 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
}
}
systemPartIndex := 0
for i := 0; i < len(arr); i++ {
m := arr[i]
role := m.Get("role").String()
@@ -138,16 +146,19 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
// system -> system_instruction as a user message style
if content.Type == gjson.String {
out, _ = sjson.SetBytes(out, "system_instruction.role", "user")
out, _ = sjson.SetBytes(out, "system_instruction.parts.0.text", content.String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("system_instruction.parts.%d.text", systemPartIndex), content.String())
systemPartIndex++
} else if content.IsObject() && content.Get("type").String() == "text" {
out, _ = sjson.SetBytes(out, "system_instruction.role", "user")
out, _ = sjson.SetBytes(out, "system_instruction.parts.0.text", content.Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("system_instruction.parts.%d.text", systemPartIndex), content.Get("text").String())
systemPartIndex++
} else if content.IsArray() {
contents := content.Array()
if len(contents) > 0 {
out, _ = sjson.SetBytes(out, "request.systemInstruction.role", "user")
out, _ = sjson.SetBytes(out, "system_instruction.role", "user")
for j := 0; j < len(contents); j++ {
out, _ = sjson.SetBytes(out, fmt.Sprintf("request.systemInstruction.parts.%d.text", j), contents[j].Get("text").String())
out, _ = sjson.SetBytes(out, fmt.Sprintf("system_instruction.parts.%d.text", systemPartIndex), contents[j].Get("text").String())
systemPartIndex++
}
}
}
@@ -162,7 +173,10 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
for _, item := range items {
switch item.Get("type").String() {
case "text":
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", item.Get("text").String())
text := item.Get("text").String()
if text != "" {
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", text)
}
p++
case "image_url":
imageURL := item.Get("image_url.url").String()
@@ -207,6 +221,10 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
for _, item := range content.Array() {
switch item.Get("type").String() {
case "text":
text := item.Get("text").String()
if text != "" {
node, _ = sjson.SetBytes(node, "parts."+itoa(p)+".text", text)
}
p++
case "image_url":
// If the assistant returned an inline data URL, preserve it for history fidelity.
@@ -271,12 +289,14 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
}
}
// tools -> tools[0].functionDeclarations + tools[0].googleSearch passthrough
// tools -> tools[].functionDeclarations + tools[].googleSearch/codeExecution/urlContext passthrough
tools := gjson.GetBytes(rawJSON, "tools")
if tools.IsArray() && len(tools.Array()) > 0 {
toolNode := []byte(`{}`)
hasTool := false
functionToolNode := []byte(`{}`)
hasFunction := false
googleSearchNodes := make([][]byte, 0)
codeExecutionNodes := make([][]byte, 0)
urlContextNodes := make([][]byte, 0)
for _, t := range tools.Array() {
if t.Get("type").String() == "function" {
fn := t.Get("function")
@@ -315,31 +335,63 @@ func ConvertOpenAIRequestToGemini(modelName string, inputRawJSON []byte, _ bool)
}
fnRaw, _ = sjson.Delete(fnRaw, "strict")
if !hasFunction {
toolNode, _ = sjson.SetRawBytes(toolNode, "functionDeclarations", []byte("[]"))
functionToolNode, _ = sjson.SetRawBytes(functionToolNode, "functionDeclarations", []byte("[]"))
}
tmp, errSet := sjson.SetRawBytes(toolNode, "functionDeclarations.-1", []byte(fnRaw))
tmp, errSet := sjson.SetRawBytes(functionToolNode, "functionDeclarations.-1", []byte(fnRaw))
if errSet != nil {
log.Warnf("Failed to append tool declaration for '%s': %v", fn.Get("name").String(), errSet)
continue
}
toolNode = tmp
functionToolNode = tmp
hasFunction = true
hasTool = true
}
}
if gs := t.Get("google_search"); gs.Exists() {
googleToolNode := []byte(`{}`)
var errSet error
toolNode, errSet = sjson.SetRawBytes(toolNode, "googleSearch", []byte(gs.Raw))
googleToolNode, errSet = sjson.SetRawBytes(googleToolNode, "googleSearch", []byte(gs.Raw))
if errSet != nil {
log.Warnf("Failed to set googleSearch tool: %v", errSet)
continue
}
hasTool = true
googleSearchNodes = append(googleSearchNodes, googleToolNode)
}
if ce := t.Get("code_execution"); ce.Exists() {
codeToolNode := []byte(`{}`)
var errSet error
codeToolNode, errSet = sjson.SetRawBytes(codeToolNode, "codeExecution", []byte(ce.Raw))
if errSet != nil {
log.Warnf("Failed to set codeExecution tool: %v", errSet)
continue
}
codeExecutionNodes = append(codeExecutionNodes, codeToolNode)
}
if uc := t.Get("url_context"); uc.Exists() {
urlToolNode := []byte(`{}`)
var errSet error
urlToolNode, errSet = sjson.SetRawBytes(urlToolNode, "urlContext", []byte(uc.Raw))
if errSet != nil {
log.Warnf("Failed to set urlContext tool: %v", errSet)
continue
}
urlContextNodes = append(urlContextNodes, urlToolNode)
}
}
if hasTool {
out, _ = sjson.SetRawBytes(out, "tools", []byte("[]"))
out, _ = sjson.SetRawBytes(out, "tools.0", toolNode)
if hasFunction || len(googleSearchNodes) > 0 || len(codeExecutionNodes) > 0 || len(urlContextNodes) > 0 {
toolsNode := []byte("[]")
if hasFunction {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", functionToolNode)
}
for _, googleNode := range googleSearchNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", googleNode)
}
for _, codeNode := range codeExecutionNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", codeNode)
}
for _, urlNode := range urlContextNodes {
toolsNode, _ = sjson.SetRawBytes(toolsNode, "-1", urlNode)
}
out, _ = sjson.SetRawBytes(out, "tools", toolsNode)
}
}

View File

@@ -21,7 +21,8 @@ import (
// convertGeminiResponseToOpenAIChatParams holds parameters for response conversion.
type convertGeminiResponseToOpenAIChatParams struct {
UnixTimestamp int64
FunctionIndex int
// FunctionIndex tracks tool call indices per candidate index to support multiple candidates.
FunctionIndex map[int]int
}
// functionCallIDCounter provides a process-wide unique counter for function call identifiers.
@@ -42,13 +43,20 @@ var functionCallIDCounter uint64
// Returns:
// - []string: A slice of strings, each containing an OpenAI-compatible JSON response
func ConvertGeminiResponseToOpenAI(_ context.Context, _ string, originalRequestRawJSON, requestRawJSON, rawJSON []byte, param *any) []string {
// Initialize parameters if nil.
if *param == nil {
*param = &convertGeminiResponseToOpenAIChatParams{
UnixTimestamp: 0,
FunctionIndex: 0,
FunctionIndex: make(map[int]int),
}
}
// Ensure the Map is initialized (handling cases where param might be reused from older context).
p := (*param).(*convertGeminiResponseToOpenAIChatParams)
if p.FunctionIndex == nil {
p.FunctionIndex = make(map[int]int)
}
if bytes.HasPrefix(rawJSON, []byte("data:")) {
rawJSON = bytes.TrimSpace(rawJSON[5:])
}
@@ -57,151 +65,179 @@ func ConvertGeminiResponseToOpenAI(_ context.Context, _ string, originalRequestR
return []string{}
}
// Initialize the OpenAI SSE template.
template := `{"id":"","object":"chat.completion.chunk","created":12345,"model":"model","choices":[{"index":0,"delta":{"role":null,"content":null,"reasoning_content":null,"tool_calls":null},"finish_reason":null,"native_finish_reason":null}]}`
// Initialize the OpenAI SSE base template.
// We use a base template and clone it for each candidate to support multiple candidates.
baseTemplate := `{"id":"","object":"chat.completion.chunk","created":12345,"model":"model","choices":[{"index":0,"delta":{"role":null,"content":null,"reasoning_content":null,"tool_calls":null},"finish_reason":null,"native_finish_reason":null}]}`
// Extract and set the model version.
if modelVersionResult := gjson.GetBytes(rawJSON, "modelVersion"); modelVersionResult.Exists() {
template, _ = sjson.Set(template, "model", modelVersionResult.String())
baseTemplate, _ = sjson.Set(baseTemplate, "model", modelVersionResult.String())
}
// Extract and set the creation timestamp.
if createTimeResult := gjson.GetBytes(rawJSON, "createTime"); createTimeResult.Exists() {
t, err := time.Parse(time.RFC3339Nano, createTimeResult.String())
if err == nil {
(*param).(*convertGeminiResponseToOpenAIChatParams).UnixTimestamp = t.Unix()
p.UnixTimestamp = t.Unix()
}
template, _ = sjson.Set(template, "created", (*param).(*convertGeminiResponseToOpenAIChatParams).UnixTimestamp)
baseTemplate, _ = sjson.Set(baseTemplate, "created", p.UnixTimestamp)
} else {
template, _ = sjson.Set(template, "created", (*param).(*convertGeminiResponseToOpenAIChatParams).UnixTimestamp)
baseTemplate, _ = sjson.Set(baseTemplate, "created", p.UnixTimestamp)
}
// Extract and set the response ID.
if responseIDResult := gjson.GetBytes(rawJSON, "responseId"); responseIDResult.Exists() {
template, _ = sjson.Set(template, "id", responseIDResult.String())
}
// Extract and set the finish reason.
if finishReasonResult := gjson.GetBytes(rawJSON, "candidates.0.finishReason"); finishReasonResult.Exists() {
template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String()))
template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String()))
baseTemplate, _ = sjson.Set(baseTemplate, "id", responseIDResult.String())
}
// Extract and set usage metadata (token counts).
// Usage is applied to the base template so it appears in the chunks.
if usageResult := gjson.GetBytes(rawJSON, "usageMetadata"); usageResult.Exists() {
cachedTokenCount := usageResult.Get("cachedContentTokenCount").Int()
if candidatesTokenCountResult := usageResult.Get("candidatesTokenCount"); candidatesTokenCountResult.Exists() {
template, _ = sjson.Set(template, "usage.completion_tokens", candidatesTokenCountResult.Int())
baseTemplate, _ = sjson.Set(baseTemplate, "usage.completion_tokens", candidatesTokenCountResult.Int())
}
if totalTokenCountResult := usageResult.Get("totalTokenCount"); totalTokenCountResult.Exists() {
template, _ = sjson.Set(template, "usage.total_tokens", totalTokenCountResult.Int())
baseTemplate, _ = sjson.Set(baseTemplate, "usage.total_tokens", totalTokenCountResult.Int())
}
promptTokenCount := usageResult.Get("promptTokenCount").Int() - cachedTokenCount
thoughtsTokenCount := usageResult.Get("thoughtsTokenCount").Int()
template, _ = sjson.Set(template, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
baseTemplate, _ = sjson.Set(baseTemplate, "usage.prompt_tokens", promptTokenCount+thoughtsTokenCount)
if thoughtsTokenCount > 0 {
template, _ = sjson.Set(template, "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)
baseTemplate, _ = sjson.Set(baseTemplate, "usage.completion_tokens_details.reasoning_tokens", thoughtsTokenCount)
}
// Include cached token count if present (indicates prompt caching is working)
if cachedTokenCount > 0 {
var err error
template, err = sjson.Set(template, "usage.prompt_tokens_details.cached_tokens", cachedTokenCount)
baseTemplate, err = sjson.Set(baseTemplate, "usage.prompt_tokens_details.cached_tokens", cachedTokenCount)
if err != nil {
log.Warnf("gemini openai response: failed to set cached_tokens in streaming: %v", err)
}
}
}
// Process the main content part of the response.
partsResult := gjson.GetBytes(rawJSON, "candidates.0.content.parts")
hasFunctionCall := false
if partsResult.IsArray() {
partResults := partsResult.Array()
for i := 0; i < len(partResults); i++ {
partResult := partResults[i]
partTextResult := partResult.Get("text")
functionCallResult := partResult.Get("functionCall")
inlineDataResult := partResult.Get("inlineData")
if !inlineDataResult.Exists() {
inlineDataResult = partResult.Get("inline_data")
}
thoughtSignatureResult := partResult.Get("thoughtSignature")
if !thoughtSignatureResult.Exists() {
thoughtSignatureResult = partResult.Get("thought_signature")
var responseStrings []string
candidates := gjson.GetBytes(rawJSON, "candidates")
// Iterate over all candidates to support candidate_count > 1.
if candidates.IsArray() {
candidates.ForEach(func(_, candidate gjson.Result) bool {
// Clone the template for the current candidate.
template := baseTemplate
// Set the specific index for this candidate.
candidateIndex := int(candidate.Get("index").Int())
template, _ = sjson.Set(template, "choices.0.index", candidateIndex)
// Extract and set the finish reason.
if finishReasonResult := candidate.Get("finishReason"); finishReasonResult.Exists() {
template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String()))
template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String()))
}
hasThoughtSignature := thoughtSignatureResult.Exists() && thoughtSignatureResult.String() != ""
hasContentPayload := partTextResult.Exists() || functionCallResult.Exists() || inlineDataResult.Exists()
partsResult := candidate.Get("content.parts")
hasFunctionCall := false
// Skip pure thoughtSignature parts but keep any actual payload in the same part.
if hasThoughtSignature && !hasContentPayload {
continue
if partsResult.IsArray() {
partResults := partsResult.Array()
for i := 0; i < len(partResults); i++ {
partResult := partResults[i]
partTextResult := partResult.Get("text")
functionCallResult := partResult.Get("functionCall")
inlineDataResult := partResult.Get("inlineData")
if !inlineDataResult.Exists() {
inlineDataResult = partResult.Get("inline_data")
}
thoughtSignatureResult := partResult.Get("thoughtSignature")
if !thoughtSignatureResult.Exists() {
thoughtSignatureResult = partResult.Get("thought_signature")
}
hasThoughtSignature := thoughtSignatureResult.Exists() && thoughtSignatureResult.String() != ""
hasContentPayload := partTextResult.Exists() || functionCallResult.Exists() || inlineDataResult.Exists()
// Skip pure thoughtSignature parts but keep any actual payload in the same part.
if hasThoughtSignature && !hasContentPayload {
continue
}
if partTextResult.Exists() {
text := partTextResult.String()
// Handle text content, distinguishing between regular content and reasoning/thoughts.
if partResult.Get("thought").Bool() {
template, _ = sjson.Set(template, "choices.0.delta.reasoning_content", text)
} else {
template, _ = sjson.Set(template, "choices.0.delta.content", text)
}
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
} else if functionCallResult.Exists() {
// Handle function call content.
hasFunctionCall = true
toolCallsResult := gjson.Get(template, "choices.0.delta.tool_calls")
// Retrieve the function index for this specific candidate.
functionCallIndex := p.FunctionIndex[candidateIndex]
p.FunctionIndex[candidateIndex]++
if toolCallsResult.Exists() && toolCallsResult.IsArray() {
functionCallIndex = len(toolCallsResult.Array())
} else {
template, _ = sjson.SetRaw(template, "choices.0.delta.tool_calls", `[]`)
}
functionCallTemplate := `{"id": "","index": 0,"type": "function","function": {"name": "","arguments": ""}}`
fcName := functionCallResult.Get("name").String()
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "id", fmt.Sprintf("%s-%d-%d", fcName, time.Now().UnixNano(), atomic.AddUint64(&functionCallIDCounter, 1)))
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "index", functionCallIndex)
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "function.name", fcName)
if fcArgsResult := functionCallResult.Get("args"); fcArgsResult.Exists() {
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "function.arguments", fcArgsResult.Raw)
}
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.delta.tool_calls.-1", functionCallTemplate)
} else if inlineDataResult.Exists() {
data := inlineDataResult.Get("data").String()
if data == "" {
continue
}
mimeType := inlineDataResult.Get("mimeType").String()
if mimeType == "" {
mimeType = inlineDataResult.Get("mime_type").String()
}
if mimeType == "" {
mimeType = "image/png"
}
imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
imagesResult := gjson.Get(template, "choices.0.delta.images")
if !imagesResult.Exists() || !imagesResult.IsArray() {
template, _ = sjson.SetRaw(template, "choices.0.delta.images", `[]`)
}
imageIndex := len(gjson.Get(template, "choices.0.delta.images").Array())
imagePayload := `{"type":"image_url","image_url":{"url":""}}`
imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.delta.images.-1", imagePayload)
}
}
}
if partTextResult.Exists() {
text := partTextResult.String()
// Handle text content, distinguishing between regular content and reasoning/thoughts.
if partResult.Get("thought").Bool() {
template, _ = sjson.Set(template, "choices.0.delta.reasoning_content", text)
} else {
template, _ = sjson.Set(template, "choices.0.delta.content", text)
}
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
} else if functionCallResult.Exists() {
// Handle function call content.
hasFunctionCall = true
toolCallsResult := gjson.Get(template, "choices.0.delta.tool_calls")
functionCallIndex := (*param).(*convertGeminiResponseToOpenAIChatParams).FunctionIndex
(*param).(*convertGeminiResponseToOpenAIChatParams).FunctionIndex++
if toolCallsResult.Exists() && toolCallsResult.IsArray() {
functionCallIndex = len(toolCallsResult.Array())
} else {
template, _ = sjson.SetRaw(template, "choices.0.delta.tool_calls", `[]`)
}
functionCallTemplate := `{"id": "","index": 0,"type": "function","function": {"name": "","arguments": ""}}`
fcName := functionCallResult.Get("name").String()
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "id", fmt.Sprintf("%s-%d-%d", fcName, time.Now().UnixNano(), atomic.AddUint64(&functionCallIDCounter, 1)))
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "index", functionCallIndex)
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "function.name", fcName)
if fcArgsResult := functionCallResult.Get("args"); fcArgsResult.Exists() {
functionCallTemplate, _ = sjson.Set(functionCallTemplate, "function.arguments", fcArgsResult.Raw)
}
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.delta.tool_calls.-1", functionCallTemplate)
} else if inlineDataResult.Exists() {
data := inlineDataResult.Get("data").String()
if data == "" {
continue
}
mimeType := inlineDataResult.Get("mimeType").String()
if mimeType == "" {
mimeType = inlineDataResult.Get("mime_type").String()
}
if mimeType == "" {
mimeType = "image/png"
}
imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
imagesResult := gjson.Get(template, "choices.0.delta.images")
if !imagesResult.Exists() || !imagesResult.IsArray() {
template, _ = sjson.SetRaw(template, "choices.0.delta.images", `[]`)
}
imageIndex := len(gjson.Get(template, "choices.0.delta.images").Array())
imagePayload := `{"type":"image_url","image_url":{"url":""}}`
imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
template, _ = sjson.Set(template, "choices.0.delta.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.delta.images.-1", imagePayload)
if hasFunctionCall {
template, _ = sjson.Set(template, "choices.0.finish_reason", "tool_calls")
template, _ = sjson.Set(template, "choices.0.native_finish_reason", "tool_calls")
}
responseStrings = append(responseStrings, template)
return true // continue loop
})
} else {
// If there are no candidates (e.g., a pure usageMetadata chunk), return the usage chunk if present.
if gjson.GetBytes(rawJSON, "usageMetadata").Exists() && len(responseStrings) == 0 {
responseStrings = append(responseStrings, baseTemplate)
}
}
if hasFunctionCall {
template, _ = sjson.Set(template, "choices.0.finish_reason", "tool_calls")
template, _ = sjson.Set(template, "choices.0.native_finish_reason", "tool_calls")
}
return []string{template}
return responseStrings
}
// ConvertGeminiResponseToOpenAINonStream converts a non-streaming Gemini response to a non-streaming OpenAI response.
@@ -219,7 +255,9 @@ func ConvertGeminiResponseToOpenAI(_ context.Context, _ string, originalRequestR
// - string: An OpenAI-compatible JSON response containing all message content and metadata
func ConvertGeminiResponseToOpenAINonStream(_ context.Context, _ string, originalRequestRawJSON, requestRawJSON, rawJSON []byte, _ *any) string {
var unixTimestamp int64
template := `{"id":"","object":"chat.completion","created":123456,"model":"model","choices":[{"index":0,"message":{"role":"assistant","content":null,"reasoning_content":null,"tool_calls":null},"finish_reason":null,"native_finish_reason":null}]}`
// Initialize template with an empty choices array to support multiple candidates.
template := `{"id":"","object":"chat.completion","created":123456,"model":"model","choices":[]}`
if modelVersionResult := gjson.GetBytes(rawJSON, "modelVersion"); modelVersionResult.Exists() {
template, _ = sjson.Set(template, "model", modelVersionResult.String())
}
@@ -238,11 +276,6 @@ func ConvertGeminiResponseToOpenAINonStream(_ context.Context, _ string, origina
template, _ = sjson.Set(template, "id", responseIDResult.String())
}
if finishReasonResult := gjson.GetBytes(rawJSON, "candidates.0.finishReason"); finishReasonResult.Exists() {
template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String()))
template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String()))
}
if usageResult := gjson.GetBytes(rawJSON, "usageMetadata"); usageResult.Exists() {
if candidatesTokenCountResult := usageResult.Get("candidatesTokenCount"); candidatesTokenCountResult.Exists() {
template, _ = sjson.Set(template, "usage.completion_tokens", candidatesTokenCountResult.Int())
@@ -267,74 +300,96 @@ func ConvertGeminiResponseToOpenAINonStream(_ context.Context, _ string, origina
}
}
// Process the main content part of the response.
partsResult := gjson.GetBytes(rawJSON, "candidates.0.content.parts")
hasFunctionCall := false
if partsResult.IsArray() {
partsResults := partsResult.Array()
for i := 0; i < len(partsResults); i++ {
partResult := partsResults[i]
partTextResult := partResult.Get("text")
functionCallResult := partResult.Get("functionCall")
inlineDataResult := partResult.Get("inlineData")
if !inlineDataResult.Exists() {
inlineDataResult = partResult.Get("inline_data")
// Process the main content part of the response for all candidates.
candidates := gjson.GetBytes(rawJSON, "candidates")
if candidates.IsArray() {
candidates.ForEach(func(_, candidate gjson.Result) bool {
// Construct a single Choice object.
choiceTemplate := `{"index":0,"message":{"role":"assistant","content":null,"reasoning_content":null,"tool_calls":null},"finish_reason":null,"native_finish_reason":null}`
// Set the index for this choice.
choiceTemplate, _ = sjson.Set(choiceTemplate, "index", candidate.Get("index").Int())
// Set finish reason.
if finishReasonResult := candidate.Get("finishReason"); finishReasonResult.Exists() {
choiceTemplate, _ = sjson.Set(choiceTemplate, "finish_reason", strings.ToLower(finishReasonResult.String()))
choiceTemplate, _ = sjson.Set(choiceTemplate, "native_finish_reason", strings.ToLower(finishReasonResult.String()))
}
if partTextResult.Exists() {
// Append text content, distinguishing between regular content and reasoning.
if partResult.Get("thought").Bool() {
template, _ = sjson.Set(template, "choices.0.message.reasoning_content", partTextResult.String())
} else {
template, _ = sjson.Set(template, "choices.0.message.content", partTextResult.String())
}
template, _ = sjson.Set(template, "choices.0.message.role", "assistant")
} else if functionCallResult.Exists() {
// Append function call content to the tool_calls array.
hasFunctionCall = true
toolCallsResult := gjson.Get(template, "choices.0.message.tool_calls")
if !toolCallsResult.Exists() || !toolCallsResult.IsArray() {
template, _ = sjson.SetRaw(template, "choices.0.message.tool_calls", `[]`)
}
functionCallItemTemplate := `{"id": "","type": "function","function": {"name": "","arguments": ""}}`
fcName := functionCallResult.Get("name").String()
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "id", fmt.Sprintf("%s-%d-%d", fcName, time.Now().UnixNano(), atomic.AddUint64(&functionCallIDCounter, 1)))
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "function.name", fcName)
if fcArgsResult := functionCallResult.Get("args"); fcArgsResult.Exists() {
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "function.arguments", fcArgsResult.Raw)
}
template, _ = sjson.Set(template, "choices.0.message.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.message.tool_calls.-1", functionCallItemTemplate)
} else if inlineDataResult.Exists() {
data := inlineDataResult.Get("data").String()
if data == "" {
continue
}
mimeType := inlineDataResult.Get("mimeType").String()
if mimeType == "" {
mimeType = inlineDataResult.Get("mime_type").String()
}
if mimeType == "" {
mimeType = "image/png"
}
imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
imagesResult := gjson.Get(template, "choices.0.message.images")
if !imagesResult.Exists() || !imagesResult.IsArray() {
template, _ = sjson.SetRaw(template, "choices.0.message.images", `[]`)
}
imageIndex := len(gjson.Get(template, "choices.0.message.images").Array())
imagePayload := `{"type":"image_url","image_url":{"url":""}}`
imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
template, _ = sjson.Set(template, "choices.0.message.role", "assistant")
template, _ = sjson.SetRaw(template, "choices.0.message.images.-1", imagePayload)
}
}
}
partsResult := candidate.Get("content.parts")
hasFunctionCall := false
if partsResult.IsArray() {
partsResults := partsResult.Array()
for i := 0; i < len(partsResults); i++ {
partResult := partsResults[i]
partTextResult := partResult.Get("text")
functionCallResult := partResult.Get("functionCall")
inlineDataResult := partResult.Get("inlineData")
if !inlineDataResult.Exists() {
inlineDataResult = partResult.Get("inline_data")
}
if hasFunctionCall {
template, _ = sjson.Set(template, "choices.0.finish_reason", "tool_calls")
template, _ = sjson.Set(template, "choices.0.native_finish_reason", "tool_calls")
if partTextResult.Exists() {
// Append text content, distinguishing between regular content and reasoning.
if partResult.Get("thought").Bool() {
oldVal := gjson.Get(choiceTemplate, "message.reasoning_content").String()
choiceTemplate, _ = sjson.Set(choiceTemplate, "message.reasoning_content", oldVal+partTextResult.String())
} else {
oldVal := gjson.Get(choiceTemplate, "message.content").String()
choiceTemplate, _ = sjson.Set(choiceTemplate, "message.content", oldVal+partTextResult.String())
}
choiceTemplate, _ = sjson.Set(choiceTemplate, "message.role", "assistant")
} else if functionCallResult.Exists() {
// Append function call content to the tool_calls array.
hasFunctionCall = true
toolCallsResult := gjson.Get(choiceTemplate, "message.tool_calls")
if !toolCallsResult.Exists() || !toolCallsResult.IsArray() {
choiceTemplate, _ = sjson.SetRaw(choiceTemplate, "message.tool_calls", `[]`)
}
functionCallItemTemplate := `{"id": "","type": "function","function": {"name": "","arguments": ""}}`
fcName := functionCallResult.Get("name").String()
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "id", fmt.Sprintf("%s-%d-%d", fcName, time.Now().UnixNano(), atomic.AddUint64(&functionCallIDCounter, 1)))
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "function.name", fcName)
if fcArgsResult := functionCallResult.Get("args"); fcArgsResult.Exists() {
functionCallItemTemplate, _ = sjson.Set(functionCallItemTemplate, "function.arguments", fcArgsResult.Raw)
}
choiceTemplate, _ = sjson.Set(choiceTemplate, "message.role", "assistant")
choiceTemplate, _ = sjson.SetRaw(choiceTemplate, "message.tool_calls.-1", functionCallItemTemplate)
} else if inlineDataResult.Exists() {
data := inlineDataResult.Get("data").String()
if data != "" {
mimeType := inlineDataResult.Get("mimeType").String()
if mimeType == "" {
mimeType = inlineDataResult.Get("mime_type").String()
}
if mimeType == "" {
mimeType = "image/png"
}
imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, data)
imagesResult := gjson.Get(choiceTemplate, "message.images")
if !imagesResult.Exists() || !imagesResult.IsArray() {
choiceTemplate, _ = sjson.SetRaw(choiceTemplate, "message.images", `[]`)
}
imageIndex := len(gjson.Get(choiceTemplate, "message.images").Array())
imagePayload := `{"type":"image_url","image_url":{"url":""}}`
imagePayload, _ = sjson.Set(imagePayload, "index", imageIndex)
imagePayload, _ = sjson.Set(imagePayload, "image_url.url", imageURL)
choiceTemplate, _ = sjson.Set(choiceTemplate, "message.role", "assistant")
choiceTemplate, _ = sjson.SetRaw(choiceTemplate, "message.images.-1", imagePayload)
}
}
}
}
if hasFunctionCall {
choiceTemplate, _ = sjson.Set(choiceTemplate, "finish_reason", "tool_calls")
choiceTemplate, _ = sjson.Set(choiceTemplate, "native_finish_reason", "tool_calls")
}
// Append the constructed choice to the main choices array.
template, _ = sjson.SetRaw(template, "choices.-1", choiceTemplate)
return true
})
}
return template

View File

@@ -298,6 +298,15 @@ func ConvertOpenAIResponsesRequestToGemini(modelName string, inputRawJSON []byte
}
functionContent, _ = sjson.SetRaw(functionContent, "parts.-1", functionResponse)
out, _ = sjson.SetRaw(out, "contents.-1", functionContent)
case "reasoning":
thoughtContent := `{"role":"model","parts":[]}`
thought := `{"text":"","thoughtSignature":"","thought":true}`
thought, _ = sjson.Set(thought, "text", item.Get("summary.0.text").String())
thought, _ = sjson.Set(thought, "thoughtSignature", item.Get("encrypted_content").String())
thoughtContent, _ = sjson.SetRaw(thoughtContent, "parts.-1", thought)
out, _ = sjson.SetRaw(out, "contents.-1", thoughtContent)
}
}
} else if input.Exists() && input.Type == gjson.String {

View File

@@ -20,6 +20,7 @@ type geminiToResponsesState struct {
// message aggregation
MsgOpened bool
MsgClosed bool
MsgIndex int
CurrentMsgID string
TextBuf strings.Builder
@@ -29,6 +30,7 @@ type geminiToResponsesState struct {
ReasoningOpened bool
ReasoningIndex int
ReasoningItemID string
ReasoningEnc string
ReasoningBuf strings.Builder
ReasoningClosed bool
@@ -37,6 +39,7 @@ type geminiToResponsesState struct {
FuncArgsBuf map[int]*strings.Builder
FuncNames map[int]string
FuncCallIDs map[int]string
FuncDone map[int]bool
}
// responseIDCounter provides a process-wide unique counter for synthesized response identifiers.
@@ -45,6 +48,39 @@ var responseIDCounter uint64
// funcCallIDCounter provides a process-wide unique counter for function call identifiers.
var funcCallIDCounter uint64
func pickRequestJSON(originalRequestRawJSON, requestRawJSON []byte) []byte {
if len(originalRequestRawJSON) > 0 && gjson.ValidBytes(originalRequestRawJSON) {
return originalRequestRawJSON
}
if len(requestRawJSON) > 0 && gjson.ValidBytes(requestRawJSON) {
return requestRawJSON
}
return nil
}
func unwrapRequestRoot(root gjson.Result) gjson.Result {
req := root.Get("request")
if !req.Exists() {
return root
}
if req.Get("model").Exists() || req.Get("input").Exists() || req.Get("instructions").Exists() {
return req
}
return root
}
func unwrapGeminiResponseRoot(root gjson.Result) gjson.Result {
resp := root.Get("response")
if !resp.Exists() {
return root
}
// Vertex-style Gemini responses wrap the actual payload in a "response" object.
if resp.Get("candidates").Exists() || resp.Get("responseId").Exists() || resp.Get("usageMetadata").Exists() {
return resp
}
return root
}
func emitEvent(event string, payload string) string {
return fmt.Sprintf("event: %s\ndata: %s", event, payload)
}
@@ -56,18 +92,37 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
FuncArgsBuf: make(map[int]*strings.Builder),
FuncNames: make(map[int]string),
FuncCallIDs: make(map[int]string),
FuncDone: make(map[int]bool),
}
}
st := (*param).(*geminiToResponsesState)
if st.FuncArgsBuf == nil {
st.FuncArgsBuf = make(map[int]*strings.Builder)
}
if st.FuncNames == nil {
st.FuncNames = make(map[int]string)
}
if st.FuncCallIDs == nil {
st.FuncCallIDs = make(map[int]string)
}
if st.FuncDone == nil {
st.FuncDone = make(map[int]bool)
}
if bytes.HasPrefix(rawJSON, []byte("data:")) {
rawJSON = bytes.TrimSpace(rawJSON[5:])
}
rawJSON = bytes.TrimSpace(rawJSON)
if len(rawJSON) == 0 || bytes.Equal(rawJSON, []byte("[DONE]")) {
return []string{}
}
root := gjson.ParseBytes(rawJSON)
if !root.Exists() {
return []string{}
}
root = unwrapGeminiResponseRoot(root)
var out []string
nextSeq := func() int { st.Seq++; return st.Seq }
@@ -98,19 +153,54 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
itemDone, _ = sjson.Set(itemDone, "sequence_number", nextSeq())
itemDone, _ = sjson.Set(itemDone, "item.id", st.ReasoningItemID)
itemDone, _ = sjson.Set(itemDone, "output_index", st.ReasoningIndex)
itemDone, _ = sjson.Set(itemDone, "item.encrypted_content", st.ReasoningEnc)
itemDone, _ = sjson.Set(itemDone, "item.summary.0.text", full)
out = append(out, emitEvent("response.output_item.done", itemDone))
st.ReasoningClosed = true
}
// Helper to finalize the assistant message in correct order.
// It emits response.output_text.done, response.content_part.done,
// and response.output_item.done exactly once.
finalizeMessage := func() {
if !st.MsgOpened || st.MsgClosed {
return
}
fullText := st.ItemTextBuf.String()
done := `{"type":"response.output_text.done","sequence_number":0,"item_id":"","output_index":0,"content_index":0,"text":"","logprobs":[]}`
done, _ = sjson.Set(done, "sequence_number", nextSeq())
done, _ = sjson.Set(done, "item_id", st.CurrentMsgID)
done, _ = sjson.Set(done, "output_index", st.MsgIndex)
done, _ = sjson.Set(done, "text", fullText)
out = append(out, emitEvent("response.output_text.done", done))
partDone := `{"type":"response.content_part.done","sequence_number":0,"item_id":"","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":""}}`
partDone, _ = sjson.Set(partDone, "sequence_number", nextSeq())
partDone, _ = sjson.Set(partDone, "item_id", st.CurrentMsgID)
partDone, _ = sjson.Set(partDone, "output_index", st.MsgIndex)
partDone, _ = sjson.Set(partDone, "part.text", fullText)
out = append(out, emitEvent("response.content_part.done", partDone))
final := `{"type":"response.output_item.done","sequence_number":0,"output_index":0,"item":{"id":"","type":"message","status":"completed","content":[{"type":"output_text","text":""}],"role":"assistant"}}`
final, _ = sjson.Set(final, "sequence_number", nextSeq())
final, _ = sjson.Set(final, "output_index", st.MsgIndex)
final, _ = sjson.Set(final, "item.id", st.CurrentMsgID)
final, _ = sjson.Set(final, "item.content.0.text", fullText)
out = append(out, emitEvent("response.output_item.done", final))
st.MsgClosed = true
}
// Initialize per-response fields and emit created/in_progress once
if !st.Started {
if v := root.Get("responseId"); v.Exists() {
st.ResponseID = v.String()
st.ResponseID = root.Get("responseId").String()
if st.ResponseID == "" {
st.ResponseID = fmt.Sprintf("resp_%x_%d", time.Now().UnixNano(), atomic.AddUint64(&responseIDCounter, 1))
}
if !strings.HasPrefix(st.ResponseID, "resp_") {
st.ResponseID = fmt.Sprintf("resp_%s", st.ResponseID)
}
if v := root.Get("createTime"); v.Exists() {
if t, err := time.Parse(time.RFC3339Nano, v.String()); err == nil {
if t, errParseCreateTime := time.Parse(time.RFC3339Nano, v.String()); errParseCreateTime == nil {
st.CreatedAt = t.Unix()
}
}
@@ -143,15 +233,21 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
// Ignore any late thought chunks after reasoning is finalized.
return true
}
if sig := part.Get("thoughtSignature"); sig.Exists() && sig.String() != "" && sig.String() != geminiResponsesThoughtSignature {
st.ReasoningEnc = sig.String()
} else if sig = part.Get("thought_signature"); sig.Exists() && sig.String() != "" && sig.String() != geminiResponsesThoughtSignature {
st.ReasoningEnc = sig.String()
}
if !st.ReasoningOpened {
st.ReasoningOpened = true
st.ReasoningIndex = st.NextIndex
st.NextIndex++
st.ReasoningItemID = fmt.Sprintf("rs_%s_%d", st.ResponseID, st.ReasoningIndex)
item := `{"type":"response.output_item.added","sequence_number":0,"output_index":0,"item":{"id":"","type":"reasoning","status":"in_progress","summary":[]}}`
item := `{"type":"response.output_item.added","sequence_number":0,"output_index":0,"item":{"id":"","type":"reasoning","status":"in_progress","encrypted_content":"","summary":[]}}`
item, _ = sjson.Set(item, "sequence_number", nextSeq())
item, _ = sjson.Set(item, "output_index", st.ReasoningIndex)
item, _ = sjson.Set(item, "item.id", st.ReasoningItemID)
item, _ = sjson.Set(item, "item.encrypted_content", st.ReasoningEnc)
out = append(out, emitEvent("response.output_item.added", item))
partAdded := `{"type":"response.reasoning_summary_part.added","sequence_number":0,"item_id":"","output_index":0,"summary_index":0,"part":{"type":"summary_text","text":""}}`
partAdded, _ = sjson.Set(partAdded, "sequence_number", nextSeq())
@@ -191,9 +287,9 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
partAdded, _ = sjson.Set(partAdded, "output_index", st.MsgIndex)
out = append(out, emitEvent("response.content_part.added", partAdded))
st.ItemTextBuf.Reset()
st.ItemTextBuf.WriteString(t.String())
}
st.TextBuf.WriteString(t.String())
st.ItemTextBuf.WriteString(t.String())
msg := `{"type":"response.output_text.delta","sequence_number":0,"item_id":"","output_index":0,"content_index":0,"delta":"","logprobs":[]}`
msg, _ = sjson.Set(msg, "sequence_number", nextSeq())
msg, _ = sjson.Set(msg, "item_id", st.CurrentMsgID)
@@ -205,8 +301,10 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
// Function call
if fc := part.Get("functionCall"); fc.Exists() {
// Before emitting function-call outputs, finalize reasoning if open.
// Before emitting function-call outputs, finalize reasoning and the message (if open).
// Responses streaming requires message done events before the next output_item.added.
finalizeReasoning()
finalizeMessage()
name := fc.Get("name").String()
idx := st.NextIndex
st.NextIndex++
@@ -219,6 +317,14 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
}
st.FuncNames[idx] = name
argsJSON := "{}"
if args := fc.Get("args"); args.Exists() {
argsJSON = args.Raw
}
if st.FuncArgsBuf[idx].Len() == 0 && argsJSON != "" {
st.FuncArgsBuf[idx].WriteString(argsJSON)
}
// Emit item.added for function call
item := `{"type":"response.output_item.added","sequence_number":0,"output_index":0,"item":{"id":"","type":"function_call","status":"in_progress","arguments":"","call_id":"","name":""}}`
item, _ = sjson.Set(item, "sequence_number", nextSeq())
@@ -228,10 +334,9 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
item, _ = sjson.Set(item, "item.name", name)
out = append(out, emitEvent("response.output_item.added", item))
// Emit arguments delta (full args in one chunk)
if args := fc.Get("args"); args.Exists() {
argsJSON := args.Raw
st.FuncArgsBuf[idx].WriteString(argsJSON)
// Emit arguments delta (full args in one chunk).
// When Gemini omits args, emit "{}" to keep Responses streaming event order consistent.
if argsJSON != "" {
ad := `{"type":"response.function_call_arguments.delta","sequence_number":0,"item_id":"","output_index":0,"delta":""}`
ad, _ = sjson.Set(ad, "sequence_number", nextSeq())
ad, _ = sjson.Set(ad, "item_id", fmt.Sprintf("fc_%s", st.FuncCallIDs[idx]))
@@ -240,6 +345,27 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
out = append(out, emitEvent("response.function_call_arguments.delta", ad))
}
// Gemini emits the full function call payload at once, so we can finalize it immediately.
if !st.FuncDone[idx] {
fcDone := `{"type":"response.function_call_arguments.done","sequence_number":0,"item_id":"","output_index":0,"arguments":""}`
fcDone, _ = sjson.Set(fcDone, "sequence_number", nextSeq())
fcDone, _ = sjson.Set(fcDone, "item_id", fmt.Sprintf("fc_%s", st.FuncCallIDs[idx]))
fcDone, _ = sjson.Set(fcDone, "output_index", idx)
fcDone, _ = sjson.Set(fcDone, "arguments", argsJSON)
out = append(out, emitEvent("response.function_call_arguments.done", fcDone))
itemDone := `{"type":"response.output_item.done","sequence_number":0,"output_index":0,"item":{"id":"","type":"function_call","status":"completed","arguments":"","call_id":"","name":""}}`
itemDone, _ = sjson.Set(itemDone, "sequence_number", nextSeq())
itemDone, _ = sjson.Set(itemDone, "output_index", idx)
itemDone, _ = sjson.Set(itemDone, "item.id", fmt.Sprintf("fc_%s", st.FuncCallIDs[idx]))
itemDone, _ = sjson.Set(itemDone, "item.arguments", argsJSON)
itemDone, _ = sjson.Set(itemDone, "item.call_id", st.FuncCallIDs[idx])
itemDone, _ = sjson.Set(itemDone, "item.name", st.FuncNames[idx])
out = append(out, emitEvent("response.output_item.done", itemDone))
st.FuncDone[idx] = true
}
return true
}
@@ -251,28 +377,7 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
if fr := root.Get("candidates.0.finishReason"); fr.Exists() && fr.String() != "" {
// Finalize reasoning first to keep ordering tight with last delta
finalizeReasoning()
// Close message output if opened
if st.MsgOpened {
fullText := st.ItemTextBuf.String()
done := `{"type":"response.output_text.done","sequence_number":0,"item_id":"","output_index":0,"content_index":0,"text":"","logprobs":[]}`
done, _ = sjson.Set(done, "sequence_number", nextSeq())
done, _ = sjson.Set(done, "item_id", st.CurrentMsgID)
done, _ = sjson.Set(done, "output_index", st.MsgIndex)
done, _ = sjson.Set(done, "text", fullText)
out = append(out, emitEvent("response.output_text.done", done))
partDone := `{"type":"response.content_part.done","sequence_number":0,"item_id":"","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":""}}`
partDone, _ = sjson.Set(partDone, "sequence_number", nextSeq())
partDone, _ = sjson.Set(partDone, "item_id", st.CurrentMsgID)
partDone, _ = sjson.Set(partDone, "output_index", st.MsgIndex)
partDone, _ = sjson.Set(partDone, "part.text", fullText)
out = append(out, emitEvent("response.content_part.done", partDone))
final := `{"type":"response.output_item.done","sequence_number":0,"output_index":0,"item":{"id":"","type":"message","status":"completed","content":[{"type":"output_text","text":""}],"role":"assistant"}}`
final, _ = sjson.Set(final, "sequence_number", nextSeq())
final, _ = sjson.Set(final, "output_index", st.MsgIndex)
final, _ = sjson.Set(final, "item.id", st.CurrentMsgID)
final, _ = sjson.Set(final, "item.content.0.text", fullText)
out = append(out, emitEvent("response.output_item.done", final))
}
finalizeMessage()
// Close function calls
if len(st.FuncArgsBuf) > 0 {
@@ -289,6 +394,9 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
}
}
for _, idx := range idxs {
if st.FuncDone[idx] {
continue
}
args := "{}"
if b := st.FuncArgsBuf[idx]; b != nil && b.Len() > 0 {
args = b.String()
@@ -308,6 +416,8 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
itemDone, _ = sjson.Set(itemDone, "item.call_id", st.FuncCallIDs[idx])
itemDone, _ = sjson.Set(itemDone, "item.name", st.FuncNames[idx])
out = append(out, emitEvent("response.output_item.done", itemDone))
st.FuncDone[idx] = true
}
}
@@ -319,8 +429,8 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
completed, _ = sjson.Set(completed, "response.id", st.ResponseID)
completed, _ = sjson.Set(completed, "response.created_at", st.CreatedAt)
if requestRawJSON != nil {
req := gjson.ParseBytes(requestRawJSON)
if reqJSON := pickRequestJSON(originalRequestRawJSON, requestRawJSON); len(reqJSON) > 0 {
req := unwrapRequestRoot(gjson.ParseBytes(reqJSON))
if v := req.Get("instructions"); v.Exists() {
completed, _ = sjson.Set(completed, "response.instructions", v.String())
}
@@ -383,41 +493,34 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
}
}
// Compose outputs in encountered order: reasoning, message, function_calls
// Compose outputs in output_index order.
outputsWrapper := `{"arr":[]}`
if st.ReasoningOpened {
item := `{"id":"","type":"reasoning","summary":[{"type":"summary_text","text":""}]}`
item, _ = sjson.Set(item, "id", st.ReasoningItemID)
item, _ = sjson.Set(item, "summary.0.text", st.ReasoningBuf.String())
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
}
if st.MsgOpened {
item := `{"id":"","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":""}],"role":"assistant"}`
item, _ = sjson.Set(item, "id", st.CurrentMsgID)
item, _ = sjson.Set(item, "content.0.text", st.TextBuf.String())
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
}
if len(st.FuncArgsBuf) > 0 {
idxs := make([]int, 0, len(st.FuncArgsBuf))
for idx := range st.FuncArgsBuf {
idxs = append(idxs, idx)
for idx := 0; idx < st.NextIndex; idx++ {
if st.ReasoningOpened && idx == st.ReasoningIndex {
item := `{"id":"","type":"reasoning","encrypted_content":"","summary":[{"type":"summary_text","text":""}]}`
item, _ = sjson.Set(item, "id", st.ReasoningItemID)
item, _ = sjson.Set(item, "encrypted_content", st.ReasoningEnc)
item, _ = sjson.Set(item, "summary.0.text", st.ReasoningBuf.String())
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
continue
}
for i := 0; i < len(idxs); i++ {
for j := i + 1; j < len(idxs); j++ {
if idxs[j] < idxs[i] {
idxs[i], idxs[j] = idxs[j], idxs[i]
}
}
if st.MsgOpened && idx == st.MsgIndex {
item := `{"id":"","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":""}],"role":"assistant"}`
item, _ = sjson.Set(item, "id", st.CurrentMsgID)
item, _ = sjson.Set(item, "content.0.text", st.TextBuf.String())
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
continue
}
for _, idx := range idxs {
args := ""
if b := st.FuncArgsBuf[idx]; b != nil {
if callID, ok := st.FuncCallIDs[idx]; ok && callID != "" {
args := "{}"
if b := st.FuncArgsBuf[idx]; b != nil && b.Len() > 0 {
args = b.String()
}
item := `{"id":"","type":"function_call","status":"completed","arguments":"","call_id":"","name":""}`
item, _ = sjson.Set(item, "id", fmt.Sprintf("fc_%s", st.FuncCallIDs[idx]))
item, _ = sjson.Set(item, "id", fmt.Sprintf("fc_%s", callID))
item, _ = sjson.Set(item, "arguments", args)
item, _ = sjson.Set(item, "call_id", st.FuncCallIDs[idx])
item, _ = sjson.Set(item, "call_id", callID)
item, _ = sjson.Set(item, "name", st.FuncNames[idx])
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
}
@@ -431,8 +534,8 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
// input tokens = prompt + thoughts
input := um.Get("promptTokenCount").Int() + um.Get("thoughtsTokenCount").Int()
completed, _ = sjson.Set(completed, "response.usage.input_tokens", input)
// cached_tokens not provided by Gemini; default to 0 for structure compatibility
completed, _ = sjson.Set(completed, "response.usage.input_tokens_details.cached_tokens", 0)
// cached token details: align with OpenAI "cached_tokens" semantics.
completed, _ = sjson.Set(completed, "response.usage.input_tokens_details.cached_tokens", um.Get("cachedContentTokenCount").Int())
// output tokens
if v := um.Get("candidatesTokenCount"); v.Exists() {
completed, _ = sjson.Set(completed, "response.usage.output_tokens", v.Int())
@@ -460,6 +563,7 @@ func ConvertGeminiResponseToOpenAIResponses(_ context.Context, modelName string,
// ConvertGeminiResponseToOpenAIResponsesNonStream aggregates Gemini response JSON into a single OpenAI Responses JSON object.
func ConvertGeminiResponseToOpenAIResponsesNonStream(_ context.Context, _ string, originalRequestRawJSON, requestRawJSON, rawJSON []byte, _ *any) string {
root := gjson.ParseBytes(rawJSON)
root = unwrapGeminiResponseRoot(root)
// Base response scaffold
resp := `{"id":"","object":"response","created_at":0,"status":"completed","background":false,"error":null,"incomplete_details":null}`
@@ -478,15 +582,15 @@ func ConvertGeminiResponseToOpenAIResponsesNonStream(_ context.Context, _ string
// created_at: map from createTime if available
createdAt := time.Now().Unix()
if v := root.Get("createTime"); v.Exists() {
if t, err := time.Parse(time.RFC3339Nano, v.String()); err == nil {
if t, errParseCreateTime := time.Parse(time.RFC3339Nano, v.String()); errParseCreateTime == nil {
createdAt = t.Unix()
}
}
resp, _ = sjson.Set(resp, "created_at", createdAt)
// Echo request fields when present; fallback model from response modelVersion
if len(requestRawJSON) > 0 {
req := gjson.ParseBytes(requestRawJSON)
if reqJSON := pickRequestJSON(originalRequestRawJSON, requestRawJSON); len(reqJSON) > 0 {
req := unwrapRequestRoot(gjson.ParseBytes(reqJSON))
if v := req.Get("instructions"); v.Exists() {
resp, _ = sjson.Set(resp, "instructions", v.String())
}
@@ -636,8 +740,8 @@ func ConvertGeminiResponseToOpenAIResponsesNonStream(_ context.Context, _ string
// input tokens = prompt + thoughts
input := um.Get("promptTokenCount").Int() + um.Get("thoughtsTokenCount").Int()
resp, _ = sjson.Set(resp, "usage.input_tokens", input)
// cached_tokens not provided by Gemini; default to 0 for structure compatibility
resp, _ = sjson.Set(resp, "usage.input_tokens_details.cached_tokens", 0)
// cached token details: align with OpenAI "cached_tokens" semantics.
resp, _ = sjson.Set(resp, "usage.input_tokens_details.cached_tokens", um.Get("cachedContentTokenCount").Int())
// output tokens
if v := um.Get("candidatesTokenCount"); v.Exists() {
resp, _ = sjson.Set(resp, "usage.output_tokens", v.Int())

View File

@@ -0,0 +1,353 @@
package responses
import (
"context"
"strings"
"testing"
"github.com/tidwall/gjson"
)
func parseSSEEvent(t *testing.T, chunk string) (string, gjson.Result) {
t.Helper()
lines := strings.Split(chunk, "\n")
if len(lines) < 2 {
t.Fatalf("unexpected SSE chunk: %q", chunk)
}
event := strings.TrimSpace(strings.TrimPrefix(lines[0], "event:"))
dataLine := strings.TrimSpace(strings.TrimPrefix(lines[1], "data:"))
if !gjson.Valid(dataLine) {
t.Fatalf("invalid SSE data JSON: %q", dataLine)
}
return event, gjson.Parse(dataLine)
}
func TestConvertGeminiResponseToOpenAIResponses_UnwrapAndAggregateText(t *testing.T) {
// Vertex-style Gemini stream wraps the actual response payload under "response".
// This test ensures we unwrap and that output_text.done contains the full text.
in := []string{
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":""}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"让"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"我先"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"了解"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"functionCall":{"name":"mcp__serena__list_dir","args":{"recursive":false,"relative_path":"internal"},"id":"toolu_1"}}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":""}]},"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":5,"totalTokenCount":15,"cachedContentTokenCount":2},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
}
originalReq := []byte(`{"instructions":"test instructions","model":"gpt-5","max_output_tokens":123}`)
var param any
var out []string
for _, line := range in {
out = append(out, ConvertGeminiResponseToOpenAIResponses(context.Background(), "test-model", originalReq, nil, []byte(line), &param)...)
}
var (
gotTextDone bool
gotMessageDone bool
gotResponseDone bool
gotFuncDone bool
textDone string
messageText string
responseID string
instructions string
cachedTokens int64
funcName string
funcArgs string
posTextDone = -1
posPartDone = -1
posMessageDone = -1
posFuncAdded = -1
)
for i, chunk := range out {
ev, data := parseSSEEvent(t, chunk)
switch ev {
case "response.output_text.done":
gotTextDone = true
if posTextDone == -1 {
posTextDone = i
}
textDone = data.Get("text").String()
case "response.content_part.done":
if posPartDone == -1 {
posPartDone = i
}
case "response.output_item.done":
switch data.Get("item.type").String() {
case "message":
gotMessageDone = true
if posMessageDone == -1 {
posMessageDone = i
}
messageText = data.Get("item.content.0.text").String()
case "function_call":
gotFuncDone = true
funcName = data.Get("item.name").String()
funcArgs = data.Get("item.arguments").String()
}
case "response.output_item.added":
if data.Get("item.type").String() == "function_call" && posFuncAdded == -1 {
posFuncAdded = i
}
case "response.completed":
gotResponseDone = true
responseID = data.Get("response.id").String()
instructions = data.Get("response.instructions").String()
cachedTokens = data.Get("response.usage.input_tokens_details.cached_tokens").Int()
}
}
if !gotTextDone {
t.Fatalf("missing response.output_text.done event")
}
if posTextDone == -1 || posPartDone == -1 || posMessageDone == -1 || posFuncAdded == -1 {
t.Fatalf("missing ordering events: textDone=%d partDone=%d messageDone=%d funcAdded=%d", posTextDone, posPartDone, posMessageDone, posFuncAdded)
}
if !(posTextDone < posPartDone && posPartDone < posMessageDone && posMessageDone < posFuncAdded) {
t.Fatalf("unexpected message/function ordering: textDone=%d partDone=%d messageDone=%d funcAdded=%d", posTextDone, posPartDone, posMessageDone, posFuncAdded)
}
if !gotMessageDone {
t.Fatalf("missing message response.output_item.done event")
}
if !gotFuncDone {
t.Fatalf("missing function_call response.output_item.done event")
}
if !gotResponseDone {
t.Fatalf("missing response.completed event")
}
if textDone != "让我先了解" {
t.Fatalf("unexpected output_text.done text: got %q", textDone)
}
if messageText != "让我先了解" {
t.Fatalf("unexpected message done text: got %q", messageText)
}
if responseID != "resp_req_vrtx_1" {
t.Fatalf("unexpected response id: got %q", responseID)
}
if instructions != "test instructions" {
t.Fatalf("unexpected instructions echo: got %q", instructions)
}
if cachedTokens != 2 {
t.Fatalf("unexpected cached token count: got %d", cachedTokens)
}
if funcName != "mcp__serena__list_dir" {
t.Fatalf("unexpected function name: got %q", funcName)
}
if !gjson.Valid(funcArgs) {
t.Fatalf("invalid function arguments JSON: %q", funcArgs)
}
if gjson.Get(funcArgs, "recursive").Bool() != false {
t.Fatalf("unexpected recursive arg: %v", gjson.Get(funcArgs, "recursive").Value())
}
if gjson.Get(funcArgs, "relative_path").String() != "internal" {
t.Fatalf("unexpected relative_path arg: %q", gjson.Get(funcArgs, "relative_path").String())
}
}
func TestConvertGeminiResponseToOpenAIResponses_ReasoningEncryptedContent(t *testing.T) {
sig := "RXE0RENrZ0lDeEFDR0FJcVFOZDdjUzlleGFuRktRdFcvSzNyZ2MvWDNCcDQ4RmxSbGxOWUlOVU5kR1l1UHMrMGdkMVp0Vkg3ekdKU0g4YVljc2JjN3lNK0FrdGpTNUdqamI4T3Z0VVNETzdQd3pmcFhUOGl3U3hXUEJvTVFRQ09mWTFyMEtTWGZxUUlJakFqdmFGWk83RW1XRlBKckJVOVpkYzdDKw=="
in := []string{
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"thought":true,"thoughtSignature":"` + sig + `","text":""}]}}],"modelVersion":"test-model","responseId":"req_vrtx_sig"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"thought":true,"text":"a"}]}}],"modelVersion":"test-model","responseId":"req_vrtx_sig"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"hello"}]}}],"modelVersion":"test-model","responseId":"req_vrtx_sig"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":""}]},"finishReason":"STOP"}],"modelVersion":"test-model","responseId":"req_vrtx_sig"},"traceId":"t1"}`,
}
var param any
var out []string
for _, line := range in {
out = append(out, ConvertGeminiResponseToOpenAIResponses(context.Background(), "test-model", nil, nil, []byte(line), &param)...)
}
var (
addedEnc string
doneEnc string
)
for _, chunk := range out {
ev, data := parseSSEEvent(t, chunk)
switch ev {
case "response.output_item.added":
if data.Get("item.type").String() == "reasoning" {
addedEnc = data.Get("item.encrypted_content").String()
}
case "response.output_item.done":
if data.Get("item.type").String() == "reasoning" {
doneEnc = data.Get("item.encrypted_content").String()
}
}
}
if addedEnc != sig {
t.Fatalf("unexpected encrypted_content in response.output_item.added: got %q", addedEnc)
}
if doneEnc != sig {
t.Fatalf("unexpected encrypted_content in response.output_item.done: got %q", doneEnc)
}
}
func TestConvertGeminiResponseToOpenAIResponses_FunctionCallEventOrder(t *testing.T) {
in := []string{
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"functionCall":{"name":"tool0"}}]}}],"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"functionCall":{"name":"tool1"}}]}}],"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"functionCall":{"name":"tool2","args":{"a":1}}}]}}],"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":""}]},"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":10,"candidatesTokenCount":5,"totalTokenCount":15,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_1"},"traceId":"t1"}`,
}
var param any
var out []string
for _, line := range in {
out = append(out, ConvertGeminiResponseToOpenAIResponses(context.Background(), "test-model", nil, nil, []byte(line), &param)...)
}
posAdded := []int{-1, -1, -1}
posArgsDelta := []int{-1, -1, -1}
posArgsDone := []int{-1, -1, -1}
posItemDone := []int{-1, -1, -1}
posCompleted := -1
deltaByIndex := map[int]string{}
for i, chunk := range out {
ev, data := parseSSEEvent(t, chunk)
switch ev {
case "response.output_item.added":
if data.Get("item.type").String() != "function_call" {
continue
}
idx := int(data.Get("output_index").Int())
if idx >= 0 && idx < len(posAdded) {
posAdded[idx] = i
}
case "response.function_call_arguments.delta":
idx := int(data.Get("output_index").Int())
if idx >= 0 && idx < len(posArgsDelta) {
posArgsDelta[idx] = i
deltaByIndex[idx] = data.Get("delta").String()
}
case "response.function_call_arguments.done":
idx := int(data.Get("output_index").Int())
if idx >= 0 && idx < len(posArgsDone) {
posArgsDone[idx] = i
}
case "response.output_item.done":
if data.Get("item.type").String() != "function_call" {
continue
}
idx := int(data.Get("output_index").Int())
if idx >= 0 && idx < len(posItemDone) {
posItemDone[idx] = i
}
case "response.completed":
posCompleted = i
output := data.Get("response.output")
if !output.Exists() || !output.IsArray() {
t.Fatalf("missing response.output in response.completed")
}
if len(output.Array()) != 3 {
t.Fatalf("unexpected response.output length: got %d", len(output.Array()))
}
if data.Get("response.output.0.name").String() != "tool0" || data.Get("response.output.0.arguments").String() != "{}" {
t.Fatalf("unexpected output[0]: %s", data.Get("response.output.0").Raw)
}
if data.Get("response.output.1.name").String() != "tool1" || data.Get("response.output.1.arguments").String() != "{}" {
t.Fatalf("unexpected output[1]: %s", data.Get("response.output.1").Raw)
}
if data.Get("response.output.2.name").String() != "tool2" {
t.Fatalf("unexpected output[2] name: %s", data.Get("response.output.2").Raw)
}
if !gjson.Valid(data.Get("response.output.2.arguments").String()) {
t.Fatalf("unexpected output[2] arguments: %q", data.Get("response.output.2.arguments").String())
}
}
}
if posCompleted == -1 {
t.Fatalf("missing response.completed event")
}
for idx := 0; idx < 3; idx++ {
if posAdded[idx] == -1 || posArgsDelta[idx] == -1 || posArgsDone[idx] == -1 || posItemDone[idx] == -1 {
t.Fatalf("missing function call events for output_index %d: added=%d argsDelta=%d argsDone=%d itemDone=%d", idx, posAdded[idx], posArgsDelta[idx], posArgsDone[idx], posItemDone[idx])
}
if !(posAdded[idx] < posArgsDelta[idx] && posArgsDelta[idx] < posArgsDone[idx] && posArgsDone[idx] < posItemDone[idx]) {
t.Fatalf("unexpected ordering for output_index %d: added=%d argsDelta=%d argsDone=%d itemDone=%d", idx, posAdded[idx], posArgsDelta[idx], posArgsDone[idx], posItemDone[idx])
}
if idx > 0 && !(posItemDone[idx-1] < posAdded[idx]) {
t.Fatalf("function call events overlap between %d and %d: prevDone=%d nextAdded=%d", idx-1, idx, posItemDone[idx-1], posAdded[idx])
}
}
if deltaByIndex[0] != "{}" {
t.Fatalf("unexpected delta for output_index 0: got %q", deltaByIndex[0])
}
if deltaByIndex[1] != "{}" {
t.Fatalf("unexpected delta for output_index 1: got %q", deltaByIndex[1])
}
if deltaByIndex[2] == "" || !gjson.Valid(deltaByIndex[2]) || gjson.Get(deltaByIndex[2], "a").Int() != 1 {
t.Fatalf("unexpected delta for output_index 2: got %q", deltaByIndex[2])
}
if !(posItemDone[2] < posCompleted) {
t.Fatalf("response.completed should be after last output_item.done: last=%d completed=%d", posItemDone[2], posCompleted)
}
}
func TestConvertGeminiResponseToOpenAIResponses_ResponseOutputOrdering(t *testing.T) {
in := []string{
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"functionCall":{"name":"tool0","args":{"x":"y"}}}]}}],"modelVersion":"test-model","responseId":"req_vrtx_2"},"traceId":"t2"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"hi"}]}}],"modelVersion":"test-model","responseId":"req_vrtx_2"},"traceId":"t2"}`,
`data: {"response":{"candidates":[{"content":{"role":"model","parts":[{"text":""}]},"finishReason":"STOP"}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2,"cachedContentTokenCount":0},"modelVersion":"test-model","responseId":"req_vrtx_2"},"traceId":"t2"}`,
}
var param any
var out []string
for _, line := range in {
out = append(out, ConvertGeminiResponseToOpenAIResponses(context.Background(), "test-model", nil, nil, []byte(line), &param)...)
}
posFuncDone := -1
posMsgAdded := -1
posCompleted := -1
for i, chunk := range out {
ev, data := parseSSEEvent(t, chunk)
switch ev {
case "response.output_item.done":
if data.Get("item.type").String() == "function_call" && data.Get("output_index").Int() == 0 {
posFuncDone = i
}
case "response.output_item.added":
if data.Get("item.type").String() == "message" && data.Get("output_index").Int() == 1 {
posMsgAdded = i
}
case "response.completed":
posCompleted = i
if data.Get("response.output.0.type").String() != "function_call" {
t.Fatalf("expected response.output[0] to be function_call: %s", data.Get("response.output.0").Raw)
}
if data.Get("response.output.1.type").String() != "message" {
t.Fatalf("expected response.output[1] to be message: %s", data.Get("response.output.1").Raw)
}
if data.Get("response.output.1.content.0.text").String() != "hi" {
t.Fatalf("unexpected message text in response.output[1]: %s", data.Get("response.output.1").Raw)
}
}
}
if posFuncDone == -1 || posMsgAdded == -1 || posCompleted == -1 {
t.Fatalf("missing required events: funcDone=%d msgAdded=%d completed=%d", posFuncDone, posMsgAdded, posCompleted)
}
if !(posFuncDone < posMsgAdded) {
t.Fatalf("expected function_call to complete before message is added: funcDone=%d msgAdded=%d", posFuncDone, posMsgAdded)
}
if !(posMsgAdded < posCompleted) {
t.Fatalf("expected response.completed after message added: msgAdded=%d completed=%d", posMsgAdded, posCompleted)
}
}

View File

@@ -88,13 +88,15 @@ func ConvertClaudeRequestToOpenAI(modelName string, inputRawJSON []byte, stream
var messagesJSON = "[]"
// Handle system message first
systemMsgJSON := `{"role":"system","content":[{"type":"text","text":"Use ANY tool, the parameters MUST accord with RFC 8259 (The JavaScript Object Notation (JSON) Data Interchange Format), the keys and value MUST be enclosed in double quotes."}]}`
systemMsgJSON := `{"role":"system","content":[]}`
hasSystemContent := false
if system := root.Get("system"); system.Exists() {
if system.Type == gjson.String {
if system.String() != "" {
oldSystem := `{"type":"text","text":""}`
oldSystem, _ = sjson.Set(oldSystem, "text", system.String())
systemMsgJSON, _ = sjson.SetRaw(systemMsgJSON, "content.-1", oldSystem)
hasSystemContent = true
}
} else if system.Type == gjson.JSON {
if system.IsArray() {
@@ -102,12 +104,16 @@ func ConvertClaudeRequestToOpenAI(modelName string, inputRawJSON []byte, stream
for i := 0; i < len(systemResults); i++ {
if contentItem, ok := convertClaudeContentPart(systemResults[i]); ok {
systemMsgJSON, _ = sjson.SetRaw(systemMsgJSON, "content.-1", contentItem)
hasSystemContent = true
}
}
}
}
}
messagesJSON, _ = sjson.SetRaw(messagesJSON, "-1", systemMsgJSON)
// Only add system message if it has content
if hasSystemContent {
messagesJSON, _ = sjson.SetRaw(messagesJSON, "-1", systemMsgJSON)
}
// Process Anthropic messages
if messages := root.Get("messages"); messages.Exists() && messages.IsArray() {

View File

@@ -181,11 +181,11 @@ func TestConvertClaudeRequestToOpenAI_ThinkingToReasoningContent(t *testing.T) {
result := ConvertClaudeRequestToOpenAI("test-model", []byte(tt.inputJSON), false)
resultJSON := gjson.ParseBytes(result)
// Find the relevant message (skip system message at index 0)
// Find the relevant message
messages := resultJSON.Get("messages").Array()
if len(messages) < 2 {
if len(messages) < 1 {
if tt.wantHasReasoningContent || tt.wantHasContent {
t.Fatalf("Expected at least 2 messages (system + user/assistant), got %d", len(messages))
t.Fatalf("Expected at least 1 message, got %d", len(messages))
}
return
}
@@ -272,15 +272,15 @@ func TestConvertClaudeRequestToOpenAI_ThinkingOnlyMessagePreserved(t *testing.T)
messages := resultJSON.Get("messages").Array()
// Should have: system (auto-added) + user + assistant (thinking-only) + user = 4 messages
if len(messages) != 4 {
t.Fatalf("Expected 4 messages, got %d. Messages: %v", len(messages), resultJSON.Get("messages").Raw)
// Should have: user + assistant (thinking-only) + user = 3 messages
if len(messages) != 3 {
t.Fatalf("Expected 3 messages, got %d. Messages: %v", len(messages), resultJSON.Get("messages").Raw)
}
// Check the assistant message (index 2) has reasoning_content
assistantMsg := messages[2]
// Check the assistant message (index 1) has reasoning_content
assistantMsg := messages[1]
if assistantMsg.Get("role").String() != "assistant" {
t.Errorf("Expected message[2] to be assistant, got %s", assistantMsg.Get("role").String())
t.Errorf("Expected message[1] to be assistant, got %s", assistantMsg.Get("role").String())
}
if !assistantMsg.Get("reasoning_content").Exists() {
@@ -292,6 +292,104 @@ func TestConvertClaudeRequestToOpenAI_ThinkingOnlyMessagePreserved(t *testing.T)
}
}
func TestConvertClaudeRequestToOpenAI_SystemMessageScenarios(t *testing.T) {
tests := []struct {
name string
inputJSON string
wantHasSys bool
wantSysText string
}{
{
name: "No system field",
inputJSON: `{
"model": "claude-3-opus",
"messages": [{"role": "user", "content": "hello"}]
}`,
wantHasSys: false,
},
{
name: "Empty string system field",
inputJSON: `{
"model": "claude-3-opus",
"system": "",
"messages": [{"role": "user", "content": "hello"}]
}`,
wantHasSys: false,
},
{
name: "String system field",
inputJSON: `{
"model": "claude-3-opus",
"system": "Be helpful",
"messages": [{"role": "user", "content": "hello"}]
}`,
wantHasSys: true,
wantSysText: "Be helpful",
},
{
name: "Array system field with text",
inputJSON: `{
"model": "claude-3-opus",
"system": [{"type": "text", "text": "Array system"}],
"messages": [{"role": "user", "content": "hello"}]
}`,
wantHasSys: true,
wantSysText: "Array system",
},
{
name: "Array system field with multiple text blocks",
inputJSON: `{
"model": "claude-3-opus",
"system": [
{"type": "text", "text": "Block 1"},
{"type": "text", "text": "Block 2"}
],
"messages": [{"role": "user", "content": "hello"}]
}`,
wantHasSys: true,
wantSysText: "Block 2", // We will update the test logic to check all blocks or specifically the second one
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := ConvertClaudeRequestToOpenAI("test-model", []byte(tt.inputJSON), false)
resultJSON := gjson.ParseBytes(result)
messages := resultJSON.Get("messages").Array()
hasSys := false
var sysMsg gjson.Result
if len(messages) > 0 && messages[0].Get("role").String() == "system" {
hasSys = true
sysMsg = messages[0]
}
if hasSys != tt.wantHasSys {
t.Errorf("got hasSystem = %v, want %v", hasSys, tt.wantHasSys)
}
if tt.wantHasSys {
// Check content - it could be string or array in OpenAI
content := sysMsg.Get("content")
var gotText string
if content.IsArray() {
arr := content.Array()
if len(arr) > 0 {
// Get the last element's text for validation
gotText = arr[len(arr)-1].Get("text").String()
}
} else {
gotText = content.String()
}
if tt.wantSysText != "" && gotText != tt.wantSysText {
t.Errorf("got system text = %q, want %q", gotText, tt.wantSysText)
}
}
})
}
}
func TestConvertClaudeRequestToOpenAI_ToolResultOrderAndContent(t *testing.T) {
inputJSON := `{
"model": "claude-3-opus",
@@ -318,39 +416,35 @@ func TestConvertClaudeRequestToOpenAI_ToolResultOrderAndContent(t *testing.T) {
messages := resultJSON.Get("messages").Array()
// OpenAI requires: tool messages MUST immediately follow assistant(tool_calls).
// Correct order: system + assistant(tool_calls) + tool(result) + user(before+after)
if len(messages) != 4 {
t.Fatalf("Expected 4 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
// Correct order: assistant(tool_calls) + tool(result) + user(before+after)
if len(messages) != 3 {
t.Fatalf("Expected 3 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
}
if messages[0].Get("role").String() != "system" {
t.Fatalf("Expected messages[0] to be system, got %s", messages[0].Get("role").String())
}
if messages[1].Get("role").String() != "assistant" || !messages[1].Get("tool_calls").Exists() {
t.Fatalf("Expected messages[1] to be assistant tool_calls, got %s: %s", messages[1].Get("role").String(), messages[1].Raw)
if messages[0].Get("role").String() != "assistant" || !messages[0].Get("tool_calls").Exists() {
t.Fatalf("Expected messages[0] to be assistant tool_calls, got %s: %s", messages[0].Get("role").String(), messages[0].Raw)
}
// tool message MUST immediately follow assistant(tool_calls) per OpenAI spec
if messages[2].Get("role").String() != "tool" {
t.Fatalf("Expected messages[2] to be tool (must follow tool_calls), got %s", messages[2].Get("role").String())
if messages[1].Get("role").String() != "tool" {
t.Fatalf("Expected messages[1] to be tool (must follow tool_calls), got %s", messages[1].Get("role").String())
}
if got := messages[2].Get("tool_call_id").String(); got != "call_1" {
if got := messages[1].Get("tool_call_id").String(); got != "call_1" {
t.Fatalf("Expected tool_call_id %q, got %q", "call_1", got)
}
if got := messages[2].Get("content").String(); got != "tool ok" {
if got := messages[1].Get("content").String(); got != "tool ok" {
t.Fatalf("Expected tool content %q, got %q", "tool ok", got)
}
// User message comes after tool message
if messages[3].Get("role").String() != "user" {
t.Fatalf("Expected messages[3] to be user, got %s", messages[3].Get("role").String())
if messages[2].Get("role").String() != "user" {
t.Fatalf("Expected messages[2] to be user, got %s", messages[2].Get("role").String())
}
// User message should contain both "before" and "after" text
if got := messages[3].Get("content.0.text").String(); got != "before" {
if got := messages[2].Get("content.0.text").String(); got != "before" {
t.Fatalf("Expected user text[0] %q, got %q", "before", got)
}
if got := messages[3].Get("content.1.text").String(); got != "after" {
if got := messages[2].Get("content.1.text").String(); got != "after" {
t.Fatalf("Expected user text[1] %q, got %q", "after", got)
}
}
@@ -378,16 +472,16 @@ func TestConvertClaudeRequestToOpenAI_ToolResultObjectContent(t *testing.T) {
resultJSON := gjson.ParseBytes(result)
messages := resultJSON.Get("messages").Array()
// system + assistant(tool_calls) + tool(result)
if len(messages) != 3 {
t.Fatalf("Expected 3 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
// assistant(tool_calls) + tool(result)
if len(messages) != 2 {
t.Fatalf("Expected 2 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
}
if messages[2].Get("role").String() != "tool" {
t.Fatalf("Expected messages[2] to be tool, got %s", messages[2].Get("role").String())
if messages[1].Get("role").String() != "tool" {
t.Fatalf("Expected messages[1] to be tool, got %s", messages[1].Get("role").String())
}
toolContent := messages[2].Get("content").String()
toolContent := messages[1].Get("content").String()
parsed := gjson.Parse(toolContent)
if parsed.Get("foo").String() != "bar" {
t.Fatalf("Expected tool content JSON foo=bar, got %q", toolContent)
@@ -414,18 +508,14 @@ func TestConvertClaudeRequestToOpenAI_AssistantTextToolUseTextOrder(t *testing.T
messages := resultJSON.Get("messages").Array()
// New behavior: content + tool_calls unified in single assistant message
// Expect: system + assistant(content[pre,post] + tool_calls)
if len(messages) != 2 {
t.Fatalf("Expected 2 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
// Expect: assistant(content[pre,post] + tool_calls)
if len(messages) != 1 {
t.Fatalf("Expected 1 message, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
}
if messages[0].Get("role").String() != "system" {
t.Fatalf("Expected messages[0] to be system, got %s", messages[0].Get("role").String())
}
assistantMsg := messages[1]
assistantMsg := messages[0]
if assistantMsg.Get("role").String() != "assistant" {
t.Fatalf("Expected messages[1] to be assistant, got %s", assistantMsg.Get("role").String())
t.Fatalf("Expected messages[0] to be assistant, got %s", assistantMsg.Get("role").String())
}
// Should have both content and tool_calls in same message
@@ -470,14 +560,14 @@ func TestConvertClaudeRequestToOpenAI_AssistantThinkingToolUseThinkingSplit(t *t
messages := resultJSON.Get("messages").Array()
// New behavior: all content, thinking, and tool_calls unified in single assistant message
// Expect: system + assistant(content[pre,post] + tool_calls + reasoning_content[t1+t2])
if len(messages) != 2 {
t.Fatalf("Expected 2 messages, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
// Expect: assistant(content[pre,post] + tool_calls + reasoning_content[t1+t2])
if len(messages) != 1 {
t.Fatalf("Expected 1 message, got %d. Messages: %s", len(messages), resultJSON.Get("messages").Raw)
}
assistantMsg := messages[1]
assistantMsg := messages[0]
if assistantMsg.Get("role").String() != "assistant" {
t.Fatalf("Expected messages[1] to be assistant, got %s", assistantMsg.Get("role").String())
t.Fatalf("Expected messages[0] to be assistant, got %s", assistantMsg.Get("role").String())
}
// Should have content with both pre and post

View File

@@ -289,21 +289,17 @@ func convertOpenAIStreamingChunkToAnthropic(rawJSON []byte, param *ConvertOpenAI
// Only process if usage has actual values (not null)
if param.FinishReason != "" {
usage := root.Get("usage")
var inputTokens, outputTokens int64
var inputTokens, outputTokens, cachedTokens int64
if usage.Exists() && usage.Type != gjson.Null {
// Check if usage has actual token counts
promptTokens := usage.Get("prompt_tokens")
completionTokens := usage.Get("completion_tokens")
if promptTokens.Exists() && completionTokens.Exists() {
inputTokens = promptTokens.Int()
outputTokens = completionTokens.Int()
}
inputTokens, outputTokens, cachedTokens = extractOpenAIUsage(usage)
// Send message_delta with usage
messageDeltaJSON := `{"type":"message_delta","delta":{"stop_reason":"","stop_sequence":null},"usage":{"input_tokens":0,"output_tokens":0}}`
messageDeltaJSON, _ = sjson.Set(messageDeltaJSON, "delta.stop_reason", mapOpenAIFinishReasonToAnthropic(param.FinishReason))
messageDeltaJSON, _ = sjson.Set(messageDeltaJSON, "usage.input_tokens", inputTokens)
messageDeltaJSON, _ = sjson.Set(messageDeltaJSON, "usage.output_tokens", outputTokens)
if cachedTokens > 0 {
messageDeltaJSON, _ = sjson.Set(messageDeltaJSON, "usage.cache_read_input_tokens", cachedTokens)
}
results = append(results, "event: message_delta\ndata: "+messageDeltaJSON+"\n\n")
param.MessageDeltaSent = true
@@ -351,7 +347,7 @@ func convertOpenAIDoneToAnthropic(param *ConvertOpenAIResponseToAnthropicParams)
// If we haven't sent message_delta yet (no usage info was received), send it now
if param.FinishReason != "" && !param.MessageDeltaSent {
messageDeltaJSON := `{"type":"message_delta","delta":{"stop_reason":"","stop_sequence":null}}`
messageDeltaJSON := `{"type":"message_delta","delta":{"stop_reason":"","stop_sequence":null},"usage":{"input_tokens":0,"output_tokens":0}}`
messageDeltaJSON, _ = sjson.Set(messageDeltaJSON, "delta.stop_reason", mapOpenAIFinishReasonToAnthropic(param.FinishReason))
results = append(results, "event: message_delta\ndata: "+messageDeltaJSON+"\n\n")
param.MessageDeltaSent = true
@@ -423,13 +419,12 @@ func convertOpenAINonStreamingToAnthropic(rawJSON []byte) []string {
// Set usage information
if usage := root.Get("usage"); usage.Exists() {
out, _ = sjson.Set(out, "usage.input_tokens", usage.Get("prompt_tokens").Int())
out, _ = sjson.Set(out, "usage.output_tokens", usage.Get("completion_tokens").Int())
reasoningTokens := int64(0)
if v := usage.Get("completion_tokens_details.reasoning_tokens"); v.Exists() {
reasoningTokens = v.Int()
inputTokens, outputTokens, cachedTokens := extractOpenAIUsage(usage)
out, _ = sjson.Set(out, "usage.input_tokens", inputTokens)
out, _ = sjson.Set(out, "usage.output_tokens", outputTokens)
if cachedTokens > 0 {
out, _ = sjson.Set(out, "usage.cache_read_input_tokens", cachedTokens)
}
out, _ = sjson.Set(out, "usage.reasoning_tokens", reasoningTokens)
}
return []string{out}
@@ -674,8 +669,12 @@ func ConvertOpenAIResponseToClaudeNonStream(_ context.Context, _ string, origina
}
if respUsage := root.Get("usage"); respUsage.Exists() {
out, _ = sjson.Set(out, "usage.input_tokens", respUsage.Get("prompt_tokens").Int())
out, _ = sjson.Set(out, "usage.output_tokens", respUsage.Get("completion_tokens").Int())
inputTokens, outputTokens, cachedTokens := extractOpenAIUsage(respUsage)
out, _ = sjson.Set(out, "usage.input_tokens", inputTokens)
out, _ = sjson.Set(out, "usage.output_tokens", outputTokens)
if cachedTokens > 0 {
out, _ = sjson.Set(out, "usage.cache_read_input_tokens", cachedTokens)
}
}
if !stopReasonSet {
@@ -692,3 +691,23 @@ func ConvertOpenAIResponseToClaudeNonStream(_ context.Context, _ string, origina
func ClaudeTokenCount(ctx context.Context, count int64) string {
return fmt.Sprintf(`{"input_tokens":%d}`, count)
}
func extractOpenAIUsage(usage gjson.Result) (int64, int64, int64) {
if !usage.Exists() || usage.Type == gjson.Null {
return 0, 0, 0
}
inputTokens := usage.Get("prompt_tokens").Int()
outputTokens := usage.Get("completion_tokens").Int()
cachedTokens := usage.Get("prompt_tokens_details.cached_tokens").Int()
if cachedTokens > 0 {
if inputTokens >= cachedTokens {
inputTokens -= cachedTokens
} else {
inputTokens = 0
}
}
return inputTokens, outputTokens, cachedTokens
}

View File

@@ -77,12 +77,21 @@ func ConvertGeminiRequestToOpenAI(modelName string, inputRawJSON []byte, stream
}
}
// Convert thinkingBudget to reasoning_effort
// Candidate count (OpenAI 'n' parameter)
if candidateCount := genConfig.Get("candidateCount"); candidateCount.Exists() {
out, _ = sjson.Set(out, "n", candidateCount.Int())
}
// Map Gemini thinkingConfig to OpenAI reasoning_effort.
// Always perform conversion to support allowCompat models that may not be in registry
if thinkingConfig := genConfig.Get("thinkingConfig"); thinkingConfig.Exists() && thinkingConfig.IsObject() {
if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() {
budget := int(thinkingBudget.Int())
if effort, ok := thinking.ConvertBudgetToLevel(budget); ok && effort != "" {
if thinkingLevel := thinkingConfig.Get("thinkingLevel"); thinkingLevel.Exists() {
effort := strings.ToLower(strings.TrimSpace(thinkingLevel.String()))
if effort != "" {
out, _ = sjson.Set(out, "reasoning_effort", effort)
}
} else if thinkingBudget := thinkingConfig.Get("thinkingBudget"); thinkingBudget.Exists() {
if effort, ok := thinking.ConvertBudgetToLevel(int(thinkingBudget.Int())); ok {
out, _ = sjson.Set(out, "reasoning_effort", effort)
}
}

View File

@@ -12,6 +12,10 @@ import (
"github.com/tidwall/sjson"
)
type oaiToResponsesStateReasoning struct {
ReasoningID string
ReasoningData string
}
type oaiToResponsesState struct {
Seq int
ResponseID string
@@ -23,6 +27,7 @@ type oaiToResponsesState struct {
// Per-output message text buffers by index
MsgTextBuf map[int]*strings.Builder
ReasoningBuf strings.Builder
Reasonings []oaiToResponsesStateReasoning
FuncArgsBuf map[int]*strings.Builder // index -> args
FuncNames map[int]string // index -> name
FuncCallIDs map[int]string // index -> call_id
@@ -63,6 +68,7 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
MsgItemDone: make(map[int]bool),
FuncArgsDone: make(map[int]bool),
FuncItemDone: make(map[int]bool),
Reasonings: make([]oaiToResponsesStateReasoning, 0),
}
}
st := (*param).(*oaiToResponsesState)
@@ -157,6 +163,31 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
st.Started = true
}
stopReasoning := func(text string) {
// Emit reasoning done events
textDone := `{"type":"response.reasoning_summary_text.done","sequence_number":0,"item_id":"","output_index":0,"summary_index":0,"text":""}`
textDone, _ = sjson.Set(textDone, "sequence_number", nextSeq())
textDone, _ = sjson.Set(textDone, "item_id", st.ReasoningID)
textDone, _ = sjson.Set(textDone, "output_index", st.ReasoningIndex)
textDone, _ = sjson.Set(textDone, "text", text)
out = append(out, emitRespEvent("response.reasoning_summary_text.done", textDone))
partDone := `{"type":"response.reasoning_summary_part.done","sequence_number":0,"item_id":"","output_index":0,"summary_index":0,"part":{"type":"summary_text","text":""}}`
partDone, _ = sjson.Set(partDone, "sequence_number", nextSeq())
partDone, _ = sjson.Set(partDone, "item_id", st.ReasoningID)
partDone, _ = sjson.Set(partDone, "output_index", st.ReasoningIndex)
partDone, _ = sjson.Set(partDone, "part.text", text)
out = append(out, emitRespEvent("response.reasoning_summary_part.done", partDone))
outputItemDone := `{"type":"response.output_item.done","item":{"id":"","type":"reasoning","encrypted_content":"","summary":[{"type":"summary_text","text":""}]},"output_index":0,"sequence_number":0}`
outputItemDone, _ = sjson.Set(outputItemDone, "sequence_number", nextSeq())
outputItemDone, _ = sjson.Set(outputItemDone, "item.id", st.ReasoningID)
outputItemDone, _ = sjson.Set(outputItemDone, "output_index", st.ReasoningIndex)
outputItemDone, _ = sjson.Set(outputItemDone, "item.summary.text", text)
out = append(out, emitRespEvent("response.output_item.done", outputItemDone))
st.Reasonings = append(st.Reasonings, oaiToResponsesStateReasoning{ReasoningID: st.ReasoningID, ReasoningData: text})
st.ReasoningID = ""
}
// choices[].delta content / tool_calls / reasoning_content
if choices := root.Get("choices"); choices.Exists() && choices.IsArray() {
choices.ForEach(func(_, choice gjson.Result) bool {
@@ -165,6 +196,10 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
if delta.Exists() {
if c := delta.Get("content"); c.Exists() && c.String() != "" {
// Ensure the message item and its first content part are announced before any text deltas
if st.ReasoningID != "" {
stopReasoning(st.ReasoningBuf.String())
st.ReasoningBuf.Reset()
}
if !st.MsgItemAdded[idx] {
item := `{"type":"response.output_item.added","sequence_number":0,"output_index":0,"item":{"id":"","type":"message","status":"in_progress","content":[],"role":"assistant"}}`
item, _ = sjson.Set(item, "sequence_number", nextSeq())
@@ -226,6 +261,10 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
// tool calls
if tcs := delta.Get("tool_calls"); tcs.Exists() && tcs.IsArray() {
if st.ReasoningID != "" {
stopReasoning(st.ReasoningBuf.String())
st.ReasoningBuf.Reset()
}
// Before emitting any function events, if a message is open for this index,
// close its text/content to match Codex expected ordering.
if st.MsgItemAdded[idx] && !st.MsgItemDone[idx] {
@@ -361,17 +400,8 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
}
if st.ReasoningID != "" {
// Emit reasoning done events
textDone := `{"type":"response.reasoning_summary_text.done","sequence_number":0,"item_id":"","output_index":0,"summary_index":0,"text":""}`
textDone, _ = sjson.Set(textDone, "sequence_number", nextSeq())
textDone, _ = sjson.Set(textDone, "item_id", st.ReasoningID)
textDone, _ = sjson.Set(textDone, "output_index", st.ReasoningIndex)
out = append(out, emitRespEvent("response.reasoning_summary_text.done", textDone))
partDone := `{"type":"response.reasoning_summary_part.done","sequence_number":0,"item_id":"","output_index":0,"summary_index":0,"part":{"type":"summary_text","text":""}}`
partDone, _ = sjson.Set(partDone, "sequence_number", nextSeq())
partDone, _ = sjson.Set(partDone, "item_id", st.ReasoningID)
partDone, _ = sjson.Set(partDone, "output_index", st.ReasoningIndex)
out = append(out, emitRespEvent("response.reasoning_summary_part.done", partDone))
stopReasoning(st.ReasoningBuf.String())
st.ReasoningBuf.Reset()
}
// Emit function call done events for any active function calls
@@ -485,11 +515,13 @@ func ConvertOpenAIChatCompletionsResponseToOpenAIResponses(ctx context.Context,
}
// Build response.output using aggregated buffers
outputsWrapper := `{"arr":[]}`
if st.ReasoningBuf.Len() > 0 {
item := `{"id":"","type":"reasoning","summary":[{"type":"summary_text","text":""}]}`
item, _ = sjson.Set(item, "id", st.ReasoningID)
item, _ = sjson.Set(item, "summary.0.text", st.ReasoningBuf.String())
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
if len(st.Reasonings) > 0 {
for _, r := range st.Reasonings {
item := `{"id":"","type":"reasoning","summary":[{"type":"summary_text","text":""}]}`
item, _ = sjson.Set(item, "id", r.ReasoningID)
item, _ = sjson.Set(item, "summary.0.text", r.ReasoningData)
outputsWrapper, _ = sjson.SetRaw(outputsWrapper, "arr.-1", item)
}
}
// Append message items in ascending index order
if len(st.MsgItemAdded) > 0 {

View File

@@ -4,6 +4,7 @@ package util
import (
"fmt"
"sort"
"strconv"
"strings"
"github.com/tidwall/gjson"
@@ -12,13 +13,27 @@ import (
var gjsonPathKeyReplacer = strings.NewReplacer(".", "\\.", "*", "\\*", "?", "\\?")
const placeholderReasonDescription = "Brief explanation of why you are calling this tool"
// CleanJSONSchemaForAntigravity transforms a JSON schema to be compatible with Antigravity API.
// It handles unsupported keywords, type flattening, and schema simplification while preserving
// semantic information as description hints.
func CleanJSONSchemaForAntigravity(jsonStr string) string {
return cleanJSONSchema(jsonStr, true)
}
// CleanJSONSchemaForGemini transforms a JSON schema to be compatible with Gemini tool calling.
// It removes unsupported keywords and simplifies schemas, without adding empty-schema placeholders.
func CleanJSONSchemaForGemini(jsonStr string) string {
return cleanJSONSchema(jsonStr, false)
}
// cleanJSONSchema performs the core cleaning operations on the JSON schema.
func cleanJSONSchema(jsonStr string, addPlaceholder bool) string {
// Phase 1: Convert and add hints
jsonStr = convertRefsToHints(jsonStr)
jsonStr = convertConstToEnum(jsonStr)
jsonStr = convertEnumValuesToStrings(jsonStr)
jsonStr = addEnumHints(jsonStr)
jsonStr = addAdditionalPropertiesHints(jsonStr)
jsonStr = moveConstraintsToDescription(jsonStr)
@@ -30,10 +45,94 @@ func CleanJSONSchemaForAntigravity(jsonStr string) string {
// Phase 3: Cleanup
jsonStr = removeUnsupportedKeywords(jsonStr)
if !addPlaceholder {
// Gemini schema cleanup: remove nullable/title and placeholder-only fields.
jsonStr = removeKeywords(jsonStr, []string{"nullable", "title"})
jsonStr = removePlaceholderFields(jsonStr)
}
jsonStr = cleanupRequiredFields(jsonStr)
// Phase 4: Add placeholder for empty object schemas (Claude VALIDATED mode requirement)
jsonStr = addEmptySchemaPlaceholder(jsonStr)
if addPlaceholder {
jsonStr = addEmptySchemaPlaceholder(jsonStr)
}
return jsonStr
}
// removeKeywords removes all occurrences of specified keywords from the JSON schema.
func removeKeywords(jsonStr string, keywords []string) string {
for _, key := range keywords {
for _, p := range findPaths(jsonStr, key) {
if isPropertyDefinition(trimSuffix(p, "."+key)) {
continue
}
jsonStr, _ = sjson.Delete(jsonStr, p)
}
}
return jsonStr
}
// removePlaceholderFields removes placeholder-only properties ("_" and "reason") and their required entries.
func removePlaceholderFields(jsonStr string) string {
// Remove "_" placeholder properties.
paths := findPaths(jsonStr, "_")
sortByDepth(paths)
for _, p := range paths {
if !strings.HasSuffix(p, ".properties._") {
continue
}
jsonStr, _ = sjson.Delete(jsonStr, p)
parentPath := trimSuffix(p, ".properties._")
reqPath := joinPath(parentPath, "required")
req := gjson.Get(jsonStr, reqPath)
if req.IsArray() {
var filtered []string
for _, r := range req.Array() {
if r.String() != "_" {
filtered = append(filtered, r.String())
}
}
if len(filtered) == 0 {
jsonStr, _ = sjson.Delete(jsonStr, reqPath)
} else {
jsonStr, _ = sjson.Set(jsonStr, reqPath, filtered)
}
}
}
// Remove placeholder-only "reason" objects.
reasonPaths := findPaths(jsonStr, "reason")
sortByDepth(reasonPaths)
for _, p := range reasonPaths {
if !strings.HasSuffix(p, ".properties.reason") {
continue
}
parentPath := trimSuffix(p, ".properties.reason")
props := gjson.Get(jsonStr, joinPath(parentPath, "properties"))
if !props.IsObject() || len(props.Map()) != 1 {
continue
}
desc := gjson.Get(jsonStr, p+".description").String()
if desc != placeholderReasonDescription {
continue
}
jsonStr, _ = sjson.Delete(jsonStr, p)
reqPath := joinPath(parentPath, "required")
req := gjson.Get(jsonStr, reqPath)
if req.IsArray() {
var filtered []string
for _, r := range req.Array() {
if r.String() != "reason" {
filtered = append(filtered, r.String())
}
}
if len(filtered) == 0 {
jsonStr, _ = sjson.Delete(jsonStr, reqPath)
} else {
jsonStr, _ = sjson.Set(jsonStr, reqPath, filtered)
}
}
}
return jsonStr
}
@@ -77,6 +176,29 @@ func convertConstToEnum(jsonStr string) string {
return jsonStr
}
// convertEnumValuesToStrings ensures all enum values are strings and the schema type is set to string.
// Gemini API requires enum values to be of type string, not numbers or booleans.
func convertEnumValuesToStrings(jsonStr string) string {
for _, p := range findPaths(jsonStr, "enum") {
arr := gjson.Get(jsonStr, p)
if !arr.IsArray() {
continue
}
var stringVals []string
for _, item := range arr.Array() {
stringVals = append(stringVals, item.String())
}
// Always update enum values to strings and set type to "string"
// This ensures compatibility with Antigravity Gemini which only allows enum for STRING type
jsonStr, _ = sjson.Set(jsonStr, p, stringVals)
parentPath := trimSuffix(p, ".enum")
jsonStr, _ = sjson.Set(jsonStr, joinPath(parentPath, "type"), "string")
}
return jsonStr
}
func addEnumHints(jsonStr string) string {
for _, p := range findPaths(jsonStr, "enum") {
arr := gjson.Get(jsonStr, p)
@@ -310,9 +432,54 @@ func removeUnsupportedKeywords(jsonStr string) string {
jsonStr, _ = sjson.Delete(jsonStr, p)
}
}
// Remove x-* extension fields (e.g., x-google-enum-descriptions) that are not supported by Gemini API
jsonStr = removeExtensionFields(jsonStr)
return jsonStr
}
// removeExtensionFields removes all x-* extension fields from the JSON schema.
// These are OpenAPI/JSON Schema extension fields that Google APIs don't recognize.
func removeExtensionFields(jsonStr string) string {
var paths []string
walkForExtensions(gjson.Parse(jsonStr), "", &paths)
// walkForExtensions returns paths in a way that deeper paths are added before their ancestors
// when they are not deleted wholesale, but since we skip children of deleted x-* nodes,
// any collected path is safe to delete. We still use DeleteBytes for efficiency.
b := []byte(jsonStr)
for _, p := range paths {
b, _ = sjson.DeleteBytes(b, p)
}
return string(b)
}
func walkForExtensions(value gjson.Result, path string, paths *[]string) {
if value.IsArray() {
arr := value.Array()
for i := len(arr) - 1; i >= 0; i-- {
walkForExtensions(arr[i], joinPath(path, strconv.Itoa(i)), paths)
}
return
}
if value.IsObject() {
value.ForEach(func(key, val gjson.Result) bool {
keyStr := key.String()
safeKey := escapeGJSONPathKey(keyStr)
childPath := joinPath(path, safeKey)
// If it's an extension field, we delete it and don't need to look at its children.
if strings.HasPrefix(keyStr, "x-") && !isPropertyDefinition(path) {
*paths = append(*paths, childPath)
return true
}
walkForExtensions(val, childPath, paths)
return true
})
}
}
func cleanupRequiredFields(jsonStr string) string {
for _, p := range findPaths(jsonStr, "required") {
parentPath := trimSuffix(p, ".required")
@@ -381,7 +548,7 @@ func addEmptySchemaPlaceholder(jsonStr string) string {
// Add placeholder "reason" property
reasonPath := joinPath(propsPath, "reason")
jsonStr, _ = sjson.Set(jsonStr, reasonPath+".type", "string")
jsonStr, _ = sjson.Set(jsonStr, reasonPath+".description", "Brief explanation of why you are calling this tool")
jsonStr, _ = sjson.Set(jsonStr, reasonPath+".description", placeholderReasonDescription)
// Add to required array
jsonStr, _ = sjson.Set(jsonStr, reqPath, []string{"reason"})

View File

@@ -818,3 +818,180 @@ func TestCleanJSONSchemaForAntigravity_MultipleFormats(t *testing.T) {
t.Errorf("date-time format hint should be added, got: %s", result)
}
}
func TestCleanJSONSchemaForAntigravity_NumericEnumToString(t *testing.T) {
// Gemini API requires enum values to be strings, not numbers
input := `{
"type": "object",
"properties": {
"priority": {"type": "integer", "enum": [0, 1, 2]},
"level": {"type": "number", "enum": [1.5, 2.5, 3.5]},
"status": {"type": "string", "enum": ["active", "inactive"]}
}
}`
result := CleanJSONSchemaForAntigravity(input)
// Numeric enum values should be converted to strings
if strings.Contains(result, `"enum":[0,1,2]`) {
t.Errorf("Integer enum values should be converted to strings, got: %s", result)
}
if strings.Contains(result, `"enum":[1.5,2.5,3.5]`) {
t.Errorf("Float enum values should be converted to strings, got: %s", result)
}
// Should contain string versions
if !strings.Contains(result, `"0"`) || !strings.Contains(result, `"1"`) || !strings.Contains(result, `"2"`) {
t.Errorf("Integer enum values should be converted to string format, got: %s", result)
}
// String enum values should remain unchanged
if !strings.Contains(result, `"active"`) || !strings.Contains(result, `"inactive"`) {
t.Errorf("String enum values should remain unchanged, got: %s", result)
}
}
func TestCleanJSONSchemaForAntigravity_BooleanEnumToString(t *testing.T) {
// Boolean enum values should also be converted to strings
input := `{
"type": "object",
"properties": {
"enabled": {"type": "boolean", "enum": [true, false]}
}
}`
result := CleanJSONSchemaForAntigravity(input)
// Boolean enum values should be converted to strings
if strings.Contains(result, `"enum":[true,false]`) {
t.Errorf("Boolean enum values should be converted to strings, got: %s", result)
}
// Should contain string versions "true" and "false"
if !strings.Contains(result, `"true"`) || !strings.Contains(result, `"false"`) {
t.Errorf("Boolean enum values should be converted to string format, got: %s", result)
}
}
func TestRemoveExtensionFields(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{
name: "removes x- fields at root",
input: `{
"type": "object",
"x-custom-meta": "value",
"properties": {
"foo": { "type": "string" }
}
}`,
expected: `{
"type": "object",
"properties": {
"foo": { "type": "string" }
}
}`,
},
{
name: "removes x- fields in nested properties",
input: `{
"type": "object",
"properties": {
"foo": {
"type": "string",
"x-internal-id": 123
}
}
}`,
expected: `{
"type": "object",
"properties": {
"foo": {
"type": "string"
}
}
}`,
},
{
name: "does NOT remove properties named x-",
input: `{
"type": "object",
"properties": {
"x-data": { "type": "string" },
"normal": { "type": "number", "x-meta": "remove" }
},
"required": ["x-data"]
}`,
expected: `{
"type": "object",
"properties": {
"x-data": { "type": "string" },
"normal": { "type": "number" }
},
"required": ["x-data"]
}`,
},
{
name: "does NOT remove $schema and other meta fields (as requested)",
input: `{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "test",
"type": "object",
"properties": {
"foo": { "type": "string" }
}
}`,
expected: `{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "test",
"type": "object",
"properties": {
"foo": { "type": "string" }
}
}`,
},
{
name: "handles properties named $schema",
input: `{
"type": "object",
"properties": {
"$schema": { "type": "string" }
}
}`,
expected: `{
"type": "object",
"properties": {
"$schema": { "type": "string" }
}
}`,
},
{
name: "handles escaping in paths",
input: `{
"type": "object",
"properties": {
"foo.bar": {
"type": "string",
"x-meta": "remove"
}
},
"x-root.meta": "remove"
}`,
expected: `{
"type": "object",
"properties": {
"foo.bar": {
"type": "string"
}
}
}`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
actual := removeExtensionFields(tt.input)
compareJSON(t, tt.expected, actual)
})
}
}

View File

@@ -86,12 +86,19 @@ func (s *FileSynthesizer) Synthesize(ctx *SynthesisContext) ([]*coreauth.Auth, e
}
}
disabled, _ := metadata["disabled"].(bool)
status := coreauth.StatusActive
if disabled {
status = coreauth.StatusDisabled
}
a := &coreauth.Auth{
ID: id,
Provider: provider,
Label: label,
Prefix: prefix,
Status: coreauth.StatusActive,
Status: status,
Disabled: disabled,
Attributes: map[string]string{
"source": full,
"path": full,
@@ -167,6 +174,16 @@ func SynthesizeGeminiVirtualAuths(primary *coreauth.Auth, metadata map[string]an
"virtual_parent_id": primary.ID,
"type": metadata["type"],
}
if v, ok := metadata["disable_cooling"]; ok {
metadataCopy["disable_cooling"] = v
} else if v, ok := metadata["disable-cooling"]; ok {
metadataCopy["disable_cooling"] = v
}
if v, ok := metadata["request_retry"]; ok {
metadataCopy["request_retry"] = v
} else if v, ok := metadata["request-retry"]; ok {
metadataCopy["request_retry"] = v
}
proxy := strings.TrimSpace(primary.ProxyURL)
if proxy != "" {
metadataCopy["proxy_url"] = proxy

View File

@@ -69,10 +69,12 @@ func TestFileSynthesizer_Synthesize_ValidAuthFile(t *testing.T) {
// Create a valid auth file
authData := map[string]any{
"type": "claude",
"email": "test@example.com",
"proxy_url": "http://proxy.local",
"prefix": "test-prefix",
"type": "claude",
"email": "test@example.com",
"proxy_url": "http://proxy.local",
"prefix": "test-prefix",
"disable_cooling": true,
"request_retry": 2,
}
data, _ := json.Marshal(authData)
err := os.WriteFile(filepath.Join(tempDir, "claude-auth.json"), data, 0644)
@@ -108,6 +110,12 @@ func TestFileSynthesizer_Synthesize_ValidAuthFile(t *testing.T) {
if auths[0].ProxyURL != "http://proxy.local" {
t.Errorf("expected proxy_url http://proxy.local, got %s", auths[0].ProxyURL)
}
if v, ok := auths[0].Metadata["disable_cooling"].(bool); !ok || !v {
t.Errorf("expected disable_cooling true, got %v", auths[0].Metadata["disable_cooling"])
}
if v, ok := auths[0].Metadata["request_retry"].(float64); !ok || int(v) != 2 {
t.Errorf("expected request_retry 2, got %v", auths[0].Metadata["request_retry"])
}
if auths[0].Status != coreauth.StatusActive {
t.Errorf("expected status active, got %s", auths[0].Status)
}
@@ -336,9 +344,11 @@ func TestSynthesizeGeminiVirtualAuths_MultiProject(t *testing.T) {
},
}
metadata := map[string]any{
"project_id": "project-a, project-b, project-c",
"email": "test@example.com",
"type": "gemini",
"project_id": "project-a, project-b, project-c",
"email": "test@example.com",
"type": "gemini",
"request_retry": 2,
"disable_cooling": true,
}
virtuals := SynthesizeGeminiVirtualAuths(primary, metadata, now)
@@ -376,6 +386,12 @@ func TestSynthesizeGeminiVirtualAuths_MultiProject(t *testing.T) {
if v.ProxyURL != "http://proxy.local" {
t.Errorf("expected proxy_url http://proxy.local, got %s", v.ProxyURL)
}
if vv, ok := v.Metadata["disable_cooling"].(bool); !ok || !vv {
t.Errorf("expected disable_cooling true, got %v", v.Metadata["disable_cooling"])
}
if vv, ok := v.Metadata["request_retry"].(int); !ok || vv != 2 {
t.Errorf("expected request_retry 2, got %v", v.Metadata["request_retry"])
}
if v.Attributes["runtime_only"] != "true" {
t.Error("expected runtime_only=true")
}

View File

@@ -124,32 +124,47 @@ func (m *Manager) Stream(ctx context.Context, provider string, req *HTTPRequest)
out := make(chan StreamEvent)
go func() {
defer close(out)
send := func(ev StreamEvent) bool {
if ctx == nil {
out <- ev
return true
}
select {
case <-ctx.Done():
return false
case out <- ev:
return true
}
}
for {
select {
case <-ctx.Done():
out <- StreamEvent{Err: ctx.Err()}
return
case msg, ok := <-respCh:
if !ok {
out <- StreamEvent{Err: errors.New("wsrelay: stream closed")}
_ = send(StreamEvent{Err: errors.New("wsrelay: stream closed")})
return
}
switch msg.Type {
case MessageTypeStreamStart:
resp := decodeResponse(msg.Payload)
out <- StreamEvent{Type: MessageTypeStreamStart, Status: resp.Status, Headers: resp.Headers}
if okSend := send(StreamEvent{Type: MessageTypeStreamStart, Status: resp.Status, Headers: resp.Headers}); !okSend {
return
}
case MessageTypeStreamChunk:
chunk := decodeChunk(msg.Payload)
out <- StreamEvent{Type: MessageTypeStreamChunk, Payload: chunk}
if okSend := send(StreamEvent{Type: MessageTypeStreamChunk, Payload: chunk}); !okSend {
return
}
case MessageTypeStreamEnd:
out <- StreamEvent{Type: MessageTypeStreamEnd}
_ = send(StreamEvent{Type: MessageTypeStreamEnd})
return
case MessageTypeError:
out <- StreamEvent{Type: MessageTypeError, Err: decodeError(msg.Payload)}
_ = send(StreamEvent{Type: MessageTypeError, Err: decodeError(msg.Payload)})
return
case MessageTypeHTTPResp:
resp := decodeResponse(msg.Payload)
out <- StreamEvent{Type: MessageTypeHTTPResp, Status: resp.Status, Headers: resp.Headers, Payload: resp.Body}
_ = send(StreamEvent{Type: MessageTypeHTTPResp, Status: resp.Status, Headers: resp.Headers, Payload: resp.Body})
return
default:
}

View File

@@ -128,8 +128,23 @@ func (h *ClaudeCodeAPIHandler) ClaudeCountTokens(c *gin.Context) {
// Parameters:
// - c: The Gin context for the request.
func (h *ClaudeCodeAPIHandler) ClaudeModels(c *gin.Context) {
models := h.Models()
firstID := ""
lastID := ""
if len(models) > 0 {
if id, ok := models[0]["id"].(string); ok {
firstID = id
}
if id, ok := models[len(models)-1]["id"].(string); ok {
lastID = id
}
}
c.JSON(http.StatusOK, gin.H{
"data": h.Models(),
"data": models,
"has_more": false,
"first_id": firstID,
"last_id": lastID,
})
}

View File

@@ -124,6 +124,7 @@ func (h *GeminiCLIAPIHandler) CLIHandler(c *gin.Context) {
log.Errorf("Failed to read response body: %v", err)
return
}
c.Set("API_RESPONSE_TIMESTAMP", time.Now())
_, _ = c.Writer.Write(output)
c.Set("API_RESPONSE", output)
}

View File

@@ -56,8 +56,16 @@ func (h *GeminiAPIHandler) GeminiModels(c *gin.Context) {
for k, v := range model {
normalizedModel[k] = v
}
if name, ok := normalizedModel["name"].(string); ok && name != "" && !strings.HasPrefix(name, "models/") {
normalizedModel["name"] = "models/" + name
if name, ok := normalizedModel["name"].(string); ok && name != "" {
if !strings.HasPrefix(name, "models/") {
normalizedModel["name"] = "models/" + name
}
if displayName, _ := normalizedModel["displayName"].(string); displayName == "" {
normalizedModel["displayName"] = name
}
if description, _ := normalizedModel["description"].(string); description == "" {
normalizedModel["description"] = name
}
}
if _, ok := normalizedModel["supportedGenerationMethods"]; !ok {
normalizedModel["supportedGenerationMethods"] = defaultMethods

View File

@@ -361,6 +361,11 @@ func appendAPIResponse(c *gin.Context, data []byte) {
return
}
// Capture timestamp on first API response
if _, exists := c.Get("API_RESPONSE_TIMESTAMP"); !exists {
c.Set("API_RESPONSE_TIMESTAMP", time.Now())
}
if existing, exists := c.Get("API_RESPONSE"); exists {
if existingBytes, ok := existing.([]byte); ok && len(existingBytes) > 0 {
combined := make([]byte, 0, len(existingBytes)+len(data)+1)
@@ -385,6 +390,7 @@ func (h *BaseAPIHandler) ExecuteWithAuthManager(ctx context.Context, handlerType
return nil, errMsg
}
reqMeta := requestExecutionMetadata(ctx)
reqMeta[coreexecutor.RequestedModelMetadataKey] = normalizedModel
req := coreexecutor.Request{
Model: normalizedModel,
Payload: cloneBytes(rawJSON),
@@ -423,6 +429,7 @@ func (h *BaseAPIHandler) ExecuteCountWithAuthManager(ctx context.Context, handle
return nil, errMsg
}
reqMeta := requestExecutionMetadata(ctx)
reqMeta[coreexecutor.RequestedModelMetadataKey] = normalizedModel
req := coreexecutor.Request{
Model: normalizedModel,
Payload: cloneBytes(rawJSON),
@@ -464,6 +471,7 @@ func (h *BaseAPIHandler) ExecuteStreamWithAuthManager(ctx context.Context, handl
return nil, errChan
}
reqMeta := requestExecutionMetadata(ctx)
reqMeta[coreexecutor.RequestedModelMetadataKey] = normalizedModel
req := coreexecutor.Request{
Model: normalizedModel,
Payload: cloneBytes(rawJSON),
@@ -503,6 +511,32 @@ func (h *BaseAPIHandler) ExecuteStreamWithAuthManager(ctx context.Context, handl
bootstrapRetries := 0
maxBootstrapRetries := StreamingBootstrapRetries(h.Cfg)
sendErr := func(msg *interfaces.ErrorMessage) bool {
if ctx == nil {
errChan <- msg
return true
}
select {
case <-ctx.Done():
return false
case errChan <- msg:
return true
}
}
sendData := func(chunk []byte) bool {
if ctx == nil {
dataChan <- chunk
return true
}
select {
case <-ctx.Done():
return false
case dataChan <- chunk:
return true
}
}
bootstrapEligible := func(err error) bool {
status := statusFromError(err)
if status == 0 {
@@ -562,12 +596,14 @@ func (h *BaseAPIHandler) ExecuteStreamWithAuthManager(ctx context.Context, handl
addon = hdr.Clone()
}
}
errChan <- &interfaces.ErrorMessage{StatusCode: status, Error: streamErr, Addon: addon}
_ = sendErr(&interfaces.ErrorMessage{StatusCode: status, Error: streamErr, Addon: addon})
return
}
if len(chunk.Payload) > 0 {
sentPayload = true
dataChan <- cloneBytes(chunk.Payload)
if okSendData := sendData(cloneBytes(chunk.Payload)); !okSendData {
return
}
}
}
}

View File

@@ -70,6 +70,58 @@ func (e *failOnceStreamExecutor) Calls() int {
return e.calls
}
type payloadThenErrorStreamExecutor struct {
mu sync.Mutex
calls int
}
func (e *payloadThenErrorStreamExecutor) Identifier() string { return "codex" }
func (e *payloadThenErrorStreamExecutor) Execute(context.Context, *coreauth.Auth, coreexecutor.Request, coreexecutor.Options) (coreexecutor.Response, error) {
return coreexecutor.Response{}, &coreauth.Error{Code: "not_implemented", Message: "Execute not implemented"}
}
func (e *payloadThenErrorStreamExecutor) ExecuteStream(context.Context, *coreauth.Auth, coreexecutor.Request, coreexecutor.Options) (<-chan coreexecutor.StreamChunk, error) {
e.mu.Lock()
e.calls++
e.mu.Unlock()
ch := make(chan coreexecutor.StreamChunk, 2)
ch <- coreexecutor.StreamChunk{Payload: []byte("partial")}
ch <- coreexecutor.StreamChunk{
Err: &coreauth.Error{
Code: "upstream_closed",
Message: "upstream closed",
Retryable: false,
HTTPStatus: http.StatusBadGateway,
},
}
close(ch)
return ch, nil
}
func (e *payloadThenErrorStreamExecutor) Refresh(ctx context.Context, auth *coreauth.Auth) (*coreauth.Auth, error) {
return auth, nil
}
func (e *payloadThenErrorStreamExecutor) CountTokens(context.Context, *coreauth.Auth, coreexecutor.Request, coreexecutor.Options) (coreexecutor.Response, error) {
return coreexecutor.Response{}, &coreauth.Error{Code: "not_implemented", Message: "CountTokens not implemented"}
}
func (e *payloadThenErrorStreamExecutor) HttpRequest(ctx context.Context, auth *coreauth.Auth, req *http.Request) (*http.Response, error) {
return nil, &coreauth.Error{
Code: "not_implemented",
Message: "HttpRequest not implemented",
HTTPStatus: http.StatusNotImplemented,
}
}
func (e *payloadThenErrorStreamExecutor) Calls() int {
e.mu.Lock()
defer e.mu.Unlock()
return e.calls
}
func TestExecuteStreamWithAuthManager_RetriesBeforeFirstByte(t *testing.T) {
executor := &failOnceStreamExecutor{}
manager := coreauth.NewManager(nil, nil, nil)
@@ -130,3 +182,73 @@ func TestExecuteStreamWithAuthManager_RetriesBeforeFirstByte(t *testing.T) {
t.Fatalf("expected 2 stream attempts, got %d", executor.Calls())
}
}
func TestExecuteStreamWithAuthManager_DoesNotRetryAfterFirstByte(t *testing.T) {
executor := &payloadThenErrorStreamExecutor{}
manager := coreauth.NewManager(nil, nil, nil)
manager.RegisterExecutor(executor)
auth1 := &coreauth.Auth{
ID: "auth1",
Provider: "codex",
Status: coreauth.StatusActive,
Metadata: map[string]any{"email": "test1@example.com"},
}
if _, err := manager.Register(context.Background(), auth1); err != nil {
t.Fatalf("manager.Register(auth1): %v", err)
}
auth2 := &coreauth.Auth{
ID: "auth2",
Provider: "codex",
Status: coreauth.StatusActive,
Metadata: map[string]any{"email": "test2@example.com"},
}
if _, err := manager.Register(context.Background(), auth2); err != nil {
t.Fatalf("manager.Register(auth2): %v", err)
}
registry.GetGlobalRegistry().RegisterClient(auth1.ID, auth1.Provider, []*registry.ModelInfo{{ID: "test-model"}})
registry.GetGlobalRegistry().RegisterClient(auth2.ID, auth2.Provider, []*registry.ModelInfo{{ID: "test-model"}})
t.Cleanup(func() {
registry.GetGlobalRegistry().UnregisterClient(auth1.ID)
registry.GetGlobalRegistry().UnregisterClient(auth2.ID)
})
handler := NewBaseAPIHandlers(&sdkconfig.SDKConfig{
Streaming: sdkconfig.StreamingConfig{
BootstrapRetries: 1,
},
}, manager)
dataChan, errChan := handler.ExecuteStreamWithAuthManager(context.Background(), "openai", "test-model", []byte(`{"model":"test-model"}`), "")
if dataChan == nil || errChan == nil {
t.Fatalf("expected non-nil channels")
}
var got []byte
for chunk := range dataChan {
got = append(got, chunk...)
}
var gotErr error
var gotStatus int
for msg := range errChan {
if msg != nil && msg.Error != nil {
gotErr = msg.Error
gotStatus = msg.StatusCode
}
}
if string(got) != "partial" {
t.Fatalf("expected payload partial, got %q", string(got))
}
if gotErr == nil {
t.Fatalf("expected terminal error, got nil")
}
if gotStatus != http.StatusBadGateway {
t.Fatalf("expected status %d, got %d", http.StatusBadGateway, gotStatus)
}
if executor.Calls() != 1 {
t.Fatalf("expected 1 stream attempt, got %d", executor.Calls())
}
}

View File

@@ -0,0 +1,120 @@
package openai
import (
"context"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/api/handlers"
coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
coreexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdkconfig "github.com/router-for-me/CLIProxyAPI/v6/sdk/config"
)
type compactCaptureExecutor struct {
alt string
sourceFormat string
calls int
}
func (e *compactCaptureExecutor) Identifier() string { return "test-provider" }
func (e *compactCaptureExecutor) Execute(ctx context.Context, auth *coreauth.Auth, req coreexecutor.Request, opts coreexecutor.Options) (coreexecutor.Response, error) {
e.calls++
e.alt = opts.Alt
e.sourceFormat = opts.SourceFormat.String()
return coreexecutor.Response{Payload: []byte(`{"ok":true}`)}, nil
}
func (e *compactCaptureExecutor) ExecuteStream(context.Context, *coreauth.Auth, coreexecutor.Request, coreexecutor.Options) (<-chan coreexecutor.StreamChunk, error) {
return nil, errors.New("not implemented")
}
func (e *compactCaptureExecutor) Refresh(ctx context.Context, auth *coreauth.Auth) (*coreauth.Auth, error) {
return auth, nil
}
func (e *compactCaptureExecutor) CountTokens(context.Context, *coreauth.Auth, coreexecutor.Request, coreexecutor.Options) (coreexecutor.Response, error) {
return coreexecutor.Response{}, errors.New("not implemented")
}
func (e *compactCaptureExecutor) HttpRequest(context.Context, *coreauth.Auth, *http.Request) (*http.Response, error) {
return nil, errors.New("not implemented")
}
func TestOpenAIResponsesCompactRejectsStream(t *testing.T) {
gin.SetMode(gin.TestMode)
executor := &compactCaptureExecutor{}
manager := coreauth.NewManager(nil, nil, nil)
manager.RegisterExecutor(executor)
auth := &coreauth.Auth{ID: "auth1", Provider: executor.Identifier(), Status: coreauth.StatusActive}
if _, err := manager.Register(context.Background(), auth); err != nil {
t.Fatalf("Register auth: %v", err)
}
registry.GetGlobalRegistry().RegisterClient(auth.ID, auth.Provider, []*registry.ModelInfo{{ID: "test-model"}})
t.Cleanup(func() {
registry.GetGlobalRegistry().UnregisterClient(auth.ID)
})
base := handlers.NewBaseAPIHandlers(&sdkconfig.SDKConfig{}, manager)
h := NewOpenAIResponsesAPIHandler(base)
router := gin.New()
router.POST("/v1/responses/compact", h.Compact)
req := httptest.NewRequest(http.MethodPost, "/v1/responses/compact", strings.NewReader(`{"model":"test-model","stream":true}`))
req.Header.Set("Content-Type", "application/json")
resp := httptest.NewRecorder()
router.ServeHTTP(resp, req)
if resp.Code != http.StatusBadRequest {
t.Fatalf("status = %d, want %d", resp.Code, http.StatusBadRequest)
}
if executor.calls != 0 {
t.Fatalf("executor calls = %d, want 0", executor.calls)
}
}
func TestOpenAIResponsesCompactExecute(t *testing.T) {
gin.SetMode(gin.TestMode)
executor := &compactCaptureExecutor{}
manager := coreauth.NewManager(nil, nil, nil)
manager.RegisterExecutor(executor)
auth := &coreauth.Auth{ID: "auth2", Provider: executor.Identifier(), Status: coreauth.StatusActive}
if _, err := manager.Register(context.Background(), auth); err != nil {
t.Fatalf("Register auth: %v", err)
}
registry.GetGlobalRegistry().RegisterClient(auth.ID, auth.Provider, []*registry.ModelInfo{{ID: "test-model"}})
t.Cleanup(func() {
registry.GetGlobalRegistry().UnregisterClient(auth.ID)
})
base := handlers.NewBaseAPIHandlers(&sdkconfig.SDKConfig{}, manager)
h := NewOpenAIResponsesAPIHandler(base)
router := gin.New()
router.POST("/v1/responses/compact", h.Compact)
req := httptest.NewRequest(http.MethodPost, "/v1/responses/compact", strings.NewReader(`{"model":"test-model","input":"hello"}`))
req.Header.Set("Content-Type", "application/json")
resp := httptest.NewRecorder()
router.ServeHTTP(resp, req)
if resp.Code != http.StatusOK {
t.Fatalf("status = %d, want %d", resp.Code, http.StatusOK)
}
if executor.alt != "responses/compact" {
t.Fatalf("alt = %q, want %q", executor.alt, "responses/compact")
}
if executor.sourceFormat != "openai-response" {
t.Fatalf("source format = %q, want %q", executor.sourceFormat, "openai-response")
}
if strings.TrimSpace(resp.Body.String()) != `{"ok":true}` {
t.Fatalf("body = %s", resp.Body.String())
}
}

View File

@@ -18,6 +18,7 @@ import (
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/api/handlers"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
// OpenAIResponsesAPIHandler contains the handlers for OpenAIResponses API endpoints.
@@ -91,6 +92,49 @@ func (h *OpenAIResponsesAPIHandler) Responses(c *gin.Context) {
}
func (h *OpenAIResponsesAPIHandler) Compact(c *gin.Context) {
rawJSON, err := c.GetRawData()
if err != nil {
c.JSON(http.StatusBadRequest, handlers.ErrorResponse{
Error: handlers.ErrorDetail{
Message: fmt.Sprintf("Invalid request: %v", err),
Type: "invalid_request_error",
},
})
return
}
streamResult := gjson.GetBytes(rawJSON, "stream")
if streamResult.Type == gjson.True {
c.JSON(http.StatusBadRequest, handlers.ErrorResponse{
Error: handlers.ErrorDetail{
Message: "Streaming not supported for compact responses",
Type: "invalid_request_error",
},
})
return
}
if streamResult.Exists() {
if updated, err := sjson.DeleteBytes(rawJSON, "stream"); err == nil {
rawJSON = updated
}
}
c.Header("Content-Type", "application/json")
modelName := gjson.GetBytes(rawJSON, "model").String()
cliCtx, cliCancel := h.GetContextWithCancel(h, c, context.Background())
stopKeepAlive := h.StartNonStreamingKeepAlive(c, cliCtx)
resp, errMsg := h.ExecuteWithAuthManager(cliCtx, h.HandlerType(), modelName, rawJSON, "responses/compact")
stopKeepAlive()
if errMsg != nil {
h.WriteErrorResponse(c, errMsg)
cliCancel(errMsg.Error)
return
}
_, _ = c.Writer.Write(resp)
cliCancel()
}
// handleNonStreamingResponse handles non-streaming chat completion responses
// for Gemini models. It selects a client from the pool, sends the request, and
// aggregates the response before sending it back to the client in OpenAIResponses format.

View File

@@ -2,15 +2,13 @@ package auth
import (
"context"
"encoding/json"
"fmt"
"io"
"net"
"net/http"
"net/url"
"strings"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/antigravity"
"github.com/router-for-me/CLIProxyAPI/v6/internal/browser"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
@@ -19,20 +17,6 @@ import (
log "github.com/sirupsen/logrus"
)
const (
antigravityClientID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
antigravityClientSecret = "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"
antigravityCallbackPort = 51121
)
var antigravityScopes = []string{
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
"https://www.googleapis.com/auth/cclog",
"https://www.googleapis.com/auth/experimentsandconfigs",
}
// AntigravityAuthenticator implements OAuth login for the antigravity provider.
type AntigravityAuthenticator struct{}
@@ -60,12 +44,12 @@ func (AntigravityAuthenticator) Login(ctx context.Context, cfg *config.Config, o
opts = &LoginOptions{}
}
callbackPort := antigravityCallbackPort
callbackPort := antigravity.CallbackPort
if opts.CallbackPort > 0 {
callbackPort = opts.CallbackPort
}
httpClient := util.SetProxy(&cfg.SDKConfig, &http.Client{})
authSvc := antigravity.NewAntigravityAuth(cfg, nil)
state, err := misc.GenerateRandomState()
if err != nil {
@@ -83,7 +67,7 @@ func (AntigravityAuthenticator) Login(ctx context.Context, cfg *config.Config, o
}()
redirectURI := fmt.Sprintf("http://localhost:%d/oauth-callback", port)
authURL := buildAntigravityAuthURL(redirectURI, state)
authURL := authSvc.BuildAuthURL(state, redirectURI)
if !opts.NoBrowser {
fmt.Println("Opening browser for antigravity authentication")
@@ -164,22 +148,29 @@ waitForCallback:
return nil, fmt.Errorf("antigravity: missing authorization code")
}
tokenResp, errToken := exchangeAntigravityCode(ctx, cbRes.Code, redirectURI, httpClient)
tokenResp, errToken := authSvc.ExchangeCodeForTokens(ctx, cbRes.Code, redirectURI)
if errToken != nil {
return nil, fmt.Errorf("antigravity: token exchange failed: %w", errToken)
}
email := ""
if tokenResp.AccessToken != "" {
if info, errInfo := fetchAntigravityUserInfo(ctx, tokenResp.AccessToken, httpClient); errInfo == nil && strings.TrimSpace(info.Email) != "" {
email = strings.TrimSpace(info.Email)
}
accessToken := strings.TrimSpace(tokenResp.AccessToken)
if accessToken == "" {
return nil, fmt.Errorf("antigravity: token exchange returned empty access token")
}
email, errInfo := authSvc.FetchUserInfo(ctx, accessToken)
if errInfo != nil {
return nil, fmt.Errorf("antigravity: fetch user info failed: %w", errInfo)
}
email = strings.TrimSpace(email)
if email == "" {
return nil, fmt.Errorf("antigravity: empty email returned from user info")
}
// Fetch project ID via loadCodeAssist (same approach as Gemini CLI)
projectID := ""
if tokenResp.AccessToken != "" {
fetchedProjectID, errProject := fetchAntigravityProjectID(ctx, tokenResp.AccessToken, httpClient)
if accessToken != "" {
fetchedProjectID, errProject := authSvc.FetchProjectID(ctx, accessToken)
if errProject != nil {
log.Warnf("antigravity: failed to fetch project ID: %v", errProject)
} else {
@@ -204,7 +195,7 @@ waitForCallback:
metadata["project_id"] = projectID
}
fileName := sanitizeAntigravityFileName(email)
fileName := antigravity.CredentialFileName(email)
label := email
if label == "" {
label = "antigravity"
@@ -231,7 +222,7 @@ type callbackResult struct {
func startAntigravityCallbackServer(port int) (*http.Server, int, <-chan callbackResult, error) {
if port <= 0 {
port = antigravityCallbackPort
port = antigravity.CallbackPort
}
addr := fmt.Sprintf(":%d", port)
listener, err := net.Listen("tcp", addr)
@@ -267,309 +258,9 @@ func startAntigravityCallbackServer(port int) (*http.Server, int, <-chan callbac
return srv, port, resultCh, nil
}
type antigravityTokenResponse struct {
AccessToken string `json:"access_token"`
RefreshToken string `json:"refresh_token"`
ExpiresIn int64 `json:"expires_in"`
TokenType string `json:"token_type"`
}
func exchangeAntigravityCode(ctx context.Context, code, redirectURI string, httpClient *http.Client) (*antigravityTokenResponse, error) {
data := url.Values{}
data.Set("code", code)
data.Set("client_id", antigravityClientID)
data.Set("client_secret", antigravityClientSecret)
data.Set("redirect_uri", redirectURI)
data.Set("grant_type", "authorization_code")
req, err := http.NewRequestWithContext(ctx, http.MethodPost, "https://oauth2.googleapis.com/token", strings.NewReader(data.Encode()))
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
resp, errDo := httpClient.Do(req)
if errDo != nil {
return nil, errDo
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity token exchange: close body error: %v", errClose)
}
}()
var token antigravityTokenResponse
if errDecode := json.NewDecoder(resp.Body).Decode(&token); errDecode != nil {
return nil, errDecode
}
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return nil, fmt.Errorf("oauth token exchange failed: status %d", resp.StatusCode)
}
return &token, nil
}
type antigravityUserInfo struct {
Email string `json:"email"`
}
func fetchAntigravityUserInfo(ctx context.Context, accessToken string, httpClient *http.Client) (*antigravityUserInfo, error) {
if strings.TrimSpace(accessToken) == "" {
return &antigravityUserInfo{}, nil
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://www.googleapis.com/oauth2/v1/userinfo?alt=json", nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "Bearer "+accessToken)
resp, errDo := httpClient.Do(req)
if errDo != nil {
return nil, errDo
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity userinfo: close body error: %v", errClose)
}
}()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return &antigravityUserInfo{}, nil
}
var info antigravityUserInfo
if errDecode := json.NewDecoder(resp.Body).Decode(&info); errDecode != nil {
return nil, errDecode
}
return &info, nil
}
func buildAntigravityAuthURL(redirectURI, state string) string {
params := url.Values{}
params.Set("access_type", "offline")
params.Set("client_id", antigravityClientID)
params.Set("prompt", "consent")
params.Set("redirect_uri", redirectURI)
params.Set("response_type", "code")
params.Set("scope", strings.Join(antigravityScopes, " "))
params.Set("state", state)
return "https://accounts.google.com/o/oauth2/v2/auth?" + params.Encode()
}
func sanitizeAntigravityFileName(email string) string {
if strings.TrimSpace(email) == "" {
return "antigravity.json"
}
replacer := strings.NewReplacer("@", "_", ".", "_")
return fmt.Sprintf("antigravity-%s.json", replacer.Replace(email))
}
// Antigravity API constants for project discovery
const (
antigravityAPIEndpoint = "https://cloudcode-pa.googleapis.com"
antigravityAPIVersion = "v1internal"
antigravityAPIUserAgent = "google-api-nodejs-client/9.15.1"
antigravityAPIClient = "google-cloud-sdk vscode_cloudshelleditor/0.1"
antigravityClientMetadata = `{"ideType":"IDE_UNSPECIFIED","platform":"PLATFORM_UNSPECIFIED","pluginType":"GEMINI"}`
)
// FetchAntigravityProjectID exposes project discovery for external callers.
func FetchAntigravityProjectID(ctx context.Context, accessToken string, httpClient *http.Client) (string, error) {
return fetchAntigravityProjectID(ctx, accessToken, httpClient)
}
// fetchAntigravityProjectID retrieves the project ID for the authenticated user via loadCodeAssist.
// This uses the same approach as Gemini CLI to get the cloudaicompanionProject.
func fetchAntigravityProjectID(ctx context.Context, accessToken string, httpClient *http.Client) (string, error) {
// Call loadCodeAssist to get the project
loadReqBody := map[string]any{
"metadata": map[string]string{
"ideType": "ANTIGRAVITY",
"platform": "PLATFORM_UNSPECIFIED",
"pluginType": "GEMINI",
},
}
rawBody, errMarshal := json.Marshal(loadReqBody)
if errMarshal != nil {
return "", fmt.Errorf("marshal request body: %w", errMarshal)
}
endpointURL := fmt.Sprintf("%s/%s:loadCodeAssist", antigravityAPIEndpoint, antigravityAPIVersion)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, endpointURL, strings.NewReader(string(rawBody)))
if err != nil {
return "", fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+accessToken)
req.Header.Set("Content-Type", "application/json")
req.Header.Set("User-Agent", antigravityAPIUserAgent)
req.Header.Set("X-Goog-Api-Client", antigravityAPIClient)
req.Header.Set("Client-Metadata", antigravityClientMetadata)
resp, errDo := httpClient.Do(req)
if errDo != nil {
return "", fmt.Errorf("execute request: %w", errDo)
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("antigravity loadCodeAssist: close body error: %v", errClose)
}
}()
bodyBytes, errRead := io.ReadAll(resp.Body)
if errRead != nil {
return "", fmt.Errorf("read response: %w", errRead)
}
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return "", fmt.Errorf("request failed with status %d: %s", resp.StatusCode, strings.TrimSpace(string(bodyBytes)))
}
var loadResp map[string]any
if errDecode := json.Unmarshal(bodyBytes, &loadResp); errDecode != nil {
return "", fmt.Errorf("decode response: %w", errDecode)
}
// Extract projectID from response
projectID := ""
if id, ok := loadResp["cloudaicompanionProject"].(string); ok {
projectID = strings.TrimSpace(id)
}
if projectID == "" {
if projectMap, ok := loadResp["cloudaicompanionProject"].(map[string]any); ok {
if id, okID := projectMap["id"].(string); okID {
projectID = strings.TrimSpace(id)
}
}
}
if projectID == "" {
tierID := "legacy-tier"
if tiers, okTiers := loadResp["allowedTiers"].([]any); okTiers {
for _, rawTier := range tiers {
tier, okTier := rawTier.(map[string]any)
if !okTier {
continue
}
if isDefault, okDefault := tier["isDefault"].(bool); okDefault && isDefault {
if id, okID := tier["id"].(string); okID && strings.TrimSpace(id) != "" {
tierID = strings.TrimSpace(id)
break
}
}
}
}
projectID, err = antigravityOnboardUser(ctx, accessToken, tierID, httpClient)
if err != nil {
return "", err
}
return projectID, nil
}
return projectID, nil
}
// antigravityOnboardUser attempts to fetch the project ID via onboardUser by polling for completion.
// It returns an empty string when the operation times out or completes without a project ID.
func antigravityOnboardUser(ctx context.Context, accessToken, tierID string, httpClient *http.Client) (string, error) {
if httpClient == nil {
httpClient = http.DefaultClient
}
fmt.Println("Antigravity: onboarding user...", tierID)
requestBody := map[string]any{
"tierId": tierID,
"metadata": map[string]string{
"ideType": "ANTIGRAVITY",
"platform": "PLATFORM_UNSPECIFIED",
"pluginType": "GEMINI",
},
}
rawBody, errMarshal := json.Marshal(requestBody)
if errMarshal != nil {
return "", fmt.Errorf("marshal request body: %w", errMarshal)
}
maxAttempts := 5
for attempt := 1; attempt <= maxAttempts; attempt++ {
log.Debugf("Polling attempt %d/%d", attempt, maxAttempts)
reqCtx := ctx
var cancel context.CancelFunc
if reqCtx == nil {
reqCtx = context.Background()
}
reqCtx, cancel = context.WithTimeout(reqCtx, 30*time.Second)
endpointURL := fmt.Sprintf("%s/%s:onboardUser", antigravityAPIEndpoint, antigravityAPIVersion)
req, errRequest := http.NewRequestWithContext(reqCtx, http.MethodPost, endpointURL, strings.NewReader(string(rawBody)))
if errRequest != nil {
cancel()
return "", fmt.Errorf("create request: %w", errRequest)
}
req.Header.Set("Authorization", "Bearer "+accessToken)
req.Header.Set("Content-Type", "application/json")
req.Header.Set("User-Agent", antigravityAPIUserAgent)
req.Header.Set("X-Goog-Api-Client", antigravityAPIClient)
req.Header.Set("Client-Metadata", antigravityClientMetadata)
resp, errDo := httpClient.Do(req)
if errDo != nil {
cancel()
return "", fmt.Errorf("execute request: %w", errDo)
}
bodyBytes, errRead := io.ReadAll(resp.Body)
if errClose := resp.Body.Close(); errClose != nil {
log.Errorf("close body error: %v", errClose)
}
cancel()
if errRead != nil {
return "", fmt.Errorf("read response: %w", errRead)
}
if resp.StatusCode == http.StatusOK {
var data map[string]any
if errDecode := json.Unmarshal(bodyBytes, &data); errDecode != nil {
return "", fmt.Errorf("decode response: %w", errDecode)
}
if done, okDone := data["done"].(bool); okDone && done {
projectID := ""
if responseData, okResp := data["response"].(map[string]any); okResp {
switch projectValue := responseData["cloudaicompanionProject"].(type) {
case map[string]any:
if id, okID := projectValue["id"].(string); okID {
projectID = strings.TrimSpace(id)
}
case string:
projectID = strings.TrimSpace(projectValue)
}
}
if projectID != "" {
log.Infof("Successfully fetched project_id: %s", projectID)
return projectID, nil
}
return "", fmt.Errorf("no project_id in response")
}
time.Sleep(2 * time.Second)
continue
}
responsePreview := strings.TrimSpace(string(bodyBytes))
if len(responsePreview) > 500 {
responsePreview = responsePreview[:500]
}
responseErr := responsePreview
if len(responseErr) > 200 {
responseErr = responseErr[:200]
}
return "", fmt.Errorf("http %d: %s", resp.StatusCode, responseErr)
}
return "", nil
cfg := &config.Config{}
authSvc := antigravity.NewAntigravityAuth(cfg, httpClient)
return authSvc.FetchProjectID(ctx, accessToken)
}

View File

@@ -176,13 +176,16 @@ waitForCallback:
}
if result.State != state {
log.Errorf("State mismatch: expected %s, got %s", state, result.State)
return nil, claude.NewAuthenticationError(claude.ErrInvalidState, fmt.Errorf("state mismatch"))
}
log.Debug("Claude authorization code received; exchanging for tokens")
log.Debugf("Code: %s, State: %s", result.Code[:min(20, len(result.Code))], state)
authBundle, err := authSvc.ExchangeCodeForTokens(ctx, result.Code, state, pkceCodes)
if err != nil {
log.Errorf("Token exchange failed: %v", err)
return nil, claude.NewAuthenticationError(claude.ErrCodeExchangeFailed, err)
}

View File

@@ -2,6 +2,8 @@ package auth
import (
"context"
"crypto/sha256"
"encoding/hex"
"fmt"
"net/http"
"strings"
@@ -191,7 +193,19 @@ waitForCallback:
return nil, fmt.Errorf("codex token storage missing account information")
}
fileName := fmt.Sprintf("codex-%s.json", tokenStorage.Email)
planType := ""
hashAccountID := ""
if tokenStorage.IDToken != "" {
if claims, errParse := codex.ParseJWTToken(tokenStorage.IDToken); errParse == nil && claims != nil {
planType = strings.TrimSpace(claims.CodexAuthInfo.ChatgptPlanType)
accountID := strings.TrimSpace(claims.CodexAuthInfo.ChatgptAccountID)
if accountID != "" {
digest := sha256.Sum256([]byte(accountID))
hashAccountID = hex.EncodeToString(digest[:])[:8]
}
}
}
fileName := codex.CredentialFileName(tokenStorage.Email, planType, hashAccountID, true)
metadata := map[string]any{
"email": tokenStorage.Email,
}

Some files were not shown because too many files have changed in this diff Show More