Commit Graph

192 Commits

Author SHA1 Message Date
hkfires
28a428ae2f fix(thinking): align budget effort mapping across translators
Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.
2025-12-16 18:34:43 +08:00
hkfires
b326ec3641 feat(iflow): add thinking support for iFlow models 2025-12-16 18:34:43 +08:00
hkfires
367a05bdf6 refactor(thinking): export thinking helpers
Expose thinking/effort normalization helpers from the executor package
so conversion tests use production code and stay aligned with runtime
validation behavior.
2025-12-15 09:16:15 +08:00
hkfires
712ce9f781 fix(thinking): drop unsupported none effort
When budget 0 maps to "none" for models that use thinking levels
but don't support that effort level, strip thinking fields instead
of setting an invalid reasoning_effort value.
Tests now expect removal for this edge case.
2025-12-15 09:16:14 +08:00
hkfires
27a5ad8ec2 Fixed: #534
fix(aistudio): correct JSON string boundary detection for backslash sequences
2025-12-15 09:00:14 +08:00
Luis Pater
14ce6aebd1 Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing
fix(gemini-cli): enhance 429 retry delay parsing
2025-12-14 14:04:14 +08:00
Luis Pater
660aabc437 fix(executor): add allowCompat support for reasoning effort normalization
Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.
2025-12-13 04:06:02 +08:00
Luis Pater
f3f0f1717d Merge branch 'dev' into think 2025-12-12 22:16:44 +08:00
Luis Pater
7621ec609e Merge pull request #501 from huynguyen03dev/fix/openai-compat-model-alias-resolution
fix(openai-compat): prevent model alias from being overwritten
2025-12-12 21:58:15 +08:00
Luis Pater
9f511f0024 fix(executor): improve model compatibility handling for OpenAI-compatibility
Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.
2025-12-12 21:57:25 +08:00
hkfires
374faa2640 fix(thinking): map budgets to effort levels
Ensure thinking settings translate correctly across providers:
- Only apply reasoning_effort to level-based models and derive it from numeric
  budget suffixes when present
- Strip effort string fields for budget-based models and skip Claude/Gemini
  budget resolution for level-based or unsupported models
- Default Gemini include_thoughts when a nonzero budget override is set
- Add cross-protocol conversion and budget range tests
2025-12-12 21:33:20 +08:00
huynguyen03.dev
15c3cc3a50 fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel
When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6),
the alias resolution was correctly applied but then immediately overwritten by
ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs.

This fix skips the ResolveOriginalModel override when a model alias has already
been resolved, ensuring the correct model name is sent to the upstream provider.

Co-authored-by: Amp <amp@ampcode.com>
2025-12-12 17:20:24 +07:00
hkfires
d131435e25 fix(codex): raise default reasoning effort to medium 2025-12-12 18:18:48 +08:00
hkfires
3c315551b0 refactor(executor): relocate gemini token counters 2025-12-11 21:56:44 +08:00
hkfires
27c9c5c4da refactor(executor): clarify executor comments and oauth names 2025-12-11 21:56:44 +08:00
hkfires
fc9f6c974a refactor(executor): clarify providers and streams
Add package and constructor documentation for AI Studio, Antigravity,
Gemini CLI, Gemini API, and Vertex executors to describe their roles and
inputs.

Introduce a shared stream scanner buffer constant in the Gemini API
executor and reuse it in Gemini CLI and Vertex streaming code so stream
handling uses a consistent configuration.

Update Refresh implementations for AI Studio, Gemini CLI, Gemini API
(API key), and Vertex executors to short‑circuit and simply return the
incoming auth object, while keeping Antigravity token renewal as the
only executor that performs OAuth refresh.

Remove OAuth2-based token refresh logic and related dependencies from
the Gemini API executor, since it now operates strictly with API key
credentials.
2025-12-11 21:56:43 +08:00
Luis Pater
a74ee3f319 Merge pull request #481 from sususu98/fix/increase-buffer-size
fix: increase buffer size for stream scanners to 50MB across multiple executors
2025-12-11 21:20:54 +08:00
hkfires
e79f65fd8e refactor(thinking): use parentheses for metadata suffix 2025-12-11 18:39:07 +08:00
hkfires
facfe7c518 refactor(thinking): use bracket tags for thinking meta
Align thinking suffix handling on a single bracket-style marker.

NormalizeThinkingModel strips a terminal `[value]` segment from
model identifiers and turns it into either a thinking budget (for
numeric values) or a reasoning effort hint (for strings). Emission
of `ThinkingIncludeThoughtsMetadataKey` is removed.

Executor helpers and the example config are updated so their
comments reference the new `[value]` suffix format instead of the
legacy dash variants.

BREAKING CHANGE: dash-based thinking suffixes (`-thinking`,
`-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed
for thinking metadata; only `[value]` annotations are recognized.
2025-12-11 18:17:28 +08:00
hkfires
6285459c08 fix(runtime): unify claude thinking config resolution 2025-12-11 17:20:44 +08:00
hkfires
21bbceca0c docs(runtime): document reasoning effort precedence 2025-12-11 16:35:36 +08:00
hkfires
f6300c72b7 fix(runtime): validate thinking config in iflow and qwen 2025-12-11 16:21:50 +08:00
hkfires
3a81ab22fd fix(runtime): unify reasoning effort metadata overrides 2025-12-11 14:35:05 +08:00
hkfires
519da2e042 fix(runtime): validate reasoning effort levels 2025-12-11 12:36:54 +08:00
hkfires
3ffd120ae9 feat(runtime): add thinking config normalization 2025-12-11 11:51:33 +08:00
sususu
07d21463ca fix(gemini-cli): enhance 429 retry delay parsing
Add fallback parsing for quota reset delay when RetryInfo is not present:
- Try ErrorInfo.metadata.quotaResetDelay (e.g., "373.801628ms")
- Parse from error.message "Your quota will reset after Xs."

This ensures proper cooldown timing for rate-limited requests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:34:39 +08:00
Luis Pater
423ce97665 feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
2025-12-11 03:10:50 +08:00
sususu
76c563d161 fix(executor): increase buffer size for stream scanners to 50MB across multiple executors 2025-12-10 23:20:04 +08:00
Luis Pater
94d61c7b2b fix(logging): update response aggregation logic to include all attempts 2025-12-10 16:53:48 +08:00
Luis Pater
f25f419e5a fix(antigravity): remove references to autopush endpoint and update fallback logic 2025-12-10 00:13:20 +08:00
hkfires
9b202b6c1c fix(executor): centralize default thinking config 2025-12-09 21:05:06 +08:00
hkfires
6a66b6801a feat(executor): enforce minimum thinking budget for antigravity models 2025-12-09 21:05:06 +08:00
hkfires
5ec9b5e5a9 feat(executor): normalize thinking budget across all Gemini executors 2025-12-09 21:05:06 +08:00
Luis Pater
5db3b58717 Merge pull request #470 from router-for-me/agry
fix(gemini): normalize model listing output
2025-12-09 21:00:29 +08:00
Luis Pater
39b6b3b289 Fixed: #463
fix(antigravity): remove `$ref` and `$defs` from JSON during key deletion
2025-12-09 17:32:17 +08:00
hkfires
e5312fb5a2 feat(antigravity): support canonical names for antigravity models 2025-12-09 16:54:13 +08:00
hkfires
96b55acff8 feat(aistudio): normalize thinking budget in request translation 2025-12-09 08:27:44 +08:00
Luis Pater
af00304b0c fix(antigravity): remove exclusiveMaximum from JSON during key deletion 2025-12-08 23:28:01 +08:00
Luis Pater
6ad188921c refactor(logging): remove unused variable in ensureAttempt and redundant function call 2025-12-08 22:25:58 +08:00
hkfires
a283545b6b feat(antigravity): enforce thinking budget limits for Claude models 2025-12-08 20:36:17 +08:00
hkfires
9c09128e00 feat(registry): add explicit thinking support config for antigravity models 2025-12-07 19:12:55 +08:00
Luis Pater
fd29ab418a Fixed: #424
**feat(antigravity): add support for maxOutputTokens and refine Claude model handling**
2025-12-07 01:55:57 +08:00
Luis Pater
d7564173dd fix(antigravity): restore production base URL in the executor 2025-12-06 01:11:37 +08:00
Luis Pater
c44c46dd80 Fixed: #421
feat(antigravity): implement project ID retrieval and integration in payload processing
2025-12-06 00:40:55 +08:00
Luis Pater
d4d529833d **refactor(antigravity): handle anyOf property, remove exclusiveMinimum, and comment unused prod URL** 2025-12-05 21:24:12 +08:00
Luis Pater
d6352dd4d4 **feat(util): add DeleteKey function and update antigravity executor for Claude model compatibility** 2025-12-05 01:55:45 +08:00
Luis Pater
bceecfb2e3 Fixed: #414
**refactor(gemini): comment out unused CLI preview entry**
2025-12-04 17:55:13 +08:00
Luis Pater
6a2906e3e5 **feat(antigravity): add support for Claude-Opus-4-5-Thinking model** 2025-12-04 16:13:13 +08:00
Luis Pater
e93f87294a refactor(antigravity): uncomment prod environment URL in fallback chain 2025-12-02 22:47:18 +08:00
Luis Pater
0fd2abbc3b **refactor(cliproxy, config): remove vertex-compat flow, streamline Vertex API key handling**
- Removed `vertex-compat` executor and related configuration.
- Consolidated Vertex compatibility checks into `vertex` handling with `apikey`-based model resolution.
- Streamlined model generation logic for Vertex API key entries.
2025-12-02 09:18:24 +08:00