CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-02-03 04:50:52 +08:00

Author	SHA1	Message	Date
huynguyen03.dev	15c3cc3a50	fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6), the alias resolution was correctly applied but then immediately overwritten by ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs. This fix skips the ResolveOriginalModel override when a model alias has already been resolved, ensuring the correct model name is sent to the upstream provider. Co-authored-by: Amp <amp@ampcode.com>	2025-12-12 17:20:24 +07:00
hkfires	3c315551b0	refactor(executor): relocate gemini token counters	2025-12-11 21:56:44 +08:00
hkfires	27c9c5c4da	refactor(executor): clarify executor comments and oauth names	2025-12-11 21:56:44 +08:00
hkfires	fc9f6c974a	refactor(executor): clarify providers and streams Add package and constructor documentation for AI Studio, Antigravity, Gemini CLI, Gemini API, and Vertex executors to describe their roles and inputs. Introduce a shared stream scanner buffer constant in the Gemini API executor and reuse it in Gemini CLI and Vertex streaming code so stream handling uses a consistent configuration. Update Refresh implementations for AI Studio, Gemini CLI, Gemini API (API key), and Vertex executors to short‑circuit and simply return the incoming auth object, while keeping Antigravity token renewal as the only executor that performs OAuth refresh. Remove OAuth2-based token refresh logic and related dependencies from the Gemini API executor, since it now operates strictly with API key credentials.	2025-12-11 21:56:43 +08:00
Luis Pater	a74ee3f319	Merge pull request #481 from sususu98/fix/increase-buffer-size fix: increase buffer size for stream scanners to 50MB across multiple executors	2025-12-11 21:20:54 +08:00
hkfires	e79f65fd8e	refactor(thinking): use parentheses for metadata suffix	2025-12-11 18:39:07 +08:00
hkfires	facfe7c518	refactor(thinking): use bracket tags for thinking meta Align thinking suffix handling on a single bracket-style marker. NormalizeThinkingModel strips a terminal `[value]` segment from model identifiers and turns it into either a thinking budget (for numeric values) or a reasoning effort hint (for strings). Emission of `ThinkingIncludeThoughtsMetadataKey` is removed. Executor helpers and the example config are updated so their comments reference the new `[value]` suffix format instead of the legacy dash variants. BREAKING CHANGE: dash-based thinking suffixes (`-thinking`, `-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed for thinking metadata; only `[value]` annotations are recognized.	2025-12-11 18:17:28 +08:00
hkfires	6285459c08	fix(runtime): unify claude thinking config resolution	2025-12-11 17:20:44 +08:00
hkfires	21bbceca0c	docs(runtime): document reasoning effort precedence	2025-12-11 16:35:36 +08:00
hkfires	f6300c72b7	fix(runtime): validate thinking config in iflow and qwen	2025-12-11 16:21:50 +08:00
hkfires	3a81ab22fd	fix(runtime): unify reasoning effort metadata overrides	2025-12-11 14:35:05 +08:00
hkfires	519da2e042	fix(runtime): validate reasoning effort levels	2025-12-11 12:36:54 +08:00
hkfires	3ffd120ae9	feat(runtime): add thinking config normalization	2025-12-11 11:51:33 +08:00
Luis Pater	423ce97665	feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic - Added support for parsing and normalizing dynamic thinking model suffixes. - Centralized budget resolution across executors and payload helpers. - Retired legacy Gemini-specific thinking handlers in favor of unified logic. - Updated executors to use metadata-based thinking configuration. - Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata. - Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs. - Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly. - Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models. - Improved handling of thinking configurations and model overrides in executors. - Removed hardcoded thinking model entries and migrated logic to metadata-based resolution. - Updated payload mutations to always include the resolved model.	2025-12-11 03:10:50 +08:00
sususu	76c563d161	fix(executor): increase buffer size for stream scanners to 50MB across multiple executors	2025-12-10 23:20:04 +08:00
Luis Pater	94d61c7b2b	fix(logging): update response aggregation logic to include all attempts	2025-12-10 16:53:48 +08:00
Luis Pater	f25f419e5a	fix(antigravity): remove references to `autopush` endpoint and update fallback logic	2025-12-10 00:13:20 +08:00
hkfires	9b202b6c1c	fix(executor): centralize default thinking config	2025-12-09 21:05:06 +08:00
hkfires	6a66b6801a	feat(executor): enforce minimum thinking budget for antigravity models	2025-12-09 21:05:06 +08:00
hkfires	5ec9b5e5a9	feat(executor): normalize thinking budget across all Gemini executors	2025-12-09 21:05:06 +08:00
Luis Pater	5db3b58717	Merge pull request #470 from router-for-me/agry fix(gemini): normalize model listing output	2025-12-09 21:00:29 +08:00
Luis Pater	39b6b3b289	Fixed: #463 fix(antigravity): remove `$ref` and `$defs` from JSON during key deletion	2025-12-09 17:32:17 +08:00
hkfires	e5312fb5a2	feat(antigravity): support canonical names for antigravity models	2025-12-09 16:54:13 +08:00
hkfires	96b55acff8	feat(aistudio): normalize thinking budget in request translation	2025-12-09 08:27:44 +08:00
Luis Pater	af00304b0c	fix(antigravity): remove `exclusiveMaximum` from JSON during key deletion	2025-12-08 23:28:01 +08:00
Luis Pater	6ad188921c	refactor(logging): remove unused variable in `ensureAttempt` and redundant function call	2025-12-08 22:25:58 +08:00
hkfires	a283545b6b	feat(antigravity): enforce thinking budget limits for Claude models	2025-12-08 20:36:17 +08:00
hkfires	9c09128e00	feat(registry): add explicit thinking support config for antigravity models	2025-12-07 19:12:55 +08:00
Luis Pater	fd29ab418a	Fixed: #424 feat(antigravity): add support for maxOutputTokens and refine Claude model handling	2025-12-07 01:55:57 +08:00
Luis Pater	d7564173dd	fix(antigravity): restore production base URL in the executor	2025-12-06 01:11:37 +08:00
Luis Pater	c44c46dd80	Fixed: #421 feat(antigravity): implement project ID retrieval and integration in payload processing	2025-12-06 00:40:55 +08:00
Luis Pater	d4d529833d	refactor(antigravity): handle `anyOf` property, remove `exclusiveMinimum`, and comment unused prod URL	2025-12-05 21:24:12 +08:00
Luis Pater	d6352dd4d4	feat(util): add DeleteKey function and update antigravity executor for Claude model compatibility	2025-12-05 01:55:45 +08:00
Luis Pater	bceecfb2e3	Fixed: #414 refactor(gemini): comment out unused CLI preview entry	2025-12-04 17:55:13 +08:00
Luis Pater	6a2906e3e5	feat(antigravity): add support for Claude-Opus-4-5-Thinking model	2025-12-04 16:13:13 +08:00
Luis Pater	e93f87294a	refactor(antigravity): uncomment prod environment URL in fallback chain	2025-12-02 22:47:18 +08:00
Luis Pater	0fd2abbc3b	refactor(cliproxy, config): remove vertex-compat flow, streamline Vertex API key handling - Removed `vertex-compat` executor and related configuration. - Consolidated Vertex compatibility checks into `vertex` handling with `apikey`-based model resolution. - Streamlined model generation logic for Vertex API key entries.	2025-12-02 09:18:24 +08:00
Aero	0ebb654019	feat: Add support for VertexAI compatible service (#375 ) feat: consolidate Vertex AI compatibility with API key support in Gemini	2025-12-02 08:14:22 +08:00
auroraflux	1c6f4be8ae	refactor(executor): dedupe thinking metadata helpers across Gemini executors Extract applyThinkingMetadata and applyThinkingMetadataCLI helpers to payload_helpers.go and use them across all four Gemini-based executors: - gemini_executor.go (Execute, ExecuteStream, CountTokens) - gemini_cli_executor.go (Execute, ExecuteStream, CountTokens) - aistudio_executor.go (translateRequest) - antigravity_executor.go (Execute, ExecuteStream) This eliminates code duplication introduced in the -reasoning suffix PR and centralizes the thinking config application logic. Net reduction: 28 lines of code.	2025-11-30 15:20:15 -08:00
Luis Pater	73208c4e55	Merge pull request #376 from auroraflux/feat/reasoning-suffix-support feat(util): add -reasoning suffix support for Gemini models	2025-11-30 20:55:38 +08:00
auroraflux	32d3809f8c	feat(util): add -reasoning suffix support for Gemini models Adds support for the `-reasoning` model name suffix which enables thinking/reasoning mode with dynamic budget. This allows clients to request reasoning-enabled inference using model names like `gemini-2.5-flash-reasoning` without explicit configuration. The suffix is normalized to the base model (e.g., gemini-2.5-flash) with thinkingBudget=-1 (dynamic) and include_thoughts=true. Follows the existing pattern established by -nothinking and -thinking-N suffixes.	2025-11-30 01:18:57 -08:00
Luis Pater	a748e93fd9	fix(executor, auth): ensure index assignment consistency for auth objects - Updated `usage_helpers.go` to call `EnsureIndex()` for proper index assignment in reporter initialization. - Adjusted `auth/manager.go` to assign auth indices inside a locked section when they are unassigned, ensuring thread safety and consistency.	2025-11-30 16:56:29 +08:00
nestharus	e73cdf5cff	fix(claude): ensure max_tokens exceeds thinking budget for thinking models Fixes an issue where Claude thinking models would return 400 errors when the thinking.budget_tokens was greater than or equal to max_tokens. Changes: - Add MaxCompletionTokens: 128000 to all Claude thinking model definitions - Add ensureMaxTokensForThinking() function in claude_executor.go that: - Checks if thinking is enabled with a budget_tokens value - Looks up the model's MaxCompletionTokens from the registry - Ensures max_tokens is set to at least the model's MaxCompletionTokens - Falls back to budget_tokens + 4000 buffer if registry lookup fails This ensures Anthropic API constraint (max_tokens > thinking.budget_tokens) is always satisfied when using extended thinking features. Fixes: #339 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-26 22:31:05 -08:00
Luis Pater	a4a26d978e	Fixed: #339 feat(handlers, executor): add Gemini 3 Pro Preview support and refine Claude system instructions - Added support for the new "Gemini 3 Pro Preview" action in Gemini handlers, including detailed metadata and configuration. - Removed redundant `cache_control` field from Claude system instructions for cleaner payload structure.	2025-11-26 11:42:57 +08:00
Luis Pater	ed9f6e897e	Fixed: #337 fix(executor): replace redundant commented code with `checkSystemInstructions` helper - Replaced commented-out `sjson.SetRawBytes` lines with the new `checkSystemInstructions` function. - Centralized system instruction handling for better code clarity and reuse. - Ensured consistent logic for managing `system` field across Claude executor flows.	2025-11-26 08:27:48 +08:00
Luis Pater	2e5681ea32	Merge branch 'dev' into feat/claude-thinking-and-beta-headers	2025-11-26 02:16:40 +08:00
Luis Pater	52c17f03a5	fix(executor): comment out redundant code for setting Claude system instructions - Commented out multiple instances of `sjson.SetRawBytes` for setting `system` key to Claude instructions as they are redundant. - Code cleanup to improve clarity and maintainability without affecting functionality.	2025-11-26 02:06:16 +08:00
nestharus	d0e694d4ed	feat(claude): add thinking model variants and beta headers support - Add Claude thinking model definitions (sonnet-4-5-thinking, opus-4-5-thinking variants) - Add Thinking support for antigravity models with -thinking suffix - Add injectThinkingConfig() for automatic thinking budget based on model suffix - Add resolveUpstreamModel() mappings for thinking variants to actual Claude models - Add extractAndRemoveBetas() to convert betas array to anthropic-beta header - Update applyClaudeHeaders() to merge custom betas from request body Closes #324	2025-11-25 03:33:05 -08:00
Luis Pater	113db3c5bf	fix(executor): update antigravity executor to enhance model metadata handling - Added additional metadata fields (`Name`, `Description`, `DisplayName`, `Version`) to `ModelInfo` struct initialization for better model representation. - Removed unnecessary whitespace in the code.	2025-11-25 09:19:01 +08:00
Luis Pater	ddb0c0ec1c	fix(translator): reintroduce `thoughtSignature` bypass logic for model parts - Restored `thoughtSignature` validator bypass for model-specific parts in Gemini content processing. - Removed redundant logic from the `executor` for cleaner handling.	2025-11-23 20:52:23 +08:00

1 2 3 4

179 Commits