CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-02-03 04:50:52 +08:00

Author	SHA1	Message	Date
Luis Pater	8d15723195	feat(registry): add `GetAvailableModelsByProvider` method for retrieving models by provider	2025-12-31 23:37:46 +08:00
hkfires	e332419081	feat(registry): add thinking support for gemini-2.5-computer-use-preview model	2025-12-31 17:09:22 +08:00
hkfires	ce7474d953	feat(cliproxy): propagate thinking support metadata to aliased models	2025-12-30 15:16:54 +08:00
leaph	6403ff4ec4	feat(iflow): add model-specific thinking configs for GLM-4.7 and MiniMax-M2.1 - GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false} - MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation - Added preserveReasoningContentInMessages() to support re-injection of reasoning content in assistant message history for multi-turn conversations - Added ThinkingSupport to MiniMax-M2.1 model definition	2025-12-27 18:39:15 +01:00
Luis Pater	9d975e0375	feat(models): add support for GLM-4.7 and MiniMax-M2.1	2025-12-24 19:30:57 +08:00
sheauhuu	df777650ac	feat: add gemini-3-flash-preview model definition in GetGeminiModels	2025-12-20 20:05:20 +08:00
hkfires	fa70b220e9	feat(registry): add gpt 5.2 codex model definition	2025-12-19 09:53:03 +08:00
Ben Vargas	a33f5d31fc	feat: use thinkingLevel for Gemini 3 models per Google documentation Per Google's official documentation, Gemini 3 models should use thinkingLevel (string) instead of thinkingBudget (number) for optimal performance. From Google's Gemini Thinking docs: > Use the thinkingLevel parameter with Gemini 3 models. While > thinkingBudget is accepted for backwards compatibility, using > it with Gemini 3 Pro may result in suboptimal performance. Changes: - Add model family detection functions (IsGemini3Model, IsGemini25Model, IsGemini3ProModel, IsGemini3FlashModel) - Add ApplyGeminiThinkingLevel and ApplyGeminiCLIThinkingLevel functions for applying thinkingLevel config - Add ValidateGemini3ThinkingLevel for model-specific level validation - Add ThinkingBudgetToGemini3Level for backward compatibility conversion - Update NormalizeGeminiThinkingBudget to convert budget to level for Gemini 3 models - Update ApplyDefaultThinkingIfNeeded to not set a default level for Gemini 3 (lets API use its dynamic default "high") - Update ConvertThinkingLevelToBudget to preserve thinkingLevel for Gemini 3 models - Add Levels field to all Gemini 3 model definitions: - Gemini 3 Pro: ["low", "high"] - Gemini 3 Flash: ["minimal", "low", "medium", "high"] Backward compatibility: - Gemini 2.5 models continue to use thinkingBudget as before - If thinkingBudget is provided for Gemini 3, it's converted to the appropriate thinkingLevel - Existing configurations continue to work	2025-12-17 15:28:20 -07:00
Luis Pater	f27672f6cf	feat(antigravity): add Gemini 3 Flash Preview model definition with enhanced capabilities	2025-12-18 01:02:19 +08:00
hkfires	b326ec3641	feat(iflow): add thinking support for iFlow models	2025-12-16 18:34:43 +08:00
Luis Pater	5a75ef8ffd	Merge pull request #536 from AoaoMH/feature/auth-model-check feat: using Client Model Infos;	2025-12-15 00:29:33 +08:00
Test	07279f8746	feat: using Client Model Infos;	2025-12-15 00:13:05 +08:00
Luis Pater	71f788b13a	fix(registry): remove unused `ThinkingSupport` from DeepSeek-R1 model	2025-12-14 21:30:17 +08:00
Luis Pater	59c62dc580	fix(registry): correct DeepSeek-V3.2 experimental model ID	2025-12-14 21:27:43 +08:00
Luis Pater	d5310a3300	Merge pull request #531 from AoaoMH/feature/auth-model-check feat: add API endpoint to query models for auth credentials	2025-12-14 16:46:43 +08:00
Luis Pater	f0a3eb574e	fix(registry): update DeepSeek model definitions with new IDs and descriptions	2025-12-14 16:17:11 +08:00
Test	bb15855443	feat: add API endpoint to query models for auth credentials	2025-12-14 15:16:26 +08:00
Ben Vargas	b09e2115d1	fix(models): add "none" reasoning effort level to gpt-5.2 Per OpenAI API documentation, gpt-5.2 supports reasoning_effort values of "none", "low", "medium", "high", and "xhigh". The "none" level was missing from the model definition. Reference: https://platform.openai.com/docs/api-reference/chat/create#chat_create-reasoning_effort	2025-12-11 15:26:23 -07:00
Luis Pater	cd2da152d4	feat(models): add GPT 5.2 model definition and prompts	2025-12-12 03:02:27 +08:00
hkfires	007572b58e	fix(util): do not strip thinking suffix on registered models NormalizeThinkingModel now checks ModelSupportsThinking before removing "-thinking" or "-thinking-<ver>", avoiding accidental parsing of model names where the suffix is part of the official id (e.g., kimi-k2-thinking, qwen3-235b-a22b-thinking-2507). The registry adds ThinkingSupport metadata for several models and propagates it via ModelInfo (e.g., kimi-k2-thinking, deepseek-r1, qwen3-235b-a22b-thinking-2507, minimax-m2), enabling accurate detection of thinking-capable models and correcting base model inference.	2025-12-11 15:52:14 +08:00
hkfires	a03d514095	feat(registry): add thinking metadata for models	2025-12-11 11:28:44 +08:00
Luis Pater	423ce97665	feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic - Added support for parsing and normalizing dynamic thinking model suffixes. - Centralized budget resolution across executors and payload helpers. - Retired legacy Gemini-specific thinking handlers in favor of unified logic. - Updated executors to use metadata-based thinking configuration. - Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata. - Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs. - Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly. - Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models. - Improved handling of thinking configurations and model overrides in executors. - Removed hardcoded thinking model entries and migrated logic to metadata-based resolution. - Updated payload mutations to always include the resolved model.	2025-12-11 03:10:50 +08:00
hkfires	3cfe7008a2	fix(registry): update gpt 5.1 model names	2025-12-09 17:55:21 +08:00
hkfires	e5312fb5a2	feat(antigravity): support canonical names for antigravity models	2025-12-09 16:54:13 +08:00
hkfires	a283545b6b	feat(antigravity): enforce thinking budget limits for Claude models	2025-12-08 20:36:17 +08:00
hkfires	9c09128e00	feat(registry): add explicit thinking support config for antigravity models	2025-12-07 19:12:55 +08:00
Luis Pater	897c40bed8	feat(registry): add DeepSeek-V3.2-Chat model definition Add new DeepSeek-V3.2-Chat model to the registry with standard chat configuration, positioned before the experimental variant for better organization.	2025-12-03 21:34:50 +08:00
Luis Pater	1434bc38e5	refactor(registry): remove Qwen3-Coder from model definitions	2025-12-02 11:34:38 +08:00
hkfires	75e278c7a5	feat(registry): add thinking support to gemini models	2025-11-30 20:56:29 +08:00
Luis Pater	d2e4639b2a	feat(registry): add context length and update max tokens for Claude model configurations - Added `ContextLength` field with a value of 200,000 to all applicable Claude model definitions. - Standardized `MaxCompletionTokens` values across models for consistency and alignment.	2025-11-27 16:13:25 +08:00
nestharus	e73cdf5cff	fix(claude): ensure max_tokens exceeds thinking budget for thinking models Fixes an issue where Claude thinking models would return 400 errors when the thinking.budget_tokens was greater than or equal to max_tokens. Changes: - Add MaxCompletionTokens: 128000 to all Claude thinking model definitions - Add ensureMaxTokensForThinking() function in claude_executor.go that: - Checks if thinking is enabled with a budget_tokens value - Looks up the model's MaxCompletionTokens from the registry - Ensures max_tokens is set to at least the model's MaxCompletionTokens - Falls back to budget_tokens + 4000 buffer if registry lookup fails This ensures Anthropic API constraint (max_tokens > thinking.budget_tokens) is always satisfied when using extended thinking features. Fixes: #339 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-26 22:31:05 -08:00
Luis Pater	36755421fe	Merge pull request #343 from router-for-me/misc style(amp): tidy whitespace in proxy module and tests	2025-11-26 19:03:07 +08:00
hkfires	6c17dbc4da	style(amp): tidy whitespace in proxy module and tests	2025-11-26 18:57:26 +08:00
Luis Pater	ee6429cc75	feat(registry): add Gemini 3 Pro Image Preview model and remove Claude Sonnet 4.5 Thinking - Added new `Gemini 3 Pro Image Preview` model with detailed metadata and configuration. - Removed outdated `Claude Sonnet 4.5 Thinking` model definition for cleanup and relevance.	2025-11-26 18:22:40 +08:00
nestharus	d0e694d4ed	feat(claude): add thinking model variants and beta headers support - Add Claude thinking model definitions (sonnet-4-5-thinking, opus-4-5-thinking variants) - Add Thinking support for antigravity models with -thinking suffix - Add injectThinkingConfig() for automatic thinking budget based on model suffix - Add resolveUpstreamModel() mappings for thinking variants to actual Claude models - Add extractAndRemoveBetas() to convert betas array to anthropic-beta header - Update applyClaudeHeaders() to merge custom betas from request body Closes #324	2025-11-25 03:33:05 -08:00
Ben Vargas	0895533400	fix(registry): correct Claude Opus 4.5 created timestamp Update epoch from 1730419200 (2024-11-01) to 1761955200 (2025-11-01).	2025-11-24 12:27:23 -07:00
Ben Vargas	43f007c234	feat(registry): add Claude Opus 4.5 model definition Add support for claude-opus-4-5-20251101 with 200K context window and 64K max output tokens.	2025-11-24 12:26:39 -07:00
Luis Pater	db81331ae8	refactor(middleware): extract request logging logic and optimize condition checks - Added `shouldLogRequest` helper to simplify path-based request logging logic. - Updated middleware to skip management endpoints for improved security. - Introduced an explicit `nil` logger check for minimal overhead. - Updated dependencies in `go.mod`. feat(auth): add handling for 404 response with retry logic - Introduced support for 404 `not_found` status with a 12-hour backoff period. - Updated `manager.go` to align state and status messages for 404 scenarios. refactor(translator): comment out debug logging in Gemini responses request	2025-11-20 23:20:40 +08:00
Luis Pater	371324c090	feat(registry): expand Gemini model definitions and support Vertex AI	2025-11-20 18:16:26 +08:00
Luis Pater	0586da9c2b	refactor(registry): move Gemini 3 Pro Preview model definition to base set	2025-11-20 10:51:16 +08:00
Ben Vargas	782bba0bc4	feat(registry): enable gemini-3-pro-preview for gemini-cli provider Add gemini-3-pro-preview model to GetGeminiCLIModels() to make it available for OAuth-based Gemini CLI users, matching the model already available in AI Studio provider. Model spec: - ID: gemini-3-pro-preview - Version: 3.0 - Input: 1M tokens - Output: 64K tokens - Thinking: 128-32K tokens (dynamic)	2025-11-19 12:47:39 -07:00
Luis Pater	bf116b68f8	feat(registry): add GPT-5.1 Codex Max model definitions and support - Introduced `gpt-5.1-codex-max` variants to model definitions (`low`, `medium`, `high`, `xhigh`). - Updated executor logic to map effort levels for Codex Max models. - Added `lastCodexMaxPrompt` processing for `gpt-5.1-codex-max` prompts. - Defined instructions for `gpt-5.1-codex-max` in a new file: `codex_instructions/gpt-5.1-codex-max_prompt.md`.	2025-11-20 03:12:22 +08:00
Luis Pater	17016ae6a5	feat(registry): add Gemini 3 Pro Preview model definition	2025-11-18 23:48:21 +08:00
Luis Pater	01b7b60901	feat(registry): add Gemini 3 Pro Preview model definition	2025-11-18 23:46:58 +08:00
Luis Pater	23a7633e6d	fix(registry): update Thinking parameters and replace Gemini-3 Preview with Gemini-2.5 Flash Lite	2025-11-18 11:51:52 +08:00
Luis Pater	772fa69515	Fixed: #254 feat(registry): add Kimi-K2-Thinking model to model definitions	2025-11-14 21:20:54 +08:00
Luis Pater	1ccb01631d	refactor(runtime): centralize reasoning effort logic for GPT models Extract reasoning effort mapping into a reusable function `setReasoningEffortByAlias` to reduce redundancy and improve maintainability. Introduce support for the "gpt-5.1-none" variant in the registry and runtime executor.	2025-11-14 17:24:40 +08:00
Ben Vargas	cfbaed0e90	fix(runtime): remove gpt-5.1 minimal effort variant Stop advertising and mapping the unsupported gpt-5.1-minimal variant in the model registry and Codex executor, and align bare gpt-5.1 requests to use medium reasoning effort like Codex CLI while preserving minimal for gpt-5.	2025-11-13 19:43:52 -07:00
Luis Pater	75b57bc112	Fixed: #246 feat(runtime): add support for GPT-5.1 models and variants Introduce GPT-5.1 model family, including minimal, low, medium, high, Codex, and Codex Mini variants. Update tokenization and reasoning effort handling to accommodate new models in executor and registry.	2025-11-13 17:42:19 +08:00
TUGOhost	92f4278039	feat: add auto model resolution and model creation timestamp tracking - Add 'created' field to model registry for tracking model creation time - Implement GetFirstAvailableModel() to find the first available model by newest creation timestamp - Add ResolveAutoModel() utility function to resolve "auto" model name to actual available model - Update request handler to resolve "auto" model before processing requests - Ensures automatic model selection when "auto" is specified as model name This enables dynamic model selection based on availability and creation time, improving the user experience when no specific model is requested.	2025-11-11 20:30:09 +08:00

1 2

87 Commits