CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-02-18 04:10:51 +08:00

Author	SHA1	Message	Date
hkfires	ac802a4646	refactor(codex): remove codex instructions injection support	2026-02-01 14:33:31 +08:00
Shady Khalifa	04b2290927	fix(codex): avoid empty prompt_cache_key	2026-01-27 19:06:42 +02:00
Shady Khalifa	53920b0399	fix(openai): drop stream for responses/compact	2026-01-27 18:27:34 +02:00
Shady Khalifa	95096bc3fc	feat(openai): add responses/compact support	2026-01-26 16:36:01 +02:00
hkfires	f30ffd5f5e	feat(executor): add request_id to error logs Extract error.message from JSON error responses when summarizing error bodies for debug logs	2026-01-25 21:31:46 +08:00
hkfires	ecc850bfb7	feat(executor): apply payload rules using requested model	2026-01-23 16:38:41 +08:00
hkfires	e641fde25c	feat(registry): support provider-specific model info lookup	2026-01-20 10:01:17 +08:00
hkfires	c7e8830a56	refactor(thinking): pass source and target formats to ApplyThinking for cross-format validation Update ApplyThinking signature to accept fromFormat and toFormat parameters instead of a single provider string. This enables: - Proper level-to-budget conversion when source is level-based (openai/codex) and target is budget-based (gemini/claude) - Strict budget range validation when source and target formats match - Level clamping to nearest supported level for cross-format requests - Format alias resolution in SDK translator registry for codex/openai-response Also adds ErrBudgetOutOfRange error code and improves iflow config extraction to fall back to openai format when iflow-specific config is not present.	2026-01-18 10:30:15 +08:00
Luis Pater	6600d58ba2	feat(codex): enhance input transformation and remove unused `safety_identifier` field - Added logic to transform `inputResults` into structured JSON for improved processing. - Removed redundant `safety_identifier` field in executor payload to streamline requests.	2026-01-16 19:59:01 +08:00
hkfires	902bea24b4	fix(codex): ensure instructions field exists	2026-01-16 15:38:10 +08:00
hkfires	72f2125668	fix(executor): properly handle thinking application errors	2026-01-15 13:06:39 +08:00
hkfires	0b06d637e7	refactor: improve thinking logic	2026-01-15 13:06:39 +08:00
Luis Pater	94e979865e	Fixed: #897 refactor(executor): remove `prompt_cache_retention` from request payloads	2026-01-12 10:46:47 +08:00
hkfires	70a82d80ac	fix(codex): only override instructions in responses for OpenCode UA	2026-01-11 15:19:37 +08:00
hkfires	ac626111ac	feat(codex): add OpenCode instructions based on user agent	2026-01-11 13:36:35 +08:00
Luis Pater	e8e3bc8616	feat(executor): add HttpRequest support across executors for better http request handling	2026-01-10 16:25:25 +08:00
hemanta212	47dacce6ea	fix(server): resolve memory leaks causing OOM in k8s deployment - usage/logger_plugin: cap modelStats.Details at 1000 entries per model - cache/signature_cache: add background cleanup for expired sessions (10 min) - management/handler: add background cleanup for stale IP rate-limit entries (1 hr) - executor/cache_helpers: add mutex protection and TTL cleanup for codexCacheMap (15 min) - executor/codex_executor: use thread-safe cache accessors Add reproduction tests demonstrating leak behavior before/after fixes. Amp-Thread-ID: https://ampcode.com/threads/T-019ba0fc-1d7b-7338-8e1d-ca0520412777 Co-authored-by: Amp <amp@ampcode.com>	2026-01-09 13:33:46 +05:45
Luis Pater	2a663d5cba	feat(executor): enhance payload translation with original request context Refactored `applyPayloadConfig` to `applyPayloadConfigWithRoot`, adding support for default rule validation against the original payload when available. Updated all executors to use `applyPayloadConfigWithRoot` and incorporate an optional original request payload for translations.	2026-01-02 00:03:26 +08:00
hkfires	96340bf136	refactor(executor): resolve upstream model at conductor level before execution	2025-12-30 19:31:54 +08:00
hkfires	b055e00c1a	fix(executor): use upstream model for thinking config and payload translation	2025-12-30 17:49:44 +08:00
Luis Pater	3a436e116a	feat(cliproxy): implement model aliasing and hashing for Codex configurations, enhance request routing logic, and normalize Codex model entries	2025-12-28 03:06:51 +08:00
hkfires	367a05bdf6	refactor(thinking): export thinking helpers Expose thinking/effort normalization helpers from the executor package so conversion tests use production code and stay aligned with runtime validation behavior.	2025-12-15 09:16:15 +08:00
Luis Pater	660aabc437	fix(executor): add `allowCompat` support for reasoning effort normalization Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.	2025-12-13 04:06:02 +08:00
Luis Pater	a74ee3f319	Merge pull request #481 from sususu98/fix/increase-buffer-size fix: increase buffer size for stream scanners to 50MB across multiple executors	2025-12-11 21:20:54 +08:00
hkfires	3a81ab22fd	fix(runtime): unify reasoning effort metadata overrides	2025-12-11 14:35:05 +08:00
hkfires	519da2e042	fix(runtime): validate reasoning effort levels	2025-12-11 12:36:54 +08:00
hkfires	3ffd120ae9	feat(runtime): add thinking config normalization	2025-12-11 11:51:33 +08:00
Luis Pater	423ce97665	feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic - Added support for parsing and normalizing dynamic thinking model suffixes. - Centralized budget resolution across executors and payload helpers. - Retired legacy Gemini-specific thinking handlers in favor of unified logic. - Updated executors to use metadata-based thinking configuration. - Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata. - Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs. - Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly. - Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models. - Improved handling of thinking configurations and model overrides in executors. - Removed hardcoded thinking model entries and migrated logic to metadata-based resolution. - Updated payload mutations to always include the resolved model.	2025-12-11 03:10:50 +08:00
sususu	76c563d161	fix(executor): increase buffer size for stream scanners to 50MB across multiple executors	2025-12-10 23:20:04 +08:00
Luis Pater	bf116b68f8	feat(registry): add GPT-5.1 Codex Max model definitions and support - Introduced `gpt-5.1-codex-max` variants to model definitions (`low`, `medium`, `high`, `xhigh`). - Updated executor logic to map effort levels for Codex Max models. - Added `lastCodexMaxPrompt` processing for `gpt-5.1-codex-max` prompts. - Defined instructions for `gpt-5.1-codex-max` in a new file: `codex_instructions/gpt-5.1-codex-max_prompt.md`.	2025-11-20 03:12:22 +08:00
Luis Pater	db2d22c978	fix(runtime): simplify scanner buffer allocation in executor implementations	2025-11-18 10:59:49 +08:00
Luis Pater	1ccb01631d	refactor(runtime): centralize reasoning effort logic for GPT models Extract reasoning effort mapping into a reusable function `setReasoningEffortByAlias` to reduce redundancy and improve maintainability. Introduce support for the "gpt-5.1-none" variant in the registry and runtime executor.	2025-11-14 17:24:40 +08:00
Ben Vargas	cfbaed0e90	fix(runtime): remove gpt-5.1 minimal effort variant Stop advertising and mapping the unsupported gpt-5.1-minimal variant in the model registry and Codex executor, and align bare gpt-5.1 requests to use medium reasoning effort like Codex CLI while preserving minimal for gpt-5.	2025-11-13 19:43:52 -07:00
Luis Pater	cf9b9be7ea	feat(runtime): extend executor support for GPT-5.1 Codex and variants Expand executor logic to handle GPT-5.1 Codex family and its variants, including reasoning effort configurations for minimal, low, medium, and high levels. Ensure proper mapping of models to payload parameters.	2025-11-14 08:08:25 +08:00
Luis Pater	fcd98f4f9b	feat(runtime): add payload configuration support for executors Introduce `PayloadConfig` in the configuration to define default and override rules for modifying payload parameters. Implement `applyPayloadConfig` and `applyPayloadConfigWithRoot` to apply these rules across various executors, ensuring consistent parameter handling for different models and protocols. Update all relevant executors to utilize this functionality.	2025-11-13 23:27:40 +08:00
Luis Pater	75b57bc112	Fixed: #246 feat(runtime): add support for GPT-5.1 models and variants Introduce GPT-5.1 model family, including minimal, low, medium, high, Codex, and Codex Mini variants. Update tokenization and reasoning effort handling to accommodate new models in executor and registry.	2025-11-13 17:42:19 +08:00
hkfires	cfb9cb8951	feat(config): support HTTP headers across providers	2025-11-08 20:52:05 +08:00
Luis Pater	67ad26c35a	fix(executor): remove default reasoning effort for `gpt-5-codex-mini`	2025-11-08 11:56:32 +08:00
Luis Pater	30d448e73c	fix(executor): update model name from `codex-mini-latest` to `gpt-5-codex-mini`	2025-11-08 11:17:40 +08:00
jeffnash	ec354f7a1a	add default medium reasoning case for gpt-5-codex-mini Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-07 17:12:10 -08:00
jeffnash	240e782606	add default medium reasoning case for gpt-5-codex-mini Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-07 17:11:40 -08:00
Jeff Nash	fcb0293c0d	feat(registry): add GPT-5 Codex Mini model variants Adds three new Codex Mini model variants (mini, mini-medium, mini-high) that map to codex-mini-latest. Codex Mini supports medium and high reasoning effort levels only (no low/minimal). Base model defaults to medium reasoning effort.	2025-11-07 17:07:39 -08:00
Luis Pater	c8f20a66a8	fix(executor): add logging and prompt cache key handling for OpenAI responses	2025-11-07 22:40:45 +08:00
hkfires	a517290726	refactor(executor): summarize API error bodies of html in debug logs	2025-10-31 06:58:38 +08:00
Luis Pater	9d42e4b239	feat(runtime): add User-Agent headers to codex and claude executors - Standardized User-Agent strings for Codex and Claude executors to improve request tracing and compatibility. - Updated header insertion logic in both executors for consistency.	2025-10-29 12:57:37 +08:00
Luis Pater	a552a45b81	Fixed: #140 #133 #80 feat(translator): add token counting functionality for Gemini, Claude, and CLI - Introduced `TokenCount` handling across various Codex translators (Gemini, Claude, CLI) with respective implementations. - Added utility methods for token counting and formatting responses. - Integrated `tiktoken-go/tokenizer` library for tokenization. - Updated CodexExecutor with token counting logic to support multiple models including GPT-5 variants. - Refined go.mod and go.sum to include new dependencies. feat(runtime): add token counting functionality across executors - Implemented token counting in OpenAICompatExecutor, QwenExecutor, and IFlowExecutor. - Added utilities for token counting and response formatting using `tiktoken-go/tokenizer`. - Integrated token counting into translators for Gemini, Claude, and Gemini CLI. - Enhanced multiple model support, including GPT-5 variants, for token counting. docs: update environment variable instructions for multi-model support - Added details for setting `ANTHROPIC_DEFAULT_OPUS_MODEL`, `ANTHROPIC_DEFAULT_SONNET_MODEL`, and `ANTHROPIC_DEFAULT_HAIKU_MODEL` for version 2.x.x. - Clarified usage of `ANTHROPIC_MODEL` and `ANTHROPIC_SMALL_FAST_MODEL` for version 1.x.x. - Expanded examples for setting environment variables across different models including Gemini, GPT-5, Claude, and Qwen3.	2025-10-26 05:39:15 +08:00
Luis Pater	e6d7677373	docs: add GPT-5 Codex guidelines for internal usage - Added comprehensive instructions for Codex CLI harness, sandboxing, approvals, and editing constraints to `internal/misc/codex_instructions/`. - Clarified `approval_policy` configurations and scenarios requiring escalated permissions. - Provided detailed style and structure guidelines for presenting results in the Codex CLI.	2025-10-23 09:14:56 +08:00
Luis Pater	20985d1a10	Refactor executor error handling and usage reporting - Updated the Execute methods in various executors (GeminiCLIExecutor, GeminiExecutor, IFlowExecutor, OpenAICompatExecutor, QwenExecutor) to return a response and error as named return values for improved clarity. - Enhanced error handling by deferring failure tracking in usage reporters, ensuring that failures are reported correctly. - Improved response body handling by ensuring proper closure and error logging for HTTP responses across all executors. - Added failure tracking and reporting in the usage reporter to capture unsuccessful requests. - Updated the usage logging structure to include a 'Failed' field for better tracking of request outcomes. - Adjusted the logic in the RequestStatistics and Record methods to accommodate the new failure tracking mechanism.	2025-10-21 11:22:24 +08:00
Luis Pater	eadccb229f	Fixed: #148 feat(executor): add initial cache_helpers.go file	2025-10-20 10:17:29 +08:00
Luis Pater	3dd0844b98	Enhance logging for API requests and responses across executors - Added detailed logging of upstream request metadata including URL, method, headers, and body for Codex, Gemini, IFlow, OpenAI Compat, and Qwen executors. - Implemented error logging for API response failures to capture errors during HTTP requests. - Introduced structured logging for authentication details (AuthID, AuthLabel, AuthType, AuthValue) to improve traceability. - Updated response logging to include status codes and headers for better debugging. - Ensured that all executors consistently log API interactions to facilitate monitoring and troubleshooting.	2025-10-17 04:12:38 +08:00

1 2

69 Commits