CLIProxyAPI

Merge pull request #1033 from router-for-me/reasoning

Refactor thinking

Luis Pater · 2026-01-15 20:33:13 +08:00

67f8732683

refactor(thinking): extract antigravity logic into a dedicated provider

hkfires · 2026-01-15 19:08:22 +08:00

4ad6189487

refactor(auth): simplify file handling logic and remove redundant comparison functions

feat(auth): fetch and update Antigravity project ID from metadata during filestore operations

- Added support to retrieve and update `project_id` using the access token if missing in metadata.
- Integrated HTTP client to fetch project ID dynamically.
- Enhanced metadata persistence logic.

Luis Pater · 2026-01-15 13:29:14 +08:00

086eb3df7a

refactor(thinking): improve budget clamping and logging with provider/model context

hkfires · 2026-01-15 13:06:41 +08:00

5a77b7728e

fix(thinking): use LookupModelInfo for model data

hkfires · 2026-01-15 13:06:41 +08:00

ed8b0f25ee

fix(thinking): map reasoning_effort to thinkingConfig

hkfires · 2026-01-15 13:06:40 +08:00

6e4a602c60

fix(thinking): improve model lookup and validation

hkfires · 2026-01-15 13:06:40 +08:00

7f1b2b3f6e

refactor(antigravity): remove hardcoded model aliases

hkfires · 2026-01-15 13:06:39 +08:00

a75fb6af90

fix(executor): properly handle thinking application errors

hkfires · 2026-01-15 13:06:39 +08:00

72f2125668

refactor: improve thinking logic

hkfires · 2026-01-15 13:06:39 +08:00

0b06d637e7

Fixed: #897

refactor(executor): remove `prompt_cache_retention` from request payloads

Luis Pater · 2026-01-12 10:46:47 +08:00

94e979865e

fix(codex): only override instructions in responses for OpenCode UA

hkfires · 2026-01-11 15:19:37 +08:00

70a82d80ac

feat(codex): add OpenCode instructions based on user agent

hkfires · 2026-01-11 13:36:35 +08:00

ac626111ac

feat(executor): add HttpRequest support across executors for better http request handling

Luis Pater · 2026-01-10 16:25:25 +08:00

e8e3bc8616

Merge pull request #947 from pykancha/fix-memory-leak

Resolve memory leaks causing OOM in k8s deployment

Luis Pater · 2026-01-10 00:40:47 +08:00

95f87d5669

Merge pull request #938 from router-for-me/log

refactor(logging): clean up oauth logs and debugs

Luis Pater · 2026-01-10 00:02:45 +08:00

c83365a349

Merge pull request #943 from ben-vargas/fix-tool-mappings

Fix Claude OAuth tool name mapping (proxy_)

Luis Pater · 2026-01-09 23:52:29 +08:00

6b3604cf2b

Fixed: #942

fix(executor): ignore non-SSE lines in OpenAI-compatible streams

Luis Pater · 2026-01-09 23:41:50 +08:00

af6bdca14f

Use unprefixed Claude request for translation

Keep the upstream payload prefixed for OAuth while passing the unprefixed request body into response translators. This avoids proxy_ leaking into OpenAI Responses echoed tool metadata while preserving the Claude OAuth workaround.

Ben Vargas · 2026-01-09 00:54:35 -07:00

e785bfcd12

fix(server): resolve memory leaks causing OOM in k8s deployment

- usage/logger_plugin: cap modelStats.Details at 1000 entries per model
- cache/signature_cache: add background cleanup for expired sessions (10 min)
- management/handler: add background cleanup for stale IP rate-limit entries (1 hr)
- executor/cache_helpers: add mutex protection and TTL cleanup for codexCacheMap (15 min)
- executor/codex_executor: use thread-safe cache accessors

Add reproduction tests demonstrating leak behavior before/after fixes.

Amp-Thread-ID: https://ampcode.com/threads/T-019ba0fc-1d7b-7338-8e1d-ca0520412777
Co-authored-by: Amp <amp@ampcode.com>

hemanta212 · 2026-01-09 13:33:46 +05:45

47dacce6ea

Fix Claude OAuth tool name mapping

Prefix tool names with proxy_ for Claude OAuth requests and strip the prefix from streaming and non-streaming responses to restore client-facing names.

Updates the Claude executor to:
- add prefixing for tools, tool_choice, and tool_use messages when using OAuth tokens
- strip the prefix from tool_use events in SSE and non-streaming payloads
- add focused unit tests for prefix/strip helpers

Ben Vargas · 2026-01-09 00:10:38 -07:00

dcac3407ab

refactor(logging): clean up oauth logs and debugs

hkfires · 2026-01-09 11:20:55 +08:00

ee62ef4745

fix(executor): handle context cancellation and deadline errors explicitly

Luis Pater · 2026-01-09 10:48:29 +08:00

ef6bafbf7e

Merge pull request #787 from sususu98/fix/antigravity-429-retry-delay-parsing

fix(antigravity): parse retry-after delay from 429 response body

Luis Pater · 2026-01-09 04:45:25 +08:00

49b9709ce5

feat(executor): centralize systemInstruction handling for Claude and Gemini-3-Pro models

Luis Pater · 2026-01-08 21:05:33 +08:00

59a448b645

fix(executor): update gemini model identifier to gemini-3-pro-preview

Update the model name check in `buildRequest` to target "gemini-3-pro-preview" instead of "gemini-3-pro" when applying specific system instruction handling.

hkfires · 2026-01-08 19:14:52 +08:00

b6a0f7a07f

feat(executor): update system instruction handling for Claude and Gemini-3-Pro models

Luis Pater · 2026-01-08 12:42:26 +08:00

1b2f907671

feat(executor): add model-specific support for "gemini-3-pro" in execution and payload handling

Luis Pater · 2026-01-08 12:27:03 +08:00

bda04eed8a

feat(executor): enhance Antigravity payload with user role and dynamic system instructions

Luis Pater · 2026-01-08 10:55:25 +08:00

67985d8226

fix(executor): remove unused tokenRefreshTimeout constant and pass zero timeout to HTTP client

Luis Pater · 2026-01-07 18:16:49 +08:00

f4ba1ab910

feat(executor): add token refresh timeout and improve context handling during refresh

Introduced `tokenRefreshTimeout` constant for token refresh operations and enhanced context propagation for `refreshToken` by embedding roundtrip information if available. Adjusted `refreshAuth` to ensure default context initialization and handle cancellation errors appropriately.

Luis Pater · 2026-01-04 00:26:08 +08:00

7a77b23f2d

feat(executor): enhance payload translation with original request context

Refactored `applyPayloadConfig` to `applyPayloadConfigWithRoot`, adding support for default rule validation against the original payload when available. Updated all executors to use `applyPayloadConfigWithRoot` and incorporate an optional original request payload for translations.

Luis Pater · 2026-01-02 00:03:26 +08:00

2a663d5cba

fix(iflow): remove thinking field from request body in thinking config handler

hkfires · 2026-01-01 19:40:28 +08:00

3902fd7501

refactor(iflow): simplify thinking config handling for GLM and MiniMax models

hkfires · 2026-01-01 19:31:08 +08:00

4fc3d5e935

fix(thinking): fallback to upstream model for thinking support when alias not in registry

hkfires · 2025-12-31 18:07:13 +08:00

8bf3305b2b

fix(thinking): use model alias for thinking config resolution in mapped models

hkfires · 2025-12-31 17:09:22 +08:00

89db4e9481

refactor(executor): remove redundant upstream model parameter from translateRequest

hkfires · 2025-12-30 20:20:42 +08:00

26efbed05c

refactor(executor): resolve upstream model at conductor level before execution

hkfires · 2025-12-30 19:31:54 +08:00

96340bf136

fix(executor): use upstream model for thinking config and payload translation

hkfires · 2025-12-30 17:49:44 +08:00

b055e00c1a

fix(antigravity): parse retry-after delay from 429 response body

When receiving HTTP 429 (Too Many Requests) responses, parse the retry
delay from the response body using parseRetryDelay and populate the
statusErr.retryAfter field. This allows upstream callers to respect
the server's requested retry timing.

Applied to all error paths in Execute, executeClaudeNonStream,
ExecuteStream, CountTokens, and refreshToken functions.

sususu · 2025-12-30 16:07:32 +08:00

414db44c00

feat(gemini): add per-key model alias support for Gemini provider

hkfires · 2025-12-30 13:27:57 +08:00

08ab6a7d77

feat(cliproxy): introduce global model name mappings for improved aliasing and routing

Luis Pater · 2025-12-30 08:13:06 +08:00

50e6d845f4

Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict

Fix: disable thinking when tool_choice forces tool use

Luis Pater · 2025-12-28 14:04:30 +08:00

457924828a

Fix: disable thinking when tool_choice forces tool use

Anthropic API does not allow extended thinking when tool_choice is set
to "any" or a specific tool. This was causing 400 errors when using
features like Amp's /handoff command which forces tool_choice.

Added disableThinkingIfToolChoiceForced() that removes thinking config
when incompatible tool_choice is detected, applied to both streaming
and non-streaming paths.

Fixes router-for-me/CLIProxyAPI#630

Ben Vargas · 2025-12-27 16:31:37 -07:00

aca2ef6359

feat(cliproxy): implement model aliasing and hashing for Codex configurations, enhance request routing logic, and normalize Codex model entries

Luis Pater · 2025-12-28 03:06:51 +08:00

3a436e116a

feat(iflow): add model-specific thinking configs for GLM-4.7 and MiniMax-M2.1

- GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false}
- MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation
- Added preserveReasoningContentInMessages() to support re-injection of reasoning
  content in assistant message history for multi-turn conversations
- Added ThinkingSupport to MiniMax-M2.1 model definition

leaph · 2025-12-27 18:39:15 +01:00

6403ff4ec4

Fixed: #747

fix(translators): rename and integrate `usageMetadata` as `cpaUsageMetadata` in Claude processing logic

Luis Pater · 2025-12-27 22:02:11 +08:00

c281f4cbaf

feat(usage): add import/export functionality for usage statistics and enhance deduplication logic

Luis Pater · 2025-12-26 11:49:51 +08:00

3ce0d76aa4

refactor: extract parseGeminiFamilyUsageDetail helper to reduce duplication

NguyenSiTrung · 2025-12-24 10:22:31 +07:00

969c1a5b72

feat: add cached token parsing for Gemini API responses

NguyenSiTrung · 2025-12-24 10:20:11 +07:00

872339bceb

262 Commits