Commit Graph

326 Commits

Author SHA1 Message Date
Luis Pater
fff866424e Merge pull request #1628 from thebtf/fix/masquerading-headers
fix: update Claude masquerading headers and configurable defaults
2026-02-19 04:19:59 +08:00
Luis Pater
252f7e0751 Merge pull request #1625 from thebtf/feat/tool-prefix-config
feat: add per-auth tool_prefix_disabled option
2026-02-19 04:07:22 +08:00
Luis Pater
b2b17528cb Merge branch 'pr-1624' into dev
# Conflicts:
#	internal/runtime/executor/claude_executor.go
#	internal/runtime/executor/claude_executor_test.go
2026-02-19 04:05:04 +08:00
Luis Pater
76294f0c59 Merge pull request #1608 from thebtf/fix/tool-reference-proxy-prefix-mainline
fix: add proxy_ prefix handling for tool_reference content blocks
2026-02-19 03:50:34 +08:00
Luis Pater
e5b5dc870f chore(executor): remove unused Openai-Beta header from Codex executor 2026-02-19 02:19:48 +08:00
Luis Pater
bb86a0c0c4 feat(logging, executor): add request logging tests and WebSocket-based Codex executor
- Introduced unit tests for request logging middleware to enhance coverage.
- Added WebSocket-based Codex executor to support Responses API upgrade.
- Updated middleware logic to selectively capture request bodies for memory efficiency.
- Enhanced Codex configuration handling with new WebSocket attributes.
2026-02-19 01:57:02 +08:00
Kirill Turanskiy
73dc0b10b8 fix: update Claude masquerading headers and make them configurable
Update hardcoded X-Stainless-* and User-Agent defaults to match
Claude Code 2.1.44 / @anthropic-ai/sdk 0.74.0 (verified via
diagnostic proxy capture 2026-02-17).

Changes:
- X-Stainless-Os/Arch: dynamic via runtime.GOOS/GOARCH
- X-Stainless-Package-Version: 0.55.1 → 0.74.0
- X-Stainless-Timeout: 60 → 600
- User-Agent: claude-cli/1.0.83 (external, cli) → claude-cli/2.1.44 (external, sdk-cli)

Add claude-header-defaults config section so values can be updated
without recompilation when Claude Code releases new versions.
2026-02-18 03:38:51 +03:00
Kirill Turanskiy
9261b0c20b feat: add per-auth tool_prefix_disabled option
Allow disabling the proxy_ tool name prefix on a per-account basis.
Users who route their own Anthropic account through CPA can set
"tool_prefix_disabled": true in their OAuth auth JSON to send tool
names unchanged to Anthropic.

Default behavior is fully preserved — prefix is applied unless
explicitly disabled.

Changes:
- Add ToolPrefixDisabled() accessor to Auth (reads metadata key
  "tool_prefix_disabled" or "tool-prefix-disabled")
- Gate all 6 prefix apply/strip points with the new flag
- Add unit tests for the accessor
2026-02-17 21:48:19 +03:00
Kirill Turanskiy
7cc725496e fix: skip proxy_ prefix for built-in tools in message history
The proxy_ prefix logic correctly skips built-in tools (those with a
non-empty "type" field) in tools[] definitions but does not skip them
in messages[].content[] tool_use blocks or tool_choice. This causes
web_search in conversation history to become proxy_web_search, which
Anthropic does not recognize.

Fix: collect built-in tool names from tools[] into a set and also
maintain a hardcoded fallback set (web_search, code_execution,
text_editor, computer) for cases where the built-in tool appears in
history but not in the current request's tools[] array. Skip prefixing
in messages and tool_choice when name matches a built-in.
2026-02-17 21:42:32 +03:00
Kirill Turanskiy
24c18614f0 fix: skip built-in tools in tool_reference prefix + refactor to switch
- Collect built-in tool names (those with a "type" field like
  web_search, code_execution) and skip prefixing tool_reference
  blocks that reference them, preventing name mismatch.
- Refactor if-else if chains to switch statements in all three
  prefix functions for idiomatic Go style.
2026-02-16 19:37:11 +03:00
Kirill Turanskiy
603f06a762 fix: handle tool_reference nested inside tool_result.content[]
tool_reference blocks can appear nested inside tool_result.content[]
arrays, not just at the top level of messages[].content[]. The prefix
logic now iterates into tool_result blocks with array content to find
and prefix/strip nested tool_reference.tool_name fields.
2026-02-16 19:06:24 +03:00
Kirill Turanskiy
98f0a3e3bd fix: add proxy_ prefix handling for tool_reference content blocks (#1)
applyClaudeToolPrefix, stripClaudeToolPrefixFromResponse, and
stripClaudeToolPrefixFromStreamLine now handle "tool_reference" blocks
(field "tool_name") in addition to "tool_use" blocks (field "name").

Without this fix, tool_reference blocks in conversation history retain
their original unprefixed names while tool definitions carry the proxy_
prefix, causing Anthropic API 400 errors: "Tool reference 'X' not found
in available tools."

Co-authored-by: Kirill Turanskiy <kt@novamedia.ru>
2026-02-16 19:06:24 +03:00
Luis Pater
453aaf8774 chore(runtime): update Qwen executor user agent and headers for compatibility with new runtime standards 2026-02-16 23:29:47 +08:00
Luis Pater
46a6782065 refactor(all): replace manual pointer assignments with new to enhance code readability and maintainability 2026-02-15 14:10:10 +08:00
Luis Pater
ae1e8a5191 chore(runtime, registry): update Codex client version and GPT-5.3 model creation date 2026-02-13 12:47:48 +08:00
Nathan
166d2d24d9 fix(schema): remove Gemini-incompatible tool metadata fields
Sanitize tool schemas by stripping prefill, enumTitles, $id, and patternProperties to prevent Gemini INVALID_ARGUMENT 400 errors, and add unit and executor-level tests to lock in the behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:29:17 +11:00
hkfires
49c1740b47 feat(executor): add session ID and HMAC-SHA256 signature generation for iFlow API requests 2026-02-09 19:29:42 +08:00
Luis Pater
7e9d0db6aa Merge pull request #1467 from dusty-du/fix/kimi-toolcall-reasoning-content
Fix Kimi tool-call payload normalization for reasoning_content
2026-02-07 09:35:04 +08:00
Luis Pater
78ef04fcf1 fix(kimi): reduce redundant payload cloning and simplify translation calls 2026-02-07 08:51:48 +08:00
Luis Pater
f7d0019df7 fix(kimi): update base URL and integrate ClaudeExecutor fallback
- Updated `KimiAPIBaseURL` to remove versioning from the root path.
- Integrated `ClaudeExecutor` fallback in `KimiExecutor` methods for compatibility with Claude requests.
- Simplified token counting by delegating to `ClaudeExecutor`.
2026-02-07 06:42:08 +08:00
test
52364af5bf Fix Kimi tool-call reasoning_content normalization 2026-02-06 14:46:16 -05:00
Luis Pater
68cb81a258 feat: add Kimi authentication support and streamline device ID handling
- Introduced `RequestKimiToken` API for Kimi authentication flow.
- Integrated device ID management throughout Kimi-related components.
- Enhanced header management for Kimi API requests with device ID context.
2026-02-06 20:43:30 +08:00
test
f5f26f0cbe Add Kimi (Moonshot AI) provider support
- OAuth2 device authorization grant flow (RFC 8628) for authentication
- Streaming and non-streaming chat completions via OpenAI-compatible API
- Models: kimi-k2, kimi-k2-thinking, kimi-k2.5
- CLI `--kimi-login` command for device flow auth
- Token management with automatic refresh
- Thinking/reasoning effort support for thinking-enabled models

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 19:24:46 -05:00
Luis Pater
b4e034be1c refactor(executor): centralize Codex client version and user agent constants
- Introduced `codexClientVersion` and `codexUserAgent` constants for better maintainability.
- Updated `EnsureHeader` calls to use the new constants.
2026-02-06 05:30:28 +08:00
Luis Pater
a5a25dec57 refactor(translator, executor): remove redundant bytes.Clone calls for improved performance
- Replaced all instances of `bytes.Clone` with direct references to enhance efficiency.
- Simplified payload handling across executors and translators by eliminating unnecessary data duplication.
2026-02-06 03:26:29 +08:00
Luis Pater
09ecfbcaed refactor(executor): optimize payload cloning and streamline SDK translator usage
- Replaced unnecessary `bytes.Clone` calls for `opts.OriginalRequest` throughout executors.
- Introduced intermediate variable `originalPayloadSource` to simplify payload processing.
- Ensured better clarity and structure in request translation logic.
2026-02-06 01:44:20 +08:00
Luis Pater
25c6b479c7 refactor(util, executor): optimize payload handling and schema processing
- Replaced repetitive string operations with a centralized `escapeGJSONPathKey` function.
- Streamlined handling of JSON schema cleaning for Gemini and Antigravity requests.
- Improved payload management by transitioning from byte slices to strings for processing.
- Removed unnecessary cloning of byte slices in several places.
2026-02-05 19:00:30 +08:00
Luis Pater
250f212fa3 fix(executor): handle "global" location in AI platform URL generation 2026-02-03 01:39:57 +08:00
hkfires
ac802a4646 refactor(codex): remove codex instructions injection support 2026-02-01 14:33:31 +08:00
Luis Pater
6d8609e457 feat(config): add payload filter rules to remove JSON paths
Introduce `Filter` rules in the payload configuration to remove specified JSON paths from the payload. Update related helper functions and add examples to `config.example.yaml`.
2026-02-01 05:29:41 +08:00
Luis Pater
d216adeffc Fixed: #1372 #1366
fix(caching): ensure unique cache_control injection using count validation
2026-01-31 23:48:50 +08:00
Luis Pater
f887f9985d Merge pull request #1248 from shekohex/feat/responses-compact
feat(openai): add responses/compact support
2026-01-31 03:12:55 +08:00
Luis Pater
7ff3936efe fix(caching): ensure prompt-caching beta is always appended and add multi-turn cache control tests 2026-01-31 01:42:58 +08:00
Martin Schneeweiss
3a43ecb19b feat(caching): implement Claude prompt caching with multi-turn support
- Add ensureCacheControl() to auto-inject cache breakpoints
- Cache tools (last tool), system (last element), and messages (2nd-to-last user turn)
- Add prompt-caching-2024-07-31 beta header
- Return original payload on sjson error to prevent corruption
- Include verification test for caching logic

Enables up to 90% cost reduction on cached tokens.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 22:59:33 +01:00
Shady Khalifa
04b2290927 fix(codex): avoid empty prompt_cache_key 2026-01-27 19:06:42 +02:00
Shady Khalifa
53920b0399 fix(openai): drop stream for responses/compact 2026-01-27 18:27:34 +02:00
Shady Khalifa
95096bc3fc feat(openai): add responses/compact support 2026-01-26 16:36:01 +02:00
Luis Pater
70897247b2 feat(auth): add support for request_retry and disable_cooling overrides
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.
2026-01-26 21:59:08 +08:00
Luis Pater
2af4a8dc12 refactor(runtime): implement retry logic for Antigravity executor with improved error handling and capacity management 2026-01-26 06:22:46 +08:00
hkfires
f30ffd5f5e feat(executor): add request_id to error logs
Extract error.message from JSON error responses when summarizing error bodies for debug logs
2026-01-25 21:31:46 +08:00
Luis Pater
2e6a2b655c Merge pull request #1132 from XYenon/fix/gemini-models-displayname-override
fix(gemini): preserve displayName and description in models list
2026-01-25 03:40:04 +08:00
Mauricio Allende
f16461bfe7 fix(claude): skip built-in tools in OAuth tool prefix 2026-01-23 21:29:39 +00:00
hkfires
ecc850bfb7 feat(executor): apply payload rules using requested model 2026-01-23 16:38:41 +08:00
hkfires
7ca045d8b9 fix(executor): adjust model-specific request payload 2026-01-22 20:28:08 +08:00
hkfires
abfca6aab2 refactor(util): reorder gemini schema cleaner helpers 2026-01-22 18:38:48 +08:00
sowar1987
a2f8f59192 Fix Gemini function-calling INVALID_ARGUMENT by relaxing Gemini tool validation and cleaning schema 2026-01-22 17:11:07 +08:00
XYenon
8c7c446f33 fix(gemini): preserve displayName and description in models list
Previously GeminiModels handler unconditionally overwrote displayName
and description with the model name, losing the original values defined
in model definitions (e.g., 'Gemini 3 Pro Preview').

Now only set these fields as fallback when they are missing or empty.
2026-01-22 15:19:27 +08:00
hkfires
6dcbbf64c3 fix(executor): only strip maxOutputTokens for non-claude models 2026-01-21 10:49:20 +08:00
bexcodex
9f364441e8 Fix antigravity malformed_function_call 2026-01-20 19:54:54 +08:00
Luis Pater
7831cba9f6 refactor(claude): remove redundant system instructions check in Claude executor 2026-01-20 11:02:52 +08:00