Non-spark codex models (gpt-5.3-codex, gpt-5.2-codex) stream function call
arguments via multiple delta events followed by a done event. The done handler
unconditionally emitted the full arguments, duplicating what deltas already
streamed. This produced invalid double JSON that Claude Code couldn't parse,
causing tool calls to fail with missing parameters and infinite retry loops.
Add HasReceivedArgumentsDelta flag to track whether delta events were received.
The done handler now only emits arguments when no deltas preceded it (spark
models), while delta-based streaming continues to work for non-spark models.
Some Codex models (e.g. gpt-5.3-codex-spark) send function call arguments
in a single "done" event without preceding "delta" events. The streaming
translator only handled "delta" events, causing tool call arguments to be
lost — resulting in empty tool inputs and infinite retry loops in clients
like Claude Code.
Emit the full arguments from the "done" event as a single input_json_delta
so downstream clients receive the complete tool input.
- Standardized the handling of `stop_reason` and `finish_reason` across Codex and Gemini responses.
- Restricted pass-through of specific reasons (`max_tokens`, `stop`) for consistency.
- Enhanced fallback logic for undefined reasons.
Optimized the handling of JSON serialization and deserialization by replacing redundant `json.Marshal` and `json.Unmarshal` calls with `sjson` and `gjson`. Introduced a `marshalJSONValue` utility for compact JSON encoding, improving performance and code simplicity. Removed unused `encoding/json` imports.
feat(translator): add token counting functionality for Gemini, Claude, and CLI
- Introduced `TokenCount` handling across various Codex translators (Gemini, Claude, CLI) with respective implementations.
- Added utility methods for token counting and formatting responses.
- Integrated `tiktoken-go/tokenizer` library for tokenization.
- Updated CodexExecutor with token counting logic to support multiple models including GPT-5 variants.
- Refined go.mod and go.sum to include new dependencies.
feat(runtime): add token counting functionality across executors
- Implemented token counting in OpenAICompatExecutor, QwenExecutor, and IFlowExecutor.
- Added utilities for token counting and response formatting using `tiktoken-go/tokenizer`.
- Integrated token counting into translators for Gemini, Claude, and Gemini CLI.
- Enhanced multiple model support, including GPT-5 variants, for token counting.
docs: update environment variable instructions for multi-model support
- Added details for setting `ANTHROPIC_DEFAULT_OPUS_MODEL`, `ANTHROPIC_DEFAULT_SONNET_MODEL`, and `ANTHROPIC_DEFAULT_HAIKU_MODEL` for version 2.x.x.
- Clarified usage of `ANTHROPIC_MODEL` and `ANTHROPIC_SMALL_FAST_MODEL` for version 1.x.x.
- Expanded examples for setting environment variables across different models including Gemini, GPT-5, Claude, and Qwen3.
- Added new "Gemini 2.5 Flash Image Preview" model definition, with enhanced image generation capabilities.
- Increased scanner buffer size to 20,971,520 bytes across executors and translators to handle larger payloads.
- Added `ConvertCodexResponseToClaudeNonStream`, `ConvertGeminiCLIResponseToClaudeNonStream`, `ConvertGeminiResponseToClaudeNonStream`, and `ConvertOpenAIResponseToClaudeNonStream` methods for handling non-streaming JSON response conversion.
- Introduced logic for parsing and structuring content, handling reasoning, text, and tool usage blocks.
- Enhanced support for stop reasons and refined token usage data aggregation.
- Added `examples/custom-provider/main.go` showcasing custom executor and translator integration using the SDK.
- Removed redundant debug logs from translator modules to enhance code cleanliness.
- Updated SDK documentation with new usage and advanced examples.
- Expanded the management API with new endpoints, including request logging and GPT-5 Codex features.
- Introduced reverse mapping logic for tool names in translators to restore original names when shortened.
- Enhanced error handling by logging API response errors consistently across handlers.
- Refactored request and response loggers to include API error details, improving debugging capabilities.
- Integrated robust tool name shortening and uniqueness mechanisms for OpenAI, Gemini, and Claude requests.
- Improved handler retry logic to properly capture and respond to errors.