mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
495da45643
## Why Responses HTTP requests were converted from `ResponsesApiRequest` into a full `serde_json::Value`. `EndpointSession` then deep-cloned that value for each retry, and the transport serialized and compressed it again before every send. Large histories make those copies expensive. Retry attempts should reuse the same immutable request bytes. ## What - Serialize standard Responses requests directly into a ref-counted `EncodedJsonBody`. - Preserve the Azure path that attaches item IDs before encoding. - Prepare JSON, compression, and derived content headers once before the retry loop. - Clone the prepared request per attempt so body clones only bump the `Bytes` reference count. - Keep auth inside the retry loop. Signing auth sees the exact final headers and body bytes that the transport sends. - Preserve request-body TRACE output. With TRACE plus compression, retain the original JSON bytes for logging; normal requests keep only the final wire bytes. - Leave non-Responses endpoint bodies on the existing `Value` path. ## Performance A temporary release-mode measurement used a 10 MiB JSON body and 10 retry preparations: - old `Value` clone + serialize path: 30 ms total - prepared shared-byte path: less than 1 ms total That is about 3 ms avoided per retry for this payload on the test machine. Each retry also stops allocating another request-sized JSON tree and serialized buffer. Without TRACE, compressed requests retain only the final compressed wire bytes. ## Validation - `just test -p codex-client` — 28 passed - `just test -p codex-api` — 125 passed - `just fix -p codex-client` - `just fix -p codex-api`
495da45643
·
2026-06-15 19:11:26 +02:00
History
codex-api
Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.
- Hosts the request/response models and request builders for Responses and Compact APIs.
- Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
- Parses SSE streams into
ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping. - Serves as the wire-level layer consumed by
codex-core; higher layers handle auth refresh and business logic.
Core interface
The public interface of this crate is intentionally small and uniform:
-
Responses endpoint
- Input:
ResponsesApiRequestfor the request body (model,instructions,input,tools,parallel_tool_calls, reasoning/text controls).ResponsesOptionsfor transport/header concerns (conversation_id,session_source,extra_headers,compression,turn_state).
- Output: a
ResponseStreamofResponseEvent(both re-exported fromcommon).
- Input:
-
Compaction endpoint
- Input:
CompactionInput<'a>(re-exported ascodex_api::CompactionInput):model: &str.input: &[ResponseItem]– history to compact.instructions: &str– fully-resolved compaction instructions.
- Output:
Vec<ResponseItem>. CompactClient::compact_input(&CompactionInput, extra_headers)wraps the JSON encoding and retry/telemetry wiring.
- Input:
-
Memory summarize endpoint
- Input:
MemorySummarizeInput(re-exported ascodex_api::MemorySummarizeInput):model: String.raw_memories: Vec<RawMemory>(serialized astracesfor wire compatibility).RawMemoryincludesid,metadata.source_path, and normalizeditems.
reasoning: Option<Reasoning>.
- Output:
Vec<MemorySummarizeOutput>. MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers)wraps JSON encoding and retry/telemetry wiring.
- Input:
All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.