Files
codex/codex-rs/codex-api
T
jif 495da45643 reuse encoded Responses request bodies (#28327)
## Why

Responses HTTP requests were converted from `ResponsesApiRequest` into a
full `serde_json::Value`. `EndpointSession` then deep-cloned that value
for each retry, and the transport serialized and compressed it again
before every send.

Large histories make those copies expensive. Retry attempts should reuse
the same immutable request bytes.

## What

- Serialize standard Responses requests directly into a ref-counted
`EncodedJsonBody`.
- Preserve the Azure path that attaches item IDs before encoding.
- Prepare JSON, compression, and derived content headers once before the
retry loop.
- Clone the prepared request per attempt so body clones only bump the
`Bytes` reference count.
- Keep auth inside the retry loop. Signing auth sees the exact final
headers and body bytes that the transport sends.
- Preserve request-body TRACE output. With TRACE plus compression,
retain the original JSON bytes for logging; normal requests keep only
the final wire bytes.
- Leave non-Responses endpoint bodies on the existing `Value` path.

## Performance

A temporary release-mode measurement used a 10 MiB JSON body and 10
retry preparations:

- old `Value` clone + serialize path: 30 ms total
- prepared shared-byte path: less than 1 ms total

That is about 3 ms avoided per retry for this payload on the test
machine. Each retry also stops allocating another request-sized JSON
tree and serialized buffer. Without TRACE, compressed requests retain
only the final compressed wire bytes.

## Validation

- `just test -p codex-client` — 28 passed
- `just test -p codex-api` — 125 passed
- `just fix -p codex-client`
- `just fix -p codex-api`
495da45643 · 2026-06-15 19:11:26 +02:00
History
..
2026-02-10 16:12:31 +00:00

codex-api

Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.

  • Hosts the request/response models and request builders for Responses and Compact APIs.
  • Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
  • Parses SSE streams into ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping.
  • Serves as the wire-level layer consumed by codex-core; higher layers handle auth refresh and business logic.

Core interface

The public interface of this crate is intentionally small and uniform:

  • Responses endpoint

    • Input:
      • ResponsesApiRequest for the request body (model, instructions, input, tools, parallel_tool_calls, reasoning/text controls).
      • ResponsesOptions for transport/header concerns (conversation_id, session_source, extra_headers, compression, turn_state).
    • Output: a ResponseStream of ResponseEvent (both re-exported from common).
  • Compaction endpoint

    • Input: CompactionInput<'a> (re-exported as codex_api::CompactionInput):
      • model: &str.
      • input: &[ResponseItem] history to compact.
      • instructions: &str fully-resolved compaction instructions.
    • Output: Vec<ResponseItem>.
    • CompactClient::compact_input(&CompactionInput, extra_headers) wraps the JSON encoding and retry/telemetry wiring.
  • Memory summarize endpoint

    • Input: MemorySummarizeInput (re-exported as codex_api::MemorySummarizeInput):
      • model: String.
      • raw_memories: Vec<RawMemory> (serialized as traces for wire compatibility).
        • RawMemory includes id, metadata.source_path, and normalized items.
      • reasoning: Option<Reasoning>.
    • Output: Vec<MemorySummarizeOutput>.
    • MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers) wraps JSON encoding and retry/telemetry wiring.

All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.