Files
codex/codex-rs/codex-api
T
guinness-oai 22dd6ebc7d Forward standalone assistant output to realtime (#27319)
## Why

When a realtime session is open without an active frontend-model
handoff, completed Codex assistant messages are currently dropped. That
prevents the frontend model from hearing orchestrator preambles and
final responses produced by typed turns or other non-handoff work, which
makes the two models present as disconnected personas.

Active handoffs already forward each completed assistant message,
including preambles. This change leaves those V1 and V2 paths intact and
fills only the no-active-handoff gap.

## What changed

- Send standalone V1 assistant messages through
`conversation.handoff.append` with a stable synthetic handoff ID
- Send standalone V2 assistant messages as normal `[BACKEND]`
`conversation.item.create` message items, then enqueue `response.create`
so the frontend model responds
- Preserve the existing active V1 and V2 transport and completion
behavior
- Continue excluding user messages from realtime mirroring
- Skip empty output and cap each complete context injection, including
its V2 prefix, at 1,000 tokens
- Add end-to-end coverage for both wire formats, V2 response creation,
preambles, final responses, and truncation

## Test plan

- CI
22dd6ebc7d · 2026-06-10 21:32:29 +00:00
History
..
2026-02-10 16:12:31 +00:00

codex-api

Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.

  • Hosts the request/response models and request builders for Responses and Compact APIs.
  • Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
  • Parses SSE streams into ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping.
  • Serves as the wire-level layer consumed by codex-core; higher layers handle auth refresh and business logic.

Core interface

The public interface of this crate is intentionally small and uniform:

  • Responses endpoint

    • Input:
      • ResponsesApiRequest for the request body (model, instructions, input, tools, parallel_tool_calls, reasoning/text controls).
      • ResponsesOptions for transport/header concerns (conversation_id, session_source, extra_headers, compression, turn_state).
    • Output: a ResponseStream of ResponseEvent (both re-exported from common).
  • Compaction endpoint

    • Input: CompactionInput<'a> (re-exported as codex_api::CompactionInput):
      • model: &str.
      • input: &[ResponseItem] history to compact.
      • instructions: &str fully-resolved compaction instructions.
    • Output: Vec<ResponseItem>.
    • CompactClient::compact_input(&CompactionInput, extra_headers) wraps the JSON encoding and retry/telemetry wiring.
  • Memory summarize endpoint

    • Input: MemorySummarizeInput (re-exported as codex_api::MemorySummarizeInput):
      • model: String.
      • raw_memories: Vec<RawMemory> (serialized as traces for wire compatibility).
        • RawMemory includes id, metadata.source_path, and normalized items.
      • reasoning: Option<Reasoning>.
    • Output: Vec<MemorySummarizeOutput>.
    • MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers) wraps JSON encoding and retry/telemetry wiring.

All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.