14 Commits

  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • realtime: rename provider session ids (#20361)
    ## Summary
    
    Codex is repurposing `session` to mean a thread group, so the realtime
    provider session id should no longer use `session_id` / `sessionId` in
    Codex-facing protocol payloads. This PR renames that provider-specific
    field to `realtime_session_id` / `realtimeSessionId` and intentionally
    breaks clients that still send the old field names.
    
    ## What Changed
    
    - Renamed realtime provider session fields in `ConversationStartParams`,
    `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
    - Renamed app-server v2 realtime request and notification fields to
    `realtimeSessionId`.
    - Removed legacy serde aliases for `session_id` / `sessionId`; clients
    must send the new names.
    - Propagated the rename through core realtime startup, app-server
    adapters, codex-api websocket handling, and TUI realtime state.
    - Regenerated app-server protocol schema/TypeScript outputs and updated
    app-server README examples.
    - Kept upstream Realtime API concepts unchanged: provider `session.id`
    parsing and `x-session-id` headers still use the upstream wire names.
    
    ## Testing
    
    - CI is running on the latest pushed commit.
    - Earlier local verification on this PR:
      - `cargo test -p codex-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
    realtime_conversation`
      - `cargo test -p codex-app-server-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
    realtime_conversation`
    - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
    linker bus error while linking the test binary)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Update realtime handoff transcript handling (#18597)
    ## Summary
    
    This PR aims to improve integration between the realtime model and the
    codex agent by sharing more context with each other. In particular, we
    now share full realtime conversation transcript deltas in addition to
    the delegation message.
    
    realtime_conversation.rs now turns a handoff into:
    ```
    <realtime_delegation>
      <input>...</input>
      <transcript_delta>...</transcript_delta>
    </realtime_delegation>
    ```
    
    ## Implementation notes
    
    The transcript is accumulated in the realtime websocket layer as parsed
    realtime events arrive. When a background-agent handoff is requested,
    the current transcript snapshot is copied onto the handoff event and
    then serialized by `realtime_conversation.rs` into the hidden realtime
    delegation envelope that Codex receives as user-turn context.
    
    For Realtime V2, the session now explicitly enables input audio
    transcription, and the parser handles the relevant input/output
    transcript completion events so the snapshot includes both user speech
    and realtime model responses. The delegation `<input>` remains the
    actual handoff request, while `<transcript_delta>` carries the
    surrounding conversation history for context.
    
    Reviewers should note that the transcript payload is intended for Codex
    context sharing, not UI rendering. The realtime delegation envelope
    should stay hidden from the user-facing transcript surface, while still
    being included in the background-agent turn so Codex can answer with the
    same conversational context the realtime model had.
  • Add realtime output modality and transcript events (#17701)
    - Add outputModality to thread/realtime/start and wire text/audio output
    selection through app-server, core, API, and TUI.\n- Rename the realtime
    transcript delta notification and add a separate transcript done
    notification that forwards final text from item done without correlating
    it with deltas.
  • Rename Realtime V2 tool to background_agent (#17278)
    Rename the Realtime V2 delegation tool and parser constant to
    background_agent, and update the tool description and fixtures to match.
    
    Validation: just fmt; cargo check -p codex-api; git diff --check
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime voice selection (#17176)
    - Add realtime voice selection for realtime/start.
    - Expose the supported v1/v2 voice lists and cover explicit, configured,
    default, and invalid voice paths.
  • Attach WebRTC realtime starts to sideband websocket (#17057)
    Summary:
    - parse the realtime call Location header and join that call over the
    direct realtime WebSocket
    - keep WebRTC starts alive on the existing realtime conversation path
    
    Validation:
    - just fmt
    - git diff --check
    - cargo check -p codex-api
    - cargo check -p codex-core --tests
    - local cargo tests not run; relying on PR CI
  • [codex] reduce module visibility (#16978)
    ## Summary
    - reduce public module visibility across Rust crates, preferring private
    or crate-private modules with explicit crate-root public exports
    - update external call sites and tests to use the intended public crate
    APIs instead of reaching through module trees
    - add the module visibility guideline to AGENTS.md
    
    ## Validation
    - `cargo check --workspace --all-targets --message-format=short` passed
    before the final fix/format pass
    - `just fix` completed successfully
    - `just fmt` completed successfully
    - `git diff --check` passed
  • [stack 2/4] Align main realtime v2 wire and runtime flow (#14830)
    ## Stack Position
    2/4. Built on top of #14828.
    
    ## Base
    - #14828
    
    ## Unblocks
    - #14829
    - #14827
    
    ## Scope
    - Port the realtime v2 wire parsing, session, app-server, and
    conversation runtime behavior onto the split websocket-method base.
    - Branch runtime behavior directly on the current realtime session kind
    instead of parser-derived flow flags.
    - Keep regression coverage in the existing e2e suites.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime transcription mode for websocket sessions (#14556)
    - add experimental_realtime_ws_mode (conversational/transcription) and
    plumb it into realtime conversation session config
    - switch realtime websocket intent and session.update payload shape
    based on mode
    - update config schema and realtime/config tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime v2 event parser behind feature flag (#14537)
    - Add a feature-flagged realtime v2 parser on the existing
    websocket/session pipeline.
    - Wire parser selection from core feature flags and map the codex
    handoff tool-call path into existing handoff events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Update realtime websocket API (#13265)
    - migrate the realtime websocket transport to the new session and
    handoff flow
    - make the realtime model configurable in config.toml and use API-key
    auth for the websocket
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Wire realtime api to core (#12268)
    - Introduce `RealtimeConversationManager` for realtime API management 
    - Add `op::conversation` to start conversation, insert audio, insert
    text, and close conversation.
    - emit conversation lifecycle and realtime events.
    - Move shared realtime payload types into codex-protocol and add core
    e2e websocket tests for start/replace/transport-close paths.
    
    Things to consider:
    - Should we use the same `op::` and `Events` channel to carry audio? I
    think we should try this simple approach and later we can create
    separate one if the channels got congested.
    - Sending text updates to the client: we can start simple and later
    restrict that.
    - Provider auth isn't wired for now intentionally
  • codex-api: realtime websocket session.create + typed inbound events (#12036)
    ## Summary
    - add realtime websocket client transport in codex-api
    - send session.create on connect with backend prompt and optional
    conversation_id
    - keep session.update for prompt changes after connect
    - switch inbound event parsing to a tagged enum (typed variants instead
    of optional field bag)
    - add a websocket e2e integration test in
    codex-rs/codex-api/tests/realtime_websocket_e2e.rs
    
    ## Why
    This moves the realtime transport to an explicit session-create
    handshake and improves protocol safety with typed inbound events.
    
    ## Testing
    - Added e2e integration test coverage for session create + event flow in
    the API crate.