Commit Graph

189 Commits

  • [codex] add roles to realtime append text (#27936)
    ## Summary
    
    Add an explicit `user` or `developer` role to
    `thread/realtime/appendText` and propagate it through the realtime input
    queue into `conversation.item.create`. Older JSON clients that omit the
    field continue to default to `user`.
    
    This lets app-provided context such as memory retain developer authority
    without bypassing app-server through a renderer-owned data channel. The
    app-server schemas, API documentation, and focused protocol and
    websocket coverage are updated with the new contract.
    
    The Codex Apps consumer is tracked in
    [openai/openai#1025261](https://github.com/openai/openai/pull/1025261).
  • realtime: add AVAS architecture override (#27720)
    ## Summary
    
    Adds a `RealtimeConversationArchitecture` option for realtime
    conversation startup, with `realtimeapi` as the default and `avas` as an
    opt-in architecture.
    
    The AVAS path is limited to realtime v1 conversational WebRTC starts,
    and WebRTC call creation appends `intent=quicksilver&architecture=avas`
    to `/v1/realtime/calls`. The existing sideband websocket still joins by
    `call_id`.
    
    This also exposes the per-session architecture override through
    app-server v2 `thread/realtime/start` params and updates the config
    schema for `[realtime].architecture`.
    
    ## Validation
    
    - `just fmt`
    - `just write-config-schema`
    - `just test -p codex-api sends_avas_session_call_query_params`
    - `just test -p codex-core -E
    'test(~conversation_webrtc_start_uses_avas_architecture_query)'`
    - `just test -p codex-core -E 'test(realtime_loads_from_config_toml)'`
    - `just test -p codex-app-server-protocol -E
    'test(~serialize_thread_realtime_start) |
    test(generated_ts_optional_nullable_fields_only_in_params)'`
    - `just test -p codex-app-server -E
    'test(realtime_webrtc_start_emits_sdp_notification)'`
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • [codex] Add comp_hash to model metadata (#27532)
    ## Summary
    - add optional `comp_hash` metadata to `ModelInfo`
    - update `ModelInfo` fixtures for the shared schema change
    - keep older model responses compatible by defaulting the field to
    `None`
    
    ## Why
    The models endpoint needs an opaque identifier for compaction-compatible
    model configurations. This PR only exposes that value in model metadata;
    it does not add it to turn context or change runtime behavior.
    
    Follow-up #27520 carries the value through turn context and rollouts,
    then uses it to trigger compaction.
    
    ## Stack
    - based directly on `main`
    - replaces #27519, which was accidentally merged into the wrong base
    branch
    - functionality follow-up: #27520
    
    ## Testing
    - `just test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `just fix -p codex-core -p codex-protocol -p codex-analytics -p
    codex-models-manager`
  • Forward standalone assistant output to realtime (#27319)
    ## Why
    
    When a realtime session is open without an active frontend-model
    handoff, completed Codex assistant messages are currently dropped. That
    prevents the frontend model from hearing orchestrator preambles and
    final responses produced by typed turns or other non-handoff work, which
    makes the two models present as disconnected personas.
    
    Active handoffs already forward each completed assistant message,
    including preambles. This change leaves those V1 and V2 paths intact and
    fills only the no-active-handoff gap.
    
    ## What changed
    
    - Send standalone V1 assistant messages through
    `conversation.handoff.append` with a stable synthetic handoff ID
    - Send standalone V2 assistant messages as normal `[BACKEND]`
    `conversation.item.create` message items, then enqueue `response.create`
    so the frontend model responds
    - Preserve the existing active V1 and V2 transport and completion
    behavior
    - Continue excluding user messages from realtime mirroring
    - Skip empty output and cap each complete context injection, including
    its V2 prefix, at 1,000 tokens
    - Add end-to-end coverage for both wire formats, V2 response creation,
    preambles, final responses, and truncation
    
    ## Test plan
    
    - CI
  • [codex] Enable standalone web search in code mode (#26719)
    ## What
    
    - Consume plaintext `output` from standalone search while retaining
    optional `encrypted_output` parsing.
    - Expose `web.run` to code mode and return search output to nested
    JavaScript calls.
    - Cover direct and code-mode standalone search paths with integration
    tests.
    
    ## Why
    
    `/v1/alpha/search` now returns plaintext output, which code mode needs
    to consume standalone search results.
    
    ## Test plan
    
    - `just test -p codex-api`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-core code_mode_can_call_standalone_web_search`
    - `just test -p codex-app-server
    standalone_web_search_round_trips_output`
  • [codex] Forward turn moderation metadata through app-server (#25710)
    ## Why
    First-party backends can supply turn-scoped moderation metadata that
    app-server clients need for client-side presentation. Exposing this as
    an experimental typed notification lets opted-in clients consume it
    without interpreting raw Responses API events.
    
    ## What changed
    - forward `response.metadata.openai_chatgpt_moderation_metadata` from
    Responses API SSE and WebSocket streams as turn-scoped moderation
    metadata
    - emit the experimental app-server v2 `turn/moderationMetadata`
    notification with `{ threadId, turnId, metadata }`
    - add app-server integration coverage for the typed moderation metadata
    notification
    
    ## Testing
    - `just test -p codex-core
    build_ws_client_metadata_includes_window_lineage_and_turn_metadata`
    - `just test -p codex-core` (fails locally: 46 failures and 1 timeout,
    primarily missing `test_stdio_server` and shell snapshot timeouts)
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    turn_moderation_metadata_emits_typed_notification_v2`
    - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed,
    and 5 timed out; failures are in existing environment-sensitive tests,
    primarily because nested macOS `sandbox-exec` is not permitted)
    - `just write-app-server-schema --experimental --schema-root
    /tmp/codex-app-server-schema-experimental`
  • [codex] Add use_responses_lite 'override' logic (#26487)
    ## Summary
    
    - add a defaulted `ModelInfo.use_responses_lite` catalog field
    - support serializing `reasoning.context` while preserving the existing
    effort and summary path
    - has not been turned on for any models yet
    
    I've added an override to parallel tools if responses_lite is on. I've
    also forced persistent reasoning when using responses_lite. It would be
    ideal if we could centralize all the responses_lite plumbing, but I
    think this is best for now to keep the plumbing & diffs small.
    
    ## Testing
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    configured_reasoning_summary_is_sent`
    - `cargo check -p codex-core --tests`
    - `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes
    with pre-existing warnings in `codex-code-mode` and
    `codex-core-plugins`)
  • Remove response.processed websocket request (#26447)
    ## Why
    
    The Responses websocket client no longer needs to send a follow-up
    `response.processed` request after a turn response has already been
    recorded. Keeping that extra acknowledgement path adds feature-gated
    control flow and a second websocket request shape that no longer carries
    useful behavior.
    
    ## What Changed
    
    - Removed the `response.processed` websocket request type and sender.
    - Removed the `responses_websocket_response_processed` feature flag and
    schema entry.
    - Removed turn and remote-compaction plumbing that only tracked response
    IDs to send the acknowledgement.
    - Removed tests that existed solely to cover the deleted feature path.
    
    ## Validation
    
    - `just fix -p codex-core -p codex-api -p codex-features`
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • feat: show enterprise monthly credit limits in status (#24812)
    ## Summary
    
    Enterprise users can have an effective monthly credit limit, but Codex
    `/status` currently drops that metadata from the account-usage response.
    
    This change adds the optional `spend_control.individual_limit`
    projection to the existing rate-limit snapshot flow. The backend client
    reads the monthly limit, app-server exposes it as `individualLimit`, and
    the TUI renders a `Monthly credit limit` row through the existing
    progress-bar renderer.
    
    When the backend does not return an effective monthly limit, existing
    rate-limit behavior is unchanged.
    
    ## Existing backend state
    
    The account-usage backend already returns the effective monthly limit
    and current usage together:
    
    ```json
    {
      "spend_control": {
        "reached": false,
        "individual_limit": {
          "limit": "25000",
          "used": "8000",
          "remaining": "17000",
          "used_percent": 32,
          "remaining_percent": 68,
          "reset_after_seconds": 86400,
          "reset_at": 1778137680
        }
      }
    }
    ```
    
    Before this change, Codex projected rolling `primary` and `secondary`
    windows plus `credits`. It ignored `spend_control.individual_limit`, so
    app-server clients and `/status` could not render the monthly cap.
    
    The updated flow is:
    
    ```text
    account usage backend
      -> backend-client reads spend_control.individual_limit
      -> existing rate-limit snapshot carries optional individual_limit
      -> app-server exposes optional individualLimit
      -> TUI renders Monthly credit limit
    ```
    
    ## App-server contract
    
    `account/rateLimits/read` and sparse `account/rateLimits/updated`
    notifications now include an additive nullable
    `rateLimits.individualLimit` field:
    
    ```json
    {
      "individualLimit": {
        "limit": "25000",
        "used": "8000",
        "remainingPercent": 68,
        "resetsAt": 1778137680
      }
    }
    ```
    
    In an `account/rateLimits/read` response, `null` means no monthly limit
    is available. `account/rateLimits/updated` remains a sparse rolling
    notification: clients merge available values into their most recent
    `account/rateLimits/read` snapshot or refetch. Nullable account metadata
    in a rolling notification does not clear a previously observed value.
    
    ## Design decisions
    
    - Extend the existing rate-limit snapshot instead of introducing a
    separate request or wire-level update protocol.
    - Keep the Codex projection narrow: `/status` needs the effective limit,
    current usage, remaining percentage, and reset timestamp.
    - Render the monthly row through the existing progress-bar renderer,
    with one optional detail line for `8,000 of 25,000 credits used`.
    - Keep the backend response optional so existing accounts and older
    usage states preserve their current behavior.
    - Preserve cached monthly metadata when sparse rolling notifications
    omit it. Live account-usage reads remain authoritative and can clear a
    removed limit.
    
    ## Visual evidence
    
    ```text
     Monthly credit limit:   [██████████████░░░░░░] 68% left (resets 07:08 on 7 May)
                             8,000 of 25,000 credits used
    ```
    
    Snapshot:
    `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap`
    
    ## Testing
    
    Tests: generated app-server schema verification, protocol tests,
    backend-client tests, app-server integration coverage, TUI snapshot
    coverage, formatting, and workspace lint cleanup.
  • [codex-rs] auto-review model override (#23767)
    ## Why
    
    Guardian auto-review normally uses the provider-preferred review model
    when one is available. Some parent models need model-catalog metadata to
    select a different review model while keeping older `/models` payloads
    compatible when that metadata is absent.
    
    ## What changed
    
    - Added optional `ModelInfo::auto_review_model_override` metadata to the
    public model payload as a review-model slug.
    - Updated Guardian review model selection to prefer the catalog override
    when present, while preserving the existing provider preferred-model
    path and parent-model fallback when it is omitted.
    - Added focused Guardian coverage for override and no-override model
    selection.
    - Added an `auto_review` core integration suite test that loads override
    metadata from a remote model catalog path and asserts the strict
    auto-review `/responses` request uses the catalog-selected review model.
    - Updated existing `ModelInfo` fixtures and local catalog constructors
    for the new optional field.
    
    ## Validation
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `cargo test -p codex-core guardian_review_uses_`
    - `cargo test -p codex-core
    remote_model_override_uses_catalog_model_for_strict_auto_review --test
    all`
    - `just fix -p codex-protocol`
    - `just fix -p codex-core`
    - `just fmt`
    - `git diff --check`
  • [codex] Require model for standalone web search (#25131)
    ## Why
    
    The standalone `/v1/alpha/search` request now requires a `model`, but
    the `web.run` extension currently omits it.
    
    Adds `model` to extension `ToolCall` invocation.
    
    Follow-up to #23823.
    
    ## What changed
    
    - Make `SearchRequest.model` required.
    - Expose the effective per-turn model on extension tool calls and pass
    it in standalone web-search requests.
    - Assert the model is forwarded in the app-server round-trip test.
    
    ## Testing
    
    - `just test -p codex-api -p codex-tools -p codex-web-search-extension
    -p codex-memories-extension -p codex-goal-extension`
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    - `just test -p codex-app-server -E
    'test(standalone_web_search_round_trips_encrypted_output)'`
  • [codex] Add model tool mode selector (#25031)
    ## Why
    Some models need to select their code-execution behavior through model
    catalog metadata. Models without that metadata must continue to follow
    the existing `CodeMode` and `CodeModeOnly` feature flags, including when
    a newer server sends an enum value this client does not recognize.
    
    ## What changed
    - add optional `ModelInfo.tool_mode` metadata with `direct`,
    `code_mode`, and `code_mode_only`
    - treat omitted and unknown wire values as `None`
    - resolve `None` from the existing feature flags
    - carry the resolved `ToolMode` directly on `TurnContext`, outside
    `Config`
    - use the resolved value for turn creation, model switches, review
    turns, tool planning, and code execution
    
    ## Coverage
    - add protocol coverage for omitted, known, and unknown enum values
    - add focused coverage for flag fallback and explicit metadata
    overriding feature flags
    - add core integration coverage that fetches remote model metadata
    through `/v1/models` and verifies the outbound `/responses` tools for
    explicit `direct` and `code_mode_only` selectors
    
    ## Stack
    - followed by #25032
  • standalone websearch extension (#23823)
    ## Summary
    
    Add the extension-backed standalone `web.run` tool so Codex can call the
    standalone search endpoint through the `codex-api` search client and
    return its encrypted output to Responses.
    
    - gate the new tool behind `standalone_web_search`
    - install the extension in the app-server thread registry and hide
    hosted `web_search` when standalone search is enabled for OpenAI
    providers so the two paths stay mutually exclusive
    - build search context from persisted history using a small tail
    heuristic: previous user message, assistant text between the last two
    user turns capped at about 1k tokens, and current user message
    
    ## Test Plan
    
    - `cargo test -p codex-web-search-extension`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
  • Display workspace usage limit error copy from response header (#24114)
    ## Why
    
    `openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex
    workspace credit-depletion and spend-cap responses. The CLI currently
    reads the adjacent promo header but otherwise renders generic
    usage-limit copy, so those responses do not explain the
    workspace-specific action the user needs to take.
    
    Backend dependency: https://github.com/openai/openai/pull/947613
    
    ## What Changed
    
    - Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error
    handling path alongside `x-codex-promo-message`.
    - Keep the header value parsing with the shared `RateLimitReachedType`
    enum.
    - Carry the parsed type on `UsageLimitReachedError` and render
    client-owned copy for the four workspace owner/member credit and
    spend-cap values.
    - Preserve existing promo and plan-based text for absent, generic, or
    unknown header values.
    - Keep the existing TUI workspace-owner nudge state path unchanged; the
    response header only selects the displayed error string.
    - Add focused display coverage for all specific type values and the
    generic fallback case.
    
    ## Test Plan
    
    - Added `usage_limit_reached_error_formats_rate_limit_reached_types`
    coverage.
    - Not run manually, per request; CI runs validation on the pushed
    commit.
  • Add typed Images client to codex-api (#23989)
    ## Why
    
    Standalone image generation needs a typed `codex-api` client surface for
    the Codex image proxy routes before the harness and model-facing tool
    layers are wired in.
    
    ## What changed
    
    - Added `ImagesClient` support for JSON `images/generations` and
    `images/edits` requests.
    - Added typed request and response shapes for generation, JSON edit
    image URLs, image metadata, and base64 image outputs.
    - Kept generation model slugs open-ended while requiring the generation
    model field that the downstream endpoint expects.
    - Exported the new client and image types from `codex-api`.
    - Added coverage for generation and edit wire shapes, extra response
    metadata that the client ignores, and malformed image responses missing
    `data`.
    
    ## Validation
    
    - `cargo test -p codex-api`
    - `just fix -p codex-api`
    - `just fmt`
    - `git diff --check main`
  • [codex] Fix realtime v1 websocket compatibility (#23771)
    ## Why
    
    Realtime v1 websocket sessions now expect a slightly different boundary
    shape for text input, completed input transcripts, and connection
    headers. Codex was still using the older shape, so some v1 text appends
    could be rejected before the existing conversation flow could handle
    them.
    
    ## What changed
    
    - Send v1 user text items with `input_text` content
    - Accept v1 turn-marked input transcript events as completed transcripts
    - Add the v1 alpha header only for v1 realtime sessions
    - Cover the outbound text shape, transcript parsing, and versioned
    headers
    
    ## Test plan
    
    - `cargo test -p codex-api endpoint::realtime_websocket::methods::tests`
    - `cargo test -p codex-core quicksilver_alpha_header`
  • Honor client-resolved service tier defaults (#23537)
    ## Why
    
    Model catalog responses can now advertise a nullable
    `default_service_tier` for each model. Codex needs to preserve three
    distinct states all the way from config/app-server inputs to inference:
    
    - no explicit service tier, so the client may apply the current model
    catalog default when FastMode is enabled
    - explicit `default`, meaning the user intentionally wants standard
    routing
    - explicit catalog tier ids such as `priority`, `flex`, or future tiers
    
    Keeping those states distinct prevents the UI from showing one tier
    while core sends another, especially after model switches or app-server
    `thread/start` / `turn/start` updates.
    
    ## What Changed
    
    - Plumbed `default_service_tier` through model catalog protocol types,
    app-server model responses, generated schemas, model cache fixtures, and
    provider/model-manager conversions.
    - Added the request-only `default` service tier sentinel and normalized
    legacy config spelling so `fast` in `config.toml` still materializes as
    the runtime/request id `priority`.
    - Moved catalog default resolution to the TUI/client side, including
    recomputing the effective service tier when model/FastMode-dependent
    surfaces change.
    - Updated app-server thread lifecycle config construction so
    `serviceTier: null` preserves explicit standard-routing intent by
    mapping to `default` instead of internal `None`.
    - Kept core responsible for validating explicit tiers against the
    current model and stripping `default` before `/v1/responses`, without
    applying catalog defaults itself.
    
    ## Validation
    
    - `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
    - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
    - `cargo test -p codex-tui service_tier`
    - `cargo test -p codex-protocol service_tier_for_request`
    - `cargo test -p codex-core get_service_tier`
    - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
    service_tier`
  • add standalone websearch api client (#23655)
    add standalone web search request types and a `codex-api` client ahead
    of the extension-contributed search tool.
    
    this adds typed commands/settings and opaque encrypted output handling
    for the new standalone search flow. the endpoint types are close to
    finalized but may still shift slightly as that API settles.
  • Add timeout for remote compaction requests (#23451)
    ## Why
    
    Remote compaction currently sends a unary `POST /responses/compact` and
    waits for the full response before replacing history or emitting the
    completed `ContextCompaction` item. Unlike normal `/responses` streaming
    requests, this unary compact request had no timeout boundary. If the
    backend accepts the request and then stalls before returning a body, the
    existing request retry policy never sees a transport error, so the
    compact turn can remain stuck after the started item with no completion
    or actionable error.
    
    That matches the reported hang shape in issues such as #18363, where
    logs show `responses/compact` was posted but no corresponding compact
    completion followed. A bounded request timeout gives the existing retry
    policy a concrete timeout error to retry instead of letting the user sit
    indefinitely on automatic context compaction.
    
    ## What
    
    - Add a request timeout to legacy `/responses/compact` calls.
    - Size that timeout from the provider stream idle timeout with a
    conservative multiplier, so the default compact attempt gets 20 minutes
    rather than the 5 minute stream idle window.
    - Map API transport timeouts to a request timeout error instead of the
    child-process timeout message.
    
    ## Testing
    
    - Not run (per request; CI will cover).
  • feat(cli): add codex doctor diagnostics (#22336)
    ## Why
    
    Users and support need a single command that captures the local Codex
    runtime, configuration, auth, terminal, network, and state shape without
    asking the user to know which diagnostic depth to choose first. `codex
    doctor` now runs the useful checks by default and makes the detailed
    human output the default because the command is usually run when someone
    already needs context.
    
    The command also targets concrete support failure modes we have seen
    while iterating on the design:
    
    - update-target mismatches like #21956, where the installed package
    manager target can differ from the running executable
    - terminal and multiplexer issues that depend on `TERM`, tmux/zellij
    state, color handling, and TTY metadata
    - provider-specific HTTP/WebSocket connectivity, including ChatGPT
    WebSocket handshakes and API-key/provider endpoint reachability
    - local state/log SQLite integrity problems and large rollout
    directories
    - feedback reports that need an attached, redacted diagnostic snapshot
    without asking the user to run a second command
    
    ## What Changed
    
    - Adds `codex doctor` as a grouped CLI diagnostic report with default
    detailed output and `--summary` for the compact view.
    - Adds stable report sections for Environment, Configuration, Updates,
    Connectivity, and Background Server, plus a top Notes block that
    promotes anomalies such as available updates, large rollout directories,
    optional MCP issues, and mixed auth signals.
    - Adds runtime provenance, install consistency, bundled/system search
    readiness, terminal/multiplexer metadata, `config.toml` parse status,
    auth mode details, sandbox details, feature flag summaries, update
    cache/latest-version state, app-server daemon state, SQLite integrity
    checks, rollout statistics, and provider-aware network diagnostics.
    - Adds ChatGPT WebSocket diagnostics that report the negotiated HTTP
    upgrade as `HTTP 101 Switching Protocols` and include timeout, DNS,
    auth, and provider context in detailed output.
    - Makes reachability provider-aware: API-key OpenAI setups check the API
    endpoint, ChatGPT auth checks the ChatGPT path, and custom/AWS/local
    providers check configured HTTP endpoints when available.
    - Adds structured, redacted JSON output where `checks` is keyed by check
    id and `details` is a key/value object for support tooling.
    - Integrates doctor with feedback uploads by attaching a best-effort
    `codex-doctor-report.json` report and adding derived Sentry tags for
    overall status and failing/warning checks.
    - Updates the TUI feedback consent copy so users can see that the doctor
    report is included when logs/diagnostics are uploaded.
    - Updates the CLI bug issue template to ask reporters for `codex doctor
    --json` and render pasted reports as JSON.
    
    ## Example Output
    
    The examples below are sanitized from local smoke runs with `--no-color`
    so the structure is reviewable in plain text.
    
    ### `codex doctor`
    
    ```text
    Codex Doctor v0.0.0 · macos-aarch64
    
    Notes
       ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
       ⚠ rollouts     1,526 active files · 2.53 GB on disk
       ⚠ mcp          MCP configuration has optional issues
       ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
    ─────────────────────────────────────────────────────────────
    
    Environment
      ✓ runtime      local debug build
          version                  0.0.0
          install method           other
          commit                   unknown
          executable               ~/code/codex.fcoury-doct…x-rs/target/debug/codex
      ✓ install      consistent
          context                  other
          managed by               npm: no · bun: no · package root —
          PATH entries (2)         ~/.local/share/mise/installs/node/24/bin/codex
                                   ~/.local/share/mise/shims/codex
      ✓ search       ripgrep 15.1.0 (system, `rg`)
      ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
          terminal                 Ghostty
          TERM_PROGRAM             ghostty
          terminal version         1.3.2-main-+b0f827665
          TERM                     xterm-256color
          multiplexer              tmux 3.6a
          tmux extended-keys       on
          tmux allow-passthrough   on
          tmux set-clipboard       on
      ✓ state        databases healthy
          CODEX_HOME               ~/.codex (dir)
          state DB                 ~/.codex/state_5.sqlite (file) · integrity ok
          log DB                   ~/.codex/logs_2.sqlite (file) · integrity ok
          active rollouts          1,526 files · 2.53 GB (avg 1.70 MB)
          archived rollouts        8 files · 3.84 MB (avg 491.11 KB)
    
    Configuration
      ✓ config       loaded
          model                    gpt-5.5 · openai
          cwd                      ~/code/codex.fcoury-doctor/codex-rs
          config.toml              ~/.codex/config.toml
          config.toml parse        ok
          MCP servers              1
          feature flags            36 enabled · 7 overridden (full list with --all)
          overrides                code_mode, code_mode_only, memories, chronicle, goals, remote_control, prevent_idle_sleep
      ✓ auth         auth is configured
          auth storage mode        File
          auth file                ~/.codex/auth.json
          auth env vars present    OPENAI_API_KEY
          stored auth mode         chatgpt
          stored API key           false
          stored ChatGPT tokens    true
          stored agent identity    false
      ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
          configured servers       1
          disabled servers         0
          streamable_http servers  1
          optional reachability    openaiDeveloperDocs: https://developers.openai.com/mcp (HEAD connect failed; GET connect failed)
      ✓ sandbox      restricted fs + restricted network · approval OnRequest
          approval policy          OnRequest
          filesystem sandbox       restricted
          network sandbox          restricted
    
    Connectivity
      ✓ network      network-related environment looks readable
      ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
          model provider           openai
          provider name            OpenAI
          wire API                 responses
          supports websockets      true
          connect timeout          15000 ms
          auth mode                chatgpt
          endpoint                 wss://chatgpt.com/backend-api/<redacted>
          DNS                      2 IPv4, 2 IPv6, first IPv6
          handshake result         HTTP 101 Switching Protocols
      ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.
          reachability mode        API key auth
          openai API               https://api.openai.com/v1 connect failed (required)
    
    Background Server
      ○ app-server   not running (ephemeral mode)
    
    ─────────────────────────────────────────────────────────────
    11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed
    
    --summary compact output           --all expand truncated lists
    --json redacted report
    ```
    
    ### `codex doctor --summary`
    
    ```text
    Codex Doctor v0.0.0 · macos-aarch64
    
    Notes
       ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
       ⚠ rollouts     1,526 active files · 2.53 GB on disk
       ⚠ mcp          MCP configuration has optional issues
       ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
    ─────────────────────────────────────────────────────────────
    
    Environment
      ✓ runtime      local debug build
      ✓ install      consistent
      ✓ search       ripgrep 15.1.0 (system, `rg`)
      ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
      ✓ state        databases healthy
    
    Configuration
      ✓ config       loaded
      ✓ auth         auth is configured
      ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
      ✓ sandbox      restricted fs + restricted network · approval OnRequest
    
    Updates
      ✓ updates      update configuration is locally consistent
    
    Connectivity
      ✓ network      network-related environment looks readable
      ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
      ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.
    
    Background Server
      ○ app-server   not running (ephemeral mode)
    
    ─────────────────────────────────────────────────────────────
    11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed
    
    Run codex doctor without --summary for detailed diagnostics.
    --all expand truncated lists       --json redacted report
    ```
    
    ### `codex doctor --json` shape
    
    ```json
    {
      "schema_version": 1,
      "overall_status": "fail",
      "checks": {
        "runtime.provenance": {
          "id": "runtime.provenance",
          "category": "Environment",
          "status": "ok",
          "summary": "local debug build",
          "details": {
            "version": "0.0.0",
            "install method": "other",
            "commit": "unknown"
          }
        },
        "sandbox.helpers": {
          "id": "sandbox.helpers",
          "category": "Configuration",
          "status": "ok",
          "summary": "restricted fs + restricted network · approval OnRequest",
          "details": {
            "approval policy": "OnRequest",
            "filesystem sandbox": "restricted",
            "network sandbox": "restricted"
          }
        }
      }
    }
    ```
    
    ### `/feedback` new sentry attachment
    
    <img width="938" height="798" alt="CleanShot 2026-05-13 at 15 36 14"
    src="https://github.com/user-attachments/assets/715e62e0-d7b4-4fea-a35a-fd5d5d33c4c0"
    />
    
    ### New section in CLI issue template
    
    <img width="1164" height="435" alt="CleanShot 2026-05-13 at 15 47 24"
    src="https://github.com/user-attachments/assets/9081dc25-a28c-4afa-8ba1-e299c2b4031d"
    />
    
    ## How to Test
    
    1. Run `cargo run --bin codex -- doctor --no-color`.
    2. Confirm the detailed report is the default and includes promoted
    Notes, grouped sections, terminal details, state DB integrity, rollout
    stats, provider reachability, WebSocket diagnostics, and app-server
    status.
    3. Run `cargo run --bin codex -- doctor --summary --no-color`.
    4. Confirm the compact view keeps the same sections and summary counts
    but omits detailed key/value rows.
    5. Run `cargo run --bin codex -- doctor --json`.
    6. Confirm the output is redacted JSON, `checks` is an object keyed by
    check id, and each check's `details` is a key/value object.
    7. Preview the CLI bug issue template and confirm the `Codex doctor
    report` field appears after the terminal field, asks for `codex doctor
    --json`, and renders pasted output as JSON.
    8. Start a feedback flow that includes logs.
    9. Confirm the upload consent copy lists `codex-doctor-report.json`
    alongside the log attachments.
    
    Targeted tests:
    
    - `cargo test -p codex-cli doctor`
    - `cargo test -p codex-app-server
    doctor_report_tags_summarize_status_counts`
    - `cargo test -p codex-feedback`
    - `cargo test -p codex-tui feedback_view`
    - `just argument-comment-lint`
    - `git diff --check`
  • fix: drop underscored id headers (#22193)
    ## Why
    Stop sending duplicate `session_id`/`thread_id` headers. We only want
    the hyphenated forms as `_` is rejected by some proxies
    
    Related discussion here:
    https://openai.slack.com/archives/C095U48JNL9/p1778508316923179
    
    ## What
    - Keep `session-id` and `thread-id`
    - Remove the underscore aliases
  • Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
    ## Why
    
    `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
    bypass the normal Responses transport by reading SSE from local files.
    That kept test-only behavior wired through production client code. The
    affected tests can stay hermetic by using the existing
    `core_test_support::responses` mock server and passing `openai_base_url`
    instead.
    
    ## What Changed
    
    - Removed the `CODEX_RS_SSE_FIXTURE` flag,
    `codex_api::stream_from_fixture`, the `env-flags` dependency, and the
    checked-in SSE fixture files.
    - Repointed the affected core, exec, and TUI tests at `MockServer` with
    the existing SSE event constructors.
    - Removed the Bazel test data plumbing for the deleted fixtures and
    refreshed cargo/Bazel lock state.
    
    ## Verification
    
    - `cargo build -p codex-cli`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core --test all responses_api_stream_cli`
    - `cargo test -p codex-core --test all
    integration_creates_and_checks_session_file`
    - `cargo test -p codex-exec --test all ephemeral`
    - `cargo test -p codex-exec --test all resume`
    - `cargo test -p codex-tui --test all
    resume_startup_does_not_consume_model_availability_nux_count`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
    - `git diff --check`
  • api: send hyphenated session and thread headers (#21757)
    ## Why
    Some consumers expect conventional hyphenated HTTP headers. Codex
    already sends the session and thread IDs on outbound Responses requests,
    but it only uses the underscore spellings today, which makes those IDs
    harder to consume in systems that normalize or reject underscore header
    names.
    
    Full context here:
    https://openai.slack.com/archives/C08KCGLSPSQ/p1778248578422369
    
    ## What changed
    - `build_session_headers` now emits both `session_id` and `session-id`
    when a session ID is present.
    - It does the same for `thread_id` and `thread-id`.
    - Added regression coverage in `codex-api/tests/clients.rs` and
    `core/tests/suite/client.rs` so both the lower-level client tests and
    the end-to-end request tests assert the two header spellings are
    present.
    
    ## Test plan
    - Added header assertions in `codex-api/tests/clients.rs`.
    - Added request-header assertions in `core/tests/suite/client.rs` for
    both the `/v1/responses` and `/api/codex/responses` request paths.
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Add response.processed websocket request (#21284)
    ## Summary
    
    - Add a `response.processed` websocket request payload and sender for
    Responses API websockets.
    - Send `response.processed` from `try_run_sampling_request` after a
    response completes, local turn processing succeeds, and the
    session-owned feature flag is enabled.
    - Add websocket coverage for both enabled and disabled feature-flag
    behavior.
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-core response_processed`
    - `cargo test -p codex-api responses_websocket`
    - `cargo test -p codex-features
    responses_websocket_response_processed_is_under_development`
    - `git diff --check`
    - `just fix -p codex-api -p codex-core -p codex-features`
    - `git diff --check origin/main...HEAD`
  • Propagate cache key and service tiers in compact (#21249)
    ## Why
    
    `/responses/compact` should preserve the request-affinity fields that
    apply to the active auth mode. ChatGPT-auth compact requests need the
    effective `service_tier`, and compact requests for every auth mode need
    the stable `prompt_cache_key`, so compaction does not quietly lose
    routing or cache behavior that normal sampling already has.
    
    This follows the request-parity direction from #20719, but keeps the net
    change focused on the compact payload fields needed here.
    
    ## What changed
    
    - Add `service_tier` and `prompt_cache_key` to the compact endpoint
    input payload.
    - Build the remote compact payload from the existing responses request
    builder output so `Fast` still maps to `priority` when compact sends a
    service tier.
    - Pass the turn service tier into remote compaction, but only include it
    in compact payloads for ChatGPT-backed auth.
    - Keep `prompt_cache_key` on compact payloads for all auth modes.
    - Add request-body diff snapshot coverage in
    `core/tests/suite/compact_remote.rs` for:
    - API-key auth reusing `prompt_cache_key` while omitting `service_tier`
    even when `Fast` is configured.
      - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
    - Drive the snapshot coverage through five varied turns: plain text,
    multi-part text, tool-call continuation, image+text input, local-shell
    continuation, and final-turn reasoning output.
    
    ## Verification
    
    - Added insta snapshots for compact request-body parity against the last
    normal `/responses` request after five varied turns.
    - Not run locally per repo guidance; relying on GitHub CI for test
    execution.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • 1- Add model service tiers metadata (#20969)
    ## Why
    
    The model list needs to carry display-ready service tier metadata so
    clients can render tier choices with stable IDs, names, and
    descriptions. A raw speed-tier string list is not enough for richer UI
    copy or future tier labels.
    
    ## What changed
    
    - Added `ModelServiceTier` to shared model metadata with string `id`,
    `name`, and `description` fields.
    - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
    empty defaults for older cached model payloads.
    - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
    it through TUI app-server model conversion.
    - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
    metadata as deprecated in source and generated schema output.
    - Regenerated app-server protocol JSON schema and TypeScript fixtures,
    including `ModelServiceTier.ts`.
    
    ## Verification
    
    - Ran `just write-app-server-schema`.
    - Did not run local tests per repo instruction; relying on PR CI.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Bound websocket request sends with idle timeout (#20751)
    ## Why
    
    We saw Responses websocket sessions recover only after a long quiet
    period when the server had already logged the websocket as disconnected.
    The normal connect path is already bounded by
    `websocket_connect_timeout_ms`, but the first request send on an
    established websocket reused only the receive-side idle timeout after
    the write completed. If the socket write/pump stalls, the client can sit
    in `ws_stream.send(...)` without reaching the existing receive timeout.
  • realtime: rename provider session ids (#20361)
    ## Summary
    
    Codex is repurposing `session` to mean a thread group, so the realtime
    provider session id should no longer use `session_id` / `sessionId` in
    Codex-facing protocol payloads. This PR renames that provider-specific
    field to `realtime_session_id` / `realtimeSessionId` and intentionally
    breaks clients that still send the old field names.
    
    ## What Changed
    
    - Renamed realtime provider session fields in `ConversationStartParams`,
    `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
    - Renamed app-server v2 realtime request and notification fields to
    `realtimeSessionId`.
    - Removed legacy serde aliases for `session_id` / `sessionId`; clients
    must send the new names.
    - Propagated the rename through core realtime startup, app-server
    adapters, codex-api websocket handling, and TUI realtime state.
    - Regenerated app-server protocol schema/TypeScript outputs and updated
    app-server README examples.
    - Kept upstream Realtime API concepts unchanged: provider `session.id`
    parsing and `x-session-id` headers still use the upstream wire names.
    
    ## Testing
    
    - CI is running on the latest pushed commit.
    - Earlier local verification on this PR:
      - `cargo test -p codex-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
    realtime_conversation`
      - `cargo test -p codex-app-server-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
    realtime_conversation`
    - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
    linker bus error while linking the test binary)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [rollout-trace] Include x-request-id in rollout trace. (#20066)
    ## Why
    
    Rollout traces need an identifier that can be used to correlate a Codex
    inference with upstream Responses API, proxy, and engine logs. The
    reduced trace model already exposed `upstream_request_id`, but it was
    being populated from the Responses API `response.id`. That value is
    useful for `previous_response_id` chaining, but it is not the transport
    request id that upstream systems key on.
    
    This PR separates those concepts so trace consumers can reliably answer
    both questions:
    
    - which Responses API response did this inference produce?
    - which upstream request handled it?
    
    ## Structure
    
    The change keeps the upstream request id at the same lifecycle level as
    the provider stream:
    
    - `codex-api` captures the `x-request-id` HTTP response header when the
    SSE stream is created and exposes it on `ResponseStream`. Fixture and
    websocket streams set the field to `None` because they do not have that
    HTTP response header.
    - `codex-core` carries that stream-level id into `InferenceTraceAttempt`
    when recording terminal stream outcomes. Completed, failed, cancelled,
    dropped-stream, and pre-response error paths all record the id when it
    is available.
    - `rollout-trace` now records both identifiers in raw terminal inference
    events and response payloads: `response_id` for the Responses API
    `response.id`, and `upstream_request_id` for `x-request-id`.
    - The reducer stores both fields on `InferenceCall`. It also uses
    `response_id` for `previous_response_id` conversation linking, which
    removes the old accidental dependency on the misnamed
    `upstream_request_id` field.
    - Terminal inference reduction now consumes the full terminal payload
    (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in
    one place. That keeps status, partial payloads, response ids, and
    upstream request ids consistent across success, failure, cancellation,
    and late stream-mapper events.
    
    ## Why This Shape
    
    `x-request-id` is a property of the HTTP/provider response envelope, not
    an SSE event. Capturing it once in `codex-api` and plumbing it through
    terminal trace recording avoids trying to infer the value from stream
    contents, and it preserves the id even when the stream fails or is
    cancelled after only partial output.
    
    Keeping `response_id` separate from `upstream_request_id` also makes the
    reduced trace model less surprising: `response_id` remains the
    conversation-continuation id, while `upstream_request_id` is the
    operational correlation id for upstream debugging.
    
    ## Validation
    
    The PR updates trace and reducer coverage for:
    
    - reading `x-request-id` from SSE response headers;
    - storing the true upstream request id on completed inference calls;
    - preserving upstream request ids for cancelled and late-cancelled
    inference streams;
    - keeping `previous_response_id` reconstruction tied to `response_id`
    rather than transport request ids.
  • Support end_turn in response.completed (#19610)
    Some providers of Responses API forward a model-defined `end_turn`
    boolean indicating explicitly the model's indication of whether it would
    like to end the turn or to be inferenced again. In this PR, we update
    the sampling loop to use this field correctly if it's set. If the field
    is not set by the provider, we fall back to the existing sampling logic.
  • refactor: route Codex auth through AuthProvider (#18811)
    ## Summary
    
    This PR moves Codex backend request authentication from direct
    bearer-token handling to `AuthProvider`.
    
    The new `codex-auth-provider` crate defines the shared request-auth
    trait. `CodexAuth::provider()` returns a provider that can apply all
    headers needed for the selected auth mode.
    
    This lets ChatGPT token auth and AgentIdentity auth share the same
    callsite path:
    - ChatGPT token auth applies bearer auth plus account/FedRAMP headers
    where needed.
    - AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
    where needed.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Callsite Migration
    
    | Area | Change |
    | --- | --- |
    | backend-client | accepts an `AuthProvider` instead of a raw
    token/header |
    | chatgpt client/connectors | applies auth through
    `CodexAuth::provider()` |
    | cloud tasks | keeps Codex-backend gating, applies auth through
    provider |
    | cloud requirements | uses Codex-backend auth checks and provider
    headers |
    | app-server remote control | applies provider headers for backend calls
    |
    | MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
    from generic account getters |
    | model refresh | treats AgentIdentity as Codex-backend auth |
    | OpenAI file upload path | rejects non-Codex-backend auth before
    applying headers |
    | core client setup | keeps model-provider auth flow and allows
    AgentIdentity through provider-backed OpenAI auth |
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
    auth mode and startup task allocation
    4. This PR: migrate Codex backend auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • feat: add AWS SigV4 auth for OpenAI-compatible model providers (#17820)
    ## Summary
    
    Add first-class Amazon Bedrock Mantle provider support so Codex can keep
    using its existing Responses API transport with OpenAI-compatible
    AWS-hosted endpoints such as AOA/Mantle.
    
    This is needed for the AWS launch path, where provider traffic should
    authenticate with AWS credentials instead of OpenAI bearer credentials.
    Requests are authenticated immediately before transport send, so SigV4
    signs the final method, URL, headers, and body bytes that `reqwest` will
    send.
    
    ## What Changed
    
    - Added a new `codex-aws-auth` crate for loading AWS SDK config,
    resolving credentials, and signing finalized HTTP requests with AWS
    SigV4.
    - Added a built-in `amazon-bedrock` provider that targets Bedrock Mantle
    Responses endpoints, defaults to `us-east-1`, supports region/profile
    overrides, disables WebSockets, and does not require OpenAI auth.
    - Added Amazon Bedrock auth resolution in `codex-model-provider`: prefer
    `AWS_BEARER_TOKEN_BEDROCK` when set, otherwise use AWS SDK credentials
    and SigV4 signing.
    - Added `AuthProvider::apply_auth` and `Request::prepare_body_for_send`
    so request-signing providers can sign the exact outbound request after
    JSON serialization/compression.
    - Determine the region by taking the `aws.region` config first (required
    for bearer token codepath), and fallback to SDK default region.
    
    ## Testing
    Amazon Bedrock Mantle Responses paths:
    
    - Built the local Codex binary with `cargo build`.
    - Verified the custom proxy-backed `aws` provider using `env_key =
    "AWS_BEARER_TOKEN_BEDROCK"` streamed raw `responses` output with
    `response.output_text.delta`, `response.completed`, and `mantle-env-ok`.
    - Verified a full `codex exec --profile aws` turn returned
    `mantle-env-ok`.
    - Confirmed the custom provider used the bearer env var, not AWS profile
    auth: bogus `AWS_PROFILE` still passed, empty env var failed locally,
    and malformed env var reached Mantle and failed with `401
    invalid_api_key`.
    - Verified built-in `amazon-bedrock` with `AWS_BEARER_TOKEN_BEDROCK` set
    passed despite bogus AWS profiles, returning `amazon-bedrock-env-ok`.
    - Verified built-in `amazon-bedrock` SDK/SigV4 auth passed with
    `AWS_BEARER_TOKEN_BEDROCK` unset and temporary AWS session env
    credentials, returning `amazon-bedrock-sdk-env-ok`.
  • Allow guardian bare allow output (#18797)
    ## Summary
    
    Allow guardian to skip other fields and output only
    `{"outcome":"allow"}` when the command is low risk.
    This change lets guardian reviews use a non-strict text format while
    keeping the JSON schema itself as plain user-visible schema data, so
    transport strictness is carried out-of-band instead of through a schema
    marker key.
    
    ## What changed
    
    - Add an explicit `output_schema_strict` flag to model prompts and pass
    it into `codex-api` text formatting.
    - Set guardian reviewer prompts to non-strict schema validation while
    preserving strict-by-default behavior for normal callers.
    - Update the guardian output contract so definitely-low-risk decisions
    may return only `{"outcome":"allow"}`.
    - Treat bare allow responses as low-risk approvals in the guardian
    parser.
    - Add tests and snapshots covering the non-strict guardian request and
    optional guardian output fields.
    
    ## Verification
    
    - `cargo test -p codex-core guardian::tests::guardian`
    - `cargo test -p codex-core guardian::tests::`
    - `cargo test -p codex-core client_common::tests::`
    - `cargo test -p codex-protocol
    user_input_serialization_includes_final_output_json_schema`
    - `cargo test -p codex-api`
    - `git diff --check`
    
    Note: `cargo test -p codex-core` was also attempted, but this desktop
    environment injects ambient config/proxy state that causes unrelated
    config/session tests expecting pristine defaults to fail.
    
    ---------
    
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • fix: fully revert agent identity runtime wiring (#18757)
    ## Summary
    
    This PR fully reverts the previously merged Agent Identity runtime
    integration from the old stack:
    https://github.com/openai/codex/pull/17387/changes
    
    It removes the Codex-side task lifecycle wiring, rollout/session
    persistence, feature flag plumbing, lazy `auth.json` mutation,
    background task auth paths, and request callsite changes introduced by
    that stack.
    
    This leaves the repo in a clean pre-AgentIdentity integration state so
    the follow-up PRs can reintroduce the pieces in smaller reviewable
    layers.
    
    ## Stack
    
    1. This PR: full revert
    2. https://github.com/openai/codex/pull/18871: move Agent Identity
    business logic into a crate
    3. https://github.com/openai/codex/pull/18785: add explicit
    AgentIdentity auth mode and startup task allocation
    4. https://github.com/openai/codex/pull/18811: migrate auth callsites
    through AuthProvider
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • chore: document intentional await-holding cases (#18423)
    ## Why
    
    This PR prepares the stack to enable Clippy await-holding lints that
    were left disabled in #18178. The mechanical lock-scope cleanup is
    handled separately; this PR is the documentation/configuration layer for
    the remaining await-across-guard sites.
    
    Without explicit annotations, reviewers and future maintainers cannot
    tell whether an await-holding warning is a real concurrency smell or an
    intentional serialization boundary.
    
    ## What changed
    
    - Configures `clippy.toml` so `await_holding_invalid_type` also covers
    `tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`.
    - Adds targeted `#[expect(clippy::await_holding_invalid_type, reason =
    ...)]` annotations for intentional async guard lifetimes.
    - Documents the main categories of intentional cases: active-turn state
    transitions that must remain atomic, session-owned MCP manager accesses,
    remote-control websocket serialization, JS REPL kernel/process
    serialization, OAuth persistence, external bearer token refresh
    serialization, and tests that intentionally serialize shared global or
    session-owned state.
    - For external bearer token refresh, documents the existing
    serialization boundary: holding `cached_token` across the provider
    command prevents concurrent cache misses from starting duplicate refresh
    commands, and the current behavior is small enough that an explicit
    expectation is easier to maintain than adding another synchronization
    primitive.
    
    ## Verification
    
    - `cargo clippy -p codex-login --all-targets`
    - `cargo clippy -p codex-connectors --all-targets`
    - `cargo clippy -p codex-core --all-targets`
    - The follow-up PR #18698 enables `await_holding_invalid_type` and
    `await_holding_lock` as workspace `deny` lints, so any undocumented
    remaining offender will fail Clippy.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423).
    * #18698
    * __->__ #18423
  • [codex] Send realtime transcript deltas on handoff (#18761)
    ## Summary
    - Track how many realtime transcript entries have already been attached
    to a background-agent handoff.
    - Attach only entries added since the previous handoff as
    `<transcript_delta>` instead of resending the accumulated transcript
    snapshot.
    - Update the realtime integration test so the second delegation carries
    only the second transcript delta.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    inbound_handoff_request_sends_transcript_delta_after_each_handoff`
    - `cargo build -p codex-cli -p codex-app-server`
    
    ## Manual testing
    Built local debug binaries at:
    - `codex-rs/target/debug/codex`
    - `codex-rs/target/debug/codex-app-server`
  • Add realtime silence tool (#18635)
    ## Summary
    
    Adds a second realtime v2 function tool, `remain_silent`, so the
    realtime model has an explicit non-speaking action when the
    collaboration mode or latest context says it should not answer aloud.
    This is stacked on #18597.
    
    ## Design
    
    - Advertise `remain_silent` alongside `background_agent` in realtime v2
    conversational sessions.
    - Parse `remain_silent` function calls into a typed
    `RealtimeEvent::NoopRequested` event.
    - Have core answer that function call with an empty
    `function_call_output` and deliberately avoid `response.create`, so no
    follow-up realtime response is requested.
    - Keep the event hidden from app-server/TUI surfaces; it is operational
    plumbing, not user-visible conversation content.
  • Update realtime handoff transcript handling (#18597)
    ## Summary
    
    This PR aims to improve integration between the realtime model and the
    codex agent by sharing more context with each other. In particular, we
    now share full realtime conversation transcript deltas in addition to
    the delegation message.
    
    realtime_conversation.rs now turns a handoff into:
    ```
    <realtime_delegation>
      <input>...</input>
      <transcript_delta>...</transcript_delta>
    </realtime_delegation>
    ```
    
    ## Implementation notes
    
    The transcript is accumulated in the realtime websocket layer as parsed
    realtime events arrive. When a background-agent handoff is requested,
    the current transcript snapshot is copied onto the handoff event and
    then serialized by `realtime_conversation.rs` into the hidden realtime
    delegation envelope that Codex receives as user-turn context.
    
    For Realtime V2, the session now explicitly enables input audio
    transcription, and the parser handles the relevant input/output
    transcript completion events so the snapshot includes both user speech
    and realtime model responses. The delegation `<input>` remains the
    actual handoff request, while `<transcript_delta>` carries the
    surrounding conversation history for context.
    
    Reviewers should note that the transcript payload is intended for Codex
    context sharing, not UI rendering. The realtime delegation envelope
    should stay hidden from the user-facing transcript surface, while still
    being included in the background-agent turn so Codex can answer with the
    same conversational context the realtime model had.
  • [codex] Use AgentAssertion downstream behind use_agent_identity (#17980)
    ## Summary
    
    This is the AgentAssertion downstream slice for feature-gated agent
    identity support, replacing the oversized AgentAssertion slice from PR
    #17807.
    
    It isolates task-scoped downstream AgentAssertion wiring on top of the
    merged PR3.1 work without re-carrying the earlier agent registration,
    task registration, or task-state history.
    
    This PR includes the task-scoped bug-fix call sites from the review:
    generic file upload auth, MCP OpenAI file upload auth, and ARC monitor
    auth. Broader user/control-plane calls move to PR4.1 and PR4.2.
    
    ## Stack
    
    - PR1: https://github.com/openai/codex/pull/17385 - add
    `features.use_agent_identity`
    - PR2: https://github.com/openai/codex/pull/17386 - register agent
    identities when enabled
    - PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
    when enabled
    - PR3.1: https://github.com/openai/codex/pull/17978 - persist and
    prewarm registered tasks per thread
    - PR4: this PR - use task-scoped `AgentAssertion` downstream when
    enabled
    - PR4.1: https://github.com/openai/codex/pull/18094 - introduce
    AuthManager-owned background/control-plane `AgentAssertion` auth
    - PR4.2: https://github.com/openai/codex/pull/18260 - use background
    task auth for additional backend/control-plane calls
    
    ## What Changed
    
    - add AgentAssertion envelope generation in `codex-core`
    - route downstream HTTP and websocket auth through AgentAssertion when
    an agent task is present
    - extend the model-provider auth provider so non-bearer authorization
    schemes can be passed through cleanly
    - make generic file uploads attach the full authorization header value
    - make MCP OpenAI file uploads use the cached thread agent task
    assertion when present
    - make ARC monitor calls use the cached thread agent task assertion when
    present
    
    ## Why
    
    The original PR had drifted ancestry and showed a much larger diff than
    the semantic change actually required. Restacking it onto PR3.1 keeps
    the reviewable surface down to the downstream assertion slice.
    
    ## Validation
    
    - `just fmt`
    - `cargo check -p codex-core -p codex-login -p codex-analytics -p
    codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
    codex-models-manager -p codex-chatgpt -p codex-model-provider -p
    codex-mcp -p codex-core-skills`
    - `cargo test -p codex-model-provider bearer_auth_provider`
    - `cargo test -p codex-core agent_assertion`
    - `cargo test -p codex-app-server remote_control`
    - `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
    - `cargo test -p codex-models-manager manager::tests`
    - `cargo test -p codex-chatgpt`
    - `cargo test -p codex-cloud-tasks`
    - `cargo test -p codex-login agent_identity`
    - `just fix -p codex-core -p codex-login -p codex-analytics -p
    codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
    codex-models-manager -p codex-chatgpt -p codex-model-provider -p
    codex-mcp -p codex-core-skills`
    - `just fix -p codex-app-server`
    - `git diff --check`
  • Add max context window model metadata (#18382)
    Adds max_context_window to model metadata and routes core context-window
    reads through resolved model info. Config model_context_window overrides
    are clamped to max_context_window when present; without an override, the
    model context_window is used.
  • refactor: use cloneable async channels for shared receivers (#18398)
    This is the first mechanical cleanup in a stack whose higher-level goal
    is to enable Clippy coverage for async guards held across `.await`
    points.
    
    The follow-up commits enable Clippy's
    [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock)
    lint and the configurable
    [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type)
    lint for Tokio guard types. This PR handles the cases where the
    underlying issue is not protected shared mutable state, but a
    `tokio::sync::mpsc::UnboundedReceiver` wrapped in `Arc<Mutex<_>>` so
    cloned owners can call `recv().await`.
    
    Using a mutex for that shape forces the receiver lock guard to live
    across `.await`. Switching these paths to `async-channel` gives us
    cloneable `Receiver`s, so each owner can hold a receiver handle directly
    and await messages without an async mutex guard.
    
    ## What changed
    
    - In `codex-rs/code-mode`, replace the turn-message
    `mpsc::UnboundedSender`/`UnboundedReceiver` plus `Arc<Mutex<Receiver>>`
    with `async_channel::Sender`/`Receiver`.
    - In `codex-rs/codex-api`, replace the realtime websocket event receiver
    with an `async_channel::Receiver`, allowing `RealtimeWebsocketEvents`
    clones to receive without locking.
    - Add `async-channel` as a dependency for `codex-code-mode` and
    `codex-api`, and update `Cargo.lock`.
    
    ## Verification
    
    - The split stack was verified at the final lint-enabling head with
    `just clippy`.
  • [codex] Propagate rate limit reached type (#18227)
    ## Summary
    
    First PR in the split from #17956.
    
    - adds the core/app-server `RateLimitReachedType` shape
    - maps backend `rate_limit_reached_type` into Codex rate-limit snapshots
    - carries the field through app-server notifications/responses and
    generated schemas
    - updates existing constructors/tests for the new optional field
    
    ## Validation
    
    - `cargo test -p codex-backend-client`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server rate_limits`
    - `cargo test -p codex-tui workspace_`
    - `cargo test -p codex-tui status_`
    - `just fmt`
    - `just fix -p codex-backend-client`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fix -p codex-tui`
  • feat: add opt-in provider runtime abstraction (#17713)
    ## Summary
    
    - Add `codex-model-provider` as the runtime home for model-provider
    behavior that does not belong in `codex-core`, `codex-login`, or
    `codex-api`.
    - The new crate wraps configured `ModelProviderInfo` in a
    `ModelProvider` trait object that can resolve the API provider config,
    provider-scoped auth manager, and request auth provider for each call.
    - This centralizes provider auth behavior in one place today, and gives
    us an extension point for future provider-specific auth, model listing,
    request setup, and related runtime behavior.
    
    ## Tests
    Ran tests manually to make sure that provider auth under different
    configs still work as expected.
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • Stream apply_patch changes (#17862)
    Adds new events for streaming apply_patch changes from responses api.
    This is to enable clients to show progress during file writes.
    
    Caveat: This does not work with apply_patch in function call mode, since
    that required adding streaming json parsing.