Commit Graph

184 Commits

  • feat: use run agent task auth for inference (#19051)
    ## Stack
    
    This is PR 3 of the simplified HAI single-run-task stack:
    
    - [#19047](https://github.com/openai/codex/pull/19047) Agent Identity
    assertion and task-registration primitives, including the shared
    run-task helper used by existing Agent Identity JWT auth.
    - [#19049](https://github.com/openai/codex/pull/19049)
    Disabled-by-default ChatGPT auth opt-in that provisions/reuses persisted
    Agent Identity runtime auth and its single run task.
    - [#19051](https://github.com/openai/codex/pull/19051) Run-scoped
    provider auth that uses one backend-owned task id for first-party
    inference and compaction requests.
    
    [#19054](https://github.com/openai/codex/pull/19054) collapsed out of
    the active stack because the simplified design no longer needs a
    separate background/control-plane task helper.
    
    ## Summary
    
    This PR moves Agent Identity usage into provider auth resolution. That
    keeps `AgentAssertion` auth tied to first-party OpenAI provider requests
    instead of applying a late session-wide override that could affect
    local, custom, Bedrock, API-key, or external-bearer providers.
    
    What changed:
    
    - adds a small `ProviderAuthScope` struct carrying the run auth policy
    and session source needed by provider-scoped auth resolution
    - lets `Session` opt the existing `ModelClient` into `ChatGptAuth`
    policy when `use_agent_identity` is enabled, without adding a second
    model-client constructor
    - resolves Agent Identity only for first-party OpenAI provider auth
    paths
    - uses the persisted run task id from the `AgentIdentityAuth` record to
    build `AgentAssertion` auth for Responses requests
    - routes shared request setup through scoped provider auth so unary
    compact requests use the same run-task assertion path as inference turns
    - keeps local/custom/Bedrock/env-key/external-bearer provider auth
    unchanged
    - lets missing run-task state surface through the existing model-request
    error path instead of silently falling back to bearer auth
    
    This PR intentionally does not create thread-scoped, target-scoped, or
    background-scoped task identities. The run task is the only task Codex
    registers in this POC shape.
    
    ## Testing
    
    - `just test -p codex-model-provider`
    - `just test -p codex-core client::tests::provider_auth_scope_uses`
    - `just test -p codex-core remote_compact_uses_agent_identity_assertion`
  • [codex] Use input items for Responses Lite tools (#27946)
    When using Responses Lite, we should all use `additional_tools` and a
    developer item instead of the top level tools array & instructions
    field. This keeps things 1-to-1.
    
    Forced namespacing for _all_ tools will land in a following PR after
    some coordination & fixes in Responses API (around collisions & return
    items).
    
    The goal is to eventually expand the scope of this to _all_ requests
    from codex, but that will require larger coordination across providers &
    slower rollout.
  • Propagate safety buffering treatment metadata (#29473)
    ## Summary
    
    - read the request-scoped safety-buffering treatment from HTTP response
    headers and per-turn WebSocket metadata through one shared header parser
    - combine that treatment with Responses API safety-buffering signals
    - propagate `showBufferingUi` and nullable `fasterModel` through the
    existing `model/safetyBuffering/updated` app-server notification
    - update the app-server documentation and generated JSON and TypeScript
    schemas
    
    The public implementation contains no model mapping or real model
    identifier. Tests and protocol examples use generic `current-model` and
    `faster-model` placeholders only.
    
    ## Dependencies
    
    - server-side treatment evaluation:
    https://github.com/openai/openai/pull/1060247
    - initial Responses API safety-buffering propagation:
    https://github.com/openai/codex/pull/29371
    - Codex App UI: https://github.com/openai/openai/pull/1057789
    
    ## Validation
    
    - Codex API tests: 129 passed
    - focused Codex core safety-buffering integration test passed
    - app-server protocol tests passed after regenerating schema fixtures
    - Clippy fix and repository formatting completed successfully
    
    The broader app-server run compiled all changed crates and completed
    with 1,269 passing tests. Its remaining failures were unrelated
    environment limitations: macOS sandbox application was denied, one
    expected test binary was unavailable, and several existing subprocess
    tests timed out as a result.
  • chore: improve expired Bedrock credential errors (#28992)
    ## Why
    
    Amazon Bedrock returns a `401 Unauthorized` response containing
    `Signature expired:` when an AWS credential, including a short-lived
    `AWS_BEARER_TOKEN_BEDROCK`, has expired. Codex currently surfaces that
    response as a generic `unexpected status` error, which does not explain
    how to recover.
    
    Environment-provided bearer tokens cannot be refreshed automatically, so
    the error should direct users to refresh their AWS credentials or
    replace or remove the environment token and restart Codex. This
    classification belongs to the Amazon Bedrock provider so similar
    responses from other providers retain their existing behavior.
    
    ## What changed
    
    - Add a synchronous `ModelProvider::map_api_error` hook that defaults to
    the existing provider-neutral API error mapping, and route model
    request, stream, WebSocket, and terminal unauthorized errors through the
    active provider.
    - Override the hook for Amazon Bedrock. After preserving the structured
    status, body, URL, and request metadata, recognize `401` responses
    containing `Signature expired:` and attach actionable credential
    guidance.
    - Keep `codex-protocol` provider-neutral by representing the guidance as
    an optional `user_message`. Error rendering prefers this message while
    continuing to append the URL, request ID, Cloudflare ray, and
    authorization diagnostics.
    - Add model-provider coverage for expired signatures and negative cases,
    core coverage for provider dispatch after unauthorized recovery, and a
    TUI snapshot for the rendered error.
    
    ## Testing
    Tested with a real request with expired bedrock key:
    <img width="962" height="126" alt="Screenshot 2026-06-22 at 3 56 51 PM"
    src="https://github.com/user-attachments/assets/7e21cc7c-798e-4662-8467-7f304a2f2b59"
    />
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • Stop logging every Responses WebSocket event (#29432)
    ## Why
    
    Every successful Responses WebSocket event currently produces three
    local log records: the full payload at TRACE, an OpenTelemetry log
    event, and an OpenTelemetry trace event.
    
    On busy threads these records fill the 1,000-row log partition in
    seconds and cause continuous SQLite insert-and-prune churn.
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1782128972644209
    
    ## What changed
    
    - Stop logging each successful Responses WebSocket payload at TRACE.
    - Stop emitting `codex.websocket_event` as OpenTelemetry log and trace
    events.
    - Keep WebSocket event counters, duration metrics, response timing
    metrics, parsing, and error handling.
  • Propagate safety buffering events to app-server clients (#29371)
    Responses API safety buffering metadata currently stops at the transport
    boundary, so app-server clients cannot render the in-progress safety
    review state.
    
    This change:
    - decodes and deduplicates `safety_buffering` metadata from Responses
    API SSE and WebSocket events without suppressing the original response
    event
    - emits a typed core event containing the requested model plus backend
    use cases and reasons
    - forwards that event as `turn/safetyBuffering/updated` through
    app-server v2 and updates generated protocol schemas
    - keeps the side-channel event out of persisted rollouts and turn timing
    
    This supports the Codex Apps buffering UX and depends on the Responses
    API backend work in https://github.com/openai/openai/pull/1044569 and
    https://github.com/openai/openai/pull/1044571.
    
    Validation:
    - focused `codex-core` safety-buffering integration test passes
    - `cargo check -p codex-core -p codex-app-server -p
    codex-app-server-protocol`
    - `just fix -p codex-api -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-rollout -p
    codex-rollout-trace -p codex-otel`
    - `just fmt`
    - broad package test run: 4,430/4,492 passed; 62 unrelated
    local-environment/concurrency failures involved unavailable test
    binaries, MCP subprocess setup, and app-server timeouts
  • Use cached and live web access terminology (#29095)
    ## Summary
    
    - Rename the string-valued external web access enum variants from
    `Offline` / `Online` to `Cached` / `Live`.
    - Align the transport names with the existing `web_search = "cached"` /
    `"live"` configuration vocabulary.
    
    Existing behavior is unchanged: `WebSearchMode::Cached` and
    `WebSearchMode::Live` continue to send the backward-compatible boolean
    values `false` and `true`; `Indexed` remains the only mode currently
    sent as a string.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-api` (127 passed)
  • Add indexed web search mode (#28489)
    ## Summary
    
    - Add `web_search = "indexed"` alongside `disabled`, `cached`, and
    `live`.
    - Use that same resolved mode for both hosted and standalone web search.
    - For hosted search, send `index_gated_web_access: true` with external
    web access enabled only when `indexed` is selected.
    - For standalone search, preserve the existing boolean wire values for
    existing modes (`cached` maps to `false` and `live` to `true`) and send
    `"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
    - Carry the mode through managed configuration requirements and
    generated schemas.
    
    ## Why
    
    Indexed search provides a middle ground between cached-only search and
    unrestricted live page fetching. Search queries can remain live while
    direct page fetches are limited to URLs admitted by the server.
    
    The existing `web_search` setting remains the single source of truth, so
    hosted and standalone executors cannot drift into different access
    modes. Without an explicit `indexed` selection, the existing
    model-visible tool and request shapes are unchanged.
    
    ```toml
    web_search = "indexed"
    
    [features]
    standalone_web_search = true
    ```
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-api` (`126 passed`)
    - `just test -p codex-web-search-extension` (`7 passed`)
    - `just test -p codex-core
    code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
    - Focused configuration, hosted request, standalone request, and
    managed-requirement coverage is included in the PR; remaining suites run
    in CI.
    
    The full workspace test suite was not run locally.
  • [codex] Assign response item IDs when recording history (#28814)
    ## Why
    
    Client-created response items enter history without IDs, so their
    identity is lost across rollout persistence and resume. IDs should be
    assigned once at the history-recording boundary, while IDs returned by
    the server must remain unchanged.
    
    The Responses API validates item IDs using type-specific prefixes.
    Locally generated IDs therefore use the matching prefix plus a
    hyphenated UUIDv7, keeping them valid while distinguishable from
    server-generated IDs. Because this changes persisted history and
    provider request shapes, the behavior is opt-in behind the
    under-development `item_ids` feature. Compaction triggers remain request
    controls whose API shape does not accept an ID.
    
    ## What changed
    
    - Register the disabled-by-default `item_ids` feature and expose it in
    `config.schema.json`.
    - Make supported optional `ResponseItem` IDs serializable and expose
    them in the generated app-server schemas.
    - When `item_ids` is enabled, assign an ID during conversation-history
    preparation if an item has no ID.
    - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
    item conventions.
    - Preserve existing server IDs without rewriting them.
    - Persist assigned IDs in rollouts and include them in subsequent
    Responses requests.
    - Remove the unsupported ID field from `CompactionTrigger` and document
    why it has no ID.
    - Add integration coverage for enabled ID persistence, preservation of
    server IDs, and omission of generated IDs while the feature is disabled.
    
    `prepare_conversation_items_for_history` is the single response-item ID
    allocation boundary.
    
    ## Test plan
    
    - `just test -p codex-features`
    - `just test -p codex-core
    response_item_ids_persist_across_resume_and_preserve_server_ids`
    - `just test -p codex-core
    non_openai_responses_requests_omit_item_turn_metadata`
    - `just test -p codex-core
    resize_all_images_prepares_failures_before_history_insertion`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
  • Always use AVAS for realtime WebRTC calls (#28856)
    ## Summary
    
    - Remove the realtime `architecture` selector from core protocol,
    app-server protocol, config parsing, generated schemas, and callers.
    - Always create WebRTC realtime calls with the AVAS query params:
    `intent=quicksilver&architecture=avas`.
    - Keep direct websocket realtime behavior on the existing config/default
    path, while WebRTC starts without an explicit version now default to
    realtime v1 because AVAS requires v1.
    
    ## Notes
    
    - WebRTC realtime now means AVAS. If a caller explicitly asks to start
    WebRTC with realtime v2, Codex rejects that request because the AVAS
    WebRTC path only supports realtime v1. Websocket realtime is separate
    and can still use realtime v2.
    - The old `[realtime] architecture = "realtimeapi" | "avas"` config knob
    is removed. Local configs that still set it will need to delete that
    line.
    - Some app-server tests that were only trying to exercise realtime v2
    protocol behavior now use websocket transport, because WebRTC is
    intentionally locked to AVAS/v1. Separate WebRTC tests cover the AVAS
    query params, v1 startup, SDP flow, and sideband join.
    
    ## Validation
    
    - Merged fresh `origin/main` at `83e6a786a2`.
    - `just fmt`
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `git diff --check`
    - `just test -p codex-api -p codex-core -p codex-app-server-protocol -p
    codex-app-server realtime` (176 passed)
    - `just test -p codex-protocol -p codex-config` (413 passed)
  • [codex] Support assistant realtime append text (#28836)
    ## Why
    
    Frontend realtime voice continuity needs to replay a tiny
    previous-session overlap as actual conversation items, including
    assistant text. The app-server `thread/realtime/appendText` API already
    carries a role through to the Rust realtime websocket layer, but the
    shared role enum only accepted `user` and `developer`.
    
    ## What Changed
    
    - Added `assistant` to `ConversationTextRole` and regenerated the
    app-server schema/type fixtures.
    - Added `output_text` as a realtime conversation content type.
    - Updated realtime websocket item creation so assistant appendText emits
    `content: [{ type: "output_text", text }]`, while user and developer
    continue to emit `input_text`.
    - Updated app-server docs and tests to cover assistant appendText
    alongside the existing developer role behavior.
    
    ## Validation
    
    - `just write-app-server-schema`
    - `just fmt` (first sandboxed attempt failed because `uv` could not
    access `~/.cache/uv`; reran with filesystem access and passed)
    - `just test -p codex-api` passed: 126/126
    - `just test -p codex-app-server-protocol` passed: 239/239, including
    generated JSON/TypeScript fixture checks
    - `just test -p codex-app-server` was started locally but stopped per
    request after unrelated local sandbox/Seatbelt failures (`sandbox-exec:
    sandbox_apply: Operation not permitted`) and one missing local `codex`
    binary failure; CI should be faster and more authoritative for the full
    suite.
  • [codex] Add optional IDs to response items (#28812)
    ## Why
    
    `ResponseItem` variants do not have a consistent internal ID shape: some
    variants carry required IDs, some carry optional IDs, and some cannot
    represent an ID at all. The existing fields also use inconsistent serde,
    TypeScript, and JSON-schema annotations. A single enum-level access path
    is needed before history recording can assign and retain IDs.
    
    This PR establishes that internal model only. It intentionally does not
    generate or serialize IDs; allocation and wire persistence are isolated
    in the stacked follow-up.
    
    ## What changed
    
    - Give every concrete `ResponseItem` variant an `Option<String>` ID
    field.
    - Apply the same internal-only annotations to every ID field:
    `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and
    `#[schemars(skip)]`.
    - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared
    accessors.
    - Preserve IDs when history items are rewritten for truncation.
    - Adapt consumers that previously assumed reasoning and image-generation
    IDs were required.
    - Regenerate app-server schemas so the hidden fields are represented
    consistently.
    
    The serde catch-all `ResponseItem::Other` remains ID-less because it
    must remain a unit variant.
    
    ## Test plan
    
    - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace
    -p codex-image-generation-extension`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api -p codex-rollout-trace -p
    codex-image-generation-extension`
    - `just test -p codex-core event_mapping`
  • [codex] Route MCP file uploads through environment filesystem (#27923)
    ## Why
    
    Codex Apps tools can mark arguments with `openai/fileParams`, but the
    execution path resolved and opened those files directly on the host.
    That bypassed the selected turn environment and prevented annotated file
    arguments from working with remote environments.
    
    ## What changed
    
    - resolve annotated file arguments against the primary turn environment
    - read file metadata and contents through that environment's sandboxed
    `ExecutorFileSystem`
    - reject files over the 512 MiB limit from metadata before reading or
    transferring them
    - retain the buffered upload-size check as defense in depth
    - make the OpenAI upload API accept a filename and buffered contents
    instead of owning local filesystem access
    - describe the model-visible argument as a path in the primary
    environment
    
    This builds on #27927, which added `size` to internal filesystem
    metadata.
    
    ## Testing
    
    - `just test -p codex-api upload_openai_file_returns_canonical_uri`
    - `just test -p codex-mcp
    tool_with_model_visible_input_schema_masks_file_params`
    - `just test -p codex-core mcp_openai_file`
    - `just test -p codex-core
    codex_apps_file_params_upload_environment_files_before_mcp_tool_call`
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • reuse encoded Responses request bodies (#28327)
    ## Why
    
    Responses HTTP requests were converted from `ResponsesApiRequest` into a
    full `serde_json::Value`. `EndpointSession` then deep-cloned that value
    for each retry, and the transport serialized and compressed it again
    before every send.
    
    Large histories make those copies expensive. Retry attempts should reuse
    the same immutable request bytes.
    
    ## What
    
    - Serialize standard Responses requests directly into a ref-counted
    `EncodedJsonBody`.
    - Preserve the Azure path that attaches item IDs before encoding.
    - Prepare JSON, compression, and derived content headers once before the
    retry loop.
    - Clone the prepared request per attempt so body clones only bump the
    `Bytes` reference count.
    - Keep auth inside the retry loop. Signing auth sees the exact final
    headers and body bytes that the transport sends.
    - Preserve request-body TRACE output. With TRACE plus compression,
    retain the original JSON bytes for logging; normal requests keep only
    the final wire bytes.
    - Leave non-Responses endpoint bodies on the existing `Value` path.
    
    ## Performance
    
    A temporary release-mode measurement used a 10 MiB JSON body and 10
    retry preparations:
    
    - old `Value` clone + serialize path: 30 ms total
    - prepared shared-byte path: less than 1 ms total
    
    That is about 3 ms avoided per retry for this payload on the test
    machine. Each retry also stops allocating another request-sized JSON
    tree and serialized buffer. Without TRACE, compressed requests retain
    only the final compressed wire bytes.
    
    ## Validation
    
    - `just test -p codex-client` — 28 passed
    - `just test -p codex-api` — 125 passed
    - `just fix -p codex-client`
    - `just fix -p codex-api`
  • serialize websocket requests directly (#28323)
    ## Why
    
    Responses WebSocket requests were encoded in two steps: first into a
    full `serde_json::Value`, then again into the JSON string sent over the
    socket.
    
    That walks the full request twice and keeps an extra JSON tree alive.
    These requests can contain the complete conversation history and tool
    schemas, so the extra work grows with the request size.
    
    ## What changed
    
    - serialize `ResponsesWsRequest` directly to the wire string
    - pass that string through the existing WebSocket stream and send path
    - keep the existing error mapping, tracing, send timeout, and telemetry
    behavior
    - compare the new wire JSON with the previous `to_value` payload in a
    focused test
    
    ## Performance
    
    I measured both paths in an optimized temporary test using a
    6,324,180-byte request: 4 MiB of history plus 256 tools with 8 KiB
    descriptions. Each path ran 100 times.
    
    - previous `to_value` + `to_string`: 209 ms total, 2.09 ms per request
    - direct `to_string`: 174 ms total, 1.74 ms per request
    - difference: about 17% faster, or 0.35 ms per request
    
    The direct path also removes one full temporary `serde_json::Value`
    tree. For this mostly string-backed payload, that avoids roughly one
    payload-sized copy plus the JSON node overhead. The exact memory saving
    depends on the request shape.
    
    The temporary benchmark was removed before committing.
    
    ## Validation
    
    - `just test -p codex-api` — 125 passed
    - `just fix -p codex-api`
  • [codex] Send turn state through compact requests (#28002)
    ## Context
    
    Inline compaction is part of the active logical turn. Compact requests
    and the sampling requests around them should use the same turn state,
    including when compaction is the first request to establish it.
    
    ## Change
    
    Pass the turn-scoped `OnceLock` directly to inline v1 compaction so
    `/responses/compact` includes an established value in the existing HTTP
    header. Capture `x-codex-turn-state` from the compact response into that
    same lock, allowing pre-turn compact to establish the value that
    subsequent sampling reuses.
    
    V2 compact already uses the normal Responses HTTP/WebSocket path and
    continues to share the same `OnceLock` without separate plumbing. The
    first returned value wins for the logical turn.
    
    ## Test plan
    
    Integration coverage verifies that:
    
    - pre-turn v1 compact can establish state for the first sampling request
    - inline v1 compact receives established state over HTTP
    - inline v2 compact reuses established state over HTTP
    - inline v2 compact reuses established state over WebSocket
    
    CI validates the full change.
  • [codex] Send request-scoped turn state over WebSocket (#27996)
    ## Context
    
    Turn state is scoped to one logical turn, but the WebSocket path
    currently exchanges it through upgrade headers, which are scoped to the
    physical connection. A connection may be reused across turns, so its
    handshake cannot represent the turn lifecycle reliably.
    
    ## Change
    
    Exchange turn state on each WebSocket response request instead:
    
    - send an established value in `response.create.client_metadata`
    - read the returned value from the existing `response.metadata` event
    - retain the first value in the turn-scoped `ModelClientSession`
    `OnceLock`
    - start the next logical turn without state, even when it reuses the
    same WebSocket connection
    
    This gives WebSocket requests the same first-value-wins contract as the
    existing HTTP path.
    
    ## Test plan
    
    Integration coverage verifies that:
    
    - WebSocket replays returned state on same-turn follow-ups
    - later response metadata does not replace the first value
    - state resets at the logical turn boundary without requiring a
    reconnect
    
    CI validates the full change.
    
    ## Stack
    
    This is 1/2. #28002 builds on this request-scoped transport to carry
    established state through compact requests.
  • [codex] add roles to realtime append text (#27936)
    ## Summary
    
    Add an explicit `user` or `developer` role to
    `thread/realtime/appendText` and propagate it through the realtime input
    queue into `conversation.item.create`. Older JSON clients that omit the
    field continue to default to `user`.
    
    This lets app-provided context such as memory retain developer authority
    without bypassing app-server through a renderer-owned data channel. The
    app-server schemas, API documentation, and focused protocol and
    websocket coverage are updated with the new contract.
    
    The Codex Apps consumer is tracked in
    [openai/openai#1025261](https://github.com/openai/openai/pull/1025261).
  • realtime: add AVAS architecture override (#27720)
    ## Summary
    
    Adds a `RealtimeConversationArchitecture` option for realtime
    conversation startup, with `realtimeapi` as the default and `avas` as an
    opt-in architecture.
    
    The AVAS path is limited to realtime v1 conversational WebRTC starts,
    and WebRTC call creation appends `intent=quicksilver&architecture=avas`
    to `/v1/realtime/calls`. The existing sideband websocket still joins by
    `call_id`.
    
    This also exposes the per-session architecture override through
    app-server v2 `thread/realtime/start` params and updates the config
    schema for `[realtime].architecture`.
    
    ## Validation
    
    - `just fmt`
    - `just write-config-schema`
    - `just test -p codex-api sends_avas_session_call_query_params`
    - `just test -p codex-core -E
    'test(~conversation_webrtc_start_uses_avas_architecture_query)'`
    - `just test -p codex-core -E 'test(realtime_loads_from_config_toml)'`
    - `just test -p codex-app-server-protocol -E
    'test(~serialize_thread_realtime_start) |
    test(generated_ts_optional_nullable_fields_only_in_params)'`
    - `just test -p codex-app-server -E
    'test(realtime_webrtc_start_emits_sdp_notification)'`
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • Forward standalone assistant output to realtime (#27319)
    ## Why
    
    When a realtime session is open without an active frontend-model
    handoff, completed Codex assistant messages are currently dropped. That
    prevents the frontend model from hearing orchestrator preambles and
    final responses produced by typed turns or other non-handoff work, which
    makes the two models present as disconnected personas.
    
    Active handoffs already forward each completed assistant message,
    including preambles. This change leaves those V1 and V2 paths intact and
    fills only the no-active-handoff gap.
    
    ## What changed
    
    - Send standalone V1 assistant messages through
    `conversation.handoff.append` with a stable synthetic handoff ID
    - Send standalone V2 assistant messages as normal `[BACKEND]`
    `conversation.item.create` message items, then enqueue `response.create`
    so the frontend model responds
    - Preserve the existing active V1 and V2 transport and completion
    behavior
    - Continue excluding user messages from realtime mirroring
    - Skip empty output and cap each complete context injection, including
    its V2 prefix, at 1,000 tokens
    - Add end-to-end coverage for both wire formats, V2 response creation,
    preambles, final responses, and truncation
    
    ## Test plan
    
    - CI
  • [codex] Enable standalone web search in code mode (#26719)
    ## What
    
    - Consume plaintext `output` from standalone search while retaining
    optional `encrypted_output` parsing.
    - Expose `web.run` to code mode and return search output to nested
    JavaScript calls.
    - Cover direct and code-mode standalone search paths with integration
    tests.
    
    ## Why
    
    `/v1/alpha/search` now returns plaintext output, which code mode needs
    to consume standalone search results.
    
    ## Test plan
    
    - `just test -p codex-api`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-core code_mode_can_call_standalone_web_search`
    - `just test -p codex-app-server
    standalone_web_search_round_trips_output`
  • [codex] Forward turn moderation metadata through app-server (#25710)
    ## Why
    First-party backends can supply turn-scoped moderation metadata that
    app-server clients need for client-side presentation. Exposing this as
    an experimental typed notification lets opted-in clients consume it
    without interpreting raw Responses API events.
    
    ## What changed
    - forward `response.metadata.openai_chatgpt_moderation_metadata` from
    Responses API SSE and WebSocket streams as turn-scoped moderation
    metadata
    - emit the experimental app-server v2 `turn/moderationMetadata`
    notification with `{ threadId, turnId, metadata }`
    - add app-server integration coverage for the typed moderation metadata
    notification
    
    ## Testing
    - `just test -p codex-core
    build_ws_client_metadata_includes_window_lineage_and_turn_metadata`
    - `just test -p codex-core` (fails locally: 46 failures and 1 timeout,
    primarily missing `test_stdio_server` and shell snapshot timeouts)
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    turn_moderation_metadata_emits_typed_notification_v2`
    - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed,
    and 5 timed out; failures are in existing environment-sensitive tests,
    primarily because nested macOS `sandbox-exec` is not permitted)
    - `just write-app-server-schema --experimental --schema-root
    /tmp/codex-app-server-schema-experimental`
  • [codex] Add use_responses_lite 'override' logic (#26487)
    ## Summary
    
    - add a defaulted `ModelInfo.use_responses_lite` catalog field
    - support serializing `reasoning.context` while preserving the existing
    effort and summary path
    - has not been turned on for any models yet
    
    I've added an override to parallel tools if responses_lite is on. I've
    also forced persistent reasoning when using responses_lite. It would be
    ideal if we could centralize all the responses_lite plumbing, but I
    think this is best for now to keep the plumbing & diffs small.
    
    ## Testing
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    configured_reasoning_summary_is_sent`
    - `cargo check -p codex-core --tests`
    - `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes
    with pre-existing warnings in `codex-code-mode` and
    `codex-core-plugins`)
  • Remove response.processed websocket request (#26447)
    ## Why
    
    The Responses websocket client no longer needs to send a follow-up
    `response.processed` request after a turn response has already been
    recorded. Keeping that extra acknowledgement path adds feature-gated
    control flow and a second websocket request shape that no longer carries
    useful behavior.
    
    ## What Changed
    
    - Removed the `response.processed` websocket request type and sender.
    - Removed the `responses_websocket_response_processed` feature flag and
    schema entry.
    - Removed turn and remote-compaction plumbing that only tracked response
    IDs to send the acknowledgement.
    - Removed tests that existed solely to cover the deleted feature path.
    
    ## Validation
    
    - `just fix -p codex-core -p codex-api -p codex-features`
  • feat: show enterprise monthly credit limits in status (#24812)
    ## Summary
    
    Enterprise users can have an effective monthly credit limit, but Codex
    `/status` currently drops that metadata from the account-usage response.
    
    This change adds the optional `spend_control.individual_limit`
    projection to the existing rate-limit snapshot flow. The backend client
    reads the monthly limit, app-server exposes it as `individualLimit`, and
    the TUI renders a `Monthly credit limit` row through the existing
    progress-bar renderer.
    
    When the backend does not return an effective monthly limit, existing
    rate-limit behavior is unchanged.
    
    ## Existing backend state
    
    The account-usage backend already returns the effective monthly limit
    and current usage together:
    
    ```json
    {
      "spend_control": {
        "reached": false,
        "individual_limit": {
          "limit": "25000",
          "used": "8000",
          "remaining": "17000",
          "used_percent": 32,
          "remaining_percent": 68,
          "reset_after_seconds": 86400,
          "reset_at": 1778137680
        }
      }
    }
    ```
    
    Before this change, Codex projected rolling `primary` and `secondary`
    windows plus `credits`. It ignored `spend_control.individual_limit`, so
    app-server clients and `/status` could not render the monthly cap.
    
    The updated flow is:
    
    ```text
    account usage backend
      -> backend-client reads spend_control.individual_limit
      -> existing rate-limit snapshot carries optional individual_limit
      -> app-server exposes optional individualLimit
      -> TUI renders Monthly credit limit
    ```
    
    ## App-server contract
    
    `account/rateLimits/read` and sparse `account/rateLimits/updated`
    notifications now include an additive nullable
    `rateLimits.individualLimit` field:
    
    ```json
    {
      "individualLimit": {
        "limit": "25000",
        "used": "8000",
        "remainingPercent": 68,
        "resetsAt": 1778137680
      }
    }
    ```
    
    In an `account/rateLimits/read` response, `null` means no monthly limit
    is available. `account/rateLimits/updated` remains a sparse rolling
    notification: clients merge available values into their most recent
    `account/rateLimits/read` snapshot or refetch. Nullable account metadata
    in a rolling notification does not clear a previously observed value.
    
    ## Design decisions
    
    - Extend the existing rate-limit snapshot instead of introducing a
    separate request or wire-level update protocol.
    - Keep the Codex projection narrow: `/status` needs the effective limit,
    current usage, remaining percentage, and reset timestamp.
    - Render the monthly row through the existing progress-bar renderer,
    with one optional detail line for `8,000 of 25,000 credits used`.
    - Keep the backend response optional so existing accounts and older
    usage states preserve their current behavior.
    - Preserve cached monthly metadata when sparse rolling notifications
    omit it. Live account-usage reads remain authoritative and can clear a
    removed limit.
    
    ## Visual evidence
    
    ```text
     Monthly credit limit:   [██████████████░░░░░░] 68% left (resets 07:08 on 7 May)
                             8,000 of 25,000 credits used
    ```
    
    Snapshot:
    `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap`
    
    ## Testing
    
    Tests: generated app-server schema verification, protocol tests,
    backend-client tests, app-server integration coverage, TUI snapshot
    coverage, formatting, and workspace lint cleanup.
  • [codex] Require model for standalone web search (#25131)
    ## Why
    
    The standalone `/v1/alpha/search` request now requires a `model`, but
    the `web.run` extension currently omits it.
    
    Adds `model` to extension `ToolCall` invocation.
    
    Follow-up to #23823.
    
    ## What changed
    
    - Make `SearchRequest.model` required.
    - Expose the effective per-turn model on extension tool calls and pass
    it in standalone web-search requests.
    - Assert the model is forwarded in the app-server round-trip test.
    
    ## Testing
    
    - `just test -p codex-api -p codex-tools -p codex-web-search-extension
    -p codex-memories-extension -p codex-goal-extension`
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    - `just test -p codex-app-server -E
    'test(standalone_web_search_round_trips_encrypted_output)'`
  • standalone websearch extension (#23823)
    ## Summary
    
    Add the extension-backed standalone `web.run` tool so Codex can call the
    standalone search endpoint through the `codex-api` search client and
    return its encrypted output to Responses.
    
    - gate the new tool behind `standalone_web_search`
    - install the extension in the app-server thread registry and hide
    hosted `web_search` when standalone search is enabled for OpenAI
    providers so the two paths stay mutually exclusive
    - build search context from persisted history using a small tail
    heuristic: previous user message, assistant text between the last two
    user turns capped at about 1k tokens, and current user message
    
    ## Test Plan
    
    - `cargo test -p codex-web-search-extension`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
  • Display workspace usage limit error copy from response header (#24114)
    ## Why
    
    `openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex
    workspace credit-depletion and spend-cap responses. The CLI currently
    reads the adjacent promo header but otherwise renders generic
    usage-limit copy, so those responses do not explain the
    workspace-specific action the user needs to take.
    
    Backend dependency: https://github.com/openai/openai/pull/947613
    
    ## What Changed
    
    - Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error
    handling path alongside `x-codex-promo-message`.
    - Keep the header value parsing with the shared `RateLimitReachedType`
    enum.
    - Carry the parsed type on `UsageLimitReachedError` and render
    client-owned copy for the four workspace owner/member credit and
    spend-cap values.
    - Preserve existing promo and plan-based text for absent, generic, or
    unknown header values.
    - Keep the existing TUI workspace-owner nudge state path unchanged; the
    response header only selects the displayed error string.
    - Add focused display coverage for all specific type values and the
    generic fallback case.
    
    ## Test Plan
    
    - Added `usage_limit_reached_error_formats_rate_limit_reached_types`
    coverage.
    - Not run manually, per request; CI runs validation on the pushed
    commit.
  • Add typed Images client to codex-api (#23989)
    ## Why
    
    Standalone image generation needs a typed `codex-api` client surface for
    the Codex image proxy routes before the harness and model-facing tool
    layers are wired in.
    
    ## What changed
    
    - Added `ImagesClient` support for JSON `images/generations` and
    `images/edits` requests.
    - Added typed request and response shapes for generation, JSON edit
    image URLs, image metadata, and base64 image outputs.
    - Kept generation model slugs open-ended while requiring the generation
    model field that the downstream endpoint expects.
    - Exported the new client and image types from `codex-api`.
    - Added coverage for generation and edit wire shapes, extra response
    metadata that the client ignores, and malformed image responses missing
    `data`.
    
    ## Validation
    
    - `cargo test -p codex-api`
    - `just fix -p codex-api`
    - `just fmt`
    - `git diff --check main`
  • [codex] Fix realtime v1 websocket compatibility (#23771)
    ## Why
    
    Realtime v1 websocket sessions now expect a slightly different boundary
    shape for text input, completed input transcripts, and connection
    headers. Codex was still using the older shape, so some v1 text appends
    could be rejected before the existing conversation flow could handle
    them.
    
    ## What changed
    
    - Send v1 user text items with `input_text` content
    - Accept v1 turn-marked input transcript events as completed transcripts
    - Add the v1 alpha header only for v1 realtime sessions
    - Cover the outbound text shape, transcript parsing, and versioned
    headers
    
    ## Test plan
    
    - `cargo test -p codex-api endpoint::realtime_websocket::methods::tests`
    - `cargo test -p codex-core quicksilver_alpha_header`
  • add standalone websearch api client (#23655)
    add standalone web search request types and a `codex-api` client ahead
    of the extension-contributed search tool.
    
    this adds typed commands/settings and opaque encrypted output handling
    for the new standalone search flow. the endpoint types are close to
    finalized but may still shift slightly as that API settles.
  • Add timeout for remote compaction requests (#23451)
    ## Why
    
    Remote compaction currently sends a unary `POST /responses/compact` and
    waits for the full response before replacing history or emitting the
    completed `ContextCompaction` item. Unlike normal `/responses` streaming
    requests, this unary compact request had no timeout boundary. If the
    backend accepts the request and then stalls before returning a body, the
    existing request retry policy never sees a transport error, so the
    compact turn can remain stuck after the started item with no completion
    or actionable error.
    
    That matches the reported hang shape in issues such as #18363, where
    logs show `responses/compact` was posted but no corresponding compact
    completion followed. A bounded request timeout gives the existing retry
    policy a concrete timeout error to retry instead of letting the user sit
    indefinitely on automatic context compaction.
    
    ## What
    
    - Add a request timeout to legacy `/responses/compact` calls.
    - Size that timeout from the provider stream idle timeout with a
    conservative multiplier, so the default compact attempt gets 20 minutes
    rather than the 5 minute stream idle window.
    - Map API transport timeouts to a request timeout error instead of the
    child-process timeout message.
    
    ## Testing
    
    - Not run (per request; CI will cover).
  • feat(cli): add codex doctor diagnostics (#22336)
    ## Why
    
    Users and support need a single command that captures the local Codex
    runtime, configuration, auth, terminal, network, and state shape without
    asking the user to know which diagnostic depth to choose first. `codex
    doctor` now runs the useful checks by default and makes the detailed
    human output the default because the command is usually run when someone
    already needs context.
    
    The command also targets concrete support failure modes we have seen
    while iterating on the design:
    
    - update-target mismatches like #21956, where the installed package
    manager target can differ from the running executable
    - terminal and multiplexer issues that depend on `TERM`, tmux/zellij
    state, color handling, and TTY metadata
    - provider-specific HTTP/WebSocket connectivity, including ChatGPT
    WebSocket handshakes and API-key/provider endpoint reachability
    - local state/log SQLite integrity problems and large rollout
    directories
    - feedback reports that need an attached, redacted diagnostic snapshot
    without asking the user to run a second command
    
    ## What Changed
    
    - Adds `codex doctor` as a grouped CLI diagnostic report with default
    detailed output and `--summary` for the compact view.
    - Adds stable report sections for Environment, Configuration, Updates,
    Connectivity, and Background Server, plus a top Notes block that
    promotes anomalies such as available updates, large rollout directories,
    optional MCP issues, and mixed auth signals.
    - Adds runtime provenance, install consistency, bundled/system search
    readiness, terminal/multiplexer metadata, `config.toml` parse status,
    auth mode details, sandbox details, feature flag summaries, update
    cache/latest-version state, app-server daemon state, SQLite integrity
    checks, rollout statistics, and provider-aware network diagnostics.
    - Adds ChatGPT WebSocket diagnostics that report the negotiated HTTP
    upgrade as `HTTP 101 Switching Protocols` and include timeout, DNS,
    auth, and provider context in detailed output.
    - Makes reachability provider-aware: API-key OpenAI setups check the API
    endpoint, ChatGPT auth checks the ChatGPT path, and custom/AWS/local
    providers check configured HTTP endpoints when available.
    - Adds structured, redacted JSON output where `checks` is keyed by check
    id and `details` is a key/value object for support tooling.
    - Integrates doctor with feedback uploads by attaching a best-effort
    `codex-doctor-report.json` report and adding derived Sentry tags for
    overall status and failing/warning checks.
    - Updates the TUI feedback consent copy so users can see that the doctor
    report is included when logs/diagnostics are uploaded.
    - Updates the CLI bug issue template to ask reporters for `codex doctor
    --json` and render pasted reports as JSON.
    
    ## Example Output
    
    The examples below are sanitized from local smoke runs with `--no-color`
    so the structure is reviewable in plain text.
    
    ### `codex doctor`
    
    ```text
    Codex Doctor v0.0.0 · macos-aarch64
    
    Notes
       ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
       ⚠ rollouts     1,526 active files · 2.53 GB on disk
       ⚠ mcp          MCP configuration has optional issues
       ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
    ─────────────────────────────────────────────────────────────
    
    Environment
      ✓ runtime      local debug build
          version                  0.0.0
          install method           other
          commit                   unknown
          executable               ~/code/codex.fcoury-doct…x-rs/target/debug/codex
      ✓ install      consistent
          context                  other
          managed by               npm: no · bun: no · package root —
          PATH entries (2)         ~/.local/share/mise/installs/node/24/bin/codex
                                   ~/.local/share/mise/shims/codex
      ✓ search       ripgrep 15.1.0 (system, `rg`)
      ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
          terminal                 Ghostty
          TERM_PROGRAM             ghostty
          terminal version         1.3.2-main-+b0f827665
          TERM                     xterm-256color
          multiplexer              tmux 3.6a
          tmux extended-keys       on
          tmux allow-passthrough   on
          tmux set-clipboard       on
      ✓ state        databases healthy
          CODEX_HOME               ~/.codex (dir)
          state DB                 ~/.codex/state_5.sqlite (file) · integrity ok
          log DB                   ~/.codex/logs_2.sqlite (file) · integrity ok
          active rollouts          1,526 files · 2.53 GB (avg 1.70 MB)
          archived rollouts        8 files · 3.84 MB (avg 491.11 KB)
    
    Configuration
      ✓ config       loaded
          model                    gpt-5.5 · openai
          cwd                      ~/code/codex.fcoury-doctor/codex-rs
          config.toml              ~/.codex/config.toml
          config.toml parse        ok
          MCP servers              1
          feature flags            36 enabled · 7 overridden (full list with --all)
          overrides                code_mode, code_mode_only, memories, chronicle, goals, remote_control, prevent_idle_sleep
      ✓ auth         auth is configured
          auth storage mode        File
          auth file                ~/.codex/auth.json
          auth env vars present    OPENAI_API_KEY
          stored auth mode         chatgpt
          stored API key           false
          stored ChatGPT tokens    true
          stored agent identity    false
      ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
          configured servers       1
          disabled servers         0
          streamable_http servers  1
          optional reachability    openaiDeveloperDocs: https://developers.openai.com/mcp (HEAD connect failed; GET connect failed)
      ✓ sandbox      restricted fs + restricted network · approval OnRequest
          approval policy          OnRequest
          filesystem sandbox       restricted
          network sandbox          restricted
    
    Connectivity
      ✓ network      network-related environment looks readable
      ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
          model provider           openai
          provider name            OpenAI
          wire API                 responses
          supports websockets      true
          connect timeout          15000 ms
          auth mode                chatgpt
          endpoint                 wss://chatgpt.com/backend-api/<redacted>
          DNS                      2 IPv4, 2 IPv6, first IPv6
          handshake result         HTTP 101 Switching Protocols
      ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.
          reachability mode        API key auth
          openai API               https://api.openai.com/v1 connect failed (required)
    
    Background Server
      ○ app-server   not running (ephemeral mode)
    
    ─────────────────────────────────────────────────────────────
    11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed
    
    --summary compact output           --all expand truncated lists
    --json redacted report
    ```
    
    ### `codex doctor --summary`
    
    ```text
    Codex Doctor v0.0.0 · macos-aarch64
    
    Notes
       ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
       ⚠ rollouts     1,526 active files · 2.53 GB on disk
       ⚠ mcp          MCP configuration has optional issues
       ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
    ─────────────────────────────────────────────────────────────
    
    Environment
      ✓ runtime      local debug build
      ✓ install      consistent
      ✓ search       ripgrep 15.1.0 (system, `rg`)
      ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
      ✓ state        databases healthy
    
    Configuration
      ✓ config       loaded
      ✓ auth         auth is configured
      ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
      ✓ sandbox      restricted fs + restricted network · approval OnRequest
    
    Updates
      ✓ updates      update configuration is locally consistent
    
    Connectivity
      ✓ network      network-related environment looks readable
      ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
      ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.
    
    Background Server
      ○ app-server   not running (ephemeral mode)
    
    ─────────────────────────────────────────────────────────────
    11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed
    
    Run codex doctor without --summary for detailed diagnostics.
    --all expand truncated lists       --json redacted report
    ```
    
    ### `codex doctor --json` shape
    
    ```json
    {
      "schema_version": 1,
      "overall_status": "fail",
      "checks": {
        "runtime.provenance": {
          "id": "runtime.provenance",
          "category": "Environment",
          "status": "ok",
          "summary": "local debug build",
          "details": {
            "version": "0.0.0",
            "install method": "other",
            "commit": "unknown"
          }
        },
        "sandbox.helpers": {
          "id": "sandbox.helpers",
          "category": "Configuration",
          "status": "ok",
          "summary": "restricted fs + restricted network · approval OnRequest",
          "details": {
            "approval policy": "OnRequest",
            "filesystem sandbox": "restricted",
            "network sandbox": "restricted"
          }
        }
      }
    }
    ```
    
    ### `/feedback` new sentry attachment
    
    <img width="938" height="798" alt="CleanShot 2026-05-13 at 15 36 14"
    src="https://github.com/user-attachments/assets/715e62e0-d7b4-4fea-a35a-fd5d5d33c4c0"
    />
    
    ### New section in CLI issue template
    
    <img width="1164" height="435" alt="CleanShot 2026-05-13 at 15 47 24"
    src="https://github.com/user-attachments/assets/9081dc25-a28c-4afa-8ba1-e299c2b4031d"
    />
    
    ## How to Test
    
    1. Run `cargo run --bin codex -- doctor --no-color`.
    2. Confirm the detailed report is the default and includes promoted
    Notes, grouped sections, terminal details, state DB integrity, rollout
    stats, provider reachability, WebSocket diagnostics, and app-server
    status.
    3. Run `cargo run --bin codex -- doctor --summary --no-color`.
    4. Confirm the compact view keeps the same sections and summary counts
    but omits detailed key/value rows.
    5. Run `cargo run --bin codex -- doctor --json`.
    6. Confirm the output is redacted JSON, `checks` is an object keyed by
    check id, and each check's `details` is a key/value object.
    7. Preview the CLI bug issue template and confirm the `Codex doctor
    report` field appears after the terminal field, asks for `codex doctor
    --json`, and renders pasted output as JSON.
    8. Start a feedback flow that includes logs.
    9. Confirm the upload consent copy lists `codex-doctor-report.json`
    alongside the log attachments.
    
    Targeted tests:
    
    - `cargo test -p codex-cli doctor`
    - `cargo test -p codex-app-server
    doctor_report_tags_summarize_status_counts`
    - `cargo test -p codex-feedback`
    - `cargo test -p codex-tui feedback_view`
    - `just argument-comment-lint`
    - `git diff --check`
  • fix: drop underscored id headers (#22193)
    ## Why
    Stop sending duplicate `session_id`/`thread_id` headers. We only want
    the hyphenated forms as `_` is rejected by some proxies
    
    Related discussion here:
    https://openai.slack.com/archives/C095U48JNL9/p1778508316923179
    
    ## What
    - Keep `session-id` and `thread-id`
    - Remove the underscore aliases
  • Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
    ## Why
    
    `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
    bypass the normal Responses transport by reading SSE from local files.
    That kept test-only behavior wired through production client code. The
    affected tests can stay hermetic by using the existing
    `core_test_support::responses` mock server and passing `openai_base_url`
    instead.
    
    ## What Changed
    
    - Removed the `CODEX_RS_SSE_FIXTURE` flag,
    `codex_api::stream_from_fixture`, the `env-flags` dependency, and the
    checked-in SSE fixture files.
    - Repointed the affected core, exec, and TUI tests at `MockServer` with
    the existing SSE event constructors.
    - Removed the Bazel test data plumbing for the deleted fixtures and
    refreshed cargo/Bazel lock state.
    
    ## Verification
    
    - `cargo build -p codex-cli`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core --test all responses_api_stream_cli`
    - `cargo test -p codex-core --test all
    integration_creates_and_checks_session_file`
    - `cargo test -p codex-exec --test all ephemeral`
    - `cargo test -p codex-exec --test all resume`
    - `cargo test -p codex-tui --test all
    resume_startup_does_not_consume_model_availability_nux_count`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
    - `git diff --check`
  • api: send hyphenated session and thread headers (#21757)
    ## Why
    Some consumers expect conventional hyphenated HTTP headers. Codex
    already sends the session and thread IDs on outbound Responses requests,
    but it only uses the underscore spellings today, which makes those IDs
    harder to consume in systems that normalize or reject underscore header
    names.
    
    Full context here:
    https://openai.slack.com/archives/C08KCGLSPSQ/p1778248578422369
    
    ## What changed
    - `build_session_headers` now emits both `session_id` and `session-id`
    when a session ID is present.
    - It does the same for `thread_id` and `thread-id`.
    - Added regression coverage in `codex-api/tests/clients.rs` and
    `core/tests/suite/client.rs` so both the lower-level client tests and
    the end-to-end request tests assert the two header spellings are
    present.
    
    ## Test plan
    - Added header assertions in `codex-api/tests/clients.rs`.
    - Added request-header assertions in `core/tests/suite/client.rs` for
    both the `/v1/responses` and `/api/codex/responses` request paths.
  • [codex] Add response.processed websocket request (#21284)
    ## Summary
    
    - Add a `response.processed` websocket request payload and sender for
    Responses API websockets.
    - Send `response.processed` from `try_run_sampling_request` after a
    response completes, local turn processing succeeds, and the
    session-owned feature flag is enabled.
    - Add websocket coverage for both enabled and disabled feature-flag
    behavior.
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-core response_processed`
    - `cargo test -p codex-api responses_websocket`
    - `cargo test -p codex-features
    responses_websocket_response_processed_is_under_development`
    - `git diff --check`
    - `just fix -p codex-api -p codex-core -p codex-features`
    - `git diff --check origin/main...HEAD`
  • Propagate cache key and service tiers in compact (#21249)
    ## Why
    
    `/responses/compact` should preserve the request-affinity fields that
    apply to the active auth mode. ChatGPT-auth compact requests need the
    effective `service_tier`, and compact requests for every auth mode need
    the stable `prompt_cache_key`, so compaction does not quietly lose
    routing or cache behavior that normal sampling already has.
    
    This follows the request-parity direction from #20719, but keeps the net
    change focused on the compact payload fields needed here.
    
    ## What changed
    
    - Add `service_tier` and `prompt_cache_key` to the compact endpoint
    input payload.
    - Build the remote compact payload from the existing responses request
    builder output so `Fast` still maps to `priority` when compact sends a
    service tier.
    - Pass the turn service tier into remote compaction, but only include it
    in compact payloads for ChatGPT-backed auth.
    - Keep `prompt_cache_key` on compact payloads for all auth modes.
    - Add request-body diff snapshot coverage in
    `core/tests/suite/compact_remote.rs` for:
    - API-key auth reusing `prompt_cache_key` while omitting `service_tier`
    even when `Fast` is configured.
      - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
    - Drive the snapshot coverage through five varied turns: plain text,
    multi-part text, tool-call continuation, image+text input, local-shell
    continuation, and final-turn reasoning output.
    
    ## Verification
    
    - Added insta snapshots for compact request-body parity against the last
    normal `/responses` request after five varied turns.
    - Not run locally per repo guidance; relying on GitHub CI for test
    execution.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Bound websocket request sends with idle timeout (#20751)
    ## Why
    
    We saw Responses websocket sessions recover only after a long quiet
    period when the server had already logged the websocket as disconnected.
    The normal connect path is already bounded by
    `websocket_connect_timeout_ms`, but the first request send on an
    established websocket reused only the receive-side idle timeout after
    the write completed. If the socket write/pump stalls, the client can sit
    in `ws_stream.send(...)` without reaching the existing receive timeout.
  • realtime: rename provider session ids (#20361)
    ## Summary
    
    Codex is repurposing `session` to mean a thread group, so the realtime
    provider session id should no longer use `session_id` / `sessionId` in
    Codex-facing protocol payloads. This PR renames that provider-specific
    field to `realtime_session_id` / `realtimeSessionId` and intentionally
    breaks clients that still send the old field names.
    
    ## What Changed
    
    - Renamed realtime provider session fields in `ConversationStartParams`,
    `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
    - Renamed app-server v2 realtime request and notification fields to
    `realtimeSessionId`.
    - Removed legacy serde aliases for `session_id` / `sessionId`; clients
    must send the new names.
    - Propagated the rename through core realtime startup, app-server
    adapters, codex-api websocket handling, and TUI realtime state.
    - Regenerated app-server protocol schema/TypeScript outputs and updated
    app-server README examples.
    - Kept upstream Realtime API concepts unchanged: provider `session.id`
    parsing and `x-session-id` headers still use the upstream wire names.
    
    ## Testing
    
    - CI is running on the latest pushed commit.
    - Earlier local verification on this PR:
      - `cargo test -p codex-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
    realtime_conversation`
      - `cargo test -p codex-app-server-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
    realtime_conversation`
    - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
    linker bus error while linking the test binary)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [rollout-trace] Include x-request-id in rollout trace. (#20066)
    ## Why
    
    Rollout traces need an identifier that can be used to correlate a Codex
    inference with upstream Responses API, proxy, and engine logs. The
    reduced trace model already exposed `upstream_request_id`, but it was
    being populated from the Responses API `response.id`. That value is
    useful for `previous_response_id` chaining, but it is not the transport
    request id that upstream systems key on.
    
    This PR separates those concepts so trace consumers can reliably answer
    both questions:
    
    - which Responses API response did this inference produce?
    - which upstream request handled it?
    
    ## Structure
    
    The change keeps the upstream request id at the same lifecycle level as
    the provider stream:
    
    - `codex-api` captures the `x-request-id` HTTP response header when the
    SSE stream is created and exposes it on `ResponseStream`. Fixture and
    websocket streams set the field to `None` because they do not have that
    HTTP response header.
    - `codex-core` carries that stream-level id into `InferenceTraceAttempt`
    when recording terminal stream outcomes. Completed, failed, cancelled,
    dropped-stream, and pre-response error paths all record the id when it
    is available.
    - `rollout-trace` now records both identifiers in raw terminal inference
    events and response payloads: `response_id` for the Responses API
    `response.id`, and `upstream_request_id` for `x-request-id`.
    - The reducer stores both fields on `InferenceCall`. It also uses
    `response_id` for `previous_response_id` conversation linking, which
    removes the old accidental dependency on the misnamed
    `upstream_request_id` field.
    - Terminal inference reduction now consumes the full terminal payload
    (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in
    one place. That keeps status, partial payloads, response ids, and
    upstream request ids consistent across success, failure, cancellation,
    and late stream-mapper events.
    
    ## Why This Shape
    
    `x-request-id` is a property of the HTTP/provider response envelope, not
    an SSE event. Capturing it once in `codex-api` and plumbing it through
    terminal trace recording avoids trying to infer the value from stream
    contents, and it preserves the id even when the stream fails or is
    cancelled after only partial output.
    
    Keeping `response_id` separate from `upstream_request_id` also makes the
    reduced trace model less surprising: `response_id` remains the
    conversation-continuation id, while `upstream_request_id` is the
    operational correlation id for upstream debugging.
    
    ## Validation
    
    The PR updates trace and reducer coverage for:
    
    - reading `x-request-id` from SSE response headers;
    - storing the true upstream request id on completed inference calls;
    - preserving upstream request ids for cancelled and late-cancelled
    inference streams;
    - keeping `previous_response_id` reconstruction tied to `response_id`
    rather than transport request ids.
  • Support end_turn in response.completed (#19610)
    Some providers of Responses API forward a model-defined `end_turn`
    boolean indicating explicitly the model's indication of whether it would
    like to end the turn or to be inferenced again. In this PR, we update
    the sampling loop to use this field correctly if it's set. If the field
    is not set by the provider, we fall back to the existing sampling logic.
  • refactor: route Codex auth through AuthProvider (#18811)
    ## Summary
    
    This PR moves Codex backend request authentication from direct
    bearer-token handling to `AuthProvider`.
    
    The new `codex-auth-provider` crate defines the shared request-auth
    trait. `CodexAuth::provider()` returns a provider that can apply all
    headers needed for the selected auth mode.
    
    This lets ChatGPT token auth and AgentIdentity auth share the same
    callsite path:
    - ChatGPT token auth applies bearer auth plus account/FedRAMP headers
    where needed.
    - AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
    where needed.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Callsite Migration
    
    | Area | Change |
    | --- | --- |
    | backend-client | accepts an `AuthProvider` instead of a raw
    token/header |
    | chatgpt client/connectors | applies auth through
    `CodexAuth::provider()` |
    | cloud tasks | keeps Codex-backend gating, applies auth through
    provider |
    | cloud requirements | uses Codex-backend auth checks and provider
    headers |
    | app-server remote control | applies provider headers for backend calls
    |
    | MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
    from generic account getters |
    | model refresh | treats AgentIdentity as Codex-backend auth |
    | OpenAI file upload path | rejects non-Codex-backend auth before
    applying headers |
    | core client setup | keeps model-provider auth flow and allows
    AgentIdentity through provider-backed OpenAI auth |
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
    auth mode and startup task allocation
    4. This PR: migrate Codex backend auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • feat: add AWS SigV4 auth for OpenAI-compatible model providers (#17820)
    ## Summary
    
    Add first-class Amazon Bedrock Mantle provider support so Codex can keep
    using its existing Responses API transport with OpenAI-compatible
    AWS-hosted endpoints such as AOA/Mantle.
    
    This is needed for the AWS launch path, where provider traffic should
    authenticate with AWS credentials instead of OpenAI bearer credentials.
    Requests are authenticated immediately before transport send, so SigV4
    signs the final method, URL, headers, and body bytes that `reqwest` will
    send.
    
    ## What Changed
    
    - Added a new `codex-aws-auth` crate for loading AWS SDK config,
    resolving credentials, and signing finalized HTTP requests with AWS
    SigV4.
    - Added a built-in `amazon-bedrock` provider that targets Bedrock Mantle
    Responses endpoints, defaults to `us-east-1`, supports region/profile
    overrides, disables WebSockets, and does not require OpenAI auth.
    - Added Amazon Bedrock auth resolution in `codex-model-provider`: prefer
    `AWS_BEARER_TOKEN_BEDROCK` when set, otherwise use AWS SDK credentials
    and SigV4 signing.
    - Added `AuthProvider::apply_auth` and `Request::prepare_body_for_send`
    so request-signing providers can sign the exact outbound request after
    JSON serialization/compression.
    - Determine the region by taking the `aws.region` config first (required
    for bearer token codepath), and fallback to SDK default region.
    
    ## Testing
    Amazon Bedrock Mantle Responses paths:
    
    - Built the local Codex binary with `cargo build`.
    - Verified the custom proxy-backed `aws` provider using `env_key =
    "AWS_BEARER_TOKEN_BEDROCK"` streamed raw `responses` output with
    `response.output_text.delta`, `response.completed`, and `mantle-env-ok`.
    - Verified a full `codex exec --profile aws` turn returned
    `mantle-env-ok`.
    - Confirmed the custom provider used the bearer env var, not AWS profile
    auth: bogus `AWS_PROFILE` still passed, empty env var failed locally,
    and malformed env var reached Mantle and failed with `401
    invalid_api_key`.
    - Verified built-in `amazon-bedrock` with `AWS_BEARER_TOKEN_BEDROCK` set
    passed despite bogus AWS profiles, returning `amazon-bedrock-env-ok`.
    - Verified built-in `amazon-bedrock` SDK/SigV4 auth passed with
    `AWS_BEARER_TOKEN_BEDROCK` unset and temporary AWS session env
    credentials, returning `amazon-bedrock-sdk-env-ok`.
  • Allow guardian bare allow output (#18797)
    ## Summary
    
    Allow guardian to skip other fields and output only
    `{"outcome":"allow"}` when the command is low risk.
    This change lets guardian reviews use a non-strict text format while
    keeping the JSON schema itself as plain user-visible schema data, so
    transport strictness is carried out-of-band instead of through a schema
    marker key.
    
    ## What changed
    
    - Add an explicit `output_schema_strict` flag to model prompts and pass
    it into `codex-api` text formatting.
    - Set guardian reviewer prompts to non-strict schema validation while
    preserving strict-by-default behavior for normal callers.
    - Update the guardian output contract so definitely-low-risk decisions
    may return only `{"outcome":"allow"}`.
    - Treat bare allow responses as low-risk approvals in the guardian
    parser.
    - Add tests and snapshots covering the non-strict guardian request and
    optional guardian output fields.
    
    ## Verification
    
    - `cargo test -p codex-core guardian::tests::guardian`
    - `cargo test -p codex-core guardian::tests::`
    - `cargo test -p codex-core client_common::tests::`
    - `cargo test -p codex-protocol
    user_input_serialization_includes_final_output_json_schema`
    - `cargo test -p codex-api`
    - `git diff --check`
    
    Note: `cargo test -p codex-core` was also attempted, but this desktop
    environment injects ambient config/proxy state that causes unrelated
    config/session tests expecting pristine defaults to fail.
    
    ---------
    
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
    Co-authored-by: Codex <noreply@openai.com>