71 Commits

  • feat: use run agent task auth for inference (#19051)
    ## Stack
    
    This is PR 3 of the simplified HAI single-run-task stack:
    
    - [#19047](https://github.com/openai/codex/pull/19047) Agent Identity
    assertion and task-registration primitives, including the shared
    run-task helper used by existing Agent Identity JWT auth.
    - [#19049](https://github.com/openai/codex/pull/19049)
    Disabled-by-default ChatGPT auth opt-in that provisions/reuses persisted
    Agent Identity runtime auth and its single run task.
    - [#19051](https://github.com/openai/codex/pull/19051) Run-scoped
    provider auth that uses one backend-owned task id for first-party
    inference and compaction requests.
    
    [#19054](https://github.com/openai/codex/pull/19054) collapsed out of
    the active stack because the simplified design no longer needs a
    separate background/control-plane task helper.
    
    ## Summary
    
    This PR moves Agent Identity usage into provider auth resolution. That
    keeps `AgentAssertion` auth tied to first-party OpenAI provider requests
    instead of applying a late session-wide override that could affect
    local, custom, Bedrock, API-key, or external-bearer providers.
    
    What changed:
    
    - adds a small `ProviderAuthScope` struct carrying the run auth policy
    and session source needed by provider-scoped auth resolution
    - lets `Session` opt the existing `ModelClient` into `ChatGptAuth`
    policy when `use_agent_identity` is enabled, without adding a second
    model-client constructor
    - resolves Agent Identity only for first-party OpenAI provider auth
    paths
    - uses the persisted run task id from the `AgentIdentityAuth` record to
    build `AgentAssertion` auth for Responses requests
    - routes shared request setup through scoped provider auth so unary
    compact requests use the same run-task assertion path as inference turns
    - keeps local/custom/Bedrock/env-key/external-bearer provider auth
    unchanged
    - lets missing run-task state surface through the existing model-request
    error path instead of silently falling back to bearer auth
    
    This PR intentionally does not create thread-scoped, target-scoped, or
    background-scoped task identities. The run task is the only task Codex
    registers in this POC shape.
    
    ## Testing
    
    - `just test -p codex-model-provider`
    - `just test -p codex-core client::tests::provider_auth_scope_uses`
    - `just test -p codex-core remote_compact_uses_agent_identity_assertion`
  • Support thread-level originator overrides (#29477)
    ## Why
    
    Work(TPP) threads can be launched from the Desktop app, but if they all
    keep the Desktop app's default originator then downstream attribution
    cannot distinguish local Work launches from cloud-backed Work launches.
    `thread/start.serviceName` already carries that launch signal, while
    `SessionMeta.originator` is the durable thread-level value that survives
    resume and fork.
    
    This change converts the Desktop Work service names into an effective
    originator at thread creation time, persists that originator with the
    thread, and keeps using it for later model requests and memory writes.
    
    ## What changed
    
    - Map `CODEX_WORK_LOCAL` and `CODEX_WORK_CLOUD` service names to
    per-thread originators, while preserving
    `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` as the highest-precedence override.
    - Persist the effective originator in `SessionMeta.originator`, read it
    back on resume/fork, and inherit the parent originator for subagent
    spawns when there is no persisted session metadata.
    - Handle truncated `SpawnAgentForkMode::LastNTurns` forks by falling
    back to the live parent originator when the forked history no longer
    includes `SessionMeta`.
    - Thread the per-thread originator through Responses headers,
    websocket/compaction request paths, thread-store creation, rollout
    metadata, and memory stage-one telemetry.
    
    ## Verification
    
    - `just test -p codex-core
    agent::control::tests::spawn_thread_subagent_inherits_parent_originator_without_fork
    agent::control::tests::spawn_thread_subagent_fork_last_n_turns_inherits_parent_originator_without_session_meta
    thread_manager::tests::originator_override_precedes_service_name_remapping`
    - `just test -p codex-core
    agent::control::tests::resume_thread_subagent_restores_stored_metadata_and_effective_multi_agent_mode`
    - `just test -p codex-memories-write`
    - `just fix -p codex-core -p codex-memories-write`
    - `git diff --check`
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • [codex] Assign response item IDs when recording history (#28814)
    ## Why
    
    Client-created response items enter history without IDs, so their
    identity is lost across rollout persistence and resume. IDs should be
    assigned once at the history-recording boundary, while IDs returned by
    the server must remain unchanged.
    
    The Responses API validates item IDs using type-specific prefixes.
    Locally generated IDs therefore use the matching prefix plus a
    hyphenated UUIDv7, keeping them valid while distinguishable from
    server-generated IDs. Because this changes persisted history and
    provider request shapes, the behavior is opt-in behind the
    under-development `item_ids` feature. Compaction triggers remain request
    controls whose API shape does not accept an ID.
    
    ## What changed
    
    - Register the disabled-by-default `item_ids` feature and expose it in
    `config.schema.json`.
    - Make supported optional `ResponseItem` IDs serializable and expose
    them in the generated app-server schemas.
    - When `item_ids` is enabled, assign an ID during conversation-history
    preparation if an item has no ID.
    - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
    item conventions.
    - Preserve existing server IDs without rewriting them.
    - Persist assigned IDs in rollouts and include them in subsequent
    Responses requests.
    - Remove the unsupported ID field from `CompactionTrigger` and document
    why it has no ID.
    - Add integration coverage for enabled ID persistence, preservation of
    server IDs, and omission of generated IDs while the feature is disabled.
    
    `prepare_conversation_items_for_history` is the single response-item ID
    allocation boundary.
    
    ## Test plan
    
    - `just test -p codex-features`
    - `just test -p codex-core
    response_item_ids_persist_across_resume_and_preserve_server_ids`
    - `just test -p codex-core
    non_openai_responses_requests_omit_item_turn_metadata`
    - `just test -p codex-core
    resize_all_images_prepares_failures_before_history_insertion`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • core: Consolidate Responses API Codex metadata (#27122)
    ## What
    Introduce a `CodexResponsesMetadata` struct that defines all the core
    metadata we send to Responses API. Example fields are `thread_id`,
    `turn_id`, `window_id`, etc.
    
    Going forward, `client_metadata["x-codex-turn-metadata"]` will be the
    canonical way Codex sends metadata to Responses API across both HTTP and
    websocket transports.
    
    For now, we continue to emit the existing top-level HTTP headers and
    top-level `client_metadata` fields from the same
    `CodexResponsesMetadata` struct for compatibility reasons.
    
    Also, app-server clients who specify additional
    `responsesapi_client_metadata` via `turn/start` and `turn/steer` will
    have those fields merged into
    `client_metadata["x-codex-turn-metadata"]`, but cannot override the
    reserved fields that core uses (i.e. the fields in
    `CodexResponsesMetadata`).
    
    ## Why
    
    Responses API request instrumentation is the source of truth for
    downstream Codex analytics that join requests by Codex IDs such as
    session, thread, turn, and context window. Before this change, those
    values were assembled through several request-specific paths: HTTP
    request bodies, websocket handshake headers, websocket `response.create`
    payloads, compaction requests, and the rich `x-codex-turn-metadata`
    envelope all had their own wiring.
    
    That made metadata propagation easy to drift across API-key/direct
    Responses API requests, ChatGPT-auth/proxied requests, websocket
    requests, and compaction requests. It also made additions like
    `window_id` error-prone because a field could be added to one transport
    projection but missed in another.
    
    ## What changed
    
    - Added `CodexResponsesMetadata` as the core-owned snapshot for Codex
    metadata sent to ResponsesAPI.
    - Render `client_metadata["x-codex-turn-metadata"]`, flat
    `client_metadata` projections, and direct compatibility headers from
    that same snapshot.
    - Include the known Codex-owned fields in the turn metadata blob,
    including installation/session/thread/turn/window IDs, request kind,
    lineage, sandbox/workspace metadata, timing, and compaction details.
    - Treat app-server `responsesapi_client_metadata` as enrichment for the
    Codex turn metadata blob while preventing those extras from overriding
    Codex-owned fields.
    - Use the same metadata path for normal turns, websocket prewarm, local
    compaction, remote v1 compaction, and remote v2 compaction.
    - Keep websocket connection-only preconnect metadata separate so
    handshakes carry compatibility identity headers without inventing a fake
    turn metadata blob.
    
    ## Verification
    
    - `cargo check -p codex-core`
    - `just fix -p codex-core`
  • [codex] Store compact window id in rollout (#27264)
    ## Why
    
    Compaction window identity is part of session history, not model-client
    transport state. Persisting it with the compacted rollout item lets
    resumed threads continue from the reconstructed window without keeping
    mutable window state on `ModelClient`.
    
    ## What changed
    
    - Added `window_id` to `CompactedItem` and stamp it when
    `replace_compacted_history` installs compacted history.
    - Moved auto-compact window id ownership into `AutoCompactWindow` /
    `SessionState`; `ModelClient` now receives the request window id from
    callers instead of storing it.
    - Returned `window_id` from rollout reconstruction for resume.
    Reconstruction uses the newest surviving compacted item's stored
    `window_id` when present, and falls back to the legacy compacted-item
    count when it is absent.
    - Kept fork startup at the fresh default window id and updated direct
    model-client tests to pass explicit test window ids.
    
    ## Validation
    
    - `cargo check -p codex-core --tests`
  • Add HTTP window ID to Responses client metadata (#26923)
    ## Summary
    
    - Keep the existing `x-codex-window-id` HTTP header unchanged.
    - Also send the same window ID in Responses `client_metadata`, allowing
    supported backend paths to surface it as
    `x-client-meta-x-codex-window-id`.
    - Cover normal HTTP Responses and remote compaction v2 requests without
    changing window generation or compaction behavior.
    
    ## Why
    
    In the `2026-06-06T23` production hour, all 28,729 HTTP compaction
    requests had `window_id` in `x-codex-turn-metadata`, but only 73
    retained the direct `x-codex-window-id` header. The request-body
    `client_metadata` path is already used for installation ID and is
    preserved through supported Responses API paths.
    
    This is additive metadata only. It does not change the direct header,
    request count, model input, compaction routing, window generation, or
    user response behavior.
    
    Legacy `/v1/responses/compact` is intentionally unchanged. Its current
    server-side `CompressBody` schema does not accept `client_metadata` and
    rejects unknown fields, so supporting that path requires a backend
    schema change before the Codex client can safely send this field.
    
    ## Validation
    
    - Current head: `219baef3c`, rebased onto `origin/main` at `26d932983`.
    - The post-rebase diff remains limited to the original five files (`22`
    insertions, `6` deletions); the legacy experiment remains fully
    reverted.
    - `just test -p codex-core
    responses_stream_includes_subagent_header_on_review`: passed; validates
    normal HTTP Responses metadata.
    - `just test -p codex-core
    remote_compact_v2_reuses_compaction_trigger_for_followups`: passed;
    validates remote compaction v2.
    - `just test -p codex-core
    remote_manual_compact_chatgpt_auth_reuses_service_tier_and_prompt_cache_key`:
    passed; validates that legacy compact keeps its accepted payload shape.
    - `just test -p codex-core
    remote_manual_compact_api_auth_omits_service_tier_and_reuses_prompt_cache_key`:
    passed; validates the legacy API-key payload as well.
    - `just fmt`: passed; an unrelated root `justfile` rewrite produced by
    the formatter was discarded.
    - `git diff --check origin/main...HEAD`: passed.
    
    The focused server pytest could not start in the local monorepo
    environment because test setup is missing the `dotenv` module. Server
    source and tests explicitly show that `CompressBody` omits
    `client_metadata` and `/v1/responses/compact` returns HTTP 400 for
    unknown body fields.
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • [codex] request desktop attestation from app (#20619)
    ## Summary
    
    TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
    attestation token and attach it as `x-oai-attestation` on the scoped
    ChatGPT Codex request paths.
    
    ![DeviceCheck attestation
    interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)
    
    ## Details
    
    This PR teaches the Codex app-server runtime how to request and attach
    an attestation token. It does not generate DeviceCheck tokens directly;
    instead, it relies on the connected desktop app to advertise that it can
    generate attestation and then asks that app for a fresh header value
    when needed.
    
    The flow is:
    
    1. The Codex desktop app connects to app-server.
    2. During `initialize`, the app can advertise that it supports
    `requestAttestation`.
    3. Before app-server calls selected ChatGPT Codex endpoints, it sends
    the internal server request `attestation/generate` to the app.
    4. app-server receives a pre-encoded header value back.
    5. app-server forwards that value as `x-oai-attestation` on the scoped
    outbound requests.
    
    The code in this repo is mostly protocol and runtime plumbing: it adds
    the app-server request/response shape, introduces an attestation
    provider in core, wires that provider into Responses / compaction /
    realtime setup paths, and covers the intended scoping with tests. The
    signed macOS DeviceCheck generation remains owned by the desktop app PR.
    
    ## Related PR
    
    - Codex desktop app implementation:
    https://github.com/openai/openai/pull/878649
    
    ## Validation
    
    <details>
    <summary>Tests run</summary>
    
    ```sh
    cargo test -p codex-app-server-protocol
    cargo test -p codex-core attestation --lib
    cargo test -p codex-app-server --lib attestation
    ```
    
    Also ran:
    
    ```sh
    just fix -p codex-core
    just fix -p codex-app-server
    just fix -p codex-app-server-protocol
    just fmt
    just write-app-server-schema
    ```
    
    </details>
    
    <details>
    <summary>E2E DeviceCheck validation</summary>
    
    First validated the signed desktop app boundary directly: launched a
    packaged signed `Codex.app`, sent `attestation/generate`, decoded the
    returned `v1.` attestation header, and validated the extracted
    DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
    bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
    `is_ok: true`.
    
    Then ran the fuller app + app-server flow. The packaged `Codex.app`
    launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
    MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
    requested `attestation/generate` from the real Electron app process, and
    the intercepted `/backend-api/codex/responses` traffic included
    `x-oai-attestation` on both routes:
    
    ```text
    GET  /backend-api/codex/responses  Upgrade: websocket  x-oai-attestation: present
    POST /backend-api/codex/responses  Upgrade: none       x-oai-attestation: present
    ```
    
    The captured header decoded to a DeviceCheck token that also validated
    with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
    team `2DC432GLL2`).
    
    </details>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • Add turn start timestamp to turn metadata (#19473)
    ## Why
    - Without change: MCP tool calls receive
    `_meta["x-codex-turn-metadata"]` with `session_id` and `turn_id`.
    - Issue: MCP servers may want the turn start timestamp to measure
    internal latency relative to turn start.
    
    ## What Changed
    - With change: turn metadata now includes `turn_started_at_unix_ms`,
    which is propagated to MCP tool calls in
    `_meta["x-codex-turn-metadata"]`.
    
    ## Verification
    - `codex-rs/core/src/mcp_tool_call_tests.rs`
    - `codex-rs/core/src/turn_metadata_tests.rs`
    - `codex-rs/core/src/turn_timing_tests.rs`
    - `codex-rs/core/tests/responses_headers.rs`
    - `codex-rs/core/tests/suite/search_tool.rs`
  • [rollout_trace] Record core session rollout traces (#18877)
    ## Summary
    
    Wires rollout trace recording into `codex-core` session and turn
    execution. This records the core model request/response, compaction, and
    session lifecycle boundaries needed for replay without yet tracing every
    nested runtime/tool boundary.
    
    ## Stack
    
    This is PR 2/5 in the rollout trace stack.
    
    - [#18876](https://github.com/openai/codex/pull/18876): Add rollout
    trace crate
    - [#18877](https://github.com/openai/codex/pull/18877): Record core
    session rollout traces
    - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
    code-mode boundaries
    - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
    and multi-agent edges
    - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
    reduction command
    
    ## Review Notes
    
    This layer is the first live integration point. The important review
    question is whether trace recording is isolated from normal session
    behavior: trace failures should not become user-visible execution
    failures, and recording should preserve the existing turn/session
    lifecycle semantics.
    
    The PR depends on the reducer/data model from the first stack entry and
    only introduces the core recorder surface that later PRs use for richer
    runtime and relationship events.
  • feat: add a built-in Amazon Bedrock model provider (#18744)
    ## Why
    
    Codex needs a first-class `amazon-bedrock` model provider so users can
    select Bedrock without copying a full provider definition into
    `config.toml`. The provider has Codex-owned defaults for the pieces that
    should stay consistent across users: the display `name`, Bedrock
    `base_url`, and `wire_api`.
    
    At the same time, users still need a way to choose the AWS credential
    profile used by their local environment. This change makes
    `amazon-bedrock` a partially modifiable built-in provider: code owns the
    provider identity and endpoint defaults, while user config can set
    `model_providers.amazon-bedrock.aws.profile`.
    
    For example:
    
    ```toml
    model_provider = "amazon-bedrock"
    
    [model_providers.amazon-bedrock.aws]
    profile = "codex-bedrock"
    ```
    
    ## What Changed
    
    - Added `amazon-bedrock` to the built-in model provider map with:
      - `name = "Amazon Bedrock"`
      - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"`
      - `wire_api = "responses"`
    - Added AWS provider auth config with a profile-only shape:
    `model_providers.<id>.aws.profile`.
    - Kept AWS auth config restricted to `amazon-bedrock`; custom providers
    that set `aws` are rejected.
    - Allowed `model_providers.amazon-bedrock` through reserved-provider
    validation so it can act as a partial override.
    - During config loading, only `aws.profile` is copied from the
    user-provided `amazon-bedrock` entry onto the built-in provider. Other
    Bedrock provider fields remain hard-coded by the built-in definition.
    - Updated the generated config schema for the new provider AWS profile
    config.
  • [codex-analytics] add session source to client metadata (#17374)
    ## Summary
    
    Adds `thread_source` field to the existing Codex turn metadata sent to
    Responses API
    - Sends `thread_source: "user"` for user-initiated sessions: CLI, VS
    Code, and Exec
    - Sends `thread_source: "subagent"` for subagent sessions
    - Omits `thread_source` for MCP, custom, and unknown session sources
    - Uses the existing turn metadata transport:
      - HTTP requests send through the `x-codex-turn-metadata` header
    - WebSocket `response.create` requests send through
    `client_metadata["x-codex-turn-metadata"]`
    
    ## Testing
    - `cargo test -p codex-protocol
    session_source_thread_source_name_classifies_user_and_subagent_sources`
    - `cargo test -p codex-core turn_metadata_state`
    - `cargo test -p codex-core --test responses_headers
    responses_stream_includes_turn_metadata_header_for_git_workspace_e2e --
    --nocapture`
  • feat(analytics): generate an installation_id and pass it in responsesapi client_metadata (#16912)
    ## Summary
    
    This adds a stable Codex installation ID and includes it on Responses
    API requests via `x-codex-installation-id` passed in via the
    `client_metadata` field for analytics/debugging.
    
    The main pieces are:
    - persist a UUID in `$CODEX_HOME/installation_id`
    - thread the installation ID into `ModelClient`
    - send it in `client_metadata` on Responses requests so it works
    consistently across HTTP and WebSocket transports
  • Fix flaky test relating to metadata remote URL (#16823)
    This test was flaking on Windows.
    
    Problem: The Windows CI test for turn metadata compared git remote URLs
    byte-for-byte even though equivalent remotes can be formatted
    differently across Git code paths.
    
    Solution: Normalize the expected and actual origin URLs in the test by
    trimming whitespace, removing a trailing slash, and stripping a trailing
    .git suffix before comparing.
  • [codex] add context-window lineage headers (#16758)
    This change adds client-owned context-window and parent thread id
    headers to all requests to responses api.
  • remove temporary ownership re-exports (#16626)
    Stacked on #16508.
    
    This removes the temporary `codex-core` / `codex-login` re-export shims
    from the ownership split and rewrites callsites to import directly from
    `codex-model-provider-info`, `codex-models-manager`, `codex-api`,
    `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
    
    No behavior change intended; this is the mechanical import cleanup layer
    split out from the ownership move.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • core: remove cross-crate re-exports from lib.rs (#16512)
    ## Why
    
    `codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
    which made downstream crates depend on `codex-core` as a proxy module
    instead of the actual owner crate.
    
    Removing those forwards makes crate boundaries explicit and lets leaf
    crates drop unnecessary `codex-core` dependencies. In this PR, this
    reduces the dependency on `codex-core` to `codex-login` in the following
    files:
    
    ```
    codex-rs/backend-client/Cargo.toml
    codex-rs/mcp-server/tests/common/Cargo.toml
    ```
    
    ## What
    
    - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
    `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
    `codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
    `codex-tools`, and `codex-utils-path`.
    - Delete the `default_client` forwarding shim in `codex-rs/core`.
    - Update in-crate and downstream callsites to import directly from the
    owning `codex-*` crate.
    - Add direct Cargo dependencies where callsites now target the owner
    crate, and remove `codex-core` from `codex-rs/backend-client`.
  • core: support dynamic auth tokens for model providers (#16288)
    ## Summary
    
    Fixes #15189.
    
    Custom model providers that set `requires_openai_auth = false` could
    only use static credentials via `env_key` or
    `experimental_bearer_token`. That is not enough for providers that mint
    short-lived bearer tokens, because Codex had no way to run a command to
    obtain a bearer token, cache it briefly in memory, and retry with a
    refreshed token after a `401`.
    
    This PR adds that provider config and wires it through the existing auth
    design: request paths still go through `AuthManager.auth()` and
    `UnauthorizedRecovery`, with `core` only choosing when to use a
    provider-backed bearer-only `AuthManager`.
    
    ## Scope
    
    To keep this PR reviewable, `/models` only uses provider auth for the
    initial request in this change. It does **not** add a dedicated `401`
    retry path for `/models`; that can be follow-up work if we still need it
    after landing the main provider-token support.
    
    ## Example Usage
    
    ```toml
    model_provider = "corp-openai"
    
    [model_providers.corp-openai]
    name = "Corp OpenAI"
    base_url = "https://gateway.example.com/openai"
    requires_openai_auth = false
    
    [model_providers.corp-openai.auth]
    command = "gcloud"
    args = ["auth", "print-access-token"]
    timeout_ms = 5000
    refresh_interval_ms = 300000
    ```
    
    The command contract is intentionally small:
    
    - write the bearer token to `stdout`
    - exit `0`
    - any leading or trailing whitespace is trimmed before the token is used
    
    ## What Changed
    
    - add `model_providers.<id>.auth` to the config model and generated
    schema
    - validate that command-backed provider auth is mutually exclusive with
    `env_key`, `experimental_bearer_token`, and `requires_openai_auth`
    - build a bearer-only `AuthManager` for `ModelClient` and
    `ModelsManager` when a provider configures `auth`
    - let normal Responses requests and realtime websocket connects use the
    provider-backed bearer source through the same `AuthManager.auth()` path
    - allow `/models` online refresh for command-auth providers and attach
    the provider token to the initial `/models` request
    - keep `auth.cwd` available as an advanced escape hatch and include it
    in the generated config schema
    
    ## Testing
    
    - `cargo test -p codex-core provider_auth_command`
    - `cargo test -p codex-core
    refresh_available_models_uses_provider_auth_token`
    - `cargo test -p codex-core
    test_deserialize_provider_auth_config_defaults`
    
    ## Docs
    
    - `developers.openai.com/codex` should document the new
    `[model_providers.<id>.auth]` block and the token-command contract
  • chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
    ## Why
    
    `argument-comment-lint` was green in CI even though the repo still had
    many uncommented literal arguments. The main gap was target coverage:
    the repo wrapper did not force Cargo to inspect test-only call sites, so
    examples like the `latest_session_lookup_params(true, ...)` tests in
    `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
    
    This change cleans up the existing backlog, makes the default repo lint
    path cover all Cargo targets, and starts rolling that stricter CI
    enforcement out on the platform where it is currently validated.
    
    ## What changed
    
    - mechanically fixed existing `argument-comment-lint` violations across
    the `codex-rs` workspace, including tests, examples, and benches
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
    `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
    `--all-targets` unless the caller explicitly narrows the target set
    - fixed both wrappers so forwarded cargo arguments after `--` are
    preserved with a single separator
    - documented the new default behavior in
    `tools/argument-comment-lint/README.md`
    - updated `rust-ci` so the macOS lint lane keeps the plain wrapper
    invocation and therefore enforces `--all-targets`, while Linux and
    Windows temporarily pass `-- --lib --bins`
    
    That temporary CI split keeps the stricter all-targets check where it is
    already cleaned up, while leaving room to finish the remaining Linux-
    and Windows-specific target-gated cleanup before enabling
    `--all-targets` on those runners. The Linux and Windows failures on the
    intermediate revision were caused by the wrapper forwarding bug, not by
    additional lint findings in those lanes.
    
    ## Validation
    
    - `bash -n tools/argument-comment-lint/run.sh`
    - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
    - shell-level wrapper forwarding check for `-- --lib --bins`
    - shell-level wrapper forwarding check for `-- --tests`
    - `just argument-comment-lint`
    - `cargo test` in `tools/argument-comment-lint`
    - `cargo test -p codex-terminal-detection`
    
    ## Follow-up
    
    - Clean up remaining Linux-only target-gated callsites, then switch the
    Linux lint lane back to the plain wrapper invocation.
    - Clean up remaining Windows-only target-gated callsites, then switch
    the Windows lint lane back to the plain wrapper invocation.
  • Prefer websockets when providers support them (#13592)
    Remove all flags and model settings.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • fix(core): prevent hanging turn/start due to websocket warming issues (#14838)
    ## Description
    
    This PR fixes a bad first-turn failure mode in app-server when the
    startup websocket prewarm hangs. Before this change, `initialize ->
    thread/start -> turn/start` could sit behind the prewarm for up to five
    minutes, so the client would not see `turn/started`, and even
    `turn/interrupt` would block because the turn had not actually started
    yet.
    
    Now, we:
    - set a (configurable) timeout of 15s for websocket startup time,
    exposed as `websocket_startup_timeout_ms` in config.toml
    - `turn/started` is sent immediately on `turn/start` even if the
    websocket is still connecting
    - `turn/interrupt` can be used to cancel a turn that is still waiting on
    the websocket warmup
    - the turn task will wait for the full 15s websocket warming timeout
    before falling back
    
    ## Why
    
    The old behavior made app-server feel stuck at exactly the moment the
    client expects turn lifecycle events to start flowing. That was
    especially painful for external clients, because from their point of
    view the server had accepted the request but then went silent for
    minutes.
    
    ## Configuring the websocket startup timeout
    Can set it in config.toml like this:
    ```
    [model_providers.openai]
    supports_websockets = true
    websocket_connect_timeout_ms = 15000
    ```
  • chore(otel): rename OtelManager to SessionTelemetry (#13808)
    ## Summary
    This is a purely mechanical refactor of `OtelManager` ->
    `SessionTelemetry` to better convey what the struct is doing. No
    behavior change.
    
    ## Why
    
    `OtelManager` ended up sounding much broader than what this type
    actually does. It doesn't manage OTEL globally; it's the session-scoped
    telemetry surface for emitting log/trace events and recording metrics
    with consistent session metadata (`app_version`, `model`, `slug`,
    `originator`, etc.).
    
    `SessionTelemetry` is a more accurate name, and updating the call sites
    makes that boundary a lot easier to follow.
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-otel`
    - `cargo test -p codex-core`
  • add fast mode toggle (#13212)
    - add a local Fast mode setting in codex-core (similar to how model id
    is currently stored on disk locally)
    - send `service_tier=priority` on requests when Fast is enabled
    - add `/fast` in the TUI and persist it locally
    - feature flag
  • Use model catalog default for reasoning summary fallback (#12873)
    ## Summary
    - make `Config.model_reasoning_summary` optional so unset means use
    model default
    - resolve the optional config value to a concrete summary when building
    `TurnContext`
    - add protocol support for `default_reasoning_summary` in model metadata
    
    ## Validation
    - `cargo test -p codex-core --lib client::tests -- --nocapture`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Prefer v2 websockets if available (#12428)
    And also cleanup settings flow to avoid reading many separate flags.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: remove codex-core public protocol/shell re-exports (#12432)
    ## Why
    
    `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
    from `codex-protocol` and `codex-shell-command`. That made it easy for
    workspace crates to import those APIs through `codex-core`, which in
    turn hides dependency edges and makes it harder to reduce compile-time
    coupling over time.
    
    This change removes those public re-exports so call sites must import
    from the source crates directly. Even when a crate still depends on
    `codex-core` today, this makes dependency boundaries explicit and
    unblocks future work to drop `codex-core` dependencies where possible.
    
    ## What Changed
    
    - Removed public re-exports from `codex-rs/core/src/lib.rs` for:
    - `codex_protocol::protocol` and related protocol/model types (including
    `InitialHistory`)
      - `codex_protocol::config_types` (`protocol_config_types`)
    - `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
    parse_command, powershell}`
    - Migrated workspace Rust call sites to import directly from:
      - `codex_protocol::protocol`
      - `codex_protocol::config_types`
      - `codex_protocol::models`
      - `codex_shell_command`
    - Added explicit `Cargo.toml` dependencies (`codex-protocol` /
    `codex-shell-command`) in crates that now import those crates directly.
    - Kept `codex-core` internal modules compiling by using `pub(crate)`
    aliases in `core/src/lib.rs` (internal-only, not part of the public
    API).
    - Updated the two utility crates that can already drop a `codex-core`
    dependency edge entirely:
      - `codex-utils-approval-presets`
      - `codex-utils-cli`
    
    ## Verification
    
    - `cargo test -p codex-utils-approval-presets`
    - `cargo test -p codex-utils-cli`
    - `cargo check --workspace --all-targets`
    - `just clippy`
  • Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
    ## Why
    
    `codex-core` was being built in multiple feature-resolved permutations
    because test-only behavior was modeled as crate features. For a large
    crate, those permutations increase compile cost and reduce cache reuse.
    
    ## Net Change
    
    - Removed the `test-support` crate feature and related feature wiring so
    `codex-core` no longer needs separate feature shapes for test consumers.
    - Standardized cross-crate test-only access behind
    `codex_core::test_support`.
    - External test code now imports helpers from
    `codex_core::test_support`.
    - Underlying implementation hooks are kept internal (`pub(crate)`)
    instead of broadly public.
    
    ## Outcome
    
    - Fewer `codex-core` build permutations.
    - Better incremental cache reuse across test targets.
    - No intended production behavior change.
  • include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946)
    This will help us understand retention/usage for folks who use the
    Windows (or any other) sandboxes
  • Support alternative websocket API (#10861)
    **Test plan**
    
    ```
    cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
      ./target/debug/codex \
        --enable responses_websockets_v2 \
        --profile byok \
        --full-auto
    ```
  • chore: rm web-search-eligible header (#10660)
    default-enablement of web_search is now client-side, no need to send
    eligibility headers to backend.
    
    Tested locally, headers no longer sent.
    
    will wait for corresponding backend change to deploy before merging
  • fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423)
    So that the rest of the codebase (like TUI) don't need to be concerned
    whether ChatGPT auth was handled by Codex itself or passed in via
    app-server's external auth mode.
  • Session-level model client (#10664)
    Make ModelClient a session-scoped object.
    Move state that is session level onto the client, and make state that is
    per-turn explicit on corresponding methods.
    Stop taking a huge Config object, instead only pass in values that are
    actually needed.
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • Move metadata calculation out of client (#10589)
    Model client shouldn't be responsible for this.
  • chore: add phase to message responseitem (#10455)
    ### What
    
    add wiring for `phase` field on `ResponseItem::Message` to lay
    groundwork for differentiating model preambles and final messages.
    currently optional.
    
    follows pattern in #9698.
    
    updated schemas with `just write-app-server-schema` so we can see type
    changes.
    
    ### Tests
    Updated existing tests for SSE parsing and hydrating from history
  • make codex better at git (#10145)
    adds basic git context to the session prefix so the model can anchor git
    actions and be a bit more version-aware. structured it in a
    multiroot-friendly shape even though we only have one root today
  • chore: rename ChatGpt -> Chatgpt in type names (#10244)
    When using ChatGPT in names of types, we should be consistent, so this
    renames some types with `ChatGpt` in the name to `Chatgpt`. From
    https://rust-lang.github.io/api-guidelines/naming.html:
    
    > In `UpperCamelCase`, acronyms and contractions of compound words count
    as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
    or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
    contractions are lower-cased: `is_xid_start`.
    
    This PR updates existing uses of `ChatGpt` and changes them to
    `Chatgpt`. Though in all cases where it could affect the wire format, I
    visually inspected that we don't change anything there. That said, this
    _will_ change the codegen because it will affect the spelling of type
    names.
    
    For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
    `app-server-protocol`, but the wire format is still `"chatgpt"`.
    
    This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
  • Remove WebSocket wire format (#10179)
    I'd like WireApi to go away (when chat is removed) and WebSockets is
    still responses API just over a different transport.
  • Fall back to http when websockets fail (#10139)
    I expect not all proxies work with websockets, fall back to http if
    websockets fail.
  • fix: enable per-turn updates to web search mode (#10040)
    web_search can now be updated per-turn, for things like changes to
    sandbox policy.
    
    `SandboxPolicy::DangerFullAccess` now sets web_search to `live`, and the
    default is still `cached`.
    
    Added integration tests.
  • make cached web_search client-side default (#9974)
    [Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary)
    for default cached `web_search` completed; cached chosen as default.
    
    Update client to reflect that.