Commit Graph

314 Commits

  • feat(cli): add sandbox profile config controls (#20118)
    ## Why
    
    The explicit profile path from #20117 is meant for standalone testing,
    but it still inherited the
    shell cwd and all managed requirements implicitly. The pre-existing
    launcher path even called out
    that it did not support a separate cwd yet in
    
    [`debug_sandbox.rs`](https://github.com/openai/codex/blob/509453f688a30929432be866402d1ea46aa12169/codex-rs/cli/src/debug_sandbox.rs#L174-L179).
    
    For a standalone command, the useful default is to let the caller choose
    the project directory being
    tested and to avoid administrator-provided constraints unless the caller
    explicitly wants to test
    those too.
    
    ## What changed
    
    - Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
    profile resolution and command
      execution.
    - Add explicit-profile-only `--include-managed-config`.
    - Make explicit profile mode skip managed requirement sources by
    default, including cloud
    requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
    legacy managed-config
      requirements projection.
    - Preserve all existing invocations outside the explicit-profile path.
    
    ## Stack
    
    1. #20117 `sandbox-ui-profile`
    2. #20118 `sandbox-ui-config` --> this PR
    
    Both PRs are additive. Replay JSON is intentionally deferred to a
    follow-up design pass.
    
    ## Tests ran
    
    - `cargo test -p codex-cli debug_sandbox`
    - `cargo test -p codex-cli sandbox_macos_`
    - `cargo test -p codex-core
    load_config_layers_can_ignore_managed_requirements`
    - `cargo test -p codex-core
    load_config_layers_includes_cloud_requirements`
    - macOS branch-binary smoke on the rebased top of stack: `-C` changed
    execution cwd, explicit
    profile mode omitted managed proxy env under `env -i`, and
    `--include-managed-config` restored it.
    - Linux devbox branch-binary smoke on the rebased top of stack: `-C`
    changed execution cwd for
      built-in and user-defined explicit profiles.
  • feat(cli): add explicit sandbox permission profiles (#20117)
    ## Why
    
    `codex sandbox` is useful for exercising sandbox behavior directly, but
    before this stack the CLI
    only picked up permission profiles indirectly from the active config.
    The existing debug-sandbox path
    already compiled `[permissions]` profiles through normal config loading,
    as covered by the existing
    profile tests in
    [`debug_sandbox.rs`](https://github.com/openai/codex/blob/de2ccf94735a3d8a2a7077e6a5292026413867cf/codex-rs/cli/src/debug_sandbox.rs#L715-L760).
    
    This adds the smallest stable entry point first: an explicit profile
    selector that reuses the same
    config machinery as normal Codex config, so standalone testing becomes
    possible without changing
    current no-selector behavior.
    
    ## What changed
    
    - Add additive `--permissions-profile NAME` support to `codex sandbox
    macos|linux|windows`.
    - Resolve built-in and user-defined profile names by feeding
    `default_permissions` through the
    existing config compilation path instead of inventing a sandbox-only
    parser.
    - Make an explicit selector win over an ambient active profile's legacy
    `sandbox_mode`.
    - Keep the existing no-selector behavior unchanged.
    
    ## Stack
    
    1. #20117 `sandbox-ui-profile` --> this PR
    2. #20118 `sandbox-ui-config`
    
    Both PRs are additive. Replay JSON is intentionally deferred to a
    follow-up design pass.
    
    ## Tests ran
    
    - `cargo test -p codex-cli debug_sandbox`
    - `cargo test -p codex-cli sandbox_macos_parses_permissions_profile`
    - `cargo test -p codex-core
    cli_override_takes_precedence_over_profile_sandbox_mode`
    - macOS branch-binary smoke on the rebased top of stack: built-in
    `:workspace` and user-defined
      profiles both executed successfully through `--permissions-profile`.
    - Linux devbox branch-binary smoke on the rebased top of stack: built-in
    `:workspace` and
    user-defined profiles both executed successfully through
    `--permissions-profile`.
  • chore(cli) deprecate --full-auto (#20133)
    ## Summary
    Starts the process of getting rid of `--full-auto`, with some
    concessions:
    1. Fully removes the command from the tui, since it just resolves to the
    default permissions there, and encourages users to use the one-time
    trust flow if they're not in a trusted repo.
    2. Marks the command as deprecated in `codex exec`, in case users are
    actively relying on this. We'll remove in an upcoming n+X release.
    3. Cleans up some of the `codex sandbox` cli logic, to keep supporting
    legacy sandbox policies for now.
    
    This isn't the cleanest setup, but I think it is worthwhile to warn
    users for one release before hard-removing it.
    
    ## Testing 
    - [x] Updated unit tests
  • linux-sandbox: switch helper plumbing to PermissionProfile (#20106)
    ## Why
    
    `PermissionProfile` is the canonical runtime permission model in the
    Rust workspace, but the Linux sandbox helper still accepted a legacy
    `SandboxPolicy` plus separate filesystem and network policy flags. That
    translation layer made the helper interface harder to reason about and
    left `linux-sandbox`-specific callers and tests coupled to the legacy
    policy representation.
    
    This change moves the helper onto `PermissionProfile` directly so the
    Linux sandbox plumbing matches the rest of the permission stack.
    
    ## What changed
    
    - changed `codex-linux-sandbox` to accept `--permission-profile` and
    derive the runtime filesystem and network policies internally
    - updated the in-process seccomp and legacy Landlock path in
    `codex-rs/linux-sandbox` to operate on `PermissionProfile`
    - updated Linux sandbox argv construction in `codex-rs/sandboxing`,
    `codex-rs/core`, and the CLI debug sandbox path to pass the canonical
    profile instead of serializing compatibility policy projections
    - simplified the Linux sandbox tests to build the exact permission
    profile under test, including the managed-proxy path and
    direct-runtime-enforcement carveout coverage
    - removed helper-local `SandboxPolicy` usage from `bwrap` tests where
    `FileSystemSandboxPolicy` is already the value being exercised
    
    ## Testing
    
    - `cargo test -p codex-sandboxing`
    - `cargo test -p codex-linux-sandbox` (on this macOS host, the crate
    compiled cleanly and its Linux-only tests were cfg-gated)
    - `cargo test -p codex-core --no-run`
    - `cargo test -p codex-cli --no-run`
  • feat: split memories part 2 (#19860)
    Keep extracting memories out of core and moving the write trigger in the
    app-server
    This is temporary and it should move at the client level as a follow-up
    This makes core fully independant from `codex-memories-write`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add codex update command (#19933)
    ## Why
    
    Addresses #9274
    
    Running `codex update` currently starts an interactive Codex session
    with `update` as the prompt. That is a rough edge for users who expect a
    direct self-update command after seeing the existing update notice, and
    it forces them to copy the suggested package-manager command manually.
    
    ## What changed
    
    - Added a top-level `codex update` subcommand.
    - Reused the existing install-channel detection and update command
    runner that the TUI already uses for update prompts.
    - Exposed the update-action lookup from `codex-tui` so the CLI can
    invoke the same behavior.
    - Added CLI coverage to ensure `codex update` is parsed as a subcommand
    instead of becoming an interactive prompt.
    
    ## Verification
    
    - `cargo test -p codex-cli`
    - `cargo test -p codex-tui update_action::tests`
  • refactor: make auth loading async (#19762)
    ## Summary
    
    Auth loading used to expose synchronous construction helpers in several
    places even though some auth sources now need async work. This PR makes
    the auth-loading surface async and updates the callers to await it.
    
    This is intentionally only plumbing. It does not change how
    AgentIdentity tokens are decoded, how task runtime ids are allocated, or
    how JWT signatures are verified.
    
    ## Stack
    
    1. **This PR:** [refactor: make auth loading
    async](https://github.com/openai/codex/pull/19762)
    2. [refactor: load AgentIdentity runtime
    eagerly](https://github.com/openai/codex/pull/19763)
    3. [feat: verify AgentIdentity JWTs with
    JWKS](https://github.com/openai/codex/pull/19764)
    
    ## Important call sites
    
    | Area | Change |
    | --- | --- |
    | `codex-login` auth loading | `CodexAuth` and `AuthManager`
    construction paths now await auth loading. |
    | app-server startup | Auth manager construction is awaited during
    initialization. |
    | CLI/TUI/exec/MCP/chatgpt callers | Existing auth-loading calls now
    await the same behavior. |
    | cloud requirements storage loader | The loader becomes async so it can
    share the same auth construction path. |
    | auth tests | Tests that load auth now run in async contexts. |
    
    ## Testing
    
    Tests: targeted Rust auth test compilation, formatter, scoped Clippy
    fix, and Bazel lock check.
  • permissions: centralize legacy sandbox projection (#19734)
    ## Why
    
    The remaining migration work still needs `SandboxPolicy` at a few
    compatibility boundaries, but those projections should come from one
    canonical path. Keeping ad hoc legacy projections scattered through
    app-server, CLI, and config code makes it easy for behavior to drift as
    `PermissionProfile` gains fidelity that the legacy enum cannot
    represent.
    
    ## What Changed
    
    - Adds `Permissions::legacy_sandbox_policy(cwd)` and
    `Config::legacy_sandbox_policy()` as the compatibility projection from
    the canonical `PermissionProfile`.
    - Adds `Permissions::can_set_legacy_sandbox_policy()` so legacy inputs
    are checked after they are converted into profile semantics.
    - Updates app-server command handling, Windows sandbox setup, session
    configuration, and sandbox summaries to use the centralized projection
    helper.
    - Leaves `SandboxPolicy` in place only for boundary inputs/outputs that
    still speak the legacy abstraction.
    
    ## Verification
    
    - `cargo check -p codex-config -p codex-core -p codex-sandboxing -p
    codex-app-server -p codex-cli -p codex-tui`
    - `cargo test -p codex-tui
    permissions_selection_history_snapshot_full_access_to_default --
    --nocapture`
    - `cargo test -p codex-tui
    permissions_selection_sends_approvals_reviewer_in_override_turn_context
    -- --nocapture`
    - `bazel test //codex-rs/tui:tui-unit-tests-bin
    --test_arg=permissions_selection_history_snapshot_full_access_to_default
    --test_output=errors`
    - `bazel test //codex-rs/tui:tui-unit-tests-bin
    --test_arg=permissions_selection_sends_approvals_reviewer_in_override_turn_context
    --test_output=errors`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19734).
    * #19737
    * #19736
    * #19735
    * __->__ #19734
  • permissions: migrate approval and sandbox consumers to profiles (#19393)
    ## Why
    
    Runtime decisions should not infer permissions from the lossy legacy
    sandbox projection once `PermissionProfile` is available. In particular,
    `Disabled` and `External` need to remain distinct, and managed profiles
    with split filesystem or deny-read rules should not be collapsed before
    approval, network, safety, or analytics code makes decisions.
    
    ## What Changed
    
    - Changes managed network proxy setup and network approval logic to use
    `PermissionProfile` when deciding whether a managed sandbox is active.
    - Migrates patch safety, Guardian/user-shell approval paths, Landlock
    helper setup, analytics sandbox classification, and selected
    turn/session code to profile-backed permissions.
    - Validates command-level profile overrides against the constrained
    `PermissionProfile` rather than a strict `SandboxPolicy` round trip.
    - Preserves configured deny-read restrictions when command profiles are
    narrowed.
    - Adds coverage for profile-backed trust, network proxy/approval
    behavior, patch safety, analytics classification, and command-profile
    narrowing.
    
    ## Verification
    
    - `cargo test -p codex-core direct_write_roots`
    - `cargo test -p codex-core runtime_roots_to_legacy_projection`
    - `cargo test -p codex-app-server
    requested_permissions_trust_project_uses_permission_profile_intent`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19393).
    * #19395
    * #19394
    * __->__ #19393
  • [codex] Move config loading into codex-config (#19487)
    ## Why
    
    Config loading had become split across crates: `codex-config` owned the
    config types and merge logic, while `codex-core` still owned the loader
    that assembled the layer stack. This change consolidates that
    responsibility in `codex-config`, so the crate that defines config
    behavior also owns how configs are discovered and loaded.
    
    To make that move possible without reintroducing the old dependency
    cycle, the shell-environment policy types and helpers that
    `codex-exec-server` needs now live in `codex-protocol` instead of
    flowing through `codex-config`.
    
    This also makes the migrated loader tests more deterministic on machines
    that already have managed or system Codex config installed by letting
    tests override the system config and requirements paths instead of
    reading the host's `/etc/codex`.
    
    ## What Changed
    
    - moved the config loader implementation from `codex-core` into
    `codex-config::loader` and deleted the old `core::config_loader` module
    instead of leaving a compatibility shim
    - moved shell-environment policy types and helpers into
    `codex-protocol`, then updated `codex-exec-server` and other downstream
    crates to import them from their new home
    - updated downstream callers to use loader/config APIs from
    `codex-config`
    - added test-only loader overrides for system config and requirements
    paths so loader-focused tests do not depend on host-managed config state
    - cleaned up now-unused dependency entries and platform-specific cfgs
    that were surfaced by post-push CI
    
    ## Testing
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core config_loader_tests::`
    - `cargo test -p codex-protocol -p codex-exec-server -p
    codex-cloud-requirements -p codex-rmcp-client --lib`
    - `cargo test --lib -p codex-app-server-client -p codex-exec`
    - `cargo test --no-run --lib -p codex-app-server`
    - `cargo test -p codex-linux-sandbox --lib`
    - `cargo shear`
    - `just bazel-lock-check`
    
    ## Notes
    
    - I did not chase unrelated full-suite failures outside the migrated
    loader surface.
    - `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
    failures on this machine, and Windows CI still shows unrelated
    long-running/timeouting test noise outside the loader migration itself.
  • permissions: derive compatibility policies from profiles (#19392)
    ## Why
    
    After #19391, `PermissionProfile` and the split filesystem/network
    policies could still be stored in parallel. That creates drift risk: a
    profile can preserve deny globs, external enforcement, or split
    filesystem entries while a cached projection silently loses those
    details. This PR makes the profile the runtime source and derives
    compatibility views from it.
    
    ## What Changed
    
    - Removes stored filesystem/network sandbox projections from
    `Permissions` and `SessionConfiguration`; their accessors now derive
    from the canonical `PermissionProfile`.
    - Derives legacy `SandboxPolicy` snapshots from profiles only where an
    older API still needs that field.
    - Updates MCP connection and elicitation state to track
    `PermissionProfile` instead of `SandboxPolicy` for auto-approval
    decisions.
    - Adds semantic filesystem-policy comparison so cwd changes can preserve
    richer profiles while still recognizing equivalent legacy projections
    independent of entry ordering.
    - Updates config/session tests to assert profile-derived projections
    instead of parallel stored fields.
    
    ## Verification
    
    - `cargo test -p codex-core direct_write_roots`
    - `cargo test -p codex-core runtime_roots_to_legacy_projection`
    - `cargo test -p codex-app-server
    requested_permissions_trust_project_uses_permission_profile_intent`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19392).
    * #19395
    * #19394
    * #19393
    * __->__ #19392
  • feat: load AgentIdentity from JWT login/env (#18904)
    ## Summary
    
    This PR lets programmatic AgentIdentity users provide one token through
    either stdin login or environment auth.
    
    `codex login --with-agent-identity` reads an Agent Identity JWT from
    stdin, validates that it has the required claims, and stores that token
    as the `agent_identity` value in `auth.json`. The file format is
    token-only; the decoded account and key fields are runtime state, not
    hand-authored auth.json fields.
    
    The Agent Identity JWT claim shape and decoder live in
    `codex-agent-identity`; `codex-login` only owns env/storage precedence
    and conversion into `CodexAuth::AgentIdentity`.
    
    When env auth is enabled, `CODEX_AGENT_IDENTITY` can provide the same
    JWT without writing auth state to disk. `CODEX_API_KEY` still wins if
    both env vars are set.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    Reference JWT/env stack: https://github.com/openai/codex/pull/18176
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
    auth mode and startup task allocation
    4. https://github.com/openai/codex/pull/18811: migrate Codex backend
    auth callsites through AuthProvider
    5. This PR: accept AgentIdentity JWTs through login/env
    
    ## Testing
    
    Tests: targeted login and Agent Identity crate tests, CLI checks, scoped
    formatter/linter cleanup, and CI.
    
    ---------
    
    Co-authored-by: Shijie Rao <shijie.rao@openai.com>
  • [codex] remove responses command (#19640)
    This removes the hidden `codex responses` CLI subcommand after
    confirming no downstream callers rely on it, deleting the raw Responses
    passthrough implementation, unregistering the subcommand, and dropping
    the now-unused CLI dependencies on `codex-api` and
    `codex-model-provider`.
  • Support end_turn in response.completed (#19610)
    Some providers of Responses API forward a model-defined `end_turn`
    boolean indicating explicitly the model's indication of whether it would
    like to end the turn or to be inferenced again. In this PR, we update
    the sampling loop to use this field correctly if it's set. If the field
    is not set by the provider, we fall back to the existing sampling logic.
  • feat: let model providers own model discovery (#18950)
    ## Why
    
    `codex-models-manager` had grown to own provider-specific concerns:
    constructing OpenAI-compatible `/models` requests, resolving provider
    auth, emitting request telemetry, and deciding how provider catalogs
    should be sourced. That made the manager harder to reuse for providers
    whose model catalog is not fetched from the OpenAI `/models` endpoint,
    such as Amazon Bedrock.
    
    This change moves provider-specific model discovery behind
    provider-owned implementations, so the models manager can focus on
    refresh policy, cache behavior, picker ordering, and model metadata
    merging.
    
    ## What Changed
    
    - Introduced a `ModelsManager` trait with separate `OpenAiModelsManager`
    and `StaticModelsManager` implementations.
    - Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives
    outside `codex-models-manager`.
    - Moved `/models` request construction, provider auth resolution,
    timeout handling, and request telemetry into `codex-model-provider` via
    `OpenAiModelsEndpoint`.
    - Added provider-owned `models_manager(...)` construction so configured
    OpenAI-compatible providers use `OpenAiModelsManager`, while
    static/catalog-backed providers can return `StaticModelsManager`.
    - Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock
    model IDs.
    - Updated core/session/thread manager code and tests to depend on
    `Arc<dyn ModelsManager>`.
    - Moved offline model test helpers into
    `codex_models_manager::test_support`.
    ## Metadata References
    
    The Bedrock catalog metadata is based on the official Amazon Bedrock
    OpenAI model documentation:
    
    - [Amazon Bedrock OpenAI
    models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html)
    lists the Bedrock model IDs, text input/output modalities, and `128,000`
    token context window for `gpt-oss-20b` and `gpt-oss-120b`.
    - [Amazon Bedrock `gpt-oss-120b` model
    card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html)
    lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the
    `bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities,
    and `128K` context window.
    - [OpenAI `gpt-oss-120b` model
    docs](https://developers.openai.com/api/docs/models/gpt-oss-120b)
    document configurable reasoning effort with `low`, `medium`, and `high`,
    plus text input/output modality.
    
    The display names, default reasoning effort, and priority ordering are
    Codex-local catalog choices.
    
    ## Test Plan
    - Manually verified app-server model listing with an AWS profile:
    
    ```shell
    CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \
      --codex-bin ./target/debug/codex \
      -c 'model_provider="amazon-bedrock"' \
      -c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \
      -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \
      model-list
    ```
    
    The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0`
    as the default model and `openai.gpt-oss-20b-1:0` as the second listed
    model, both text-only and supporting low/medium/high reasoning effort.
  • [rollout_trace] Add debug trace reduction command (#18880)
    ## Summary
    
    Adds the debug CLI entry point for reducing recorded rollout traces.
    This gives developers a direct way to inspect whether the emitted trace
    stream reduces into the expected conversation/runtime model.
    
    ## Stack
    
    This is PR 5/5 in the rollout trace stack.
    
    - [#18876](https://github.com/openai/codex/pull/18876): Add rollout
    trace crate
    - [#18877](https://github.com/openai/codex/pull/18877): Record core
    session rollout traces
    - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
    code-mode boundaries
    - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
    and multi-agent edges
    - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
    reduction command
    
    ## Review Notes
    
    This PR is intentionally last: it depends on the trace crate, core
    recorder, runtime/tool events, and session/agent edge data all existing.
    The command should remain a debug/developer tool and avoid adding new
    runtime behavior.
    
    The useful review question is whether the CLI exposes the reducer in the
    smallest practical way for local inspection without turning the debug
    command into a supported user-facing workflow.
  • refactor: route Codex auth through AuthProvider (#18811)
    ## Summary
    
    This PR moves Codex backend request authentication from direct
    bearer-token handling to `AuthProvider`.
    
    The new `codex-auth-provider` crate defines the shared request-auth
    trait. `CodexAuth::provider()` returns a provider that can apply all
    headers needed for the selected auth mode.
    
    This lets ChatGPT token auth and AgentIdentity auth share the same
    callsite path:
    - ChatGPT token auth applies bearer auth plus account/FedRAMP headers
    where needed.
    - AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
    where needed.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Callsite Migration
    
    | Area | Change |
    | --- | --- |
    | backend-client | accepts an `AuthProvider` instead of a raw
    token/header |
    | chatgpt client/connectors | applies auth through
    `CodexAuth::provider()` |
    | cloud tasks | keeps Codex-backend gating, applies auth through
    provider |
    | cloud requirements | uses Codex-backend auth checks and provider
    headers |
    | app-server remote control | applies provider headers for backend calls
    |
    | MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
    from generic account getters |
    | model refresh | treats AgentIdentity as Codex-backend auth |
    | OpenAI file upload path | rejects non-Codex-backend auth before
    applying headers |
    | core client setup | keeps model-provider auth flow and allows
    AgentIdentity through provider-backed OpenAI auth |
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
    auth mode and startup task allocation
    4. This PR: migrate Codex backend auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • Move marketplace add/remove and startup sync out of core. (#19099)
    Move more things to core-plugins.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: add Unix socket transport (#18255)
    ## Summary
    - add unix:// app-server transport backed by the shared codex-uds crate
    - reuse the websocket connection loop for axum and tungstenite-backed
    streams
    - add codex app-server proxy to bridge stdio clients to the control
    socket
    - tolerate Windows UDS backends that report a missing rendezvous path as
    connection refused before binding
    
    ## Tests
    - cargo test -p codex-app-server
    control_socket_acceptor_forwards_websocket_text_messages_and_pings
    - cargo test -p codex-app-server
    - just fmt
    - just fix -p codex-app-server
    - git -c core.fsmonitor=false diff --check
  • [codex] Fix plugin marketplace help usage (#18710)
    ## Summary
    - Updates generated CLI help for plugin marketplace commands to show the
    full `codex plugin marketplace ...` namespace.
    - Adds a regression test covering the marketplace command and its `add`,
    `upgrade`, and `remove` help pages.
    
    ## Root Cause
    The marketplace parser already lived under `codex plugin marketplace`,
    but Clap generated usage text from the child parser's standalone command
    name. That made help output show stale `codex marketplace ...`
    instructions even though the top-level `codex marketplace` command no
    longer parses.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-cli`
    - `./target/debug/codex plugin marketplace --help`
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • use long-lived sessions for codex sandbox windows (#18953)
    `codex sandbox windows` previously did a one-shot spawn for all
    commands.
    This change uses the `unified_exec` session to spawn long-lived
    processes instead, and implements a simple bridge to forward stdin to
    the spawned session and stdout/stderr from the spawned session back to
    the caller.
    
    It also fixes a bug with the new shared spawn context code where the
    "no-network env" was being applied to both elevated and unelevated
    sandbox spawns. It should only be applied for the unelevated sandbox
    because the elevated one uses firewall rules instead of an env-based
    network suppression strategy.
  • feat: add explicit AgentIdentity auth mode (#18785)
    ## Summary
    
    This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode.
    
    An AgentIdentity auth record is a standalone `auth.json` mode. When
    `AuthManager::auth().await` loads that mode, it registers one
    process-scoped task and stores it in runtime-only state on the auth
    value. Header creation stays synchronous after that because the task is
    initialized before callers receive the auth object.
    
    This PR also removes the old feature flag path. AgentIdentity is
    selected by explicit auth mode, not by a hidden flag or lazy mutation of
    ChatGPT auth records.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Design Decisions
    
    - AgentIdentity is a real auth enum variant because it can be the only
    credential in `auth.json`.
    - The process task is ephemeral runtime state. It is not serialized and
    is not stored in rollout/session data.
    - Account/user metadata needed by existing Codex backend checks lives on
    the AgentIdentity record for now.
    - `is_chatgpt_auth()` remains token-specific.
    - `uses_codex_backend()` is the broader predicate for ChatGPT-token auth
    and AgentIdentity auth.
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. This PR: explicit AgentIdentity auth mode and startup task allocation
    4. https://github.com/openai/codex/pull/18811: migrate Codex backend
    auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • Stabilize debug clear memories integration test (#18858)
    ## Why
    
    `debug_clear_memories_resets_state_and_removes_memory_dir` can be flaky
    because the test drops its `sqlx::SqlitePool` immediately before
    invoking `codex debug clear-memories`. Dropping the pool does not wait
    for all SQLite connections to close, so the CLI can race with still-open
    test connections.
    
    ## What changed
    
    - Await `pool.close()` before spawning `codex debug clear-memories`.
    - Close the reopened verification pool before the temp `CODEX_HOME` is
    torn down.
    
    ## Verification
    
    - `cargo test -p codex-cli --test debug_clear_memories
    debug_clear_memories_resets_state_and_removes_memory_dir`
  • Fix exec inheritance of root shared flags (#18630)
    Addresses #18113
    
    Problem: Shared flags provided before the exec subcommand were parsed by
    the root CLI but not inherited by the exec CLI, so exec sessions could
    run with stale or default sandbox and model configuration.
    
    Solution: Move shared TUI and exec flags into a common option block and
    merge root selections into exec before dispatch, while preserving exec's
    global subcommand flag behavior.
  • uds: add async Unix socket crate (#18254)
    ## Summary
    - add a codex-uds crate with async UnixListener and UnixStream wrappers
    - expose helpers for private socket directory setup and stale socket
    path checks
    - migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket
    relaying
    - update the CLI stdio-to-uds command path for the async runner
    
    ## Tests
    - cargo test -p codex-uds -p codex-stdio-to-uds
    - cargo test -p codex-cli
    - just fmt
    - just fix -p codex-uds
    - just fix -p codex-stdio-to-uds
    - just fix -p codex-cli
    - just bazel-lock-check
    - git diff --check
  • Use thread IDs in TUI resume hints (#18440)
    ## Summary
    
    Fixes #18313.
    
    Recent TUI resume breadcrumbs could print a thread title instead of the
    stable thread UUID. For sessions whose title was auto-derived from the
    first prompt, that made the suggested codex resume command look like it
    should resume a long prompt rather than the session ID.
    
    This updates the TUI and CLI post-exit resume hints, plus the in-session
    summary shown when switching/forking threads, to always use the stable
    thread ID for these recovery breadcrumbs. Explicit name-based resume
    support remains available elsewhere.
  • Support codex app on macOS (Intel) and Windows (#18500)
    ## Summary
    
    `codex app` should be a platform-aware entry point for opening Codex
    Desktop or helping users install it. Before this change, the command
    only existed on macOS and its default installer URL always pointed at
    the Apple Silicon DMG, which sent Intel Mac users to the wrong build.
    
    This updates the macOS path to choose the Apple Silicon or Intel DMG
    based on the detected processor, while keeping `--download-url` as an
    advanced override. It also enables `codex app` on Windows, where the CLI
    opens an installed Codex Desktop app when available and otherwise opens
    the Windows installer URL.
    
    ---------
    
    Co-authored-by: Felipe Coury <felipe.coury@openai.com>
  • [5/6] Wire executor-backed MCP stdio (#18212)
    ## Summary
    - Add the executor-backed RMCP stdio transport.
    - Wire MCP stdio placement through the executor environment config.
    - Cover local and executor-backed stdio paths with the existing MCP test
    helpers.
    
    ## Stack
    ```text
    o  #18027 [6/6] Fail exec client operations after disconnect
    │
    @  #18212 [5/6] Wire executor-backed MCP stdio
    │
    o  #18087 [4/6] Abstract MCP stdio server launching
    │
    o  #18020 [3/6] Add pushed exec process events
    │
    o  #18086 [2/6] Support piped stdin in exec process API
    │
    o  #18085 [1/6] Add MCP server environment config
    │
    o  main
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Add marketplace remove command and shared logic (#17752)
    ## Summary
    
    Move the marketplace remove implementation into shared core logic so
    both the CLI command and follow-up app-server RPC can reuse the same
    behavior.
    
    This change:
    - adds a shared `codex_core::plugins::remove_marketplace(...)` flow
    - moves validation, config removal, and installed-root deletion out of
    the CLI
    - keeps the CLI as a thin wrapper over the shared implementation
    - adds focused core coverage for the shared remove path
    
    ## Validation
    
    - `just fmt`
    - focused local coverage for the shared remove path
    - heavier follow-up validation deferred to stacked PR CI
  • [codex] Revoke ChatGPT tokens on logout (#17825)
    ## Summary
    
    This changes Codex logout so managed ChatGPT auth is revoked against
    AuthAPI before local auth state is removed. CLI logout, TUI `/logout`,
    and the app-server account logout path now use the token-revoking logout
    flow instead of only deleting `auth.json` / credential store state.
    
    ## Root Cause
    
    Logout previously cleared only local auth storage. That removed Codex's
    local credentials but did not ask the backend to invalidate the
    refresh/access token state associated with a managed ChatGPT login.
    
    ## Behavior
    
    For managed ChatGPT auth, logout sends the stored refresh token to
    `https://auth.openai.com/oauth/revoke` with `token_type_hint:
    refresh_token` and the Codex OAuth client id, then deletes all local
    auth stores after revocation succeeds. If only an access token is
    available, it falls back to revoking that access token. API key auth and
    externally supplied `chatgptAuthTokens` are still only cleared locally
    because Codex does not own a refresh token for those modes.
    
    Revocation failures are fail-closed: if Codex cannot load stored auth or
    the backend revoke call fails, logout returns an error and leaves local
    auth in place so the user can retry instead of silently clearing local
    state while backend tokens remain valid.
    
    ## Validation
    ran local version of `codex-cli` with staging overrides/harness for auth
    
    ran `codex login` then `codex logout`:
    
    saw auth.json clear and  backend revocation endpoints were called
    
    ```
    POST /oauth/revoke
    status: 200
    
    revoking access token
    should clear auth session
    clearing auth session due to token revocation
    successfully revoked session and access token
    CANONICAL-API-LINE Response: status='200' method='POST' path='/oauth/revoke
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add opt-in provider runtime abstraction (#17713)
    ## Summary
    
    - Add `codex-model-provider` as the runtime home for model-provider
    behavior that does not belong in `codex-core`, `codex-login`, or
    `codex-api`.
    - The new crate wraps configured `ModelProviderInfo` in a
    `ModelProvider` trait object that can resolve the API provider config,
    provider-scoped auth manager, and request auth provider for each call.
    - This centralizes provider auth behavior in one place today, and gives
    us an extension point for future provider-specific auth, model listing,
    request setup, and related runtime behavior.
    
    ## Tests
    Ran tests manually to make sure that provider auth under different
    configs still work as expected.
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • Stream apply_patch changes (#17862)
    Adds new events for streaming apply_patch changes from responses api.
    This is to enable clients to show progress during file writes.
    
    Caveat: This does not work with apply_patch in function call mode, since
    that required adding streaming json parsing.
  • Move marketplace add under plugin command (#18116)
    ## Summary
    - move the marketplace add CLI from `codex marketplace add` to `codex
    plugin marketplace add`
    - keep marketplace config overrides working through the nested plugin
    command
    - reject `--sparse` for local marketplace directory sources before the
    local-source install path bypasses git-source validation
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - `cargo test -p codex-cli`
    - `cargo test -p codex-core marketplace_add -- --nocapture`
    - `cargo test -p codex-core
    install_plugin_updates_config_with_relative_path_and_plugin_key --
    --nocapture`
    - `xli-test-marketplace-cli` local isolated matrix: `T1`, `L1`-`L10`
  • Add server-level approval defaults for custom MCP servers (#17843)
    ## Summary
    - Add `default_tools_approval_mode` support for custom MCP server
    configs, matching the existing `codex_apps` behavior
    - Apply approval precedence as per-tool override, then server default,
    then `auto`
    - Update config serialization, CLI display, schema generation, docs, and
    tests
    
    ## Testing
    - `cargo check -p codex-config`
    - `cargo check -p codex-core`
    - `just write-config-schema`
    - `just fmt`
    - `cargo test -p codex-config`
    - Targeted `codex-core` tests for config parsing, config writes, and MCP
    approval precedence
    - `just fix -p codex-config -p codex-core`
  • Auto-upgrade configured marketplaces (#17425)
    ## Summary
    - Add best-effort auto-upgrade for user-configured Git marketplaces
    recorded in `config.toml`.
    - Track the last activated Git revision with `last_revision` so
    unchanged marketplace sources skip clone work.
    - Trigger the upgrade from plugin startup and `plugin/list`, while
    preserving existing fail-open plugin behavior with warning logs rather
    than new user-visible errors.
    
    ## Details
    - Remote configured marketplaces use `git ls-remote` to compare the
    source/ref against the recorded revision.
    - Upgrades clone into a staging directory, validate that
    `.agents/plugins/marketplace.json` exists and that the manifest name
    matches the configured marketplace key, then atomically activate the new
    root.
    - Local `.agents/plugins/marketplace.json` marketplaces remain live
    filesystem state and are not auto-pulled.
    - Existing non-curated plugin cache refresh is kicked after successful
    marketplace root upgrades.
    
    ## Validation
    - `just write-config-schema`
    - `cargo test -p codex-core marketplace_upgrade`
    - `cargo check -p codex-cli -p codex-app-server`
    - `just fix -p codex-core`
    
    Did not run the complete `cargo test` suite because the repo
    instructions require asking before a full core workspace run.
  • [1/8] Add MCP server environment config (#18085)
    ## Summary
    - Add an MCP server environment setting with local as the default.
    - Thread the default through config serialization, schema generation,
    and existing config fixtures.
    
    ## Stack
    ```text
    o  #18027 [8/8] Fail exec client operations after disconnect
    │
    o  #18025 [7/8] Cover MCP stdio tests with executor placement
    │
    o  #18089 [6/8] Wire remote MCP stdio through executor
    │
    o  #18088 [5/8] Add executor process transport for MCP stdio
    │
    o  #18087 [4/8] Abstract MCP stdio server launching
    │
    o  #18020 [3/8] Add pushed exec process events
    │
    o  #18086 [2/8] Support piped stdin in exec process API
    │
    @  #18085 [1/8] Add MCP server environment config
    │
    o  main
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: unify memory drop endpoints (#18134)
    Unify all the memories drop behind a single implementation that drops
    both the main memories and the extensions
  • Significantly improve standalone installer (#17022)
    ## Summary
    
    This PR significantly improves the standalone installer experience.
    
    The main changes are:
    
    1. We now install the codex binary and other dependencies in a
    subdirectory under CODEX_HOME.
    (`CODEX_HOME/packages/standalone/releases/...`)
    
    2. We replace the `codex.js` launcher that npm/bun rely on with logic in
    the Rust binary that automatically resolves its dependencies (like
    ripgrep)
    
    ## Motivation
    
    A few design constraints pushed this work.
    
    1. Currently, the entrypoint to codex is through `codex.js`, which
    forces a node dependency to kick off our rust app. We want to move away
    from this so that the entrypoint to codex does not rely on node or
    external package managers.
    2. Right now, the native script adds codex and its dependencies directly
    to user PATH. Given that codex is likely to add more binary dependencies
    than ripgrep, we want a solution which does not add arbitrary binaries
    to user PATH -- the only one we want to add is the `codex` command
    itself.
    3. We want upgrades to be atomic. We do not want scenarios where
    interrupting an upgrade command can move codex into undefined state (for
    example, having a new codex binary but an old ripgrep binary). This was
    ~possible with the old script.
    4. Currently, the Rust binary uses heuristics to determine which
    installer created it. These heuristics are flaky and are tied to the
    `codex.js` launcher. We need a more stable/deterministic way to
    determine how the binary was installed for standalone.
    5. We do not want conflicting codex installations on PATH. For example,
    the user installing via npm, then installing via brew, then installing
    via standalone would make it unclear which version of codex is being
    launched and make it tough for us to determine the right upgrade
    command.
    
    ## Design
    
    ### Standalone package layout
    
    Standalone installs now live under `CODEX_HOME/packages/standalone`:
    
    ```text
    $CODEX_HOME/
      packages/
        standalone/
          current -> releases/0.111.0-x86_64-unknown-linux-musl
          releases/
            0.111.0-x86_64-unknown-linux-musl/
              codex
              codex-resources/
                rg
    ```
    
    where `standalone/current` is a symlink to a release directory.
    
    On Windows, the release directory has the same shape, with `.exe` names
    and Windows helpers in `codex-resources`:
    
    ```text
    %CODEX_HOME%\
      packages\
        standalone\
          current -> releases\0.111.0-x86_64-pc-windows-msvc
          releases\
            0.111.0-x86_64-pc-windows-msvc\
              codex.exe
              codex-resources\
                rg.exe
                codex-command-runner.exe
                codex-windows-sandbox-setup.exe
    ```
    
    This gives us:
    - atomic upgrades because we can fully stage a release before switching
    `standalone/current`
    - a stable way for the binary to recognize a standalone install from its
    canonical `current_exe()` path under CODEX_HOME
    - a clean place for binary dependencies like `rg`, Windows sandbox
    helpers, and, in the future, our custom `zsh` etc
    
    ### Command location
    
    On Unix, we add a symlink at `~/.local/bin/codex` which points directly
    to the `$CODEX_HOME/packages/standalone/current/codex` binary. This
    becomes the main entrypoint for the CLI.
    
    On Windows, we store the link at
    `%LOCALAPPDATA%\Programs\OpenAI\Codex\bin`.
    
    ### PATH persistence
    
    This is a tricky part of the PR, as there's no ~super reliable way to
    ensure that we end up on PATH without significant tradeoffs.
    
    Most Unix variants will have `~/.local/bin` on PATH already, which means
    we *should* be fine simply registering the command there in most cases.
    However, there are cases where this is not the case. In these cases, we
    directly edit the profile depending on the shell we're in.
    
    - macOS zsh: `~/.zprofile`
    - macOS bash: `~/.bash_profile`
    - Linux zsh: `~/.zshrc`
    - Linux bash: `~/.bashrc`
    - fallback: `~/.profile`
    
    On Windows, we update the User `Path` environment variable directly and
    we don't need to worry about shell profiles.
    
    ### Standalone runtime detection
    
    This PR adds a new shared crate, `codex-install-context`, which computes
    install ownership once per process and caches it in a `OnceLock`.
    
    That context includes:
    - install manager (`Standalone`, `Npm`, `Bun`, `Brew`, `Other`)
    - the managed standalone release directory, when applicable
    - the managed standalone `codex-resources` directory, when present
    - the resolved `rg_command`
    
    The standalone path is detected by canonicalizing `current_exe()`,
    canonicalizing CODEX_HOME via `find_codex_home()`, and checking whether
    the binary is running from under
    `$CODEX_HOME/packages/standalone/releases`.
    
    We intentionally do not use a release metadata file. The binary path is
    the source of truth.
    
    ### Dependency resolution
    
    For standalone installs, `grep_files` now resolves bundled `rg` from
    `codex-resources` next to the Codex binary.
    
    For npm/bun/brew/other installs, `grep_files` falls back to resolving
    `rg` from PATH.
    
    For Windows standalone installs, Windows sandbox helpers are still found
    as direct siblings when present. If they are not direct siblings, the
    lookup also checks the sibling `codex-resources` directory.
    
    ### TUI update path
    
    The TUI now has `UpdateAction::StandaloneUnix` and
    `UpdateAction::StandaloneWindows`, which rerun the standalone install
    commands.
    
    Unix update command:
    
    ```sh
    sh -c "curl -fsSL https://chatgpt.com/codex/install.sh | sh"
    ```
    
    Windows update command:
    
    ```powershell
    powershell -c "irm https://chatgpt.com/codex/install.ps1|iex"
    ```
    
    The Windows updater runs PowerShell directly. We do this because `cmd
    /C` would parse the `|iex` as a cmd pipeline instead of passing it to
    PowerShell.
    
    ## Additional installer behavior
    
    - standalone installs now warn about conflicting npm/bun/brew-managed
    `codex` installs and offer to uninstall them
    - same-version reruns do not redownload the release if it is already
    staged locally
    
    ## Testing
    
    Installer smoke tests run:
    - macOS: fresh install into isolated `HOME` and `CODEX_HOME` with
    `scripts/install/install.sh --release latest`
    - macOS: reran the installer against the same isolated install to verify
    the same-version/update path and PATH block idempotence
    - macOS: verified the installed `codex --version` and bundled
    `codex-resources/rg --version`
    - Windows: parsed `scripts/install/install.ps1` with PowerShell via
    `[scriptblock]::Create(...)`
    - Windows: verified the standalone update action builds a direct
    PowerShell command and does not route the `irm ...|iex` command through
    `cmd /C`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Support Unix socket allowlists in macOS sandbox (#17654)
    ## Changes
    
    Allows sandboxes to restrict overall network access while granting
    access to specific unix sockets on mac.
    
    ## Details
    
    - `codex sandbox macos`: adds a repeatable `--allow-unix-socket` option.
    - `codex-sandboxing`: threads explicit Unix socket roots into the macOS
    Seatbelt profile generation.
    - Preserves restricted network behavior when only Unix socket IPC is
    requested, and preserves full network behavior when full network is
    already enabled.
    
    ## Verification
    
    - `cargo test -p codex-cli -p codex-sandboxing`
    - `cargo build -p codex-cli --bin codex`
    - verified that `codex sandbox macos --allow-unix-socket /tmp/test.sock
    -- test-client` grants access as expected
  • [codex] Support local marketplace sources (#17756)
    ## Summary
    
    - Port marketplace source support into the shared core marketplace-add
    flow
    - Support local marketplace directory sources
    - Support direct `marketplace.json` URL sources
    - Persist the new source types in config/schema and cover them in CLI
    and app-server tests
    
    ## Validation
    
    - `cargo test -p codex-core marketplace_add`
    - `cargo test -p codex-cli marketplace_add`
    - `cargo test -p codex-app-server marketplace_add`
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-core`
    - `just fix -p codex-cli`
    
    ## Context
    
    Current `main` moved marketplace-add behavior into shared core code and
    still assumed only git-backed sources. This change keeps that structure
    but restores support for local directories and direct manifest URLs in
    the shared path.
  • feat: codex sampler (#17784)
    Add a pure sampler using the Codex auth and model config. To be used by
    other binary such as tape recorder
  • Refactor plugin loading to async (#17747)
    Simplifies skills migration.
  • [codex] Refactor marketplace add into shared core flow (#17717)
    ## Summary
    
    Move `codex marketplace add` onto a shared core implementation so the
    CLI and app-server path can use one source of truth.
    
    This change:
    - adds shared marketplace-add orchestration in `codex-core`
    - switches the CLI command to call that shared implementation
    - removes duplicated CLI-only marketplace add helpers
    - preserves focused parser and add-path coverage while moving the shared
    behavior into core tests
    
    ## Why
    
    The new `marketplace/add` RPC should reuse the same underlying
    marketplace-add flow as the CLI. This refactor lands that consolidation
    first so the follow-up app-server PR can be mostly protocol and handler
    wiring.
    
    ## Validation
    
    - `cargo test -p codex-core marketplace_add`
    - `cargo test -p codex-cli marketplace_cmd`
    - `just fix -p codex-core`
    - `just fix -p codex-cli`
    - `just fmt`
  • Add supports_parallel_tool_calls flag to included mcps (#17667)
    ## Why
    
    For more advanced MCP usage, we want the model to be able to emit
    parallel MCP tool calls and have Codex execute eligible ones
    concurrently, instead of forcing all MCP calls through the serial block.
    
    The main design choice was where to thread the config. I made this
    server-level because parallel safety depends on the MCP server
    implementation. Codex reads the flag from `mcp_servers`, threads the
    opted-in server names into `ToolRouter`, and checks the parsed
    `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
    on model-visible tool names, which can be incomplete in
    deferred/search-tool paths or ambiguous for similarly named
    servers/tools.
    
    ## What was added
    
    Added `supports_parallel_tool_calls` for MCP servers.
    
    Before:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    ```
    
    After:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    supports_parallel_tool_calls = true
    ```
    
    MCP calls remain serial by default. Only tools from opted-in servers are
    eligible to run in parallel. Docs also now warn to enable this only when
    the server’s tools are safe to run concurrently, especially around
    shared state or read/write races.
    
    ## Testing
    
    Tested with a local stdio MCP server exposing real delay tools. The
    model/Responses side was mocked only to deterministically emit two MCP
    calls in the same turn.
    
    Each test called `query_with_delay` and `query_with_delay_2` with `{
    "seconds": 25 }`.
    
    | Build/config | Observed | Wall time |
    | --- | --- | --- |
    | main with flag enabled | serial | `58.79s` |
    | PR with flag enabled | parallel | `31.73s` |
    | PR without flag | serial | `56.70s` |
    
    PR with flag enabled showed both tools start before either completed;
    main and PR-without-flag completed the first delay before starting the
    second.
    
    Also added an integration test.
    
    Additional checks:
    
    - `cargo test -p codex-tools` passed
    - `cargo test -p codex-core
    mcp_parallel_support_uses_exact_payload_server` passed
    - `git diff --check` passed