112 Commits

  • [codex] Classify nested MCP authentication startup errors (#30257)
    ## Summary
    
    - classify authentication-required RMCP startup failures, including
    errors nested inside `ClientInitializeError::TransportError`
    - let `codex-mcp` consume that classification so the existing
    `reauthenticationRequired` startup failure reason is emitted
    - add a regression test that performs real startup with an expired
    persisted OAuth token and no refresh token
    
    ## Why
    
    Follow-up to #29877.
    
    RMCP stores streamable HTTP initialization failures inside a dynamic
    transport error whose payload is not exposed through the standard Rust
    error source chain. The original `anyhow::Error::chain()` check
    therefore missed the nested `AuthError::AuthorizationRequired` seen
    during real MCP startup and emitted `failureReason: null`.
    
    The transport-specific inspection now lives in `codex-rmcp-client`,
    while `codex-mcp` consumes only the domain-level authentication-required
    result. This classifier does not distinguish first-time login from
    reauthentication; the existing auth-state logic remains responsible for
    that distinction.
    
    ## User impact
    
    When stored MCP OAuth credentials are expired and cannot be refreshed,
    app clients now receive `failureReason: "reauthenticationRequired"` on
    the failed startup update and can show the reconnect action. First-time
    login and unrelated startup failures remain unchanged.
    
    ## Validation
    
    - `just test -p codex-rmcp-client --test streamable_http_oauth_startup
    identifies_expired_unrefreshable_token_startup_error`
    - `just test -p codex-mcp
    startup_outcome_error_identifies_authentication_required`
    - `just test -p codex-mcp
    mcp_startup_failure_reason_requires_existing_oauth_and_auth_failure`
    - `cargo build -p codex-cli --bin codex`
    - local app-server probe emitted `failureReason:
    "reauthenticationRequired"`
    - manual end-to-end reconnect flow confirmed
    - `just fmt`
  • [codex] Surface MCP reauthentication-required startup failures (#29877)
    ## Summary
    
    - distinguish expired, non-refreshable stored MCP OAuth credentials from
    first-time missing credentials
    - carry a typed `failureReason: "reauthenticationRequired"` on the
    existing `mcpServer/startupStatus/updated` notification only when user
    action is required
    - keep the public MCP auth-status API unchanged and regenerate the
    app-server protocol schemas and documentation
    
    ## Why
    
    An MCP server with an expired access token and no usable refresh token
    currently fails startup without giving clients a reliable, typed
    recovery signal.
    
    The existing startup-status notification is the natural place to carry
    this state. Its nullable `failureReason` keeps the recovery reason
    attached to the failed startup transition without adding a one-off
    notification. Internally, Codex distinguishes first-time login from
    reauthentication and emits the reason only when the startup error itself
    requires authentication.
    
    ## User impact
    
    App clients can prompt an existing user to reconnect an MCP server when
    automatic recovery is impossible by handling a failed
    `mcpServer/startupStatus/updated` notification whose `failureReason` is
    `reauthenticationRequired`. Starting, ready, cancelled, unrelated
    failures, and first-time setup carry no reauthentication reason.
    
    ## Companion app PR
    
    - openai/openai#1069582
    
    ## Validation
    
    - `just test -p codex-app-server-protocol` — 248 passed; schema fixture
    tests passed
    - `cargo check -p codex-app-server -p codex-tui`
    - `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
    - `just test -p codex-protocol -p codex-app-server-protocol -p
    codex-mcp` — 579 passed
    - `just write-app-server-schema`
    - `just fmt`
  • Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
    ## Why
    
    #28522 routes selected-plugin HTTP MCP traffic through the owning
    executor, but OAuth bootstrap and refresh still used host-local clients.
    Executor-only servers therefore cannot complete discovery or login
    through the same network boundary as the MCP connection.
    
    ## What changed
    
    - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
    contract
    - let RMCP own discovery, dynamic registration, PKCE, token exchange,
    and refresh
    - route auth status, persisted-token startup, and app-server login
    through the server runtime while preserving the existing local discovery
    path
    - add optional `threadId` to `mcpServer/oauth/login` and echo it in the
    completion notification
    - implement RMCP's redirect policy and 1 MiB OAuth response limit over
    executor HTTP
    - cover selected-thread OAuth discovery and login through an
    executor-only route
    
    Depends on #28522.
  • Allow ChatGPT-hosted MCP servers to use session auth (#29733)
    ## Why
    
    ChatGPT session authentication was inferred from the reserved Codex Apps
    server name. That couples credential routing to Codex Apps-specific
    behavior and prevents other MCP endpoints hosted by ChatGPT from
    explicitly using the current session.
    
    The opt-in also needs a clear security boundary: an arbitrary MCP
    configuration must not be able to redirect ChatGPT credentials to
    another origin.
    
    ## What changed
    
    - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
    `false`.
    - Honor the setting only when the parsed server URL has the same HTTP(S)
    origin as the configured `chatgpt_base_url`; otherwise remove the
    capability before startup.
    - Resolve bearer tokens and static or environment-backed authorization
    headers before selecting authentication, with configured authorization
    taking precedence over ChatGPT session auth.
    - Enable the setting for the built-in Codex Apps and hosted plugin
    runtime endpoints while keeping Codex Apps caching and tool
    normalization scoped to the reserved server.
    - Persist the setting through MCP config rewrite paths and expose it in
    the generated config schema.
    - Load the current login state for `codex mcp list` so reported auth
    status matches runtime behavior.
    
    ## Verification
    
    Core integration coverage exercises the complete streamable HTTP MCP
    startup path and verifies that:
    
    - a same-origin opted-in server receives the current ChatGPT access
    token;
    - an explicitly configured authorization header takes precedence;
    - a different-origin server completes MCP initialization and tool
    listing without receiving any ChatGPT authorization header.
  • [codex] trace MCP startup latency (#28630)
    ## Summary
    
    - add trace-level instrumentation around per-server MCP setup, client
    construction, initialization, and initial tool listing
    - trace Codex Apps tool and server-info cache loads
    - attach `server_name` to server-scoped spans so slow startup work can
    be attributed to a specific MCP server
    
    ## Why
    
    `session_init.mcp_manager_init` can occasionally be slow, but its
    existing coarse span does not identify whether time is spent loading the
    Codex Apps cache, constructing a client, initializing a transport, or
    listing tools. These definition-level spans provide that breakdown
    without changing startup behavior.
    
    ## Validation
    
    - `just test -p codex-mcp` (87 passed)
    - `just test -p codex-rmcp-client` (86 passed, 2 skipped)
  • Prepare managed network sandbox context (#29456)
    ## Why
    
    Managed network configures commands to use local HTTP and SOCKS proxies.
    For commands delegated to the exec server, the proxy environment and the
    sandbox policy were prepared separately. On macOS, that meant a command
    could receive `HTTPS_PROXY=http://127.0.0.1:43123` while Seatbelt still
    denied access to port `43123`.
    
    ## What changed
    
    `NetworkProxy` now prepares the command environment and sandbox context
    together from the same runtime snapshot:
    
    ```text
    Prepared managed network
    ├── command environment: HTTPS_PROXY=http://127.0.0.1:43123
    └── sandbox context: allow outbound to 127.0.0.1:43123
    ```
    
    That context travels with remote exec requests. The exec server
    preserves the managed proxy and CA environment, and macOS Seatbelt
    allows only the prepared loopback proxy ports without enabling broad
    network access or local binding.
    
    The protocol field is optional and the existing enforcement flag remains
    in place, preserving compatibility with callers that do not send the new
    context.
  • Update rmcp to 1.8.0 (#29634)
    ## Summary
    
    - Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
    - Adapt to the new shared `peer_info` return type.
    - Box OAuth status discovery at the MCP boundary to keep the expanded
    future type from overflowing Rust's trait recursion limit.
    
    This brings in custom OAuth HTTP client support from
    [modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).
  • mcp: accept foreign absolute cwd for remote stdio (#29493)
    ## Why
    
    Remote stdio MCP servers can run in an environment whose path convention
    differs from the Codex host. A Windows cwd such as
    `C:\Users\openai\share` is absolute for the executor but was rejected by
    a POSIX orchestrator.
    
    Built on #29501, now merged, which only clarifies the host-native
    `PathUri` constructor name.
    
    ## What changed
    
    - Deserialize MCP cwd values as `LegacyAppPathString` so config does not
    apply host path rules.
    - Interpret that spelling as host-native for local launches and convert
    it to `PathUri` at executor launch.
    - Skip host filesystem and command resolution checks for remote stdio in
    `codex doctor`.
    - Add host-independent config and executor-boundary coverage using the
    foreign path convention for each test platform.
    
    ## Validation
    
    - `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
    codex-rmcp-client` (408 passed)
    - `just test -p codex-cli -p codex-rmcp-client` (372 passed)
    - `cargo check --workspace --tests`
    - `just test` (11,311 passed; 43 unrelated environment/timing failures)
    - `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
    codex-mcp-extension -p codex-rmcp-client -p codex-tui`
  • path-uri: clarify host-native path conversion (#29501)
    ## Why
    
    Downstream refactors are producing confusing code with this
    functionality having a very generic name. Encoding the specific
    conversion approach in the method name makes it clearer.
    
    ## What
    
    Rename `PathUri::from_path` to `PathUri::from_host_native_path` and
    update its Rust call sites.
  • Carry sandbox intent to remote exec servers (#29108)
    ## What changed
    
    PR #29099 stopped sending the orchestrator's concrete sandbox wrapper to
    a remote exec-server. Remote commands now arrive as plain native argv.
    
    This PR adds the next piece: Codex also sends portable sandbox intent
    next to that plain argv.
    
    For a remote unified-exec command, the request can now include:
    
    - the canonical permission profile before local workspace-root
    materialization
    - the sandbox cwd and workspace roots as `PathUri` values
    - Windows sandbox settings
    - the legacy Landlock setting
    - whether managed networking must be enforced
    
    The important part is that symbolic entries such as `:workspace_roots`
    stay symbolic while crossing the boundary. The executor can then bind
    them to its own workspace-root paths instead of receiving
    orchestrator-local absolute paths.
    
    The data travels through `ExecRequest` into `ExecParams`. Older
    exec-servers can still deserialize requests because the new fields have
    defaults.
    
    ## Why
    
    The orchestrator should not decide how another machine implements
    sandboxing.
    
    For example:
    
    - a local macOS Codex would normally build a Seatbelt command
    - a remote Linux executor needs a Linux sandbox command instead
    
    The orchestrator now sends the plain command plus the policy it intended
    to enforce. A later PR can let the exec-server choose and build the
    correct sandbox for its own operating system.
    
    ## Important detail
    
    This keeps the portable intent separate from the local `SandboxType`.
    
    `SandboxType::None` is ambiguous:
    
    - it can mean the command was explicitly approved to run without a
    sandbox
    - it can also mean the orchestrator host has no concrete sandbox
    implementation available
    
    Those cases are different for remote execution. This PR adds
    `sandbox_requested` so an executor can still receive sandbox intent when
    the orchestrator cannot build a local wrapper. Explicit unsandboxed
    retries still send no sandbox context.
    
    ## Behavior today
    
    This PR only transports the intent. The exec-server accepts the new
    fields but does not apply them yet.
    
    Remote commands therefore remain unsandboxed after this PR, just as they
    are after PR #29099.
    
    ## Follow-up
    
    The next PR will make exec-server read this portable intent, bind
    symbolic workspace permissions to executor-native roots, choose the
    sandbox for its own operating system, build the wrapper locally, and
    then spawn the command.
  • [codex] prototype mcp_history thread hint injection (#29259)
    ## Why
    
    Prototype whether the harness can invoke the `mcp_history` MCP while
    constructing full initial context and expose its thread hint to the
    model without requiring a model-issued tool call.
    
    The prototype builds on the context-window lineage added by #29256 and
    is now based directly on `main`.
    
    ## What changed
    
    - Call `mcp_history/thread_hint` with no arguments while building the
    full `<token_budget>` context.
    - Pass the current `threadId` through MCP request metadata, matching the
    normal MCP tool-call path.
    - Serialize only the unstructured `content` result and append it inside
    `<token_budget>` when the call succeeds.
    - Omit the additional context when the MCP call or content serialization
    fails.
    
    ## Prototype limitations
    
    - The direct call bypasses the normal model-initiated MCP approval,
    lifecycle-event, telemetry, and result-sanitization path.
    - The call has no prototype-specific timeout, result-size cap, or
    per-window cache.
    - MCP latency is added to full-context construction, including
    applicable compaction paths.
    
    ## Validation
    
    - `just test -p codex-core token_budget`
  • [codex] Support protected resource OAuth discovery (#29022)
    ## Why
    
    Plugin-install preflight and the actual OAuth login flow used different
    discovery implementations. Preflight had a Codex-specific implementation
    that only queried authorization-server metadata on the MCP host, while
    login already used the upstream `rmcp` Rust MCP SDK. As a result,
    servers that advertise a separate authorization server through RFC 9728
    Protected Resource Metadata were classified as OAuth-unsupported during
    plugin installation, so login was skipped.
    
    ## What changed
    
    - delegate plugin-install OAuth discovery to
    `rmcp::transport::AuthorizationManager`, the same implementation used by
    the login flow
    - let `rmcp` follow Protected Resource Metadata first and perform direct
    RFC 8414 authorization-server discovery when protected-resource
    discovery does not yield usable metadata
    - retain Codex's existing HTTP headers, timeout, `no_proxy` behavior,
    and scope normalization around that discovery
    - add unit coverage and a pure-MCP plugin-install integration test that
    proves the protected-resource path reaches OAuth client registration
    
    This only changes shared MCP OAuth discovery. App declarations and
    `appsNeedingAuth` behavior are unchanged.
    
    ## Verification
    
    - `just test -p codex-rmcp-client auth_status`
    - `just test -p codex-app-server plugin_install_starts_mcp_oauth`
    - real plugin-install smoke test with an isolated `CODEX_HOME`: both
    DigitalOcean MCP servers started OAuth callback listeners, while Linear
    continued to start its existing direct-discovery OAuth flow
  • Support openai/form extended form elicitations (#27500)
    # Summary
    Allow App Server clients to opt into `openai/form` MCP elicitations.
  • [codex] Carry exec-server cwd as PathUri (#28032)
    ## Why
    
    This is the second-to-last place in the exec-server protocol that needs
    to migrate to URIs to support cross-OS operation.
    
    ## What
    
    - Change `ExecParams.cwd` to `PathUri`.
    - Keep the cwd URI-shaped through core and rmcp producers, converting it
    to `AbsolutePathBuf` only in `LocalProcess::start_process`.
    - Reject non-native cwd URIs before launch and update the affected
    protocol documentation and call sites.
  • feat: use encrypted local secrets for MCP OAuth (#27541)
    ## Summary
    
    - store MCP OAuth credentials in the configured auth credential backend
    - support encrypted-local OAuth storage, including legacy keyring
    migration
    - propagate the credential backend through MCP refresh, session, CLI,
    and app-server paths
    
    ## Stack
    
    1. #27504 — config and feature flag
    2. #27535 — auth-specific secret namespaces
    3. #27539 — encrypted CLI auth storage
    4. this PR — encrypted MCP OAuth storage
    
    This is a parallel review stack; the original #17931 remains unchanged.
    
    ## Tests
    
    - `just test -p codex-rmcp-client` (the transport round-trip test passed
    after building the required `codex` binary and retrying)
    - `just test -p codex-mcp`
    - `just test -p codex-app-server
    refresh_config_uses_latest_auth_keyring_backend`
    - `just test -p codex-core
    refresh_mcp_servers_is_deferred_until_next_turn`
    - `just test -p codex-cli mcp`
    - `just fix -p codex-rmcp-client -p codex-mcp -p codex-core -p codex-cli
    -p codex-app-server -p codex-protocol`
    - `just bazel-lock-check`
  • [codex] Retry streamable HTTP initialize failures (#25147)
    ## Summary
    - Retry transient streamable HTTP failures during RMCP startup when the
    failure happens while sending the initialize request.
    - Retry transient streamable HTTP failures for tools/list, which is
    read-only and safe to replay.
    - Cover both retryable HTTP statuses and request-layer failures where no
    HTTP status is returned.
    - Surface retryable HTTP statuses from the streamable HTTP adapter as
    typed client errors.
    - Add integration coverage for initialize retry, tools/list retry,
    no-status request failure retry, and non-retryable initialize status.
    
    ## Root cause
    The observed codex_apps failures can happen before normal tool
    execution: RMCP startup fails while sending initialize, or the first
    read-only tools/list fails after startup. Retrying hosted_apps_bridge
    tools/call would not cover initialize and would risk replaying
    side-effecting tool calls. This change retries the streamable HTTP
    handshake itself, recreates the transport between initialize attempts,
    and retries only tools/list among post-initialize service operations.
    
    ## Validation
    - cargo fmt --package codex-rmcp-client
    - cargo test -p codex-rmcp-client --test streamable_http_recovery
  • [codex] Report unusable MCP OAuth credentials as logged out (#26713)
    ## Why
    
    Persisted MCP OAuth credentials were reported as authenticated whenever
    a credential record existed. An expired token without a usable refresh
    token could therefore appear as `OAuth` even though startup could not
    authenticate with it, leaving users with a misleading status instead of
    a login prompt.
    
    ## What changed
    
    - Classify stored OAuth credentials as missing, usable, or requiring
    authorization.
    - Reuse the existing refresh window so near-expiry credentials without a
    refresh path are also treated as logged out.
    - Validate required credential fields before reporting OAuth
    authentication.
    - Add unit coverage for credential usability and integration coverage
    for expired, unexpired, and refreshable persisted credentials.
    
    ## Validation
    
    - `just test -p codex-rmcp-client`
  • fix(rmcp): refresh expired OAuth tokens before startup (#26482)
    ## Why
    
    Codex persists OAuth expiry as an absolute `expires_at`, then
    reconstructs RMCP’s relative `expires_in` when credentials are loaded.
    For an already-expired token, Codex reconstructed `expires_in` as
    missing.
    
    [RMCP 0.15 treated a missing `expires_in` as zero when a refresh token
    was
    present](https://github.com/modelcontextprotocol/rust-sdk/blob/9cfc905a9ef17c8bba6748dc0a9bdd2452681733/crates/rmcp/src/transport/auth.rs#L704-L723),
    so this still triggered a refresh. [RMCP 1.7 treats missing expiry
    information as unknown and uses the access token
    as-is](https://github.com/modelcontextprotocol/rust-sdk/blob/3529c3675ff64db805bd947ca6ece6090809e43d/crates/rmcp/src/transport/auth.rs#L1233-L1265),
    causing the stale token to be sent during `initialize`.
    
    ## What changed
    
    - Represent a known-expired persisted token as `expires_in = 0`,
    preserving `None` for genuinely unknown expiry.
    - Add Streamable HTTP coverage requiring the token to refresh before the
    startup handshake.
    
    ## Validation
    
    - The new regression test fails on RMCP 1.7 before the fix and passes
    afterward.
    - The same scenario passes on the commit immediately before the RMCP 1.7
    update, using RMCP 0.15.
    - `just test -p codex-rmcp-client` (63 passed).
  • Optimize unbounded byte scans with memchr (#26265)
    ## Summary
    
    This PR adds `memchr` for some low-hanging performance improvements
    (namely, in MCP stdio, Ollama streaming, and full message-history
    newline counts).
    
    Codex produced the following release benchmarks:
    
    | Operation | Before | After | Speedup |
    | --- | ---: | ---: | ---: |
    | MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
    | Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
    | Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |
    
    With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
    MCP server, completed `initialize`, requested `tools/list`, and
    deserialized a 1 MiB tool description over newline-delimited stdio),
    it's about 16x faster end-to-end:
    
    | Branch | 50 calls | Per call |
    | --- | ---: | ---: |
    | `main` | 862.53 ms | 17.25 ms |
    | this branch | 53.89 ms | 1.08 ms |
    
    `memchr` is already in our dependency tree and extremely widely used for
    this kind of optimized scanning.
  • feat: Add focused diagnostics for MCP HTTP send failures (#25013)
    Adds failure-only logging for MCP streamable HTTP post_message calls and
    the underlying reqwest send path, capturing the MCP method/request id,
    endpoint shape, auth-header presence, timeout/connect classification,
    and sanitized error source chain without logging headers, bodies,
    tokens, or full URLs.
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • Allow parallel MCP tool calls when annotated readOnly (#23750)
    ## Summary
    - Treat MCP tools with `readOnlyHint: true` as parallel-safe even when
    `supports_parallel_tool_calls` is unset or `false`.
    - Keep server-level `supports_parallel_tool_calls` as an additive
    override for non-read-only tools.
    - Add focused unit coverage for the MCP handler eligibility decision.
    - Update RMCP integration coverage to keep the serial baseline on a
    mutable tool, verify read-only concurrency without server opt-in, and
    preserve the server opt-in concurrency path separately.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-core --lib tools::handlers::mcp::tests::`
    - `cargo test -p codex-core --test all
    stdio_mcp_read_only_tool_calls_run_concurrently_without_server_opt_in`
    - `cargo test -p codex-core --test all
    stdio_mcp_parallel_tool_calls_opt_in_runs_concurrently`
    - `cargo test -p codex-rmcp-client`
  • Route MCP servers through explicit environments (#23583)
    ## Summary
    - route each configured MCP server through an explicit per-server
    `environment_id` instead of a manager-wide remote toggle
    - default omitted `environment_id` to `local`, resolve named ids through
    `EnvironmentManager`, and fail only the affected MCP server when an
    explicit id is unknown
    - keep local stdio on the existing local launcher path for now, while
    named-environment stdio uses the selected environment backend and
    requires an absolute `cwd`
    - allow local HTTP MCP servers to keep using the ambient HTTP client
    when no local `Environment` is configured; named-environment HTTP MCPs
    use that environment's HTTP client
    
    ## Validation
    - devbox Bazel build: `bazel build --bes_backend= --bes_results_url=
    //codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server
    //codex-rs/rmcp-client:test_streamable_http_server`
    - devbox app-server config matrix with real `config.toml` /
    `environments.toml` files covering omitted local, explicit local,
    omitted local under remote default, explicit remote stdio, local HTTP
    without local env, explicit remote HTTP, local stdio without local env,
    unknown explicit env, and remote stdio without `cwd`
  • Support explicit MCP OAuth client IDs (#22575)
    ## Why
    Some MCP OAuth providers require a pre-registered public client ID and
    cannot rely on dynamic client registration. Codex already supports MCP
    OAuth, but it had no way to supply that client ID from config into the
    PKCE flow.
    
    ## What changed
    - add `oauth.client_id` under `[mcp_servers.<server>]` config, including
    config editing and schema generation
    - thread the configured client ID through CLI, app-server, plugin login,
    and MCP skill dependency OAuth entrypoints
    - configure RMCP authorization with the explicit client when present,
    while preserving the existing dynamic-registration path when it is
    absent
    - add focused coverage for config parsing/serialization and OAuth URL
    generation
    
    ## Verification
    - `cargo test -p codex-config -p codex-rmcp-client -p codex-mcp -p
    codex-core-plugins`
    - `cargo test -p codex-core blocking_replace_mcp_servers_round_trips
    --lib`
    - `cargo test -p codex-core
    replace_mcp_servers_streamable_http_serializes_oauth_resource --lib`
    - `cargo test -p codex-core config_schema_matches_fixture --lib`
    
    ## Notes
    Broader local package runs still hit unrelated pre-existing stack
    overflows in:
    - `codex-app-server::in_process_start_clamps_zero_channel_capacity`
    -
    `codex-core::resume_agent_from_rollout_uses_edge_data_when_descendant_metadata_source_is_stale`
  • Add callback ids to local MCP OAuth redirects (#20237)
    ## Summary
    
    - Add a deterministic callback-id path segment to local MCP OAuth
    redirect URIs before starting authorization.
    - Derive the callback id from the normalized MCP server URL and encode
    it as a 12-character URL-safe hash.
    - Reuse the existing exact callback-path validation so OAuth completion
    only succeeds on the callback path that was sent in the redirect URI.
    
    ## Context
    
    Slack thread:
    https://openai.slack.com/archives/C087WB3AGCR/p1777480566571699
    
    That thread calls out the OAuth mix-up class of issue for MCP servers.
    The connector/App Connect flow already has a callback_id concept that
    binds the OAuth callback URL to the MCP app/server identity. Codex
    desktop's local MCP OAuth flow was still using a generic local callback
    path like `/callback`, so this PR adds the same shape to the shared
    local MCP OAuth helper.
    
    ## Behavior
    
    Before this change, local MCP OAuth used:
    
    - default local callback URL: `http://127.0.0.1:<port>/callback`
    - configured callback URL: `<configured callback URL>` unchanged
    
    After this change, Codex appends a deterministic callback-id segment:
    
    - default local callback URL:
    `http://127.0.0.1:<port>/callback/<callback_id>`
    - configured callback URL: `<configured callback path>/<callback_id>`
    
    The local callback server already compares the incoming request path
    against the path from the redirect URI. By appending the callback id
    before both authorization and callback validation, callbacks that arrive
    on the old generic path or a mismatched callback-id path are rejected.
    
    The callback id is bound to the MCP endpoint URL, including path and
    query, so path-based multi-tenant MCP deployments on the same origin do
    not share a callback path. URL fragments are ignored because they are
    not sent to the server.
    
    The change lives in `codex-rmcp-client`, so it covers both the normal
    desktop MCP OAuth login path and silent/plugin-triggered MCP OAuth login
    paths that use the same `perform_oauth_login_*` helpers.
    
    ## Scope and non-goals
    
    - This does not change the app-server protocol or desktop webview
    request shape.
    - This does not implement RFC 9207 `iss` validation; issuer validation
    is still useful when providers return `iss`.
    - This does not make arbitrary untrusted MCP servers safe to use. It
    specifically adds callback URL binding for the local MCP OAuth flow.
    
    ## Validation
    
    - `cargo fmt --all`
    - `cargo test -p codex-rmcp-client perform_oauth_login`
  • feat(sandbox): add Windows deny-read parity (#18202)
    ## Why
    
    The split filesystem policy stack already supports exact and glob
    `access = none` read restrictions on macOS and Linux. Windows still
    needed subprocess handling for those deny-read policies without claiming
    enforcement from a backend that cannot provide it.
    
    ## Key finding
    
    The unelevated restricted-token backend cannot safely enforce deny-read
    overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
    checks, not read denials, so this PR intentionally fails that backend
    closed when deny-read overrides are present instead of claiming
    unsupported enforcement.
    
    ## What changed
    
    This PR adds the Windows deny-read enforcement layer and makes the
    backend split explicit:
    
    - Resolves Windows deny-read filesystem policy entries into concrete ACL
    targets.
    - Preserves exact missing paths so they can be materialized and denied
    before an enforceable sandboxed process starts.
    - Snapshot-expands existing glob matches into ACL targets for Windows
    subprocess enforcement.
    - Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
    - Plans both the configured lexical path and the canonical target for
    existing paths so reparse-point aliases are covered.
    - Threads deny-read overrides through the elevated/logon-user Windows
    sandbox backend and unified exec.
    - Applies elevated deny-read ACLs synchronously before command launch
    rather than delegating them to the background read-grant helper.
    - Reconciles persistent deny-read ACEs per sandbox principal so policy
    changes do not leave stale deny-read ACLs behind.
    - Fails closed on the unelevated restricted-token backend when deny-read
    overrides are present, because its `WRITE_RESTRICTED` token model is not
    authoritative for read denials.
    
    ## Landed prerequisites
    
    These prerequisite PRs are already on `main`:
    
    1. #15979 `feat(permissions): add glob deny-read policy support`
    2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
    3. #17740 `feat(config): support managed deny-read requirements`
    
    This PR targets `main` directly and contains only the Windows deny-read
    enforcement layer.
    
    ## Implementation notes
    
    - Exact deny-read paths remain enforceable on the elevated path even
    when they do not exist yet: Windows materializes the missing path before
    applying the deny ACE, so the sandboxed command cannot create and read
    it during the same run.
    - Existing exact deny paths are preserved lexically until the ACL
    planner, which then adds the canonical target as a second ACL target
    when needed. That keeps both the configured alias and the resolved
    object covered.
    - Windows ACLs do not consume Codex glob syntax directly, so glob
    deny-read entries are expanded to the concrete matches that exist before
    process launch.
    - Glob traversal deduplicates directory visits within each pattern walk
    to avoid cycles, without collapsing distinct lexical roots that happen
    to resolve to the same target.
    - Persistent deny-read ACL state is keyed by sandbox principal SID, so
    cleanup only removes ACEs owned by the same backend principal.
    - Deny-read ACEs are fail-closed on the elevated path: setup aborts if
    mandatory deny-read ACL application fails.
    - Unelevated restricted-token sessions reject deny-read overrides early
    instead of running with a silently unenforceable read policy.
    
    ## Verification
    
    - `cargo test -p codex-core
    windows_restricted_token_rejects_unreadable_split_carveouts`
    - `just fmt`
    - `just fix -p codex-core`
    - `just fix -p codex-windows-sandbox`
    - GitHub Actions rerun is in progress on the pushed head.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • fix(tui): suppress taskkill output for MCP teardown on Windows (#21759)
    ## Why
    
    On native Windows, running `/mcp` can leak `taskkill`'s normal
    `SUCCESS:` messages into the Codex TUI while the temporary MCP inventory
    process tree is being torn down. That corrupts the screen even though
    MCP itself is working correctly.
    
    Fixes #20845.
    
    ## What Changed
    
    - Redirect the Windows-only MCP teardown `taskkill` subprocess to null
    stdio so its console output cannot reach the TUI.
    
    ## How to Test
    
    1. On native Windows, configure a stdio MCP server, for example:
       ```powershell
    codex mcp add sequential-thinking -- npx -y
    @modelcontextprotocol/server-sequential-thinking
       ```
    2. With the latest released Codex CLI, start Codex and run `/mcp`.
    3. Confirm the current behavior: `taskkill` `SUCCESS:` lines appear in
    the TUI during the MCP refresh.
    4. Switch to this branch's build, start Codex again, and run `/mcp`.
    5. Confirm the MCP inventory still renders normally and the `taskkill`
    lines no longer appear.
    6. Repeat `/mcp` once more on this branch to verify the regression does
    not recur on repeated inventory requests.
    
    Targeted tests:
    - `cargo test -p codex-rmcp-client`
    - `cargo test -p codex-rmcp-client --test process_group_cleanup --quiet`
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: make built-in MCPs first-class runtime servers (#21356)
    ## DISCLAIMER
    This is experimental and no production service must rely on this
    
    ## Why
    
    Built-in MCPs are product-owned runtime capabilities, but they were
    previously flattened into the same config-backed stdio path as
    user-configured servers. That made them depend on a hidden `codex
    builtin-mcp` re-exec path, exposed them through config-oriented CLI
    flows, and erased distinctions the runtime needs to preserve—most
    notably whether an MCP call should count as external context for
    memory-mode pollution.
    
    ## What changed
    
    - Model product-owned built-ins separately from config-backed MCP
    servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
    - Launch built-ins in process through a reusable async transport instead
    of the hidden `builtin-mcp` stdio subcommand.
    - Keep config-oriented CLI operations such as `codex mcp
    list/get/login/logout` scoped to configured servers, while merging
    built-ins only into the effective runtime server set.
    - Retain server metadata after launch so parallel-tool support and
    context classification come from the live server set; built-in
    `memories` is now classified as local Codex state rather than external
    context.
    
    ## Test plan
    
    - `cargo test -p codex-mcp`
    - `cargo test -p codex-core --test suite
    builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Use MCP server instructions in deferred namespace descriptions (#21053)
    ## Why
    
    MCP servers can provide `instructions` that explain what their tools are
    for. Directly exposed MCP namespaces already use those instructions when
    a connector description is not available, but deferred `tool_search`
    results did not preserve that fallback. The direct path falls back from
    connector metadata to server instructions, while the deferred path only
    carried `connector_description` and otherwise fell back to generic
    namespace text.
    
    That meant a plain MCP server could provide useful model-facing guidance
    and still appear as `Tools in the X namespace.` whenever it was
    discovered lazily through `tool_search`.
    
    ## What changed
    
    - Store one model-facing `namespace_description` on `ToolInfo`, using
    connector descriptions for connector-backed tools and server
    instructions for plain MCP servers.
    - Thread that namespace description through the `tool_search` source
    list, search indexing, and returned namespace metadata.
    - Add an end-to-end regression test for deferred non-app MCP search
    results exposing server instructions as the namespace description.
    
    ## Verification
    
    - `cargo test -p codex-tools
    search_tool_description_lists_each_mcp_source_once --lib`
    - `cargo test -p codex-core --test all
    tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`
  • expand the set of core shell env vars for Windows. (#20089)
    https://github.com/openai/codex/issues/13917 and
    https://github.com/openai/codex/issues/18248 correctly identify that
    
    ```
    [shell_environment_policy]
    inherit = "core"
    ```
    is not functional on Windows because it carries an insufficient set of
    env vars.
    This PR expands that to match the more functional set from the MCP
    client
  • Terminate stdio MCP servers on shutdown to avoid process leaks (#19753)
    ## Why
    
    Several bug reports describe thread shutdown (including subagent
    threads) leaving stdio MCP server processes behind. These reports all
    point at the same lifecycle gap: Codex launches stdio MCP servers, but
    the session-level shutdown path does not explicitly close MCP clients or
    terminate the server process tree.
    
    Fixes #12491
    Fixes #12976
    Fixes #18881
    Fixes #19469
    
    ## History
    
    This is best understood as a regression/coverage gap in MCP session
    lifecycle management, not as stdio MCP cleanup being absent all along.
    #10710 added process-group cleanup for stdio MCP servers, but that
    cleanup only runs when the `RmcpClient`/transport is dropped. The older
    reports (#12491 and #12976) came after that cleanup existed, which
    suggests the remaining problem was that some higher-level shutdown paths
    kept the MCP manager alive or replaced it without explicitly draining
    clients. The newer reports (#18881 and #19469) exposed the same family
    around manager replacement and shutdown.
    
    ## What changed
    
    - Added an explicit stdio MCP process handle in `codex-rmcp-client` so
    local MCP servers terminate their process group and executor-backed MCP
    servers call the executor process terminator.
    - Added `RmcpClient::shutdown()` and manager-level MCP shutdown draining
    so session shutdown, channel-close fallback, MCP refresh, and connector
    probing stop owned MCP clients.
    - Added regression coverage that starts a stdio MCP server, begins an
    in-flight blocking tool call, shuts down the client, and asserts the
    server process exits.
    
    ## Verification
    
    - `cargo test -p codex-rmcp-client`
    - `cargo test -p codex-mcp`
    - `just fix -p codex-rmcp-client`
    - `just fix -p codex-mcp`
    - `just fix -p codex-core`
    
    - Manual before/after validation with a temporary repro script:
    - Pre-fix binary from `HEAD^` (`fed0a8f4fa`): reproduced the leak with
    surviving MCP server and child PIDs, `survivors=[77583, 77592]`,
    `leaked=true`.
    - Post-fix binary from this branch (`67e318148b`): verified both MCP
    processes were gone after interrupting `codex exec`, `survivors=[]`,
    `leaked=false`.
  • [codex] Move config loading into codex-config (#19487)
    ## Why
    
    Config loading had become split across crates: `codex-config` owned the
    config types and merge logic, while `codex-core` still owned the loader
    that assembled the layer stack. This change consolidates that
    responsibility in `codex-config`, so the crate that defines config
    behavior also owns how configs are discovered and loaded.
    
    To make that move possible without reintroducing the old dependency
    cycle, the shell-environment policy types and helpers that
    `codex-exec-server` needs now live in `codex-protocol` instead of
    flowing through `codex-config`.
    
    This also makes the migrated loader tests more deterministic on machines
    that already have managed or system Codex config installed by letting
    tests override the system config and requirements paths instead of
    reading the host's `/etc/codex`.
    
    ## What Changed
    
    - moved the config loader implementation from `codex-core` into
    `codex-config::loader` and deleted the old `core::config_loader` module
    instead of leaving a compatibility shim
    - moved shell-environment policy types and helpers into
    `codex-protocol`, then updated `codex-exec-server` and other downstream
    crates to import them from their new home
    - updated downstream callers to use loader/config APIs from
    `codex-config`
    - added test-only loader overrides for system config and requirements
    paths so loader-focused tests do not depend on host-managed config state
    - cleaned up now-unused dependency entries and platform-specific cfgs
    that were surfaced by post-push CI
    
    ## Testing
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core config_loader_tests::`
    - `cargo test -p codex-protocol -p codex-exec-server -p
    codex-cloud-requirements -p codex-rmcp-client --lib`
    - `cargo test --lib -p codex-app-server-client -p codex-exec`
    - `cargo test --no-run --lib -p codex-app-server`
    - `cargo test -p codex-linux-sandbox --lib`
    - `cargo shear`
    - `just bazel-lock-check`
    
    ## Notes
    
    - I did not chase unrelated full-suite failures outside the migrated
    loader surface.
    - `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
    failures on this machine, and Windows CI still shows unrelated
    long-running/timeouting test noise outside the loader migration itself.
  • test: harden app-server integration tests (#19683)
    ## Why
    
    Windows Bazel runs in the permissions stack exposed that app-server
    integration tests were launching normal plugin startup warmups in every
    subprocess. Those warmups can call
    `https://chatgpt.com/backend-api/plugins/featured` when a test is not
    specifically exercising plugin startup, which adds slow background work,
    noisy stderr, and dependence on external network state. The relevant
    startup/featured-plugin behavior was introduced across #15042 and
    #15264.
    
    A few app-server tests also had long optional waits or unbounded cleanup
    paths, making failures expensive to diagnose and contributing to slow
    Windows shards. One external-agent config test from #18246 used a
    GitHub-style marketplace source, which was enough to exercise the
    pending remote-import path but also meant the background completion task
    could attempt a real clone.
    
    ## What Changed
    
    - Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
    plumbing and a hidden debug-only
    `--disable-plugin-startup-tasks-for-tests` app-server flag, so
    integration tests can suppress startup plugin warmups without adding a
    production env-var gate.
    - Has the app-server test harness pass that hidden flag by default,
    while opting plugin-startup coverage back in for tests that
    intentionally exercise startup sync and featured-plugin warmup behavior.
    - Lowers normal app-server subprocess logging from `info`/`debug` to
    `warn` to avoid multi-megabyte stderr output in Bazel logs.
    - Prevents the external-agent config test from attempting a real
    marketplace clone by using an invalid non-local source while still
    exercising the pending-import completion path.
    - Bounds optional filesystem/realtime waits and fake WebSocket
    test-server shutdown so failures produce targeted timeouts instead of
    hanging a shard.
    - Fixes the Unix script-resolution test in `rmcp-client` to exercise
    PATH resolution directly and include the actual spawn error in failures.
    
    ## Verification
    
    - `cargo check -p codex-app-server`
    - `cargo clippy -p codex-app-server --tests -- -D warnings`
    - `cargo test -p codex-rmcp-client
    program_resolver::tests::test_unix_executes_script_without_extension`
    - `cargo test -p codex-app-server --test all
    external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
    -- --nocapture`
    - `cargo test -p codex-app-server --test all
    plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
    --nocapture`
    - Windows Local Bazel passed with this test-hardening bundle before it
    was extracted from #19606.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
    * #19395
    * #19394
    * #19393
    * #19392
    * #19606
    * __->__ #19683
  • refactor: route Codex auth through AuthProvider (#18811)
    ## Summary
    
    This PR moves Codex backend request authentication from direct
    bearer-token handling to `AuthProvider`.
    
    The new `codex-auth-provider` crate defines the shared request-auth
    trait. `CodexAuth::provider()` returns a provider that can apply all
    headers needed for the selected auth mode.
    
    This lets ChatGPT token auth and AgentIdentity auth share the same
    callsite path:
    - ChatGPT token auth applies bearer auth plus account/FedRAMP headers
    where needed.
    - AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
    where needed.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Callsite Migration
    
    | Area | Change |
    | --- | --- |
    | backend-client | accepts an `AuthProvider` instead of a raw
    token/header |
    | chatgpt client/connectors | applies auth through
    `CodexAuth::provider()` |
    | cloud tasks | keeps Codex-backend gating, applies auth through
    provider |
    | cloud requirements | uses Codex-backend auth checks and provider
    headers |
    | app-server remote control | applies provider headers for backend calls
    |
    | MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
    from generic account getters |
    | model refresh | treats AgentIdentity as Codex-backend auth |
    | OpenAI file upload path | rejects non-Codex-backend auth before
    applying headers |
    | core client setup | keeps model-provider auth flow and allows
    AgentIdentity through provider-backed OpenAI auth |
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
    auth mode and startup task allocation
    4. This PR: migrate Codex backend auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • [3/4] Add executor-backed RMCP HTTP client (#18583)
    ### Why
    The RMCP layer needs a Streamable HTTP client that can talk either
    directly over `reqwest` or through the executor HTTP runner without
    duplicating MCP session logic higher in the stack. This PR adds that
    client-side transport boundary so remote Streamable HTTP MCP can reuse
    the same RMCP flow as the local path.
    
    ### What
    - Add a shared `rmcp-client/src/streamable_http/` module with:
      - `transport_client.rs` for the local-or-remote transport enum
      - `local_client.rs` for the direct `reqwest` implementation
      - `remote_client.rs` for the executor-backed implementation
      - `common.rs` for the small shared Streamable HTTP helpers
    - Teach `RmcpClient` to build Streamable HTTP transports in either local
    or remote mode while keeping the existing OAuth ownership in RMCP.
    - Translate remote POST, GET, and DELETE session operations into
    executor `http/request` calls.
    - Preserve RMCP session expiry handling and reconnect behavior for the
    remote transport.
    - Add remote transport coverage in
    `rmcp-client/tests/streamable_http_remote.rs` and keep the shared test
    support in `rmcp-client/tests/streamable_http_test_support.rs`.
    
    ### Verification
    - `cargo check -p codex-rmcp-client`
    - online CI
    
    ### Stack
    1. #18581 protocol
    2. #18582 runner
    3. #18583 RMCP client
    4. #18584 manager wiring and local/remote coverage
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add turn-scoped environment selections (#18416)
    ## Summary
    - add experimental turn/start.environments params for per-turn
    environment id + cwd selections
    - pass selections through core protocol ops and resolve them with
    EnvironmentManager before TurnContext creation
    - treat omitted selections as default behavior, empty selections as no
    environment, and non-empty selections as first environment/cwd as the
    turn primary
    
    ## Testing
    - ran `just fmt`
    - ran `just write-app-server-schema`
    - not run: unit tests for this stacked PR
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: document intentional await-holding cases (#18423)
    ## Why
    
    This PR prepares the stack to enable Clippy await-holding lints that
    were left disabled in #18178. The mechanical lock-scope cleanup is
    handled separately; this PR is the documentation/configuration layer for
    the remaining await-across-guard sites.
    
    Without explicit annotations, reviewers and future maintainers cannot
    tell whether an await-holding warning is a real concurrency smell or an
    intentional serialization boundary.
    
    ## What changed
    
    - Configures `clippy.toml` so `await_holding_invalid_type` also covers
    `tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`.
    - Adds targeted `#[expect(clippy::await_holding_invalid_type, reason =
    ...)]` annotations for intentional async guard lifetimes.
    - Documents the main categories of intentional cases: active-turn state
    transitions that must remain atomic, session-owned MCP manager accesses,
    remote-control websocket serialization, JS REPL kernel/process
    serialization, OAuth persistence, external bearer token refresh
    serialization, and tests that intentionally serialize shared global or
    session-owned state.
    - For external bearer token refresh, documents the existing
    serialization boundary: holding `cached_token` across the provider
    command prevents concurrent cache misses from starting duplicate refresh
    commands, and the current behavior is small enough that an explicit
    expectation is easier to maintain than adding another synchronization
    primitive.
    
    ## Verification
    
    - `cargo clippy -p codex-login --all-targets`
    - `cargo clippy -p codex-connectors --all-targets`
    - `cargo clippy -p codex-core --all-targets`
    - The follow-up PR #18698 enables `await_holding_invalid_type` and
    `await_holding_lock` as workspace `deny` lints, so any undocumented
    remaining offender will fail Clippy.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423).
    * #18698
    * __->__ #18423
  • refactor: use semaphores for async serialization gates (#18403)
    This is the second cleanup in the await-holding lint stack. The
    higher-level goal, following https://github.com/openai/codex/pull/18178
    and https://github.com/openai/codex/pull/18398, is to enable Clippy
    coverage for guards held across `.await` points without carrying broad
    suppressions.
    
    The stack is working toward enabling Clippy's
    [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock)
    lint and the configurable
    [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type)
    lint for Tokio guard types.
    
    Several existing fields used `tokio::sync::Mutex<()>` only as
    one-at-a-time async gates. Those guards intentionally lived across
    `.await` while an operation was serialized. A mutex over `()` suggests
    protected data and trips the await-holding lint shape; a single-permit
    `tokio::sync::Semaphore` expresses the intended serialization directly.
    
    ## What changed
    
    - Replace `Mutex<()>` serialization gates with `Semaphore::new(1)` for
    agent identity ensure, exec policy updates, guardian review session
    reuse, plugin remote sync, managed network proxy refresh, auth token
    refresh, and RMCP session recovery.
    - Update call sites from `lock().await` / `try_lock()` to
    `acquire().await` / `try_acquire()`.
    - Map closed-semaphore errors into the existing local error types, even
    though these semaphores are owned for the lifetime of their managers.
    - Update session test builders for the new
    `managed_network_proxy_refresh_lock` type.
    
    ## Verification
    
    - The split stack was verified at the final lint-enabling head with
    `just clippy`.
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18403).
    * #18698
    * #18423
    * #18418
    * __->__ #18403
  • [5/6] Wire executor-backed MCP stdio (#18212)
    ## Summary
    - Add the executor-backed RMCP stdio transport.
    - Wire MCP stdio placement through the executor environment config.
    - Cover local and executor-backed stdio paths with the existing MCP test
    helpers.
    
    ## Stack
    ```text
    o  #18027 [6/6] Fail exec client operations after disconnect
    │
    @  #18212 [5/6] Wire executor-backed MCP stdio
    │
    o  #18087 [4/6] Abstract MCP stdio server launching
    │
    o  #18020 [3/6] Add pushed exec process events
    │
    o  #18086 [2/6] Support piped stdin in exec process API
    │
    o  #18085 [1/6] Add MCP server environment config
    │
    o  main
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • refactor: narrow async lock guard lifetimes (#18211)
    Follow-up to https://github.com/openai/codex/pull/18178, where we called
    out enabling the await-holding lint as a follow-up.
    
    The long-term goal is to enable Clippy coverage for async guards held
    across awaits. This PR is intentionally only the first, low-risk cleanup
    pass: it narrows obvious lock guard lifetimes and leaves
    `codex-rs/Cargo.toml` unchanged so the lint is not enabled until the
    remaining cases are fixed or explicitly justified. It intentionally
    leaves the active-turn/turn-state locking pattern alone because those
    checks and mutations need to stay atomic.
    
    ## Common fixes used here
    
    These are the main patterns reviewers should expect in this PR, and they
    are also the patterns to reach for when fixing future `await_holding_*`
    findings:
    
    - **Scope the guard to the synchronous work.** If the code only needs
    data from a locked value, move the lock into a small block, clone or
    compute the needed values, and do the later `.await` after the block.
    - **Use direct one-line mutations when there is no later await.** Cases
    like `map.lock().await.remove(&id)` are acceptable when the guard is
    only needed for that single mutation and the statement ends before any
    async work.
    - **Drain or clone work out of the lock before notifying or awaiting.**
    For example, the JS REPL drains pending exec senders into a local vector
    and the websocket writer clones buffered envelopes before it serializes
    or sends them.
    - **Use a `Semaphore` only when serialization is intentional across
    async work.** The test serialization guards intentionally span awaited
    setup or execution, so using a semaphore communicates "one at a time"
    without holding a mutex guard.
    - **Remove the mutex when there is only one owner.** The PTY stdin
    writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not
    protect shared access because nothing else had access to the writer.
    - **Do not split locks that protect an atomic invariant.** This PR
    deliberately leaves active-turn/turn-state paths alone because those
    checks and mutations need to stay atomic. Those cases should be fixed
    separately with a design change or documented with `#[expect]`.
    
    ## What changed
    
    - Narrow scoped async mutex guards in app-server, JS REPL, network
    approval, remote-control websocket, and the RMCP test server.
    - Replace test-only async mutex serialization guards with semaphores
    where the guard intentionally lives across async work.
    - Let the PTY pipe writer task own stdin directly instead of wrapping it
    in an async mutex.
    
    ## Verification
    
    - `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p
    codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness`
    - `just clippy -p codex-core`
    - `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p
    codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was
    run; the app-server suite passed, and `codex-core` failed in the local
    sandbox on six otel approval tests plus
    `suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`,
    which appear to depend on local command approval/default rules and
    `CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.
  • [4/6] Abstract MCP stdio server launching (#18087)
    ## Summary
    - Move local MCP stdio process startup behind a launcher trait.
    - Preserve existing local stdio behavior while making transport creation
    explicit.
    
    ## Stack
    ```text
    o  #18027 [6/6] Fail exec client operations after disconnect
    │
    o  #18212 [5/6] Wire executor-backed MCP stdio
    │
    @  #18087 [4/6] Abstract MCP stdio server launching
    │
    o  #18020 [3/6] Add pushed exec process events
    │
    o  #18086 [2/6] Support piped stdin in exec process API
    │
    o  #18085 [1/6] Add MCP server environment config
    │
    o  main
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Support original-detail metadata on MCP image outputs (#17714)
    ## Summary
    - honor `_meta["codex/imageDetail"] == "original"` on MCP image content
    and map it to `detail: "original"` where supported
    - strip that detail back out when the active model does not support
    original-detail image inputs
    - update code-mode `image(...)` to accept individual MCP image blocks
    - teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
    from raw MCP image outputs
    - document the new `_meta` contract and add generic RMCP-backed coverage
    across protocol, core, code-mode, and js_repl paths
  • Send sandbox state through MCP tool metadata (#17763)
    ## Changes
    
    Allows MCPs to opt in to receiving sandbox config info through `_meta`
    on model-initiated tool calls. This lets MCPs adhere to the thread's
    sandbox if they choose to.
    
    ## Details
    
    - Adds the `codex/sandbox-state-meta` experimental MCP capability.
    - Tracks whether each MCP server advertises that capability.
    - When a server opts in, `codex-core` injects the current `SandboxState`
    into model-initiated MCP tool-call request `_meta`.
    
    ## Verification
    
    - added an integration test for the capability
  • Add supports_parallel_tool_calls flag to included mcps (#17667)
    ## Why
    
    For more advanced MCP usage, we want the model to be able to emit
    parallel MCP tool calls and have Codex execute eligible ones
    concurrently, instead of forcing all MCP calls through the serial block.
    
    The main design choice was where to thread the config. I made this
    server-level because parallel safety depends on the MCP server
    implementation. Codex reads the flag from `mcp_servers`, threads the
    opted-in server names into `ToolRouter`, and checks the parsed
    `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
    on model-visible tool names, which can be incomplete in
    deferred/search-tool paths or ambiguous for similarly named
    servers/tools.
    
    ## What was added
    
    Added `supports_parallel_tool_calls` for MCP servers.
    
    Before:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    ```
    
    After:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    supports_parallel_tool_calls = true
    ```
    
    MCP calls remain serial by default. Only tools from opted-in servers are
    eligible to run in parallel. Docs also now warn to enable this only when
    the server’s tools are safe to run concurrently, especially around
    shared state or read/write races.
    
    ## Testing
    
    Tested with a local stdio MCP server exposing real delay tools. The
    model/Responses side was mocked only to deterministically emit two MCP
    calls in the same turn.
    
    Each test called `query_with_delay` and `query_with_delay_2` with `{
    "seconds": 25 }`.
    
    | Build/config | Observed | Wall time |
    | --- | --- | --- |
    | main with flag enabled | serial | `58.79s` |
    | PR with flag enabled | parallel | `31.73s` |
    | PR without flag | serial | `56.70s` |
    
    PR with flag enabled showed both tools start before either completed;
    main and PR-without-flag completed the first delay before starting the
    second.
    
    Also added an integration test.
    
    Additional checks:
    
    - `cargo test -p codex-tools` passed
    - `cargo test -p codex-core
    mcp_parallel_support_uses_exact_payload_server` passed
    - `git diff --check` passed
  • fix(mcp) pause timer for elicitations (#17566)
    ## Summary
    Stop counting elicitation time towards mcp tool call time. There are
    some tradeoffs here, but in general I don't think time spent waiting for
    elicitations should count towards tool call time, or at least not
    directly towards timeouts.
    
    Elicitations are not exactly like exec_command escalation requests, but
    I would argue it's ~roughly equivalent.
    
    ## Testing
    - [x] Added unit tests
    - [x] Tested locally
  • Add output_schema to code mode render (#17210)
    This updates code-mode tool rendering so MCP tools can surface
    structured output types from their `outputSchema`.
    
    What changed:
    - Detect MCP tool-call result wrappers from the output schema shape
    instead of relying on tool-name parsing or provenance flags.
    - Render shared TypeScript aliases once for MCP tool results
    (`CallToolResult`, `ContentBlock`, etc.) so multiple MCP tool
    declarations stay compact.
    - Type `structuredContent` from the tool definition's `outputSchema`
    instead of rendering it as `unknown`.
    - Update the shared MCP aliases to match the MCP draft `CallToolResult`
    schema more closely.
    
    Example:
    - Before: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
    string; message: string; }): Promise<{ _meta?: unknown; content:
    Array<unknown>; isError?: boolean; structuredContent?: unknown; }>; };`
    - After: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
    string; message: string; }): Promise<CallToolResult<{ echo: string; env:
    string | null; }>>; };`