107 Commits

  • [codex] Classify nested MCP authentication startup errors (#30257)
    ## Summary
    
    - classify authentication-required RMCP startup failures, including
    errors nested inside `ClientInitializeError::TransportError`
    - let `codex-mcp` consume that classification so the existing
    `reauthenticationRequired` startup failure reason is emitted
    - add a regression test that performs real startup with an expired
    persisted OAuth token and no refresh token
    
    ## Why
    
    Follow-up to #29877.
    
    RMCP stores streamable HTTP initialization failures inside a dynamic
    transport error whose payload is not exposed through the standard Rust
    error source chain. The original `anyhow::Error::chain()` check
    therefore missed the nested `AuthError::AuthorizationRequired` seen
    during real MCP startup and emitted `failureReason: null`.
    
    The transport-specific inspection now lives in `codex-rmcp-client`,
    while `codex-mcp` consumes only the domain-level authentication-required
    result. This classifier does not distinguish first-time login from
    reauthentication; the existing auth-state logic remains responsible for
    that distinction.
    
    ## User impact
    
    When stored MCP OAuth credentials are expired and cannot be refreshed,
    app clients now receive `failureReason: "reauthenticationRequired"` on
    the failed startup update and can show the reconnect action. First-time
    login and unrelated startup failures remain unchanged.
    
    ## Validation
    
    - `just test -p codex-rmcp-client --test streamable_http_oauth_startup
    identifies_expired_unrefreshable_token_startup_error`
    - `just test -p codex-mcp
    startup_outcome_error_identifies_authentication_required`
    - `just test -p codex-mcp
    mcp_startup_failure_reason_requires_existing_oauth_and_auth_failure`
    - `cargo build -p codex-cli --bin codex`
    - local app-server probe emitted `failureReason:
    "reauthenticationRequired"`
    - manual end-to-end reconnect flow confirmed
    - `just fmt`
  • Reuse MCP runtimes when selected availability changes nothing (#30148)
    ## Why
    
    MCP runtime reuse was keyed by every ready selected-capability
    environment, even when an environment contributed no MCP servers or
    connectors.
    
    For example:
    
    1. a global stdio MCP is running;
    2. a selected remote environment contains only a skill;
    3. that environment becomes ready;
    4. the MCP and connector projection stays exactly the same;
    5. Codex nevertheless rebuilds the MCP manager and restarts the global
    stdio process.
    
    That restart can interrupt active calls and discard process-local state
    even though nothing about MCP changed.
    
    ## What changes
    
    When selected-environment availability changes, Codex now resolves the
    candidate MCP and connector projection before deciding whether to
    replace the runtime:
    
    - if the winning MCP servers or their ownership change, rebuild as
    before;
    - if the selected connector snapshot changes, rebuild as before;
    - if an enabled MCP is explicitly bound to an environment whose
    availability changed, rebuild as before;
    - otherwise, keep the exact live manager and processes, and update only
    the availability input remembered by the snapshot.
    
    ```text
    ready selected environments:  [] -> [skills-env]
    resolved MCP servers:          {global_probe} -> {global_probe}
    resolved connectors:           {} -> {}
    result:                         reuse manager; keep the same process
    ```
    
    The comparison uses the resolved winning servers and their sources, so
    plugin/config ownership remains part of the runtime identity.
    
    ## Existing stack coverage
    
    The integration PR directly below this one already covers both rebuild
    boundaries: a selected MCP becomes callable and a selected connector
    tool becomes model-visible when their environment becomes available. It
    also verifies that an unchanged selected MCP runtime keeps its process.
    
    This PR does not add another remote-attachment integration scenario for
    the no-change optimization. `environment/add` returns before readiness,
    and app-server does not currently expose a deterministic readiness
    signal for an environment that contributes only skills. Keeping a
    fixed-delay test would add flake risk; adding a new readiness API would
    be outside this fix.
    
    ## Scope and assumptions
    
    - This does not change skill discovery, World State rendering, or plugin
    metadata caching.
    - This does not add file watching or hot reload behavior.
    - This does not change disconnect/reconnect handling.
    - Selected environment IDs and their capability contents retain the
    stack's existing stability assumption.
    - Delayed `required = true` executor MCP behavior remains out of scope.
  • Retry failed Codex Apps MCP startup (#29920)
    ## Problem
    
    The built-in Codex Apps MCP client shares a future for the full startup
    operation: connect, complete `initialize`, fetch the initial tools, and
    return a usable client. Sharing deduplicates startup work, but it also
    memoizes terminal errors.
    
    After a transient connection, handshake, or initial `tools/list`
    failure, later tool builds observe the same failed future. The thread
    cannot reconnect after the backend recovers and continues serving its
    startup-time cached tool snapshot, which may be empty or stale.
    
    ## Fix
    
    When Apps MCP startup ends in an error, Codex starts bounded recovery
    without putting startup latency on tool-router construction:
    
    1. The current tool build immediately continues with the cached startup
    snapshot.
    2. After the initial failure is reported, Codex starts one fresh full
    startup attempt in the background.
    3. Concurrent tool builds share that in-flight attempt and also continue
    with cached tools.
    4. On success, the recovered client becomes active, refreshes the Apps
    tools cache, emits a `Ready` startup status, and is reused by later
    operations.
    5. On failure, the cache remains unchanged and later tool builds may
    start another background attempt after exponential cooldown: 1s, 2s, 4s,
    8s, 16s, then 30s maximum.
    
    Each recreated startup performs a fresh MCP `initialize` and uncached
    `tools/list`. The MCP client retains its existing bounded retries for
    retryable `initialize` and `tools/list` failures.
    
    This avoids adding the Apps startup timeout to every request during a
    sustained outage.
    
    ## Scope
    
    This is limited to the built-in Codex Apps MCP client:
    
    - no reconnects for user-configured MCP servers;
    - no cache deletion; and
    - no proactive refresh for a healthy client with stale tools.
    
    ## Tests
    
    Coverage verifies:
    
    - tool builds return cached tools without waiting for a blocked
    reconnect;
    - concurrent tool builds start only one background reconnect;
    - failed reconnects preserve cached tools and respect exponential
    cooldown;
    - a recovered client is retained and reused; and
    - a long-lived thread exposes recovered app tools on a later follow-up.
    
    Validation:
    
    - `just test -p codex-mcp` — 95 passed
    - `just test -p codex-core
    later_follow_up_uses_background_recovered_apps_after_mid_thread_startup_failures
    --no-capture` — passed
    - `just fix -p codex-mcp`
    - `just fmt`
  • Keep MCP elicitation routable across runtime refreshes (#30127)
    ## Why
    
    An MCP tool call can still be waiting for an elicitation response when
    an environment update replaces the thread's MCP runtime.
    
    Before this change:
    
    ```text
    runtime A starts a tool call and asks the user
    environment becomes ready, so runtime B is published
    client answers the prompt through runtime B
    runtime B cannot find runtime A's pending responder
    ```
    
    The response is lost and the original tool call stays blocked.
    
    ## What changed
    
    All MCP runtimes for one thread now share a small elicitation router:
    
    ```text
    runtime A ---\
                   shared router: response token -> exact pending responder
    runtime B ---/
    ```
    
    When Codex surfaces an MCP elicitation, it assigns a unique opaque
    response token. The router records which pending request owns that
    token. A replacement runtime reuses the same router, so the latest
    runtime can deliver a response to a request started by the previous
    runtime.
    
    The Codex-owned token also prevents two runtime connections that reuse
    the same MCP server request ID from receiving each other's responses.
    
    This does not retain or search old MCP managers. Only the pending
    responder map is shared.
    
    ## Covered scenario
    
    The integration test exercises the complete failure mode:
    
    1. A thread starts while its selected environment is still unavailable.
    2. A configured MCP server starts a tool call and asks the client for
    input.
    3. The environment becomes ready, causing Codex to publish a replacement
    MCP runtime.
    4. The client answers the original prompt after the replacement.
    5. The original tool call receives that answer and completes.
    
    A focused routing test also creates two runtimes with the same server
    request ID and verifies that each response reaches the exact request
    that emitted its token.
    
    ## Scope
    
    This PR changes only elicitation response routing across MCP runtime
    replacement. It does not change when runtimes are rebuilt, which
    environments contribute MCP configuration, or how environment
    availability is detected.
  • Pin MCP runtimes to model steps (#30101)
    ## Why
    
    An MCP refresh can replace the session's current manager while a model
    step is still running. The step must execute calls through the same
    manager whose tools it advertised.
    
    ## Boundary
    
    ```text
    current session MCP runtime
              |
              | capture once for this model step
              v
    StepContext.mcp
      - exact MCP config
      - exact connection manager
      - exact runtime environment context
    ```
    
    ```rust
    pub struct McpRuntimeSnapshot {
        config: Arc<McpConfig>,
        manager: Arc<McpConnectionManager>,
        runtime_context: McpRuntimeContext,
    }
    ```
    
    ## Example
    
    ```text
    step A captures runtime A and advertises A's tools
    refresh publishes runtime B
    step A tool call -> runtime A
    next step        -> runtime B
    ```
    
    Capturing the snapshot is only an `Arc` clone. It does not restart MCPs
    or make an RPC.
    
    ## What changes
    
    - Captures one MCP runtime in `StepContext`.
    - Uses it for tool planning, tool calls, resources, approvals, connector
    attribution, and elicitation.
    - Publishes replacement runtimes atomically.
    - Lets an old runtime live only while an in-flight step or request still
    holds its `Arc`.
    
    Most of this diff is mechanical routing from the session-global manager
    to `step_context.mcp`; it does not introduce selected-plugin discovery
    yet.
    
    ## What does not change
    
    - No plugin or extension migration.
    - No new MCP cache policy.
    - No environment file watching.
    - No client sharing between separate managers.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. Project executor skills through World State.
    3. **This PR:** pin one MCP runtime to each model step.
    4. Project selected MCP/app/connector metadata by environment
    availability.
    5. One end-to-end integration scenario.
  • [codex] Surface MCP reauthentication-required startup failures (#29877)
    ## Summary
    
    - distinguish expired, non-refreshable stored MCP OAuth credentials from
    first-time missing credentials
    - carry a typed `failureReason: "reauthenticationRequired"` on the
    existing `mcpServer/startupStatus/updated` notification only when user
    action is required
    - keep the public MCP auth-status API unchanged and regenerate the
    app-server protocol schemas and documentation
    
    ## Why
    
    An MCP server with an expired access token and no usable refresh token
    currently fails startup without giving clients a reliable, typed
    recovery signal.
    
    The existing startup-status notification is the natural place to carry
    this state. Its nullable `failureReason` keeps the recovery reason
    attached to the failed startup transition without adding a one-off
    notification. Internally, Codex distinguishes first-time login from
    reauthentication and emits the reason only when the startup error itself
    requires authentication.
    
    ## User impact
    
    App clients can prompt an existing user to reconnect an MCP server when
    automatic recovery is impossible by handling a failed
    `mcpServer/startupStatus/updated` notification whose `failureReason` is
    `reauthenticationRequired`. Starting, ready, cancelled, unrelated
    failures, and first-time setup carry no reauthentication reason.
    
    ## Companion app PR
    
    - openai/openai#1069582
    
    ## Validation
    
    - `just test -p codex-app-server-protocol` — 248 passed; schema fixture
    tests passed
    - `cargo check -p codex-app-server -p codex-tui`
    - `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
    - `just test -p codex-protocol -p codex-app-server-protocol -p
    codex-mcp` — 579 passed
    - `just write-app-server-schema`
    - `just fmt`
  • feat(core, mcp): cache codex_apps tools in memory (#29003)
    ## Description
    
    This makes Codex Apps tool reads use a shared in-memory snapshot instead
    of rereading the disk cache every time `list_all_tools()` runs. Disk
    still seeds the cache on startup and gets updated after successful
    fetches, but it is no longer the live read path.
    
    The core change is that `McpManager` now owns a process-scoped
    `CodexAppsToolsCache`. Codex threads in the same app-server process now
    share this Codex Apps in-memory tools snapshot. The snapshot is keyed by
    the Codex home plus the Codex Apps identity: the active Codex auth
    user/workspace and the effective Codex Apps MCP source config.
    
    There's already code to hard-refresh the cache, so we respect it in this
    PR.
    
    ## Local benchmark
    
    I ran a local steady-state microbenchmark of the exact repeated Codex
    Apps cached-tools read this PR removes, using the same real local cache
    payload in both trees: `3,678,138` bytes and `381` tools. The cache file
    was already warm in the OS page cache, so this measures same-process
    reread/deserialization work rather than cold-disk latency or full turn
    latency. Each run is 25 iterations (mimicking a turn that makes 25
    inference calls).
    
    | Version | Run 1 | Run 2 | Avg |
    |---|---:|---:|---:|
    | `origin/main` disk read + JSON deserialize + `filter_tools` | `50.755
    ms` | `52.894 ms` | `51.825 ms` |
    | This branch in-memory `current_tools` + `filter_tools` | `0.740 ms` |
    `0.778 ms` | `0.759 ms` |
    
    That removes about `51 ms` from each repeated Codex Apps cached-tools
    read on this machine, roughly `68x` faster for that subpath. It is
    useful evidence for the hot path this PR changes, but not a claim that
    every production turn gets `51 ms` faster; end-to-end impact also
    depends on the rest of `list_all_tools()` and tool-payload construction.
    
    This is on my M2 Max macbook, so with a slower disk this would be much
    worse (and indeed we did see this really blew up turn runtime with a
    slow disk).
  • Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
    ## Why
    
    #28522 routes selected-plugin HTTP MCP traffic through the owning
    executor, but OAuth bootstrap and refresh still used host-local clients.
    Executor-only servers therefore cannot complete discovery or login
    through the same network boundary as the MCP connection.
    
    ## What changed
    
    - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
    contract
    - let RMCP own discovery, dynamic registration, PKCE, token exchange,
    and refresh
    - route auth status, persisted-token startup, and app-server login
    through the server runtime while preserving the existing local discovery
    path
    - add optional `threadId` to `mcpServer/oauth/login` and echo it in the
    completion notification
    - implement RMCP's redirect policy and 1 MiB OAuth response limit over
    executor HTTP
    - cover selected-thread OAuth discovery and login through an
    executor-only route
    
    Depends on #28522.
  • Support HTTP MCP servers from selected executor plugins (#28522)
    ## Why
    
    Selected executor plugins can declare both stdio and Streamable HTTP MCP
    servers, but only stdio registrations were retained. That silently drops
    part of the plugin's tool surface and prevents HTTP traffic from using
    the owning executor's network.
    
    ## What changed
    
    - retain selected-plugin Streamable HTTP MCP declarations alongside
    stdio declarations
    - route their HTTP clients through the owning executor environment
    - preserve local auth-header environment references while rejecting them
    for executor-hosted declarations
    - cover thread isolation, refresh, and an executor-only HTTP route end
    to end
  • Represent MCP authentication with an enum (#29924)
    ## Why
    
    MCP authentication has distinct OAuth and ChatGPT-session flows.
    Representing that choice as `use_chatgpt_auth` makes one flow implicit
    and allows the configuration model to express the distinction only
    through a boolean.
    
    ChatGPT credential forwarding also needs a first-party trust boundary. A
    configurable `chatgpt_base_url` controls routing, but must not grant an
    MCP server permission to receive session credentials.
    
    This change builds on #29733, where the boolean was introduced.
    
    ## What changed
    
    - Replace `use_chatgpt_auth` with an `auth` field backed by the
    exhaustive `McpServerAuth` enum.
    - Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
    the default.
    - Trust only the origin derived from the existing hardcoded
    `CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
    - Keep configured bearer tokens and authorization headers ahead of the
    selected authentication flow.
    - Update config writers, schema output, fixtures, and integration-test
    setup to use the enum.
    
    ## Verification
    
    Integration coverage exercises the complete streamable HTTP startup path
    in two independent configurations:
    
    - A directly constructed MCP configuration verifies that matching an
    overridden `chatgpt_base_url` does not grant ChatGPT auth.
    - A persisted `config.toml` containing an attacker-controlled
    `chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
    through normal config parsing.
    
    Both tests complete MCP initialization and tool listing and assert that
    the full captured request sequence contains no authorization headers.
    Separate integration coverage verifies that configured authorization
    takes precedence over ChatGPT auth.
  • Allow ChatGPT-hosted MCP servers to use session auth (#29733)
    ## Why
    
    ChatGPT session authentication was inferred from the reserved Codex Apps
    server name. That couples credential routing to Codex Apps-specific
    behavior and prevents other MCP endpoints hosted by ChatGPT from
    explicitly using the current session.
    
    The opt-in also needs a clear security boundary: an arbitrary MCP
    configuration must not be able to redirect ChatGPT credentials to
    another origin.
    
    ## What changed
    
    - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
    `false`.
    - Honor the setting only when the parsed server URL has the same HTTP(S)
    origin as the configured `chatgpt_base_url`; otherwise remove the
    capability before startup.
    - Resolve bearer tokens and static or environment-backed authorization
    headers before selecting authentication, with configured authorization
    taking precedence over ChatGPT session auth.
    - Enable the setting for the built-in Codex Apps and hosted plugin
    runtime endpoints while keeping Codex Apps caching and tool
    normalization scoped to the reserved server.
    - Persist the setting through MCP config rewrite paths and expose it in
    the generated config schema.
    - Load the current login state for `codex mcp list` so reported auth
    status matches runtime behavior.
    
    ## Verification
    
    Core integration coverage exercises the complete streamable HTTP MCP
    startup path and verifies that:
    
    - a same-origin opted-in server receives the current ChatGPT access
    token;
    - an explicitly configured authorization header takes precedence;
    - a different-origin server completes MCP initialization and tool
    listing without receiving any ChatGPT authorization header.
  • Add a connector declaration snapshot (#29851)
    ## Why
    
    Connector declarations currently enter Codex through broad plugin
    capability summaries, then MCP setup, turn tooling, and `app/list` each
    reconstruct the same information. That makes executor-selected
    connectors difficult to add without coupling connector behavior to the
    host plugin loader.
    
    This PR introduces a small connector-owned value that later stack layers
    can populate before thread startup.
    
    ## What changed
    
    - Move the pure app-declaration parser into `codex-connectors`,
    preserving declaration order and category cleanup while leaving
    host-side validation and deduplication unchanged.
    - Add an immutable `ConnectorSnapshot` with ordered connector IDs and
    plugin display-name provenance.
    - Adapt the existing local-plugin capability summaries into that
    snapshot at current consumer boundaries.
    - Use the snapshot for MCP tool provenance, turn connector inventory,
    and `app/list`.
    - Keep the crate API narrow: no test-only snapshot accessors are
    exposed.
    
    The externally visible behavior is unchanged. Connector tools still come
    from the orchestrator-owned `/ps/mcp` server, and local plugin
    enablement remains owned by the existing plugin loader.
    
    ## Stack scope
    
    This is the foundation only. It does not read selected executor packages
    or change thread startup. #29852 adds the executor-backed declaration
    reader, and #29856 composes selected declarations into a thread
    snapshot.
  • Keep executor plugin MCP paths URI-native (#29628)
    ## Why
    
    Executor-owned plugin roots are `PathUri`, but MCP config normalization
    still converts them into a native `Path` using the app-server host's
    rules. Relative `cwd` values can therefore resolve against the wrong
    filesystem when host and executor path conventions differ.
    
    This PR keeps executor MCP paths URI-native until the selected
    environment launches the server, while retaining the existing host
    parser behavior.
    
    ## What changed
    
    - Keep one shared MCP normalization path with narrow host-`Path` and
    executor-`PathUri` entrypoints.
    - Preserve native host resolution for locally installed plugin MCP
    configs.
    - For executor configs, default `cwd` to the plugin root and resolve
    relative working directories with the root URI's path convention.
    - Accept explicit executor `file:` URIs only when they remain within the
    selected plugin root.
    - Preserve the selected environment id and existing remote
    environment-variable ownership rules.
    - Route the executor plugin provider through the URI-native entrypoint
    without converting the root on the host.
    - Ensure `codex doctor` does not probe executor-owned stdio commands or
    foreign working directories on the host.
    - Cover foreign Windows roots, relative and absolute executor working
    directories, traversal rejection, runtime resolution, and doctor
    behavior.
    
    ```text
    plugin root:    file:///C:/plugins/demo
    configured cwd: scripts
                      |
                      v
    resolved cwd:  file:///C:/plugins/demo/scripts
                      |
                      v
    launch through the selected executor
    ```
    
    No new provider or filesystem abstraction is introduced.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. **This PR** — resolve executor MCP working directories without host
    path conversion.
  • [codex] trace MCP startup latency (#28630)
    ## Summary
    
    - add trace-level instrumentation around per-server MCP setup, client
    construction, initialization, and initial tool listing
    - trace Codex Apps tool and server-info cache loads
    - attach `server_name` to server-scoped spans so slow startup work can
    be attributed to a specific MCP server
    
    ## Why
    
    `session_init.mcp_manager_init` can occasionally be slow, but its
    existing coarse span does not identify whether time is spent loading the
    Codex Apps cache, constructing a client, initializing a transport, or
    listing tools. These definition-level spans provide that breakdown
    without changing startup behavior.
    
    ## Validation
    
    - `just test -p codex-mcp` (87 passed)
    - `just test -p codex-rmcp-client` (86 passed, 2 skipped)
  • [codex] Fix stale approval policy in MCP test (#29696)
    ## Summary
    
    - replace the stale `AskForApproval::OnFailure` reference in the MCP
    connection manager test with `AskForApproval::OnRequest`
    - restore `codex-mcp` test compilation after `OnFailure` was removed in
    #28418
    
    ## Root cause
    
    The test was added on main after the approval-policy removal branch had
    already updated the other references, so the newly added call site was
    missed when #28418 merged.
    
    ## Validation
    
    - `just test -p codex-mcp` (90 passed)
    - `just fmt`
  • chore(core) rm AskForApproval::OnFailure (#28418)
    ## Summary
    Deletes the OnFailure variant of the `AskForApproval` enum. This option
    has been deprecated since #11631.
    
    ## Testing
    - [x] Tests pass
  • Shut down superseded MCP managers on refresh (#29608)
    ## Summary
    
    MCP refresh replaced the published connection manager without shutting
    down the manager it superseded. If another task retained that old
    manager, its stdio MCP processes stayed alive and accumulated across
    refreshes.
    
    Atomically swap in the refreshed manager, then explicitly shut down the
    exact manager returned by the swap. Add a process-level regression test
    that retains the old manager during refresh and verifies its stdio
    process exits while the replacement remains available.
    
    ## Context
    
    Explicit cleanup was lost when manager publication moved to `ArcSwap`.
    Dropping the old manager is not a reliable shutdown boundary because
    active callers can retain its `Arc` and underlying client process
    handles.
  • Update rmcp to 1.8.0 (#29634)
    ## Summary
    
    - Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
    - Adapt to the new shared `peer_info` return type.
    - Box OAuth status discovery at the MCP boundary to keep the expanded
    future type from overflowing Rust's trait recursion limit.
    
    This brings in custom OAuth HTTP client support from
    [modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).
  • Fix Codex Apps auth elicitation hang (#29615)
    ## Summary
    - Require the reserved Codex Apps MCP server name to be present in the
    connection manager before treating it as host-owned.
    - Update auth elicitation tests to model an installed host-owned Codex
    Apps server without sending startup events to the test session.
    
    ## Why
    PR #29518 replaced the old host-owned flag with a name-only check. That
    made non-host-owned tests with the reserved codex_apps name enter auth
    elicitation and wait forever for a response.
  • Allow codex sandbox to consume MCP sandbox state (#29358)
    ## Summary
    
    - let `codex sandbox` accept the JSON value from
    `codex/sandbox-state-meta`
    - require the payload `permissionProfile` instead of falling back to
    ambient permissions
    - reuse the existing macOS, Linux, and Windows launch paths, treating
    external sandbox state conservatively as read-only
    - let opaque forwarders add runtime read roots and disable direct
    network access without decoding the payload
    
    Builds on #29113, which is now on `main`.
    
    ## Tests
    
    - `just test -p codex-cli debug_sandbox::tests`
    - `cargo build -p codex-rmcp-client --bin test_stdio_server`
    - `just test -p codex-core
    stdio_mcp_tool_call_includes_sandbox_state_meta`
    - `just test -p codex-mcp`
    - `just fmt`
  • Group Codex Apps client setup (#29583)
    ## Why
    
    `McpConnectionManager::new` classified the Codex Apps server twice: once
    to create its tools cache context and again to select its runtime
    authentication provider. Keeping those decisions separate makes it
    harder to see that they belong to the same server-specific setup path.
    
    ## What changed
    
    - Group Codex Apps cache and authentication setup under one explicit
    branch.
    - Keep regular MCP server setup in the corresponding `else` branch.
    - Limit environment bearer-token inspection to the Codex Apps path where
    it affects runtime authentication.
  • Remove redundant Codex Apps cache guard (#29575)
    ## Why
    
    Codex Apps cache writes are already restricted to Codex Apps call paths:
    startup invokes the helper only from the Codex Apps branch, and hard
    refresh operates on the reserved Codex Apps server directly. Rechecking
    the server name inside the cache helper duplicates that classification
    and leaves the helper with an argument that cannot change valid
    behavior.
    
    ## What changed
    
    - Remove the redundant server-name check and parameter from the cache
    writer.
    - Rename the helper to `write_codex_apps_tools_cache` to reflect its
    narrower contract.
    - Update production and test callsites to use the simplified API.
  • Centralize Codex Apps client handling (#29528)
    ## Why
    
    Codex Apps-specific behavior is currently distributed across cache
    helpers, startup, tool conversion, and model-visible annotation. Each
    layer independently checks the reserved server name, which obscures the
    boundary between trusted host-owned connector metadata and regular MCP
    server data.
    
    Classifying the server once when `AsyncManagedClient` is created gives
    the client a single source of truth and makes the two processing paths
    explicit.
    
    ## What changed
    
    - Record whether an `AsyncManagedClient` represents the Codex Apps
    server at construction time.
    - Route startup cache loading, cache persistence, and cache telemetry
    through the Codex Apps branch.
    - Split uncached tool conversion between Codex Apps normalization and
    regular MCP metadata sanitization.
    - Split model-visible schema and plugin provenance handling along the
    same boundary.
    - Remove redundant server-name guards from helpers that are now called
    only from the Codex Apps branch.
    
    ## Verification
    
    - Preserve behavioral coverage that verifies Codex Apps connector
    metadata and the complete converted `ToolInfo` shape.
    
    ## Stack
    
    Depends on #29518.
  • Remove redundant Codex Apps manager flag (#29518)
    ## Why
    
    Codex Apps server admission is already decided before
    `McpConnectionManager` is constructed. `effective_mcp_servers` and
    `effective_mcp_servers_from_configured` remove the server when the apps
    feature or required authentication is unavailable, so storing the same
    decision on the manager duplicates state that can drift from the
    effective server map.
    
    ## What changed
    
    - Remove `host_owned_codex_apps_enabled` from `McpConnectionManager` and
    its constructor.
    - Identify the host-owned Codex Apps server by its reserved server name
    once it is present in the effective server map.
    - Remove the now-unused flag calculations and constructor arguments from
    production and test callsites.
  • mcp: accept foreign absolute cwd for remote stdio (#29493)
    ## Why
    
    Remote stdio MCP servers can run in an environment whose path convention
    differs from the Codex host. A Windows cwd such as
    `C:\Users\openai\share` is absolute for the executor but was rejected by
    a POSIX orchestrator.
    
    Built on #29501, now merged, which only clarifies the host-native
    `PathUri` constructor name.
    
    ## What changed
    
    - Deserialize MCP cwd values as `LegacyAppPathString` so config does not
    apply host path rules.
    - Interpret that spelling as host-native for local launches and convert
    it to `PathUri` at executor launch.
    - Skip host filesystem and command resolution checks for remote stdio in
    `codex doctor`.
    - Add host-independent config and executor-boundary coverage using the
    foreign path convention for each test platform.
    
    ## Validation
    
    - `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
    codex-rmcp-client` (408 passed)
    - `just test -p codex-cli -p codex-rmcp-client` (372 passed)
    - `cargo check --workspace --tests`
    - `just test` (11,311 passed; 43 unrelated environment/timing failures)
    - `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
    codex-mcp-extension -p codex-rmcp-client -p codex-tui`
  • Add config toggles for orchestrator skills and MCP (#28942)
    ## Why
    
    Orchestrator-provided skills and Codex Apps MCP tools add model-visible
    instructions, resources, and tools beyond the local workspace. Hosts
    need config-level switches to disable those orchestrator-owned surfaces
    independently, without disabling regular skills or regular MCP servers.
    
    ## What changed
    
    - Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled`
    config entries, both defaulting to `true`.
    - Includes the new settings in `config.schema.json` and in the config
    lock so resolved thread configuration preserves the same orchestrator
    exposure decisions.
    - Threads `orchestrator.skills.enabled` through the app-server skills
    extension so disabled orchestrator skills do not expose the `skills`
    namespace or inject orchestrator skill context.
    - Gates Codex Apps MCP exposure, app instructions, and app auth
    eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps
    MCP tools available.
    - Updates the thread-manager sample config to disable both
    orchestrator-owned surfaces.
    
    ## Verification
    
    - Added config parsing, loading, defaulting, and schema coverage for the
    new settings.
    - Added MCP exposure coverage that `orchestrator.mcp.enabled = false`
    removes Codex Apps tools while preserving regular MCP tools.
    - Added app-server coverage that `orchestrator.skills.enabled = false`
    prevents orchestrator skill tools, prompts, and resource reads from
    reaching the model turn.
  • [codex] Remove hardcoded app ID filters (#28947)
    ## Summary
    
    - remove the duplicated originator-specific connector ID denylists
    - stop filtering connector directory/accessibility results and
    live/cached Codex Apps MCP tools by hardcoded connector ID
    - remove the now-unused `codex-login` dependency from
    `codex-utils-plugins`
    - update regression coverage so formerly blocked connector IDs are
    preserved
    
    ## Why
    
    The client-side policy was duplicated across crates, used opaque IDs
    without ownership or expiry information, and could drift between app
    listing and MCP tool behavior. Server-provided visibility,
    authorization, plugin discoverability, accessibility, enabled-state
    handling, and consequential-tool approval templates remain unchanged.
    
    ## Validation
    
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `git diff --check`
    - confirmed the final diff contains no hardcoded denylist symbols
    
    A targeted `codex-mcp` test build spent an unusually long time in local
    compilation/linking. Its first attempt exposed a test-only `PartialEq`
    assertion issue, which was corrected. A follow-up non-linking `cargo
    check -p codex-mcp --tests` was still running when this draft was
    opened; CI should provide the complete Rust validation.
  • Support openai/form extended form elicitations (#27500)
    # Summary
    Allow App Server clients to opt into `openai/form` MCP elicitations.
  • Scope MCP sandbox metadata to server environment (#28914)
    Scope MCP sandbox metadata to the MCP server's owning environment.
    
    Previously, `codex/sandbox-state-meta` always used the turn's primary
    cwd and rebuilt a legacy sandbox policy from that cwd. That can be wrong
    for MCP servers owned by a different execution environment.
    
    This now sends the owning environment cwd as a `file:` URI in
    `sandboxCwd`, keeps `permissionProfile` as the permission source of
    truth, and omits sandbox-state metadata when a non-default server
    environment is not selected for the turn. Local/default MCP servers keep
    the existing fallback cwd behavior.
    
    Tests:
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just test -p codex-mcp`
    - `just test -p codex-core mcp_sandbox_cwd`
    - `cargo build -p codex-rmcp-client --bin test_stdio_server`
    - `just test -p codex-core
    stdio_mcp_tool_call_includes_sandbox_state_meta`
  • [mcp] Increase default tool timeout to 300 seconds (#28234)
    Summary
    - Increase the default MCP tool-call timeout from 120 to 300 seconds.
    
    Validation
    - `just test -p codex-mcp`
    - `just fmt`
  • skills: cache orchestrator resources per thread (#28336)
    ## Why
    
    Hosted orchestrator skills are read through the remote MCP resource
    server. Within one thread, the same catalog or skill resource can be
    requested multiple times by prompt injection and the `skills.list` /
    `skills.read` tools. Re-fetching adds latency and can make those
    surfaces observe different remote contents during the same thread.
    
    This is a follow-up to #28333: orchestrator skills remain limited to
    threads without a local executor, and those threads now get a stable
    per-thread view of the remote skill data they use.
    
    ## What changed
    
    - Reuse the existing per-thread orchestrator catalog snapshot for
    `skills.list` and `skills.read` availability checks.
    - Cache successful orchestrator resource reads by authority, package,
    and resource so prompt injection and tool calls share the same contents.
    - Keep the cache memory-only and bounded to 100 resources and 8 MiB per
    thread.
    - Leave host and executor skill reads unchanged, and do not cache failed
    remote reads.
    
    ## Verification
    
    - Extended the app-server MCP resource integration test to read the same
    hosted skill resource twice and verify that the remote server receives
    one read.
    - The same test verifies that catalog discovery and the selected skill's
    main prompt are each fetched only once per thread.
  • Add selected-plugin precedence and attribution to the MCP catalog (#27884)
    ## Why
    
    **In short:** this PR resolves already-discovered MCP registrations. It
    does not read selected plugins or discover their MCP servers.
    
    The resolved MCP catalog currently builds config and auto-discovered
    plugin registrations before runtime contributors are applied. A
    thread-selected plugin needs a distinct precedence tier in that same
    initial resolution pass: otherwise a disabled lower-precedence winner
    can leave stale name-level state behind, and the winning MCP tools
    cannot be attributed to the selected package reliably.
    
    This PR adds that catalog boundary before executor discovery is
    connected.
    
    ## What changed
    
    - Added an explicit selected-plugin registration tier between
    auto-discovered plugins and explicit config.
    - Collected selected-plugin contributions before the initial catalog
    build, while leaving compatibility and generic extension overlays in
    their existing runtime phase.
    - Retained the winning plugin ID and display name directly on
    plugin-owned catalog registrations.
    - Derived MCP tool provenance from the winning catalog entry instead of
    joining against local-only plugin summaries.
    - Retained the winning selected server's tool approval policy in the
    running connection manager, so a selected registration cannot inherit
    approval behavior from a losing local plugin.
    - Kept remembered approval session-scoped for selected plugins until
    there is an authority-aware persistence contract; Codex will not write
    approval back to an unrelated local plugin.
    - Preserved existing name-level disabled vetoes for discovered plugins
    and config, while keeping a selected package's own disabled registration
    scoped to that registration.
    - Preserved deterministic selection order and existing config,
    compatibility, and extension precedence.
    
    The resulting order is:
    
    ```text
    auto-discovered plugin
      < selected plugin
      < explicit config
      < compatibility registration
      < extension overlay
    ```
    
    ## Behavior and scope
    
    This is a catalog and provenance change only. No production host
    contributes selected-plugin MCP registrations yet, so existing local MCP
    behavior remains unchanged.
    
    The stacked follow-up, #27870, installs the executor plugin provider
    that produces these registrations. App-server activation remains a
    separate final step.
    
    ## Verification
    
    Focused tests cover precedence, deterministic selected-plugin conflicts,
    disabled-veto behavior across catalog phases, managed requirements
    before selected-plugin resolution, winning-server approval policy, and
    attribution when local and selected packages share an ID or server name.
    CI owns execution of the test suite.
  • feat: use encrypted local secrets for MCP OAuth (#27541)
    ## Summary
    
    - store MCP OAuth credentials in the configured auth credential backend
    - support encrypted-local OAuth storage, including legacy keyring
    migration
    - propagate the credential backend through MCP refresh, session, CLI,
    and app-server paths
    
    ## Stack
    
    1. #27504 — config and feature flag
    2. #27535 — auth-specific secret namespaces
    3. #27539 — encrypted CLI auth storage
    4. this PR — encrypted MCP OAuth storage
    
    This is a parallel review stack; the original #17931 remains unchanged.
    
    ## Tests
    
    - `just test -p codex-rmcp-client` (the transport round-trip test passed
    after building the required `codex` binary and retrying)
    - `just test -p codex-mcp`
    - `just test -p codex-app-server
    refresh_config_uses_latest_auth_keyring_backend`
    - `just test -p codex-core
    refresh_mcp_servers_is_deferred_until_next_turn`
    - `just test -p codex-cli mcp`
    - `just fix -p codex-rmcp-client -p codex-mcp -p codex-core -p codex-cli
    -p codex-app-server -p codex-protocol`
    - `just bazel-lock-check`
  • Extract shared plugin MCP config parsing (#27863)
    ## Why
    
    We want a thread-selected plugin to eventually expose stdio MCP servers
    that run on the executor owning that plugin.
    
    The existing plugin MCP parser lived inside `core-plugins` and was
    coupled to the host filesystem loader. Reusing it from an executor
    provider would either duplicate MCP normalization or make the plugin
    package layer own MCP runtime semantics. This PR creates the shared
    MCP-owned boundary first.
    
    In simple terms:
    
    ```text
    plugin .mcp.json
            |
            v
    shared parser in codex-mcp
            |
            +-- Declared placement: preserve current local-plugin behavior
            |
            +-- Environment placement: produce config bound to one executor
    ```
    
    This builds on the authority-bound plugin descriptors from #27692. It
    intentionally does not discover, register, or launch executor MCP
    servers yet.
    
    ## What changed
    
    - Moved plugin MCP file parsing and normalization from `core-plugins`
    into `codex-mcp`.
    - Kept support for both existing file shapes: a top-level server map and
    an object containing `mcpServers`.
    - Kept per-server failure isolation: one invalid server does not discard
    valid siblings, while malformed top-level JSON still fails the whole
    file.
    - Updated the existing local plugin loader to use `Declared` placement,
    preserving its current transport, OAuth, relative `cwd`, and error
    behavior.
    - Added `Environment` placement for the next stacked PR:
    - the selected environment ID overrides anything declared by the plugin;
      - missing stdio `cwd` defaults to the plugin root;
    - relative `cwd` is resolved beneath the plugin root and cannot traverse
    outside it;
    - bare or source-less environment-variable references resolve on a
    non-local executor;
    - explicit orchestrator environment-variable forwarding is rejected for
    executor-owned plugins.
    
    ## User impact
    
    None in this PR. Existing local plugin MCP loading follows the same
    behavior through the shared parser. The executor placement mode is not
    connected to thread startup until the follow-up registration PR.
    
    ## Assumptions
    
    - A selected capability root's environment is authoritative. A plugin
    cannot redirect its stdio process to the orchestrator or another
    executor.
    - Relative working directories belong under the plugin package root.
    Explicit absolute working directories remain valid within the owning
    environment.
    - For a non-local executor, unqualified environment-variable names refer
    to that executor. Reading an orchestrator variable requires an explicit
    contract and is rejected for now.
    - Parsing only produces normalized `McpServerConfig` values. Process
    startup remains owned by the existing MCP runtime and connection
    manager.
    
    ## Follow-ups
    
    1. Add the executor MCP provider and catalog registration: read the
    selected plugin's MCP config through the same executor filesystem,
    support stdio only, freeze the result per active thread, apply managed
    policy, and resolve name collisions as discovered plugin < selected
    plugin < explicit config.
    2. Install that provider in app-server and add an end-to-end test
    proving `thread/start.selectedCapabilityRoots` launches and calls the
    MCP tool on the selected executor, preserves the frozen registration
    across refresh, and does not expose it to an unselected thread.
    3. After the initial executor-stdio vertical, define
    resume/fork/environment-replacement semantics, executor HTTP placement,
    warning delivery, common MCP tool-context bounds, and move remaining MCP
    source composition above core.
    
    ## Verification
    
    - `cargo check -p codex-mcp -p codex-core-plugins --tests`
    - `just bazel-lock-check`
    - Added focused parser coverage for legacy local normalization, executor
    authority, working-directory handling, and environment-variable
    sourcing.
  • Resolve MCP server registrations through a catalog (#27634)
    ## Why
    
    MCP servers currently come from user config, local plugins,
    compatibility Apps synthesis, and host extensions. Those sources were
    composed by mutating a shared map, leaving registration identity,
    precedence, removal, and provenance implicit in assembly order.
    
    Before adding executor-owned MCPs, Codex needs one durable resolution
    boundary above `McpConnectionManager`. This PR introduces that boundary
    while preserving current server configuration, policy, and runtime
    behavior. Executor-scoped registrations and explicit policy layers
    remain follow-ups.
    
    ## What changed
    
    - Add typed `McpServerRegistration` inputs and an immutable
    `ResolvedMcpCatalog` in `codex-mcp`.
    - Retain each registration's complete `McpServerConfig`, including its
    environment binding, while recording its source and provenance.
    - Preserve the existing structural precedence between plugin, config,
    compatibility, and ordered extension sources.
    - Resolve equal-precedence actions by contribution order; provenance IDs
    are used only for diagnostics and cannot affect the winner.
    - Preserve extension removals and the existing name-scoped `enabled =
    false` veto.
    - Report same-tier conflicts with every contender and the final catalog
    outcome, including whether the winning action registers or removes the
    server.
    - Require MCP contributors to provide a stable diagnostic identity.
    - Derive materialized server maps and plugin ownership from the resolved
    catalog.
    
    `McpConnectionManager`, transport startup, tool calls, and resource
    routing continue to consume the same effective `McpServerConfig` values.
    
    ## Scope
    
    This PR does not add new MCP capabilities or change user-visible
    behavior. It does not add executor plugin discovery, thread-scoped
    registrations, dynamic refresh generations, or new user/managed policy
    semantics.
    
    ## Verification
    
    - Added focused catalog coverage for source precedence, complete
    configuration preservation, disabled vetoes, plugin ownership,
    contribution-order tie breaking, removal outcomes, and conflict
    diagnostics.
    - Extended hosted Apps coverage for ordered extension removal and
    Apps-disabled hosts with and without the hosted extension installed.
    - `cargo check -p codex-mcp --tests -p codex-extension-api -p
    codex-core`
  • skills: make backend plugin skills invocable without an executor (#27387)
    ## Why
    
    #27198 made the extension-owned `codex_apps` MCP connection the hosted
    plugin runtime, but its `mcp/skill` resources still bypassed the skills
    extension. App-server could list and read those resources through
    generic MCP APIs, but a thread with no selected environment did not
    expose them in the model's skills catalog or load their `SKILL.md`
    through `$skill`.
    
    Hosted skills should stay remote while using the same typed catalog,
    source authority, deduplication, bounded contextual catalog, and
    selected-skill prompt injection as host and executor skills. They should
    not be downloaded or exposed as ambient filesystem paths.
    
    ## What changed
    
    - Add a session-scoped `McpResourceClient` over the replaceable MCP
    connection manager so resource list/read calls follow startup and
    refresh replacements.
    - Add a `BackendSkillProvider` that pages `codex_apps` resources,
    accepts bounded and validated `mcp/skill` entries, and reads a selected
    skill's `SKILL.md` through the same MCP connection.
    - Register the remote provider in app-server and include it in the
    skills catalog even when a thread has no selected capability roots or
    executor.
    - Contribute hosted skill metadata through the bounded
    `AvailableSkillsInstructions` developer-context path, exclude remote
    entries from per-turn catalog injection, and classify `<skills>`
    messages as contextual developer content so rollback can trim and
    rebuild them correctly.
    
    ## Testing
    
    - Extend the app-server MCP resource integration test with
    `environments: []` to exercise two-page discovery, filter a
    non-`mcp/skill` resource, verify the escaped developer catalog entry and
    user-role `<skill>` fragment containing the fetched `SKILL.md`, and
    preserve generic MCP resource reads.
    - Add core event-mapping coverage that classifies `<skills>` developer
    messages as contextual history.
  • Use latest-wins MCP manager replacement (#27259)
    ## Summary
    
    We originally addressed startup prewarming holding the read side of
    `RwLock<McpConnectionManager>` by snapshotting tool-list state. Review
    feedback identified the broader ownership problem: the outer
    synchronization should only publish or retrieve the current manager,
    while MCP operations rely on the manager's internal synchronization. A
    follow-up preserved operation retirement with a separate gate, but
    further review questioned whether that synchronization was actually
    required and whether we could support latest-wins replacement instead.
    
    This PR now stores the current MCP manager in `ArcSwap`. Each operation
    uses `load_full()` to obtain an owned `Arc<McpConnectionManager>`, then
    performs MCP I/O without retaining the publication mechanism. Refresh
    cancels obsolete startup work, constructs a replacement, and atomically
    publishes it. New operations see the latest manager, while operations
    that already loaded the previous manager retain a valid handle. Refresh
    happens at a turn boundary, so there should be no active user tool calls
    to drain.
    
    Git history supports dropping the outer `RwLock`. It was introduced in
    `03ffe4d595` on November 17, 2025 for non-blocking MCP startup: the
    session published an empty manager, startup initialized that same object
    while holding the write lock, and readers waited for initialization.
    `7cd2e84026` on February 19, 2026 removed that two-phase initialization
    in favor of constructing a fresh manager and swapping it in, explicitly
    noting that `Option` or `OnceCell` could replace the placeholder design.
    Hot reload later reused the existing lock to publish a replacement, but
    I found no indication that the lock was introduced to guarantee
    in-flight tool calls finish before refresh or shutdown.
    
    Terminal shutdown remains separate from refresh: it aborts startup
    prewarming and active tasks before shutting down the current manager, so
    tool calls may be interrupted and no model WebSocket work continues
    after shutdown. Focused regression coverage exercises pending tool-list
    cancellation, deferred refresh, and startup-prewarm shutdown.
  • Use plugin-service MCP as the hosted plugin runtime (#27198)
    ## Stack
    
    - Base: #27191
    - This PR is the third vertical and should be reviewed against
    `jif/external-plugins-2`, not `main`.
    
    ## Why
    
    #27191 moves the host-owned Apps MCP registration behind an extension
    contributor, but deliberately preserves the existing endpoint-selection
    feature while that contribution contract lands. App-server can therefore
    resolve the server through extensions, yet the hosted plugin endpoint is
    still selected through temporary `apps_mcp_path_override` plumbing.
    
    That is not the long-term plugin model. A plugin can bundle skills,
    connectors, MCP servers, and hooks, and those components do not all need
    the same source or execution environment. In particular, an
    authenticated HTTP MCP server can expose plugin capabilities directly
    from a backend without an executor or an orchestrator filesystem.
    
    This PR completes that hosted vertical. App-server's MCP extension now
    owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
    continue to arrive as MCP tools, while backend-provided skills arrive as
    MCP resources and use Codex's existing resource list/read paths. No
    second backend client, skill filesystem, or generic plugin activation
    framework is introduced.
    
    The backend route remains the hosted implementation. This change
    replaces Codex's temporary endpoint-selection mechanism, not the service
    behind the endpoint.
    
    ## What changed
    
    ### Hosted plugin runtime
    
    The MCP extension now contributes `codex_apps` as the hosted plugin
    runtime rather than as a configurable Apps endpoint:
    
    - `https://chatgpt.com` resolves to
    `https://chatgpt.com/backend-api/ps/mcp`;
    - a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
    - the existing product-SKU header and ChatGPT authentication behavior
    are preserved;
    - executor availability is never consulted for this streamable HTTP
    transport.
    
    The same MCP connection carries both component shapes supported by the
    hosted endpoint:
    
    - connector actions are discovered and invoked as MCP tools;
    - hosted skills are enumerated and read as MCP resources through the
    existing `list_mcp_resources` and `read_mcp_resource` paths.
    
    This keeps component access in the subsystem that already owns the
    protocol instead of downloading backend skills into an orchestrator
    filesystem or inventing a parallel hosted-skill client.
    
    ### Explicit runtime ordering
    
    `McpManager` now resolves the reserved `codex_apps` entry in three
    ordered phases:
    
    1. install the legacy Apps fallback for compatibility;
    2. apply ordered extension `Set` or `Remove` overlays;
    3. apply the final ChatGPT-auth gate without synthesizing the server
    again.
    
    This ordering is important:
    
    - an ordinary configured or plugin MCP server cannot claim the
    auth-bearing `codex_apps` name;
    - an extension-contributed hosted runtime wins over the fallback;
    - an extension `Remove` remains authoritative;
    - a host without the MCP extension retains the legacy Apps endpoint and
    current local-only behavior.
    
    The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
    longer needed.
    
    ### Remove the path override
    
    The `apps_mcp_path_override` feature and its runtime plumbing are
    removed, including:
    
    - the feature registry entry and structured feature config;
    - `Config` and `McpConfig` fields;
    - config schema output;
    - config-lock materialization;
    - URL override handling in `codex-mcp`.
    
    Existing boolean and structured forms still deserialize as ignored
    compatibility input. They are omitted from new serialized config, and
    config-lock comparison normalizes the removed input so older locks
    remain replayable.
    
    ### App-server coverage
    
    App-server MCP fixtures now serve the hosted route at
    `/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
    therefore exercise the extension-owned endpoint rather than succeeding
    through the legacy fallback.
    
    The stack also adds the missing `codex_chatgpt::connectors` re-export
    for the manager-backed connector helper introduced in #27191.
    
    ## Compatibility
    
    - App-server installs the extension and uses `/ps/mcp` for the hosted
    runtime.
    - CLI and other hosts that do not install the extension retain the
    legacy Apps endpoint.
    - Apps disabled or non-ChatGPT authentication removes `codex_apps` from
    the effective runtime view.
    - Existing local plugins, local skills, executor-selected skills,
    configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
    - Backend plugin enablement remains account/workspace state owned by the
    hosted endpoint; this PR does not add thread-local backend plugin
    selection.
    
    ## Architectural fit
    
    The stack now proves two independent runtime shapes:
    
    1. #27184 resolves filesystem-backed skills through the executor that
    owns a selected root.
    2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
    extension with no executor.
    
    Together they preserve the intended separation:
    
    - selection identifies a plugin/root when explicit selection is needed;
    - each component's owning extension resolves its concrete access
    mechanism;
    - execution stays with the runtime required by that component;
    - existing skills, MCP, connector, and hook subsystems remain the
    downstream consumers.
    
    ## Planned follow-ups
    
    1. **Executor stdio MCP:** selecting an executor plugin registers a
    manifest-declared stdio MCP server and executes it in the environment
    that owns the plugin.
    2. **Optional backend selection:** only if CCA needs thread-local
    selection distinct from backend account/workspace enablement, add a
    concrete backend-owned capability location and surface those selected
    skills through the skills catalog.
    3. **Connector metadata and hooks:** activate those plugin components
    through their existing owning subsystems, with executor hooks remaining
    environment-bound.
    4. **Propagation and persistence:** define explicit resume, fork,
    subagent, refresh, and environment-removal semantics once selected roots
    have multiple real consumers.
    5. **Local convergence:** migrate legacy local skill, MCP, connector,
    and hook paths behind their owning extensions one vertical at a time,
    then remove duplicate core managers and compatibility plumbing after
    parity.
    
    ## Verification
    
    Coverage in this change exercises:
    
    - extension-owned `/backend-api/ps/mcp` registration without an
    executor;
    - preservation of the legacy endpoint in hosts without the extension;
    - extension `Set` and `Remove` precedence over the legacy fallback;
    - ChatGPT-auth gating for the reserved server;
    - hosted MCP resource reads with and without an active thread;
    - connector tool invocation and MCP elicitation through the hosted
    route;
    - ignored boolean and structured forms of the removed path override;
    - config-lock replay compatibility for the removed feature.
    
    `cargo check -p codex-features -p codex-mcp-extension -p
    codex-app-server` passes. Tests and Clippy were not run locally under
    the current development instruction; CI provides the full validation
    pass.
  • [codex] Make MCP connection startup fallible (#27261)
    ## Why
    
    Required MCP server startup was enforced in `Session::new` after
    `McpConnectionManager` had already created the clients. That split let
    other manager construction paths bypass the same requirement and exposed
    manager internals solely so the session could validate them. Keeping
    required-server readiness in the constructor gives every caller one
    consistent startup contract.
    
    ## What changed
    
    - make `McpConnectionManager::new` return `anyhow::Result<Self>` and
    fail when an enabled, required server cannot initialize
    - pass the startup cancellation token into the constructor so
    required-server waits remain cancellable
    - propagate constructor failures through resource reads, connector
    discovery, and MCP status collection
    - preserve the active manager and cancellation token when a refreshed
    replacement fails
    - keep required-startup failure collection private and cover the
    constructor error contract directly
    
    ## Validation
    
    - updated the focused connection-manager test to assert the complete
    required-server startup error
    - local tests not run; relying on CI
  • [codex] Tighten MCP connection manager API visibility and order (#27257)
    ## Summary
    
    - order `McpConnectionManager` methods by visibility, with the primary
    constructor and public API first
    - restrict `list_available_server_infos` to `codex-mcp`
    - make `new_uninitialized` a private test-only helper
    
    ## Why
    
    The manager exposed methods that are only used inside `codex-mcp` or its
    unit tests. Tightening those methods keeps the exported API intentional,
    while the new ordering makes the supported surface easier to scan.
    
    ## Validation
    
    - `just fmt`
    - `git diff --check`
    - local tests not run; relying on CI
  • Route hosted Apps MCP through extensions (#27191)
    ## Stack
    
    - Base: #27184
    - This PR is the second vertical and should be reviewed against
    `jif/external-plugins-1`, not `main`.
    
    ## Why
    
    CCA is moving toward a split runtime where the orchestrator may have no
    filesystem or executor, but it still needs to activate remotely hosted
    plugin components. HTTP MCP servers are the simplest complete example:
    they need configuration and host authentication, but they do not need an
    executor process.
    
    The Apps MCP endpoint is currently synthesized by a special-purpose
    loader inside the MCP runtime. That works locally, but it leaves hosted
    MCP activation outside the extension model being established in #27184.
    It also makes the Apps path a poor foundation for plugins whose skills,
    MCP servers, connectors, and hooks may come from different sources or
    execute in different places.
    
    This PR moves that one behavior behind an extension-owned contribution
    while preserving the existing local fallback. It deliberately does not
    introduce a generic plugin activation framework.
    
    ## What changed
    
    ### MCP extension contribution
    
    `codex-extension-api` gains an ordered `McpServerContributor` contract.
    A contributor returns typed `Set` or `Remove` overlays for MCP server
    configuration; later contributors win for the names they own.
    
    The contract stays at the existing MCP configuration boundary.
    Extensions do not create a second connection manager or transport
    abstraction.
    
    ### Hosted Apps MCP extension
    
    A new `codex-mcp-extension` contributes the reserved `codex_apps` server
    from the existing Apps feature, ChatGPT base URL, path override, and
    product SKU configuration.
    
    When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
    resulting streamable HTTP endpoint is
    `https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
    remains authoritative, so this server can run in an orchestrator-only
    process without being exposed for API-key sessions.
    
    ### One resolved runtime view
    
    `McpManager` now distinguishes three views:
    
    - **configured:** config- and plugin-backed servers before extension
    overlays;
    - **runtime:** configured servers plus host-installed extension
    contributions;
    - **effective:** runtime servers after auth gating and compatibility
    built-ins.
    
    App-server installs the hosted MCP extension and uses the runtime view
    for thread startup, refresh, status, threadless resource reads,
    connector discovery, and MCP OAuth lookup. This keeps
    `mcpServer/oauth/login` consistent with the servers exposed by the other
    MCP APIs. The hosted Apps server itself continues to use existing
    ChatGPT host authentication rather than MCP OAuth.
    
    ## Compatibility
    
    Hosts that do not install the MCP extension retain the existing Apps MCP
    synthesis path. This preserves current local-only, CLI, and
    standalone-host behavior while app-server exercises the extension path.
    
    Disabling Apps removes the reserved `codex_apps` entry, and losing
    ChatGPT auth removes it from the effective runtime view. Executor
    availability is not consulted for this HTTP transport.
    
    ## Follow-ups
    
    The next vertical will resolve a manifest-declared stdio MCP server from
    an executor-selected plugin root and execute it in the environment that
    owns that root. Later verticals can add backend-owned skills, connector
    metadata, hooks, durable selection semantics, and incremental local
    convergence without changing the component-specific runtime boundaries
    introduced here.
    
    ## Verification
    
    Focused coverage was added for:
    
    - contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
    executor;
    - requiring ChatGPT auth in the effective runtime view;
    - removing a reserved configured Apps server when the Apps feature is
    disabled.
    
    `cargo check -p codex-app-server -p codex-mcp-extension -p
    codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
    locally under the current development instruction; CI provides the full
    validation pass.
  • [app-server][core] Add connector-level Guardian reviewer overrides (#25167)
    Context: https://openai.slack.com/archives/C0B4JAF0Q2C/p1779912328647229
    
    ```
    approvals_reviewer = "auto_review"
    
    [apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
    enabled = true
    approvals_reviewer = "user"
    default_tools_approval_mode = "prompt"
    ```
    
    <img width="230" height="84" alt="Screenshot 2026-05-31 at 11 56 34 AM"
    src="https://github.com/user-attachments/assets/e319f8f7-0983-42a7-98cd-3302732fa406"
    />
    
    <img width="841" height="233" alt="Screenshot 2026-05-31 at 11 52 42 AM"
    src="https://github.com/user-attachments/assets/7ac76645-4e90-4d00-8242-f031146a22a5"
    />
    
    -------
    
    ```
    approvals_reviewer = "user"
    
    [apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
    enabled = true
    approvals_reviewer = "auto_review"
    default_tools_approval_mode = "prompt"
    ```
    <img width="195" height="83" alt="Screenshot 2026-05-31 at 12 02 27 PM"
    src="https://github.com/user-attachments/assets/3d374dc8-8aa2-466f-a13f-e4ed8567aa2e"
    />
    <img width="771" height="207" alt="Screenshot 2026-05-31 at 12 05 42 PM"
    src="https://github.com/user-attachments/assets/105c2575-68d6-4ca6-8e69-dc8c82da36a2"
    />
    
    
    
    ## Summary
    - add `apps.<connector_id>.approvals_reviewer` to override Guardian or
    user review routing per connected app
    - apply overrides across direct app MCP calls, delegated MCP prompts,
    and app-server MCP elicitation review while preserving global behavior
    for non-app MCP servers
    - expose and document the config through app-server v2 and generated
    schemas, while honoring global managed reviewer requirements
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Expose MCP server info as part of server status (#24698)
    # Summary
    
    Expose MCP server info via App Server (when available) so apps can
    render a richer MCP experience
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • fix(core): instrument stalled tool-listing handoff (#24667)
    ## Why
    
    When a turn needs a follow-up request after tool output is recorded,
    Codex can still appear stuck in `Thinking` before the next `/responses`
    request is opened. The existing local trace showed the last completed
    response and the absence of a new backend request, but it did not show
    whether the stall was in tool-router preparation or later request setup.
    
    Issue: N/A (internal incident investigation)
    
    ## What Changed
    
    Added trace spans around the pre-stream tool-router handoff in
    `core/src/session/turn.rs`, including the `built_tools` phase and the
    MCP manager read lock.
    
    Added per-server MCP tool-listing spans and trace breadcrumbs in
    `codex-mcp/src/connection_manager.rs` with startup snapshot /
    startup-complete state so a pending MCP client is visible in feedback
    logs instead of looking like a silent hang.
    
    ## Verification
    
    - `just fmt`
    - `just test -p codex-mcp`
    - `just test -p codex-core` (prior full rerun fails in this workspace on
    unrelated integration tests: code-mode output length expectations, one
    shell timeout formatting assertion, and shell snapshot timeouts; latest
    review-fix rerun compiled and passed 1160 tests before I stopped the
    abnormally slow unrelated suite)
  • Remove reserved namespaces dedup (#24609)
    Avoid suffixing reserved namespaces.
  • Move MCP tool naming mode into manager (#21576)
    ## Why
    
    The `non_prefixed_mcp_tool_names` feature should be applied where MCP
    tools become model-visible, not by remapping names later in core.
    Keeping the decision in `McpConnectionManager` construction makes
    `ToolInfo` the single shaped view that spec building, deferred tool
    search, routing, and unavailable-tool placeholders can consume directly.
    
    This also preserves the existing external behavior while the feature is
    off, and keeps the feature-on behavior for code mode and hooks explicit
    at the manager boundary.
    
    ## What Changed
    
    - Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig`
    into `McpConnectionManager::new`.
    - Normalize MCP `ToolInfo` names in the manager using either
    legacy-prefixed namespaces or non-prefixed namespaces; the legacy path
    adds `mcp__` without restoring the old trailing namespace suffix.
    - Remove the core-side MCP name remapping path so specs, tool search,
    session resolution, and unavailable-tool placeholder construction use
    the manager-provided `ToolName` values directly.
    - Keep code mode flattening on the `__` namespace separator.
    - Preserve hook compatibility by giving non-prefixed MCP hook names
    legacy `mcp__...` matcher aliases.
    - Add/adjust integration and unit coverage for non-prefixed code-mode
    behavior, hook matching with the feature on and off, and manager-level
    legacy prefixing.
    
    ## Testing
    
    - `cargo test -p codex-mcp --lib`
    - `cargo test -p codex-core --lib tools::spec::tests -- --nocapture`
    - `cargo test -p codex-core --lib mcp_tools -- --nocapture`
    - `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture`
    - `cargo test -p codex-core --test all mcp_tool -- --nocapture`
    - `cargo test -p codex-core --test all search_tool -- --nocapture`
    - `cargo test -p codex-core --test all hooks_mcp -- --nocapture`
    - `cargo test -p codex-core --test all
    code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled --
    --nocapture`
    - `cargo test -p codex-tools`
    - `cargo test -p codex-features`
  • Route MCP servers through explicit environments (#23583)
    ## Summary
    - route each configured MCP server through an explicit per-server
    `environment_id` instead of a manager-wide remote toggle
    - default omitted `environment_id` to `local`, resolve named ids through
    `EnvironmentManager`, and fail only the affected MCP server when an
    explicit id is unknown
    - keep local stdio on the existing local launcher path for now, while
    named-environment stdio uses the selected environment backend and
    requires an absolute `cwd`
    - allow local HTTP MCP servers to keep using the ambient HTTP client
    when no local `Environment` is configured; named-environment HTTP MCPs
    use that environment's HTTP client
    
    ## Validation
    - devbox Bazel build: `bazel build --bes_backend= --bes_results_url=
    //codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server
    //codex-rs/rmcp-client:test_streamable_http_server`
    - devbox app-server config matrix with real `config.toml` /
    `environments.toml` files covering omitted local, explicit local,
    omitted local under remote default, explicit remote stdio, local HTTP
    without local env, explicit remote HTTP, local stdio without local env,
    unknown explicit env, and remote stdio without `cwd`
  • Make local environment optional in EnvironmentManager (#23369)
    ## Summary
    - make `EnvironmentManager` local environment/runtime paths optional
    - simplify constructor surface around snapshot materialization
    - rename local env accessors to `require_local_environment` /
    `try_local_environment`
    
    ## Validation
    - devbox Bazel build for touched crate surfaces
    - `//codex-rs/exec-server:exec-server-unit-tests`
    - `//codex-rs/app-server-client:app-server-client-unit-tests`
    - filtered touched `//codex-rs/core:core-unit-tests` cases