codex

[codex] Classify nested MCP authentication startup errors (#30257 )

## Summary

- classify authentication-required RMCP startup failures, including
errors nested inside `ClientInitializeError::TransportError`
- let `codex-mcp` consume that classification so the existing
`reauthenticationRequired` startup failure reason is emitted
- add a regression test that performs real startup with an expired
persisted OAuth token and no refresh token

## Why

Follow-up to #29877.

RMCP stores streamable HTTP initialization failures inside a dynamic
transport error whose payload is not exposed through the standard Rust
error source chain. The original `anyhow::Error::chain()` check
therefore missed the nested `AuthError::AuthorizationRequired` seen
during real MCP startup and emitted `failureReason: null`.

The transport-specific inspection now lives in `codex-rmcp-client`,
while `codex-mcp` consumes only the domain-level authentication-required
result. This classifier does not distinguish first-time login from
reauthentication; the existing auth-state logic remains responsible for
that distinction.

## User impact

When stored MCP OAuth credentials are expired and cannot be refreshed,
app clients now receive `failureReason: "reauthenticationRequired"` on
the failed startup update and can show the reconnect action. First-time
login and unrelated startup failures remain unchanged.

## Validation

- `just test -p codex-rmcp-client --test streamable_http_oauth_startup
identifies_expired_unrefreshable_token_startup_error`
- `just test -p codex-mcp
startup_outcome_error_identifies_authentication_required`
- `just test -p codex-mcp
mcp_startup_failure_reason_requires_existing_oauth_and_auth_failure`
- `cargo build -p codex-cli --bin codex`
- local app-server probe emitted `failureReason:
"reauthenticationRequired"`
- manual end-to-end reconnect flow confirmed
- `just fmt`

felixxia-oai · 2026-06-26 14:11:13 -07:00

526f495f3a

Reuse MCP runtimes when selected availability changes nothing (#30148 )

## Why

MCP runtime reuse was keyed by every ready selected-capability
environment, even when an environment contributed no MCP servers or
connectors.

For example:

1. a global stdio MCP is running;
2. a selected remote environment contains only a skill;
3. that environment becomes ready;
4. the MCP and connector projection stays exactly the same;
5. Codex nevertheless rebuilds the MCP manager and restarts the global
stdio process.

That restart can interrupt active calls and discard process-local state
even though nothing about MCP changed.

## What changes

When selected-environment availability changes, Codex now resolves the
candidate MCP and connector projection before deciding whether to
replace the runtime:

- if the winning MCP servers or their ownership change, rebuild as
before;
- if the selected connector snapshot changes, rebuild as before;
- if an enabled MCP is explicitly bound to an environment whose
availability changed, rebuild as before;
- otherwise, keep the exact live manager and processes, and update only
the availability input remembered by the snapshot.

```text
ready selected environments:  [] -> [skills-env]
resolved MCP servers:          {global_probe} -> {global_probe}
resolved connectors:           {} -> {}
result:                         reuse manager; keep the same process
```

The comparison uses the resolved winning servers and their sources, so
plugin/config ownership remains part of the runtime identity.

## Existing stack coverage

The integration PR directly below this one already covers both rebuild
boundaries: a selected MCP becomes callable and a selected connector
tool becomes model-visible when their environment becomes available. It
also verifies that an unchanged selected MCP runtime keeps its process.

This PR does not add another remote-attachment integration scenario for
the no-change optimization. `environment/add` returns before readiness,
and app-server does not currently expose a deterministic readiness
signal for an environment that contributes only skills. Keeping a
fixed-delay test would add flake risk; adding a new readiness API would
be outside this fix.

## Scope and assumptions

- This does not change skill discovery, World State rendering, or plugin
metadata caching.
- This does not add file watching or hot reload behavior.
- This does not change disconnect/reconnect handling.
- Selected environment IDs and their capability contents retain the
stack's existing stability assumption.
- Delayed `required = true` executor MCP behavior remains out of scope.

jif · 2026-06-26 09:27:41 +01:00

6d2168f06a

Retry failed Codex Apps MCP startup (#29920 )

## Problem

The built-in Codex Apps MCP client shares a future for the full startup
operation: connect, complete `initialize`, fetch the initial tools, and
return a usable client. Sharing deduplicates startup work, but it also
memoizes terminal errors.

After a transient connection, handshake, or initial `tools/list`
failure, later tool builds observe the same failed future. The thread
cannot reconnect after the backend recovers and continues serving its
startup-time cached tool snapshot, which may be empty or stale.

## Fix

When Apps MCP startup ends in an error, Codex starts bounded recovery
without putting startup latency on tool-router construction:

1. The current tool build immediately continues with the cached startup
snapshot.
2. After the initial failure is reported, Codex starts one fresh full
startup attempt in the background.
3. Concurrent tool builds share that in-flight attempt and also continue
with cached tools.
4. On success, the recovered client becomes active, refreshes the Apps
tools cache, emits a `Ready` startup status, and is reused by later
operations.
5. On failure, the cache remains unchanged and later tool builds may
start another background attempt after exponential cooldown: 1s, 2s, 4s,
8s, 16s, then 30s maximum.

Each recreated startup performs a fresh MCP `initialize` and uncached
`tools/list`. The MCP client retains its existing bounded retries for
retryable `initialize` and `tools/list` failures.

This avoids adding the Apps startup timeout to every request during a
sustained outage.

## Scope

This is limited to the built-in Codex Apps MCP client:

- no reconnects for user-configured MCP servers;
- no cache deletion; and
- no proactive refresh for a healthy client with stale tools.

## Tests

Coverage verifies:

- tool builds return cached tools without waiting for a blocked
reconnect;
- concurrent tool builds start only one background reconnect;
- failed reconnects preserve cached tools and respect exponential
cooldown;
- a recovered client is retained and reused; and
- a long-lived thread exposes recovered app tools on a later follow-up.

Validation:

- `just test -p codex-mcp` — 95 passed
- `just test -p codex-core
later_follow_up_uses_background_recovered_apps_after_mid_thread_startup_failures
--no-capture` — passed
- `just fix -p codex-mcp`
- `just fmt`

kbazzi · 2026-06-25 21:31:12 -07:00

92d2e1df70

Keep MCP elicitation routable across runtime refreshes (#30127 )

## Why

An MCP tool call can still be waiting for an elicitation response when
an environment update replaces the thread's MCP runtime.

Before this change:

```text
runtime A starts a tool call and asks the user
environment becomes ready, so runtime B is published
client answers the prompt through runtime B
runtime B cannot find runtime A's pending responder
```

The response is lost and the original tool call stays blocked.

## What changed

All MCP runtimes for one thread now share a small elicitation router:

```text
runtime A ---\
               shared router: response token -> exact pending responder
runtime B ---/
```

When Codex surfaces an MCP elicitation, it assigns a unique opaque
response token. The router records which pending request owns that
token. A replacement runtime reuses the same router, so the latest
runtime can deliver a response to a request started by the previous
runtime.

The Codex-owned token also prevents two runtime connections that reuse
the same MCP server request ID from receiving each other's responses.

This does not retain or search old MCP managers. Only the pending
responder map is shared.

## Covered scenario

The integration test exercises the complete failure mode:

1. A thread starts while its selected environment is still unavailable.
2. A configured MCP server starts a tool call and asks the client for
input.
3. The environment becomes ready, causing Codex to publish a replacement
MCP runtime.
4. The client answers the original prompt after the replacement.
5. The original tool call receives that answer and completes.

A focused routing test also creates two runtimes with the same server
request ID and verifies that each response reaches the exact request
that emitted its token.

## Scope

This PR changes only elicitation response routing across MCP runtime
replacement. It does not change when runtimes are rebuilt, which
environments contribute MCP configuration, or how environment
availability is detected.

jif · 2026-06-26 01:28:14 +00:00

fb8598df3f

Pin MCP runtimes to model steps (#30101 )

## Why

An MCP refresh can replace the session's current manager while a model
step is still running. The step must execute calls through the same
manager whose tools it advertised.

## Boundary

```text
current session MCP runtime
          |
          | capture once for this model step
          v
StepContext.mcp
  - exact MCP config
  - exact connection manager
  - exact runtime environment context
```

```rust
pub struct McpRuntimeSnapshot {
    config: Arc<McpConfig>,
    manager: Arc<McpConnectionManager>,
    runtime_context: McpRuntimeContext,
}
```

## Example

```text
step A captures runtime A and advertises A's tools
refresh publishes runtime B
step A tool call -> runtime A
next step        -> runtime B
```

Capturing the snapshot is only an `Arc` clone. It does not restart MCPs
or make an RPC.

## What changes

- Captures one MCP runtime in `StepContext`.
- Uses it for tool planning, tool calls, resources, approvals, connector
attribution, and elicitation.
- Publishes replacement runtimes atomically.
- Lets an old runtime live only while an in-flight step or request still
holds its `Arc`.

Most of this diff is mechanical routing from the session-global manager
to `step_context.mcp`; it does not introduce selected-plugin discovery
yet.

## What does not change

- No plugin or extension migration.
- No new MCP cache policy.
- No environment file watching.
- No client sharing between separate managers.

## Stack

1. Extension-owned World State sections.
2. Project executor skills through World State.
3. **This PR:** pin one MCP runtime to each model step.
4. Project selected MCP/app/connector metadata by environment
availability.
5. One end-to-end integration scenario.

jif · 2026-06-26 00:53:07 +01:00

ee9e0f6387

[codex] Surface MCP reauthentication-required startup failures (#29877 )

## Summary

- distinguish expired, non-refreshable stored MCP OAuth credentials from
first-time missing credentials
- carry a typed `failureReason: "reauthenticationRequired"` on the
existing `mcpServer/startupStatus/updated` notification only when user
action is required
- keep the public MCP auth-status API unchanged and regenerate the
app-server protocol schemas and documentation

## Why

An MCP server with an expired access token and no usable refresh token
currently fails startup without giving clients a reliable, typed
recovery signal.

The existing startup-status notification is the natural place to carry
this state. Its nullable `failureReason` keeps the recovery reason
attached to the failed startup transition without adding a one-off
notification. Internally, Codex distinguishes first-time login from
reauthentication and emits the reason only when the startup error itself
requires authentication.

## User impact

App clients can prompt an existing user to reconnect an MCP server when
automatic recovery is impossible by handling a failed
`mcpServer/startupStatus/updated` notification whose `failureReason` is
`reauthenticationRequired`. Starting, ready, cancelled, unrelated
failures, and first-time setup carry no reauthentication reason.

## Companion app PR

- openai/openai#1069582

## Validation

- `just test -p codex-app-server-protocol` — 248 passed; schema fixture
tests passed
- `cargo check -p codex-app-server -p codex-tui`
- `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
- `just test -p codex-protocol -p codex-app-server-protocol -p
codex-mcp` — 579 passed
- `just write-app-server-schema`
- `just fmt`

felixxia-oai · 2026-06-25 21:50:36 +00:00

a6d20ed297

feat(core, mcp): cache codex_apps tools in memory (#29003 )

## Description

This makes Codex Apps tool reads use a shared in-memory snapshot instead
of rereading the disk cache every time `list_all_tools()` runs. Disk
still seeds the cache on startup and gets updated after successful
fetches, but it is no longer the live read path.

The core change is that `McpManager` now owns a process-scoped
`CodexAppsToolsCache`. Codex threads in the same app-server process now
share this Codex Apps in-memory tools snapshot. The snapshot is keyed by
the Codex home plus the Codex Apps identity: the active Codex auth
user/workspace and the effective Codex Apps MCP source config.

There's already code to hard-refresh the cache, so we respect it in this
PR.

## Local benchmark

I ran a local steady-state microbenchmark of the exact repeated Codex
Apps cached-tools read this PR removes, using the same real local cache
payload in both trees: `3,678,138` bytes and `381` tools. The cache file
was already warm in the OS page cache, so this measures same-process
reread/deserialization work rather than cold-disk latency or full turn
latency. Each run is 25 iterations (mimicking a turn that makes 25
inference calls).

| Version | Run 1 | Run 2 | Avg |
|---|---:|---:|---:|
| `origin/main` disk read + JSON deserialize + `filter_tools` | `50.755
ms` | `52.894 ms` | `51.825 ms` |
| This branch in-memory `current_tools` + `filter_tools` | `0.740 ms` |
`0.778 ms` | `0.759 ms` |

That removes about `51 ms` from each repeated Codex Apps cached-tools
read on this machine, roughly `68x` faster for that subpath. It is
useful evidence for the hot path this PR changes, but not a claim that
every production turn gets `51 ms` faster; end-to-end impact also
depends on the rest of `list_all_tools()` and tool-payload construction.

This is on my M2 Max macbook, so with a slower disk this would be much
worse (and indeed we did see this really blew up turn runtime with a
slow disk).

Owen Lin · 2026-06-25 20:54:48 +00:00

703793c22e

Support OAuth for HTTP MCP servers from selected executor plugins (#28529 )

## Why

#28522 routes selected-plugin HTTP MCP traffic through the owning
executor, but OAuth bootstrap and refresh still used host-local clients.
Executor-only servers therefore cannot complete discovery or login
through the same network boundary as the MCP connection.

## What changed

- adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
contract
- let RMCP own discovery, dynamic registration, PKCE, token exchange,
and refresh
- route auth status, persisted-token startup, and app-server login
through the server runtime while preserving the existing local discovery
path
- add optional `threadId` to `mcpServer/oauth/login` and echo it in the
completion notification
- implement RMCP's redirect policy and 1 MiB OAuth response limit over
executor HTTP
- cover selected-thread OAuth discovery and login through an
executor-only route

Depends on #28522.

jif · 2026-06-25 10:31:17 +01:00

b215961a56

Support HTTP MCP servers from selected executor plugins (#28522 )

## Why

Selected executor plugins can declare both stdio and Streamable HTTP MCP
servers, but only stdio registrations were retained. That silently drops
part of the plugin's tool surface and prevents HTTP traffic from using
the owning executor's network.

## What changed

- retain selected-plugin Streamable HTTP MCP declarations alongside
stdio declarations
- route their HTTP clients through the owning executor environment
- preserve local auth-header environment references while rejecting them
for executor-hosted declarations
- cover thread isolation, refresh, and an executor-only HTTP route end
to end

jif · 2026-06-25 10:10:36 +01:00

6368937939

Represent MCP authentication with an enum (#29924 )

## Why

MCP authentication has distinct OAuth and ChatGPT-session flows.
Representing that choice as `use_chatgpt_auth` makes one flow implicit
and allows the configuration model to express the distinction only
through a boolean.

ChatGPT credential forwarding also needs a first-party trust boundary. A
configurable `chatgpt_base_url` controls routing, but must not grant an
MCP server permission to receive session credentials.

This change builds on #29733, where the boolean was introduced.

## What changed

- Replace `use_chatgpt_auth` with an `auth` field backed by the
exhaustive `McpServerAuth` enum.
- Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
the default.
- Trust only the origin derived from the existing hardcoded
`CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
- Keep configured bearer tokens and authorization headers ahead of the
selected authentication flow.
- Update config writers, schema output, fixtures, and integration-test
setup to use the enum.

## Verification

Integration coverage exercises the complete streamable HTTP startup path
in two independent configurations:

- A directly constructed MCP configuration verifies that matching an
overridden `chatgpt_base_url` does not grant ChatGPT auth.
- A persisted `config.toml` containing an attacker-controlled
`chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
through normal config parsing.

Both tests complete MCP initialization and tool listing and assert that
the full captured request sequence contains no authorization headers.
Separate integration coverage verifies that configured authorization
takes precedence over ChatGPT auth.

Ahmed Ibrahim · 2026-06-24 19:51:51 -07:00

f8937b7d86

Allow ChatGPT-hosted MCP servers to use session auth (#29733 )

## Why

ChatGPT session authentication was inferred from the reserved Codex Apps
server name. That couples credential routing to Codex Apps-specific
behavior and prevents other MCP endpoints hosted by ChatGPT from
explicitly using the current session.

The opt-in also needs a clear security boundary: an arbitrary MCP
configuration must not be able to redirect ChatGPT credentials to
another origin.

## What changed

- Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
`false`.
- Honor the setting only when the parsed server URL has the same HTTP(S)
origin as the configured `chatgpt_base_url`; otherwise remove the
capability before startup.
- Resolve bearer tokens and static or environment-backed authorization
headers before selecting authentication, with configured authorization
taking precedence over ChatGPT session auth.
- Enable the setting for the built-in Codex Apps and hosted plugin
runtime endpoints while keeping Codex Apps caching and tool
normalization scoped to the reserved server.
- Persist the setting through MCP config rewrite paths and expose it in
the generated config schema.
- Load the current login state for `codex mcp list` so reported auth
status matches runtime behavior.

## Verification

Core integration coverage exercises the complete streamable HTTP MCP
startup path and verifies that:

- a same-origin opted-in server receives the current ChatGPT access
token;
- an explicitly configured authorization header takes precedence;
- a different-origin server completes MCP initialization and tool
listing without receiving any ChatGPT authorization header.

Ahmed Ibrahim · 2026-06-24 19:21:28 -07:00

4c0706e24a

Add a connector declaration snapshot (#29851 )

## Why

Connector declarations currently enter Codex through broad plugin
capability summaries, then MCP setup, turn tooling, and `app/list` each
reconstruct the same information. That makes executor-selected
connectors difficult to add without coupling connector behavior to the
host plugin loader.

This PR introduces a small connector-owned value that later stack layers
can populate before thread startup.

## What changed

- Move the pure app-declaration parser into `codex-connectors`,
preserving declaration order and category cleanup while leaving
host-side validation and deduplication unchanged.
- Add an immutable `ConnectorSnapshot` with ordered connector IDs and
plugin display-name provenance.
- Adapt the existing local-plugin capability summaries into that
snapshot at current consumer boundaries.
- Use the snapshot for MCP tool provenance, turn connector inventory,
and `app/list`.
- Keep the crate API narrow: no test-only snapshot accessors are
exposed.

The externally visible behavior is unchanged. Connector tools still come
from the orchestrator-owned `/ps/mcp` server, and local plugin
enablement remains owned by the existing plugin loader.

## Stack scope

This is the foundation only. It does not read selected executor packages
or change thread startup. #29852 adds the executor-backed declaration
reader, and #29856 composes selected declarations into a thread
snapshot.

jif · 2026-06-24 23:24:01 +01:00

4e0f863df3

Keep executor plugin MCP paths URI-native (#29628 )

## Why

Executor-owned plugin roots are `PathUri`, but MCP config normalization
still converts them into a native `Path` using the app-server host's
rules. Relative `cwd` values can therefore resolve against the wrong
filesystem when host and executor path conventions differ.

This PR keeps executor MCP paths URI-native until the selected
environment launches the server, while retaining the existing host
parser behavior.

## What changed

- Keep one shared MCP normalization path with narrow host-`Path` and
executor-`PathUri` entrypoints.
- Preserve native host resolution for locally installed plugin MCP
configs.
- For executor configs, default `cwd` to the plugin root and resolve
relative working directories with the root URI's path convention.
- Accept explicit executor `file:` URIs only when they remain within the
selected plugin root.
- Preserve the selected environment id and existing remote
environment-variable ownership rules.
- Route the executor plugin provider through the URI-native entrypoint
without converting the root on the host.
- Ensure `codex doctor` does not probe executor-owned stdio commands or
foreign working directories on the host.
- Cover foreign Windows roots, relative and absolute executor working
directories, traversal rejection, runtime resolution, and doctor
behavior.

```text
plugin root:    file:///C:/plugins/demo
configured cwd: scripts
                  |
                  v
resolved cwd:  file:///C:/plugins/demo/scripts
                  |
                  v
launch through the selected executor
```

No new provider or filesystem abstraction is introduced.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. #28918 — keep selected plugin roots and resources URI-native.
4. #29626 — load executor skills without host path conversion.
5. **This PR** — resolve executor MCP working directories without host
path conversion.

jif · 2026-06-24 09:46:07 +01:00

3e39e92f03

[codex] trace MCP startup latency (#28630 )

## Summary

- add trace-level instrumentation around per-server MCP setup, client
construction, initialization, and initial tool listing
- trace Codex Apps tool and server-info cache loads
- attach `server_name` to server-scoped spans so slow startup work can
be attributed to a specific MCP server

## Why

`session_init.mcp_manager_init` can occasionally be slow, but its
existing coarse span does not identify whether time is spent loading the
Codex Apps cache, constructing a client, initializing a transport, or
listing tools. These definition-level spans provide that breakdown
without changing startup behavior.

## Validation

- `just test -p codex-mcp` (87 passed)
- `just test -p codex-rmcp-client` (86 passed, 2 skipped)

rphilizaire-openai · 2026-06-23 17:46:54 -07:00

322e33512b

[codex] Fix stale approval policy in MCP test (#29696 )

## Summary

- replace the stale `AskForApproval::OnFailure` reference in the MCP
connection manager test with `AskForApproval::OnRequest`
- restore `codex-mcp` test compilation after `OnFailure` was removed in
#28418

## Root cause

The test was added on main after the approval-policy removal branch had
already updated the other references, so the newly added call site was
missed when #28418 merged.

## Validation

- `just test -p codex-mcp` (90 passed)
- `just fmt`

sayan-oai · 2026-06-23 19:56:20 +00:00

00dc5ea397

chore(core) rm AskForApproval::OnFailure (#28418 )

## Summary
Deletes the OnFailure variant of the `AskForApproval` enum. This option
has been deprecated since #11631.

## Testing
- [x] Tests pass

Dylan Hurd · 2026-06-23 12:13:54 -07:00

2cf2a6a844

Shut down superseded MCP managers on refresh (#29608 )

## Summary

MCP refresh replaced the published connection manager without shutting
down the manager it superseded. If another task retained that old
manager, its stdio MCP processes stayed alive and accumulated across
refreshes.

Atomically swap in the refreshed manager, then explicitly shut down the
exact manager returned by the swap. Add a process-level regression test
that retains the old manager during refresh and verifies its stdio
process exits while the replacement remains available.

## Context

Explicit cleanup was lost when manager publication moved to `ArcSwap`.
Dropping the old manager is not a reliable shutdown boundary because
active callers can retain its `Arc` and underlying client process
handles.

jif · 2026-06-23 18:29:27 +01:00

8751fd3fcb

Update rmcp to 1.8.0 (#29634 )

## Summary

- Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
- Adapt to the new shared `peer_info` return type.
- Box OAuth status discovery at the MCP boundary to keep the expanded
future type from overflowing Rust's trait recursion limit.

This brings in custom OAuth HTTP client support from
[modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).

jif · 2026-06-23 15:25:28 +01:00

bbe1006890

Fix Codex Apps auth elicitation hang (#29615 )

## Summary
- Require the reserved Codex Apps MCP server name to be present in the
connection manager before treating it as host-owned.
- Update auth elicitation tests to model an installed host-owned Codex
Apps server without sending startup events to the test session.

## Why
PR #29518 replaced the old host-owned flag with a name-only check. That
made non-host-owned tests with the reserved codex_apps name enter auth
elicitation and wait forever for a response.

jif · 2026-06-23 13:45:42 +02:00

55bc38a22b

Allow codex sandbox to consume MCP sandbox state (#29358 )

## Summary

- let `codex sandbox` accept the JSON value from
`codex/sandbox-state-meta`
- require the payload `permissionProfile` instead of falling back to
ambient permissions
- reuse the existing macOS, Linux, and Windows launch paths, treating
external sandbox state conservatively as read-only
- let opaque forwarders add runtime read roots and disable direct
network access without decoding the payload

Builds on #29113, which is now on `main`.

## Tests

- `just test -p codex-cli debug_sandbox::tests`
- `cargo build -p codex-rmcp-client --bin test_stdio_server`
- `just test -p codex-core
stdio_mcp_tool_call_includes_sandbox_state_meta`
- `just test -p codex-mcp`
- `just fmt`

jif · 2026-06-23 10:17:52 +02:00

d2484697b1

Group Codex Apps client setup (#29583 )

## Why

`McpConnectionManager::new` classified the Codex Apps server twice: once
to create its tools cache context and again to select its runtime
authentication provider. Keeping those decisions separate makes it
harder to see that they belong to the same server-specific setup path.

## What changed

- Group Codex Apps cache and authentication setup under one explicit
branch.
- Keep regular MCP server setup in the corresponding `else` branch.
- Limit environment bearer-token inspection to the Codex Apps path where
it affects runtime authentication.

Ahmed Ibrahim · 2026-06-23 00:53:17 -07:00

266dcbfe5b

Remove redundant Codex Apps cache guard (#29575 )

## Why

Codex Apps cache writes are already restricted to Codex Apps call paths:
startup invokes the helper only from the Codex Apps branch, and hard
refresh operates on the reserved Codex Apps server directly. Rechecking
the server name inside the cache helper duplicates that classification
and leaves the helper with an argument that cannot change valid
behavior.

## What changed

- Remove the redundant server-name check and parameter from the cache
writer.
- Rename the helper to `write_codex_apps_tools_cache` to reflect its
narrower contract.
- Update production and test callsites to use the simplified API.

Ahmed Ibrahim · 2026-06-23 00:31:56 -07:00

83c4934e45

Centralize Codex Apps client handling (#29528 )

## Why

Codex Apps-specific behavior is currently distributed across cache
helpers, startup, tool conversion, and model-visible annotation. Each
layer independently checks the reserved server name, which obscures the
boundary between trusted host-owned connector metadata and regular MCP
server data.

Classifying the server once when `AsyncManagedClient` is created gives
the client a single source of truth and makes the two processing paths
explicit.

## What changed

- Record whether an `AsyncManagedClient` represents the Codex Apps
server at construction time.
- Route startup cache loading, cache persistence, and cache telemetry
through the Codex Apps branch.
- Split uncached tool conversion between Codex Apps normalization and
regular MCP metadata sanitization.
- Split model-visible schema and plugin provenance handling along the
same boundary.
- Remove redundant server-name guards from helpers that are now called
only from the Codex Apps branch.

## Verification

- Preserve behavioral coverage that verifies Codex Apps connector
metadata and the complete converted `ToolInfo` shape.

## Stack

Depends on #29518.

Ahmed Ibrahim · 2026-06-23 00:00:25 -07:00

f0ad028a74

Remove redundant Codex Apps manager flag (#29518 )

## Why

Codex Apps server admission is already decided before
`McpConnectionManager` is constructed. `effective_mcp_servers` and
`effective_mcp_servers_from_configured` remove the server when the apps
feature or required authentication is unavailable, so storing the same
decision on the manager duplicates state that can drift from the
effective server map.

## What changed

- Remove `host_owned_codex_apps_enabled` from `McpConnectionManager` and
its constructor.
- Identify the host-owned Codex Apps server by its reserved server name
once it is present in the effective server map.
- Remove the now-unused flag calculations and constructor arguments from
production and test callsites.

Ahmed Ibrahim · 2026-06-22 23:19:42 -07:00

a22e3d0b82

mcp: accept foreign absolute cwd for remote stdio (#29493 )

## Why

Remote stdio MCP servers can run in an environment whose path convention
differs from the Codex host. A Windows cwd such as
`C:\Users\openai\share` is absolute for the executor but was rejected by
a POSIX orchestrator.

Built on #29501, now merged, which only clarifies the host-native
`PathUri` constructor name.

## What changed

- Deserialize MCP cwd values as `LegacyAppPathString` so config does not
apply host path rules.
- Interpret that spelling as host-native for local launches and convert
it to `PathUri` at executor launch.
- Skip host filesystem and command resolution checks for remote stdio in
`codex doctor`.
- Add host-independent config and executor-boundary coverage using the
foreign path convention for each test platform.

## Validation

- `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
codex-rmcp-client` (408 passed)
- `just test -p codex-cli -p codex-rmcp-client` (372 passed)
- `cargo check --workspace --tests`
- `just test` (11,311 passed; 43 unrelated environment/timing failures)
- `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
codex-mcp-extension -p codex-rmcp-client -p codex-tui`

Adam Perry @ OpenAI · 2026-06-23 01:33:51 +00:00

67009bc53f

Add config toggles for orchestrator skills and MCP (#28942 )

## Why

Orchestrator-provided skills and Codex Apps MCP tools add model-visible
instructions, resources, and tools beyond the local workspace. Hosts
need config-level switches to disable those orchestrator-owned surfaces
independently, without disabling regular skills or regular MCP servers.

## What changed

- Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled`
config entries, both defaulting to `true`.
- Includes the new settings in `config.schema.json` and in the config
lock so resolved thread configuration preserves the same orchestrator
exposure decisions.
- Threads `orchestrator.skills.enabled` through the app-server skills
extension so disabled orchestrator skills do not expose the `skills`
namespace or inject orchestrator skill context.
- Gates Codex Apps MCP exposure, app instructions, and app auth
eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps
MCP tools available.
- Updates the thread-manager sample config to disable both
orchestrator-owned surfaces.

## Verification

- Added config parsing, loading, defaulting, and schema coverage for the
new settings.
- Added MCP exposure coverage that `orchestrator.mcp.enabled = false`
removes Codex Apps tools while preserving regular MCP tools.
- Added app-server coverage that `orchestrator.skills.enabled = false`
prevents orchestrator skill tools, prompts, and resource reads from
reaching the model turn.

jif · 2026-06-19 14:42:26 +02:00

81b000421d

[codex] Remove hardcoded app ID filters (#28947 )

## Summary

- remove the duplicated originator-specific connector ID denylists
- stop filtering connector directory/accessibility results and
live/cached Codex Apps MCP tools by hardcoded connector ID
- remove the now-unused `codex-login` dependency from
`codex-utils-plugins`
- update regression coverage so formerly blocked connector IDs are
preserved

## Why

The client-side policy was duplicated across crates, used opaque IDs
without ownership or expiry information, and could drift between app
listing and MCP tool behavior. Server-provided visibility,
authorization, plugin discoverability, accessibility, enabled-state
handling, and consequential-tool approval templates remain unchanged.

## Validation

- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `git diff --check`
- confirmed the final diff contains no hardcoded denylist symbols

A targeted `codex-mcp` test build spent an unusually long time in local
compilation/linking. Its first attempt exposed a test-only `PartialEq`
assertion issue, which was corrected. A follow-up non-linking `cargo
check -p codex-mcp --tests` was still running when this draft was
opened; CI should provide the complete Rust validation.

Eric Ning · 2026-06-18 20:29:01 +00:00

29eb434bc5

Support openai/form extended form elicitations (#27500 )

# Summary
Allow App Server clients to opt into `openai/form` MCP elicitations.

Gabriel Peal · 2026-06-18 11:54:49 -07:00

21a599fa56

Scope MCP sandbox metadata to server environment (#28914 )

Scope MCP sandbox metadata to the MCP server's owning environment.

Previously, `codex/sandbox-state-meta` always used the turn's primary
cwd and rebuilt a legacy sandbox policy from that cwd. That can be wrong
for MCP servers owned by a different execution environment.

This now sends the owning environment cwd as a `file:` URI in
`sandboxCwd`, keeps `permissionProfile` as the permission source of
truth, and omits sandbox-state metadata when a non-default server
environment is not selected for the turn. Local/default MCP servers keep
the existing fallback cwd behavior.

Tests:
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just test -p codex-mcp`
- `just test -p codex-core mcp_sandbox_cwd`
- `cargo build -p codex-rmcp-client --bin test_stdio_server`
- `just test -p codex-core
stdio_mcp_tool_call_includes_sandbox_state_meta`

jif · 2026-06-18 19:31:07 +02:00

790213ded0

[mcp] Increase default tool timeout to 300 seconds (#28234 )

Summary
- Increase the default MCP tool-call timeout from 120 to 300 seconds.

Validation
- `just test -p codex-mcp`
- `just fmt`

Alex Daley · 2026-06-15 16:07:01 -04:00

41db093aa0

skills: cache orchestrator resources per thread (#28336 )

## Why

Hosted orchestrator skills are read through the remote MCP resource
server. Within one thread, the same catalog or skill resource can be
requested multiple times by prompt injection and the `skills.list` /
`skills.read` tools. Re-fetching adds latency and can make those
surfaces observe different remote contents during the same thread.

This is a follow-up to #28333: orchestrator skills remain limited to
threads without a local executor, and those threads now get a stable
per-thread view of the remote skill data they use.

## What changed

- Reuse the existing per-thread orchestrator catalog snapshot for
`skills.list` and `skills.read` availability checks.
- Cache successful orchestrator resource reads by authority, package,
and resource so prompt injection and tool calls share the same contents.
- Keep the cache memory-only and bounded to 100 resources and 8 MiB per
thread.
- Leave host and executor skill reads unchanged, and do not cache failed
remote reads.

## Verification

- Extended the app-server MCP resource integration test to read the same
hosted skill resource twice and verify that the remote server receives
one read.
- The same test verifies that catalog discovery and the selected skill's
main prompt are each fetched only once per thread.

jif · 2026-06-15 20:20:19 +02:00

0afe559318

Add selected-plugin precedence and attribution to the MCP catalog (#27884 )

## Why

**In short:** this PR resolves already-discovered MCP registrations. It
does not read selected plugins or discover their MCP servers.

The resolved MCP catalog currently builds config and auto-discovered
plugin registrations before runtime contributors are applied. A
thread-selected plugin needs a distinct precedence tier in that same
initial resolution pass: otherwise a disabled lower-precedence winner
can leave stale name-level state behind, and the winning MCP tools
cannot be attributed to the selected package reliably.

This PR adds that catalog boundary before executor discovery is
connected.

## What changed

- Added an explicit selected-plugin registration tier between
auto-discovered plugins and explicit config.
- Collected selected-plugin contributions before the initial catalog
build, while leaving compatibility and generic extension overlays in
their existing runtime phase.
- Retained the winning plugin ID and display name directly on
plugin-owned catalog registrations.
- Derived MCP tool provenance from the winning catalog entry instead of
joining against local-only plugin summaries.
- Retained the winning selected server's tool approval policy in the
running connection manager, so a selected registration cannot inherit
approval behavior from a losing local plugin.
- Kept remembered approval session-scoped for selected plugins until
there is an authority-aware persistence contract; Codex will not write
approval back to an unrelated local plugin.
- Preserved existing name-level disabled vetoes for discovered plugins
and config, while keeping a selected package's own disabled registration
scoped to that registration.
- Preserved deterministic selection order and existing config,
compatibility, and extension precedence.

The resulting order is:

```text
auto-discovered plugin
  < selected plugin
  < explicit config
  < compatibility registration
  < extension overlay
```

## Behavior and scope

This is a catalog and provenance change only. No production host
contributes selected-plugin MCP registrations yet, so existing local MCP
behavior remains unchanged.

The stacked follow-up, #27870, installs the executor plugin provider
that produces these registrations. App-server activation remains a
separate final step.

## Verification

Focused tests cover precedence, deterministic selected-plugin conflicts,
disabled-veto behavior across catalog phases, managed requirements
before selected-plugin resolution, winning-server approval policy, and
attribution when local and selected packages share an ID or server name.
CI owns execution of the test suite.

jif · 2026-06-15 11:10:51 +02:00

c3a479620f

feat: use encrypted local secrets for MCP OAuth (#27541 )

## Summary

- store MCP OAuth credentials in the configured auth credential backend
- support encrypted-local OAuth storage, including legacy keyring
migration
- propagate the credential backend through MCP refresh, session, CLI,
and app-server paths

## Stack

1. #27504 — config and feature flag
2. #27535 — auth-specific secret namespaces
3. #27539 — encrypted CLI auth storage
4. this PR — encrypted MCP OAuth storage

This is a parallel review stack; the original #17931 remains unchanged.

## Tests

- `just test -p codex-rmcp-client` (the transport round-trip test passed
after building the required `codex` binary and retrying)
- `just test -p codex-mcp`
- `just test -p codex-app-server
refresh_config_uses_latest_auth_keyring_backend`
- `just test -p codex-core
refresh_mcp_servers_is_deferred_until_next_turn`
- `just test -p codex-cli mcp`
- `just fix -p codex-rmcp-client -p codex-mcp -p codex-core -p codex-cli
-p codex-app-server -p codex-protocol`
- `just bazel-lock-check`

Celia Chen · 2026-06-12 22:03:51 +00:00

9915d34684

Extract shared plugin MCP config parsing (#27863 )

## Why

We want a thread-selected plugin to eventually expose stdio MCP servers
that run on the executor owning that plugin.

The existing plugin MCP parser lived inside `core-plugins` and was
coupled to the host filesystem loader. Reusing it from an executor
provider would either duplicate MCP normalization or make the plugin
package layer own MCP runtime semantics. This PR creates the shared
MCP-owned boundary first.

In simple terms:

```text
plugin .mcp.json
        |
        v
shared parser in codex-mcp
        |
        +-- Declared placement: preserve current local-plugin behavior
        |
        +-- Environment placement: produce config bound to one executor
```

This builds on the authority-bound plugin descriptors from #27692. It
intentionally does not discover, register, or launch executor MCP
servers yet.

## What changed

- Moved plugin MCP file parsing and normalization from `core-plugins`
into `codex-mcp`.
- Kept support for both existing file shapes: a top-level server map and
an object containing `mcpServers`.
- Kept per-server failure isolation: one invalid server does not discard
valid siblings, while malformed top-level JSON still fails the whole
file.
- Updated the existing local plugin loader to use `Declared` placement,
preserving its current transport, OAuth, relative `cwd`, and error
behavior.
- Added `Environment` placement for the next stacked PR:
- the selected environment ID overrides anything declared by the plugin;
  - missing stdio `cwd` defaults to the plugin root;
- relative `cwd` is resolved beneath the plugin root and cannot traverse
outside it;
- bare or source-less environment-variable references resolve on a
non-local executor;
- explicit orchestrator environment-variable forwarding is rejected for
executor-owned plugins.

## User impact

None in this PR. Existing local plugin MCP loading follows the same
behavior through the shared parser. The executor placement mode is not
connected to thread startup until the follow-up registration PR.

## Assumptions

- A selected capability root's environment is authoritative. A plugin
cannot redirect its stdio process to the orchestrator or another
executor.
- Relative working directories belong under the plugin package root.
Explicit absolute working directories remain valid within the owning
environment.
- For a non-local executor, unqualified environment-variable names refer
to that executor. Reading an orchestrator variable requires an explicit
contract and is rejected for now.
- Parsing only produces normalized `McpServerConfig` values. Process
startup remains owned by the existing MCP runtime and connection
manager.

## Follow-ups

1. Add the executor MCP provider and catalog registration: read the
selected plugin's MCP config through the same executor filesystem,
support stdio only, freeze the result per active thread, apply managed
policy, and resolve name collisions as discovered plugin < selected
plugin < explicit config.
2. Install that provider in app-server and add an end-to-end test
proving `thread/start.selectedCapabilityRoots` launches and calls the
MCP tool on the selected executor, preserves the frozen registration
across refresh, and does not expose it to an unselected thread.
3. After the initial executor-stdio vertical, define
resume/fork/environment-replacement semantics, executor HTTP placement,
warning delivery, common MCP tool-context bounds, and move remaining MCP
source composition above core.

## Verification

- `cargo check -p codex-mcp -p codex-core-plugins --tests`
- `just bazel-lock-check`
- Added focused parser coverage for legacy local normalization, executor
authority, working-directory handling, and environment-variable
sourcing.

jif · 2026-06-12 15:10:05 +02:00

17b9f4843e

Resolve MCP server registrations through a catalog (#27634 )

## Why

MCP servers currently come from user config, local plugins,
compatibility Apps synthesis, and host extensions. Those sources were
composed by mutating a shared map, leaving registration identity,
precedence, removal, and provenance implicit in assembly order.

Before adding executor-owned MCPs, Codex needs one durable resolution
boundary above `McpConnectionManager`. This PR introduces that boundary
while preserving current server configuration, policy, and runtime
behavior. Executor-scoped registrations and explicit policy layers
remain follow-ups.

## What changed

- Add typed `McpServerRegistration` inputs and an immutable
`ResolvedMcpCatalog` in `codex-mcp`.
- Retain each registration's complete `McpServerConfig`, including its
environment binding, while recording its source and provenance.
- Preserve the existing structural precedence between plugin, config,
compatibility, and ordered extension sources.
- Resolve equal-precedence actions by contribution order; provenance IDs
are used only for diagnostics and cannot affect the winner.
- Preserve extension removals and the existing name-scoped `enabled =
false` veto.
- Report same-tier conflicts with every contender and the final catalog
outcome, including whether the winning action registers or removes the
server.
- Require MCP contributors to provide a stable diagnostic identity.
- Derive materialized server maps and plugin ownership from the resolved
catalog.

`McpConnectionManager`, transport startup, tool calls, and resource
routing continue to consume the same effective `McpServerConfig` values.

## Scope

This PR does not add new MCP capabilities or change user-visible
behavior. It does not add executor plugin discovery, thread-scoped
registrations, dynamic refresh generations, or new user/managed policy
semantics.

## Verification

- Added focused catalog coverage for source precedence, complete
configuration preservation, disabled vetoes, plugin ownership,
contribution-order tie breaking, removal outcomes, and conflict
diagnostics.
- Extended hosted Apps coverage for ordered extension removal and
Apps-disabled hosts with and without the hosted extension installed.
- `cargo check -p codex-mcp --tests -p codex-extension-api -p
codex-core`

jif · 2026-06-11 21:54:52 +02:00

4a5a676499

skills: make backend plugin skills invocable without an executor (#27387 )

## Why

#27198 made the extension-owned `codex_apps` MCP connection the hosted
plugin runtime, but its `mcp/skill` resources still bypassed the skills
extension. App-server could list and read those resources through
generic MCP APIs, but a thread with no selected environment did not
expose them in the model's skills catalog or load their `SKILL.md`
through `$skill`.

Hosted skills should stay remote while using the same typed catalog,
source authority, deduplication, bounded contextual catalog, and
selected-skill prompt injection as host and executor skills. They should
not be downloaded or exposed as ambient filesystem paths.

## What changed

- Add a session-scoped `McpResourceClient` over the replaceable MCP
connection manager so resource list/read calls follow startup and
refresh replacements.
- Add a `BackendSkillProvider` that pages `codex_apps` resources,
accepts bounded and validated `mcp/skill` entries, and reads a selected
skill's `SKILL.md` through the same MCP connection.
- Register the remote provider in app-server and include it in the
skills catalog even when a thread has no selected capability roots or
executor.
- Contribute hosted skill metadata through the bounded
`AvailableSkillsInstructions` developer-context path, exclude remote
entries from per-turn catalog injection, and classify `<skills>`
messages as contextual developer content so rollback can trim and
rebuild them correctly.

## Testing

- Extend the app-server MCP resource integration test with
`environments: []` to exercise two-page discovery, filter a
non-`mcp/skill` resource, verify the escaped developer catalog entry and
user-role `<skill>` fragment containing the fetched `SKILL.md`, and
preserve generic MCP resource reads.
- Add core event-mapping coverage that classifies `<skills>` developer
messages as contextual history.

jif · 2026-06-11 11:28:16 +02:00

a287c5dffd

Use latest-wins MCP manager replacement (#27259 )

## Summary

We originally addressed startup prewarming holding the read side of
`RwLock<McpConnectionManager>` by snapshotting tool-list state. Review
feedback identified the broader ownership problem: the outer
synchronization should only publish or retrieve the current manager,
while MCP operations rely on the manager's internal synchronization. A
follow-up preserved operation retirement with a separate gate, but
further review questioned whether that synchronization was actually
required and whether we could support latest-wins replacement instead.

This PR now stores the current MCP manager in `ArcSwap`. Each operation
uses `load_full()` to obtain an owned `Arc<McpConnectionManager>`, then
performs MCP I/O without retaining the publication mechanism. Refresh
cancels obsolete startup work, constructs a replacement, and atomically
publishes it. New operations see the latest manager, while operations
that already loaded the previous manager retain a valid handle. Refresh
happens at a turn boundary, so there should be no active user tool calls
to drain.

Git history supports dropping the outer `RwLock`. It was introduced in
`03ffe4d595` on November 17, 2025 for non-blocking MCP startup: the
session published an empty manager, startup initialized that same object
while holding the write lock, and readers waited for initialization.
`7cd2e84026` on February 19, 2026 removed that two-phase initialization
in favor of constructing a fresh manager and swapping it in, explicitly
noting that `Option` or `OnceCell` could replace the placeholder design.
Hot reload later reused the existing lock to publish a replacement, but
I found no indication that the lock was introduced to guarantee
in-flight tool calls finish before refresh or shutdown.

Terminal shutdown remains separate from refresh: it aborts startup
prewarming and active tasks before shutting down the current manager, so
tool calls may be interrupted and no model WebSocket work continues
after shutdown. Focused regression coverage exercises pending tool-list
cancellation, deferred refresh, and startup-prewarm shutdown.

Charlie Marsh · 2026-06-10 08:33:21 -07:00

41b4fabbb4

Use plugin-service MCP as the hosted plugin runtime (#27198 )

## Stack

- Base: #27191
- This PR is the third vertical and should be reviewed against
`jif/external-plugins-2`, not `main`.

## Why

#27191 moves the host-owned Apps MCP registration behind an extension
contributor, but deliberately preserves the existing endpoint-selection
feature while that contribution contract lands. App-server can therefore
resolve the server through extensions, yet the hosted plugin endpoint is
still selected through temporary `apps_mcp_path_override` plumbing.

That is not the long-term plugin model. A plugin can bundle skills,
connectors, MCP servers, and hooks, and those components do not all need
the same source or execution environment. In particular, an
authenticated HTTP MCP server can expose plugin capabilities directly
from a backend without an executor or an orchestrator filesystem.

This PR completes that hosted vertical. App-server's MCP extension now
owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
continue to arrive as MCP tools, while backend-provided skills arrive as
MCP resources and use Codex's existing resource list/read paths. No
second backend client, skill filesystem, or generic plugin activation
framework is introduced.

The backend route remains the hosted implementation. This change
replaces Codex's temporary endpoint-selection mechanism, not the service
behind the endpoint.

## What changed

### Hosted plugin runtime

The MCP extension now contributes `codex_apps` as the hosted plugin
runtime rather than as a configurable Apps endpoint:

- `https://chatgpt.com` resolves to
`https://chatgpt.com/backend-api/ps/mcp`;
- a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
- the existing product-SKU header and ChatGPT authentication behavior
are preserved;
- executor availability is never consulted for this streamable HTTP
transport.

The same MCP connection carries both component shapes supported by the
hosted endpoint:

- connector actions are discovered and invoked as MCP tools;
- hosted skills are enumerated and read as MCP resources through the
existing `list_mcp_resources` and `read_mcp_resource` paths.

This keeps component access in the subsystem that already owns the
protocol instead of downloading backend skills into an orchestrator
filesystem or inventing a parallel hosted-skill client.

### Explicit runtime ordering

`McpManager` now resolves the reserved `codex_apps` entry in three
ordered phases:

1. install the legacy Apps fallback for compatibility;
2. apply ordered extension `Set` or `Remove` overlays;
3. apply the final ChatGPT-auth gate without synthesizing the server
again.

This ordering is important:

- an ordinary configured or plugin MCP server cannot claim the
auth-bearing `codex_apps` name;
- an extension-contributed hosted runtime wins over the fallback;
- an extension `Remove` remains authoritative;
- a host without the MCP extension retains the legacy Apps endpoint and
current local-only behavior.

The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
longer needed.

### Remove the path override

The `apps_mcp_path_override` feature and its runtime plumbing are
removed, including:

- the feature registry entry and structured feature config;
- `Config` and `McpConfig` fields;
- config schema output;
- config-lock materialization;
- URL override handling in `codex-mcp`.

Existing boolean and structured forms still deserialize as ignored
compatibility input. They are omitted from new serialized config, and
config-lock comparison normalizes the removed input so older locks
remain replayable.

### App-server coverage

App-server MCP fixtures now serve the hosted route at
`/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
therefore exercise the extension-owned endpoint rather than succeeding
through the legacy fallback.

The stack also adds the missing `codex_chatgpt::connectors` re-export
for the manager-backed connector helper introduced in #27191.

## Compatibility

- App-server installs the extension and uses `/ps/mcp` for the hosted
runtime.
- CLI and other hosts that do not install the extension retain the
legacy Apps endpoint.
- Apps disabled or non-ChatGPT authentication removes `codex_apps` from
the effective runtime view.
- Existing local plugins, local skills, executor-selected skills,
configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
- Backend plugin enablement remains account/workspace state owned by the
hosted endpoint; this PR does not add thread-local backend plugin
selection.

## Architectural fit

The stack now proves two independent runtime shapes:

1. #27184 resolves filesystem-backed skills through the executor that
owns a selected root.
2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
extension with no executor.

Together they preserve the intended separation:

- selection identifies a plugin/root when explicit selection is needed;
- each component's owning extension resolves its concrete access
mechanism;
- execution stays with the runtime required by that component;
- existing skills, MCP, connector, and hook subsystems remain the
downstream consumers.

## Planned follow-ups

1. **Executor stdio MCP:** selecting an executor plugin registers a
manifest-declared stdio MCP server and executes it in the environment
that owns the plugin.
2. **Optional backend selection:** only if CCA needs thread-local
selection distinct from backend account/workspace enablement, add a
concrete backend-owned capability location and surface those selected
skills through the skills catalog.
3. **Connector metadata and hooks:** activate those plugin components
through their existing owning subsystems, with executor hooks remaining
environment-bound.
4. **Propagation and persistence:** define explicit resume, fork,
subagent, refresh, and environment-removal semantics once selected roots
have multiple real consumers.
5. **Local convergence:** migrate legacy local skill, MCP, connector,
and hook paths behind their owning extensions one vertical at a time,
then remove duplicate core managers and compatibility plumbing after
parity.

## Verification

Coverage in this change exercises:

- extension-owned `/backend-api/ps/mcp` registration without an
executor;
- preservation of the legacy endpoint in hosts without the extension;
- extension `Set` and `Remove` precedence over the legacy fallback;
- ChatGPT-auth gating for the reserved server;
- hosted MCP resource reads with and without an active thread;
- connector tool invocation and MCP elicitation through the hosted
route;
- ignored boolean and structured forms of the removed path override;
- config-lock replay compatibility for the removed feature.

`cargo check -p codex-features -p codex-mcp-extension -p
codex-app-server` passes. Tests and Clippy were not run locally under
the current development instruction; CI provides the full validation
pass.

jif · 2026-06-10 12:54:21 +02:00

9cd11e9e62

[codex] Make MCP connection startup fallible (#27261 )

## Why

Required MCP server startup was enforced in `Session::new` after
`McpConnectionManager` had already created the clients. That split let
other manager construction paths bypass the same requirement and exposed
manager internals solely so the session could validate them. Keeping
required-server readiness in the constructor gives every caller one
consistent startup contract.

## What changed

- make `McpConnectionManager::new` return `anyhow::Result<Self>` and
fail when an enabled, required server cannot initialize
- pass the startup cancellation token into the constructor so
required-server waits remain cancellable
- propagate constructor failures through resource reads, connector
discovery, and MCP status collection
- preserve the active manager and cancellation token when a refreshed
replacement fails
- keep required-startup failure collection private and cover the
constructor error contract directly

## Validation

- updated the focused connection-manager test to assert the complete
required-server startup error
- local tests not run; relying on CI

Ahmed Ibrahim · 2026-06-10 00:17:58 -07:00

a7b6baecc6

[codex] Tighten MCP connection manager API visibility and order (#27257 )

## Summary

- order `McpConnectionManager` methods by visibility, with the primary
constructor and public API first
- restrict `list_available_server_infos` to `codex-mcp`
- make `new_uninitialized` a private test-only helper

## Why

The manager exposed methods that are only used inside `codex-mcp` or its
unit tests. Tightening those methods keeps the exported API intentional,
while the new ordering makes the supported surface easier to scan.

## Validation

- `just fmt`
- `git diff --check`
- local tests not run; relying on CI

Ahmed Ibrahim · 2026-06-09 16:07:34 -07:00

51b3cd51f6

Route hosted Apps MCP through extensions (#27191 )

## Stack

- Base: #27184
- This PR is the second vertical and should be reviewed against
`jif/external-plugins-1`, not `main`.

## Why

CCA is moving toward a split runtime where the orchestrator may have no
filesystem or executor, but it still needs to activate remotely hosted
plugin components. HTTP MCP servers are the simplest complete example:
they need configuration and host authentication, but they do not need an
executor process.

The Apps MCP endpoint is currently synthesized by a special-purpose
loader inside the MCP runtime. That works locally, but it leaves hosted
MCP activation outside the extension model being established in #27184.
It also makes the Apps path a poor foundation for plugins whose skills,
MCP servers, connectors, and hooks may come from different sources or
execute in different places.

This PR moves that one behavior behind an extension-owned contribution
while preserving the existing local fallback. It deliberately does not
introduce a generic plugin activation framework.

## What changed

### MCP extension contribution

`codex-extension-api` gains an ordered `McpServerContributor` contract.
A contributor returns typed `Set` or `Remove` overlays for MCP server
configuration; later contributors win for the names they own.

The contract stays at the existing MCP configuration boundary.
Extensions do not create a second connection manager or transport
abstraction.

### Hosted Apps MCP extension

A new `codex-mcp-extension` contributes the reserved `codex_apps` server
from the existing Apps feature, ChatGPT base URL, path override, and
product SKU configuration.

When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
resulting streamable HTTP endpoint is
`https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
remains authoritative, so this server can run in an orchestrator-only
process without being exposed for API-key sessions.

### One resolved runtime view

`McpManager` now distinguishes three views:

- **configured:** config- and plugin-backed servers before extension
overlays;
- **runtime:** configured servers plus host-installed extension
contributions;
- **effective:** runtime servers after auth gating and compatibility
built-ins.

App-server installs the hosted MCP extension and uses the runtime view
for thread startup, refresh, status, threadless resource reads,
connector discovery, and MCP OAuth lookup. This keeps
`mcpServer/oauth/login` consistent with the servers exposed by the other
MCP APIs. The hosted Apps server itself continues to use existing
ChatGPT host authentication rather than MCP OAuth.

## Compatibility

Hosts that do not install the MCP extension retain the existing Apps MCP
synthesis path. This preserves current local-only, CLI, and
standalone-host behavior while app-server exercises the extension path.

Disabling Apps removes the reserved `codex_apps` entry, and losing
ChatGPT auth removes it from the effective runtime view. Executor
availability is not consulted for this HTTP transport.

## Follow-ups

The next vertical will resolve a manifest-declared stdio MCP server from
an executor-selected plugin root and execute it in the environment that
owns that root. Later verticals can add backend-owned skills, connector
metadata, hooks, durable selection semantics, and incremental local
convergence without changing the component-specific runtime boundaries
introduced here.

## Verification

Focused coverage was added for:

- contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
executor;
- requiring ChatGPT auth in the effective runtime view;
- removing a reserved configured Apps server when the Apps feature is
disabled.

`cargo check -p codex-app-server -p codex-mcp-extension -p
codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
locally under the current development instruction; CI provides the full
validation pass.

jif · 2026-06-09 22:44:16 +02:00

4ec3b8eeea

[app-server][core] Add connector-level Guardian reviewer overrides (#25167 )

Context: https://openai.slack.com/archives/C0B4JAF0Q2C/p1779912328647229

```
approvals_reviewer = "auto_review"

[apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
enabled = true
approvals_reviewer = "user"
default_tools_approval_mode = "prompt"
```

<img width="230" height="84" alt="Screenshot 2026-05-31 at 11 56 34 AM"
src="https://github.com/user-attachments/assets/e319f8f7-0983-42a7-98cd-3302732fa406"
/>

<img width="841" height="233" alt="Screenshot 2026-05-31 at 11 52 42 AM"
src="https://github.com/user-attachments/assets/7ac76645-4e90-4d00-8242-f031146a22a5"
/>

-------

```
approvals_reviewer = "user"

[apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
enabled = true
approvals_reviewer = "auto_review"
default_tools_approval_mode = "prompt"
```
<img width="195" height="83" alt="Screenshot 2026-05-31 at 12 02 27 PM"
src="https://github.com/user-attachments/assets/3d374dc8-8aa2-466f-a13f-e4ed8567aa2e"
/>
<img width="771" height="207" alt="Screenshot 2026-05-31 at 12 05 42 PM"
src="https://github.com/user-attachments/assets/105c2575-68d6-4ca6-8e69-dc8c82da36a2"
/>



## Summary
- add `apps.<connector_id>.approvals_reviewer` to override Guardian or
user review routing per connected app
- apply overrides across direct app MCP calls, delegated MCP prompts,
and app-server MCP elicitation review while preserving global behavior
for non-app MCP servers
- expose and document the config through app-server v2 and generated
schemas, while honoring global managed reviewer requirements

---------

Co-authored-by: jif-oai <jif@openai.com>

Alex Zamoshchin · 2026-06-02 17:04:11 +02:00

4d80d808b4

[codex] Support ui visibility meta for tools (#24700 )

## Summary

Adds support for the same ui.visibility metadata as resources

[spec](https://github.com/modelcontextprotocol/ext-apps/blob/main/specification/draft/apps.mdx#resource-discovery)

Gabriel Peal · 2026-05-28 10:24:03 -07:00

577ec03bf8

Expose MCP server info as part of server status (#24698 )

# Summary

Expose MCP server info via App Server (when available) so apps can
render a richer MCP experience

Gabriel Peal · 2026-05-28 09:38:34 -07:00

8a827d6426

Update rmcp to 1.7.0 (#24763 )

WIll make it easier to uprev when the new draft spec is supported.

Also updates reqwest where needed for compatibility but doesn't update
it everywhere since this is already a large diff.

The new version of rmcp handles certain kinds of authentication failures
differently, this patch includes support for identifying the failing scope
in a WWW-Authenticate header.

Adam Perry @ OpenAI · 2026-05-27 14:52:06 -07:00

910578792f

fix(core): instrument stalled tool-listing handoff (#24667 )

## Why

When a turn needs a follow-up request after tool output is recorded,
Codex can still appear stuck in `Thinking` before the next `/responses`
request is opened. The existing local trace showed the last completed
response and the absence of a new backend request, but it did not show
whether the stall was in tool-router preparation or later request setup.

Issue: N/A (internal incident investigation)

## What Changed

Added trace spans around the pre-stream tool-router handoff in
`core/src/session/turn.rs`, including the `built_tools` phase and the
MCP manager read lock.

Added per-server MCP tool-listing spans and trace breadcrumbs in
`codex-mcp/src/connection_manager.rs` with startup snapshot /
startup-complete state so a pending MCP client is visible in feedback
logs instead of looking like a silent hang.

## Verification

- `just fmt`
- `just test -p codex-mcp`
- `just test -p codex-core` (prior full rerun fails in this workspace on
unrelated integration tests: code-mode output length expectations, one
shell timeout formatting assertion, and shell snapshot timeouts; latest
review-fix rerun compiled and passed 1160 tests before I stopped the
abnormally slow unrelated suite)

Anton Panasenko · 2026-05-27 02:00:40 +00:00

64e340ad28

Remove reserved namespaces dedup (#24609 )

Avoid suffixing reserved namespaces.

pakrym-oai · 2026-05-26 09:57:05 -07:00

6937e8354a

Move MCP tool naming mode into manager (#21576 )

## Why

The `non_prefixed_mcp_tool_names` feature should be applied where MCP
tools become model-visible, not by remapping names later in core.
Keeping the decision in `McpConnectionManager` construction makes
`ToolInfo` the single shaped view that spec building, deferred tool
search, routing, and unavailable-tool placeholders can consume directly.

This also preserves the existing external behavior while the feature is
off, and keeps the feature-on behavior for code mode and hooks explicit
at the manager boundary.

## What Changed

- Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig`
into `McpConnectionManager::new`.
- Normalize MCP `ToolInfo` names in the manager using either
legacy-prefixed namespaces or non-prefixed namespaces; the legacy path
adds `mcp__` without restoring the old trailing namespace suffix.
- Remove the core-side MCP name remapping path so specs, tool search,
session resolution, and unavailable-tool placeholder construction use
the manager-provided `ToolName` values directly.
- Keep code mode flattening on the `__` namespace separator.
- Preserve hook compatibility by giving non-prefixed MCP hook names
legacy `mcp__...` matcher aliases.
- Add/adjust integration and unit coverage for non-prefixed code-mode
behavior, hook matching with the feature on and off, and manager-level
legacy prefixing.

## Testing

- `cargo test -p codex-mcp --lib`
- `cargo test -p codex-core --lib tools::spec::tests -- --nocapture`
- `cargo test -p codex-core --lib mcp_tools -- --nocapture`
- `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture`
- `cargo test -p codex-core --test all mcp_tool -- --nocapture`
- `cargo test -p codex-core --test all search_tool -- --nocapture`
- `cargo test -p codex-core --test all hooks_mcp -- --nocapture`
- `cargo test -p codex-core --test all
code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled --
--nocapture`
- `cargo test -p codex-tools`
- `cargo test -p codex-features`

pakrym-oai · 2026-05-26 08:21:15 -07:00

ff7513cd83

Route MCP servers through explicit environments (#23583 )

## Summary
- route each configured MCP server through an explicit per-server
`environment_id` instead of a manager-wide remote toggle
- default omitted `environment_id` to `local`, resolve named ids through
`EnvironmentManager`, and fail only the affected MCP server when an
explicit id is unknown
- keep local stdio on the existing local launcher path for now, while
named-environment stdio uses the selected environment backend and
requires an absolute `cwd`
- allow local HTTP MCP servers to keep using the ambient HTTP client
when no local `Environment` is configured; named-environment HTTP MCPs
use that environment's HTTP client

## Validation
- devbox Bazel build: `bazel build --bes_backend= --bes_results_url=
//codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server
//codex-rs/rmcp-client:test_streamable_http_server`
- devbox app-server config matrix with real `config.toml` /
`environments.toml` files covering omitted local, explicit local,
omitted local under remote default, explicit remote stdio, local HTTP
without local env, explicit remote HTTP, local stdio without local env,
unknown explicit env, and remote stdio without `cwd`

starr-openai · 2026-05-21 17:19:54 +02:00

298e5cfce1

Make local environment optional in EnvironmentManager (#23369 )

## Summary
- make `EnvironmentManager` local environment/runtime paths optional
- simplify constructor surface around snapshot materialization
- rename local env accessors to `require_local_environment` /
`try_local_environment`

## Validation
- devbox Bazel build for touched crate surfaces
- `//codex-rs/exec-server:exec-server-unit-tests`
- `//codex-rs/app-server-client:app-server-client-unit-tests`
- filtered touched `//codex-rs/core:core-unit-tests` cases

starr-openai · 2026-05-19 12:55:34 -07:00

5c43a64e2b

107 Commits