codex

[codex] Classify nested MCP authentication startup errors (#30257 )

## Summary

- classify authentication-required RMCP startup failures, including
errors nested inside `ClientInitializeError::TransportError`
- let `codex-mcp` consume that classification so the existing
`reauthenticationRequired` startup failure reason is emitted
- add a regression test that performs real startup with an expired
persisted OAuth token and no refresh token

## Why

Follow-up to #29877.

RMCP stores streamable HTTP initialization failures inside a dynamic
transport error whose payload is not exposed through the standard Rust
error source chain. The original `anyhow::Error::chain()` check
therefore missed the nested `AuthError::AuthorizationRequired` seen
during real MCP startup and emitted `failureReason: null`.

The transport-specific inspection now lives in `codex-rmcp-client`,
while `codex-mcp` consumes only the domain-level authentication-required
result. This classifier does not distinguish first-time login from
reauthentication; the existing auth-state logic remains responsible for
that distinction.

## User impact

When stored MCP OAuth credentials are expired and cannot be refreshed,
app clients now receive `failureReason: "reauthenticationRequired"` on
the failed startup update and can show the reconnect action. First-time
login and unrelated startup failures remain unchanged.

## Validation

- `just test -p codex-rmcp-client --test streamable_http_oauth_startup
identifies_expired_unrefreshable_token_startup_error`
- `just test -p codex-mcp
startup_outcome_error_identifies_authentication_required`
- `just test -p codex-mcp
mcp_startup_failure_reason_requires_existing_oauth_and_auth_failure`
- `cargo build -p codex-cli --bin codex`
- local app-server probe emitted `failureReason:
"reauthenticationRequired"`
- manual end-to-end reconnect flow confirmed
- `just fmt`

felixxia-oai · 2026-06-26 14:11:13 -07:00

526f495f3a

[codex] Surface MCP reauthentication-required startup failures (#29877 )

## Summary

- distinguish expired, non-refreshable stored MCP OAuth credentials from
first-time missing credentials
- carry a typed `failureReason: "reauthenticationRequired"` on the
existing `mcpServer/startupStatus/updated` notification only when user
action is required
- keep the public MCP auth-status API unchanged and regenerate the
app-server protocol schemas and documentation

## Why

An MCP server with an expired access token and no usable refresh token
currently fails startup without giving clients a reliable, typed
recovery signal.

The existing startup-status notification is the natural place to carry
this state. Its nullable `failureReason` keeps the recovery reason
attached to the failed startup transition without adding a one-off
notification. Internally, Codex distinguishes first-time login from
reauthentication and emits the reason only when the startup error itself
requires authentication.

## User impact

App clients can prompt an existing user to reconnect an MCP server when
automatic recovery is impossible by handling a failed
`mcpServer/startupStatus/updated` notification whose `failureReason` is
`reauthenticationRequired`. Starting, ready, cancelled, unrelated
failures, and first-time setup carry no reauthentication reason.

## Companion app PR

- openai/openai#1069582

## Validation

- `just test -p codex-app-server-protocol` — 248 passed; schema fixture
tests passed
- `cargo check -p codex-app-server -p codex-tui`
- `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
- `just test -p codex-protocol -p codex-app-server-protocol -p
codex-mcp` — 579 passed
- `just write-app-server-schema`
- `just fmt`

felixxia-oai · 2026-06-25 21:50:36 +00:00

a6d20ed297

Support OAuth for HTTP MCP servers from selected executor plugins (#28529 )

## Why

#28522 routes selected-plugin HTTP MCP traffic through the owning
executor, but OAuth bootstrap and refresh still used host-local clients.
Executor-only servers therefore cannot complete discovery or login
through the same network boundary as the MCP connection.

## What changed

- adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
contract
- let RMCP own discovery, dynamic registration, PKCE, token exchange,
and refresh
- route auth status, persisted-token startup, and app-server login
through the server runtime while preserving the existing local discovery
path
- add optional `threadId` to `mcpServer/oauth/login` and echo it in the
completion notification
- implement RMCP's redirect policy and 1 MiB OAuth response limit over
executor HTTP
- cover selected-thread OAuth discovery and login through an
executor-only route

Depends on #28522.

jif · 2026-06-25 10:31:17 +01:00

b215961a56

Allow ChatGPT-hosted MCP servers to use session auth (#29733 )

## Why

ChatGPT session authentication was inferred from the reserved Codex Apps
server name. That couples credential routing to Codex Apps-specific
behavior and prevents other MCP endpoints hosted by ChatGPT from
explicitly using the current session.

The opt-in also needs a clear security boundary: an arbitrary MCP
configuration must not be able to redirect ChatGPT credentials to
another origin.

## What changed

- Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
`false`.
- Honor the setting only when the parsed server URL has the same HTTP(S)
origin as the configured `chatgpt_base_url`; otherwise remove the
capability before startup.
- Resolve bearer tokens and static or environment-backed authorization
headers before selecting authentication, with configured authorization
taking precedence over ChatGPT session auth.
- Enable the setting for the built-in Codex Apps and hosted plugin
runtime endpoints while keeping Codex Apps caching and tool
normalization scoped to the reserved server.
- Persist the setting through MCP config rewrite paths and expose it in
the generated config schema.
- Load the current login state for `codex mcp list` so reported auth
status matches runtime behavior.

## Verification

Core integration coverage exercises the complete streamable HTTP MCP
startup path and verifies that:

- a same-origin opted-in server receives the current ChatGPT access
token;
- an explicitly configured authorization header takes precedence;
- a different-origin server completes MCP initialization and tool
listing without receiving any ChatGPT authorization header.

Ahmed Ibrahim · 2026-06-24 19:21:28 -07:00

4c0706e24a

[codex] trace MCP startup latency (#28630 )

## Summary

- add trace-level instrumentation around per-server MCP setup, client
construction, initialization, and initial tool listing
- trace Codex Apps tool and server-info cache loads
- attach `server_name` to server-scoped spans so slow startup work can
be attributed to a specific MCP server

## Why

`session_init.mcp_manager_init` can occasionally be slow, but its
existing coarse span does not identify whether time is spent loading the
Codex Apps cache, constructing a client, initializing a transport, or
listing tools. These definition-level spans provide that breakdown
without changing startup behavior.

## Validation

- `just test -p codex-mcp` (87 passed)
- `just test -p codex-rmcp-client` (86 passed, 2 skipped)

rphilizaire-openai · 2026-06-23 17:46:54 -07:00

322e33512b

Prepare managed network sandbox context (#29456 )

## Why

Managed network configures commands to use local HTTP and SOCKS proxies.
For commands delegated to the exec server, the proxy environment and the
sandbox policy were prepared separately. On macOS, that meant a command
could receive `HTTPS_PROXY=http://127.0.0.1:43123` while Seatbelt still
denied access to port `43123`.

## What changed

`NetworkProxy` now prepares the command environment and sandbox context
together from the same runtime snapshot:

```text
Prepared managed network
├── command environment: HTTPS_PROXY=http://127.0.0.1:43123
└── sandbox context: allow outbound to 127.0.0.1:43123
```

That context travels with remote exec requests. The exec server
preserves the managed proxy and CA environment, and macOS Seatbelt
allows only the prepared loopback proxy ports without enabling broad
network access or local binding.

The protocol field is optional and the existing enforcement flag remains
in place, preserving compatibility with callers that do not send the new
context.

jif · 2026-06-23 20:07:09 +01:00

e476fc16ce

Update rmcp to 1.8.0 (#29634 )

## Summary

- Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
- Adapt to the new shared `peer_info` return type.
- Box OAuth status discovery at the MCP boundary to keep the expanded
future type from overflowing Rust's trait recursion limit.

This brings in custom OAuth HTTP client support from
[modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).

jif · 2026-06-23 15:25:28 +01:00

bbe1006890

mcp: accept foreign absolute cwd for remote stdio (#29493 )

## Why

Remote stdio MCP servers can run in an environment whose path convention
differs from the Codex host. A Windows cwd such as
`C:\Users\openai\share` is absolute for the executor but was rejected by
a POSIX orchestrator.

Built on #29501, now merged, which only clarifies the host-native
`PathUri` constructor name.

## What changed

- Deserialize MCP cwd values as `LegacyAppPathString` so config does not
apply host path rules.
- Interpret that spelling as host-native for local launches and convert
it to `PathUri` at executor launch.
- Skip host filesystem and command resolution checks for remote stdio in
`codex doctor`.
- Add host-independent config and executor-boundary coverage using the
foreign path convention for each test platform.

## Validation

- `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
codex-rmcp-client` (408 passed)
- `just test -p codex-cli -p codex-rmcp-client` (372 passed)
- `cargo check --workspace --tests`
- `just test` (11,311 passed; 43 unrelated environment/timing failures)
- `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
codex-mcp-extension -p codex-rmcp-client -p codex-tui`

Adam Perry @ OpenAI · 2026-06-23 01:33:51 +00:00

67009bc53f

path-uri: clarify host-native path conversion (#29501 )

## Why

Downstream refactors are producing confusing code with this
functionality having a very generic name. Encoding the specific
conversion approach in the method name makes it clearer.

## What

Rename `PathUri::from_path` to `PathUri::from_host_native_path` and
update its Rust call sites.

Adam Perry @ OpenAI · 2026-06-23 00:02:33 +00:00

11fab432be

Carry sandbox intent to remote exec servers (#29108 )

## What changed

PR #29099 stopped sending the orchestrator's concrete sandbox wrapper to
a remote exec-server. Remote commands now arrive as plain native argv.

This PR adds the next piece: Codex also sends portable sandbox intent
next to that plain argv.

For a remote unified-exec command, the request can now include:

- the canonical permission profile before local workspace-root
materialization
- the sandbox cwd and workspace roots as `PathUri` values
- Windows sandbox settings
- the legacy Landlock setting
- whether managed networking must be enforced

The important part is that symbolic entries such as `:workspace_roots`
stay symbolic while crossing the boundary. The executor can then bind
them to its own workspace-root paths instead of receiving
orchestrator-local absolute paths.

The data travels through `ExecRequest` into `ExecParams`. Older
exec-servers can still deserialize requests because the new fields have
defaults.

## Why

The orchestrator should not decide how another machine implements
sandboxing.

For example:

- a local macOS Codex would normally build a Seatbelt command
- a remote Linux executor needs a Linux sandbox command instead

The orchestrator now sends the plain command plus the policy it intended
to enforce. A later PR can let the exec-server choose and build the
correct sandbox for its own operating system.

## Important detail

This keeps the portable intent separate from the local `SandboxType`.

`SandboxType::None` is ambiguous:

- it can mean the command was explicitly approved to run without a
sandbox
- it can also mean the orchestrator host has no concrete sandbox
implementation available

Those cases are different for remote execution. This PR adds
`sandbox_requested` so an executor can still receive sandbox intent when
the orchestrator cannot build a local wrapper. Explicit unsandboxed
retries still send no sandbox context.

## Behavior today

This PR only transports the intent. The exec-server accepts the new
fields but does not apply them yet.

Remote commands therefore remain unsandboxed after this PR, just as they
are after PR #29099.

## Follow-up

The next PR will make exec-server read this portable intent, bind
symbolic workspace permissions to executor-native roots, choose the
sandbox for its own operating system, build the wrapper locally, and
then spawn the command.

jif · 2026-06-21 12:33:21 +02:00

bd2968a4db

[codex] prototype mcp_history thread hint injection (#29259 )

## Why

Prototype whether the harness can invoke the `mcp_history` MCP while
constructing full initial context and expose its thread hint to the
model without requiring a model-issued tool call.

The prototype builds on the context-window lineage added by #29256 and
is now based directly on `main`.

## What changed

- Call `mcp_history/thread_hint` with no arguments while building the
full `<token_budget>` context.
- Pass the current `threadId` through MCP request metadata, matching the
normal MCP tool-call path.
- Serialize only the unstructured `content` result and append it inside
`<token_budget>` when the call succeeds.
- Omit the additional context when the MCP call or content serialization
fails.

## Prototype limitations

- The direct call bypasses the normal model-initiated MCP approval,
lifecycle-event, telemetry, and result-sanitization path.
- The call has no prototype-specific timeout, result-size cap, or
per-window cache.
- MCP latency is added to full-context construction, including
applicable compaction paths.

## Validation

- `just test -p codex-core token_budget`

pakrym-oai · 2026-06-20 18:02:02 -07:00

b6d6be2a84

[codex] Support protected resource OAuth discovery (#29022 )

## Why

Plugin-install preflight and the actual OAuth login flow used different
discovery implementations. Preflight had a Codex-specific implementation
that only queried authorization-server metadata on the MCP host, while
login already used the upstream `rmcp` Rust MCP SDK. As a result,
servers that advertise a separate authorization server through RFC 9728
Protected Resource Metadata were classified as OAuth-unsupported during
plugin installation, so login was skipped.

## What changed

- delegate plugin-install OAuth discovery to
`rmcp::transport::AuthorizationManager`, the same implementation used by
the login flow
- let `rmcp` follow Protected Resource Metadata first and perform direct
RFC 8414 authorization-server discovery when protected-resource
discovery does not yield usable metadata
- retain Codex's existing HTTP headers, timeout, `no_proxy` behavior,
and scope normalization around that discovery
- add unit coverage and a pure-MCP plugin-install integration test that
proves the protected-resource path reaches OAuth client registration

This only changes shared MCP OAuth discovery. App declarations and
`appsNeedingAuth` behavior are unchanged.

## Verification

- `just test -p codex-rmcp-client auth_status`
- `just test -p codex-app-server plugin_install_starts_mcp_oauth`
- real plugin-install smoke test with an isolated `CODEX_HOME`: both
DigitalOcean MCP servers started OAuth callback listeners, while Linear
continued to start its existing direct-discovery OAuth flow

xl-openai · 2026-06-18 21:17:20 -07:00

4e6bc42266

Support openai/form extended form elicitations (#27500 )

# Summary
Allow App Server clients to opt into `openai/form` MCP elicitations.

Gabriel Peal · 2026-06-18 11:54:49 -07:00

21a599fa56

[codex] Carry exec-server cwd as PathUri (#28032 )

## Why

This is the second-to-last place in the exec-server protocol that needs
to migrate to URIs to support cross-OS operation.

## What

- Change `ExecParams.cwd` to `PathUri`.
- Keep the cwd URI-shaped through core and rmcp producers, converting it
to `AbsolutePathBuf` only in `LocalProcess::start_process`.
- Reject non-native cwd URIs before launch and update the affected
protocol documentation and call sites.

Adam Perry @ OpenAI · 2026-06-13 20:56:42 +00:00

0fed4497f5

feat: use encrypted local secrets for MCP OAuth (#27541 )

## Summary

- store MCP OAuth credentials in the configured auth credential backend
- support encrypted-local OAuth storage, including legacy keyring
migration
- propagate the credential backend through MCP refresh, session, CLI,
and app-server paths

## Stack

1. #27504 — config and feature flag
2. #27535 — auth-specific secret namespaces
3. #27539 — encrypted CLI auth storage
4. this PR — encrypted MCP OAuth storage

This is a parallel review stack; the original #17931 remains unchanged.

## Tests

- `just test -p codex-rmcp-client` (the transport round-trip test passed
after building the required `codex` binary and retrying)
- `just test -p codex-mcp`
- `just test -p codex-app-server
refresh_config_uses_latest_auth_keyring_backend`
- `just test -p codex-core
refresh_mcp_servers_is_deferred_until_next_turn`
- `just test -p codex-cli mcp`
- `just fix -p codex-rmcp-client -p codex-mcp -p codex-core -p codex-cli
-p codex-app-server -p codex-protocol`
- `just bazel-lock-check`

Celia Chen · 2026-06-12 22:03:51 +00:00

9915d34684

[codex] Retry streamable HTTP initialize failures (#25147 )

## Summary
- Retry transient streamable HTTP failures during RMCP startup when the
failure happens while sending the initialize request.
- Retry transient streamable HTTP failures for tools/list, which is
read-only and safe to replay.
- Cover both retryable HTTP statuses and request-layer failures where no
HTTP status is returned.
- Surface retryable HTTP statuses from the streamable HTTP adapter as
typed client errors.
- Add integration coverage for initialize retry, tools/list retry,
no-status request failure retry, and non-retryable initialize status.

## Root cause
The observed codex_apps failures can happen before normal tool
execution: RMCP startup fails while sending initialize, or the first
read-only tools/list fails after startup. Retrying hosted_apps_bridge
tools/call would not cover initialize and would risk replaying
side-effecting tool calls. This change retries the streamable HTTP
handshake itself, recreates the transport between initialize attempts,
and retries only tools/list among post-initialize service operations.

## Validation
- cargo fmt --package codex-rmcp-client
- cargo test -p codex-rmcp-client --test streamable_http_recovery

ssetty-oai · 2026-06-09 15:49:48 -07:00

490340ffcf

[codex] Report unusable MCP OAuth credentials as logged out (#26713 )

## Why

Persisted MCP OAuth credentials were reported as authenticated whenever
a credential record existed. An expired token without a usable refresh
token could therefore appear as `OAuth` even though startup could not
authenticate with it, leaving users with a misleading status instead of
a login prompt.

## What changed

- Classify stored OAuth credentials as missing, usable, or requiring
authorization.
- Reuse the existing refresh window so near-expiry credentials without a
refresh path are also treated as logged out.
- Validate required credential fields before reporting OAuth
authentication.
- Add unit coverage for credential usability and integration coverage
for expired, unexpired, and refreshable persisted credentials.

## Validation

- `just test -p codex-rmcp-client`

Adam Perry @ OpenAI · 2026-06-09 14:18:24 -07:00

f574946960

fix(rmcp): refresh expired OAuth tokens before startup (#26482 )

## Why

Codex persists OAuth expiry as an absolute `expires_at`, then
reconstructs RMCP’s relative `expires_in` when credentials are loaded.
For an already-expired token, Codex reconstructed `expires_in` as
missing.

[RMCP 0.15 treated a missing `expires_in` as zero when a refresh token
was
present](https://github.com/modelcontextprotocol/rust-sdk/blob/9cfc905a9ef17c8bba6748dc0a9bdd2452681733/crates/rmcp/src/transport/auth.rs#L704-L723),
so this still triggered a refresh. [RMCP 1.7 treats missing expiry
information as unknown and uses the access token
as-is](https://github.com/modelcontextprotocol/rust-sdk/blob/3529c3675ff64db805bd947ca6ece6090809e43d/crates/rmcp/src/transport/auth.rs#L1233-L1265),
causing the stale token to be sent during `initialize`.

## What changed

- Represent a known-expired persisted token as `expires_in = 0`,
preserving `None` for genuinely unknown expiry.
- Add Streamable HTTP coverage requiring the token to refresh before the
startup handshake.

## Validation

- The new regression test fails on RMCP 1.7 before the fix and passes
afterward.
- The same scenario passes on the commit immediately before the RMCP 1.7
update, using RMCP 0.15.
- `just test -p codex-rmcp-client` (63 passed).

Adam Perry @ OpenAI · 2026-06-05 02:31:06 +00:00

4de7a2b9d8

Optimize unbounded byte scans with memchr (#26265 )

## Summary

This PR adds `memchr` for some low-hanging performance improvements
(namely, in MCP stdio, Ollama streaming, and full message-history
newline counts).

Codex produced the following release benchmarks:

| Operation | Before | After | Speedup |
| --- | ---: | ---: | ---: |
| MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
| Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
| Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |

With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
MCP server, completed `initialize`, requested `tools/list`, and
deserialized a 1 MiB tool description over newline-delimited stdio),
it's about 16x faster end-to-end:

| Branch | 50 calls | Per call |
| --- | ---: | ---: |
| `main` | 862.53 ms | 17.25 ms |
| this branch | 53.89 ms | 1.08 ms |

`memchr` is already in our dependency tree and extremely widely used for
this kind of optimized scanning.

Charlie Marsh · 2026-06-04 09:53:08 -04:00

7da4af622f

feat: Add focused diagnostics for MCP HTTP send failures (#25013 )

Adds failure-only logging for MCP streamable HTTP post_message calls and
the underlying reqwest send path, capturing the MCP method/request id,
endpoint shape, auth-header presence, timeout/connect classification,
and sanitized error source chain without logging headers, bodies,
tokens, or full URLs.

xl-openai · 2026-05-29 10:09:33 -07:00

e29bbb5368

Update rmcp to 1.7.0 (#24763 )

WIll make it easier to uprev when the new draft spec is supported.

Also updates reqwest where needed for compatibility but doesn't update
it everywhere since this is already a large diff.

The new version of rmcp handles certain kinds of authentication failures
differently, this patch includes support for identifying the failing scope
in a WWW-Authenticate header.

Adam Perry @ OpenAI · 2026-05-27 14:52:06 -07:00

910578792f

Allow parallel MCP tool calls when annotated readOnly (#23750 )

## Summary
- Treat MCP tools with `readOnlyHint: true` as parallel-safe even when
`supports_parallel_tool_calls` is unset or `false`.
- Keep server-level `supports_parallel_tool_calls` as an additive
override for non-read-only tools.
- Add focused unit coverage for the MCP handler eligibility decision.
- Update RMCP integration coverage to keep the serial baseline on a
mutable tool, verify read-only concurrency without server opt-in, and
preserve the server opt-in concurrency path separately.

## Testing
- `just fmt`
- `cargo test -p codex-core --lib tools::handlers::mcp::tests::`
- `cargo test -p codex-core --test all
stdio_mcp_read_only_tool_calls_run_concurrently_without_server_opt_in`
- `cargo test -p codex-core --test all
stdio_mcp_parallel_tool_calls_opt_in_runs_concurrently`
- `cargo test -p codex-rmcp-client`

anp-oai · 2026-05-21 20:40:34 -07:00

c83ba22359

Route MCP servers through explicit environments (#23583 )

## Summary
- route each configured MCP server through an explicit per-server
`environment_id` instead of a manager-wide remote toggle
- default omitted `environment_id` to `local`, resolve named ids through
`EnvironmentManager`, and fail only the affected MCP server when an
explicit id is unknown
- keep local stdio on the existing local launcher path for now, while
named-environment stdio uses the selected environment backend and
requires an absolute `cwd`
- allow local HTTP MCP servers to keep using the ambient HTTP client
when no local `Environment` is configured; named-environment HTTP MCPs
use that environment's HTTP client

## Validation
- devbox Bazel build: `bazel build --bes_backend= --bes_results_url=
//codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server
//codex-rs/rmcp-client:test_streamable_http_server`
- devbox app-server config matrix with real `config.toml` /
`environments.toml` files covering omitted local, explicit local,
omitted local under remote default, explicit remote stdio, local HTTP
without local env, explicit remote HTTP, local stdio without local env,
unknown explicit env, and remote stdio without `cwd`

starr-openai · 2026-05-21 17:19:54 +02:00

298e5cfce1

Support explicit MCP OAuth client IDs (#22575 )

## Why
Some MCP OAuth providers require a pre-registered public client ID and
cannot rely on dynamic client registration. Codex already supports MCP
OAuth, but it had no way to supply that client ID from config into the
PKCE flow.

## What changed
- add `oauth.client_id` under `[mcp_servers.<server>]` config, including
config editing and schema generation
- thread the configured client ID through CLI, app-server, plugin login,
and MCP skill dependency OAuth entrypoints
- configure RMCP authorization with the explicit client when present,
while preserving the existing dynamic-registration path when it is
absent
- add focused coverage for config parsing/serialization and OAuth URL
generation

## Verification
- `cargo test -p codex-config -p codex-rmcp-client -p codex-mcp -p
codex-core-plugins`
- `cargo test -p codex-core blocking_replace_mcp_servers_round_trips
--lib`
- `cargo test -p codex-core
replace_mcp_servers_streamable_http_serializes_oauth_resource --lib`
- `cargo test -p codex-core config_schema_matches_fixture --lib`

## Notes
Broader local package runs still hit unrelated pre-existing stack
overflows in:
- `codex-app-server::in_process_start_clamps_zero_channel_capacity`
-
`codex-core::resume_agent_from_rollout_uses_edge_data_when_descendant_metadata_source_is_stale`

Matthew Zeng · 2026-05-14 11:52:43 -07:00

d8ddeb6869

Add callback ids to local MCP OAuth redirects (#20237 )

## Summary

- Add a deterministic callback-id path segment to local MCP OAuth
redirect URIs before starting authorization.
- Derive the callback id from the normalized MCP server URL and encode
it as a 12-character URL-safe hash.
- Reuse the existing exact callback-path validation so OAuth completion
only succeeds on the callback path that was sent in the redirect URI.

## Context

Slack thread:
https://openai.slack.com/archives/C087WB3AGCR/p1777480566571699

That thread calls out the OAuth mix-up class of issue for MCP servers.
The connector/App Connect flow already has a callback_id concept that
binds the OAuth callback URL to the MCP app/server identity. Codex
desktop's local MCP OAuth flow was still using a generic local callback
path like `/callback`, so this PR adds the same shape to the shared
local MCP OAuth helper.

## Behavior

Before this change, local MCP OAuth used:

- default local callback URL: `http://127.0.0.1:<port>/callback`
- configured callback URL: `<configured callback URL>` unchanged

After this change, Codex appends a deterministic callback-id segment:

- default local callback URL:
`http://127.0.0.1:<port>/callback/<callback_id>`
- configured callback URL: `<configured callback path>/<callback_id>`

The local callback server already compares the incoming request path
against the path from the redirect URI. By appending the callback id
before both authorization and callback validation, callbacks that arrive
on the old generic path or a mismatched callback-id path are rejected.

The callback id is bound to the MCP endpoint URL, including path and
query, so path-based multi-tenant MCP deployments on the same origin do
not share a callback path. URL fragments are ignored because they are
not sent to the server.

The change lives in `codex-rmcp-client`, so it covers both the normal
desktop MCP OAuth login path and silent/plugin-triggered MCP OAuth login
paths that use the same `perform_oauth_login_*` helpers.

## Scope and non-goals

- This does not change the app-server protocol or desktop webview
request shape.
- This does not implement RFC 9207 `iss` validation; issuer validation
is still useful when providers return `iss`.
- This does not make arbitrary untrusted MCP servers safe to use. It
specifically adds callback URL binding for the local MCP OAuth flow.

## Validation

- `cargo fmt --all`
- `cargo test -p codex-rmcp-client perform_oauth_login`

stevenlee-oai · 2026-05-13 13:26:04 -07:00

0a2d751fc2

feat(sandbox): add Windows deny-read parity (#18202 )

## Why

The split filesystem policy stack already supports exact and glob
`access = none` read restrictions on macOS and Linux. Windows still
needed subprocess handling for those deny-read policies without claiming
enforcement from a backend that cannot provide it.

## Key finding

The unelevated restricted-token backend cannot safely enforce deny-read
overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
checks, not read denials, so this PR intentionally fails that backend
closed when deny-read overrides are present instead of claiming
unsupported enforcement.

## What changed

This PR adds the Windows deny-read enforcement layer and makes the
backend split explicit:

- Resolves Windows deny-read filesystem policy entries into concrete ACL
targets.
- Preserves exact missing paths so they can be materialized and denied
before an enforceable sandboxed process starts.
- Snapshot-expands existing glob matches into ACL targets for Windows
subprocess enforcement.
- Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
- Plans both the configured lexical path and the canonical target for
existing paths so reparse-point aliases are covered.
- Threads deny-read overrides through the elevated/logon-user Windows
sandbox backend and unified exec.
- Applies elevated deny-read ACLs synchronously before command launch
rather than delegating them to the background read-grant helper.
- Reconciles persistent deny-read ACEs per sandbox principal so policy
changes do not leave stale deny-read ACLs behind.
- Fails closed on the unelevated restricted-token backend when deny-read
overrides are present, because its `WRITE_RESTRICTED` token model is not
authoritative for read denials.

## Landed prerequisites

These prerequisite PRs are already on `main`:

1. #15979 `feat(permissions): add glob deny-read policy support`
2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
3. #17740 `feat(config): support managed deny-read requirements`

This PR targets `main` directly and contains only the Windows deny-read
enforcement layer.

## Implementation notes

- Exact deny-read paths remain enforceable on the elevated path even
when they do not exist yet: Windows materializes the missing path before
applying the deny ACE, so the sandboxed command cannot create and read
it during the same run.
- Existing exact deny paths are preserved lexically until the ACL
planner, which then adds the canonical target as a second ACL target
when needed. That keeps both the configured alias and the resolved
object covered.
- Windows ACLs do not consume Codex glob syntax directly, so glob
deny-read entries are expanded to the concrete matches that exist before
process launch.
- Glob traversal deduplicates directory visits within each pattern walk
to avoid cycles, without collapsing distinct lexical roots that happen
to resolve to the same target.
- Persistent deny-read ACL state is keyed by sandbox principal SID, so
cleanup only removes ACEs owned by the same backend principal.
- Deny-read ACEs are fail-closed on the elevated path: setup aborts if
mandatory deny-read ACL application fails.
- Unelevated restricted-token sessions reject deny-read overrides early
instead of running with a silently unenforceable read policy.

## Verification

- `cargo test -p codex-core
windows_restricted_token_rejects_unreadable_split_carveouts`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-windows-sandbox`
- GitHub Actions rerun is in progress on the pushed head.

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-05-11 23:04:28 -07:00

46f30d0282

fix(tui): suppress taskkill output for MCP teardown on Windows (#21759 )

## Why

On native Windows, running `/mcp` can leak `taskkill`'s normal
`SUCCESS:` messages into the Codex TUI while the temporary MCP inventory
process tree is being torn down. That corrupts the screen even though
MCP itself is working correctly.

Fixes #20845.

## What Changed

- Redirect the Windows-only MCP teardown `taskkill` subprocess to null
stdio so its console output cannot reach the TUI.

## How to Test

1. On native Windows, configure a stdio MCP server, for example:
   ```powershell
codex mcp add sequential-thinking -- npx -y
@modelcontextprotocol/server-sequential-thinking
   ```
2. With the latest released Codex CLI, start Codex and run `/mcp`.
3. Confirm the current behavior: `taskkill` `SUCCESS:` lines appear in
the TUI during the MCP refresh.
4. Switch to this branch's build, start Codex again, and run `/mcp`.
5. Confirm the MCP inventory still renders normally and the `taskkill`
lines no longer appear.
6. Repeat `/mcp` once more on this branch to verify the regression does
not recur on repeated inventory requests.

Targeted tests:
- `cargo test -p codex-rmcp-client`
- `cargo test -p codex-rmcp-client --test process_group_cleanup --quiet`

Felipe Coury · 2026-05-10 15:51:26 +00:00

e5d022297d

Disable empty Cargo test targets (#21584 )

## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>

Charlie Marsh · 2026-05-07 15:44:17 -07:00

54ef99a365

feat: make built-in MCPs first-class runtime servers (#21356 )

## DISCLAIMER
This is experimental and no production service must rely on this

## Why

Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.

## What changed

- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.

## Test plan

- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`

---------

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-05-07 10:36:32 +02:00

b2268999fe

Use MCP server instructions in deferred namespace descriptions (#21053 )

## Why

MCP servers can provide `instructions` that explain what their tools are
for. Directly exposed MCP namespaces already use those instructions when
a connector description is not available, but deferred `tool_search`
results did not preserve that fallback. The direct path falls back from
connector metadata to server instructions, while the deferred path only
carried `connector_description` and otherwise fell back to generic
namespace text.

That meant a plain MCP server could provide useful model-facing guidance
and still appear as `Tools in the X namespace.` whenever it was
discovered lazily through `tool_search`.

## What changed

- Store one model-facing `namespace_description` on `ToolInfo`, using
connector descriptions for connector-backed tools and server
instructions for plain MCP servers.
- Thread that namespace description through the `tool_search` source
list, search indexing, and returned namespace metadata.
- Add an end-to-end regression test for deferred non-app MCP search
results exposing server instructions as the namespace description.

## Verification

- `cargo test -p codex-tools
search_tool_description_lists_each_mcp_source_once --lib`
- `cargo test -p codex-core --test all
tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`

sayan-oai · 2026-05-04 19:36:07 +00:00

b9e8df47da

expand the set of core shell env vars for Windows. (#20089 )

https://github.com/openai/codex/issues/13917 and
https://github.com/openai/codex/issues/18248 correctly identify that

```
[shell_environment_policy]
inherit = "core"
```
is not functional on Windows because it carries an insufficient set of
env vars.
This PR expands that to match the more functional set from the MCP
client

iceweasel-oai · 2026-04-29 19:23:46 +00:00

9d1e5df4b2

Terminate stdio MCP servers on shutdown to avoid process leaks (#19753 )

## Why

Several bug reports describe thread shutdown (including subagent
threads) leaving stdio MCP server processes behind. These reports all
point at the same lifecycle gap: Codex launches stdio MCP servers, but
the session-level shutdown path does not explicitly close MCP clients or
terminate the server process tree.

Fixes #12491
Fixes #12976
Fixes #18881
Fixes #19469

## History

This is best understood as a regression/coverage gap in MCP session
lifecycle management, not as stdio MCP cleanup being absent all along.
#10710 added process-group cleanup for stdio MCP servers, but that
cleanup only runs when the `RmcpClient`/transport is dropped. The older
reports (#12491 and #12976) came after that cleanup existed, which
suggests the remaining problem was that some higher-level shutdown paths
kept the MCP manager alive or replaced it without explicitly draining
clients. The newer reports (#18881 and #19469) exposed the same family
around manager replacement and shutdown.

## What changed

- Added an explicit stdio MCP process handle in `codex-rmcp-client` so
local MCP servers terminate their process group and executor-backed MCP
servers call the executor process terminator.
- Added `RmcpClient::shutdown()` and manager-level MCP shutdown draining
so session shutdown, channel-close fallback, MCP refresh, and connector
probing stop owned MCP clients.
- Added regression coverage that starts a stdio MCP server, begins an
in-flight blocking tool call, shuts down the client, and asserts the
server process exits.

## Verification

- `cargo test -p codex-rmcp-client`
- `cargo test -p codex-mcp`
- `just fix -p codex-rmcp-client`
- `just fix -p codex-mcp`
- `just fix -p codex-core`

- Manual before/after validation with a temporary repro script:
- Pre-fix binary from `HEAD^` (`fed0a8f4fa`): reproduced the leak with
surviving MCP server and child PIDs, `survivors=[77583, 77592]`,
`leaked=true`.
- Post-fix binary from this branch (`67e318148b`): verified both MCP
processes were gone after interrupting `codex exec`, `survivors=[]`,
`leaked=false`.

Eric Traut · 2026-04-28 09:29:57 -07:00

4e0cf945b7

[codex] Move config loading into codex-config (#19487 )

## Why

Config loading had become split across crates: `codex-config` owned the
config types and merge logic, while `codex-core` still owned the loader
that assembled the layer stack. This change consolidates that
responsibility in `codex-config`, so the crate that defines config
behavior also owns how configs are discovered and loaded.

To make that move possible without reintroducing the old dependency
cycle, the shell-environment policy types and helpers that
`codex-exec-server` needs now live in `codex-protocol` instead of
flowing through `codex-config`.

This also makes the migrated loader tests more deterministic on machines
that already have managed or system Codex config installed by letting
tests override the system config and requirements paths instead of
reading the host's `/etc/codex`.

## What Changed

- moved the config loader implementation from `codex-core` into
`codex-config::loader` and deleted the old `core::config_loader` module
instead of leaving a compatibility shim
- moved shell-environment policy types and helpers into
`codex-protocol`, then updated `codex-exec-server` and other downstream
crates to import them from their new home
- updated downstream callers to use loader/config APIs from
`codex-config`
- added test-only loader overrides for system config and requirements
paths so loader-focused tests do not depend on host-managed config state
- cleaned up now-unused dependency entries and platform-specific cfgs
that were surfaced by post-push CI

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-core config_loader_tests::`
- `cargo test -p codex-protocol -p codex-exec-server -p
codex-cloud-requirements -p codex-rmcp-client --lib`
- `cargo test --lib -p codex-app-server-client -p codex-exec`
- `cargo test --no-run --lib -p codex-app-server`
- `cargo test -p codex-linux-sandbox --lib`
- `cargo shear`
- `just bazel-lock-check`

## Notes

- I did not chase unrelated full-suite failures outside the migrated
loader surface.
- `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
failures on this machine, and Windows CI still shows unrelated
long-running/timeouting test noise outside the loader migration itself.

pakrym-oai · 2026-04-26 15:10:53 -07:00

9c3abcd46c

test: harden app-server integration tests (#19683 )

## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683

Michael Bolin · 2026-04-26 12:43:16 -07:00

ac2bffa443

[4/4] Honor Streamable HTTP MCP placement (#18584 )

Ahmed Ibrahim · 2026-04-24 15:03:55 -07:00

6de6eaa0c1

refactor: route Codex auth through AuthProvider (#18811 )

## Summary

This PR moves Codex backend request authentication from direct
bearer-token handling to `AuthProvider`.

The new `codex-auth-provider` crate defines the shared request-auth
trait. `CodexAuth::provider()` returns a provider that can apply all
headers needed for the selected auth mode.

This lets ChatGPT token auth and AgentIdentity auth share the same
callsite path:
- ChatGPT token auth applies bearer auth plus account/FedRAMP headers
where needed.
- AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
where needed.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Callsite Migration

| Area | Change |
| --- | --- |
| backend-client | accepts an `AuthProvider` instead of a raw
token/header |
| chatgpt client/connectors | applies auth through
`CodexAuth::provider()` |
| cloud tasks | keeps Codex-backend gating, applies auth through
provider |
| cloud requirements | uses Codex-backend auth checks and provider
headers |
| app-server remote control | applies provider headers for backend calls
|
| MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
from generic account getters |
| model refresh | treats AgentIdentity as Codex-backend auth |
| OpenAI file upload path | rejects non-Codex-backend auth before
applying headers |
| core client setup | keeps model-provider auth flow and allows
AgentIdentity through provider-backed OpenAI auth |

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
auth mode and startup task allocation
4. This PR: migrate Codex backend auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.

efrazer-oai · 2026-04-23 17:14:02 -07:00

5882f3f95e

Fix relative stdio MCP cwd fallback (#19031 )

Matthew Zeng · 2026-04-22 17:52:17 -07:00

8f0a92c1e5

[3/4] Add executor-backed RMCP HTTP client (#18583 )

### Why
The RMCP layer needs a Streamable HTTP client that can talk either
directly over `reqwest` or through the executor HTTP runner without
duplicating MCP session logic higher in the stack. This PR adds that
client-side transport boundary so remote Streamable HTTP MCP can reuse
the same RMCP flow as the local path.

### What
- Add a shared `rmcp-client/src/streamable_http/` module with:
  - `transport_client.rs` for the local-or-remote transport enum
  - `local_client.rs` for the direct `reqwest` implementation
  - `remote_client.rs` for the executor-backed implementation
  - `common.rs` for the small shared Streamable HTTP helpers
- Teach `RmcpClient` to build Streamable HTTP transports in either local
or remote mode while keeping the existing OAuth ownership in RMCP.
- Translate remote POST, GET, and DELETE session operations into
executor `http/request` calls.
- Preserve RMCP session expiry handling and reconnect behavior for the
remote transport.
- Add remote transport coverage in
`rmcp-client/tests/streamable_http_remote.rs` and keep the shared test
support in `rmcp-client/tests/streamable_http_test_support.rs`.

### Verification
- `cargo check -p codex-rmcp-client`
- online CI

### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-22 17:38:04 -07:00

0e78ce80ee

Add turn-scoped environment selections (#18416 )

## Summary
- add experimental turn/start.environments params for per-turn
environment id + cwd selections
- pass selections through core protocol ops and resolve them with
EnvironmentManager before TurnContext creation
- treat omitted selections as default behavior, empty selections as no
environment, and non-empty selections as first environment/cwd as the
turn primary

## Testing
- ran `just fmt`
- ran `just write-app-server-schema`
- not run: unit tests for this stacked PR

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-21 17:48:33 -07:00

1d4cc494c9

chore: document intentional await-holding cases (#18423 )

## Why

This PR prepares the stack to enable Clippy await-holding lints that
were left disabled in #18178. The mechanical lock-scope cleanup is
handled separately; this PR is the documentation/configuration layer for
the remaining await-across-guard sites.

Without explicit annotations, reviewers and future maintainers cannot
tell whether an await-holding warning is a real concurrency smell or an
intentional serialization boundary.

## What changed

- Configures `clippy.toml` so `await_holding_invalid_type` also covers
`tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`.
- Adds targeted `#[expect(clippy::await_holding_invalid_type, reason =
...)]` annotations for intentional async guard lifetimes.
- Documents the main categories of intentional cases: active-turn state
transitions that must remain atomic, session-owned MCP manager accesses,
remote-control websocket serialization, JS REPL kernel/process
serialization, OAuth persistence, external bearer token refresh
serialization, and tests that intentionally serialize shared global or
session-owned state.
- For external bearer token refresh, documents the existing
serialization boundary: holding `cached_token` across the provider
command prevents concurrent cache misses from starting duplicate refresh
commands, and the current behavior is small enough that an explicit
expectation is easier to maintain than adding another synchronization
primitive.

## Verification

- `cargo clippy -p codex-login --all-targets`
- `cargo clippy -p codex-connectors --all-targets`
- `cargo clippy -p codex-core --all-targets`
- The follow-up PR #18698 enables `await_holding_invalid_type` and
`await_holding_lock` as workspace `deny` lints, so any undocumented
remaining offender will fail Clippy.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423).
* #18698
* __->__ #18423

Michael Bolin · 2026-04-20 22:41:54 -07:00

d62421d322

refactor: use semaphores for async serialization gates (#18403 )

This is the second cleanup in the await-holding lint stack. The
higher-level goal, following https://github.com/openai/codex/pull/18178
and https://github.com/openai/codex/pull/18398, is to enable Clippy
coverage for guards held across `.await` points without carrying broad
suppressions.

The stack is working toward enabling Clippy's
[`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock)
lint and the configurable
[`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type)
lint for Tokio guard types.

Several existing fields used `tokio::sync::Mutex<()>` only as
one-at-a-time async gates. Those guards intentionally lived across
`.await` while an operation was serialized. A mutex over `()` suggests
protected data and trips the await-holding lint shape; a single-permit
`tokio::sync::Semaphore` expresses the intended serialization directly.

## What changed

- Replace `Mutex<()>` serialization gates with `Semaphore::new(1)` for
agent identity ensure, exec policy updates, guardian review session
reuse, plugin remote sync, managed network proxy refresh, auth token
refresh, and RMCP session recovery.
- Update call sites from `lock().await` / `try_lock()` to
`acquire().await` / `try_acquire()`.
- Map closed-semaphore errors into the existing local error types, even
though these semaphores are owned for the lifetime of their managers.
- Update session test builders for the new
`managed_network_proxy_refresh_lock` type.

## Verification

- The split stack was verified at the final lint-enabling head with
`just clippy`.





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18403).
* #18698
* #18423
* #18418
* __->__ #18403

Michael Bolin · 2026-04-20 17:21:29 +00:00

5d5d610740

[5/6] Wire executor-backed MCP stdio (#18212 )

## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
@  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
o  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-18 21:47:43 -07:00

996aa23e4c

refactor: narrow async lock guard lifetimes (#18211 )

Follow-up to https://github.com/openai/codex/pull/18178, where we called
out enabling the await-holding lint as a follow-up.

The long-term goal is to enable Clippy coverage for async guards held
across awaits. This PR is intentionally only the first, low-risk cleanup
pass: it narrows obvious lock guard lifetimes and leaves
`codex-rs/Cargo.toml` unchanged so the lint is not enabled until the
remaining cases are fixed or explicitly justified. It intentionally
leaves the active-turn/turn-state locking pattern alone because those
checks and mutations need to stay atomic.

## Common fixes used here

These are the main patterns reviewers should expect in this PR, and they
are also the patterns to reach for when fixing future `await_holding_*`
findings:

- **Scope the guard to the synchronous work.** If the code only needs
data from a locked value, move the lock into a small block, clone or
compute the needed values, and do the later `.await` after the block.
- **Use direct one-line mutations when there is no later await.** Cases
like `map.lock().await.remove(&id)` are acceptable when the guard is
only needed for that single mutation and the statement ends before any
async work.
- **Drain or clone work out of the lock before notifying or awaiting.**
For example, the JS REPL drains pending exec senders into a local vector
and the websocket writer clones buffered envelopes before it serializes
or sends them.
- **Use a `Semaphore` only when serialization is intentional across
async work.** The test serialization guards intentionally span awaited
setup or execution, so using a semaphore communicates "one at a time"
without holding a mutex guard.
- **Remove the mutex when there is only one owner.** The PTY stdin
writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not
protect shared access because nothing else had access to the writer.
- **Do not split locks that protect an atomic invariant.** This PR
deliberately leaves active-turn/turn-state paths alone because those
checks and mutations need to stay atomic. Those cases should be fixed
separately with a design change or documented with `#[expect]`.

## What changed

- Narrow scoped async mutex guards in app-server, JS REPL, network
approval, remote-control websocket, and the RMCP test server.
- Replace test-only async mutex serialization guards with semaphores
where the guard intentionally lives across async work.
- Let the PTY pipe writer task own stdin directly instead of wrapping it
in an async mutex.

## Verification

- `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness`
- `just clippy -p codex-core`
- `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was
run; the app-server suite passed, and `codex-core` failed in the local
sandbox on six otel approval tests plus
`suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`,
which appear to depend on local command approval/default rules and
`CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.

Michael Bolin · 2026-04-17 14:06:50 -07:00

1265df0ec2

[4/6] Abstract MCP stdio server launching (#18087 )

## Summary
- Move local MCP stdio process startup behind a launcher trait.
- Preserve existing local stdio behavior while making transport creation
explicit.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
o  #18212 [5/6] Wire executor-backed MCP stdio
│
@  #18087 [4/6] Abstract MCP stdio server launching
│
o  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-17 12:34:48 -07:00

92cf90277d

Support original-detail metadata on MCP image outputs (#17714 )

## Summary
- honor `_meta["codex/imageDetail"] == "original"` on MCP image content
and map it to `detail: "original"` where supported
- strip that detail back out when the active model does not support
original-detail image inputs
- update code-mode `image(...)` to accept individual MCP image blocks
- teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
from raw MCP image outputs
- document the new `_meta` contract and add generic RMCP-backed coverage
across protocol, core, code-mode, and js_repl paths

Curtis 'Fjord' Hawthorne · 2026-04-15 14:43:33 -07:00

9e2fc31854

Send sandbox state through MCP tool metadata (#17763 )

## Changes

Allows MCPs to opt in to receiving sandbox config info through `_meta`
on model-initiated tool calls. This lets MCPs adhere to the thread's
sandbox if they choose to.

## Details

- Adds the `codex/sandbox-state-meta` experimental MCP capability.
- Tracks whether each MCP server advertises that capability.
- When a server opts in, `codex-core` injects the current `SandboxState`
into model-initiated MCP tool-call request `_meta`.

## Verification

- added an integration test for the capability

aaronl-openai · 2026-04-15 00:49:15 -07:00

42528a905d

Add supports_parallel_tool_calls flag to included mcps (#17667 )

## Why

For more advanced MCP usage, we want the model to be able to emit
parallel MCP tool calls and have Codex execute eligible ones
concurrently, instead of forcing all MCP calls through the serial block.

The main design choice was where to thread the config. I made this
server-level because parallel safety depends on the MCP server
implementation. Codex reads the flag from `mcp_servers`, threads the
opted-in server names into `ToolRouter`, and checks the parsed
`ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
on model-visible tool names, which can be incomplete in
deferred/search-tool paths or ambiguous for similarly named
servers/tools.

## What was added

Added `supports_parallel_tool_calls` for MCP servers.

Before:

```toml
[mcp_servers.docs]
command = "docs-server"
```

After:

```toml
[mcp_servers.docs]
command = "docs-server"
supports_parallel_tool_calls = true
```

MCP calls remain serial by default. Only tools from opted-in servers are
eligible to run in parallel. Docs also now warn to enable this only when
the server’s tools are safe to run concurrently, especially around
shared state or read/write races.

## Testing

Tested with a local stdio MCP server exposing real delay tools. The
model/Responses side was mocked only to deterministically emit two MCP
calls in the same turn.

Each test called `query_with_delay` and `query_with_delay_2` with `{
"seconds": 25 }`.

| Build/config | Observed | Wall time |
| --- | --- | --- |
| main with flag enabled | serial | `58.79s` |
| PR with flag enabled | parallel | `31.73s` |
| PR without flag | serial | `56.70s` |

PR with flag enabled showed both tools start before either completed;
main and PR-without-flag completed the first delay before starting the
second.

Also added an integration test.

Additional checks:

- `cargo test -p codex-tools` passed
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server` passed
- `git diff --check` passed

josiah-openai · 2026-04-13 15:16:34 -07:00

937dd3812d

Use AbsolutePathBuf in skill loading and codex_home (#17407 )

Helps with FS migration later

pakrym-oai · 2026-04-13 10:26:51 -07:00

ac82443d07

fix(mcp) pause timer for elicitations (#17566 )

## Summary
Stop counting elicitation time towards mcp tool call time. There are
some tradeoffs here, but in general I don't think time spent waiting for
elicitations should count towards tool call time, or at least not
directly towards timeouts.

Elicitations are not exactly like exec_command escalation requests, but
I would argue it's ~roughly equivalent.

## Testing
- [x] Added unit tests
- [x] Tested locally

Dylan Hurd · 2026-04-12 16:06:17 -07:00

68a1d82a41

Add output_schema to code mode render (#17210 )

This updates code-mode tool rendering so MCP tools can surface
structured output types from their `outputSchema`.

What changed:
- Detect MCP tool-call result wrappers from the output schema shape
instead of relying on tool-name parsing or provenance flags.
- Render shared TypeScript aliases once for MCP tool results
(`CallToolResult`, `ContentBlock`, etc.) so multiple MCP tool
declarations stay compact.
- Type `structuredContent` from the tool definition's `outputSchema`
instead of rendering it as `unknown`.
- Update the shared MCP aliases to match the MCP draft `CallToolResult`
schema more closely.

Example:
- Before: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
string; message: string; }): Promise<{ _meta?: unknown; content:
Array<unknown>; isError?: boolean; structuredContent?: unknown; }>; };`
- After: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
string; message: string; }): Promise<CallToolResult<{ echo: string; env:
string | null; }>>; };`

Vivian Fang · 2026-04-10 11:41:44 +00:00

7bbe3b6011

112 Commits