codex

fix(remote-control): avoid server token refresh retry storms (#30201 )

## Why

Remote-control websocket reconnects and pairing requests proactively
refresh their server token. When `/server/refresh` returns a transient
error such as `502`, the still-valid token was discarded as a usable
connection path, causing reconnect failures and repeated refresh
attempts that could amplify an upstream incident.

## What Changed

- Start proactive refresh five minutes before token expiry and
distinguish it from a required refresh for missing or expired tokens.
- Continue websocket and pairing operations with the existing valid
token after `429`, `5xx`, or timeout failures.
- Share an in-memory `next_refresh_at` throttle across websocket and
pairing callers, honoring both `Retry-After` formats and otherwise using
a jittered 24–36 second delay.
- Keep required refreshes strict, preserve `404` enrollment replacement,
and clear token/throttle state for `401` and `403` auth recovery.
- Preserve refresh response metadata internally and add focused
wire-level and integration coverage.

## Verification

Added behavioral coverage proving that:

- a valid near-expiry token still completes websocket and pairing
requests after transient refresh failures;
- `Retry-After` suppresses a subsequent refresh across websocket and
pairing callers;
- request and response-body timeouts are classified as transient;
- an expired token, including one that expires during refresh, cannot
proceed to websocket connection;
- auth failures clear the attempted token without overwriting a
concurrently rotated token.

Anton Panasenko · 2026-06-26 17:34:52 -07:00

d047c33a1b

Project selected plugin runtime by environment availability (#30093 )

## Why

Selected plugin metadata is stable, but MCP processes are live runtime
state. They need different lifetimes:

- the MCP extension caches manifest, MCP, and connector declarations for
each stable selected root;
- each model step projects that cached metadata through the roots that
resolved as ready for that exact step;
- the MCP manager is rebuilt only when that availability projection
changes.

This matches executor skills: both features consume the same resolved
step roots instead of inferring readiness from the turn's selected
environments.

## Behavior

```text
E1 not ready for this step
  -> no E1 MCP servers or connectors
  -> cached plugin metadata stays in ext/mcp

E1 becomes ready
  -> reuse cached metadata
  -> publish one MCP runtime containing E1 capabilities

same ready roots on the next step
  -> reuse the exact runtime; no rediscovery and no MCP restart

resume
  -> create new extension thread state and a new MCP runtime
```

All model-facing consumers use the same step snapshot:

```text
resolved selected roots
        |
        v
extension MCP/connector projection
        |
        v
{ MCP config, connector snapshot, MCP manager }
        |
        +-> advertise model tools
        +-> build app/connector tools
        +-> execute MCP calls
```

## Cache contract

The existing MCP extension owns a cache keyed by the full
`SelectedCapabilityRoot`:

```rust
let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
```

The cache lives with extension thread state. Environment availability
filters projection but does not invalidate metadata. Resume creates new
thread state. There is no file watcher or executor generation because
contents behind a stable environment/root are assumed stable.

## What changes

- Keeps executor plugin discovery and cached metadata in `ext/mcp`.
- Caches MCP and connector declarations together per selected root.
- Uses the step's already-resolved capability roots, including lazy
environments that are not turn environments.
- Reuses the current MCP runtime when the ready-root projection is
unchanged.
- Uses the same step MCP manager and connector snapshot for
model-visible tools and execution.
- Resolves direct thread-scoped MCP requests from the current
selected-root projection.

## Deliberately out of scope

- `app/list` remains based on the latest global host-plugin state; this
PR does not make its response or notifications thread-specific.
- `required = true` startup semantics do not apply to delayed executor
MCP activation.
- No filesystem/content invalidation.
- No transport-disconnect watcher.
- No executor generations or environment replacement semantics.
- No client sharing across complete manager replacements.

## Stack

1. Extension-owned World State sections.
2. Project executor skills through World State.
3. Pin one MCP runtime to each model step.
4. **This PR:** project selected MCP and connector state from
extension-owned metadata.
5. Integration coverage for selected capability availability and resume.

## Verification

-
`selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
- The stacked integration PR covers unavailable to ready activation,
unchanged-runtime reuse, skills, MCP tools, connector attribution, and
cold resume.

jif · 2026-06-26 01:36:44 +01:00

3095ea9c3d

Read connector declarations from executor plugins (#29852 )

## Why

Selected capability roots can live on a different executor and operating
system from app-server. Their connector declarations must therefore be
read through the executor that owns the package, without converting
executor URIs into host paths.

This PR adds that authority-bound reader without activating connectors
or changing thread startup.

## What changed

- Add a small `codex-connectors-extension` crate for executor-owned
connector I/O.
- Read only the app configuration explicitly declared by the resolved
plugin manifest.
- Read through the `ExecutorFileSystem` retained by
`ResolvedExecutorPlugin`; there is no host-filesystem fallback or
default-file probe.
- Keep `PathUri` values intact so Windows, Unix, and remote executor
paths work from any orchestrator OS.
- Return full `AppDeclaration` values so the caller retains declaration
names and categories for routing.
- Preserve the selected plugin ID and exact executor URI in read and
parse errors.

The contract is intentionally narrow: selected packages are trusted,
valid packages and packages that provide connectors explicitly declare
their app configuration.

## Stack scope

This PR is stacked on #29851. It only provides the executor-backed
reader. #29856 resolves selected roots at thread start, freezes their
connector snapshot, and contains the remote-capable end-to-end authority
test for the complete path.

jif · 2026-06-24 23:56:50 +01:00

9ff8068880

[codex] Inject agent graph store into ThreadManager (#29736 )

Pick up the AgentGraphStore migration.

- Inject an explicit optional agent graph store into `ThreadManager` 
- Move all calls to spawn, close, recursive resume, and
subtree/archive/delete/feedback traversal through it
- Keep using  `LocalAgentGraphStore` when SQLite is available

This required some changes to the interface to deal with futures:

- The interface now matches `ThreadStore`'s object-safe pattern by
returning a boxed `AgentGraphStoreFuture` directly, allowing
`ThreadManager` to hold `Arc<dyn AgentGraphStore>`

*Slight behavior change!* Unfiltered subtree enumeration now performs a
single all-status breadth-first traversal, so a closed grandchild
beneath an open edge is included; the previous Open-then-Closed
traversals could not cross mixed-status paths and silently omitted it.

Tom · 2026-06-24 13:24:10 -07:00

ece1dfece0

protocol: separate app and exec RPC ownership (#29714 )

## Why

The app-server and exec-server expose separate JSON-RPC APIs, but
exec-server currently sources its serialized protocol and envelope types
through app-server-oriented code. Giving each API an explicit owner
makes the crate boundary legible without introducing shared generic
envelopes.

## What changed

- Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and
JSON-RPC envelopes.
- Updated exec-server clients, transports, handlers, and tests to use
the new crate.
- Exposed app-server's existing JSON-RPC types through a public `rpc`
module while retaining root re-exports.
- Preserved existing wire shapes, including exec `PathUri` behavior.

## Stack

This is PR 1 of 6. Next: [PR
#29721](https://github.com/openai/codex/pull/29721), which moves auth
mode below the app wire boundary.

## Validation

- Exec-server protocol and server coverage passed in the focused
protocol test runs.
- App-server protocol schema fixtures passed.

Adam Perry @ OpenAI · 2026-06-23 22:37:31 +00:00

829f5b6b59

Update rmcp to 1.8.0 (#29634 )

## Summary

- Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
- Adapt to the new shared `peer_info` return type.
- Box OAuth status discovery at the MCP boundary to keep the expanded
future type from overflowing Rust's trait recursion limit.

This brings in custom OAuth HTTP client support from
[modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).

jif · 2026-06-23 15:25:28 +01:00

bbe1006890

Share resumed rollout history (#28426 )

## Summary

Resuming a persisted thread currently deep-clones its complete rollout
history several times. `InitialHistory` is retained for the app-server
response, copied into thread persistence, and copied again by read-only
accessors. These copies scale with the complete rollout rather than the
bounded model context and add measurable latency for large sessions.

This change stores resumed rollout history in `Arc<Vec<RolloutItem>>`.
Rollout loading wraps the parsed vector once, while app-server response
construction, session initialization, and thread persistence share it
through inexpensive `Arc` clones. Read-only history access now returns a
borrowed slice, and fork paths use `Arc::unwrap_or_clone` where they
genuinely need mutable ownership. Rollout reconstruction also consumes
its temporary context instead of cloning the reconstructed model
history.

The serialized representation remains unchanged. In an artificial 123 MB
rollout benchmark, sharing resumed history reduced cold resume latency
by roughly 9–10%. The affected crates compile with their test targets,
all 80 thread-store tests pass, and the Bazel dependency lock remains
valid.

Charlie Marsh · 2026-06-23 10:23:25 -04:00

330ae6a516

PAC 4 - Add macOS system proxy resolver (#26709 )

## Summary

Stacked on #26708.

Adds the macOS implementation of the shared system-proxy contract. This
allows Codex-owned auth clients to use the route macOS selects for each
auth URL through SystemConfiguration and CFNetwork, including PAC and
WPAD results.

The `respect_system_proxy` feature is disabled by default, so existing
client behavior remains unchanged unless explicitly enabled.

## Implementation

- Adds the macOS-only `system-configuration` dependency to
`codex-client`.
- Dispatches system-proxy resolution to `outbound_proxy/macos.rs` on
macOS.
- Reads system proxy settings from `SCDynamicStore` and resolves the
target URL with `CFNetworkCopyProxiesForURL`.
- Executes PAC URLs and inline PAC JavaScript through a bounded run loop
with a five-second timeout.
- Handles `DIRECT`, HTTP proxies, and CFNetwork HTTPS entries using HTTP
CONNECT; unsupported SOCKS entries map to `UnsupportedProxyScheme`.
- Builds concrete proxy URLs from host and port entries, including IPv6
host bracketing.
- Maps results into the shared `SystemProxyDecision::{Direct, Proxy,
Unavailable}` contract.
- Hashes URL-specific cache keys so PAC decisions remain distinct
without retaining raw request URLs or query strings.

## End-user behavior

- Disabled/default: existing client behavior is unchanged.
- Enabled with `[features.respect_system_proxy]`:
  - macOS auth clients honor system proxy configuration, PAC, and WPAD;
  - valid OS/PAC `DIRECT` decisions use a direct connection;
- unavailable system resolution falls back to explicit environment proxy
variables, then `DIRECT`, through the shared contract from #26707.
- Unsupported proxy schemes are not silently translated into another
route.
- Custom CA handling remains separate from proxy selection.
- Known limitation: only the first supported system/PAC candidate is
used. Subsequent proxy or `DIRECT` candidates are not attempted after a
connection failure. This matches the current Windows behavior and leaves
room for future ordered-fallback support.

## Tests

- `just test -p codex-client` — 34 tests passed.
- `just clippy -p codex-client`
- `just fmt`
- `just bazel-lock-check`

canvrno-oai · 2026-06-22 17:56:04 -07:00

b16d2858f5

chore: advance tungstenite fork pins (#29480 )

## Why

`openai-oss-forks/tokio-tungstenite` now includes the updated
`tungstenite` fork revision from
[openai-oss-forks/tokio-tungstenite#3](https://github.com/openai-oss-forks/tokio-tungstenite/pull/3).
Codex should consume the merged fork commit and resolve its direct and
transitive `tungstenite` dependencies to the same revision instead of
retaining the older pins.

## What Changed

- Advanced the `tokio-tungstenite` git pin to
`0e5b2d73aa18dd9f0a50ee9ff199d5aef7594186`.
- Advanced the `tungstenite` fork pin to
`4fffad30fe373adbdcffab9545e9e9bf4f2fc19f` and adjusted the patch source
so the transitive dependency resolves to that revision.
- Updated `Cargo.lock` and `MODULE.bazel.lock` to match the dependency
graph.

Anton Panasenko · 2026-06-22 14:24:02 -07:00

0a9b7d2c36

chore(deps): advance tokio-tungstenite (#29132 )

## Why

Responses websocket connections use `tokio-tungstenite`. When DNS
returns an unusable native IPv6 address before a working IPv4 address,
sequential dialing can consume Codex's outer websocket timeout before
reaching IPv4. The merged fork change adds Happy Eyeballs-style
alternate-family racing so websocket dialing matches the recovery
behavior already present in the HTTP path.

## What Changed

Advance the workspace `tokio-tungstenite` patch from `132f5b39` to
merged commit `e5e64b86`, and update the matching lockfile source. The
new revision comes from
[openai-oss-forks/tokio-tungstenite#1](https://github.com/openai-oss-forks/tokio-tungstenite/pull/1).

Anton Panasenko · 2026-06-19 12:02:12 -07:00

abd901770e

exec-server: add Noise relay transport (#26242 )

## Why

Rendezvous forwards traffic between the orchestrator and exec-server.
The endpoints need to authenticate each other and encrypt that traffic
without trusting Rendezvous with plaintext or endpoint keys.

## Changes

- Adds a hybrid Noise IK channel through Clatter using X25519,
ML-KEM-768, AES-256-GCM, and SHA-256.
- Binds each handshake to `environment_id`, `executor_registration_id`,
and `stream_id`.
- Pins the registry-provided executor key and carries the harness
authorization inside the encrypted handshake.
- Orders relay frames before consuming Noise nonces and fragments large
JSON-RPC messages into bounded records.
- Bounds handshake payloads, frames, streams, and message reassembly.

Runtime activation is in
[openai/codex#26245](https://github.com/openai/codex/pull/26245).

## Stack

1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**:
Noise channel and relay transport
2. [openai/codex#26245](https://github.com/openai/codex/pull/26245):
remote registration and runtime activation

## Verification

- `just test -p codex-exec-server`
- Oversized initiator payload regression coverage
- `just fix -p codex-exec-server`
- `just bazel-lock-check`
- `cargo shear`

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-06-15 16:39:41 -07:00

428cd44154

Use aws-lc-rs for rustls crypto provider (#27706 )

## Why

Some enterprise TLS proxies issue certificate chains signed with
`ecdsa_secp521r1_sha512` / `ECDSA_NISTP521_SHA512`. Custom CA
configuration such as `SSL_CERT_FILE` can add the right trust root, but
it cannot make `rustls`'s `ring` verifier support a certificate
signature algorithm it does not advertise.

That can still break TLS after the CA bundle is configured, including on
Rust websocket paths that call the shared
`ensure_rustls_crypto_provider()` helper, such as the Responses
websocket connector and remote app-server client:

-
[`codex-api/src/endpoint/responses_websocket.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L441)
-
[`app-server-client/src/remote.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/app-server-client/src/remote.rs#L718)

The `aws-lc-rs` `rustls` provider supports this P-521/SHA-512
certificate signature scheme, so use it as Codex's process-wide `rustls`
provider.

## What Changed

- Switch the workspace `rustls` feature from `ring` to `aws_lc_rs`.
- Update `codex-utils-rustls-provider` to install
`rustls::crypto::aws_lc_rs::default_provider()`.
- Add an assertion and integration test that the installed provider
supports `ECDSA_NISTP521_SHA512`.

## Verification

```shell
just fmt
just test -p codex-utils-rustls-provider
just bazel-lock-update
just bazel-lock-check
```

malsamiri-oai · 2026-06-15 11:32:13 -07:00

d5a8117e08

[codex] Pin bundled SQLite to fixed WAL-reset version (#27992 )

## Summary

Prevent dependency refreshes from silently downgrading Codex's bundled
SQLite to a release affected by the WAL-reset corruption bug.

SQLx 0.9 accepts a broad `libsqlite3-sys` range. An unrelated lock
refresh therefore moved Codex from `libsqlite3-sys 0.37.0` back to
`0.35.0`, changing the bundled SQLite runtime from 3.51.3 to 3.50.2.
SQLite documents the affected versions and fix in [The WAL Reset
Bug](https://www.sqlite.org/wal.html#the_wal_reset_bug) and the [SQLite
3.51.3 changelog](https://www.sqlite.org/changes.html#version_3_51_3).

Gabriel Peal · 2026-06-13 21:28:31 -07:00

73c58011b3

Remove TUI realtime voice support (#27801 )

## Why

Removes the realtime audio support from TUI.

## What Changed

- Removed the TUI `/realtime` and realtime `/settings` command paths.
- Deleted TUI voice capture/playback, WebRTC session handling,
audio-device selection UI, and recording-meter code.
- Removed TUI realtime tests and snapshots that covered the deleted
surfaces.
- Dropped the TUI-only `cpal` and `codex-realtime-webrtc` dependencies
and refreshed the Rust/Bazel locks.

Eric Traut · 2026-06-12 14:20:55 -07:00

576f603440

code-mode standalone: extract protocol and add host crate (#27724 )

This is phase 1 of a 4 phase stack:
1. **Add protocol and host crates for new IPC code mode implementation**
2. Create the new standalone binary
3. Create a new IPC `CodeModeSessionProvider` to use new binary
4. Remove v8 from core and only use IPC provider


## Add protocol and host crates for new IPC code mode implementation
Establish a clean process boundary without changing the existing
in-process behavior.

- Add the codex-code-mode-protocol crate for shared session, runtime,
response, and tool-definition types.
- Move protocol-facing code out of the V8-backed implementation.
- Add a buildable codex-code-mode-host crate as the foundation for the
standalone process.
- Keep the existing in-process runtime as the active implementation.

Channing Conger · 2026-06-11 22:37:26 -07:00

aa46f2debf

[codex] parallelize release code generation (#27702 )

The release profile still uses one codegen unit, which serializes LLVM
code generation within each crate. That setting was selected alongside
fat LTO for optimization quality and binary size, but releases now use
ThinLTO and code generation dominates the critical-path build.

Use four codegen units. On an Apple M4 Max with 16 cores and 128 GiB
RAM, using rustc 1.96.0, four and eight units took 507.486 and 505.325
seconds respectively. Four therefore keeps the build-time gain while
limiting the stripped `codex` increase to 14.7%, compared with 21.5% at
eight units. The gzip-compressed binary grows 7.8% at four units.

The one-unit build from an empty target directory took 981.150 seconds.
That comparison also populated dependency and native build caches, so it
is directional rather than controlled. It agrees with the earlier clean
matrix where eight units reduced 671 seconds to 303 seconds:
https://gist.github.com/anp/4b88393a0acd35783d9f42156f3243d5

At the local 48% reduction, the current release's 55m22s critical-path
macOS Cargo step would save about 26 minutes from the 71m28s workflow:
https://github.com/openai/codex/actions/runs/27367405663

The prompt-image medians ranged from 3.9% faster to 0.9% slower. CLI
startup shifted by 1-2 ms while user and system CPU time were unchanged.

This is a draft because the release-latency improvement may not justify
the binary-size increase.

Tamir Duberstein · 2026-06-11 19:44:36 -07:00

e23d4df4ff

[codex] Remove async_trait from first-party code (#27475 )

## Why

First-party async traits should expose their `Send` contracts explicitly
without requiring `async_trait`. This completes the migration pattern
established in #27303 and #27304.

## What changed

- Replaced the remaining first-party `async_trait` traits with native
return-position `impl Future + Send` where statically dispatched and
explicit boxed `Send` futures where object safety is required.
- Kept implementations behavior-preserving, outlining existing async
bodies into inherent methods where that keeps the diff reviewable.
- Removed all direct first-party `async-trait` dependencies and the
workspace dependency declaration.
- Added a cargo-deny policy that permits `async-trait` only through the
remaining transitive wrapper crates.
- Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
keep the full cargo-deny check passing.

## Validation

- `just test -p codex-exec-server`: 216 passed, 2 skipped.
- `just test -p codex-model-provider`: 39 passed.
- `just test -p codex-core` and `just test`: changed tests passed;
remaining failures are environment-sensitive suites unrelated to this
migration.
- `cargo deny check`
- `just fix`
- `just fmt`
- `cargo shear`
- `just bazel-lock-check`

Adam Perry @ OpenAI · 2026-06-11 18:16:39 -07:00

5a56caf18c

[codex] Load user instructions through an injected provider (#27101 )

## Why

We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
make embedders responsible for supplying user-level instructions. This
also ensures user instructions load when no primary environment is
selected.

## What changed

Stacked on #27415, which makes `codex exec` surface thread-scoped
runtime warnings.

- Added `UserInstructionsProvider` to `codex-extension-api`, with
absolute source attribution and recoverable loading warnings.
- Added `codex-home` with the filesystem-backed provider for
`AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
trimming, lossy UTF-8 handling, and the existing uncapped global
instruction size.
- Removed global instruction loading from `Config` and require
`ThreadManager` callers to inject a provider.
- Load provider instructions once for each fresh root runtime, including
runtimes without a primary environment. Running sessions retain their
snapshot, while child agents inherit the parent snapshot without
invoking the provider.
- Keep provider instructions separate while loading project `AGENTS.md`,
then assemble the model-visible instructions with the existing ordering,
source attribution, warning, and turn-context behavior.
- Wired the Codex home provider through the CLI, app server, MCP server,
core facade, and thread-manager sample.

## Validation

- `just test -p codex-home -p codex-extension-api`
- `just test -p codex-core agents_md`
- `just test -p codex-core guardian`
- `just test -p codex-app-server
thread_start_without_selected_environment_includes_only_global_instruction_source`
- `just test -p codex-exec warning`
- `just bazel-lock-check`

Adam Perry @ OpenAI · 2026-06-11 19:28:47 +00:00

236b50125d

[codex] migrate ExecutorFileSystem paths to PathUri (#27424 )

## Why

We're moving exec-server to use PathUri for its internal path
representations.

## What

Move `ExecutorFileSystem` APIs to use `PathUri` instead of
`AbsolutePathBuf`. Future changes will convert higher-level parts of
exec-server.

Adam Perry @ OpenAI · 2026-06-11 18:44:18 +00:00

b2a4e3be27

Route hosted Apps MCP through extensions (#27191 )

## Stack

- Base: #27184
- This PR is the second vertical and should be reviewed against
`jif/external-plugins-1`, not `main`.

## Why

CCA is moving toward a split runtime where the orchestrator may have no
filesystem or executor, but it still needs to activate remotely hosted
plugin components. HTTP MCP servers are the simplest complete example:
they need configuration and host authentication, but they do not need an
executor process.

The Apps MCP endpoint is currently synthesized by a special-purpose
loader inside the MCP runtime. That works locally, but it leaves hosted
MCP activation outside the extension model being established in #27184.
It also makes the Apps path a poor foundation for plugins whose skills,
MCP servers, connectors, and hooks may come from different sources or
execute in different places.

This PR moves that one behavior behind an extension-owned contribution
while preserving the existing local fallback. It deliberately does not
introduce a generic plugin activation framework.

## What changed

### MCP extension contribution

`codex-extension-api` gains an ordered `McpServerContributor` contract.
A contributor returns typed `Set` or `Remove` overlays for MCP server
configuration; later contributors win for the names they own.

The contract stays at the existing MCP configuration boundary.
Extensions do not create a second connection manager or transport
abstraction.

### Hosted Apps MCP extension

A new `codex-mcp-extension` contributes the reserved `codex_apps` server
from the existing Apps feature, ChatGPT base URL, path override, and
product SKU configuration.

When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
resulting streamable HTTP endpoint is
`https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
remains authoritative, so this server can run in an orchestrator-only
process without being exposed for API-key sessions.

### One resolved runtime view

`McpManager` now distinguishes three views:

- **configured:** config- and plugin-backed servers before extension
overlays;
- **runtime:** configured servers plus host-installed extension
contributions;
- **effective:** runtime servers after auth gating and compatibility
built-ins.

App-server installs the hosted MCP extension and uses the runtime view
for thread startup, refresh, status, threadless resource reads,
connector discovery, and MCP OAuth lookup. This keeps
`mcpServer/oauth/login` consistent with the servers exposed by the other
MCP APIs. The hosted Apps server itself continues to use existing
ChatGPT host authentication rather than MCP OAuth.

## Compatibility

Hosts that do not install the MCP extension retain the existing Apps MCP
synthesis path. This preserves current local-only, CLI, and
standalone-host behavior while app-server exercises the extension path.

Disabling Apps removes the reserved `codex_apps` entry, and losing
ChatGPT auth removes it from the effective runtime view. Executor
availability is not consulted for this HTTP transport.

## Follow-ups

The next vertical will resolve a manifest-declared stdio MCP server from
an executor-selected plugin root and execute it in the environment that
owns that root. Later verticals can add backend-owned skills, connector
metadata, hooks, durable selection semantics, and incremental local
convergence without changing the component-specific runtime boundaries
introduced here.

## Verification

Focused coverage was added for:

- contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
executor;
- requiring ChatGPT auth in the effective runtime view;
- removing a reserved configured Apps server when the Apps feature is
disabled.

`cargo check -p codex-app-server -p codex-mcp-extension -p
codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
locally under the current development instruction; CI provides the full
validation pass.

jif · 2026-06-09 22:44:16 +02:00

4ec3b8eeea

Load selected executor skills through extensions (#27184 )

## Why

CCA is moving toward a split runtime where the orchestrator may not have
a filesystem, while executors can expose preinstalled plugins and
skills. A thread therefore needs to select capabilities without asking
app-server or core to interpret executor-owned paths through the
orchestrator's filesystem.

The longer-term model is broader than executor skills:

- A plugin is a bundle of skills, MCP servers, connectors/apps, and
hooks.
- A plugin root can be local, executor-owned, or hosted by a backend.
- Components inside one plugin can use different access and execution
mechanisms. A skill may be read from a filesystem or through backend
tools; an HTTP MCP server can run without an executor; a stdio MCP
server or hook needs an execution environment.
- Core should carry generic extension initialization data. The extension
that owns a component should discover it, expose it to the model, and
invoke it through the appropriate runtime.

This PR establishes that architecture through one complete vertical:
selecting a root on an executor, discovering the skills beneath it,
exposing those skills to the model, and reading an explicitly invoked
`SKILL.md` through the same executor.

## Contract

`thread/start` gains an experimental `selectedCapabilityRoots` field:

```json
{
  "selectedCapabilityRoots": [
    {
      "id": "deploy-plugin@1",
      "location": {
        "type": "environment",
        "environmentId": "workspace",
        "path": "/opt/codex/plugins/deploy"
      }
    }
  ]
}
```

The root is intentionally not classified as a "plugin" or "skill" in the
API. It can point at a standalone skill, a directory containing several
skills, or a plugin containing skills and other components. This PR only
teaches the skills extension how to consume it; later extensions can
resolve MCP, connector, and hook components from the same selection.

The platform-supplied `id` is stable selection identity. The location
says which runtime owns the root and gives that runtime an opaque path.
App-server does not inspect or canonicalize the path.

## What changed

### Generic thread extension initialization

App-server converts selected roots into `ExtensionDataInit`. Core
carries that generic initialization value until the final thread ID is
known, then creates thread-scoped `ExtensionData` before lifecycle
contributors run.

This keeps `Session` and core independent of the capability-selection
contract. The initialization value is consumed during construction; it
is not retained as another long-lived `Session` field.

### Executor-backed skills

The skills extension now owns an `ExecutorSkillProvider` that:

- resolves the selected environment through `EnvironmentManager`
- discovers, canonicalizes, and reads skills through that environment's
`ExecutorFileSystem`
- contributes the bounded selected-skill catalog as stable developer
context
- reads an explicitly invoked skill body through the authority that
listed it
- warns when an environment or root is unavailable
- never falls back to the orchestrator filesystem for an executor-owned
root

Skill catalog and instruction fragments have hard byte bounds, which
also bound them below the 10K-token per-item context limit. If a
selected executor skill has the same name as a legacy local skill, the
executor selection owns that invocation and the local body is not
injected a second time.

Existing local and bundled skill loading remains in place. Omitting
`selectedCapabilityRoots` therefore preserves current local-only
behavior.

## Current semantics

- Only environment-owned locations are represented in this first
contract.
- Roots are resolved by the destination extension, not by app-server or
core.
- An unavailable executor or invalid root produces a warning and no
capabilities from that root; it does not trigger a local-filesystem
fallback.
- Selection applies to a newly started active thread.
- MCP servers, connectors, and hooks beneath a selected plugin root are
not activated yet.
- Selection is not yet persisted or inherited across resume, fork, or
subagent creation. Existing local capabilities continue to behave as
they do today in those flows.

## Planned vertical follow-ups

1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that
works without an executor, then replace the special-purpose MCP plugins
loader with that implementation.
2. **Executor MCP:** register and execute stdio MCP servers through the
environment that owns the selected plugin root.
3. **Backend skills:** add a hosted skill source whose catalog and
bodies are accessed through extension tools rather than a filesystem.
4. **Connectors and hooks:** activate those components through their
owning extensions, using the same selected-root boundary and
component-specific runtime.
5. **Durable selection:** define the desired-selection lifecycle,
persist it, and make resume, fork, and subagent inheritance explicit
rather than accidental.
6. **Local convergence:** incrementally route existing local plugin,
skill, and MCP loading through the same extension model while preserving
current local behavior.

Each follow-up remains reviewable as an end-to-end capability. The
platform selects roots, generic thread extension data carries the
selection, and the owning extension resolves and operates its component.

## Verification

Coverage added for:

- app-server end-to-end discovery and explicit invocation of a skill
inside an executor-selected plugin root
- exclusive invocation when a selected executor skill collides with a
local skill name
- executor filesystem authority for discovery, canonicalization, and
reads
- thread extension initialization before lifecycle contributors run
- stable executor catalog context, explicit invocation, context
rebuilding, hidden skills, and preserved host/remote catalog behavior

Targeted protocol, core-skills, skills-extension, core lifecycle, and
app-server executor-skill tests were run during development.

jif · 2026-06-09 19:51:54 +02:00

89ac3ec27c

Add typed file URIs (#26840 )

## Why

Codex needs stable `file:` URI identifiers that can cross process and
operating-system boundaries without eagerly interpreting them as native
paths. Existing fields also need to keep accepting absolute path strings
during migration.

## What changed

- Add `codex-utils-path-uri` with a validated, immutable `PathUri`
wrapper that currently accepts only `file:` URLs.
- Expose URI-level `basename`, `parent`, and `join` operations that
preserve authorities and percent encoding without guessing the source
operating system.
- Keep native conversion explicit through `AbsolutePathBuf` and the
current host rules.
- Serialize as canonical URI text while accepting both URI text and
legacy absolute native paths during deserialization.
- Add adversarial coverage for Windows-looking and POSIX paths, UNC
authorities, encoded metadata characters, non-UTF-8 POSIX paths, URI
hierarchy operations, and legacy serde round trips.

Adam Perry @ OpenAI · 2026-06-08 16:33:41 -07:00

ffec7c0933

[codex] Restore release symbol artifacts with line tables (#26202 )

## Summary

- Restore separate release symbol archives for macOS, Linux, and Windows
binaries.
- Build release binaries with `line-tables-only` debuginfo instead of
full debuginfo.
- Strip Unix distribution binaries after extracting symbols, preserve
Windows PDBs, and keep symbol archives available to the release job.
- Strip the packaged Linux `bwrap` binary before hashing it so the
embedded digest matches the distributed bytes.

## Root cause

The first symbol-artifact implementation enabled
`CARGO_PROFILE_RELEASE_DEBUG=full`. In the June 2 release runs, macOS
ARM primary builds reached the 90-minute timeout while still inside
`Cargo build`. After the symbol changes were reverted, the same primary
build completed in about 22 minutes. The archive step itself completed
in tens of seconds when reached.

Rust's `line-tables-only` debuginfo level preserves function names and
source locations for symbolication without emitting the heavier variable
and type information from full debuginfo.

## Validation

- Ran `just fmt` from `codex-rs`.
- Ran `just test-github-scripts` from the repository root: 23 tests
passed.
- Ran `bash -n` and `shellcheck` on
`.github/scripts/archive-release-symbols-and-strip-binaries.sh`.
- Parsed both modified workflows as YAML and ran `git diff --check`.
- Built a macOS release smoke binary with `line-tables-only`, archived
its dSYM through the restored script, stripped the production binary,
and verified that `atos` resolves `symbol_smoke_function` to
`main.rs:2`.
- Ran Linux archive-script control-flow coverage with stubbed `objcopy`
and `strip` commands.
- Ran Windows PDB archive staging coverage and verified
underscore-emitted Rust PDB names are staged under shipped hyphenated
binary names.

## Follow-up

The release workflow only runs for tags or manual dispatches, so CI
cannot dry-run the full release matrix on this PR. The next release run
will verify runner time and memory behavior under `line-tables-only`.

Jeremy Rose · 2026-06-08 17:16:36 +00:00

6d0e313e23

deps: update starlark to 0.14.2 (#24820 )

Michael Bolin · 2026-06-07 17:35:33 -07:00

e648ec771f

build(v8): update rusty_v8 to 149.2.0 (#26464 )

Channing Conger · 2026-06-06 14:27:23 -07:00

b89ce9a2bc

[2 of 2] Finish moving goal runtime to extension (#26548 )

## Stack

1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
goal extension with core behavior
2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
goal runtime to extension

## Why

This PR completes the switch of the goal behavior to the
extension-backed runtime and removes the old core goal implementation.

## What Changed

- Installs the goal extension for app-server `ThreadManager` sessions.
- Routes app-server thread goal `get`, `set`, and `clear` through
`GoalService`.
- Uses thread-idle lifecycle emission after goal resume and snapshot
ordering so the extension can decide whether to continue the goal.
- Forwards extension goal updates through a FIFO async app-server
notification path so backpressure does not drop them or reorder updates.
- Keeps review turns from enabling goal runtime behavior.
- Plans extension tools before dynamic tools so built-in goal tool names
keep their old precedence when goals are enabled.
- Removes the old core goal runtime, core goal tool handlers, and core
goal tool specs.
- Updates tests that were coupled to the core-owned goal runtime while
leaving the legacy `<goal_context>` compatibility path in core for old
threads.
- Removes the stale cargo-shear ignore now that `codex-goal-extension`
is used by the workspace.
- Keeps realtime event matching exhaustive after removing the old
goal-specific realtime text path.


## Validation

- Ran manual `/goal` runs in TUI. Validated time accounting matched
wall-clock time and goal lifecycle state transitions.

Eric Traut · 2026-06-05 14:17:30 -07:00

479a14cf59

build: use ThinLTO for release binaries (#23710 )

## Why

Fat LTO makes release builds substantially slower without providing
enough measured runtime benefit to justify the release CI long pole. The
build-profile investigation found that keeping Cargo's default release
`opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced
a clean `codex-cli` release build from 2073.893 seconds to 1243.172
seconds, a 40.06% improvement.

The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%).
Measured runtime changes were small: the worst image workload median was
+0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO
retains cross-crate optimization while avoiding most of the fat-LTO
build cost.

This deliberately avoids global size optimization: final-executable
testing showed a substantial regression on the image request path, which
is expected to become more important as image usage grows.

## What changed

- Set the workspace release profile to `lto = "thin"`, retaining Cargo's
default release `opt-level=3`.
- Remove release and CI workflow-specific LTO overrides so
release-profile builds consistently use the workspace setting.
- Remove the now-unused Windows release workflow input and related
diagnostic output.

## Validation

- Confirmed the release profile parses with `cargo metadata --no-deps
--format-version 1`.
- CI validates release builds across the supported target matrix.

Adam Perry @ OpenAI · 2026-06-04 20:07:53 +00:00

f97d5c3275

Optimize unbounded byte scans with memchr (#26265 )

## Summary

This PR adds `memchr` for some low-hanging performance improvements
(namely, in MCP stdio, Ollama streaming, and full message-history
newline counts).

Codex produced the following release benchmarks:

| Operation | Before | After | Speedup |
| --- | ---: | ---: | ---: |
| MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
| Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
| Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |

With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
MCP server, completed `initialize`, requested `tools/list`, and
deserialized a 1 MiB tool description over newline-delimited stdio),
it's about 16x faster end-to-end:

| Branch | 50 calls | Per call |
| --- | ---: | ---: |
| `main` | 862.53 ms | 17.25 ms |
| this branch | 53.89 ms | 1.08 ms |

`memchr` is already in our dependency tree and extremely widely used for
this kind of optimized scanning.

Charlie Marsh · 2026-06-04 09:53:08 -04:00

7da4af622f

chore: extract context fragments into dedicated crate (#26122 )

## Why

`codex-core` currently owns the generic contextual-fragment trait and
several reusable fragment implementations. That makes it harder for
other crates to share the same host-owned model-input abstraction
without depending on all of `codex-core`.

This change extracts the reusable fragment machinery into a small
`codex-context-fragments` crate so future extension and skills work can
depend on the fragment abstraction directly.

## What Changed

- Added the `codex-context-fragments` crate with:
  - `ContextualUserFragment`
  - `FragmentRegistration` / `FragmentRegistrationProxy`
  - additional-context fragment types
- Moved `SkillInstructions` into `codex-core-skills`, since
skill-specific rendering belongs with skills rather than generic core
context machinery.
- Kept `codex-core` re-exporting the fragment types it still uses
internally, so existing call sites keep the same shape.
- Updated Cargo and Bazel workspace metadata for the new crate.

## Verification

- `cargo metadata --locked --format-version 1 --no-deps`
- `just bazel-lock-update`
- `just bazel-lock-check`

jif · 2026-06-03 12:25:21 +02:00

ac67905fc4

feat: add skills extension scaffold (#25953 )

## Disclaimer
This is only here for iteration purpose! Do not make any code rely on
this

## Why

Skills still live behind `codex-core` discovery and injection paths, but
the extension system needs an authority-aware home before that logic can
move. This adds that boundary without changing current skills behavior,
and keeps host, executor, and remote skills distinct so future
list/read/search flows do not collapse back to ambient local paths.

## What changed

- Add the `codex-skills-extension` workspace/Bazel crate under
`ext/skills`.
- Define the initial catalog, authority, provider, and turn-state types
for authority-bound skill packages and resources.
- Register placeholder thread/config/prompt/turn lifecycle contributors
plus host, executor, and remote provider aggregation points.
- Capture the remaining extraction work as TODOs, including the missing
extension API hooks needed for per-turn catalog construction and typed
skill injection.
- Keep plugins outside the runtime skills model: plugin-installed skills
are treated as materialized host-owned skill sources once available.

## Verification

- Not run locally.

jif · 2026-06-03 01:10:26 +02:00

2d385e166c

Move cloud requirements crate to cloud config (#24621 )

## Summary

- Moves the existing `codex-cloud-requirements` crate to
`codex-cloud-config`.
- Updates workspace dependencies and imports to the new crate name.
- Intentionally keeps runtime behavior unchanged: this still fetches the
legacy cloud requirements endpoint.

## Details

This PR exists to make the lineage obvious before the bundle migration.
GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs`
implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather
than as unrelated new code.

The follow-up PR adapts this moved crate to the new config bundle API
and switches runtime consumers over.

joeflorencio-openai · 2026-06-01 16:43:52 -07:00

0b3a6f7185

[codex] Consolidate shared prompts in codex-prompts (#25151 )

## Why

`codex_core` is consistently a bottleneck for incremental builds during
iteration. The simplest fix is to make the crate smaller.

## Summary

`codex-core` owns several reusable prompt renderers and static prompt
assets, which makes the crate harder to split apart.

Rename `codex-review-prompts` to `codex-prompts` and move shared review,
goal, permissions, compaction, realtime, hierarchical AGENTS.md, and
`apply_patch` prompts into it. Move prompt-only tests and update
consumers and `CODEOWNERS`.

## Validation

- `just test -p codex-prompts -p codex-apply-patch`
- `just test -p codex-core prompt_caching`
- Bazel builds for the affected crates

Adam Perry @ OpenAI · 2026-06-01 18:45:07 +00:00

ba2b67f9cd

fix: main (#25075 )

jif-oai · 2026-05-29 12:53:31 +02:00

3deda3116c

Add feature-gated standalone image generation extension (#24723 )

## Why

Add a standalone image generation path that can be exercised
independently of hosted Responses image generation, while retaining the
hosted tool as fallback unless the extension is actually available to
the model.

## What changed

- Added the `codex-image-generation-extension` crate with standalone
generate/edit execution, prior-image selection for edits, model-visible
image output, and local generated-image persistence.
- Installed the extension in app-server behind the disabled-by-default
`imagegenext` feature and backend eligibility checks.
- Updated core tool planning so eligible `image_gen.imagegen` exposure
replaces hosted `image_generation`, while unavailable configurations
retain hosted fallback.
- Added coverage for extension behavior, edit history reuse, feature
gating, auth eligibility, and hosted-tool replacement.
- The extension is installed through app-server only in this PR; other
execution paths retain hosted image generation because hosted
replacement occurs only when the standalone executor is actually
registered and model-visible.
- The initial extension contract intentionally fixes the image model to
`gpt-image-2` and uses automatic image parameters.
- Native generated-image history/card parity and rollout persistence
cleanup are intentionally deferred follow-up work.

## Validation

- `just test -p codex-image-generation-extension`
- `just test -p codex-features`
- `just test -p codex-core
hosted_tools_follow_provider_auth_model_and_config_gates`
- `just test -p codex-app-server`
- `just fix -p codex-image-generation-extension -p codex-features -p
codex-core -p codex-app-server`
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`

---------

Co-authored-by: jif-oai <jif@openai.com>

Won Park · 2026-05-28 11:44:55 -07:00

ecb41fcb64

Revert "Add app-server startup benchmark crate" (#24937 )

Reverts openai/codex#24651, broke musl job
https://github.com/openai/codex/actions/runs/26585495205/job/78330166927

Adam Perry @ OpenAI · 2026-05-28 17:49:41 +00:00

c2508db60d

Add app-server startup benchmark crate (#24651 )

## Summary
- Add a new `app-server-start-bench` crate to measure app-server startup
performance
- Wire the benchmark into the workspace and Bazel build so it can be run
consistently
- Update lockfiles and repo automation to account for the new package

Adam Perry @ OpenAI · 2026-05-28 08:46:30 -07:00

bd2a732923

Update rmcp to 1.7.0 (#24763 )

WIll make it easier to uprev when the new draft spec is supported.

Also updates reqwest where needed for compatibility but doesn't update
it everywhere since this is already a large diff.

The new version of rmcp handles certain kinds of authentication failures
differently, this patch includes support for identifying the failing scope
in a WWW-Authenticate header.

Adam Perry @ OpenAI · 2026-05-27 14:52:06 -07:00

910578792f

Bump SQLx to pick up newer bundled SQLite (#24728 )

## Why

Codex stores thread, log, goal, and memory state in bundled SQLite
databases through SQLx. We have a suspected SQLite WAL-reset corruption
issue under heavy concurrent writer load, especially when multiple
subagents are active. The existing `sqlx 0.8.6` dependency kept us on an
older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack
far enough forward to pick up the newer bundled SQLite library.

## What changed

- Bump the workspace `sqlx` dependency to `0.9.0`.
- Use the SQLx 0.9 feature names explicitly: `runtime-tokio`,
`tls-rustls`, and `sqlite-bundled`.
- Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys
0.37.0`.
- Refresh `MODULE.bazel.lock` for the dependency changes.
- Adapt `codex-state` to SQLx 0.9:
- build dynamic state queries with `QueryBuilder<Sqlite>` instead of
passing dynamic `String`s to `sqlx::query`;
- remove the old `QueryBuilder` lifetime parameter from helper
signatures;
- preserve SQLx's new `Migrator` fields when constructing runtime
migrators.

## Verification

- `just test -p codex-state`
- `just bazel-lock-check`
- `cargo check -p codex-state --tests`

jif-oai · 2026-05-27 18:44:07 +02:00

379511dcea

standalone websearch extension (#23823 )

## Summary

Add the extension-backed standalone `web.run` tool so Codex can call the
standalone search endpoint through the `codex-api` search client and
return its encrypted output to Responses.

- gate the new tool behind `standalone_web_search`
- install the extension in the app-server thread registry and hide
hosted `web_search` when standalone search is enabled for OpenAI
providers so the two paths stay mutually exclusive
- build search context from persisted history using a small tail
heuristic: previous user message, assistant text between the last two
user turns capped at about 1k tokens, and current user message

## Test Plan

- `cargo test -p codex-web-search-extension`
- `cargo test -p codex-api`
- `cargo test -p codex-core
hosted_tools_follow_provider_auth_model_and_config_gates`

sayan-oai · 2026-05-26 11:12:24 -07:00

a22706dfae

chore: drop orphaned codex memories MCP crate (#24555 )

## Why

The memory read-tool surface had two implementations: the app-server
extension path under `ext/memories`, and an unused `codex-memories-mcp`
workspace crate under `memories/mcp`. The MCP crate no longer has
reverse dependents, so keeping it around preserves duplicate backend,
schema, and tool code that is not part of the live app-server memory
path.

Dropping the orphaned crate makes the remaining memory crate split
clearer: `memories/read` owns read-path prompt/citation helpers,
`memories/write` owns the write pipeline, and `ext/memories` owns the
app-server extension integration.

## What changed

- Removed the `memories/mcp` crate and its Bazel/Cargo metadata.
- Removed `memories/mcp` from the Rust workspace and lockfile.
- Updated `memories/README.md` so it only lists the remaining reusable
memory crates.

## Verification

- `cargo metadata --format-version 1 --no-deps` succeeds.

jif-oai · 2026-05-26 11:29:37 +02:00

d579dafb70

[codex] Add image re-encoding benchmarks (#23935 )

## Summary
- add Divan benchmarks for prompt image re-encoding paths
- wire the image benchmark smoke test into Rust CI workflows

## Why
Image prompt handling includes re-encoding work that benefits from
repeatable benchmark coverage so changes can be measured in CI and
locally.

This already helped identify a potential regression from changing compiler flags.

## Impact
Developers can run and compare the new image re-encoding benchmarks, and
CI exercises the benchmark target via the Rust benchmark smoke test.

Adam Perry @ OpenAI · 2026-05-22 22:38:40 +00:00

7924743c38

feat: support local refs and defs in tool input schemas (#23357 )

# Why

Some connector tool input schemas use local JSON Schema references and
definition tables to avoid duplicating large nested shapes. Codex
previously lowered these schemas into the supported subset in a way that
could discard `$ref`-only schema objects and lose the corresponding
definitions, which made non-strict tool registration less faithful than
the original connector schema.

This keeps the existing minimal-lowering policy: Codex still does not
raw-pass through arbitrary JSON Schema, but it now preserves local
reference structure that fits the Responses-compatible subset and prunes
definition entries that cannot be reached by following `$ref`s from the
root schema after sanitization, including refs found transitively inside
other reachable definitions. The pruning matters because Responses
parses definition tables even when entries are unused, so keeping dead
definitions wastes prompt tokens.

# What changed

- Added `$ref`, `$defs`, and legacy `definitions` fields to the tool
`JsonSchema` representation.
- Updated `parse_tool_input_schema` lowering so `$ref`-only schema
objects survive sanitization instead of becoming `{}`.
- Sanitized definition tables recursively and dropped malformed
definition tables so non-strict registration degrades gracefully.
- Added reachability pruning for root definition tables by starting from
refs outside definition tables, then following refs inside reachable
definitions.
- Added JSON Pointer decoding for local definition refs such as
`#/$defs/Foo~1Bar`.

# Verification
ran local golden-schema probes against representative connector schemas
to validate behavior on real generated schemas:

| Golden schema | Before bytes | After bytes | `$defs` before -> after |
`$ref` before -> after | Result |
|---|---:|---:|---:|---:|---|
| `google_calendar/create_space` | 7111 | 4526 | 7 -> 7 | 7 -> 7 | all
definitions preserved because all are reachable |
| `figma/apply_file_variable_changes` | 4609 | 999 | 8 -> 5 | 8 -> 5 |
unused defs pruned after unsupported `oneOf` shapes lower away |
| `snowflake/list_catalog_integrations` | 1380 | 404 | 3 -> 0 | 0 -> 0 |
all defs pruned because none are referenced |
| `dropbox/create_shared_link` | 8894 | 1836 | 14 -> 4 | 9 -> 4 | only
defs reachable from the root schema after sanitization are retained,
including transitively through other retained defs |

Token increase across golden schema due to this change:
<img width="817" height="366" alt="Screenshot 2026-05-19 at 1 47 04 PM"
src="https://github.com/user-attachments/assets/d5c80fe9-da85-41e6-8ac7-a01d1e0b0f71"
/>

Celia Chen · 2026-05-22 00:32:14 +00:00

0cec508148

CI: Customize v8 building (#22086 )

## Summary

Move the rusty_v8 artifact production into hermetic Bazel path and bump
the `v8` crate to `147.4.0`

The new flow builds V8 release artifacts from source for Darwin and
Linux targets, publishes both the current release-compatible artifacts
and sandbox-enabled variants, and keeps Cargo consumers on prebuilt
binaries by continuing to feed the `v8` crate the archive and generated
binding files it already expects.

## Why

We need control over V8 build-time features without giving up prebuilt
artifacts for downstream Cargo builds.

Upstream `rusty_v8` already supports source-only features such as
`v8_enable_sandbox`, but its normal prebuilt release assets do not cover
every feature combination we need. Building the artifacts ourselves lets
us enable settings such as the V8 sandbox and pointer compression at
artifact build time, then publish those outputs so ordinary Cargo builds
can still consume prebuilts instead of compiling V8 locally.

This keeps the fast consumer experience of prebuilt `rusty_v8` archives
while giving us a reproducible path to ship featureful variants that
upstream does not currently publish for us.

## Implementation Notes

The Bazel graph in this PR is not copied wholesale from `rusty_v8`;
`rusty_v8`'s normal source build is still GN/Ninja-based.

Instead, this change starts from upstream V8's Bazel rules and adapts
them to Codex's hermetic toolchains and dependency layout. Where we
intentionally follow `rusty_v8`, we mirror its existing artifact
contract:

- the same `v8` crate version and generated binding expectations
- the same sandbox feature relationship, where sandboxing requires
pointer compression
- the same custom libc++ model expected by Cargo's default
`use_custom_libcxx` feature
- the same release-style archive plus `src_binding` outputs consumed by
the `v8` crate

To preserve that contract, the Bazel release path pins the libc++,
libc++abi, and llvm-libc revisions used by `rusty_v8 v147.4.0`, builds
release artifacts with `--config=rusty-v8-upstream-libcxx`, and folds
the matching runtime objects into the final static archive.

## Windows

Windows is annoyingly handled differently.

Codex's current hermetic Bazel Windows C++ platform is `windows-gnullvm`
/ `x86_64-w64-windows-gnu`, while upstream `rusty_v8` publishes Windows
prebuilts for `*-pc-windows-msvc`. Those are different ABIs, so the
Bazel graph cannot truthfully reproduce the upstream MSVC artifacts
until we add a real MSVC-targeting C++ toolchain.

For now:

- Windows MSVC consumers continue to use upstream `rusty_v8` release
archives.
- Windows GNU targets are built in-tree so they link against a matching
GNU ABI.
- The canary workflow separately exercises upstream `rusty_v8` source
builds for MSVC sandbox artifacts, but MSVC is not yet part of the
Bazel-produced release matrix.

## Validation
This PR is technically self validating through CI. I have already
published it as a release tag so the artifacts from this branch are
published to
https://github.com/openai/codex/releases/tag/rusty-v8-v147.4.0 CI for
this PR should therefore consume our own release targets. I have also
locally tested for linux and darwin.

---------

Co-authored-by: Codex <noreply@openai.com>

Channing Conger · 2026-05-18 21:33:05 -07:00

7cdeab33d1

chore: goal ext skeleton (#23288 )

Skeleton of `/goal` in extension
Lot's of follow-ups coming

jif-oai · 2026-05-18 13:32:21 +02:00

a80f07ec4a

Move memory prompt injection to app-server extension (#22841 )

## Why

Memory prompt injection should be owned by the extension path that
app-server composes at runtime, not by an inlined special case inside
`codex-core`. This keeps `codex-core` focused on session orchestration
while allowing the memories extension to own its app-server prompt
behavior.

## What Changed

- Registers `codex-memories-extension` in the app-server extension
registry.
- Moves the memory developer-instruction injection out of
`core/src/session/mod.rs` and into the memories extension prompt
contributor.
- Adds config-change handling so the extension keeps its per-thread
memory settings in sync after startup.
- Leaves memories read/retrieval tools unregistered for now so this PR
only changes prompt injection.
- Removes the stale `cargo-shear` ignore now that app-server depends on
the extension crate.

## Validation

Not run locally; validation is left to CI.

jif-oai · 2026-05-15 16:19:34 +02:00

cccde930ce

chore(config) rm Feature::CodexGitCommit (#22412 )

## Summary
Removes the unused Feature::CodexGitCommit

## Testing
- [x] tests pass

Dylan Hurd · 2026-05-13 12:33:36 -07:00

d18a7c982e

config: add strict config parsing (#20559 )

## Why

Codex intentionally ignores unknown `config.toml` fields by default so
older and newer config files keep working across versions. That leniency
also makes typo detection hard because misspelled or misplaced keys
disappear silently.

This change adds an opt-in strict config mode so users and tooling can
fail fast on unrecognized config fields without changing the default
permissive behavior.

This feature is possible because `serde_ignored` exposes the exact
signal Codex needs: it lets Codex run ordinary Serde deserialization
while recording fields Serde would otherwise ignore. That avoids
requiring `#[serde(deny_unknown_fields)]` across every config type and
keeps strict validation opt-in around the existing config model.

## What Changed

### Added strict config validation

- Added `serde_ignored`-based validation for `ConfigToml` in
`codex-rs/config/src/strict_config.rs`.
- Combined `serde_ignored` with `serde_path_to_error` so strict mode
preserves typed config error paths while also collecting fields Serde
would otherwise ignore.
- Added strict-mode validation for unknown `[features]` keys, including
keys that would otherwise be accepted by `FeaturesToml`'s flattened
boolean map.
- Kept typed config errors ahead of ignored-field reporting, so
malformed known fields are reported before unknown-field diagnostics.
- Added source-range diagnostics for top-level and nested unknown config
fields, including non-file managed preference source names.

### Kept parsing single-pass per source

- Reworked file and managed-config loading so strict validation reuses
the already parsed `TomlValue` for that source.
- For actual config files and managed config strings, the loader now
reads once, parses once, and validates that same parsed value instead of
deserializing multiple times.
- Validated `-c` / `--config` override layers with the same
base-directory context used for normal relative-path resolution, so
unknown override keys are still reported when another override contains
a relative path.

### Scoped `--strict-config` to config-heavy entry points

- Added support for `--strict-config` on the main config-loading entry
points where it is most useful:
  - `codex`
  - `codex resume`
  - `codex fork`
  - `codex exec`
  - `codex review`
  - `codex mcp-server`
  - `codex app-server` when running the server itself
  - the standalone `codex-app-server` binary
  - the standalone `codex-exec` binary
- Commands outside that set now reject `--strict-config` early with
targeted errors instead of accepting it everywhere through shared CLI
plumbing.
- `codex app-server` subcommands such as `proxy`, `daemon`, and
`generate-*` are intentionally excluded from the first rollout.
- When app-server strict mode sees invalid config, app-server exits with
the config error instead of logging a warning and continuing with
defaults.
- Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
instead of extending shared `ReviewArgs`, so `--strict-config` stays on
the outer config-loading command surface and does not become part of the
reusable review payload used by `codex exec review`.

### Coverage

- Added tests for top-level and nested unknown config fields, unknown
`[features]` keys, typed-error precedence, source-location reporting,
and non-file managed preference source names.
- Added CLI coverage showing invalid `--enable`, invalid `--disable`,
and unknown `-c` overrides still error when `--strict-config` is
present, including compound-looking feature names such as
`multi_agent_v2.subagent_usage_hint_text`.
- Added integration coverage showing both `codex app-server
--strict-config` and standalone `codex-app-server --strict-config` exit
with an error for unknown config fields instead of starting with
fallback defaults.
- Added coverage showing unsupported command surfaces reject
`--strict-config` with explicit errors.

## Example Usage

Run Codex with strict config validation enabled:

```shell
codex --strict-config
```

Strict config mode is also available on the supported config-heavy
subcommands:

```shell
codex --strict-config exec "explain this repository"
codex review --strict-config --uncommitted
codex mcp-server --strict-config
codex app-server --strict-config --listen off
codex-app-server --strict-config --listen off
```

For example, if `~/.codex/config.toml` contains a typo in a key name:

```toml
model = "gpt-5"
approval_polic = "on-request"
```

then `codex --strict-config` reports the misspelled key instead of
silently ignoring it. The path is shortened to `~` here for readability:

```text
$ codex --strict-config
Error loading config.toml:
~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
  |
2 | approval_polic = "on-request"
  | ^^^^^^^^^^^^^^
```

Without `--strict-config`, Codex keeps the existing permissive behavior
and ignores the unknown key.

Strict config mode also validates ad-hoc `-c` / `--config` overrides:

```text
$ codex --strict-config -c foo=bar
Error: unknown configuration field `foo` in -c/--config override

$ codex --strict-config -c features.foo=true
Error: unknown configuration field `features.foo` in -c/--config override
```

Invalid feature toggles are rejected too, including values that look
like nested config paths:

```text
$ codex --strict-config --enable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --disable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
```

Unsupported commands reject the flag explicitly:

```text
$ codex --strict-config cloud list
Error: `--strict-config` is not supported for `codex cloud`
```

## Verification

The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
`--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
case, unknown `-c` overrides, app-server strict startup failure through
`codex app-server`, and rejection for unsupported commands such as
`codex cloud`, `codex mcp`, `codex remote-control`, and `codex
app-server proxy`.

The config and config-loader tests cover unknown top-level fields,
unknown nested fields, unknown `[features]` keys, source-location
reporting, non-file managed config sources, and `-c` validation for keys
such as `features.foo`.

The app-server test suite covers standalone `codex-app-server
--strict-config` startup failure for an unknown config field.

## Documentation

The Codex CLI docs on developers.openai.com/codex should mention
`--strict-config` as an opt-in validation mode for supported
config-heavy entry points once this ships.

Michael Bolin · 2026-05-13 16:08:05 +00:00

889ee018e7

feat: memories ext (#22498 )

First memories extension implementation
Based on memories-mcp tools

jif-oai · 2026-05-13 17:14:31 +02:00

8ba6749932

Refactor extension tools onto shared ToolExecutor (#22369 )

## Why

Extension tools were split across two public runtime contracts:
`codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
types, while core native tools used `codex_tools::ToolExecutor`. That
made contributed tool specs and execution behavior easy to drift apart
and added another crate boundary for what should be one executable-tool
seam.

This PR makes `ToolExecutor` the single runtime contract and keeps
extension-specific pinning in `codex-extension-api`.

## Remaining todo

https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
Either generic `Invocation` or sub-extract the `ToolCall` and clean
`ToolInvocation`

## What changed

- Removed the `codex-tool-api` workspace crate and its dependencies from
core and `codex-extension-api`.
- Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
extension contributors can return a dyn executor.
- Added the extension-facing aliases under
`ext/extension-api/src/contributors/tools.rs`, including
`ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
ExtensionToolOutput>`.
- Changed `ToolContributor::tools` to return extension executors
directly instead of `ToolBundle`s.
- Updated core’s extension tool handler/registry/router path to adapt
those extension executors into the existing native `ToolInvocation`
runtime path.
- Added focused coverage for extension tools being registered,
model-visible, dispatchable, and not replacing built-in tools.

## Verification

- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`

jif-oai · 2026-05-13 12:12:06 +02:00

9c5dfa7b1a

Remove CODEX_RS_SSE_FIXTURE test hook (#22413 )

## Why

`CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
bypass the normal Responses transport by reading SSE from local files.
That kept test-only behavior wired through production client code. The
affected tests can stay hermetic by using the existing
`core_test_support::responses` mock server and passing `openai_base_url`
instead.

## What Changed

- Removed the `CODEX_RS_SSE_FIXTURE` flag,
`codex_api::stream_from_fixture`, the `env-flags` dependency, and the
checked-in SSE fixture files.
- Repointed the affected core, exec, and TUI tests at `MockServer` with
the existing SSE event constructors.
- Removed the Bazel test data plumbing for the deleted fixtures and
refreshed cargo/Bazel lock state.

## Verification

- `cargo build -p codex-cli`
- `cargo test -p codex-api`
- `cargo test -p codex-core --test all responses_api_stream_cli`
- `cargo test -p codex-core --test all
integration_creates_and_checks_session_file`
- `cargo test -p codex-exec --test all ephemeral`
- `cargo test -p codex-exec --test all resume`
- `cargo test -p codex-tui --test all
resume_startup_does_not_consume_model_availability_nux_count`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
- `git diff --check`

pakrym-oai · 2026-05-13 03:08:01 +00:00

96833c5b15

363 Commits