codex

Add support for UDS in codex --remote (#22414 )

## Why

Added support for UDS connections in `codex --remote`.

TUI also now connects to local app-server using UDS by default if it is
running and set to listen to UDS connection.

## What Changed

- Introduced `RemoteAppServerEndpoint` with `WebSocket` and `UnixSocket`
variants.
- Reused the existing JSON-RPC-over-WebSocket protocol over either a TCP
WebSocket stream or a UDS stream.
- Updated `codex --remote` to accept `ws://host:port`,
`wss://host:port`, `unix://`, and `unix://PATH`.
- Kept `--remote-auth-token-env` restricted to `wss://` and loopback
`ws://` remotes.
- Added a fast TUI startup probe for the default daemon socket, falling
back to the embedded app server when the daemon is absent or
unresponsive.

## Verification

- Manually verified that the updated remote flow works.
- Added coverage for UDS remote round trips, WebSocket auth headers,
auth-token transport policy, remote address parsing, and missing-daemon
fallback.
- Ran focused remote test coverage locally.

Eric Traut · 2026-05-12 21:17:20 -07:00

ad572709ab

Restore app-server websocket listener with auth guard (#22404 )

## Why
PR #21843 removed the TCP websocket app-server listener, but that also
removed functionality that still needs to exist. Restoring it as-is
would reopen the old remote exposure problem, so this keeps the restored
listener while making remote and non-loopback usage require explicit
auth.

## What Changed
- Mostly reverts #21843 and reapplies the small merge-conflict
resolutions needed on top of current main.
- Restores ws://IP:PORT parsing, the app-server TCP websocket acceptor,
websocket auth CLI flags, and the associated tests.
- The only intentional behavior change from the restored code is that
non-loopback websocket listeners now fail startup unless --ws-auth
capability-token or --ws-auth signed-bearer-token is configured.
Loopback listeners remain available for local and SSH-forwarding
workflows.

## Reviewer Focus
Please focus review on the small auth-enforcement delta layered on top
of the revert:

- codex-rs/app-server-transport/src/transport/websocket.rs:
start_websocket_acceptor now rejects unauthenticated non-loopback
websocket binds before accepting connections.
- codex-rs/app-server-transport/src/transport/auth.rs: helper logic
classifies unauthenticated non-loopback listeners.
- codex-rs/app-server/tests/suite/v2/connection_handling_websocket.rs:
tests cover unauthenticated ws://0.0.0.0 startup rejection and
authenticated non-loopback capability-token startup.

Everything else is intended to be revert/merge-conflict restoration
rather than new product behavior.

## Verification

- Manually verified that TUI remoting is restored and that auth is
enforced for non-localhost urls.

Eric Traut · 2026-05-12 18:40:53 -07:00

51bfb5f3b1

mark Feature::RemoteControl as removed (#22386 )

## Why

`remote_control` can appear in `config.toml`, CLI feature overrides, and
the app-server config APIs. Before this PR, app-server startup treated
`config.features.enabled(Feature::RemoteControl)` as the signal to start
remote control ([base
code](https://github.com/openai/codex/blob/5e3ee5eddfa5333f2e0b011880abf0cbf92bd295/codex-rs/app-server/src/lib.rs#L678-L680)).
That meant a user with:

```toml
[features]
remote_control = true
```

would accidentally opt every app-server process into remote control.
Remote-control startup should instead be a per-process launch decision
made by CLI flags.

## What Changed

- Marks `Feature::RemoteControl` as `Stage::Removed`, keeping
`remote_control` as a known compatibility key while making it
config-inert.
- Adds a hidden `--remote-control` process flag to `codex app-server`
and standalone `codex-app-server`.
- Plumbs that flag through
`AppServerRuntimeOptions.remote_control_enabled` and makes app-server
startup use only that runtime option to decide whether to start remote
control.
- Removes the app-server config mutation hook that reloaded config and
toggled remote control at runtime.
- Updates managed daemon spawning to use `codex app-server
--remote-control --listen unix://` instead of `--enable remote_control`.

Config APIs can still list, read, write, and set `remote_control`; those
operations just no longer affect remote-control process enrollment.

Owen Lin · 2026-05-13 00:52:45 +00:00

2237a13cf1

feat(sandbox): add Windows deny-read parity (#18202 )

## Why

The split filesystem policy stack already supports exact and glob
`access = none` read restrictions on macOS and Linux. Windows still
needed subprocess handling for those deny-read policies without claiming
enforcement from a backend that cannot provide it.

## Key finding

The unelevated restricted-token backend cannot safely enforce deny-read
overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
checks, not read denials, so this PR intentionally fails that backend
closed when deny-read overrides are present instead of claiming
unsupported enforcement.

## What changed

This PR adds the Windows deny-read enforcement layer and makes the
backend split explicit:

- Resolves Windows deny-read filesystem policy entries into concrete ACL
targets.
- Preserves exact missing paths so they can be materialized and denied
before an enforceable sandboxed process starts.
- Snapshot-expands existing glob matches into ACL targets for Windows
subprocess enforcement.
- Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
- Plans both the configured lexical path and the canonical target for
existing paths so reparse-point aliases are covered.
- Threads deny-read overrides through the elevated/logon-user Windows
sandbox backend and unified exec.
- Applies elevated deny-read ACLs synchronously before command launch
rather than delegating them to the background read-grant helper.
- Reconciles persistent deny-read ACEs per sandbox principal so policy
changes do not leave stale deny-read ACLs behind.
- Fails closed on the unelevated restricted-token backend when deny-read
overrides are present, because its `WRITE_RESTRICTED` token model is not
authoritative for read denials.

## Landed prerequisites

These prerequisite PRs are already on `main`:

1. #15979 `feat(permissions): add glob deny-read policy support`
2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
3. #17740 `feat(config): support managed deny-read requirements`

This PR targets `main` directly and contains only the Windows deny-read
enforcement layer.

## Implementation notes

- Exact deny-read paths remain enforceable on the elevated path even
when they do not exist yet: Windows materializes the missing path before
applying the deny ACE, so the sandboxed command cannot create and read
it during the same run.
- Existing exact deny paths are preserved lexically until the ACL
planner, which then adds the canonical target as a second ACL target
when needed. That keeps both the configured alias and the resolved
object covered.
- Windows ACLs do not consume Codex glob syntax directly, so glob
deny-read entries are expanded to the concrete matches that exist before
process launch.
- Glob traversal deduplicates directory visits within each pattern walk
to avoid cycles, without collapsing distinct lexical roots that happen
to resolve to the same target.
- Persistent deny-read ACL state is keyed by sandbox principal SID, so
cleanup only removes ACEs owned by the same backend principal.
- Deny-read ACEs are fail-closed on the elevated path: setup aborts if
mandatory deny-read ACL application fails.
- Unelevated restricted-token sessions reject deny-read overrides early
instead of running with a silently unenforceable read policy.

## Verification

- `cargo test -p codex-core
windows_restricted_token_rejects_unreadable_split_carveouts`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-windows-sandbox`
- GitHub Actions rerun is in progress on the pushed head.

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-05-11 23:04:28 -07:00

46f30d0282

Update codex remote-control to start the daemon (#22218 )

## Why
Update `codex remote-control` to use the new app server daemon commands
instead.
- if the updater loop is not running, bootstrap the daemon with remote
control enabled (`codex app-server daemon bootstrap --remote-control`)
- otherwise, enable the persisted remote-control setting and start the
daemon normally

Owen Lin · 2026-05-11 15:38:30 -07:00

4859d80ffe

app-server: remove TCP websocket listener (#21843 )

## Why

The app-server no longer needs to expose a TCP websocket listener.
Keeping that transport also kept around a separate listener/auth surface
that is unnecessary now that local clients can use stdio or the
Unix-domain control socket, while remote connectivity is handled by
`remote_control`.

## What Changed

- Removed `ws://IP:PORT` parsing and the `AppServerTransport::WebSocket`
startup path.
- Deleted the app-server websocket listener auth module and removed
related CLI flags/dependencies.
- Kept websocket framing only where it is still needed: over the
Unix-domain control socket and in the outbound `remote_control`
connection.
- Updated app-server CLI/help text and `app-server/README.md` to
document only `stdio://`, `unix://`, `unix://PATH`, and `off` for local
transports.
- Converted affected app-server integration coverage from TCP websocket
listeners to UDS-backed websocket connections, and added a parse test
that rejects `ws://` listen URLs.
- Removed the now-unused workspace `constant_time_eq` dependency and
refreshed `Cargo.lock` after `cargo shear` caught the drift.
- Moved test app-server UDS socket paths to short Unix temp paths so
macOS Bazel test sandboxes do not exceed Unix socket path limits.

## Verification

- Added/updated tests around UDS websocket transport behavior and
`ws://` listen URL rejection.
- `cargo shear`
- `cargo metadata --no-deps --format-version 1`
- `cargo test -p codex-app-server unix_socket_transport`
- `cargo test -p codex-app-server unix_socket_disconnect`
- `just fix -p codex-app-server`
- `git diff --check`

Local full Rust test execution was blocked before compilation by an
external fetch failure for the pinned `nornagon/crossterm` git
dependency. `just bazel-lock-update` and `just bazel-lock-check` were
retried after the manifest cleanup but remain blocked by external
BuildBuddy/V8 fetch timeouts.

Ruslan Nigmatullin · 2026-05-11 10:17:26 -07:00

a124ddb854

[daemon] Add app-server daemon lifecycle management (#20718 )

## Why

Desktop and mobile Codex clients need a machine-readable way to
bootstrap and manage `codex app-server` on remote machines reached over
SSH. The same flow is also useful for bringing up app-server with
`remote_control` enabled on a fresh developer machine and keeping that
managed install current without requiring a human session.

## What changed

- add the new experimental `codex-app-server-daemon` crate and wire it
into `codex app-server daemon` lifecycle commands: `start`, `restart`,
`stop`, `version`, and `bootstrap`
- add explicit `enable-remote-control` and `disable-remote-control`
commands that persist the launch setting and restart a running managed
daemon so the change takes effect immediately
- emit JSON success responses for daemon commands so remote callers can
consume them directly
- support a Unix-only pidfile-backed detached backend for lifecycle
management
- assume the standalone `install.sh` layout for daemon-managed binaries
and always launch `CODEX_HOME/packages/standalone/current/codex`
- add bootstrap support for the standalone managed install plus a
detached hourly updater loop
- harden lifecycle management around concurrent operations, pidfile
ownership, stale state cleanup, updater ownership, managed-binary
preflight, Unix-only rejection, forced shutdown after the graceful
window, and updater process-group tracking/cleanup
- document the experimental Unix-only support boundary plus the
standalone bootstrap/update flow in
`codex-rs/app-server-daemon/README.md`

## Verification

- `cargo test -p codex-app-server-daemon -p codex-cli`
- live pid validation on `cb4`: `bootstrap --remote-control`, `restart`,
`version`, `stop`

## Follow-up

- Add updater self-refresh so the long-lived `pid-update-loop` can
replace its own executable image after installing a newer managed Codex
binary.

Ruslan Nigmatullin · 2026-05-08 16:51:16 -07:00

0c8d42525e

Disable empty Cargo test targets (#21584 )

## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>

Charlie Marsh · 2026-05-07 15:44:17 -07:00

54ef99a365

add top-level remote-control command (#21424 )

## Summary

`codex --enable remote_control app-server --listen off` is the current
way to start a headless, remote-controllable app-server, but it is hard
to remember and exposes implementation details.

This adds `codex remote-control` as a friendly top-level wrapper for
that flow. The command starts a foreground app-server with local
transports disabled and enables `remote_control` only for that
invocation.

## Changes

- Add a visible `codex remote-control` CLI subcommand.
- Launch app-server with `AppServerTransport::Off`.
- Append `features.remote_control=true` after root feature toggles so
the explicit command wins over `--disable remote_control`.
- Reject root `--remote` / `--remote-auth-token-env`, matching other
non-TUI subcommands.
- Add tests for parsing, launch defaults, override ordering, and remote
flag rejection.

## Verification

- `cargo test -p codex-cli`
- `just fix -p codex-cli`

Owen Lin · 2026-05-07 10:17:07 -07:00

129401df43

feat: make built-in MCPs first-class runtime servers (#21356 )

## DISCLAIMER
This is experimental and no production service must rely on this

## Why

Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.

## What changed

- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.

## Test plan

- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`

---------

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-05-07 10:36:32 +02:00

b2268999fe

Revert state DB injection and agent graph store (#21481 )

## Why

Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.

## What changed

- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.

## Validation

- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.

pakrym-oai · 2026-05-06 22:48:29 -07:00

a8488fec5e

chore: spawn MCP for memories (#21214 )

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-05-06 15:05:54 +02:00

ca257b6ce5

Add cloud executor registration to exec-server (#19575 )

## Summary
This PR adds the first `codex-rs` milestone for remote-exec e2e: a local
`codex exec-server` can now register itself with
`codex-cloud-environments` and attach to the returned rendezvous
websocket.

At a high level, `codex exec-server --cloud ...` now:
- loads ChatGPT auth from normal Codex config
- registers an executor with `codex-cloud-environments`
- receives a signed rendezvous websocket URL
- serves the existing exec-server JSON-RPC protocol over that websocket

## What Changed
- Added `--cloud`, `--cloud-base-url`, `--cloud-environment-id`, and
`--cloud-name` to `codex exec-server`
- Added a new `exec-server/src/cloud.rs` module that handles:
  - registration requests
  - auth/header setup
  - bounded auth retry on `401/403`
  - reconnect/backoff after websocket disconnects
- Reused the existing `ConnectionProcessor` / `ExecServerHandler` path
so cloud mode serves the same exec/filesystem RPC surface as local
websocket mode
- Added cloud-specific error variants and minimal docs for the new mode

## Testing
Manual e2e test that fully goes through exec server flow with our codex
cloud agent as orchestrator

Michael Zeng · 2026-05-05 22:01:48 +00:00

d0f9d5eba2

Inject state DB, agent graph store (#20689 )

## Why

We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.

This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.

This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.

## What changed

- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
  - `LocalThreadStore`
  - `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.

## Verification

- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`

Rasmus Rygaard · 2026-05-05 21:45:29 +00:00

7e310bc7f3

Rename agent identity login surface to access token (#21059 )

## Why
The external startup/login surface for this auth path should talk about
an access token instead of exposing the internal Agent Identity
terminology. Users should pass `CODEX_ACCESS_TOKEN` or pipe a token into
`codex login --with-access-token`; the old external env/flag spellings
are removed so there is only one supported user-facing path.

## What Changed
- Added `CODEX_ACCESS_TOKEN` as the supported environment variable for
this auth path.
- Added `codex login --with-access-token` as the supported stdin-based
login command.
- Removed the legacy `CODEX_AGENT_IDENTITY` env-var fallback and hidden
`--with-agent-identity` CLI alias.
- Updated CLI error, status, and stdin prompts to use access-token
language.
- Added coverage for access-token env loading, CLI login failure
behavior, and renamed login status text.

## Validation
- `cargo test -p codex-login`
- `cargo test -p codex-cli`
- `just fix -p codex-login`
- `just fix -p codex-cli`

Shijie Rao · 2026-05-04 19:43:48 -07:00

0d418f478d

state: pass state db handles through consumers (#20561 )

## Why

SQLite state was still being opened from consumer paths, including lazy
`OnceCell`-backed thread-store call sites. That let one process
construct multiple state DB connections for the same Codex home, which
makes SQLite lock contention and `database is locked` failures much
easier to hit.

State DB lifetime should be chosen by main-like entrypoints and tests,
then passed through explicitly. Consumers should use the supplied
`Option<StateDbHandle>` or `StateDbHandle` and keep their existing
filesystem fallback or error behavior when no handle is available.

The startup path also needs to keep the rollout crate in charge of
SQLite state initialization. Opening `codex_state::StateRuntime`
directly bypasses rollout metadata backfill, so entrypoints should
initialize through `codex_rollout::state_db` and receive a handle only
after required rollout backfills have completed.

## What Changed

- Initialize the state DB in main-like entrypoints for CLI, TUI,
app-server, exec, MCP server, and the thread-manager sample.
- Pass `Option<StateDbHandle>` through `ThreadManager`,
`LocalThreadStore`, app-server processors, TUI app wiring, rollout
listing/recording, personality migration, shell snapshot cleanup,
session-name lookup, and memory/device-key consumers.
- Remove the lazy local state DB wrapper from the thread store so
non-test consumers use only the supplied handle or their existing
fallback path.
- Make `codex_rollout::state_db::init` the local state startup path: it
opens/migrates SQLite, runs rollout metadata backfill when needed, waits
for concurrent backfill workers up to a bounded timeout, verifies
completion, and then returns the initialized handle.
- Keep optional/non-owning SQLite helpers, such as remote TUI local
reads, as open-only paths that do not run startup backfill.
- Switch app-server startup from direct
`codex_state::StateRuntime::init` to the rollout state initializer so
app-server cannot skip rollout backfill.
- Collapse split rollout lookup/list APIs so callers use the normal
methods with an optional state handle instead of `_with_state_db`
variants.
- Restore `getConversationSummary(ThreadId)` to delegate through
`ThreadStore::read_thread` instead of a LocalThreadStore-specific
rollout path special case.
- Keep DB-backed rollout path lookup keyed on the DB row and file
existence, without imposing the filesystem filename convention on
existing DB rows.
- Verify readable DB-backed rollout paths against `session_meta.id`
before returning them, so a stale SQLite row that points at another
thread's JSONL falls back to filesystem search and read-repairs the DB
row.
- Keep `debug prompt-input` filesystem-only so a one-off debug command
does not initialize or backfill SQLite state just to print prompt input.
- Keep goal-session test Codex homes alive only in the goal-specific
helper, rather than leaking tempdirs from the shared session test
helper.
- Update tests and call sites to pass explicit state handles where DB
behavior is expected and explicit `None` where filesystem-only behavior
is intended.

## Validation

- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
codex-tui -p codex-exec -p codex-cli --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout state_db_`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout try_init_ -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
codex-rollout --lib -- -D warnings`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store
read_thread_falls_back_when_sqlite_path_points_to_another_thread --
--nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
shell_snapshot`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all personality_migration`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find::find_prefers_sqlite_path_by_id --
--nocapture`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
interrupt_accounts_active_goal_before_pausing`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server get_auth_status -- --test-threads=1`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server --lib`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
-p codex-app-server --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
-p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
codex-exec -p codex-cli`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
codex-app-server`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
codex-rollout`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
- `just argument-comment-lint -p codex-core`
- `just argument-comment-lint -p codex-rollout`

Focused coverage added in `codex-rollout`:

- `recorder::tests::state_db_init_backfills_before_returning` verifies
the rollout metadata row exists before startup init returns.
- `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
verifies startup waits for another worker to finish backfill instead of
disabling the handle for the process.
-
`state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
verifies startup does not hang indefinitely on a stuck backfill lease.
-
`tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
verifies DB-backed lookup accepts valid existing rollout paths even when
the filename does not include the thread UUID.
-
`tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
verifies DB-backed lookup ignores a stale row whose existing path
belongs to another thread and read-repairs the row after filesystem
fallback.

Focused coverage updated in `codex-core`:

- `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
DB-preferred rollout file with matching `session_meta.id`, so it still
verifies that valid SQLite paths win without depending on stale/empty
rollout contents.

`cargo test -p codex-app-server thread_list_respects_search_term_filter
-- --test-threads=1 --nocapture` was attempted locally but timed out
waiting for the app-server test harness `initialize` response before
reaching the changed thread-list code path.

`bazel test //codex-rs/thread-store:thread-store-unit-tests
--test_output=errors` was attempted locally after the thread-store fix,
but this container failed before target analysis while fetching `v8+`
through BuildBuddy/direct GitHub. The equivalent local crate coverage,
including `cargo test -p codex-thread-store`, passes.

A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
also requires system `libcap.pc` for `codex-linux-sandbox`; the
follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
this container.

Ruslan Nigmatullin · 2026-05-04 11:46:03 -07:00

4d201e340e

Add stdio exec-server listener (#20663 )

## Why

This stack adds configured exec-server environments, including
environments reached over stdio. Before client-side stdio transports or
config can use that path, the exec-server binary itself needs a
first-class stdio listen mode so it can speak the same JSON-RPC protocol
over stdin/stdout that it already speaks over websockets.

**Stack position:** this is PR 1 of 5. It is the server-side transport
foundation for the stack.

## What Changed

- Accept `stdio` and `stdio://` for `codex exec-server --listen`.
- Promote the existing stdio `JsonRpcConnection` helper from test-only
code into normal exec-server transport code.
- Add parse coverage for stdio listen URLs while preserving the existing
websocket default.

## Stack

- **1. This PR:** https://github.com/openai/codex/pull/20663 - Add stdio
exec-server listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack.

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-05-04 11:40:03 -07:00

0035d7bd18

Move plugin out of core. (#20348 )

xl-openai · 2026-04-30 14:26:14 -07:00

7b3de63041

Remove core protocol dependency [2/2] (#20325 )

## Why

With the local model layer and app-server routing in place from PR1,
this PR moves the active TUI runtime onto app-server notifications. The
affected pieces share the same event flow, so the command surface,
session state, bottom-pane prompts, chat rendering, history/status
views, and tests move together to keep the stacked branch buildable.

This PR also removes the obsolete compatibility surface that is no
longer used after the migration. The proposed protocol-boundary verifier
layer was dropped from the stack; enforcing that final boundary will be
simpler once `codex-tui` no longer needs any `codex_protocol`
references.

This PR is part 2 of a 2-PR stack:

1. Add TUI-owned replacement models and extract app-server event
routing.
2. Move the active TUI flow to app-server notifications and delete
obsolete adapter code.

## What changed

- Rewired app command and session handling to use app-server request and
notification shapes.
- Moved approval overlays, request-user-input flows, MCP elicitation,
realtime events, and review commands onto the app-server-facing model
surface.
- Updated chat rendering, history cells, status views, multi-agent UI,
replay state, and TUI tests to use app-server notifications plus the
local models introduced in PR1.
- Deleted `codex-rs/tui/src/app/app_server_adapter.rs` and the
superseded `chatwidget/tests/background_events.rs` fixture path.

## Verification

- `cargo check -p codex-tui --tests`
- Top of stack: `cargo test -p codex-tui`

Eric Traut · 2026-04-30 11:34:34 -07:00

f2bc2f26a9

Reduce the surface of collaboration modes (#20149 )

Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider

pakrym-oai · 2026-04-29 17:22:41 -07:00

fedcefe9da

feat(cli): add sandbox profile config controls (#20118 )

## Why

The explicit profile path from #20117 is meant for standalone testing,
but it still inherited the
shell cwd and all managed requirements implicitly. The pre-existing
launcher path even called out
that it did not support a separate cwd yet in

[`debug_sandbox.rs`](https://github.com/openai/codex/blob/509453f688a30929432be866402d1ea46aa12169/codex-rs/cli/src/debug_sandbox.rs#L174-L179).

For a standalone command, the useful default is to let the caller choose
the project directory being
tested and to avoid administrator-provided constraints unless the caller
explicitly wants to test
those too.

## What changed

- Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
profile resolution and command
execution.
- Add explicit-profile-only `--include-managed-config`.
- Make explicit profile mode skip managed requirement sources by
default, including cloud
requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
legacy managed-config
requirements projection.
- Preserve all existing invocations outside the explicit-profile path.

## Stack

1. #20117 `sandbox-ui-profile`
2. #20118 `sandbox-ui-config` --> this PR

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_`
- `cargo test -p codex-core
load_config_layers_can_ignore_managed_requirements`
- `cargo test -p codex-core
load_config_layers_includes_cloud_requirements`
- macOS branch-binary smoke on the rebased top of stack: `-C` changed
execution cwd, explicit
profile mode omitted managed proxy env under `env -i`, and
`--include-managed-config` restored it.
- Linux devbox branch-binary smoke on the rebased top of stack: `-C`
changed execution cwd for
built-in and user-defined explicit profiles.

viyatb-oai · 2026-04-29 06:55:51 +00:00

5597925155

feat(cli): add explicit sandbox permission profiles (#20117 )

## Why

`codex sandbox` is useful for exercising sandbox behavior directly, but
before this stack the CLI
only picked up permission profiles indirectly from the active config.
The existing debug-sandbox path
already compiled `[permissions]` profiles through normal config loading,
as covered by the existing
profile tests in
[`debug_sandbox.rs`](https://github.com/openai/codex/blob/de2ccf94735a3d8a2a7077e6a5292026413867cf/codex-rs/cli/src/debug_sandbox.rs#L715-L760).

This adds the smallest stable entry point first: an explicit profile
selector that reuses the same
config machinery as normal Codex config, so standalone testing becomes
possible without changing
current no-selector behavior.

## What changed

- Add additive `--permissions-profile NAME` support to `codex sandbox
macos|linux|windows`.
- Resolve built-in and user-defined profile names by feeding
`default_permissions` through the
existing config compilation path instead of inventing a sandbox-only
parser.
- Make an explicit selector win over an ambient active profile's legacy
`sandbox_mode`.
- Keep the existing no-selector behavior unchanged.

## Stack

1. #20117 `sandbox-ui-profile` --> this PR
2. #20118 `sandbox-ui-config`

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_parses_permissions_profile`
- `cargo test -p codex-core
cli_override_takes_precedence_over_profile_sandbox_mode`
- macOS branch-binary smoke on the rebased top of stack: built-in
`:workspace` and user-defined
profiles both executed successfully through `--permissions-profile`.
- Linux devbox branch-binary smoke on the rebased top of stack: built-in
`:workspace` and
user-defined profiles both executed successfully through
`--permissions-profile`.

viyatb-oai · 2026-04-29 06:18:16 +00:00

6ed0440611

chore(cli) deprecate --full-auto (#20133 )

## Summary
Starts the process of getting rid of `--full-auto`, with some
concessions:
1. Fully removes the command from the tui, since it just resolves to the
default permissions there, and encourages users to use the one-time
trust flow if they're not in a trusted repo.
2. Marks the command as deprecated in `codex exec`, in case users are
actively relying on this. We'll remove in an upcoming n+X release.
3. Cleans up some of the `codex sandbox` cli logic, to keep supporting
legacy sandbox policies for now.

This isn't the cleanest setup, but I think it is worthwhile to warn
users for one release before hard-removing it.

## Testing 
- [x] Updated unit tests

Dylan Hurd · 2026-04-29 04:41:30 +00:00

3d10ba9f36

linux-sandbox: switch helper plumbing to PermissionProfile (#20106 )

## Why

`PermissionProfile` is the canonical runtime permission model in the
Rust workspace, but the Linux sandbox helper still accepted a legacy
`SandboxPolicy` plus separate filesystem and network policy flags. That
translation layer made the helper interface harder to reason about and
left `linux-sandbox`-specific callers and tests coupled to the legacy
policy representation.

This change moves the helper onto `PermissionProfile` directly so the
Linux sandbox plumbing matches the rest of the permission stack.

## What changed

- changed `codex-linux-sandbox` to accept `--permission-profile` and
derive the runtime filesystem and network policies internally
- updated the in-process seccomp and legacy Landlock path in
`codex-rs/linux-sandbox` to operate on `PermissionProfile`
- updated Linux sandbox argv construction in `codex-rs/sandboxing`,
`codex-rs/core`, and the CLI debug sandbox path to pass the canonical
profile instead of serializing compatibility policy projections
- simplified the Linux sandbox tests to build the exact permission
profile under test, including the managed-proxy path and
direct-runtime-enforcement carveout coverage
- removed helper-local `SandboxPolicy` usage from `bwrap` tests where
`FileSystemSandboxPolicy` is already the value being exercised

## Testing

- `cargo test -p codex-sandboxing`
- `cargo test -p codex-linux-sandbox` (on this macOS host, the crate
compiled cleanly and its Linux-only tests were cfg-gated)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-cli --no-run`

Michael Bolin · 2026-04-28 19:43:44 -07:00

e6db1a9442

feat: verify agent identity JWTs with JWKS (#19764 )

efrazer-oai · 2026-04-28 09:56:20 -07:00

f6797c3ac6

feat: split memories part 2 (#19860 )

Keep extracting memories out of core and moving the write trigger in the
app-server
This is temporary and it should move at the client level as a follow-up
This makes core fully independant from `codex-memories-write`

---------

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-04-28 13:03:28 +02:00

431ebeaef7

Add codex update command (#19933 )

## Why

Addresses #9274

Running `codex update` currently starts an interactive Codex session
with `update` as the prompt. That is a rough edge for users who expect a
direct self-update command after seeing the existing update notice, and
it forces them to copy the suggested package-manager command manually.

## What changed

- Added a top-level `codex update` subcommand.
- Reused the existing install-channel detection and update command
runner that the TUI already uses for update prompts.
- Exposed the update-action lookup from `codex-tui` so the CLI can
invoke the same behavior.
- Added CLI coverage to ensure `codex update` is parsed as a subcommand
instead of becoming an interactive prompt.

## Verification

- `cargo test -p codex-cli`
- `cargo test -p codex-tui update_action::tests`

Eric Traut · 2026-04-27 23:33:59 -07:00

b985768dc1

refactor: make auth loading async (#19762 )

## Summary

Auth loading used to expose synchronous construction helpers in several
places even though some auth sources now need async work. This PR makes
the auth-loading surface async and updates the callers to await it.

This is intentionally only plumbing. It does not change how
AgentIdentity tokens are decoded, how task runtime ids are allocated, or
how JWT signatures are verified.

## Stack

1. **This PR:** [refactor: make auth loading
async](https://github.com/openai/codex/pull/19762)
2. [refactor: load AgentIdentity runtime
eagerly](https://github.com/openai/codex/pull/19763)
3. [feat: verify AgentIdentity JWTs with
JWKS](https://github.com/openai/codex/pull/19764)

## Important call sites

| Area | Change |
| --- | --- |
| `codex-login` auth loading | `CodexAuth` and `AuthManager`
construction paths now await auth loading. |
| app-server startup | Auth manager construction is awaited during
initialization. |
| CLI/TUI/exec/MCP/chatgpt callers | Existing auth-loading calls now
await the same behavior. |
| cloud requirements storage loader | The loader becomes async so it can
share the same auth construction path. |
| auth tests | Tests that load auth now run in async contexts. |

## Testing

Tests: targeted Rust auth test compilation, formatter, scoped Clippy
fix, and Bazel lock check.

efrazer-oai · 2026-04-27 11:00:27 -07:00

2009f6e894

permissions: centralize legacy sandbox projection (#19734 )

## Why

The remaining migration work still needs `SandboxPolicy` at a few
compatibility boundaries, but those projections should come from one
canonical path. Keeping ad hoc legacy projections scattered through
app-server, CLI, and config code makes it easy for behavior to drift as
`PermissionProfile` gains fidelity that the legacy enum cannot
represent.

## What Changed

- Adds `Permissions::legacy_sandbox_policy(cwd)` and
`Config::legacy_sandbox_policy()` as the compatibility projection from
the canonical `PermissionProfile`.
- Adds `Permissions::can_set_legacy_sandbox_policy()` so legacy inputs
are checked after they are converted into profile semantics.
- Updates app-server command handling, Windows sandbox setup, session
configuration, and sandbox summaries to use the centralized projection
helper.
- Leaves `SandboxPolicy` in place only for boundary inputs/outputs that
still speak the legacy abstraction.

## Verification

- `cargo check -p codex-config -p codex-core -p codex-sandboxing -p
codex-app-server -p codex-cli -p codex-tui`
- `cargo test -p codex-tui
permissions_selection_history_snapshot_full_access_to_default --
--nocapture`
- `cargo test -p codex-tui
permissions_selection_sends_approvals_reviewer_in_override_turn_context
-- --nocapture`
- `bazel test //codex-rs/tui:tui-unit-tests-bin
--test_arg=permissions_selection_history_snapshot_full_access_to_default
--test_output=errors`
- `bazel test //codex-rs/tui:tui-unit-tests-bin
--test_arg=permissions_selection_sends_approvals_reviewer_in_override_turn_context
--test_output=errors`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19734).
* #19737
* #19736
* #19735
* __->__ #19734

Michael Bolin · 2026-04-26 20:31:23 -07:00

0d8cdc0510

permissions: migrate approval and sandbox consumers to profiles (#19393 )

## Why

Runtime decisions should not infer permissions from the lossy legacy
sandbox projection once `PermissionProfile` is available. In particular,
`Disabled` and `External` need to remain distinct, and managed profiles
with split filesystem or deny-read rules should not be collapsed before
approval, network, safety, or analytics code makes decisions.

## What Changed

- Changes managed network proxy setup and network approval logic to use
`PermissionProfile` when deciding whether a managed sandbox is active.
- Migrates patch safety, Guardian/user-shell approval paths, Landlock
helper setup, analytics sandbox classification, and selected
turn/session code to profile-backed permissions.
- Validates command-level profile overrides against the constrained
`PermissionProfile` rather than a strict `SandboxPolicy` round trip.
- Preserves configured deny-read restrictions when command profiles are
narrowed.
- Adds coverage for profile-backed trust, network proxy/approval
behavior, patch safety, analytics classification, and command-profile
narrowing.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`




































































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19393).
* #19395
* #19394
* __->__ #19393

Michael Bolin · 2026-04-26 15:30:40 -07:00

dda8199b73

[codex] Move config loading into codex-config (#19487 )

## Why

Config loading had become split across crates: `codex-config` owned the
config types and merge logic, while `codex-core` still owned the loader
that assembled the layer stack. This change consolidates that
responsibility in `codex-config`, so the crate that defines config
behavior also owns how configs are discovered and loaded.

To make that move possible without reintroducing the old dependency
cycle, the shell-environment policy types and helpers that
`codex-exec-server` needs now live in `codex-protocol` instead of
flowing through `codex-config`.

This also makes the migrated loader tests more deterministic on machines
that already have managed or system Codex config installed by letting
tests override the system config and requirements paths instead of
reading the host's `/etc/codex`.

## What Changed

- moved the config loader implementation from `codex-core` into
`codex-config::loader` and deleted the old `core::config_loader` module
instead of leaving a compatibility shim
- moved shell-environment policy types and helpers into
`codex-protocol`, then updated `codex-exec-server` and other downstream
crates to import them from their new home
- updated downstream callers to use loader/config APIs from
`codex-config`
- added test-only loader overrides for system config and requirements
paths so loader-focused tests do not depend on host-managed config state
- cleaned up now-unused dependency entries and platform-specific cfgs
that were surfaced by post-push CI

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-core config_loader_tests::`
- `cargo test -p codex-protocol -p codex-exec-server -p
codex-cloud-requirements -p codex-rmcp-client --lib`
- `cargo test --lib -p codex-app-server-client -p codex-exec`
- `cargo test --no-run --lib -p codex-app-server`
- `cargo test -p codex-linux-sandbox --lib`
- `cargo shear`
- `just bazel-lock-check`

## Notes

- I did not chase unrelated full-suite failures outside the migrated
loader surface.
- `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
failures on this machine, and Windows CI still shows unrelated
long-running/timeouting test noise outside the loader migration itself.

pakrym-oai · 2026-04-26 15:10:53 -07:00

9c3abcd46c

permissions: derive compatibility policies from profiles (#19392 )

## Why

After #19391, `PermissionProfile` and the split filesystem/network
policies could still be stored in parallel. That creates drift risk: a
profile can preserve deny globs, external enforcement, or split
filesystem entries while a cached projection silently loses those
details. This PR makes the profile the runtime source and derives
compatibility views from it.

## What Changed

- Removes stored filesystem/network sandbox projections from
`Permissions` and `SessionConfiguration`; their accessors now derive
from the canonical `PermissionProfile`.
- Derives legacy `SandboxPolicy` snapshots from profiles only where an
older API still needs that field.
- Updates MCP connection and elicitation state to track
`PermissionProfile` instead of `SandboxPolicy` for auto-approval
decisions.
- Adds semantic filesystem-policy comparison so cwd changes can preserve
richer profiles while still recognizing equivalent legacy projections
independent of entry ordering.
- Updates config/session tests to assert profile-derived projections
instead of parallel stored fields.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`



































































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19392).
* #19395
* #19394
* #19393
* __->__ #19392

Michael Bolin · 2026-04-26 15:06:42 -07:00

deaa307fb2

feat: load AgentIdentity from JWT login/env (#18904 )

## Summary

This PR lets programmatic AgentIdentity users provide one token through
either stdin login or environment auth.

`codex login --with-agent-identity` reads an Agent Identity JWT from
stdin, validates that it has the required claims, and stores that token
as the `agent_identity` value in `auth.json`. The file format is
token-only; the decoded account and key fields are runtime state, not
hand-authored auth.json fields.

The Agent Identity JWT claim shape and decoder live in
`codex-agent-identity`; `codex-login` only owns env/storage precedence
and conversion into `CodexAuth::AgentIdentity`.

When env auth is enabled, `CODEX_AGENT_IDENTITY` can provide the same
JWT without writing auth state to disk. `CODEX_API_KEY` still wins if
both env vars are set.

Reference old stack: https://github.com/openai/codex/pull/17387/changes
Reference JWT/env stack: https://github.com/openai/codex/pull/18176

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate Codex backend
auth callsites through AuthProvider
5. This PR: accept AgentIdentity JWTs through login/env

## Testing

Tests: targeted login and Agent Identity crate tests, CLI checks, scoped
formatter/linter cleanup, and CI.

---------

Co-authored-by: Shijie Rao <shijie.rao@openai.com>

efrazer-oai · 2026-04-26 19:49:54 +00:00

fed0a8f4fa

[codex] remove responses command (#19640 )

This removes the hidden `codex responses` CLI subcommand after
confirming no downstream callers rely on it, deleting the raw Responses
passthrough implementation, unregistering the subcommand, and dropping
the now-unused CLI dependencies on `codex-api` and
`codex-model-provider`.

Thibault Sottiaux · 2026-04-25 23:10:38 -07:00

87bc72408c

Support end_turn in response.completed (#19610 )

Some providers of Responses API forward a model-defined `end_turn`
boolean indicating explicitly the model's indication of whether it would
like to end the turn or to be inferenced again. In this PR, we update
the sampling loop to use this field correctly if it's set. If the field
is not set by the provider, we fall back to the existing sampling logic.

Andrey Mishchenko · 2026-04-25 21:57:42 -07:00

355c40ad7e

feat: let model providers own model discovery (#18950 )

## Why

`codex-models-manager` had grown to own provider-specific concerns:
constructing OpenAI-compatible `/models` requests, resolving provider
auth, emitting request telemetry, and deciding how provider catalogs
should be sourced. That made the manager harder to reuse for providers
whose model catalog is not fetched from the OpenAI `/models` endpoint,
such as Amazon Bedrock.

This change moves provider-specific model discovery behind
provider-owned implementations, so the models manager can focus on
refresh policy, cache behavior, picker ordering, and model metadata
merging.

## What Changed

- Introduced a `ModelsManager` trait with separate `OpenAiModelsManager`
and `StaticModelsManager` implementations.
- Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives
outside `codex-models-manager`.
- Moved `/models` request construction, provider auth resolution,
timeout handling, and request telemetry into `codex-model-provider` via
`OpenAiModelsEndpoint`.
- Added provider-owned `models_manager(...)` construction so configured
OpenAI-compatible providers use `OpenAiModelsManager`, while
static/catalog-backed providers can return `StaticModelsManager`.
- Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock
model IDs.
- Updated core/session/thread manager code and tests to depend on
`Arc<dyn ModelsManager>`.
- Moved offline model test helpers into
`codex_models_manager::test_support`.
## Metadata References

The Bedrock catalog metadata is based on the official Amazon Bedrock
OpenAI model documentation:

- [Amazon Bedrock OpenAI
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html)
lists the Bedrock model IDs, text input/output modalities, and `128,000`
token context window for `gpt-oss-20b` and `gpt-oss-120b`.
- [Amazon Bedrock `gpt-oss-120b` model
card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html)
lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the
`bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities,
and `128K` context window.
- [OpenAI `gpt-oss-120b` model
docs](https://developers.openai.com/api/docs/models/gpt-oss-120b)
document configurable reasoning effort with `low`, `medium`, and `high`,
plus text input/output modality.

The display names, default reasoning effort, and priority ordering are
Codex-local catalog choices.

## Test Plan
- Manually verified app-server model listing with an AWS profile:

```shell
CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \
  --codex-bin ./target/debug/codex \
  -c 'model_provider="amazon-bedrock"' \
  -c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \
  -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \
  model-list
```

The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0`
as the default model and `openai.gpt-oss-20b-1:0` as the second listed
model, both text-only and supporting low/medium/high reasoning effort.

Celia Chen · 2026-04-24 04:28:25 +00:00

e8d8080818

[rollout_trace] Add debug trace reduction command (#18880 )

## Summary

Adds the debug CLI entry point for reducing recorded rollout traces.
This gives developers a direct way to inspect whether the emitted trace
stream reduces into the expected conversation/runtime model.

## Stack

This is PR 5/5 in the rollout trace stack.

- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command

## Review Notes

This PR is intentionally last: it depends on the trace crate, core
recorder, runtime/tool events, and session/agent edge data all existing.
The command should remain a debug/developer tool and avoid adding new
runtime behavior.

The useful review question is whether the CLI exposes the reducer in the
smallest practical way for local inspection without turning the debug
command into a supported user-facing workflow.

cassirer-openai · 2026-04-24 01:56:48 +00:00

e3c8720a99

refactor: route Codex auth through AuthProvider (#18811 )

## Summary

This PR moves Codex backend request authentication from direct
bearer-token handling to `AuthProvider`.

The new `codex-auth-provider` crate defines the shared request-auth
trait. `CodexAuth::provider()` returns a provider that can apply all
headers needed for the selected auth mode.

This lets ChatGPT token auth and AgentIdentity auth share the same
callsite path:
- ChatGPT token auth applies bearer auth plus account/FedRAMP headers
where needed.
- AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
where needed.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Callsite Migration

| Area | Change |
| --- | --- |
| backend-client | accepts an `AuthProvider` instead of a raw
token/header |
| chatgpt client/connectors | applies auth through
`CodexAuth::provider()` |
| cloud tasks | keeps Codex-backend gating, applies auth through
provider |
| cloud requirements | uses Codex-backend auth checks and provider
headers |
| app-server remote control | applies provider headers for backend calls
|
| MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
from generic account getters |
| model refresh | treats AgentIdentity as Codex-backend auth |
| OpenAI file upload path | rejects non-Codex-backend auth before
applying headers |
| core client setup | keeps model-provider auth flow and allows
AgentIdentity through provider-backed OpenAI auth |

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
auth mode and startup task allocation
4. This PR: migrate Codex backend auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.

efrazer-oai · 2026-04-23 17:14:02 -07:00

5882f3f95e

Move marketplace add/remove and startup sync out of core. (#19099 )

Move more things to core-plugins.

---------

Co-authored-by: Codex <noreply@openai.com>

xl-openai · 2026-04-23 11:27:17 -07:00

198eddd25d

app-server: add Unix socket transport (#18255 )

## Summary
- add unix:// app-server transport backed by the shared codex-uds crate
- reuse the websocket connection loop for axum and tungstenite-backed
streams
- add codex app-server proxy to bridge stdio clients to the control
socket
- tolerate Windows UDS backends that report a missing rendezvous path as
connection refused before binding

## Tests
- cargo test -p codex-app-server
control_socket_acceptor_forwards_websocket_text_messages_and_pings
- cargo test -p codex-app-server
- just fmt
- just fix -p codex-app-server
- git -c core.fsmonitor=false diff --check

Ruslan Nigmatullin · 2026-04-23 11:09:25 -07:00

8a0ab3fc13

[codex] Fix plugin marketplace help usage (#18710 )

## Summary
- Updates generated CLI help for plugin marketplace commands to show the
full `codex plugin marketplace ...` namespace.
- Adds a regression test covering the marketplace command and its `add`,
`upgrade`, and `remove` help pages.

## Root Cause
The marketplace parser already lived under `codex plugin marketplace`,
but Clap generated usage text from the child parser's standalone command
name. That made help output show stale `codex marketplace ...`
instructions even though the top-level `codex marketplace` command no
longer parses.

## Validation
- `just fmt`
- `cargo test -p codex-cli`
- `./target/debug/codex plugin marketplace --help`

xli-oai · 2026-04-23 09:48:37 -07:00

e18bfeec91

Add safety check notification and error handling (#19055 )

Adds a new app-server notification that fires when a user account has
been flagged for potential safety reasons.

Eric Traut · 2026-04-22 22:24:12 -07:00

bbff4ee61a

use long-lived sessions for codex sandbox windows (#18953 )

`codex sandbox windows` previously did a one-shot spawn for all
commands.
This change uses the `unified_exec` session to spawn long-lived
processes instead, and implements a simple bridge to forward stdin to
the spawned session and stdout/stderr from the spawned session back to
the caller.

It also fixes a bug with the new shared spawn context code where the
"no-network env" was being applied to both elevated and unelevated
sandbox spawns. It should only be applied for the unelevated sandbox
because the elevated one uses firewall rules instead of an env-based
network suppression strategy.

iceweasel-oai · 2026-04-22 06:39:29 +00:00

3a451b6321

feat: add explicit AgentIdentity auth mode (#18785 )

## Summary

This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode.

An AgentIdentity auth record is a standalone `auth.json` mode. When
`AuthManager::auth().await` loads that mode, it registers one
process-scoped task and stores it in runtime-only state on the auth
value. Header creation stays synchronous after that because the task is
initialized before callers receive the auth object.

This PR also removes the old feature flag path. AgentIdentity is
selected by explicit auth mode, not by a hidden flag or lazy mutation of
ChatGPT auth records.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Design Decisions

- AgentIdentity is a real auth enum variant because it can be the only
credential in `auth.json`.
- The process task is ephemeral runtime state. It is not serialized and
is not stored in rollout/session data.
- Account/user metadata needed by existing Codex backend checks lives on
the AgentIdentity record for now.
- `is_chatgpt_auth()` remains token-specific.
- `uses_codex_backend()` is the broader predicate for ChatGPT-token auth
and AgentIdentity auth.

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. This PR: explicit AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate Codex backend
auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.

efrazer-oai · 2026-04-21 22:33:24 -07:00

69c8913e24

Stabilize debug clear memories integration test (#18858 )

## Why

`debug_clear_memories_resets_state_and_removes_memory_dir` can be flaky
because the test drops its `sqlx::SqlitePool` immediately before
invoking `codex debug clear-memories`. Dropping the pool does not wait
for all SQLite connections to close, so the CLI can race with still-open
test connections.

## What changed

- Await `pool.close()` before spawning `codex debug clear-memories`.
- Close the reopened verification pool before the temp `CODEX_HOME` is
torn down.

## Verification

- `cargo test -p codex-cli --test debug_clear_memories
debug_clear_memories_resets_state_and_removes_memory_dir`

jif-oai · 2026-04-21 18:15:37 +01:00

10e1659d4f

Fix exec inheritance of root shared flags (#18630 )

Addresses #18113

Problem: Shared flags provided before the exec subcommand were parsed by
the root CLI but not inherited by the exec CLI, so exec sessions could
run with stale or default sandbox and model configuration.

Solution: Move shared TUI and exec flags into a common option block and
merge root selections into exec before dispatch, while preserving exec's
global subcommand flag behavior.

Eric Traut · 2026-04-20 16:12:17 -07:00

0f1c9b8963

uds: add async Unix socket crate (#18254 )

## Summary
- add a codex-uds crate with async UnixListener and UnixStream wrappers
- expose helpers for private socket directory setup and stale socket
path checks
- migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket
relaying
- update the CLI stdio-to-uds command path for the async runner

## Tests
- cargo test -p codex-uds -p codex-stdio-to-uds
- cargo test -p codex-cli
- just fmt
- just fix -p codex-uds
- just fix -p codex-stdio-to-uds
- just fix -p codex-cli
- just bazel-lock-check
- git diff --check

Ruslan Nigmatullin · 2026-04-20 15:59:05 -07:00

97d4b42583

Add codex debug models to show model catalog (#18625 )

Andrey Mishchenko · 2026-04-20 05:42:22 +00:00

ab65fbbdd6

Use thread IDs in TUI resume hints (#18440 )

## Summary

Fixes #18313.

Recent TUI resume breadcrumbs could print a thread title instead of the
stable thread UUID. For sessions whose title was auto-derived from the
first prompt, that made the suggested codex resume command look like it
should resume a long prompt rather than the session ID.

This updates the TUI and CLI post-exit resume hints, plus the in-session
summary shown when switching/forking threads, to always use the stable
thread ID for these recovery breadcrumbs. Explicit name-based resume
support remains available elsewhere.

Eric Traut · 2026-04-19 22:38:48 -07:00

fa8943fe7e

Support codex app on macOS (Intel) and Windows (#18500 )

## Summary

`codex app` should be a platform-aware entry point for opening Codex
Desktop or helping users install it. Before this change, the command
only existed on macOS and its default installer URL always pointed at
the Apple Silicon DMG, which sent Intel Mac users to the wrong build.

This updates the macOS path to choose the Apple Silicon or Intel DMG
based on the detected processor, while keeping `--download-url` as an
advanced override. It also enables `codex app` on Windows, where the CLI
opens an installed Codex Desktop app when available and otherwise opens
the Windows installer URL.

---------

Co-authored-by: Felipe Coury <felipe.coury@openai.com>

Eric Traut · 2026-04-19 10:30:13 -07:00

116317021d

434 Commits