codex

Support openai/form extended form elicitations (#27500 )

# Summary
Allow App Server clients to opt into `openai/form` MCP elicitations.

Gabriel Peal · 2026-06-18 11:54:49 -07:00

21a599fa56

[codex] Add external agent import result accounting (#28008 )

## Why

External-agent imports can complete synchronously or continue in the
background for plugins/sessions. Clients need a stable import id to
correlate the immediate response with the eventual completion
notification, and the completion payload needs enough accounting to show
which artifact types succeeded or failed without hiding partial
failures.

## What Changed

- `externalAgentConfig/import` now returns an `importId`;
`externalAgentConfig/import/completed` includes the same `importId` plus
type-level `itemResults`.
- Completed `itemResults` report `successCount`, `errorCount`,
`successes`, and `rawErrors` for each migrated item type.
- Added protocol/schema/TypeScript types for import successes, raw
errors, and type-level results. No progress notification is included in
the final PR.
- `ExternalAgentConfigService::import` now returns an outcome object
with synchronous item results and pending plugin imports.
- Plugin import outcomes track succeeded/failed marketplaces, plugin
ids, and raw errors. Plugin failures can be reported in completed
accounting while later migration items continue.
- Non-plugin synchronous import failures still fail the request, so
invalid config/skills-style failures are not reported as a successful
import response.
- Session imports now return item results. Successful imports include
the source session path and imported thread id; prepare, persist,
ledger, and source-validation failures become raw errors in completion
accounting where the import can continue.
- The request processor generates the `importId`, aggregates synchronous
results with background plugin/session results, and sends a single
completed notification when all selected work is done.
- App-server docs and generated schema fixtures were updated for the new
response/completed payload shapes.

## Validation

- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server-client event_requires_delivery`
- `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-sync-error
just test -p codex-app-server
external_agent_config_import_returns_error_for_failed_sync_import`
- `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-external-agent
just test -p codex-app-server external_agent_config`

Note: local sandbox validation used `CODEX_SQLITE_HOME` because the
default sqlite state path is read-only in this environment.

charlesgong-openai · 2026-06-15 13:25:42 -07:00

fc1fb682a7

Add request_user_input auto-resolution window contract (#27256 )

## Why

`request_user_input` is moving beyond its original plan-mode-only
workflow, and future default/goal-mode usage needs a way for the model
to ask helpful but non-blocking questions without forcing the turn to
wait forever. This PR adds an explicit `autoResolutionMs` contract so a
later client/runtime change can auto-resolve unanswered prompts after a
bounded window while leaving truly blocking questions unchanged.

This is contract plumbing only; it does not implement the client-side
timer or auto-selection behavior, and the model-facing description
treats the field as reserved unless the current runtime explicitly
supports auto-resolution.

## What Changed

- Added optional `autoResolutionMs` to the model-facing
`request_user_input` args and core `RequestUserInputEvent`.
- Added model-facing schema text for `autoResolutionMs` while marking it
reserved for runtimes that explicitly support auto-resolution.
- Bounds `autoResolutionMs` to `60_000..=240_000` ms during argument
normalization by clamping out-of-range model-provided values.
- Propagated the field through app-server v2
`ToolRequestUserInputParams`, app-server request forwarding, generated
TypeScript, and JSON schema fixtures.
- Updated app-server, core, protocol, and TUI call sites/tests so
omitted values preserve existing `None`/`null` behavior and coverage
verifies a `Some(60_000)` round trip.

## Verification

- `just test -p codex-app-server-protocol`
- `just test -p codex-core request_user_input`
- `just test -p codex-app-server request_user_input_round_trip`
- `just test -p codex-tui request_user_input`
- `just test -p codex-protocol`

Shijie Rao · 2026-06-11 22:30:41 -07:00

216ce03031

[1 of 3] Support long raw TUI goal objectives (#27508 )

## Stack

1. **[1 of 3] Support long raw TUI goal objectives** - this PR
2. [2 of 3] Support long pasted text in TUI goals - #27509
3. [3 of 3] Support images in TUI goals - #27510

## Why

`thread/goal/set` limits persisted objective text to 4000 characters.
The TUI used to reject raw `/goal` objectives above that limit, even
though the client can make them usable by writing the long text to a
file and storing a short objective that points at that file.

This also needs to work for remote app-server sessions: filesystem API
calls must create files on the app-server host, and the stored path must
be meaningful to the agent on that host.

## What Changed

- Adds an app-server-host path helper so TUI code can build paths that
are resolved on the app-server host rather than the TUI host.
- Adds TUI app-server session helpers for `fs/createDirectory`,
`fs/writeFile`, `fs/readFile`, and `fs/remove` that work for embedded
and remote app-server sessions without changing the app-server protocol.
- Materializes oversized raw `/goal` objectives into
`$CODEX_HOME/attachments/<uuid>/goal-objective.md` through the
app-server filesystem APIs, then stores a short, readable objective that
directs the agent to that file.
- Reads managed objective files back for `/goal edit`. Other goal UI
renders the readable stored objective normally, without
managed-file-specific presentation logic.
- Recognizes managed references only when they name the expected
generated file under the app server's reported `$CODEX_HOME`, and cleans
up newly materialized files when goal replacement or setting does not
complete.

## Verification

- Added/updated TUI tests for raw oversized `/goal` submission, large
inline-paste expansion, queued oversized goals, app-facing
materialization before `thread/goal/set`, managed-path validation,
editing, and cleanup.
- Added/updated app-server-client remote coverage for initialized remote
Codex home handling.

## Manual Testing

- Ran the real TUI against a Unix-socket app server with different local
and server `$CODEX_HOME` directories. Oversized goals wrote only under
the server home, and persisted references used the server-canonical path
rather than the TUI path.
- Exercised 3,999-, 4,000-, and 4,001-character raw objectives. The
first two stayed inline without new files; the 4,001-character objective
became a managed objective file.
- Submitted a larger 8,275-character objective, verified its full
contents on the app-server host, and observed the goal continuation open
the referenced server-side file.
- Opened `/goal edit` for a managed objective and verified the full text
was restored through remote `fs/readFile`.
- Submitted an oversized replacement while a goal was active, verified
no file was written before confirmation, then canceled and confirmed
that the existing goal and attachment count were unchanged.

Eric Traut · 2026-06-11 22:26:31 -07:00

78bab04116

Remove TUI legacy Windows sandbox dependency (#27490 )

## Why

This is part of an ongoing attempt to eliminate the TUI's direct
dependency on core features. When we moved the TUI to the app server, we
left a `legacy_core` shim that re-exported some remaining core symbols
for the TUI. The intent was to eventually remove all of these.

In this PR, we remove the symbols related to the Windows sandbox.

The change should be behavior-neutral and low risk because it's just
refactoring and removal of code that is now effectively dead.

When working on this PR, I noticed a big existing problem that affects
mixed-platform remoting. For example, if you run the TUI on a Linux box
and remote into a Windows box, the TUI logic doesn't properly handle
Windows sandbox setup properly. Fixing this is beyond the scope of this
PR, but I've left a TODO comment in place so we don't forget.

## What changed

- Move the remaining TUI-specific sandbox level, setup, telemetry, and
read-root helpers into `codex-tui`, calling `codex-windows-sandbox`
directly.
- Remove the Windows sandbox namespace and read-root grant re-exports
from the client-side `legacy_core` facade.
- Remove the dormant pre-elevation prompt fallback guarded by the
permanently enabled `ELEVATED_SANDBOX_NUX_ENABLED` switch. The reachable
elevated and non-elevated setup flows remain unchanged.

Eric Traut · 2026-06-11 09:23:08 -07:00

1e5b87b4d7

Trim TUI legacy telemetry and migration dependencies (#27487 )

## Why

The TUI still reached through `codex-app-server-client::legacy_core` for
process telemetry setup and personality migration, exposing core-only
details after the TUI moved onto the app-server layer.

This is part of our ongoing efforts to whittle away at the legacy_core
shim that was left over after migrating the TUI to the app server.

This change is just a refactor/rename and should be behavior-neutral and
low risk.

## What changed

- expose OTEL provider construction through the app-server client and
keep the small process/SQLite telemetry adapters local to the TUI
- collapse personality migration results to the config-reload decision
the TUI needs
- remove the `legacy_core::otel_init` and
`legacy_core::personality_migration` subnamespaces

Eric Traut · 2026-06-10 19:50:57 -07:00

ab4ce40042

Remove TUI legacy core test_support dependencies (#27484 )

## Why

The TUI now sits on the app-server layer, but
`app-server-client::legacy_core` still exposed core test helpers solely
for TUI tests. We've been whittling away the remaining dependencies.
This is the next step on that journey.

There is no functional change — just a refactor, and this affects only
test code, so it should be low risk.

## What changed

- remove the `legacy_core::test_support` re-export and call
model-manager test helpers directly
- keep the bundled model-preset cache local to TUI test support
- import constraint types directly from `codex-config`

Eric Traut · 2026-06-10 17:55:49 -07:00

36fc79c6f4

[codex] add /import for external agents (#27071 )

## Why

External-agent import should be discoverable and deliberate without
blocking startup or claiming the public `codex [PROMPT]` CLI namespace.
The slash command keeps the flow local to the interactive TUI and reuses
the existing app-server import API.

## What changed

- add the user-facing `/import` slash command
- detect external-agent importable items only when the command is
invoked
- run imports through the embedded local app-server
- show start and completion messages, refresh configuration, and block
duplicate imports while one is pending
- reject the flow for unsupported remote and local-daemon sessions

## Validation

- `just test -p codex-tui external_agent_config_migration` (10 passed)
- manually exercised an isolated TUI fixture with existing
external-agent setup and session data using a fresh `CODEX_HOME`
- verified picker customization, plugin and session detection, import
completion, repeated invocation, and imported-session resume context
- the broader `just test -p codex-tui` run passed 2,805 tests, with 2
unrelated guardian feature-flag failures and 4 skipped tests

## Draft follow-ups

- review whether completion messaging should remain attached to the
initiating chat if the user switches chats during an import
- review shutdown semantics for an in-progress background import

## Stack

1. [#27064](https://github.com/openai/codex/pull/27064): remove the
startup migration flow
2. [#27065](https://github.com/openai/codex/pull/27065): extract the
picker renderer
3. [#27070](https://github.com/openai/codex/pull/27070): add the
external-agent import picker UX
4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
through `/import`

**This PR is stack item 4.** Draft while the lower stack dependencies
are reviewed.

stefanstokic-oai · 2026-06-10 15:53:15 -04:00

b4445f2758

Reduce TUI legacy core dependencies (#26711 )

## Why

The TUI still reached through `app-server-client::legacy_core` for
thread-name normalization and project-instruction filename details. In
particular, checking the TUI's local filesystem for `/init` is incorrect
for remote app-server sessions, where the server owns the working
directory and instruction discovery.

## What changed

- use the instruction source paths supplied by the app server to decide
whether `/init` should avoid overwriting project instructions
- keep the small thread-name normalization helper local to the TUI
- remove the now-unused instruction filename constants, utility module,
and other unused `legacy_core` re-exports
- make status helper tests independent of concrete instruction filenames

## Verification

- `just test -p codex-app-server-client`
- `just test -p codex-tui
slash_init_skips_when_project_instructions_are_loaded`
- `just test -p codex-tui` ran 2,799 tests; 2,797 passed and two
unrelated guardian feature-flag tests failed reproducibly in untouched
code

### Manual test

Started an app server over WebSocket with a remote workspace containing
`AGENTS.md`, then connected the TUI using `--remote`. After confirming
`thread/start` returned the file in `instructionSources`, deleted
`AGENTS.md` and ran `/init` in the existing session.

The TUI still reported that project instructions already existed and
skipped `/init`. The trace contained no `turn/start` request, confirming
the decision came from app-server session state rather than a new
client-local filesystem check.

Eric Traut · 2026-06-09 13:26:00 -07:00

8e69d29521

Switch runtime to cloud config bundle (#24622 )

## Summary

- Adapts the moved `codex-cloud-config` crate from the legacy cloud
requirements endpoint to the new config bundle endpoint.
- Switches runtime consumers from `CloudRequirementsLoader` to
`CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered
config and requirements.
- Removes the legacy cloud requirements domain loader path.

## Details

This intentionally keeps `codex-cloud-config` monolithic for review
lineage: the previous PR establishes the crate move, and this PR shows
the behavior change against that moved implementation. A follow-up PR
splits the module back into focused files.

The new bundle path preserves the important cloud requirements loader
semantics where intended: account-scoped signed cache, 30 minute TTL, 5
minute refresh cadence, retry/backoff, auth recovery, and fail-closed
startup loading. The cached payload changes from a single requirements
TOML string to the backend-delivered bundle, and validation rejects
malformed config or requirements fragments before cache write/use.

joeflorencio-openai · 2026-06-02 13:18:59 -07:00

d45cd26248

Show remote connection details in /status (#24420 )

## Summary

Fixes #24411.

`/status` currently has no way to show when the TUI is talking to Codex
through a remote transport. That makes embedded local sessions, local
daemon sessions, and true remote sessions look the same, and it hides
the remote server version when debugging connection-specific behavior.

This PR adds a single `Remote` row for non-embedded connections only.
The row shows the sanitized connection address and a dimmed version
parenthetical, preserving the existing status output for embedded local
sessions.

<img width="791" height="144" alt="image"
src="https://github.com/user-attachments/assets/529d7940-1c45-4586-8b06-f20a1f04b771"
/>


## Verification

- Manually validated when connecting remotely (either implicitly to
local daemon or explicitly)

Eric Traut · 2026-05-25 09:42:42 -07:00

913270a689

Add thread/settings/update app-server API (#23502 )

## Why

App-server clients need a way to update a thread's next-turn settings
without starting a turn, adding transcript content, or waiting for turn
lifecycle events. This gives settings UI a direct path for durable
thread settings while clients observe the eventual effective state
through a notification.

This is a simplified rework of PR
https://github.com/openai/codex/pull/22509. In particular, it changes
the `thread/settings/update` api to return immediately rather than
waiting and returning the effective (updated) thread settings. This
makes the new api consistent with `turn/start` and greatly reduces the
complexity of the implementation relative to the earlier attempt.

## What Changed

- Adds experimental `thread/settings/update` with partial-update request
fields and an empty acknowledgment response.
- Adds experimental `thread/settings/updated`, carrying full effective
`ThreadSettings` and scoped by `threadId` to subscribed clients for the
affected thread.
- Shares durable settings validation with `turn/start`, including
`sandboxPolicy` plus `permissions` rejection and `serviceTier: null`
clearing.
- Emits the same settings notification when `turn/start` overrides
change the stored effective thread settings.
- Regenerates app-server protocol schema fixtures and updates
`app-server/README.md`.

Eric Traut · 2026-05-20 11:03:20 -07:00

771a4e74ac

Make local environment optional in EnvironmentManager (#23369 )

## Summary
- make `EnvironmentManager` local environment/runtime paths optional
- simplify constructor surface around snapshot materialization
- rename local env accessors to `require_local_environment` /
`try_local_environment`

## Validation
- devbox Bazel build for touched crate surfaces
- `//codex-rs/exec-server:exec-server-unit-tests`
- `//codex-rs/app-server-client:app-server-client-unit-tests`
- filtered touched `//codex-rs/core:core-unit-tests` cases

starr-openai · 2026-05-19 12:55:34 -07:00

5c43a64e2b

config: add strict config parsing (#20559 )

## Why

Codex intentionally ignores unknown `config.toml` fields by default so
older and newer config files keep working across versions. That leniency
also makes typo detection hard because misspelled or misplaced keys
disappear silently.

This change adds an opt-in strict config mode so users and tooling can
fail fast on unrecognized config fields without changing the default
permissive behavior.

This feature is possible because `serde_ignored` exposes the exact
signal Codex needs: it lets Codex run ordinary Serde deserialization
while recording fields Serde would otherwise ignore. That avoids
requiring `#[serde(deny_unknown_fields)]` across every config type and
keeps strict validation opt-in around the existing config model.

## What Changed

### Added strict config validation

- Added `serde_ignored`-based validation for `ConfigToml` in
`codex-rs/config/src/strict_config.rs`.
- Combined `serde_ignored` with `serde_path_to_error` so strict mode
preserves typed config error paths while also collecting fields Serde
would otherwise ignore.
- Added strict-mode validation for unknown `[features]` keys, including
keys that would otherwise be accepted by `FeaturesToml`'s flattened
boolean map.
- Kept typed config errors ahead of ignored-field reporting, so
malformed known fields are reported before unknown-field diagnostics.
- Added source-range diagnostics for top-level and nested unknown config
fields, including non-file managed preference source names.

### Kept parsing single-pass per source

- Reworked file and managed-config loading so strict validation reuses
the already parsed `TomlValue` for that source.
- For actual config files and managed config strings, the loader now
reads once, parses once, and validates that same parsed value instead of
deserializing multiple times.
- Validated `-c` / `--config` override layers with the same
base-directory context used for normal relative-path resolution, so
unknown override keys are still reported when another override contains
a relative path.

### Scoped `--strict-config` to config-heavy entry points

- Added support for `--strict-config` on the main config-loading entry
points where it is most useful:
  - `codex`
  - `codex resume`
  - `codex fork`
  - `codex exec`
  - `codex review`
  - `codex mcp-server`
  - `codex app-server` when running the server itself
  - the standalone `codex-app-server` binary
  - the standalone `codex-exec` binary
- Commands outside that set now reject `--strict-config` early with
targeted errors instead of accepting it everywhere through shared CLI
plumbing.
- `codex app-server` subcommands such as `proxy`, `daemon`, and
`generate-*` are intentionally excluded from the first rollout.
- When app-server strict mode sees invalid config, app-server exits with
the config error instead of logging a warning and continuing with
defaults.
- Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
instead of extending shared `ReviewArgs`, so `--strict-config` stays on
the outer config-loading command surface and does not become part of the
reusable review payload used by `codex exec review`.

### Coverage

- Added tests for top-level and nested unknown config fields, unknown
`[features]` keys, typed-error precedence, source-location reporting,
and non-file managed preference source names.
- Added CLI coverage showing invalid `--enable`, invalid `--disable`,
and unknown `-c` overrides still error when `--strict-config` is
present, including compound-looking feature names such as
`multi_agent_v2.subagent_usage_hint_text`.
- Added integration coverage showing both `codex app-server
--strict-config` and standalone `codex-app-server --strict-config` exit
with an error for unknown config fields instead of starting with
fallback defaults.
- Added coverage showing unsupported command surfaces reject
`--strict-config` with explicit errors.

## Example Usage

Run Codex with strict config validation enabled:

```shell
codex --strict-config
```

Strict config mode is also available on the supported config-heavy
subcommands:

```shell
codex --strict-config exec "explain this repository"
codex review --strict-config --uncommitted
codex mcp-server --strict-config
codex app-server --strict-config --listen off
codex-app-server --strict-config --listen off
```

For example, if `~/.codex/config.toml` contains a typo in a key name:

```toml
model = "gpt-5"
approval_polic = "on-request"
```

then `codex --strict-config` reports the misspelled key instead of
silently ignoring it. The path is shortened to `~` here for readability:

```text
$ codex --strict-config
Error loading config.toml:
~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
  |
2 | approval_polic = "on-request"
  | ^^^^^^^^^^^^^^
```

Without `--strict-config`, Codex keeps the existing permissive behavior
and ignores the unknown key.

Strict config mode also validates ad-hoc `-c` / `--config` overrides:

```text
$ codex --strict-config -c foo=bar
Error: unknown configuration field `foo` in -c/--config override

$ codex --strict-config -c features.foo=true
Error: unknown configuration field `features.foo` in -c/--config override
```

Invalid feature toggles are rejected too, including values that look
like nested config paths:

```text
$ codex --strict-config --enable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --disable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
```

Unsupported commands reject the flag explicitly:

```text
$ codex --strict-config cloud list
Error: `--strict-config` is not supported for `codex cloud`
```

## Verification

The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
`--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
case, unknown `-c` overrides, app-server strict startup failure through
`codex app-server`, and rejection for unsupported commands such as
`codex cloud`, `codex mcp`, `codex remote-control`, and `codex
app-server proxy`.

The config and config-loader tests cover unknown top-level fields,
unknown nested fields, unknown `[features]` keys, source-location
reporting, non-file managed config sources, and `-c` validation for keys
such as `features.foo`.

The app-server test suite covers standalone `codex-app-server
--strict-config` startup failure for an unknown config field.

## Documentation

The Codex CLI docs on developers.openai.com/codex should mention
`--strict-config` as an opt-in validation mode for supported
config-heavy entry points once this ships.

Michael Bolin · 2026-05-13 16:08:05 +00:00

889ee018e7

Add support for UDS in codex --remote (#22414 )

## Why

Added support for UDS connections in `codex --remote`.

TUI also now connects to local app-server using UDS by default if it is
running and set to listen to UDS connection.

## What Changed

- Introduced `RemoteAppServerEndpoint` with `WebSocket` and `UnixSocket`
variants.
- Reused the existing JSON-RPC-over-WebSocket protocol over either a TCP
WebSocket stream or a UDS stream.
- Updated `codex --remote` to accept `ws://host:port`,
`wss://host:port`, `unix://`, and `unix://PATH`.
- Kept `--remote-auth-token-env` restricted to `wss://` and loopback
`ws://` remotes.
- Added a fast TUI startup probe for the default daemon socket, falling
back to the embedded app server when the daemon is absent or
unresponsive.

## Verification

- Manually verified that the updated remote flow works.
- Added coverage for UDS remote round trips, WebSocket auth headers,
auth-token transport policy, remote address parsing, and missing-daemon
fallback.
- Ran focused remote test coverage locally.

Eric Traut · 2026-05-12 21:17:20 -07:00

ad572709ab

[codex] request desktop attestation from app (#20619 )

## Summary

TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
attestation token and attach it as `x-oai-attestation` on the scoped
ChatGPT Codex request paths.

![DeviceCheck attestation
interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)

## Details

This PR teaches the Codex app-server runtime how to request and attach
an attestation token. It does not generate DeviceCheck tokens directly;
instead, it relies on the connected desktop app to advertise that it can
generate attestation and then asks that app for a fresh header value
when needed.

The flow is:

1. The Codex desktop app connects to app-server.
2. During `initialize`, the app can advertise that it supports
`requestAttestation`.
3. Before app-server calls selected ChatGPT Codex endpoints, it sends
the internal server request `attestation/generate` to the app.
4. app-server receives a pre-encoded header value back.
5. app-server forwards that value as `x-oai-attestation` on the scoped
outbound requests.

The code in this repo is mostly protocol and runtime plumbing: it adds
the app-server request/response shape, introduces an attestation
provider in core, wires that provider into Responses / compaction /
realtime setup paths, and covers the intended scoping with tests. The
signed macOS DeviceCheck generation remains owned by the desktop app PR.

## Related PR

- Codex desktop app implementation:
https://github.com/openai/openai/pull/878649

## Validation

<details>
<summary>Tests run</summary>

```sh
cargo test -p codex-app-server-protocol
cargo test -p codex-core attestation --lib
cargo test -p codex-app-server --lib attestation
```

Also ran:

```sh
just fix -p codex-core
just fix -p codex-app-server
just fix -p codex-app-server-protocol
just fmt
just write-app-server-schema
```

</details>

<details>
<summary>E2E DeviceCheck validation</summary>

First validated the signed desktop app boundary directly: launched a
packaged signed `Codex.app`, sent `attestation/generate`, decoded the
returned `v1.` attestation header, and validated the extracted
DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
`is_ok: true`.

Then ran the fuller app + app-server flow. The packaged `Codex.app`
launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
requested `attestation/generate` from the real Electron app process, and
the intercepted `/backend-api/codex/responses` traffic included
`x-oai-attestation` on both routes:

```text
GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present
POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present
```

The captured header decoded to a DeviceCheck token that also validated
with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
team `2DC432GLL2`).

</details>

---------

Co-authored-by: Codex <noreply@openai.com>

Jiaming Zhang · 2026-05-08 12:36:02 -07:00

5f4d0ec343

Load configured environments from CODEX_HOME (#20667 )

## Why

The earlier PRs add stdio transport support and the config-backed
environment provider, but the feature remains inert until normal Codex
entrypoints construct `EnvironmentManager` with enough context to
discover `CODEX_HOME/environments.toml`. This final stack PR activates
the provider while preserving the legacy `CODEX_EXEC_SERVER_URL`
fallback when no environments file exists.

**Stack position:** this is PR 5 of 5. It is the product wiring PR that
activates the configured environment provider added in PR 4.

## What Changed

- Thread `codex_home` into `EnvironmentManagerArgs`.
- Change `EnvironmentManager::new(...)` to load the provider from
`CODEX_HOME`.
- Preserve legacy behavior by falling back to
`DefaultEnvironmentProvider::from_env()` when `environments.toml` is
absent.
- Make `environments.toml`-backed managers start new threads with all
configured environments, default first, while keeping the legacy env-var
path single-default.
- Update the app-server, TUI, exec, MCP server, connector, prompt-debug,
and thread-manager-sample callsites to pass `codex_home` and handle
provider-loading errors.

## Self-Review Notes

- The multi-environment startup path is intentionally tied to the
`environments.toml` provider. Using `>1` configured environment as the
only signal would also expand the legacy `CODEX_EXEC_SERVER_URL`
provider because it keeps `local` addressable alongside `remote`.
- The startup environment list is still derived inside
`EnvironmentManager`; the provider only says whether its snapshot should
start new threads with all configured environments.
- The thread-manager sample was updated to pass the current
`ThreadManager::new(...)` installation id argument so the stack compiles
under Bazel.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- **5. This PR:** https://github.com/openai/codex/pull/20667 - Load
configured environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

- `just fmt`
- `git diff --check`
- `bazel build --config=remote --strategy=remote
--remote_download_toplevel
//codex-rs/thread-manager-sample:codex-thread-manager-sample`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel
//codex-rs/exec-server:exec-server-unit-tests`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel --test_sharding_strategy=disabled
--test_arg=default_thread_environment_selections_use_manager_default_id
//codex-rs/core:core-unit-tests`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel --test_sharding_strategy=disabled
--test_arg=start_thread_uses_all_default_environments_from_codex_home
//codex-rs/core:core-unit-tests`

## Documentation

This activates `CODEX_HOME/environments.toml`; user-facing documentation
should be added before this stack is treated as a documented public
workflow.

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-05-08 11:17:56 -07:00

5f2543b74e

Disable empty Cargo test targets (#21584 )

## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>

Charlie Marsh · 2026-05-07 15:44:17 -07:00

54ef99a365

Revert state DB injection and agent graph store (#21481 )

## Why

Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.

## What changed

- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.

## Validation

- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.

pakrym-oai · 2026-05-06 22:48:29 -07:00

a8488fec5e

Move message history out of core (#21278 )

## Why

Message history was implemented inside `codex-core` and surfaced through
core protocol ops and `SessionConfiguredEvent` fields even though the
current consumer is TUI-local prompt recall. That made core own UI
history persistence and exposed `history_log_id` / `history_entry_count`
through surfaces that app-server and other clients do not need.

This change moves message history persistence out of core and keeps the
recall plumbing local to the TUI.

## What changed

- Added a new `codex-message-history` crate for appending, looking up,
trimming, and reading metadata from `history.jsonl`.
- Removed core protocol history ops/events: `AddToHistory`,
`GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
- Removed `history_log_id` and `history_entry_count` from
`SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
- Updated the TUI to dispatch local app events for message-history
append/lookup and keep its persistent-history metadata in TUI session
state.

## Validation

- `cargo test -p codex-message-history -p codex-protocol`
- `cargo test -p codex-exec event_processor_with_json_output`
- `cargo test -p codex-mcp-server outgoing_message`
- `cargo test -p codex-tui`
- `just fix -p codex-message-history -p codex-protocol -p codex-core -p
codex-tui -p codex-exec -p codex-mcp-server`

pakrym-oai · 2026-05-06 08:35:42 -07:00

2004173cd7

test: isolate app-server-client in-process test state (#21328 )

## Why

The in-process `app-server-client` tests were still building their
configs from the ambient `codex_home` and letting the embedded app
server create its own state DB when `state_db` was absent. That matters
because in-process startup falls back to
`init_state_db_from_config(...)` in that case, so tests can otherwise
share persisted state instead of getting isolated fixtures:
[`app-server/src/in_process.rs`](https://github.com/openai/codex/blob/a98623511ba433154ec811fc63091617f5945438/codex-rs/app-server/src/in_process.rs#L368-L373).

## What changed

- Give each in-process test client its own temporary `codex_home`.
- Initialize the matching state DB from that per-client config and pass
it into the client explicitly.
- Keep the temp directory alive for the lifetime of the test client
through a small `TestClient` wrapper.
- Add `tempfile` as a dev dependency for the new harness.

The updated setup lives in
[`app-server-client/src/lib.rs`](https://github.com/openai/codex/blob/35c1133d45d10931914dbb88a1246a195d025ff6/codex-rs/app-server-client/src/lib.rs#L982-L1055).

## Testing

- Existing `codex-app-server-client` tests continue to exercise the
updated in-process client path through the isolated helper.

jif-oai · 2026-05-06 09:21:22 +00:00

b5e965e1d7

Inject state DB, agent graph store (#20689 )

## Why

We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.

This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.

This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.

## What changed

- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
  - `LocalThreadStore`
  - `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.

## Verification

- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`

Rasmus Rygaard · 2026-05-05 21:45:29 +00:00

7e310bc7f3

add turn items view to app-server turns (#21063 )

## Why

`Turn.items` currently overloads an empty array to mean either that no
items exist or that the server intentionally did not load them for this
response. That ambiguity blocks future lazy-loading work where clients
need to distinguish unloaded, summary, and fully hydrated turn payloads.

## What changed

- add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
variants
- add required `itemsView` metadata to app-server `Turn` payloads
- mark reconstructed persisted history as `full` and live shell-style
turn payloads as `notLoaded`
- keep current `thread/turns/list` behavior unchanged and document that
it still returns `full` turns today
- regenerate the JSON and TypeScript protocol fixtures

## Verification

- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server thread_read_can_include_turns`
- `cargo test -p codex-app-server
thread_turns_list_can_page_backward_and_forward`
- `cargo test -p codex-app-server
thread_resume_rejects_history_when_thread_is_running`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`

rhan-oai · 2026-05-05 19:17:16 +00:00

9e0c191c13

[codex-analytics] add item lifecycle timing (#20514 )

## Why

Tool families already disagree on what their existing `duration` fields
mean, so lifecycle latency should live on the shared item envelope
instead of being inferred from per-tool execution fields. Carrying that
envelope through app-server notifications gives downstream consumers one
reusable timing signal without pretending every tool has the same
execution semantics.

## What changed

- Adds `started_at_ms` to core `ItemStartedEvent` values and
`completed_at_ms` to core `ItemCompletedEvent` values.
- Populates those timestamps in the shared session lifecycle emitters,
so protocol-native items get timing without each producer tracking its
own clock state.
- Exposes `startedAtMs` on app-server `item/started` notifications and
`completedAtMs` on `item/completed` notifications.
- Maps the lifecycle timestamps through the app-server boundary while
leaving legacy-converted notifications nullable when no lifecycle
timestamp exists.
- Regenerates the app-server JSON schema and TypeScript fixtures for the
notification-envelope change and updates downstream fixtures that
construct those notifications directly.
- Extends the existing web-search and image-generation integration flows
to assert the new lifecycle timestamps on the native item events.

## Verification

- `cargo check -p codex-protocol -p codex-core -p
codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
-p codex-app-server-client`
- `cargo test -p codex-core --test all web_search_item_is_emitted`
- `cargo test -p codex-core --test all
image_generation_call_event_is_emitted`
- `cargo test -p codex-app-server-protocol`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
* #18748
* #18747
* #17090
* #17089
* __->__ #20514

rhan-oai · 2026-05-04 22:33:20 +00:00

aee1fe2659

state: pass state db handles through consumers (#20561 )

## Why

SQLite state was still being opened from consumer paths, including lazy
`OnceCell`-backed thread-store call sites. That let one process
construct multiple state DB connections for the same Codex home, which
makes SQLite lock contention and `database is locked` failures much
easier to hit.

State DB lifetime should be chosen by main-like entrypoints and tests,
then passed through explicitly. Consumers should use the supplied
`Option<StateDbHandle>` or `StateDbHandle` and keep their existing
filesystem fallback or error behavior when no handle is available.

The startup path also needs to keep the rollout crate in charge of
SQLite state initialization. Opening `codex_state::StateRuntime`
directly bypasses rollout metadata backfill, so entrypoints should
initialize through `codex_rollout::state_db` and receive a handle only
after required rollout backfills have completed.

## What Changed

- Initialize the state DB in main-like entrypoints for CLI, TUI,
app-server, exec, MCP server, and the thread-manager sample.
- Pass `Option<StateDbHandle>` through `ThreadManager`,
`LocalThreadStore`, app-server processors, TUI app wiring, rollout
listing/recording, personality migration, shell snapshot cleanup,
session-name lookup, and memory/device-key consumers.
- Remove the lazy local state DB wrapper from the thread store so
non-test consumers use only the supplied handle or their existing
fallback path.
- Make `codex_rollout::state_db::init` the local state startup path: it
opens/migrates SQLite, runs rollout metadata backfill when needed, waits
for concurrent backfill workers up to a bounded timeout, verifies
completion, and then returns the initialized handle.
- Keep optional/non-owning SQLite helpers, such as remote TUI local
reads, as open-only paths that do not run startup backfill.
- Switch app-server startup from direct
`codex_state::StateRuntime::init` to the rollout state initializer so
app-server cannot skip rollout backfill.
- Collapse split rollout lookup/list APIs so callers use the normal
methods with an optional state handle instead of `_with_state_db`
variants.
- Restore `getConversationSummary(ThreadId)` to delegate through
`ThreadStore::read_thread` instead of a LocalThreadStore-specific
rollout path special case.
- Keep DB-backed rollout path lookup keyed on the DB row and file
existence, without imposing the filesystem filename convention on
existing DB rows.
- Verify readable DB-backed rollout paths against `session_meta.id`
before returning them, so a stale SQLite row that points at another
thread's JSONL falls back to filesystem search and read-repairs the DB
row.
- Keep `debug prompt-input` filesystem-only so a one-off debug command
does not initialize or backfill SQLite state just to print prompt input.
- Keep goal-session test Codex homes alive only in the goal-specific
helper, rather than leaking tempdirs from the shared session test
helper.
- Update tests and call sites to pass explicit state handles where DB
behavior is expected and explicit `None` where filesystem-only behavior
is intended.

## Validation

- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
codex-tui -p codex-exec -p codex-cli --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout state_db_`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout try_init_ -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
codex-rollout --lib -- -D warnings`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store
read_thread_falls_back_when_sqlite_path_points_to_another_thread --
--nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
shell_snapshot`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all personality_migration`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find::find_prefers_sqlite_path_by_id --
--nocapture`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
interrupt_accounts_active_goal_before_pausing`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server get_auth_status -- --test-threads=1`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server --lib`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
-p codex-app-server --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
-p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
codex-exec -p codex-cli`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
codex-app-server`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
codex-rollout`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
- `just argument-comment-lint -p codex-core`
- `just argument-comment-lint -p codex-rollout`

Focused coverage added in `codex-rollout`:

- `recorder::tests::state_db_init_backfills_before_returning` verifies
the rollout metadata row exists before startup init returns.
- `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
verifies startup waits for another worker to finish backfill instead of
disabling the handle for the process.
-
`state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
verifies startup does not hang indefinitely on a stuck backfill lease.
-
`tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
verifies DB-backed lookup accepts valid existing rollout paths even when
the filename does not include the thread UUID.
-
`tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
verifies DB-backed lookup ignores a stale row whose existing path
belongs to another thread and read-repairs the row after filesystem
fallback.

Focused coverage updated in `codex-core`:

- `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
DB-preferred rollout file with matching `session_meta.id`, so it still
verifies that valid SQLite paths win without depending on stale/empty
rollout contents.

`cargo test -p codex-app-server thread_list_respects_search_term_filter
-- --test-threads=1 --nocapture` was attempted locally but timed out
waiting for the app-server test harness `initialize` response before
reaching the changed thread-list code path.

`bazel test //codex-rs/thread-store:thread-store-unit-tests
--test_output=errors` was attempted locally after the thread-store fix,
but this container failed before target analysis while fetching `v8+`
through BuildBuddy/direct GitHub. The equivalent local crate coverage,
including `cargo test -p codex-thread-store`, passes.

A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
also requires system `libcap.pc` for `codex-linux-sandbox`; the
follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
this container.

Ruslan Nigmatullin · 2026-05-04 11:46:03 -07:00

4d201e340e

feat: export and replay effective config locks (#20405 )

## Why

For reproducibility. A hand-written `config.toml` is not enough to
recreate what a Codex session actually ran with because layered config,
CLI overrides, defaults, feature aliases, resolved feature config,
prompt setup, and model-catalog/session values can all affect the final
runtime behavior.

This PR adds an effective config lockfile path: one run can export the
resolved session config, and a later run can replay that lockfile and
fail early if the regenerated effective config drifts.

## What Changed

- Add a dedicated `ConfigLockfileToml` wrapper with top-level lockfile
metadata plus the replayable config:

```toml
version = 1
codex_version = "..."

[config]
# effective ConfigToml fields
```

- Keep lockfile metadata out of regular `ConfigToml`; replay loads
`ConfigLockfileToml` and then uses its nested `config` as the
authoritative config layer.
- Add `debug.config_lockfile.export_dir` to write
`<thread_id>.config.lock.toml` when a root session starts.
- Add `debug.config_lockfile.load_path` to replay a saved lockfile and
validate the regenerated session lockfile against it.
- Add `debug.config_lockfile.allow_codex_version_mismatch` to optionally
tolerate Codex binary version drift while still comparing the rest of
the lockfile.
- Add `debug.config_lockfile.save_fields_resolved_from_model_catalog` so
lock creation can either save model-catalog/session-resolved fields or
intentionally leave those fields dynamic.
- Build lockfiles from the effective config plus resolved runtime values
such as model selection, reasoning settings, prompts, service tier, web
search mode, feature states/config, memories config, skill instructions,
and agent limits.
- Materialize feature aliases and custom feature config into the
lockfile so replay compares canonical resolved behavior instead of
user-authored alias shape.
- Strip profile/debug/file-include/environment-specific inputs from
generated lockfiles so they contain replayable values rather than the
inputs that produced those values.
- Surface JSON-RPC server error code/data in app-server client and TUI
bootstrap errors so config-lock replay failures include the actual TOML
diff.
- Regenerate the config schema for the new debug config keys.

## Review Notes

The main flow is split across these files:

- `config/src/config_toml.rs`: lockfile/debug TOML shapes.
- `core/src/config/mod.rs`: loading `debug.config_lockfile.*`, replaying
a lockfile as a config layer, and preserving the expected lockfile for
validation.
- `core/src/session/config_lock.rs`: exporting the current session
lockfile and materializing resolved session/config values.
- `core/src/config_lock.rs`: lockfile parsing, metadata/version checks,
replay comparison, and diff formatting.

## Usage

Export a lockfile from a normal session:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"'
```

Export a lockfile without saving model-catalog/session-resolved fields:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' \
-c 'debug.config_lockfile.save_fields_resolved_from_model_catalog=false'
```

Replay a saved lockfile in a later session:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"'
```

If replay resolves to a different effective config, startup fails with a
TOML diff.

To tolerate Codex binary version drift during replay:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' \
-c 'debug.config_lockfile.allow_codex_version_mismatch=true'
```

## Limitations

This does not support custom rules/network policies.

## Verification

- `cargo test -p codex-core config_lock`
- `cargo test -p codex-config`
- `cargo test -p codex-thread-manager-sample`

jif-oai · 2026-05-01 17:46:02 +02:00

0b04d1b3cc

Move plugin out of core. (#20348 )

xl-openai · 2026-04-30 14:26:14 -07:00

7b3de63041

Add environment provider snapshot (#20058 )

## Summary
- Change `EnvironmentProvider` to return concrete `Environment`
instances instead of `EnvironmentConfigurations`.
- Make `DefaultEnvironmentProvider` provide the provider-visible `local`
environment plus optional `remote` environment from
`CODEX_EXEC_SERVER_URL`.
- Keep `EnvironmentManager` as the concrete cache while exposing its own
explicit local environment for `local_environment()` fallback paths.

## Validation
- `just fmt`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-28 20:05:18 -07:00

e1ec9e63a0

Allow large remote app-server resume responses (#19920 )

## Why

Remote TUI resume uses the app-server websocket client. That client
inherited tungstenite's default `16 MiB` frame limit, so a large saved
session could make `thread/resume` return a single JSON-RPC response
frame that the client rejected before the TUI could deserialize or
render it.

Fixes #19837

## What Changed

- Configure the remote app-server websocket client with a bounded `128
MiB` max frame/message size.
- Preserve the concrete remote worker exit reason when completing
pending requests after a transport/read failure instead of replacing it
with a generic channel-closed error.
- Add a regression test that sends a single `>16 MiB` JSON-RPC response
frame and verifies the typed request succeeds.

Note: This isn't a perfect fix. It really just moves the limit to a much
larger value. I looked at a bunch of other potential fixes (both
server-side and client-side), and they all involved significant
complexity, had backward-compatibility impact, or impacted performance
of common use cases. This simple fix should address the vast majority of
remote use cases.

## Verification

I reproed the problem locally using a long rollout. Verified that fix
addresses connection drop.

Eric Traut · 2026-04-27 22:44:10 -07:00

92fb848065

[codex] Move config loading into codex-config (#19487 )

## Why

Config loading had become split across crates: `codex-config` owned the
config types and merge logic, while `codex-core` still owned the loader
that assembled the layer stack. This change consolidates that
responsibility in `codex-config`, so the crate that defines config
behavior also owns how configs are discovered and loaded.

To make that move possible without reintroducing the old dependency
cycle, the shell-environment policy types and helpers that
`codex-exec-server` needs now live in `codex-protocol` instead of
flowing through `codex-config`.

This also makes the migrated loader tests more deterministic on machines
that already have managed or system Codex config installed by letting
tests override the system config and requirements paths instead of
reading the host's `/etc/codex`.

## What Changed

- moved the config loader implementation from `codex-core` into
`codex-config::loader` and deleted the old `core::config_loader` module
instead of leaving a compatibility shim
- moved shell-environment policy types and helpers into
`codex-protocol`, then updated `codex-exec-server` and other downstream
crates to import them from their new home
- updated downstream callers to use loader/config APIs from
`codex-config`
- added test-only loader overrides for system config and requirements
paths so loader-focused tests do not depend on host-managed config state
- cleaned up now-unused dependency entries and platform-specific cfgs
that were surfaced by post-push CI

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-core config_loader_tests::`
- `cargo test -p codex-protocol -p codex-exec-server -p
codex-cloud-requirements -p codex-rmcp-client --lib`
- `cargo test --lib -p codex-app-server-client -p codex-exec`
- `cargo test --no-run --lib -p codex-app-server`
- `cargo test -p codex-linux-sandbox --lib`
- `cargo shear`
- `just bazel-lock-check`

## Notes

- I did not chase unrelated full-suite failures outside the migrated
loader surface.
- `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
failures on this machine, and Windows CI still shows unrelated
long-running/timeouting test noise outside the loader migration itself.

pakrym-oai · 2026-04-26 15:10:53 -07:00

9c3abcd46c

Add remote thread config endpoint (#18908 )

## Why

App-server needs a way to fetch thread-scoped config from the remote
thread config service when the user config opts into that behavior. This
mirrors the existing experimental remote thread store endpoint while
keeping local/noop behavior as the default.

Startup paths also need to avoid silently dropping the remote config
endpoint after the first config load. The stdio app-server path
discovers the endpoint from the initial config and installs the real
thread config loader for later config builds, while in-process clients
used by TUI/exec now select the same remote loader directly from their
provided config.

## What changed

- Added `experimental_thread_config_endpoint` to `ConfigToml`, `Config`,
and `core/config.schema.json`.
- Added config parsing coverage for the new setting.
- Updated app-server startup to select `RemoteThreadConfigLoader` from
the initially loaded config, falling back to `NoopThreadConfigLoader`
when unset.
- Let `ConfigManager` replace its thread config loader after startup
discovery so later config loads use the selected loader.
- Updated in-process app-server client startup to pass
`RemoteThreadConfigLoader` when its config has
`experimental_thread_config_endpoint` set.

## Verification

- Added `experimental_thread_config_endpoint_loads_from_config_toml`.
- Added
`runtime_start_args_use_remote_thread_config_loader_when_configured`.
- Ran `cargo check -p codex-app-server --lib`.
- Ran `cargo test -p codex-app-server-client`.

Rasmus Rygaard · 2026-04-23 11:46:06 -07:00

f11583b8f6

Move marketplace add/remove and startup sync out of core. (#19099 )

Move more things to core-plugins.

---------

Co-authored-by: Codex <noreply@openai.com>

xl-openai · 2026-04-23 11:27:17 -07:00

198eddd25d

TUI: Keep remote app-server events draining (#18932 )

Addresses #18860

Problem: Remote app-server clients could stop draining websocket events
when their bounded local event channel filled, leaving clients stuck on
stale in-progress turns after a disconnect.

Solution: Use an unbounded local event channel for the remote client so
the websocket reader can keep forwarding disconnect and progress events
instead of blocking or dropping them.

Why this is reasonable: This does not make the remote websocket itself
unbounded. The changed queue lives inside the remote client, between the
task that reads the remote websocket and the API consumer in the same
client process. Once an event has been received from the remote server,
preserving it is preferable to blocking websocket reads or dropping
disconnect/lifecycle events; network-level backpressure still happens at
the websocket boundary if the remote side outpaces the client.

Eric Traut · 2026-04-22 09:29:34 -07:00

79ea577156

Fix remote app-server shutdown race (#18936 )

## Why

A Mac Bazel CI run saw `remote_notifications_arrive_over_websocket` fail
during shutdown with `remote app-server shutdown channel is closed`
(https://app.buildbuddy.io/invocation/9dac05d6-ae20-40f9-b627-fca6e91cf127).
The remote websocket worker can legitimately finish while `shutdown()`
is waiting for the shutdown acknowledgement: after the test server sends
a notification and exits, the worker may deliver the required disconnect
event, observe that the caller has dropped the event receiver, and exit
before it sends the shutdown one-shot.

That state is already terminal cleanup, not a failed shutdown, so
callers should not see a `BrokenPipe` from the acknowledgement channel.

## What Changed

- Treat a closed remote shutdown acknowledgement as an already-exited
worker while still propagating websocket close errors when the worker
returns them.
- Added a deterministic regression test for the interleaving where the
shutdown command is received and the worker exits before replying.

## Verification

- `cargo test -p codex-app-server-client`
- New test:
`remote::tests::shutdown_tolerates_worker_exit_after_command_is_queued`

Michael Bolin · 2026-04-22 02:41:19 +00:00

8fea372c77

Support multiple managed environments (#18401 )

## Summary
- refactor EnvironmentManager to own keyed environments with
default/local lookup helpers
- keep remote exec-server client creation lazy until exec/fs use
- preserve disabled agent environment access separately from internal
local environment access

## Validation
- not run (per Codex worktree instruction to avoid tests/builds unless
requested)

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-21 15:29:35 -07:00

ddbe2536be

Add session config loader interface (#18208 )

## Why

Cloud-hosted sessions need a way for the service that starts or manages
a thread to provide session-owned config without treating all config as
if it came from the same user/project/workspace TOML stack.

The important boundary is ownership: some values should be controlled by
the session/orchestrator, some by the authenticated user, and later some
may come from the executor. The earlier broad config-store shape made
that boundary too fuzzy and overlapped heavily with the existing
filesystem-backed config loader. This PR starts with the smaller piece
we need now: a typed session config loader that can feed the existing
config layer stack while preserving the normal precedence and merge
behavior.

## What Changed

- Added `ThreadConfigLoader` and related typed payloads in
`codex-config`.
- `SessionThreadConfig` currently supports `model_provider`,
`model_providers`, and feature flags.
- `UserThreadConfig` is present as an ownership boundary, but does not
yet add TOML-backed fields.
- `NoopThreadConfigLoader` preserves existing behavior when no external
loader is configured.
  - `StaticThreadConfigLoader` supports tests and simple callers.

- Taught thread config sources to produce ordinary `ConfigLayerEntry`
values so the existing `ConfigLayerStack` remains the place where
precedence and merging happen.

- Wired the loader through `ConfigBuilder`, the config loader, and
app-server startup paths so app-server can provide session-owned config
before deriving a thread config.

- Added coverage for:
  - translating typed thread config into config layers,
- inserting thread config layers into the stack at the right precedence,
- applying session-provided model provider and feature settings when
app-server derives config from thread params.

## Follow-Ups

This intentionally stops short of adding the remote/service transport.
The next pieces are expected to be:

1. Define the proto/API shape for this interface.
2. Add a client implementation that can source session config from the
service side.

## Verification

- Added unit coverage in `codex-config` for the loader and layer
conversion.
- Added `codex-core` config loader coverage for thread config layer
precedence.
- Added app-server coverage that verifies session thread config wins
over request-provided config for model provider and feature settings.

Rasmus Rygaard · 2026-04-20 23:05:49 +00:00

7b994100b3

Remove simple TUI legacy_core reexports (#18631 )

## Problem
The TUI still imported path utilities and config-loader symbols through
app-server-client's legacy_core facade even though those APIs already
exist in utility/config crates. This is part of our ongoing effort to
whittle away at these old dependencies.

## Solution
Rewire imports to avoid the TUI directly importing from the core crate
and instead import from common lower-level crates. This PR doesn't
include any functional changes; it's just a simple rewiring.

Eric Traut · 2026-04-20 10:48:27 -07:00

164b6a0c78

TUI: remove simple legacy_core re-exports (#18605 )

## Summary

The TUI still imported several symbols through the transitional
app-server-client `legacy_core` facade even though those symbols are
already owned by smaller crates. This PR narrows that facade by rewiring
those imports directly to their owner crates.

## Changes

No functional changes, just import rewiring. This is part of our ongoing
effort to whittle away at the `legacy_core` namespace, which represents
all of the remaining symbols that the TUI imports from the core.

Eric Traut · 2026-04-19 22:39:53 -07:00

87fc21ff60

Refactor AGENTS.md discovery into AgentsMdManager (#18035 )

Encapsulate Agents MD processing a bit and drop user_instructions_path
from config.

pakrym-oai · 2026-04-16 10:51:33 -07:00

ab97c9aaad

Async config loading (#18022 )

Parts of config will come from executor. Prepare for that by making
config loading methods async.

pakrym-oai · 2026-04-15 19:18:38 -07:00

bd61737e8a

fix: propagate log db (#17953 )

It restores the TRACE logs in the DB and `/feedback`
Fix https://github.com/openai/codex/pull/16184 

Result:
https://openai.sentry.io/issues/6972946529/?project=4510195390611458&query=019d91e9-f931-7451-8852-c5240514a419&referrer=issue-stream

jif-oai · 2026-04-15 20:25:53 +01:00

7e7b35b4d2

Run exec-server fs operations through sandbox helper (#17294 )

## Summary
- run exec-server filesystem RPCs requiring sandboxing through a
`codex-fs` arg0 helper over stdin/stdout
- keep direct local filesystem execution for `DangerFullAccess` and
external sandbox policies
- remove the standalone exec-server binary path in favor of top-level
arg0 dispatch/runtime paths
- add sandbox escape regression coverage for local and remote filesystem
paths

## Validation
- `just fmt`
- `git diff --check`
- remote devbox: `cd codex-rs && bazel test --bes_backend=
--bes_results_url= //codex-rs/exec-server:all` (6/6 passed)

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-12 18:36:03 -07:00

d626dc3895

TUI: enforce core boundary (#17399 )

Problem: The TUI still depended on `codex-core` directly in a number of
places, and we had no enforcement from keeping this problem from getting
worse.

Solution: Route TUI core access through
`codex-app-server-client::legacy_core`, add CI enforcement for that
boundary, and re-export this legacy bridge inside the TUI as
`crate::legacy_core` so the remaining call sites stay readable. There is
no functional change in this PR — just changes to import targets.

Over time, we can whittle away at the remaining symbols in this legacy
namespace with the eventual goal of removing them all. In the meantime,
this linter rule will prevent us from inadvertently importing new
symbols from core.

Eric Traut · 2026-04-10 20:25:31 -07:00

66e13efd9c

Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391 )

Reverts openai/codex#16969

#sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300

Shijie Rao · 2026-04-10 23:33:13 +00:00

930e5adb7e

Option to Notify Workspace Owner When Usage Limit is Reached (#16969 )

## Summary
- Replace the manual `/notify-owner` flow with an inline confirmation
prompt when a usage-based workspace member hits a credits-depleted
limit.
- Fetch the current workspace role from the live ChatGPT
`accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
the desktop and web clients.
- Keep owner, member, and spend-cap messaging distinct so we only offer
the owner nudge when the workspace is actually out of credits.

## What Changed
- `backend-client`
- Added a typed fetch for the current account role from
`accounts/check`.
  - Mapped backend role values into a Rust workspace-role enum.
- `app-server` and protocol
  - Added `workspaceRole` to `account/read` and `account/updated`.
- Derived `isWorkspaceOwner` from the live role, with a fallback to the
cached token claim when the role fetch is unavailable.
- `tui`
  - Removed the explicit `/notify-owner` slash command.
- When a member is blocked because the workspace is out of credits, the
error now prompts:
- `Your workspace is out of credits. Request more from your workspace
owner? [y/N]`
  - Choosing `y` sends the existing owner-notification request.
- Choosing `n`, pressing `Esc`, or accepting the default selection
dismisses the prompt without sending anything.
- Selection popups now honor explicit item shortcuts, which is how the
`y` / `n` interaction is wired.

## Reviewer Notes
- The main behavior change is scoped to usage-based workspace members
whose workspace credits are depleted.
- Spend-cap reached should not show the owner-notification prompt.
- Owners and admins should continue to see `/usage` guidance instead of
the member prompt.
- The live role fetch is best-effort; if it fails, we fall back to the
existing token-derived ownership signal.

## Testing
- Manual verification
  - Workspace owner does not see the member prompt.
- Workspace member with depleted credits sees the confirmation prompt
and can send the nudge with `y`.
- Workspace member with spend cap reached does not see the
owner-notification prompt.

### Workspace member out of usage

https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1

### Workspace owner
<img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
22 AM"
src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
/>

richardopenai · 2026-04-09 21:15:17 -07:00

9f2a585153

Install rustls provider for remote websocket client (#17288 )

Addresses #17283

Problem: `codex --remote wss://...` could panic because
app-server-client did not install rustls' process-level crypto provider
before opening TLS websocket connections.

Solution: Add the existing rustls provider utility dependency and
install it before the remote websocket connect.

Eric Traut · 2026-04-09 20:29:12 -07:00

36712d8546

[codex] Support remote exec cwd in TUI startup (#17142 )

When running with remote executor the cwd is the remote path. Today we
check for existence of a local directory on startup and attempt to load
config from it.

For remote executors don't do that.

pakrym-oai · 2026-04-08 13:09:28 -07:00

e4d6702b87

[codex-analytics] add protocol-native turn timestamps (#16638 )

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
* #16870
* #16706
* #16659
* #16641
* #16640
* __->__ #16638

rhan-oai · 2026-04-06 16:22:59 -07:00

756c45ec61

chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 )

## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.

Michael Bolin · 2026-03-27 19:00:44 -07:00

61dfe0b86c

Remove the legacy TUI split (#15922 )

This is the part 1 of 2 PRs that will delete the `tui` /
`tui_app_server` split. This part simply deletes the existing `tui`
directory and marks the `tui_app_server` feature flag as removed. I left
the `tui_app_server` feature flag in place for now so its presence
doesn't result in an error. It is simply ignored.

Part 2 will rename the `tui_app_server` directory `tui`. I did this as
two parts to reduce visible code churn.

Eric Traut · 2026-03-27 22:56:44 +00:00

d65deec617

60 Commits