codex

code-mode: move cell state into library actor (#28599 )

A code-mode cell is a single JavaScript execution that can produce
output, call tools, wait for asynchronous work, resume, or be
terminated. This PR extracts the existing per-cell run loop into a
dedicated actor that owns the cell’s lifecycle state. It is primarily an
ownership change rather than a new lifecycle contract: existing behavior
now has one clear implementation boundary.

### Architecture
The session service remains responsible for session-wide concerns:
allocating cell IDs, storing shared values, creating cells, and routing
requests to them.

Once a cell is created, its execution state belongs to its actor.
Callers interact with the actor through a handle. The actor receives two
kinds of input: runtime events and control requests.

A single event loop serializes these inputs and applies the lifecycle
rules. It tracks the current observer—the caller waiting for an
update—along with accumulated output, outstanding callbacks, runtime
state, yield deadlines, and termination progress. Observation,
termination, completion, and cleanup therefore have one consistent
owner.

When the runtime has no immediately runnable work and is waiting only on
timers or tool results, the actor can return accumulated output and
information about outstanding tool calls while keeping the cell
available to resume. On completion or termination, it performs the
appropriate callback cleanup before publishing the final result and
removing the cell from the session.

A small host interface connects the actor to session-owned facilities
such as tool dispatch, notifications, stored values, and final cell
removal, keeping those responsibilities outside the actor itself.

### Why
Previously, cell lifecycle state and coordination lived alongside
session management. The actor boundary makes each cell a self-contained
state machine with a single writer, while the service becomes a registry
and adapter around it.

This makes lifecycle behavior easier to reason about and test in
isolation. It also establishes a clean boundary for later changing where
cells run or how they communicate without recreating their lifecycle
rules.

Channing Conger · 2026-06-16 19:28:55 -07:00

e2f074e16c

[codex] Support object-valued plugin MCP manifests (#28580 )

## Summary
This fixes plugin manifest parsing for MCP servers declared as an object
directly in `plugin.json`.

Before this change, Codex modeled `mcpServers` as only a string path,
for example:

```json
{
  "name": "counter-sample",
  "version": "1.1.1",
  "mcpServers": "./.mcp.json"
}
```

Some migrated plugins instead provide the server map directly in the
manifest:

```json
{
  "name": "counter-sample",
  "version": "1.1.1",
  "description": "Plugin that declares MCP servers in the manifest",
  "mcpServers": {
    "counter": {
      "type": "http",
      "url": "https://sample.example/counter/mcp"
    }
  }
}
```

That object form previously failed during install/load with an error
like:

```text
failed to parse plugin manifest: invalid type: map, expected a string
```

## What changed
- Add a manifest representation for `mcpServers` as either
`Path(Resource)` or `Object(map)`.
- Parse `plugin.json` `mcpServers` as either a string path or an object.
- Route object-valued MCP server maps through the existing plugin MCP
config parser instead of adding a second parser.
- Apply existing per-plugin MCP server policy to object-valued MCP
servers the same way as file-backed MCP servers.
- Include object-valued MCP server names in plugin telemetry/capability
metadata.
- Support object-valued MCP config for executor plugins without
requiring a `.mcp.json` filesystem read.
- Update the bundled plugin-creator validator and `plugin-json-spec.md`
so generated-plugin validation accepts the same object-valued shape.

## Compatibility
Existing plugin manifests that use `"mcpServers": "./.mcp.json"`
continue to work. Plugins can now also use the object shape shown above.

## Tests
Added coverage for the new manifest attribute shape at the install,
normal load, telemetry, and executor-provider layers:

- `install_accepts_manifest_mcp_server_objects`
- `load_plugins_loads_manifest_mcp_server_objects`
- `plugin_telemetry_metadata_uses_manifest_mcp_server_objects`
- `reads_manifest_object_config_without_executor_file_system_access`

Also smoke-tested the plugin-creator validator against both supported
forms:

- `mcpServers` as a direct object in `plugin.json`
- `mcpServers` as `"./.mcp.json"` with a companion `.mcp.json`

## Validation
- `just test -p codex-plugin`
- `just test -p codex-core-plugins`
- `just test -p codex-mcp-extension`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fmt`
- `git diff --check`
- Focused rename/object-form rerun: `just test -p codex-core-plugins
manager::tests::load_plugins_loads_manifest_mcp_server_objects
manager::tests::plugin_telemetry_metadata_uses_manifest_mcp_server_objects
store::tests::install_accepts_manifest_mcp_server_objects`
- Focused executor rerun: `just test -p codex-mcp-extension
executor_plugin::provider::tests::reads_manifest_object_config_without_executor_file_system_access`
- `python3
codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
/private/tmp/codex-validator-object`
- `python3
codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
/private/tmp/codex-validator-path`

charlesgong-openai · 2026-06-16 19:22:57 -07:00

1883dedc0e

thread-store: fix response fixture compilation (#28642 )

## Why

A `codex-thread-store` test fixture still constructs
`ResponseItem::FunctionCallOutput` without its required `metadata`
field, preventing the crate's test targets from compiling on `main`.

## What changed

- Set the fixture's response-item metadata to `None`.

## Testing

- `cargo check -p codex-thread-store --tests`

pakrym-oai · 2026-06-16 19:16:16 -07:00

6f77491e95

[codex] core: restore absolute turn context cwd (#28629 )

## Why

#28152 jumped the gun on moving the rollout format to store URIs, and
would likely break compat with some features that don't go through the
same types as the core logic.

## What

Make `TurnContextItem.cwd` an `AbsolutePathBuf` again, remove test added
for `PathUri` serialization in rollouts. Also drops a bunch of error
paths that are no longer needed.

Adam Perry @ OpenAI · 2026-06-16 19:05:26 -07:00

3d02c443bc

[codex] Gate remote plugin catalog by auth (#28625 )

## Summary

- Treat the remote global plugin catalog as active only when
`remote_plugin` is enabled and the current auth uses the Codex backend.
- Skip the local OpenAI curated marketplace for remote-enabled ChatGPT
users while preserving configured marketplaces.
- Keep the local curated marketplace for API-key users, unauthenticated
fallback, and ChatGPT users with `remote_plugin` disabled.
- Apply the same effective-remote gate to the remote
installed-marketplace cache.

## Root cause

The tool-suggestion discovery path unconditionally included the local
OpenAI curated marketplace. For remote-enabled ChatGPT users, that made
remote discovery additive: Codex parsed every local curated
`plugin.json` before also loading the remote catalog.

## Validation

- `just fmt`
- `cargo build -p codex-cli --bin codex`
- Targeted auth/feature matrix tests pass, including API-key auth with
`remote_plugin` enabled.
- Manual CLI validation confirmed:
  - ChatGPT + remote off includes local curated.
  - ChatGPT + remote on excludes local curated.
  - API-key auth keeps local curated when remote is enabled.
- `just test -p codex-core-plugins`: 235 passed; one unrelated existing
marketplace test failed because it loaded the developer's home
marketplace configuration.

xl-openai · 2026-06-16 17:24:48 -07:00

69bc0645ac

Revert "Tell codex about PathUri serde compat. (#28595 )" (#28627 )

This reverts commit bd2a786326, which
didn't capture all the nuance we need for this migration.

Adam Perry @ OpenAI · 2026-06-16 17:18:20 -07:00

bfe90188ad

Add thread recencyAt for sidebar ordering (#27910 )

## Summary

Add a server-owned `recencyAt` timestamp and `recency_at` thread-list
sort key for product recency ordering while preserving the existing
meaning of `updatedAt` as the latest persisted thread mutation.

This is the server-side alternative to #27697. Rather than narrowing
`updatedAt`, clients can sort the sidebar by `recency_at` and continue
treating `updatedAt` as mutation time.

Paired Codex Apps PR:
[openai/openai#1024599](https://github.com/openai/openai/pull/1024599)

## Contract

- `recencyAt` initializes when a thread is created.
- A turn start advances `recencyAt` monotonically.
- Commentary, agent output, tool results, token/accounting updates, turn
completion, archive, unarchive, resume, and generic metadata writes do
not advance it.
- `updatedAt` retains its existing behavior and continues to advance for
persisted thread mutations.
- Current servers populate `recencyAt`; the response field is optional
in generated TypeScript so clients connected to older servers can fall
back to `updatedAt`.
- Filesystem-only fallback uses existing updated/mtime ordering when
SQLite is unavailable.

## Persistence and compatibility

Migration 0038 adds second- and millisecond-precision recency columns,
backfills them from the existing updated timestamp, creates list
indexes, and includes an insert trigger so older binaries writing to a
migrated database seed recency without causing later mutations to
advance it.

Generic metadata upserts preserve existing recency values. Turn-start
updates use a dedicated monotonic touch, and process-local allocation
keeps millisecond cursor values unique. State DB list, search, read,
filtered-list repair, rollout fallback propagation, and app-server
conversions all carry the new field.

## API

`Thread` responses include:

```ts
recencyAt?: number
```

`thread/list` and `thread/search` accept:

```json
{ "sortKey": "recency_at" }
```

Generated TypeScript and JSON schemas are included.

## Validation

- `just test -p codex-state` — 146 passed
- `just test -p codex-rollout` — 69 passed
- `just test -p codex-thread-store` — 81 passed
- `just test -p codex-app-server-protocol` — 231 passed
- Focused app-server list ordering, response mapping, archive/unarchive,
and resume lifecycle tests passed
- Scoped `just fix` for state, rollout, thread-store,
app-server-protocol, and app-server
- `just fmt`
- `git diff --check`
- Independent correctness, simplicity, elegance, security, and
test-quality reviews; actionable ordering, lifecycle, query-projection,
and timestamp-uniqueness findings were addressed

Jeremy Rose · 2026-06-16 17:06:22 -07:00

fac3158c2a

PAC 1 - Add system proxy feature config surface (#26706 )

## Summary

Introduces the default-off `respect_system_proxy` feature flag used to
gate first-class system PAC/proxy support for Codex-owned native
clients.

With the feature disabled or absent, behavior remains unchanged. This PR
establishes the configuration and managed-requirement surface; proxy
discovery and request routing are implemented by follow-up PRs.

## Configuration

User configuration uses the standard boolean feature form:

```toml
[features]
respect_system_proxy = true
```

Managed feature requirements use the corresponding boolean key. The
effective runtime configuration is exposed as a boolean and defaults to
`false`.

## Implementation

- Registers `respect_system_proxy` as an under-development, default-off
feature.
- Resolves user configuration and managed feature requirements into
`Config.respect_system_proxy`.
- Provides bootstrap resolution for startup paths that must evaluate the
feature before full configuration loading completes.
- Uses the standard feature CLI and config-editing behavior.
- Excludes `features.respect_system_proxy` from project-local
configuration.
- Updates the generated configuration schema.

## End-user behavior

- No networking behavior changes when the feature is absent or disabled.
- Enabling the feature makes the boolean available to the native
proxy-routing implementation in follow-up PRs.
- Repository-local configuration cannot enable the feature.

## Test coverage

Covers scalar configuration and CLI override resolution, managed
requirement constraints, bootstrap resolution, and project-local
filtering.

canvrno-oai · 2026-06-16 16:54:37 -07:00

f0cb96bcb1

[codex] [4/4] Simplify recommended plugin install schema (#28403 )

## Summary
- Simplify recommendation-context `request_plugin_install` arguments to
`plugin_id` and `suggest_reason`.
- Derive plugin type and install action from the matched candidate while
preserving Codex-owned elicitation metadata.
- Keep the legacy list-backed schema unchanged and accept resumed calls
that still use `tool_id`.

## Stack
- #28399
- #28400
- #27704
- This PR

## Validation
- `just test -p codex-tools -p codex-core request_plugin_install` (25
passed)
- `just fix -p codex-tools -p codex-core`
- `just fmt`
- `git diff --check`

Alex Daley · 2026-06-16 23:44:42 +00:00

a397b59887

core: render remote environment cwd natively (#28152 )

## Why

Model-visible `<environment_context>` should match the environment of
the executor, not of the app server.

Stacked on #28146.

## What

- Keep selected environment cwd values as `PathUri` while building
environment context.
- Render cwd text using the path convention represented by the URI, with
the canonical URI as a fallback.
- Preserve compatibility with legacy `TurnContextItem.cwd` values when
reconstructing and diffing context.
- Extend the Wine-backed remote Windows test to assert that the model
sees `powershell` and `C:\windows`.

Adam Perry @ OpenAI · 2026-06-16 16:17:47 -07:00

4c79527e31

[codex] [3/4] Activate endpoint plugin recommendations (#27704 )

Summary\n- Await endpoint recommendation selection while constructing
each authenticated turn, removing the first-turn cache race.\n- Snapshot
and filter endpoint candidates once per turn, then use that same set for
the bounded contextual user fragment, tool exposure, and exact install
validation.\n- Keep recommendation selection ephemeral: do not persist
recommendation state in or gate resumed threads on prior context.\n-
Hide the legacy list tool in endpoint mode and preserve legacy discovery
unchanged when the endpoint is disabled or unavailable.\n- Keep remote
plugin and connector app identities out of model-visible context and
attach them only to Codex-owned elicitation metadata.\n\nStack\n- 3/4,
based on #28400.\n- Endpoint client and cache: #28399.\n- Generalized
suggestion presentation: #28400.\n- Install-schema follow-up:
#28403.\n\nValidation\n- \n- \n- \n- \n- Full : 2,649 passed and 88
environment-dependent tests failed because this sandbox cannot write ,
nest Seatbelt, or locate auxiliary test binaries.

Alex Daley · 2026-06-16 23:04:07 +00:00

a34da3b295

[codex] [2/4] Generalize plugin suggestion presentation (#28400 )

Summary
- Add list-backed and developer-context presentations for plugin
suggestion candidates.
- Let tool planning, install validation, and request-tool copy follow
the selected presentation.
- Keep every production caller on the existing list-backed presentation,
preserving the current list tool, request schema, connector behavior,
and model-visible copy.
- Leave developer-context presentation latent until the final PR in the
stack.

Stack
- 2/3, based on #28399.
- Follow-up: #27704 activates endpoint recommendations.

Validation
- `just test -p codex-core request_plugin_install`
- `just test -p codex-core spec_plan`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`

Alex Daley · 2026-06-16 22:44:10 +00:00

587487df9e

[codex] [1/4] Add recommended plugin endpoint cache (#28399 )

Summary
- Add authenticated parsing for `/ps/plugins/suggested?scope=GLOBAL`,
including remote plugin and connector app identities.
- Validate, deduplicate, sort, and cap endpoint candidates before
caching them by backend and account identity.
- Deduplicate concurrent cache misses and warm recommendations from the
existing remote-installed-plugin refresh path used at startup and after
account changes.
- Keep endpoint results model-invisible in this PR; failures and
responses without `enabled: true` resolve to legacy mode.

Stack
- 1/3. Follow-up: #28400 generalizes plugin suggestion presentation
without activating endpoint recommendations.
- Final activation: #27704.

Validation
- `just test -p codex-core-plugins recommended_plugins`
- `just fix -p codex-core-plugins`
- `just fmt`
- `git diff --check`

Alex Daley · 2026-06-16 22:22:21 +00:00

7e735b59ce

Tell codex about PathUri serde compat. (#28595 )

This addresses another wrinkle I keep having to re-prompt codex about
when migrating to cross-OS paths.

Adam Perry @ OpenAI · 2026-06-16 15:01:22 -07:00

bd2a786326

app-server: preserve target-native environment cwd (#28146 )

## Why

app-server may run on a different OS from the selected exec-server
environment. Parsing that environment’s cwd with the Codex host’s path
rules prevents thread startup.

## What

Carry environment cwd values as `LegacyAppPathString` at the app-server
boundary and `PathUri` internally. Existing tool-call schemas and
relative-path behavior stay host-native; remaining local-only consumers
convert explicitly and leave follow-up TODOs.

The Wine integration test verifies app-server can start a thread and
complete an ordinary turn with a Windows environment cwd from Linux.

## Validation

- `bazel test //codex-rs/core/tests/remote_env_windows:smoke-test
--test_output=errors`
- focused app-server environment-selection and protocol schema tests
- scoped Clippy for `codex-core` and `codex-app-server-protocol`

Adam Perry @ OpenAI · 2026-06-16 21:42:28 +00:00

f8850cab1d

Record invariants for path migration. (#28589 )

## Why

Help Codex understand how to execute the migration to support cross-OS
paths.

## What

Expand the path-types skill with our goals and constraints.

Adam Perry @ OpenAI · 2026-06-16 21:05:32 +00:00

33d50234a8

Clarify model-generated and legacy app path types (#28577 )

## Why

`ApiPathString` kind of implies that it can be used anywhere we pull a
path out of JSON, but it's not really appropriate for tool arguments
when the model might generate relative paths.

Prefer `String` for model-generated paths and we can handle the
conversion per feature for now and define a shared abstraction later if
it makes sense.

# What

Rename `ApiPathString` to `AppLegacyPathString` to clarify its role.

Expand the `path-types` skill to tell the model to leave tool args as
bare strings.

Adam Perry @ OpenAI · 2026-06-16 20:47:43 +00:00

322b83de5e

[codex] test exec relative additional permissions (#28587 )

## Why

Review caught some would-be regressions in changes to unified_exec that
weren't surfaced in CI.

## What

Add coverage for requesting permissions through unified exec when there
are additional permissions. Previously this flow was only tested against
shell_command.

Adam Perry @ OpenAI · 2026-06-16 20:45:57 +00:00

a50671e748

code-mode: extend test coverage to lock in cell lifecycle (#28468 )

This PR establishes the intended behavior as an executable contract
before a refactor of the cell runtime begins. It also fixes cases where
a second observer or termination request could replace an existing
response channel and leave the original caller unresolved.

### Behavior codified
- A cell can yield output and subsequently resume to completion.
- A caller can run a cell until it has no immediately runnable work,
receive its accumulated output and outstanding tool-call IDs, and then
resume the same cell when the awaited work is available.
- Each cell admits one active observer:
   - a second observer receives an explicit busy error
   - the existing observer remains registered and is not displaced
- A natural result (conclusion of the js module) that has already
reached the cell controller wins over a later termination request.
- Otherwise, termination preempts execution and resolves both:
  - the active observer, if present
  - the caller requesting termination
- Repeated termination requests are rejected while termination is
already in progress.
- Terminal responses are sent only after outstanding callback work has
been handled:
- natural completion drains notifications and cancels outstanding tool
calls
- termination cancels and drains both notification and tool callbacks.
- Cell removal and cell_closed notification happen after callback
cleanup

Channing Conger · 2026-06-16 13:34:16 -07:00

e93516e259

[codex] re-enable absolute workdir integration test (#28581 )

## Why

In #28146 I missed the invariant that an absolute `exec_command` workdir
must override the environment cwd. The existing integration test would
have caught that regression, but it was ignored as flaky.

## What

Re-enable `unified_exec_respects_workdir_override`.

## Validation

`just test -p codex-core unified_exec_respects_workdir_override`

Adam Perry @ OpenAI · 2026-06-16 20:19:41 +00:00

4b7351700f

[codex-app-server-test-client] Plugin Install/Uninstall Analytics Smoke Test (#27100 )

## This PR

The original [combined remote plugin analytics PR
#26281](https://github.com/openai/codex/pull/26281) mixed reusable
analytics test infrastructure, two manual smoke workflows, a metadata
refactor, and the final identity behavior. This PR adds the
account-mutating validation workflow separately so its cleanup and
recovery guarantees can be reviewed without the final analytics behavior
change.

- Add a manually invoked remote plugin install/uninstall smoke workflow.
- Require explicit account-mutation confirmation and an initially
uninstalled plugin.
- Validate the current `codex_plugin_installed` contract, where
`plugin_id` is the backend ID.
- Restore and verify the original uninstalled state, with a dedicated
recovery command.

This baseline intentionally does not require `codex_plugin_uninstalled`,
because production does not emit that event yet. The final PR will
update this smoke to require local `plugin_id`, `remote_plugin_id`, and
uninstall emission. Review this PR as the net diff against #27099.

## Testing

- `just test -p codex-app-server-test-client` (3 focused
capture/validation tests passed)
- The live workflow was previously exercised on the green combined
reference branch, and the original uninstalled account state was
restored.
- CI is green across the required platform matrix.

## Split Overview

```text
main
├── #27093  Debug analytics capture
│   └── #27099  Non-mutating plugin smoke
│       └── #27100  Remote install/uninstall smoke  ← you are here
└── #27102  Plugin telemetry metadata refactor

After #27093, #27099, #27100, and #27102 merge:
└── Final PR: add remote_plugin_id to plugin analytics
```

Review order and dependencies:

1. [#27093 Add debug-only analytics event
capture](https://github.com/openai/codex/pull/27093) (based on `main`)
2. [#27099 Add a plugin analytics smoke
workflow](https://github.com/openai/codex/pull/27099) (stacked on
#27093)
3. [#27100 Add a remote plugin analytics mutation smoke
workflow](https://github.com/openai/codex/pull/27100) **(this PR,
stacked on #27099)**
4. [#27102 Centralize plugin telemetry metadata
construction](https://github.com/openai/codex/pull/27102) (independent,
based on `main`)
5. Final remote-ID behavior PR (created after PRs 1-4 merge)

The original [#26281](https://github.com/openai/codex/pull/26281)
remains open as the green aggregate reference until the final PR is
published.

jameswt-oai · 2026-06-16 12:28:45 -07:00

8a40200880

[codex] Route MCP file uploads through environment filesystem (#27923 )

## Why

Codex Apps tools can mark arguments with `openai/fileParams`, but the
execution path resolved and opened those files directly on the host.
That bypassed the selected turn environment and prevented annotated file
arguments from working with remote environments.

## What changed

- resolve annotated file arguments against the primary turn environment
- read file metadata and contents through that environment's sandboxed
`ExecutorFileSystem`
- reject files over the 512 MiB limit from metadata before reading or
transferring them
- retain the buffered upload-size check as defense in depth
- make the OpenAI upload API accept a filename and buffered contents
instead of owning local filesystem access
- describe the model-visible argument as a path in the primary
environment

This builds on #27927, which added `size` to internal filesystem
metadata.

## Testing

- `just test -p codex-api upload_openai_file_returns_canonical_uri`
- `just test -p codex-mcp
tool_with_model_visible_input_schema_masks_file_params`
- `just test -p codex-core mcp_openai_file`
- `just test -p codex-core
codex_apps_file_params_upload_environment_files_before_mcp_tool_call`

pakrym-oai · 2026-06-16 11:27:46 -07:00

7baf7e467e

ci: run code-mode unit tests on all bazel targets (#28562 )

## Why

V8 should be stable under Bazel, so the `codex-code-mode` unit tests
should run across the Bazel platform matrix. If these tests prove
unstable, we should fix the tests rather than exclude them from CI.

## What changed

- Remove the explicit `//codex-rs/code-mode:code-mode-unit-tests`
exclusion from the macOS and Linux Bazel test jobs.
- Remove the same exclusion from the native Windows post-merge job.
- Keep the existing Windows gnullvm shard coverage.

## Bazel test coverage

The target contains 26 unit tests. A fresh uncached local Bazel
execution ran all 26 with 0 failures, 0 ignored tests, and 0 filtered
tests.

PR Bazel CI selected the target on every enabled platform and reported a
cached pass:

| Platform | Passing CI job |
| --- | --- |
| macOS aarch64 | [Bazel test
passed](https://github.com/openai/codex/actions/runs/27636617545/job/81725447804)
|
| macOS x86_64 | [Bazel test passed in
2.2s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448008)
|
| Linux GNU | [Bazel test passed in
0.4s](https://github.com/openai/codex/actions/runs/27636617545/job/81725447898)
|
| Linux musl | [Bazel test passed in
0.4s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448117)
|
| Windows gnullvm | [Bazel test passed in shard 4/4 in
1.6s](https://github.com/openai/codex/actions/runs/27636617545/job/81725448166)
|

Channing Conger · 2026-06-16 11:26:33 -07:00

009a2bb93d

feat(tui): add rate-limit reset redemption to /usage (#28154 )

## Why

Codex users can earn personal rate-limit reset credits, but the CLI does
not currently provide a way to view or redeem them. The `/usage` command
restored in #27925 is intended to be the entry point for usage-related
actions, so reset redemption belongs there rather than in a separate
dashed slash command.

Depends on #28143 for the app-server and backend-client reset-credit
APIs.

## What changed

- Turn bare `/usage` into a menu with entries for token activity and
earned rate-limit resets while preserving `/usage daily`, `/usage
weekly`, and `/usage cumulative`.
- Add loading, empty, confirmation, success, retry, and error states
with a caller-generated UUID idempotency key reused across retries of
the same logical reset.
- Show an availability hint only for backend-classified rate-limit
errors with credits available.
- Hide the reset entry for workspace accounts.

## Validation

- `just test -p codex-tui chatwidget::tests::usage` — 19 passed.
- `just fix -p codex-tui` — passed.
- `just fmt` — passed.
- `cargo insta pending-snapshots` from `codex-rs/tui` — no pending
snapshots.

## Examples
<img width="1168" height="304" alt="image"
src="https://github.com/user-attachments/assets/caa4c1e3-e996-494d-ae17-50b521f5dce8"
/>
<img width="908" height="260" alt="image"
src="https://github.com/user-attachments/assets/e38a726b-77cc-4bd0-9ea8-9f3ad21c5768"
/>


### Reset flow
<img width="1509" height="312" alt="image"
src="https://github.com/user-attachments/assets/d987013c-78a5-48a2-ad8d-c61ad267a327"
/>
<img width="585" height="190" alt="image"
src="https://github.com/user-attachments/assets/de32be19-79b9-4a3e-8574-6f1c208c98ae"
/>
<img width="600" height="210" alt="image"
src="https://github.com/user-attachments/assets/88a165cf-796d-4fdc-a7bc-ea89917573da"
/>

<img width="512" height="193" alt="image"
src="https://github.com/user-attachments/assets/d2353998-5aa8-442e-a5f8-3a8a5b832753"
/>

jay · 2026-06-16 17:59:40 +00:00

f8f5a6e78f

Add incremental thread history changes

Add ThreadHistoryBuilder APIs for collecting incremental thread item and turn changes while applying rollout items.

Batch handling coalesces repeated changes so callers can get the latest incremental thread item changes for a set of rollout items without rebuilding full history.

Tom · 2026-06-16 10:56:29 -07:00

1e6970542e

[codex] Warn clearly when code mode output is truncated (#28467 )

## Summary

- make `formatted_truncate_text` prepend `Warning: truncated output
(original token count: N)` above the existing `Total output lines`
header
- update direct formatter, unified-exec, user-shell, and code-mode
expectations
- add core unit coverage that runs in Bazel without requiring the
skipped V8-backed code-mode integration suite

## Validation

- `cargo test -p codex-utils-output-truncation -- --nocapture` (17
passed)
- `cargo test -p codex-core --lib
truncated_text_output_starts_with_warning -- --nocapture`
- `cargo test -p codex-core --test all
clamps_model_requested_max_output_tokens_to_policy -- --nocapture` (2
passed)
- `cargo test -p codex-core --test all
unified_exec_formats_large_output_summary -- --nocapture`
- `cargo test -p codex-core --test all
user_shell_command_output_is_truncated_in_history -- --nocapture`
- Bazel CI exercises the shared formatter and downstream integration
expectations

Ahmed Ibrahim · 2026-06-16 10:37:06 -07:00

952656356a

fix(tui): highlight C++ module files (#28554 )

## Why

Codex syntax-highlights diffs for conventional C++ extensions such as
`.cpp` and `.cxx`, but C++ module interface files using `.cppm`, `.ixx`,
or `.cxxm` fall back to plain diff coloring. The bundled syntax set
already includes C++, but it does not resolve those module extensions by
itself.

Closes #28223.

## What changed

- map `.cppm`, `.ixx`, and `.cxxm` to the existing `cpp` syntax in
`render/highlight.rs`
- extend alias-resolution coverage for all three module extensions
- verify `.cpp`, `.cppm`, `.ixx`, and `.cxxm` diffs produce
syntax-highlighted RGB spans while unknown extensions retain the plain
fallback
- snapshot the syntax-colored token segmentation for the supported C++
module extensions

## How to Test

1. Ask Codex to create or modify a C++ module interface file using
`.cppm`, `.ixx`, or `.cxxm`.
2. Confirm C++ tokens in the rendered diff receive syntax colors instead
of only the red/green diff treatment.
3. Modify an equivalent `.cpp` file and confirm its existing
highlighting remains unchanged.
4. Modify a file with an unknown extension and confirm it still uses the
plain diff fallback.

Targeted tests:

- `just test -p codex-tui -E
'test(find_syntax_resolves_languages_and_aliases) |
test(cpp_module_extensions_use_cpp_highlighting) |
test(unknown_extension_falls_back_without_syntax_highlighting)'`

Felipe Coury · 2026-06-16 17:33:13 +00:00

3ded846488

[codex-app-server-test-client & codex-app-server] Plugin Usage Analytics Smoke Test (#27099 )

## This PR

The original [combined remote plugin analytics PR
#26281](https://github.com/openai/codex/pull/26281) mixed reusable
analytics test infrastructure, two manual smoke workflows, a metadata
refactor, and the final identity behavior. This PR establishes a
non-mutating end-to-end plugin smoke workflow before any analytics
identity semantics change.

- Add `plugin-analytics-smoke` to the existing app-server test client.
- Exercise plugin disable, enable, and use through production app-server
RPC paths.
- Isolate config writes in a temporary file and use a loopback Responses
API server.
- Capture analytics without sending them to the production analytics
backend.
- Validate the current local `plugin_id`, names, capability metadata,
thread, turn, and model fields.

This is intentionally a baseline smoke workflow. It does not assert
`remote_plugin_id`; the final PR will update it when that field exists.
Review this PR as the net diff against #27093.

## Testing

- The test-client target compiles successfully.
- The combined reference branch exercised the manual smoke against the
live remote plugin service.
- CI is green across the required platform matrix.

## Split Overview

```text
main
├── #27093  Debug analytics capture
│   └── #27099  Non-mutating plugin smoke           ← you are here
│       └── #27100  Remote install/uninstall smoke
└── #27102  Plugin telemetry metadata refactor

After #27093, #27099, #27100, and #27102 merge:
└── Final PR: add remote_plugin_id to plugin analytics
```

Review order and dependencies:

1. [#27093 Add debug-only analytics event
capture](https://github.com/openai/codex/pull/27093) (based on `main`)
2. [#27099 Add a plugin analytics smoke
workflow](https://github.com/openai/codex/pull/27099) **(this PR,
stacked on #27093)**
3. [#27100 Add a remote plugin analytics mutation smoke
workflow](https://github.com/openai/codex/pull/27100) (stacked on this
PR)
4. [#27102 Centralize plugin telemetry metadata
construction](https://github.com/openai/codex/pull/27102) (independent,
based on `main`)
5. Final remote-ID behavior PR (created after PRs 1-4 merge)

The original [#26281](https://github.com/openai/codex/pull/26281)
remains open as the green aggregate reference until the final PR is
published.

jameswt-oai · 2026-06-16 10:11:41 -07:00

a376781a3c

chore: side prompt (#28553 )

Fix side bug with prompt

jif · 2026-06-16 19:05:03 +02:00

a544f5a612

[codex] exec-server: stream files in chunks (#28354 )

## Why

`fs/readFile` buffers the entire file in one response, which makes large
remote reads expensive and prevents callers from applying backpressure.
We need an opt-in streaming path with bounded block sizes while
preserving the existing single-call API for small and sandboxed reads.

## What changed

- Add `ExecServerClient::stream`, returning a named `FileReadStream`
that implements `futures::Stream` and yields immutable 1 MiB byte
blocks.
- Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs.
`fs/readBlock` accepts an explicit offset and length.
- Keep unsandboxed files open between block reads, cap open handles per
connection, and clean them up on EOF, error, stream drop, explicit
close, or connection shutdown.
- Reject platform-sandboxed streaming opens instead of turning the
one-shot sandbox helper into a persistent server. Existing `fs/readFile`
behavior is unchanged.

## Testing

- `just test -p codex-exec-server`
- Integration coverage for 1 MiB chunking, exact block-boundary EOF,
sandbox rejection, and continued reads from the opened file after path
replacement.
- Handle-manager coverage for non-sequential offsets, variable block
lengths, the 128-handle limit, and capacity release after close.

pakrym-oai · 2026-06-16 09:50:55 -07:00

a4711b88dd

fix(tui): restore TUI after suspend (#28342 )

## Why

On Linux, suspending Codex with `Ctrl+Z` and returning with `fg` can
leave the composer misaligned or inject terminal response bytes such as
focus reports into the prompt. Shell job-control output moves the cursor
while Codex is suspended, and terminal input polling can race with the
responses used to restore the inline viewport.

Fixes #26564.

## What changed

- preserve and restore keyboard reporting without disturbing the parent
terminal stack
- pause terminal event polling while Codex is suspended and flush
buffered input before resuming it
- force crossterm's cached raw-mode state back in sync after the shell
completes its `fg` handoff
- probe the actual post-`fg` cursor position with the tolerant
terminal-response parser, then realign the inline viewport before
redrawing

## How to Test

1. On Linux, start the development TUI with `just c`.
2. Type text into the composer without submitting it.
3. Press `Ctrl+Z`, run any harmless shell command, then run `fg`.
4. Confirm the composer redraws below the shell output, the draft text
is preserved, and no raw escape sequences appear.
5. Repeat the suspend/resume cycle and confirm normal typing still
works.

Targeted tests:

- `cargo test -p codex-tui --lib parses_cursor_position_as_zero_based -j
1`
- `cargo test -p codex-tui --lib tui::event_stream::tests -j 1`

Felipe Coury · 2026-06-16 09:09:24 -07:00

76135cbe7e

path-uri: clarify invalid host path errors (#28473 )

## Why

Ensure a consistent string format when exposing path conversion errors
to the model.

## What

- Render `PathUriParseError::InvalidFileUriPath` as `'$PATH' is invalid
on '$OS'`.

Adam Perry @ OpenAI · 2026-06-16 09:03:44 -07:00

7162030b37

perf(config): defer remote sandbox hostname lookup (#28542 )

## Why

[#18763](https://github.com/openai/codex/pull/18763) added canonical
hostname resolution for `remote_sandbox_config`. Requirements
composition currently performs that synchronous DNS lookup on every
fresh process, even when none of the loaded requirements layers contains
`[[remote_sandbox_config]]`. On hosts with slow local DNS resolution,
this can add several seconds to Codex startup.

## What

- defer hostname resolution until a parsed requirements layer actually
contains `remote_sandbox_config`
- cache the resolver result once per requirements composition,
preserving the existing single-lookup behavior across multiple layers
- keep the existing FQDN resolution and per-layer requirements
precedence unchanged
- cover both the ordinary no-lookup path and the multi-layer
single-lookup path

## How to Test

On a host where local canonical-name resolution is slow:

1. Start Codex without `[[remote_sandbox_config]]` in any managed
requirements layer and confirm startup no longer waits for hostname
resolution.
2. Add a matching `[[remote_sandbox_config]]` entry and confirm its
`allowed_sandbox_modes` still overrides the layer's top-level value.
3. Add remote sandbox entries to multiple requirements layers and
confirm precedence remains unchanged while the hostname is resolved only
once.

Targeted tests:

- `just test -p codex-config hostname_resolver`
- `just test -p codex-config` (181 passed)

Felipe Coury · 2026-06-16 11:17:41 -04:00

40e7dda4d2

core: surface terminal subagent errors to parent agents (#28375 )

## Why

When a subagent exhausts its retries, it emits an `Error`, but the
generic task lifecycle then emits `TurnComplete(None)`. That completion
used to overwrite the subagent's `Errored` status with
`Completed(None)`, so the parent received an empty completion
notification.

This made a failed child look indistinguishable from a child that
completed without an answer. In unattended or long-running multi-agent
work, the root could silently continue without knowing that delegated
work failed or how to restart it.

## Behavior

Before, a terminal stream failure was reduced to an empty completion:

```text
<subagent_notification>
{"agent_path":"/root/worker","status":{"completed":null}}
</subagent_notification>
```

Now the parent receives the actual terminal error, bounded to 1,000
tokens, together with an actionable recovery hint:

```text
<subagent_notification>
{
"agent_path": "/root/worker",
"status": {
"errored": "stream disconnected before completion: stream closed before response.completed"
},
"next_action": "This agent's turn failed. If you still need this agent, use `followup_task` to give it another task."
}
</subagent_notification>
```

The notification remains queue-only: it does not wake the root or replay
the failed request. The root sees it at the next sampling boundary and
can use `followup_task` to start a new turn for that agent.

## What changed

- Added terminal-error precedence to the [agent status
reducer](https://github.com/openai/codex/blob/e95fcfe2bb6a02f1a75650afa20048859f556511/codex-rs/core/src/agent/status.rs#L23-L34),
so a closing `TurnComplete` cannot erase an immediately preceding
`Errored` status.
- Made MultiAgentV2 completion forwarding use the retained session
status instead of re-deriving `Completed(None)` from the final event.
- Extended the [subagent notification
fragment](https://github.com/openai/codex/blob/e95fcfe2bb6a02f1a75650afa20048859f556511/codex-rs/core/src/context/subagent_notification.rs#L6-L60)
with a `next_action` for terminal errors and a hard cap on model-visible
error text.
- Kept successful completions and interrupted turns unchanged.

## Verification

- Added a status-reducer test proving that `Errored` survives the
trailing `TurnComplete`.
- Added an integration test that exhausts a subagent's stream retries
and verifies the exact `agent_message` delivered to the parent,
including the error and `followup_task` guidance.
- Re-ran the existing successful-completion and interrupted-turn
notification tests.

jif · 2026-06-16 14:34:54 +02:00

1b24ba912a

[codex] Clarify plugin load and runtime capability stages (#28472 )

## Summary

Plugin loading and auth projection both previously produced
`PluginLoadOutcome`. That made an unfiltered load result look like
runtime-ready capabilities and generated capability summaries before
auth routing had run.

This change keeps loaded plugin records in the cache, applies the
current auth policy in `PluginsManager`, and only then builds
`PluginLoadOutcome` and its summaries. Auth changes still reuse the
cached disk load and re-resolve apps and MCP servers without reloading
plugins.

The updated tests cover cached auth changes and verify that capability
summaries match the effective app/MCP surface.

## Testing

- `just test -p codex-core-plugins`
- `just test -p codex-plugin`
- `just fix -p codex-core-plugins`

xl-openai · 2026-06-16 12:57:21 +01:00

de1f77bfdd

[tests] Keep Apps out of generic core test harness (#28508 )

## Summary

- disable the stable Apps feature in the generic `test_codex()`
integration-test harness
- keep Apps-specific tests explicit: their builders re-enable Apps and
point it at a local mock server

## Why

Generic tests that use dummy ChatGPT auth were also enabling the
host-owned `codex_apps` MCP server. That made unrelated tests contact
`chatgpt.com` and wait for MCP startup, causing the Bazel timeouts
observed on #28368.

The generic harness should be hermetic and should not start an external
service that the test did not request. This is test-only; production
Apps behavior is unchanged. The broader optional-MCP startup behavior is
being handled separately in #28407.

## Testing

- `just test -p codex-core -E
'test(pre_sampling_compact_runs_when_comp_hash_changes) |
test(model_switch_to_smaller_model_updates_token_context_window) |
test(codex_apps_file_params_upload_local_paths_before_mcp_tool_call)'`
- `just fix -p codex-core`
- `just fmt`

jif · 2026-06-16 13:07:43 +02:00

ef8eb8bdd9

feat: render typed envelopes for multi-agent v2 messages (#28368 )

## Why

Multi-agent v2 messages need a consistent, model-visible envelope that
identifies what kind of interaction occurred, who sent it, and which
agent it targets. Previously, encrypted deliveries exposed only
`encrypted_content`, while child completion used the legacy
`<subagent_notification>` shape. That meant the client could not
consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the
same format.

This change adds the routing envelope as plaintext while keeping task
and message payloads encrypted. No new Responses API field is required:
an encrypted delivery is represented as an `input_text` header
immediately followed by its existing `encrypted_content` item.

Every envelope now follows this shape:

```text
Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER>
Task name: <recipient agent path>
Sender: <author agent path>
Payload:
<message payload>
```

## Message types

### `NEW_TASK`

`NEW_TASK` is used when the recipient should begin a new turn, including
an initial `spawn_agent` task and a later `followup_task`.

For a root agent spawning `/root/worker`, the request contains a
plaintext envelope followed by the encrypted task:

```json
{
  "type": "agent_message",
  "author": "/root",
  "recipient": "/root/worker",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n"
    },
    {
      "type": "encrypted_content",
      "encrypted_content": "<encrypted task payload>"
    }
  ]
}
```

Conceptually, the model receives:

```text
Message Type: NEW_TASK
Task name: /root/worker
Sender: /root
Payload:
Review the authentication changes and report any regressions.
```

### `MESSAGE`

`MESSAGE` is used for a queued `send_message` delivery. It communicates
with an existing agent without starting a new turn.

For `/root/worker` reporting progress to the root agent, the request
contains:

```json
{
  "type": "agent_message",
  "author": "/root/worker",
  "recipient": "/root",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n"
    },
    {
      "type": "encrypted_content",
      "encrypted_content": "<encrypted message payload>"
    }
  ]
}
```

Conceptually, the model receives:

```text
Message Type: MESSAGE
Task name: /root
Sender: /root/worker
Payload:
The protocol tests pass; I am checking the resume path now.
```

### `FINAL_ANSWER`

`FINAL_ANSWER` is emitted when a child agent reaches a terminal state
and reports its result to its parent. Completion payloads are already
available locally, so the complete envelope is represented as plaintext
rather than as a plaintext header plus encrypted content.

For `/root/worker` completing work for the root agent, the request
contains:

```json
{
  "type": "agent_message",
  "author": "/root/worker",
  "recipient": "/root",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found."
    }
  ]
}
```

The model-visible form is:

```text
Message Type: FINAL_ANSWER
Task name: /root
Sender: /root/worker
Payload:
No regressions found.
```

Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a
terminal-status description in the payload.

## What changed

- Render `NEW_TASK` or `MESSAGE` in
`InterAgentCommunication::to_model_input_item`, based on whether the
encrypted delivery starts a turn.
- Replace the multi-agent v2 `<subagent_notification>` completion
payload with a model-visible `FINAL_ANSWER` envelope.
- Document `Task name`, `Sender`, and `Payload` consistently in the
multi-agent developer instructions.
- Prevent local-only history projections from treating an encrypted
message's plaintext header as the complete assistant message.
- Preserve rollout-trace interaction edges when an agent message
contains both plaintext and encrypted content.

Legacy multi-agent behavior remains unchanged.

## Verification

- `just test -p codex-protocol`
- `just test -p codex-rollout-trace`
- `just test -p codex-web-search-extension`
- `just test -p codex-core
encrypted_multi_agent_v2_spawn_sends_agent_message_to_child`
- `just test -p codex-core
plaintext_multi_agent_v2_completion_sends_agent_message`
- `just test -p codex-core
multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn`
- `just test -p codex-core
multi_agent_v2_completion_queues_message_for_direct_parent`

jif · 2026-06-16 11:46:59 +02:00

5b22a8e5b1

[codex] Compress cold active rollouts (#28338 )

## Why

The local rollout compression worker currently scans only
`archived_sessions`, so cold unarchived thread history remains expanded
indefinitely.

## What changed

- Scan `sessions` after `archived_sessions` within the existing worker
runtime budget.
- Update rollout compression coverage to require both cold active and
archived rollouts to be compressed while fresh active rollouts remain
plain.

The worker remains behind the disabled-by-default
`local_thread_store_compression` feature, and the existing seven-day
cold-file threshold is unchanged.

## Validation

- `just test -p codex-rollout` (69 passed)
- `just fmt`
- `git diff --check`

jif · 2026-06-16 10:52:21 +02:00

352d2fed1f

[codex] expose Bedrock credential source in account/read (#27751 )

## Why

`account/read` currently reports only `type: "amazonBedrock"`, so
clients cannot distinguish a Codex-managed Bedrock API key from
credentials supplied by AWS. The app UI needs that distinction to render
the appropriate account state without duplicating provider-auth logic.

Credential-source selection belongs to the Bedrock model provider
because it already owns the precedence between managed Bedrock auth and
the external AWS credential path. This builds on #27443 and #27689.

## What changed

- Added `AmazonBedrockCredentialSource` with `codexManaged` and
`awsManaged` values.
- Included the selected credential source in
`ProviderAccount::AmazonBedrock` and the app-server `Account` response.
- Made `AmazonBedrockModelProvider::account_state()` classify the source
from its managed-auth state.
- Regenerated the app-server JSON and TypeScript schemas.
- Updated app-server account documentation and downstream TUI matches.

`codexManaged` means the provider found a managed Bedrock API key.
`awsManaged` identifies the provider's external AWS credential path; it
does not assert that the AWS credential chain has been validated.

## Testing

- Added model-provider coverage for Codex-managed precedence and
AWS-managed fallback.
- Added app-server protocol serialization coverage for both wire values.
- Added app-server integration coverage for both `account/read`
responses.
- `just test -p codex-protocol -p codex-model-provider -p
codex-app-server-protocol` (497 tests passed).

After rebasing onto #27711, the `codex-app-server` test target compiled
past the image-generation `PathUri` migration. Local linking was then
interrupted by disk exhaustion (`No space left on device`).

Celia Chen · 2026-06-16 07:14:53 +00:00

12aaeb7bf8

[codex] Record external agent import results (#28396 )

## Summary
- restore `externalAgentConfig/import/progress` notifications while
keeping `externalAgentConfig/import/completed` as the must-deliver event
- persist completed external-agent config imports in state DB by
`importId`, including concrete success/failure details for config,
AGENTS.md, skills, plugins, MCP servers, subagents, hooks, commands, and
sessions
- add `externalAgentConfig/import/readHistories` so clients can recover
persisted import results after missing the live completion notification
- include `errorType` on import failures in protocol
responses/notifications and persisted DB JSON so future code can
classify failures without another wire/storage shape change

## Validation
- `git diff --check`
- `just test -p codex-state external_agent_config_imports`
- `just test -p codex-app-server-protocol`
- `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-sqlite-read-details
just test -p codex-app-server
external_agent_config_import_sends_completion_notification_for_sync_only_import`

Also ran earlier broader checks before publishing:
- `just test -p codex-state`
-
`CODEX_SQLITE_HOME=/private/tmp/codex-app-server-external-agent-test-sqlite
just test -p codex-app-server external_agent_config`
- `just test -p codex-external-agent-migration`

charlesgong-openai · 2026-06-15 23:17:24 -07:00

314fa3d25b

[codex] Use local environment for user shell commands (#28163 )

## Why

User shell commands still read the legacy turn cwd and session shell
even though execution context is now owned by selected turn
environments. App-server also defines `thread/shellCommand` as a
local-host escape hatch, so it must use an available local environment
even when a remote environment is primary.

## What changed

- Add `ResolvedTurnEnvironments::local()` to find the selected local
environment.
- Resolve the user shell command cwd and shell from that local
`TurnEnvironment`.
- Emit the standard `shell is unavailable in this session` error when no
selected local environment or resolved local shell is available.
- Add an integration test covering `/shell` without a local environment.

## Test plan

- `just test -p codex-core
user_shell_command_without_local_environment_emits_error`

pakrym-oai · 2026-06-16 04:55:20 +00:00

1e015884c5

[codex] Use expect in integration tests (#28441 )

The workspace denies `clippy::expect_used` in production. Although
`clippy.toml` allows `expect` in tests, Bazel Clippy compiles
integration-test helper code in a way that does not receive that
exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
and equivalent `match`/`let else` forms.

This allows `clippy::expect_used` once at each integration-test crate
root (including aggregated suites and test-support libraries), then
replaces manual panic-based Result and Option unwraps with
`expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
crate roots. Intentional assertion and unexpected-variant panics remain
unchanged, and the production `expect_used = "deny"` lint remains in
place.

The cleanup is mechanical and net-negative in line count.

pakrym-oai · 2026-06-15 21:53:47 -07:00

e752f7b4ae

[codex] Add interruptible sleep tool (#28429 )

## Why

Models sometimes need to pause briefly while waiting for external work,
but using a shell command for that delay ties the wait to a process and
does not naturally resume when new turn input arrives.

## What changed

- add a built-in `sleep` tool behind the under-development `sleep_tool`
feature
- accept a bounded `duration_ms` argument, matching the millisecond
convention used by unified exec
- end the sleep early when either steered user input or mailbox input
arrives
- include elapsed wall-clock time in completed and interrupted outputs
- emit a dedicated core `SleepItem` through `item/started` and
`item/completed`
- expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain
it in reconstructed thread history
- regenerate the configuration schema for the new feature flag
- regenerate app-server JSON and TypeScript schema fixtures

## Test plan

- `just test -p codex-core sleep_tool_follows_feature_gate`
- `just test -p codex-core any_new_input_interrupts_sleep`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server
sleep_emits_started_and_completed_items`

pakrym-oai · 2026-06-15 21:39:21 -07:00

08901fc8e1

[codex] Bind shell snapshots to retained thread environments (#28421 )

## Why

Shell snapshots are currently session-scoped even though shell and cwd
are properties of a selected turn environment. That makes snapshot
refresh depend on separate session-cwd plumbing, prevents retained
environments from retaining their snapshot work, and can make snapshot
construction use a different shell than command execution.

This follows #27955 by making the retained thread-environment service
own environment snapshot lifecycles. Session configuration remains the
requested selection state, while `ThreadEnvironments` remains the source
of successfully resolved environments.

## What changed

- Configure the shell-snapshot builder before initial environment
resolution.
- Start each local environment snapshot task when its `TurnEnvironment`
is built and retain that shared task while environment ID and cwd still
match.
- Inherit retained environment snapshots into spawned child threads.
- Carry the selected `TurnEnvironment` through shell runtimes so
snapshot construction and command execution use the same
environment-specific shell and cwd.
- Load project instructions and warm plugins/skills after initial
environment resolution.
- Continue decoding invalid UTF-8 instruction files lossily without
emitting a startup warning.
- Keep requested selections in `SessionConfiguration`; failed or
duplicate resolutions only affect the resolved environment snapshot.

## Validation

- `cargo check -p codex-core --tests`
- `just test -p codex-home instructions` (6 passed)
- Focused environment, instruction, shell-snapshot, and user-shell tests
(84 passed)
- Focused shell-snapshot, user-shell, and unified-exec tests (126
passed; two event-timing tests passed on retry)

pakrym-oai · 2026-06-15 20:10:53 -07:00

022f1221e8

Use ApiPathString in app-server filesystem permission paths (#28367 )

## Why

Clients running an app-server on one OS and an exec-server on another OS
need to be able to pass sandbox config to app-server that refers to
resources on the executor's foreign OS.

## What

`AbsolutePathBuf` can't represent these paths and we don't want users to
be exposed to `PathUri` yet, so this moves the public app-server API to
be expressed in terms of `ApiPathString`.

Stacked on #28165.

- change app-server v2 filesystem permission paths, including legacy
read/write roots, to `ApiPathString`
- localize API paths through `PathUri` when converting into the current
native core permission types
- make path-bearing permission conversions fallible and surface
localization failures instead of silently treating malformed grants as
ordinary denials
- propagate conversion failures through app-server and TUI approval
handling
- regenerate the app-server JSON and TypeScript schemas
- leave migration TODOs on native-path conversions so they can be
removed once core permission paths use `PathUri`

Adam Perry @ OpenAI · 2026-06-15 19:25:54 -07:00

ecfe174d5f

[codex] Make plugin details capability aware (#27958 )

## Summary

Makes plugin details/read flows capability-aware so auth-filtered plugin
surfaces report the same usable app/MCP/skill shape as the marketplace
and install flows.

## Validation

Not run; this change was rebased onto the current plugin auth stack and
pushed as a draft PR.

**Manual test**
1. set up a local marketplace with a plugin that has both app and mcp
declarations

```
// .app.json
{
  "apps": {
    "linear": {
      "id": "some_id"
    }
  }
}

```

```
// .mcp.json
{
  "mcpServers": {
    "linear": {
      "type": "http",
      "url": "https://mcp.linear.app/mcp",
      "oauth_resource": "https://mcp.linear.app/mcp"
    },
    "linear2": {
      "type": "http",
      "url": "https://mcp.linear2.app/mcp",
      "oauth_resource": "https://mcp.linear2.app/mcp"
    }
  }
}
```

2a. **login in with api key** and observe plugin details page which
shows no apps (note we don't show "app not available due to api key log
in as there's no way to differentiate between no apps and app without
substitute mcp exists" without significantly more code changes, i've
separated this to a follow up if we want that behaviour.
<img width="1170" height="279" alt="Screenshot 2026-06-15 at 23 45 40"
src="https://github.com/user-attachments/assets/d36cb160-fbec-461e-9643-9c761dbae7bb"
/>
<img width="975" height="640" alt="Screenshot 2026-06-15 at 18 40 30"
src="https://github.com/user-attachments/assets/90ec0bc8-7506-4b90-bbd3-070720de799e"
/>


2b. **log in with chat** and observe intended conflict resolution logic
<img width="1165" height="224" alt="Screenshot 2026-06-15 at 17 17 30"
src="https://github.com/user-attachments/assets/80adfbf2-7dac-4f08-8b76-8eeeab6c95e7"
/>
<img width="968" height="567" alt="Screenshot 2026-06-15 at 18 38 59"
src="https://github.com/user-attachments/assets/9ea92c5e-535b-4aa4-8ad0-ee513b57bc3c"
/>

felixxia-oai · 2026-06-16 01:25:22 +00:00

d959664420

[codex] Load API curated marketplace by auth (#28383 )

## Summary
- choose the local OpenAI curated marketplace manifest based on auth:
Codex backend auth gets the existing marketplace, direct provider auth
gets `api_marketplace.json`
- include Bedrock API key auth in the direct-provider API marketplace
path
- safely skip the API marketplace when `api_marketplace.json` is absent

## Validation
- `just fmt`
- `git diff --check origin/main...HEAD`
- CI should run the full validation

## Manual Testing

### - New api marketplace not available for API key sign
1. Safely not display anything from api marketplace
<img width="1161" height="289" alt="Screenshot 2026-06-15 at 21 37 43"
src="https://github.com/user-attachments/assets/a5f16642-8a20-4ac1-a0de-1274a4c7b5b2"
/>

### - New api marketplace for API key sign in
1. Setup api_marketplace.json
```
{
  "name": "openai-curated",
  "interface": {
    "displayName": "Codex official"
  },
  "plugins": [
    {
      "name": "linear",
      "source": {
        "source": "local",
        "path": "./plugins/linear"
      },
      "policy": {
        "installation": "AVAILABLE",
        "authentication": "ON_INSTALL"
      },
      "category": "Productivity"
    }
  ]
}
```

2. Log in with API key, observe that only the defined plugin from
api_marketplace.json is available from "Codex Official" (outside of
local testing marketplaces)
<img width="1167" height="446" alt="Screenshot 2026-06-15 at 21 16 53"
src="https://github.com/user-attachments/assets/7cf61477-d826-4ef6-bc05-0a23ac1c0259"
/>

also checked functionality on codex app

### - SiWC users 
Still uses 'default' marketplace.json and renders all plugins
<img width="1171" height="502" alt="Screenshot 2026-06-15 at 21 40 25"
src="https://github.com/user-attachments/assets/d212ea9b-0aa5-470b-8ea4-450efe65bb2b"
/>

also checked functionality on codex app


## Notes
- `just test -p codex-core-plugins` was started locally before splitting
branches, but I stopped relying on local tests per follow-up and left
final validation to PR CI.

felixxia-oai · 2026-06-16 01:16:11 +00:00

02dce8eb8d

exec-server: default remote transport to Noise (#26245 )

## Why

The transport in
[openai/codex#26242](https://github.com/openai/codex/pull/26242) needs
to be used by every remote orchestrator-to-executor connection before
JSON-RPC traffic starts.

## Changes

- Generates one executor Noise identity when remote exec-server starts
and registers its public key.
- Creates a harness identity for each physical remote environment
connection.
- Fetches a fresh registry bundle before connecting and validates the
authenticated harness key before completing the executor handshake.
- Multiplexes encrypted logical streams over the existing executor
WebSocket.
- Adds bounded stream, handshake-failure, and reassembly state.
- Adds safe lifecycle diagnostics without logging keys, authorizations,
plaintext, or ciphertext.
- Covers reconnects, replay rejection, validation failure, framing
limits, and encrypted JSON-RPC tool traffic.

## Stack

1. [openai/codex#26242](https://github.com/openai/codex/pull/26242):
Noise channel and relay transport
2. **[openai/codex#26245](https://github.com/openai/codex/pull/26245)**:
remote registration and runtime activation

## Verification

- `just test -p codex-exec-server`
- `just fix -p codex-exec-server`
- `just bazel-lock-check`
- `cargo shear`

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-06-15 17:39:00 -07:00

6e50b22e55

Run core integration tests against a Wine-backed Windows executor (#28401 )

## Why

We want to exercise a linux app-server against a windows exec-server
without having to repeat every test case. This approach has slight
precedent in the remote docker test setup.

## What

Run the shared `codex-core` integration suite against Windows
exec-server behavior from Linux. This makes cross-OS path and shell
regressions visible while keeping unsupported cases owned by individual
tests.

- Add `local`, `docker`, and `wine-exec` test environment selection with
legacy Docker compatibility.
- Extend `codex_rust_crate` to generate a sharded Wine-exec variant
using a cross-built Windows server and pinned Bazel Wine/PowerShell
runtimes.
- Teach remote-aware helpers about Windows paths and track temporary
incompatibilities with source-local `skip_if_wine_exec!` calls and
follow-up reasons.

Adam Perry @ OpenAI · 2026-06-16 00:38:41 +00:00

1fe89de576

Preserve hook trust bypass in codex exec threads (#26434 )

Addresses #26383 and #26452

## Summary

`codex exec --dangerously-bypass-hook-trust` printed the bypass warning,
but valid untrusted hooks still did not run.

Exec applied the flag to its initial config, then lost it when
app-server reloaded config for the new or resumed thread.

## Fix

Forward `bypass_hook_trust: true` through the existing thread request
config override for both start and resume.

The override is omitted when the flag is not enabled, preserving normal
trust behavior.

## Testing

Added:

- A test confirming start and resume preserve the override.
- An end-to-end exec test confirming a `SessionStart` hook runs and
creates a marker file.

Abhinav · 2026-06-15 17:36:21 -07:00

d007b0852a

7552 Commits