codex

[codex] Remove child AGENTS.md prompt experiment (#28993 )

## Why

`child_agents_md` is a disabled, under-development experiment that adds
a second model-visible explanation of hierarchical `AGENTS.md` behavior.
Keeping it leaves unused prompt, configuration, documentation, and test
surface.

## What changed

- remove the `ChildAgentsMd` feature and `child_agents_md` config schema
entry
- remove the hierarchical prompt asset, export, and instruction
injection
- remove feature-specific tests and documentation
- keep the generic unstable-feature warning coverage using
`apply_patch_streaming_events`

Normal project `AGENTS.md` discovery and composition are unchanged.

## Testing

- `just test -p codex-features`
- `just test -p codex-prompts`
- `just test -p codex-core agents_md`
- `just test -p codex-core unstable_features_warning`

pakrym-oai · 2026-06-18 16:13:07 -07:00

bb72e151e5

[codex] Support marketplace plugin manifest fallback (#28789 )

## Summary

Support marketplace plugins whose source directory does not include a
discoverable plugin manifest. Metadata-rich `marketplace.json` entries
now act as fallback plugin manifests for listing, local detail reads,
install, and non-curated cache refresh.

The fallback preserves marketplace-entry plugin fields wholesale, then
adds the small Codex-facing compatibility bridge for presentation
metadata. A real source `plugin.json` always wins when present.

## Details

- Capture flattened marketplace-entry fields into
`MarketplacePluginManifestFallback`, preserving fields such as
`version`, `description`, `skills`, `mcpServers`, `apps`, `hooks`,
`agents`, `commands`, `strict`, `author`, and future manifest fields
without a per-field translation list.
- Bridge Claude-style top-level `displayName`, `author.name`,
`homepage`, and marketplace `category` into Codex's nested `interface`
fields only when the nested values are absent.
- Treat fallback metadata as installable only when the marketplace entry
contributes metadata beyond bare `name` and `source`; existing
missing-manifest behavior remains for metadata-free entries.
- Read local plugin details from the already parsed fallback manifest,
including fallback-declared app and MCP paths, instead of rereading only
an on-disk manifest.
- Pass fallback contents into `PluginStore`, which validates them and
injects `.codex-plugin/plugin.json` into Store's existing atomic copy.
Local marketplace source directories are never mutated, and the fallback
path no longer needs an additional staging directory.
- Keep Git source materialization unchanged; Git clones still use the
existing marketplace source staging area before Store installation.

charlesgong-openai · 2026-06-18 15:49:27 -07:00

772c5c5195

core: load AGENTS.md from foreign environments (#28958 )

## Why

Make it possible to load AGENTS.md from remote exec-servers whose OS is
different than app-server.

## What

- keep `AGENTS.md` discovery and provenance as `PathUri`, with
root-aware parent and ancestor traversal
- expose lifecycle instruction sources as legacy app-server path strings
in events while retaining `PathUri` internally
- preserve and test mixed POSIX and Windows paths in model context and
TUI status output
- cover remote Windows loading end to end by seeding the Wine prefix
through host filesystem APIs
- fix bug in `PathUri`'s parent() implementation that would erase
Windows drive letters

Adam Perry @ OpenAI · 2026-06-18 15:06:23 -07:00

dce673905a

[codex] Preserve remote plugin download status errors (#28863 )

## Summary

- preserve the original HTTP status when a remote plugin bundle download
returns a non-success response
- retain at most 8 KiB of the error response body and annotate
truncation or body-read failures
- add regression coverage for an oversized error response

## Root cause

The non-success response path reused the normal size-limited body
reader. When an error response exceeded 8 KiB, that reader returned
`DownloadTooLarge` before the code constructed `DownloadStatus`, masking
the upstream HTTP status and response context.

## Impact

Remote plugin installation failures now retain the actionable upstream
HTTP status without allowing unbounded error bodies into logs.

## Validation

- `just test -p codex-app-server
plugin_install_preserves_status_when_remote_bundle_error_body_is_too_large`
- `just fmt`
- `git diff --check`

xl-openai · 2026-06-18 14:29:01 -07:00

406062c3af

[connectors] Ignore synthetic links for app accessibility (#28770 )

Summary
- Stop treating Codex Apps MCP tools with
`_meta._codex_apps.synthetic_link: true` as evidence that a connector is
accessible in `app/list`.
- Preserve synthetic tools in the agent-facing MCP connector set so they
remain available for install/auth flows.
- Keep the app-list accessibility cache limited to connectors backed by
at least one non-synthetic tool.
- Add focused regression coverage for both sides of the boundary.

Validation
- `just fmt`
- `just test -p codex-core
synthetic_links_are_exposed_to_the_agent_but_not_accessible_in_app_list`
- `git diff --check`
- A crate-wide `just test -p codex-core` run completed with 2,699
passing and 51 unrelated local sandbox/state failures, primarily state
DB migration races (`UNIQUE constraint failed:
_sqlx_migrations.version`).

Alex Daley · 2026-06-18 17:19:24 -04:00

4af7762f01

feat: opt ChatGPT auth into agent identity (#19049 )

## Stack

This is PR 2 of the simplified HAI single-run-task stack:

- [#19047](https://github.com/openai/codex/pull/19047) Agent Identity
assertion and task-registration primitives, including the shared
run-task helper used by existing Agent Identity JWT auth.
- [#19049](https://github.com/openai/codex/pull/19049)
Disabled-by-default ChatGPT auth opt-in that provisions/reuses persisted
Agent Identity runtime auth and its single run task.
- [#19051](https://github.com/openai/codex/pull/19051) Run-scoped
provider auth that uses one backend-owned task id for first-party
inference and compaction requests.

[#19054](https://github.com/openai/codex/pull/19054) collapsed out of
the active stack because the simplified design no longer needs a
separate background/control-plane task helper.

## Summary

This PR adds the disabled-by-default path for normal ChatGPT-login Codex
sessions to obtain Agent Identity runtime auth through the Codex
backend. Existing Agent Identity JWT startup mode remains a separate
path and does not require the feature flag.

What changed:

- adds the experimental `use_agent_identity` feature flag and config
schema entry
- adds an explicit `AgentIdentityAuthPolicy` so call sites choose
`JwtOnly` or `ChatGptAuth` instead of passing a bare boolean
- stores standalone Agent Identity JWT credentials separately from
backend-registered Agent Identity records
- persists the registered Agent Identity record, private key, and single
run task id in `auth.json` so process restarts reuse the same identity
- derives the agent/task registration base URL from ChatGPT/Codex auth
config while keeping JWT JWKS lookup separate
- provisions and caches ChatGPT-derived Agent Identity runtime auth when
`use_agent_identity` is enabled
- reuses the shared run-task registration helper from PR1 rather than
adding a second task-registration path

This PR intentionally does not switch model inference over to
`AgentAssertion` auth. The provider-auth integration lands in the next
PR.

## Testing

- `just test -p codex-login`

Adrian · 2026-06-18 14:05:27 -07:00

ec848dde0e

Emit Trusted MCP App Identity on Tool-Call Items (#27132 )

## Summary

- Add optional `appContext` to app-server MCP tool-call items with
trusted `connectorId`, `linkId`, and `mcpAppResourceUri` metadata.
- Preserve that context across tool-call events, persisted history,
reconnects, and thread resume.
- Keep the deprecated top-level `mcpAppResourceUri` temporarily for
client migration.

The consumer contract is `{ appContext: { connectorId, linkId,
mcpAppResourceUri }, tool }`.

## Validation

- Full GitHub Actions suite passes, including CLA, Bazel tests, clippy,
release builds, and argument-comment lint.

---------

Co-authored-by: martinauyeung-oai <280153141+martinauyeung-oai@users.noreply.github.com>

martinauyeung-oai · 2026-06-18 14:02:54 -07:00

765309d5a6

TUI: improve unified mention selection visibility (#28959 )

## Summary

[@milanglacier reported in
#28653](https://github.com/openai/codex/issues/28653) that the active
mention candidate is hard to distinguish. I suspect [@binbjz’s #28500
report](https://github.com/openai/codex/issues/28500) _(where arrow-key
navigation appeared not to work)_ may describe the same presentation
problem: the selection may have been changing, but the UI was not
showing the active row clearly in their terminal. This PR makes two
small changes to the selection indication behavior:

- Reserve a two-character gutter and mark the active candidate with `> `
for color-agnostic indicator coverage.
- Apply the shared theme-aware accent to the entire selected row for
extra emphasis.
- Update the existing popup snapshot.

Reverse-video styling was considered, but avoided it because it is
overly dependent on the user’s terminal palette.

<img width="2046" height="482" alt="image"
src="https://github.com/user-attachments/assets/b5eb62c3-fd24-4c09-906e-7bd66913b5c6"
/>

## Testing

- `just test -p codex-tui default_unified_mention_popup_snapshot`
- `just clippy -p codex-tui`
- `just fmt`
- Compiled `codex-cli` and tested the unified mentions picker in the
terminal.

canvrno-oai · 2026-06-18 13:37:04 -07:00

9bcc09f9f7

[codex] Remove hardcoded app ID filters (#28947 )

## Summary

- remove the duplicated originator-specific connector ID denylists
- stop filtering connector directory/accessibility results and
live/cached Codex Apps MCP tools by hardcoded connector ID
- remove the now-unused `codex-login` dependency from
`codex-utils-plugins`
- update regression coverage so formerly blocked connector IDs are
preserved

## Why

The client-side policy was duplicated across crates, used opaque IDs
without ownership or expiry information, and could drift between app
listing and MCP tool behavior. Server-provided visibility,
authorization, plugin discoverability, accessibility, enabled-state
handling, and consequential-tool approval templates remain unchanged.

## Validation

- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `git diff --check`
- confirmed the final diff contains no hardcoded denylist symbols

A targeted `codex-mcp` test build spent an unusually long time in local
compilation/linking. Its first attempt exposed a test-only `PartialEq`
assertion issue, which was corrected. A follow-up non-linking `cargo
check -p codex-mcp --tests` was still running when this draft was
opened; CI should provide the complete Rust validation.

Eric Ning · 2026-06-18 20:29:01 +00:00

29eb434bc5

Make auto-review on-request prompt more proactive (#26496 )

## Why

`on-request` approval policy text is currently tuned for user-reviewed
approvals. For auto-reviewed productivity runs, likely sandbox blocks
should be escalated earlier so commands that need remote services,
authentication, or other out-of-sandbox access do not first fail or hang
inside the sandbox.

## What changed

- Adds a separate `on_request_auto_review.md` permissions prompt
selected for `AskForApproval::OnRequest` with
`ApprovalsReviewer::AutoReview`.
- Keeps the normal user-reviewed `on-request` wording unchanged.
- Makes the `When to request escalation` bullets more explicit about
likely sandbox blocks, network access, remote
auth/cluster/cloud/database access, out-of-sandbox environment access,
git operations that may write lock files, and short-timeout reruns after
likely sandbox-blocked attempts.
- Omits approved command prefix and `prefix_rule` guidance for the
auto-review on-request prompt.
- Adds prompt tests covering the auto-review path, normal on-request
wording, and inline permission request behavior.

maja-openai · 2026-06-18 13:16:14 -07:00

d9dace8a59

Add app-server current-time impl (varlatency 3/n) (#28835 )

## What

Server should request:

```
{
  "id": 42,
  "method": "currentTime/read",
  "params": {
    "threadId": "11111111-1111-1111-1111-aaaaafdc2c11"
  }
}
```

Client should respond with something like:

```rust
{
  "id": 42,
  "result": {
    "currentTimeAt": 1781717655
  }
}
```

## Why

Sessions configured with `clock_source = "external"` need a
thread-specific external time source before inference. The system clock
remains the default production provider.

## Validation

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all
current_time_read_round_trip_adds_reminder_to_model_input`
- `cargo test -p codex-app-server
first_attestation_capable_connection_for_thread_only_uses_thread_subscribers`
- `cargo test -p codex-analytics`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`

Stacked on #28824.

rka-oai · 2026-06-18 13:12:11 -07:00

f4602b7516

apply-patch: carry paths as PathUri (#28854 )

## Why

Allows the model to edit files that are hosted on a different OS than
where app-server is running.

## What

* Use `PathUri` for apply_patch-internal data structures
* Limit `PathUri` -> `AbsolutePathBuf` conversion to cases where the
inferred path convention matches the host OS, allows requiring valid
paths to pass to perms check
* Adds `PathConvention::path_segments()` for iterating over path
segments regardless of OS
* Handle cross-platform relative paths in path filename parsing for
sniffing a shell
* Ensure we can apply patches in the wine e2e test

Adam Perry @ OpenAI · 2026-06-18 19:31:19 +00:00

0f89dd768c

[codex] Cache plugin metadata for tool suggestions (#27812 )

## Why

`built_tools` runs for every sampling request, and local plugin
discovery was repeatedly rereading plugin manifests, skills, MCP
configuration, and app declarations to build the same tool-suggest
metadata.

That source-derived metadata is stable until the existing plugin manager
reloads its cache. Runtime eligibility still needs to reflect the
current install, disable, policy, app-overlap, and authentication state.

## What changed

- Add a bounded, in-memory tool-suggest metadata cache owned by
`PluginsManager`.
- Key cached metadata by plugin identity and source, while applying
authentication routing each time the metadata is projected.
- Invalidate the metadata alongside the existing loaded-plugin cache,
including its normal configuration, marketplace refresh, and
remote-installed-plugin invalidation paths.
- Guard against an in-flight load repopulating stale metadata after
invalidation.
- Keep marketplace membership and all runtime eligibility filtering live
rather than introducing a separate catalog or revision model.

## Impact

Repeated sampling requests reuse already-loaded plugin capability
metadata while retaining the existing plugin-manager lifecycle as the
single freshness boundary.

## Validation

- `just test -p codex-core-plugins` — 252 passed
- Added focused coverage for cache invalidation and authentication
reprojection.

Matthew Zeng · 2026-06-18 12:25:07 -07:00

a52a3b5197

current time reminders impl for system clock (varlatency 2/n) (#28824 )

Stacked on #28822.

## Summary

- add a host-injectable current-time provider with a built-in system
implementation
- record UTC developer reminders in history immediately before due model
requests
- keep cadence state per session and force a refresh after compaction

This does NOT include the app server client <-> server clock logic. This
PR is only for the reminder message & system clock that will be used in
prod.

## Testing

- `just test -p codex-core varlatency_`
- `just clippy -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-thread-manager-sample`
- `just fmt`

rka-oai · 2026-06-18 19:18:42 +00:00

752ed90d78

[codex] Make thread store turn filter optional (#28949 )

Make `ListItemsParams::turn_id` optional so callers can list persisted
items across an entire thread or narrow the result to one turn. This
aligns the thread-store API and documentation with thread-wide item
listing while preserving the optional turn-filter behavior for
implementations.

Tom · 2026-06-18 12:13:31 -07:00

01a2df2947

Support openai/form extended form elicitations (#27500 )

# Summary
Allow App Server clients to opt into `openai/form` MCP elicitations.

Gabriel Peal · 2026-06-18 11:54:49 -07:00

21a599fa56

[codex] rollout budget implementation (varlength 2/N) (#28494 )

## Stack

Depends on #28746. This PR implements shared rollout-budget accounting
and model-visible reminders using the configuration defined in #28746.

# Description / Main changes to Core:

`AgentControl` will now be the area where "rollout level" features &
accounting will have to live. It is incorrectly named for this
responsibility, but I think it can hold all the necessary shared state &
features (rollout token budget, mutliple thread interruption
responsibilitym etc)

In this PR, we have one "token ledger" that each thread will subtract
from when sampling. The "charge" will occur when response.completed() is
done and the calculation will be done on the responses api usage
carrier. The calculation will weigh sampling and pre-fill tokens as
specified.

Every time the budget crosses the configured reminder threshold, a
developer message is appended before the thread's next request

This remaining budget will _always_ be restated/reminded after a
compaction event.

Expiration and fan-out interruption will be in the stacked follow-up
(and also live in Agent Control).

## Reminders

"You have weighted {session_tokens_left} tokens left in the shared
session token budget."

The first request in each thread context receives the current remainder.
Later reminders are emitted after aggregate weighted usage crosses a
configured interval. If several intervals are crossed before a thread
sends another request, Core inserts one reminder with the latest
remainder.

Compaction response usage is charged before the next context starts. The
next reminder is appended after the compaction summary, leaving the
initial context content stable.

## Tests

Integration coverage verifies:

- weighted output and non-cached input accounting
- initial and periodic reminders
- shared accounting between a root and sub-agent
- post-compaction remainder and message placement

Local checks:

- `just fmt`
- `just test -p codex-core rollout_budget`
- `git diff --check`

The full workspace test suite was not run locally.

rka-oai · 2026-06-18 18:52:19 +00:00

32a696dbac

Add Config for Time Reminders (varlatency 1/n) (#28822 )

## Summary

Example:

> [features.current_time_reminder]
enabled = true
reminder_interval_model_requests = 1
clock_source = "system"

## Testing

- `just test -p codex-core varlatency`
- `just test -p codex-core
lock_contains_prompts_and_materializes_features`
- `just fix -p codex-core -p codex-config -p codex-features`

rka-oai · 2026-06-18 11:39:02 -07:00

df5f122854

Synchronize realtime notification test requests (#28946 )

## What

Deliver the scripted realtime notification batch after the assistant
text append request instead of after the preceding developer text append
request.

## Why

The batch ends with an upstream error that closes the realtime
conversation. When it is emitted after the developer append, it races
the subsequent assistant append: the app-server RPC can acknowledge the
append before its downstream WebSocket send completes, and the test
intermittently observes three requests instead of four.

Making the fake server wait for the assistant append before emitting the
terminal batch establishes the ordering the test asserts without sleeps
or production-code changes.

## Validation

- `git diff --check`
- CI (the failure is timing-dependent and most reproducible in the
Windows Bazel shard)

rka-oai · 2026-06-18 18:05:53 +00:00

636a2594c6

[codex] Fix Windows sandbox runtime ACL refresh (#28943 )

## Why

Codex Desktop repairs sandbox-user read/execute access for binaries
copied to `%LOCALAPPDATA%\OpenAI\Codex\bin`, but Computer Use launches
its bundled Node runtime from `%LOCALAPPDATA%\OpenAI\Codex\runtimes`.

On fresh Windows installations, `CodexSandboxUsers` may therefore be
unable to execute the bundled Node binary. The command runner starts,
but `CreateProcessAsUserW` fails with error 5 (`ACCESS_DENIED`), causing
the Node REPL to exit before Computer Use can discover applications.

This is a follow-up to #21564, which added the original runtime `bin`
ACL repair.

## What changed

- Expand the Codex Desktop runtime ACL roots from only `bin` to both
`bin` and `runtimes`.
- Apply the existing inherited read/execute ACL repair to each runtime
directory when it exists.
- Rename the setup helper to reflect that it now handles multiple
runtime paths.

## Validation

- `cargo fmt -- --check`
- `just test -p codex-windows-sandbox` was run: 113 tests passed and
five environment-dependent legacy execution tests failed because
`CreateRestrictedToken` returned error 87.

iceweasel-oai · 2026-06-18 11:04:30 -07:00

afbb69a2fb

[codex] Initialize exec-server OpenTelemetry at startup (#25019 )

## Summary

- Initialize stderr tracing and the configured OpenTelemetry provider
for local and remote `codex exec-server` startup.
- Instrument the local and remote server entrypoints with a root runtime
span.
- Keep raw Noise environment, registration, and stream identifiers out
of exported spans while preserving them in local debug events.
- Keep telemetry setup in a focused CLI module instead of growing the
top-level command entrypoint.

## Stack

- Previous: none (`#27058` has merged)
- Next: #27466

## Validation

- `just test -p codex-exec-server --lib` (139 passed)
- `just test -p codex-cli --test exec_server` (3 passed)
- `just bazel-lock-check`
- `just fix -p codex-exec-server -p codex-cli`
- `just fmt`

---------

Co-authored-by: Richard Lee <richardlee@openai.com>

starr-openai · 2026-06-18 11:03:42 -07:00

4c7228e423

Fix goal-first live threads missing from thread/list (#28808 )

Fixes #28263.

## Why

When a thread starts with `/goal`, the goal extension can update SQLite
goal state before the thread has any user-turn rollout items.
`thread/list` and `thread/search` rely on persisted listing metadata, so
a goal-first live thread could be absent from app-server listings after
restart even though the goal itself existed.

This regressed when goal handling moved out of core: the core path wrote
the goal update through the live thread rollout path, while the
extension-backed app-server path only updated goal state and emitted the
live notification.

## What

- Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension
owns the canonical `ThreadGoalUpdated` rollout item shape.
- Expose a narrow `CodexThread::append_rollout_items()` helper that
appends through the live thread and keeps derived SQLite metadata in
sync.
- When app-server sets a goal on an active live thread, persist the goal
update through that live-thread path.
- Add an app-server regression test that starts a live thread with
`thread/goal/set` and verifies it appears in state-DB-only
`thread/list`.

## Verification

- `env -u CODEX_SQLITE_HOME just test -p codex-app-server
goal_first_live_thread_appears_in_state_db_thread_list`

Eric Traut · 2026-06-18 10:50:15 -07:00

e8dd1b45cb

Add turn-scoped context contributions (#28911 )

## Summary
- keep context injection on a single ContextContributor trait
- split context injection into thread-scoped and turn-scoped
contribution methods
- wire turn-scoped fragments into initial context assembly so extensions
can contribute context from turn-local state

jif · 2026-06-18 19:40:28 +02:00

9684ec25be

Scope MCP sandbox metadata to server environment (#28914 )

Scope MCP sandbox metadata to the MCP server's owning environment.

Previously, `codex/sandbox-state-meta` always used the turn's primary
cwd and rebuilt a legacy sandbox policy from that cwd. That can be wrong
for MCP servers owned by a different execution environment.

This now sends the owning environment cwd as a `file:` URI in
`sandboxCwd`, keeps `permissionProfile` as the permission source of
truth, and omits sandbox-state metadata when a non-default server
environment is not selected for the turn. Local/default MCP servers keep
the existing fallback cwd behavior.

Tests:
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just test -p codex-mcp`
- `just test -p codex-core mcp_sandbox_cwd`
- `cargo build -p codex-rmcp-client --bin test_stdio_server`
- `just test -p codex-core
stdio_mcp_tool_call_includes_sandbox_state_meta`

jif · 2026-06-18 19:31:07 +02:00

790213ded0

Pin Windows argument lint to Windows 2022 (#28940 )

## What

Run the Windows argument-comment-lint job on the `windows-2022` hosted
runner instead of the custom Windows runner pool.

## Why

The custom pool recently moved from the Visual Studio 2022 Windows image
to `windows-2025-vs2026`. Since that migration, the job fails while
Bazel materializes LLVM external repository sources, before the argument
lint itself runs. The same failure appears across unrelated PRs.

This narrow change tests GitHub’s recommended mitigation for workloads
that still require the Visual Studio 2022 image:
https://github.com/actions/runner-images/issues/14017

## How

Use the standard `windows-2022` runner for only the Windows
argument-comment-lint matrix entry. No product code or lint behavior
changes.

rka-oai · 2026-06-18 17:16:51 +00:00

47ab51470b

Recover exec process stdin writes (#28895 )

## Summary

Remote stdio MCP servers send tool calls by writing JSON-RPC bytes
through `process/write`.

When the exec-server websocket drops at the wrong time, the remote
process can survive session recovery, but the stdin write can still fail
back to RMCP as a transport send error. RMCP then closes the stdio MCP
transport, so tools like `node_repl` are lost even though the
process/session recovery path is working.

This changes `process/write` to be safe to retry across exec-server
recovery:

- adds a required `writeId` to `process/write`
- retries remote `Session::write` with the same `writeId` after
reconnect
- remembers accepted write ids per process so duplicate retries return
`Accepted` without writing the same bytes to child stdin again
- covers both the client retry path and server-side write id dedupe with
tests

In simple terms:

```text
before:
write to MCP stdin -> websocket closes -> write errors -> RMCP closes node_repl

after:
write to MCP stdin -> websocket closes -> reconnect -> retry same writeId
server either writes once or recognizes it already did
```

jif · 2026-06-18 19:04:26 +02:00

83e6a786a2

Pause active goals before TUI interrupts (#28813 )

Fixes #28104.

## Summary
Active `/goal` turns should leave the persisted goal paused whenever the
TUI interrupts the running turn. The bug in #28104 showed this most
visibly through `Esc`: some interrupt paths aborted the turn without
updating the goal status, so the goal could remain active and continue
automatically.

This change makes `ChatWidget` pause an active goal before the TUI sends
an interrupt from the status-row path, the pending-steer path, `Ctrl+C`,
or a request-user-input overlay. The modal overlay now reports whether a
key will interrupt the turn, which keeps modal `Esc` and `Ctrl+C`
behavior aligned with the normal interrupt paths.

## Manual Testing
Built the local CLI with `just codex --help`, then launched the local
TUI with goals enabled. Started an active `/goal` turn and interrupted
it with `Esc`, then resumed and repeated with `Ctrl+C`; both paths
showed `Goal paused`, the interrupted-conversation message, and the
`Goal paused (/goal resume)` footer. I also stopped the background
terminal and exited the TUI cleanly after the run.

I did not find a reliable standalone manual path to force the
request-user-input overlay case, so that path is covered by the focused
automated test.

Eric Traut · 2026-06-18 08:48:03 -07:00

07298a948c

Avoid sandbox helper in apply_patch approval tests (#28915 )

## Summary
This keeps the apply_patch approval tests focused on approval behavior
instead of macOS sandboxed filesystem helper startup.

The changed cases still force patch approval with `UnlessTrusted`, but
use `DangerFullAccess` after approval so the patch write is direct and
cheap. Workspace-write and sandbox-helper behavior remain covered by the
filesystem and apply_patch sandbox tests.

jif · 2026-06-18 15:13:55 +02:00

5670360009

Add network environment ID plumbing (#28766 )

## Why

Prepare network approval scoping to distinguish execution environments
without changing behavior yet.

## What changed

- Add optional environment IDs to network policy requests.
- Add optional network environment IDs to exec and sandbox request
structs.
- Thread default None values through existing construction points.
- Fix stale constructor call sites that caused the CI compile failures.

## Not included

- Per-environment proxy listeners.
- Network approval cache or prompt behavior changes.
- Ambiguous request attribution handling.

Those behavior changes moved to stacked follow-up #28899.

## Validation

- just fmt
- CI will run tests and clippy

jif · 2026-06-18 14:09:38 +02:00

0369b24d54

[codex] add rollout token budget configuration (varlength 1/N) (#28746 )

## What

This PR defines the structured configuration contract for shared rollout
token budgets (across ALL agent threads under 1 rollout).

```toml
[features.rollout_budget]
enabled = true
limit_tokens = 100000
reminder_interval_tokens = 10000
sampling_token_weight = 1.0
prefill_token_weight = 0.1
```

The reminder interval defaults to 10% of the rollout limit. Sampling and
prefill weights default to `1.0`.

## Scope

This PR only defines and validates configuration. It does not track
usage, inject reminders, or stop a rollout. Accounting and reminders are
implemented in the stacked follow-up #28494.

The existing `token_budget` feature remains unchanged. `rollout_budget`
has its own feature key and configuration type.

## Tests

The config test verifies that the structured fields resolve into
`RolloutBudgetConfig` and do not enable the existing `token_budget`
feature.

Local checks:

- `just write-config-schema`
- `just test -p codex-core load_config_resolves_rollout_budget`
- `cargo check -p codex-thread-manager-sample`
- `git diff --check`

The full workspace test suite was not run locally.

rka-oai · 2026-06-18 04:29:47 -07:00

ecc4c30e28

[codex] Pass plugin namespace into skill loading (#28608 )

## What changed

- retain the parsed plugin manifest namespace on loaded plugins
- carry that namespace through `PluginSkillRoot` and `SkillRoot`
- use the provided namespace when qualifying plugin skill names
- include the namespace in the skills cache key

## Why

Plugin loading has already parsed `plugin.json`, but skill parsing
currently walks every `SKILL.md` ancestor and probes/reads the manifest
again to reconstruct the same namespace. Passing the parsed namespace
removes those repeated filesystem calls, which are particularly costly
on remote filesystems.

Context:
https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4

## Impact

Plugin skill names remain unchanged. A regression test uses a
deliberately different on-disk manifest name to verify that plugin roots
use the provided parsed namespace.

## Validation

- `just test -p codex-core-skills -p codex-core-plugins -p codex-plugin
-p codex-utils-plugins` (352 passed)
- `just fix -p codex-core-skills -p codex-core-plugins -p codex-plugin
-p codex-utils-plugins`
- `just fmt`

Matthew Zeng · 2026-06-18 00:16:46 -07:00

c73296a0f0

[codex] Split plugin and skill warmup tracing (#28605 )

## What changed

- promote plugin config loading to an info-level `plugins_for_config`
span
- promote skill config loading to an info-level `skills_for_config` span
- attach stable OpenTelemetry names to both spans

## Why

`session_init.plugin_skill_warmup` currently combines plugin loading and
skill loading, which makes cold-start traces unable to identify which
phase dominates. These child spans preserve the existing aggregate while
making the two costs independently visible.

Context:
https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4

## Impact

This is observability-only. It does not change plugin or skill loading
behavior.

## Validation

- `just test -p codex-core-skills -p codex-core-plugins` (347 passed)
- `just fmt`

Matthew Zeng · 2026-06-17 22:45:10 -07:00

2c7802e7cf

unified-exec: retain PathUri in command events (#28780 )

## Why

App-server must report command events containing foreign-platform paths
without changing existing client or rollout path-string formats.

## What changed

- retain `PathUri` through exec command begin/end events
- convert cwd values to `LegacyAppPathString` at the app-server
compatibility boundary
- drop command actions with foreign paths and log them
- serialize rollout-trace cwd values using their inferred native path
representation
- restore Wine coverage for retained Windows cwd values and successful
completion

Adam Perry @ OpenAI · 2026-06-18 05:00:04 +00:00

3931bc2bde

Record more path migration guidance for codex. (#28851 )

Some common themes pulled out of both human and automated reviews from
the last couple of days' migrations to `PathUri` and
`LegacyAppPathString`.

Adam Perry @ OpenAI · 2026-06-18 04:59:42 +00:00

285eff6c3e

[codex] Support plugin manifest path lists (#28790 )

## Summary

Allow plugin manifests to declare `skills` as either a single path
string or an array of path strings in the core plugin loader.

## Why

Some plugin packages need to expose skills from more than one directory.
Before this change, `plugin.json` only accepted a single string for
`skills`, so manifests like this were ignored as an invalid `skills`
shape:

```json
{
  "skills": ["./skills/abc", "./skills/edk"]
}
```

This keeps the existing single-string form working while adding support
for the list form. The final scope is intentionally limited to the core
plugin manifest/load path for `skills`; `apps`, file-backed
`mcpServers`, and the bundled plugin-creator assets are unchanged in
this PR.

## What changed

- Parse `skills` as either a string or an array of strings in
`plugin.json`.
- Store resolved skill paths as a list in `PluginManifestPaths`.
- Load manifest-declared skill roots in addition to the default
`./skills` root.
- Deduplicate exact duplicate skill roots before loading.
- Rely on existing skill-loader dedupe by canonical `SKILL.md` path for
overlapping roots such as `./skills` plus `./skills/abc`.
- Update plugin manifest tests to cover:
  - single string `skills`
  - list of string `skills`
  - duplicate skill roots
  - `./skills` as a manifest path
  - explicit child roots like `./skills/abc` and `./skills/edk`
  - overlapping-root dedupe

## Validation

- `just test -p codex-plugin`
- `just test -p codex-core-plugins`
- `just test -p codex-mcp-extension`
- `git diff --check`

charlesgong-openai · 2026-06-17 21:33:53 -07:00

e12dd73b7d

Expose selecte namespaces as direct model tools (#28825 )

## Why

Som tools, such as history and notes, must remain top-level when MCP
deferral is enabled while staying unavailable through code-mode `exec`.

## What changed

- Added `features.code_mode.direct_only_tool_namespaces`.
- Classified matching MCP tools as `DirectModelOnly`.
- Kept those tools top-level in `code_mode_only`.
- Excluded them from `tool_search` deferral and the nested `exec`
surface.
- Updated the generated config schema.

## Validation

- `code_mode_only_exposes_direct_model_only_mcp_namespaces`
- `load_config_resolves_code_mode_config`

Won Park · 2026-06-18 04:07:54 +00:00

78a9e169bb

Refresh signed exec-server URLs on reconnect (#28374 )

## Summary

- add a provider API that supplies a fresh signed WebSocket URL for each
remote exec-server connection
- refresh the signed URL after disconnects and retry once when a
handshake returns `401 Unauthorized`
- allow `EnvironmentManager` consumers to register remote environments
backed by the URL provider

## Tests

- `just test -p codex-exec-server -E
'test(remote_websocket_client_refreshes_url_after_unauthorized_handshake)
| test(remote_websocket_client_refreshes_url_after_disconnect)'` — 2
passed
- `cargo check -p codex-core-api` — passed
- `just fix -p codex-exec-server` — passed
- `just fix -p codex-core-api` — no test targets; no-op
- `just fmt` — passed
- `just test -p codex-exec-server` — 187 passed; 32 unrelated macOS
sandbox tests could not invoke nested `sandbox-exec` (`Operation not
permitted`)

Anton Panasenko · 2026-06-17 20:58:48 -07:00

ac3fe64100

[codex] Support assistant realtime append text (#28836 )

## Why

Frontend realtime voice continuity needs to replay a tiny
previous-session overlap as actual conversation items, including
assistant text. The app-server `thread/realtime/appendText` API already
carries a role through to the Rust realtime websocket layer, but the
shared role enum only accepted `user` and `developer`.

## What Changed

- Added `assistant` to `ConversationTextRole` and regenerated the
app-server schema/type fixtures.
- Added `output_text` as a realtime conversation content type.
- Updated realtime websocket item creation so assistant appendText emits
`content: [{ type: "output_text", text }]`, while user and developer
continue to emit `input_text`.
- Updated app-server docs and tests to cover assistant appendText
alongside the existing developer role behavior.

## Validation

- `just write-app-server-schema`
- `just fmt` (first sandboxed attempt failed because `uv` could not
access `~/.cache/uv`; reran with filesystem access and passed)
- `just test -p codex-api` passed: 126/126
- `just test -p codex-app-server-protocol` passed: 239/239, including
generated JSON/TypeScript fixture checks
- `just test -p codex-app-server` was started locally but stopped per
request after unrelated local sandbox/Seatbelt failures (`sandbox-exec:
sandbox_apply: Operation not permitted`) and one missing local `codex`
binary failure; CI should be faster and more authoritative for the full
suite.

guinness-oai · 2026-06-17 20:57:13 -07:00

e922f46a0f

[codex] control automatic realtime handoff delivery (#27986 )

## What

Built on the realtime speech-control plumbing merged in #27917.

- Add optional `codexResponseHandoffPrefix` to `thread/realtime/start`.
- Apply that prefix only to automatic V1 commentary sent through
`conversation.handoff.append`; final answers remain unprefixed.
- Add opt-in `clientManagedHandoffs`. When true, core suppresses
automatic response handoffs and completion output so delivery is
controlled by explicit client append APIs.
- Preserve existing automatic behavior by default.
`codexResponsesAsItems: true` continues to select item routing when
client-managed mode is disabled.

## Why

Voice clients need two delivery policies: automatic background context
with silent commentary instructions and fully client-owned handoffs.
Phase-aware prefixing keeps routine commentary silent without
suppressing the final answer, while client-managed mode lets an app
decide exactly which updates to append.

## Validation

- `just fmt`
- `cargo test -p codex-app-server-protocol
serialize_thread_realtime_start`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core --test all
conversation_handoff_persists_across_item_done_until_turn_complete`
- `RUST_MIN_STACK=16777216 cargo test -p codex-app-server --test all
webrtc_v1_client_managed_handoffs_disable_automatic_output`
- `RUST_MIN_STACK=16777216 cargo test -p codex-app-server --test all
webrtc_v1_final_automatic_handoff_omits_silent_prefix`
- `cargo build -p codex-cli --bin codex`
- Local Codex Apps compatibility check: 43 focused webview tests passed,
and a live voice session routed through the source-built app-server.

The explicit `RUST_MIN_STACK` avoids a macOS Tokio test-worker stack
overflow seen with the default test environment.

jiayuhuang-openai · 2026-06-18 02:22:29 +00:00

683bd170dc

[codex] Use unique IDs for realtime-routed turns (#28826 )

## Why

A durable realtime voice orchestrator can reconnect and resume through
multiple fresh `Session` instances. Realtime handoffs were using the
Session-local `auto-compact-N` counter as their turn identity, but that
counter restarts at zero for every resumed Session. The durable thread
could therefore accumulate duplicate turn IDs, violating the uniqueness
assumptions made by app-server and web clients. In Codex Apps, a new
delegated response stream could be attached to an older turn with the
same ID, placing live output higher in history and putting turn-scoped
actions at risk.

Persisted rollout and reconstructed model-context order were already
correct because raw response items remain append-only and chronological.
This change restores unique identity for reconstructed and live turn
surfaces.

## What changed

- Generate a UUIDv7 specifically for each realtime-routed delegation.
- Leave the existing `auto-compact-N` identity path unchanged for actual
internal auto-compaction turns.
- Extend the inbound realtime handoff integration test to require a UUID
turn ID from `turn/started`.

## Verification

- `just test -p codex-core inbound_handoff_request_starts_turn`
- `just fix -p codex-core`
- `just fmt`

guinness-oai · 2026-06-17 19:13:38 -07:00

a306ac4ee3

fix(install): support older awk checksum parsing (#28784 )

## Why

The standalone installer validates package checksums with an awk
interval expression. Older mawk releases do not support that expression,
so they reject valid 64-character digests and report that the release
manifest is missing an entry. This affects both x64 and ARM64 systems on
common Debian-derived environments.

Fixes #24219.

## What Changed

Replace the awk interval expression with an explicit length check plus
rejection of non-hexadecimal characters. This preserves the existing
SHA-256 validation and lowercase normalization while working with older
awk implementations.

## How to Test

1. Build and run the checksum predicate with mawk 1.3.4 20121129.
2. Confirm the old interval predicate rejects a valid 64-character
digest.
3. Confirm the updated predicate accepts that digest.
4. Put the old mawk binary first on PATH as awk and run
scripts/install/install.sh with an isolated HOME, CODEX_HOME, and
CODEX_INSTALL_DIR.
5. Confirm Codex installs successfully and the installed binary reports
version 0.140.0.
6. Verify the predicate rejects wrong-length digests, non-hexadecimal
digests, and entries for another asset while accepting uppercase
hexadecimal digests.

Felipe Coury · 2026-06-17 22:12:02 -04:00

f22d15b679

[codex] Add optional IDs to response items (#28812 )

## Why

`ResponseItem` variants do not have a consistent internal ID shape: some
variants carry required IDs, some carry optional IDs, and some cannot
represent an ID at all. The existing fields also use inconsistent serde,
TypeScript, and JSON-schema annotations. A single enum-level access path
is needed before history recording can assign and retain IDs.

This PR establishes that internal model only. It intentionally does not
generate or serialize IDs; allocation and wire persistence are isolated
in the stacked follow-up.

## What changed

- Give every concrete `ResponseItem` variant an `Option<String>` ID
field.
- Apply the same internal-only annotations to every ID field:
`#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and
`#[schemars(skip)]`.
- Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared
accessors.
- Preserve IDs when history items are rewritten for truncation.
- Adapt consumers that previously assumed reasoning and image-generation
IDs were required.
- Regenerate app-server schemas so the hidden fields are represented
consistently.

The serde catch-all `ResponseItem::Other` remains ID-less because it
must remain a unit variant.

## Test plan

- `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace
-p codex-image-generation-extension`
- `just test -p codex-protocol`
- `just test -p codex-app-server-protocol`
- `just test -p codex-api -p codex-rollout-trace -p
codex-image-generation-extension`
- `just test -p codex-core event_mapping`

pakrym-oai · 2026-06-17 18:27:43 -07:00

dbd2857f4b

feat(exec-server): add Noise rendezvous environment (#28774 )

## Why

Codex can run a remote exec server through the Noise relay, but the
normal
environment-manager path could not establish an
environment-registry-backed
harness connection. Signed rendezvous URLs and harness authorizations
are
short-lived, so reconnects must fetch a fresh bundle instead of
retaining
stale connection credentials. A stalled registry request must also fail
within
the regular remote connection deadline, without exposing these
credentials in
debug logs.

Issue: N/A (internal environment-service integration).

## What Changed

- Add environment-manager configuration for a registry-backed Noise
rendezvous
  environment.
- Request a fresh bundle from
`/cloud/environment/{environment_id}/connect` for every physical harness
  connection, using the existing 10-second remote connection timeout.
- Share the Environment Registry register, connect, and validate wire
payloads
  through `codex-exec-server` and `codex-core-api`.
- Redact the signed rendezvous URL and harness authorization from the
public
  connect response's `Debug` output.
- Add focused coverage for registry bundle retrieval, stalled requests,
and
  credential redaction.

Anton Panasenko · 2026-06-17 17:20:53 -07:00

c274a83f8b

path-uri: decouple native path parsing (#28778 )

## Why

`PathUri::join` should not depend on the app-server compatibility
wrapper `LegacyAppPathString` to parse native paths. Native path parsing
belongs to the URI abstraction that it constructs.

## What

Move platform-independent native path parsing into the root `PathUri`
module. `PathUri::join` and `LegacyAppPathString` now share the
crate-private `PathUri::from_absolute_native_path` constructor.

Adam Perry @ OpenAI · 2026-06-17 22:17:07 +00:00

e7b6e0d859

[codex] trace tools build latency (#28782 )

Add more tracing spans around tool building.

Owen Lin · 2026-06-17 14:53:54 -07:00

38211a3ff8

bazel: refresh expired macOS SDK pin (#28791 )

## Why

macOS Bazel jobs fail before target analysis because the pinned Apple
CDN object now returns HTTP 403.

## What

Uprev the pin to Apple's currently live macOS 26.5 Command Line Tools
package, including its checksum and SDK extraction path.

## Validation

- Built `@macos_sdk//sysroot` from a fresh Bazel output root.
- Regenerated and checked `MODULE.bazel.lock`; it remains unchanged.

Adam Perry @ OpenAI · 2026-06-17 21:48:17 +00:00

243243ab8f

fix(plugins): support root local marketplace plugins (#28771 )

## Summary
- allow local marketplace `source.path: "."` and `source.path: "./"` to
resolve to the marketplace root
- keep `""` invalid and preserve rejection of non-root paths without
`./` plus non-normal/traversal paths
- add focused regression coverage for repo-root plugin layouts and
rejected local paths

## Tests
- `RUSTUP_TOOLCHAIN=stable just fmt`
- `RUSTUP_TOOLCHAIN=stable just test -p codex-core-plugins`
- `RUSTUP_TOOLCHAIN=stable just fix -p codex-core-plugins`

Note: plain pinned-toolchain `just fmt` was blocked locally by a rustup
`clippy` component conflict, so validation used the working stable 1.95
toolchain fallback.

Casey Chow · 2026-06-17 14:06:42 -07:00

a760b63f83

exec-server: expose environment registry payloads (#28651 )

## Why

Services that proxy the exec-server environment registry endpoints need
to deserialize and forward the same Noise registration and harness-key
validation payloads. Those wire models currently live as private,
serialize-only structs in `exec-server`, which forces consumers to
duplicate the contract.

## What changed

- Add owned serde models for registration and harness-key validation
requests and responses.
- Use those models in the existing exec-server registry client.
- Re-export the models from `codex-exec-server` and `codex-core-api`.
- Keep the harness authorization request free of a derived `Debug`
implementation so it is not accidentally logged.

## Testing

- Focused exec-server registration and harness-key validation tests: 2
passed.
- `cargo check -p codex-core-api`

The full `codex-exec-server` suite compiled and ran 254 tests: 222
passed, while 32 existing filesystem sandbox tests could not run under
the nested macOS sandbox (`sandbox_apply: Operation not permitted`).

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-06-17 13:27:25 -07:00

a0586ad12d

[codex] Track plugin install and import telemetry failures (#28731 )

## Summary
- Track plugin install failures through the unified
`codex_plugin_install_failed` event for local installs, remote install
preflight failures, bundle failures, and remote catalog/backend
failures.
- Send classified `error_type` values in plugin install failure
analytics instead of raw error strings.
- Stop sending raw external-agent import errors in analytics while
preserving raw failure details in app-facing import
notifications/history.
- Keep raw plugin/migration diagnostics in `tracing::warn!` logs.
- Keep remote failure plugin names as the existing local placeholder
(`unknown`) and remove the extra telemetry plugin-name override.
- Change `ExternalAgentConfigImportParams.source` from a generated enum
to `string | null`, with legacy `claudeCode` / `claudeCowork` inputs
normalized to existing analytics values.

## Testing

charlesgong-openai · 2026-06-17 13:16:34 -07:00

3959ab0ffc

unified-exec: preserve PathUri through exec-server (#28681 )

## Why

It should be possible for app-server to handle "foreign" OS paths in
unified_exec working directories, allowing e.g. a Linux app-server to
run processes on e.g. a Windows exec-server.

## What

Convert the core unified_exec cwd values to use `PathUri`.

Adds fallible path conversion in several places to try to minimize the
scope of this change. The only time this change suppresses errors from
converting `PathUri` to an `AbsolutePathBuf` is when the turn is
configured with no sandboxing at all to allow us to make progress
testing without sandboxing.

Future changes to apply_patch and sandboxing will clean up these error
paths.

A tool's cwd is resolved from joining a model-provided workdir to the
environment's cwd. When using `AbsolutePathBuf::join()`, an
absolute-path workdir would overwrite the environment's cwd and we would
resolve permissions/sandboxing against the model-provided path. This
change extends `PathUri::join()` to also treat an absolute rhs as an
override of the base/lhs.

This also removes some coverage from the remove_env_windows tests until
a follow-up converts foreign paths in command exec events correctly.

## Breaking Changes

When using `AbsolutePathBuf::join()` for workdir resolution, we ended up
resolving tilde-prefixed paths against the app-server's `$HOME`, e.g.
`~/foo/bar` becomes `/home/anp/foo/bar`. It's difficult to do this with
`PathUri` joining, so after offline discussion this PR no longer
implements it.

A quick check of some power users' rollouts suggests that models don't
actually generate home-prefixed absolute working directories for their
spawns, so this shouldn't have any real blast radius.

Adam Perry @ OpenAI · 2026-06-17 19:36:16 +00:00

5867b529ae

7620 Commits