codex

protocol: remove submission-side serde from Op (#26674 )

## Why

Submission-side `Op` payloads are now an internal handoff inside the
Rust codebase, so keeping a stable serde contract there adds complexity
without a real wire consumer.

## What changed

- remove serde/schema annotations from `Submission`, `Op`, and
submission-only payload types like thread settings overrides, additional
context, realtime conversation params, `TurnEnvironmentSelection`, and
`RequestUserInputResponse`
- delete the `Op` serialization tests and the now-unused double-option
prompt serde helper
- keep event/API-facing serialization where it is still required, and
serialize the `request_user_input` tool output from its wire payload
instead of the core response struct
- update `protocol_v1.md` to call out that events remain the serialized
transport surface while submission payloads are implementation details

## Testing

- `just test -p codex-protocol`
- `cargo check -p codex-core -p codex-app-server -p codex-thread-store`
- `just test -p codex-core request_user_input`

pakrym-oai · 2026-06-05 15:41:13 -07:00

470c20bf98

[2 of 2] Finish moving goal runtime to extension (#26548 )

## Stack

1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
goal extension with core behavior
2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
goal runtime to extension

## Why

This PR completes the switch of the goal behavior to the
extension-backed runtime and removes the old core goal implementation.

## What Changed

- Installs the goal extension for app-server `ThreadManager` sessions.
- Routes app-server thread goal `get`, `set`, and `clear` through
`GoalService`.
- Uses thread-idle lifecycle emission after goal resume and snapshot
ordering so the extension can decide whether to continue the goal.
- Forwards extension goal updates through a FIFO async app-server
notification path so backpressure does not drop them or reorder updates.
- Keeps review turns from enabling goal runtime behavior.
- Plans extension tools before dynamic tools so built-in goal tool names
keep their old precedence when goals are enabled.
- Removes the old core goal runtime, core goal tool handlers, and core
goal tool specs.
- Updates tests that were coupled to the core-owned goal runtime while
leaving the legacy `<goal_context>` compatibility path in core for old
threads.
- Removes the stale cargo-shear ignore now that `codex-goal-extension`
is used by the workspace.
- Keeps realtime event matching exhaustive after removing the old
goal-specific realtime text path.


## Validation

- Ran manual `/goal` runs in TUI. Validated time accounting matched
wall-clock time and goal lifecycle state transitions.

Eric Traut · 2026-06-05 14:17:30 -07:00

479a14cf59

[codex] Bound WSL local curated discovery (#26669 )

## Context
The installed-app suggestion expansion added in #24996 reads plugin
details for trusted file-backed marketplace candidates because the list
response does not include app ids. On Windows-backed WSL mounts, the
local `openai-curated` checkout lives under `$CODEX_HOME/.tmp/plugins`,
and those per-plugin detail reads can be very slow.

Remote curated already has cached app ids, so it does not need the same
local filesystem traversal.

## Summary
- Keep only the WSL Windows-backed local `openai-curated` checkout on
the legacy fallback/configured discovery path.
- Preserve installed-app expansion for non-WSL file-backed marketplaces
and remote curated.
- Add focused tests for the WSL local curated path predicate.

## Test
- `just test -p codex-core-plugins discoverable`
- `just test -p codex-core plugins::discoverable::tests`

xl-openai · 2026-06-05 14:09:40 -07:00

679cc08445

Add JSON output for plugin subcommands (#26631 )

## Summary
- Follow-up to #25330 and #26417
- Add `--json` output for `codex plugin add` and `codex plugin remove`
- Add `--json` output for `codex plugin marketplace
add/list/upgrade/remove`
- Keep existing human-readable output unchanged
- Keep existing error handling/stderr behavior unchanged; `--json`
changes successful stdout output only
- Align marketplace add/remove JSON field names with the existing
app-server protocol shape
- Add CLI coverage for plugin and marketplace JSON outputs

## Validation
- `just fmt`
- `just fix -p codex-cli`
- `just test -p codex-cli`

mpc-oai · 2026-06-05 14:40:31 -05:00

bb7d19bc24

Speed up TUI startup by reusing plugin discovery (#26469 )

## Summary

TUI startup loads related plugin data from `hooks/list`, session MCP
initialization, and plugin skill warmup. These paths repeated filesystem
discovery and emitted the same plugin warnings, while `hooks/list` and
account/model bootstrap ran serially.

This change:

- Reuses one immutable plugin load outcome across startup consumers.
- Keys the cache only on plugin-relevant configuration.
- Single-flights concurrent plugin loads and prevents invalidated loads
from repopulating the cache.
- Runs hook discovery and account/model bootstrap concurrently.
- Preserves configuration-migration ordering, hook review behavior, and
accurate startup telemetry.

In 10 alternating release-build launches in the Ruff repository with the
existing `~/.codex` configuration, median time to the first editable
composer decreased from 833ms to 504ms. The branch was faster in 9 of 10
pairs, with a paired median improvement of 312ms.

Charlie Marsh · 2026-06-05 15:32:43 -04:00

055c7a7c53

Use state DB first for resume --last (#26462 )

## Summary

`codex resume --last` currently lists sessions by updated time using
scan-and-repair. Updated-time filesystem listing must stat every rollout
before applying the cwd, provider, and source filters, so startup scales
with the entire local session history...

This change queries the state DB first for the latest matching session.
For local workspaces, we only accept the indexed result when its rollout
path still exists; otherwise we retry with scan-and-repair. The same
lookup path is shared by `fork --last`.

I benchmarked the same `thread/list` request used by `resume --last` in
my local `ruff` checkout against a Codex home with 2,599 active rollouts
totaling 3.7 GiB, including 90 Ruff threads.

Across five fresh release app-server processes with warm filesystem
caches, the state-DB-only lookup had median latency of 0.37-0.44 ms,
while scan-and-repair had median latency of 139-162 ms. First-request
latency was 0.7-1.7 ms versus 142-185 ms.

So this **removes roughly 140-160 ms from the `resume --last` lookup**
on this machine, and makes that lookup over 300x faster.

The tradeoff is that this does leave two correctness gaps:

- If a newer matching rollout is missing from SQLite but an older
matching row exists, the fast path resumes the older thread and never
falls back to the filesystem scan.
- If an existing row has stale filter or ordering metadata, the fast
path can select a different thread from scan-and-repair. The rollout
tests already demonstrate this for stale cwd metadata: state-DB-only
returns the stale match, while scan-and-repair removes and repairs it.

So you could end up seeing the "wrong" result in cases like...

1. A crash or SQLite error occurs between Codex writing the conversation
file and updating SQLite, leaving the newer file unindexed.

2. An older Codex version, restore, or manual copy adds a conversation
file after SQLite’s one-time backfill completed.

These seem pretty rare though (and sessions can always be recovered via
other mechanisms -- `--last` is just a convenience feature), and I think
the tradeoffs are good here?

Charlie Marsh · 2026-06-05 14:58:09 -04:00

345cf6e8d0

Make runtime workspace roots absolute in app-server API (#26552 )

Stacked on #26532.

## Why

#26532 moves cwd normalization to the app-server/core boundary.
`runtimeWorkspaceRoots` still accepted raw paths in v2 requests and in
`ConfigOverrides`, which left core responsible for interpreting those
roots later. This makes runtime workspace roots follow the same
absolute-path boundary as cwd.

## What

- Change v2 `runtimeWorkspaceRoots` request fields for `thread/start`,
`thread/resume`, `thread/fork`, and `turn/start` to `AbsolutePathBuf`.
- Deduplicate already-absolute runtime roots in app-server handlers and
pass them through `ConfigOverrides.workspace_roots` as
`AbsolutePathBuf`.
- Update TUI and exec client request builders to pass absolute runtime
roots directly.
- Update app-server docs, schema fixtures, and focused tests for
absolute runtime roots.

## Testing

- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server runtime_workspace_roots`
- `just test -p codex-core
session_permission_profile_rebinds_runtime_workspace_roots`
- `just test -p codex-tui app_server_session`
- `just test -p codex-exec`

pakrym-oai · 2026-06-05 11:36:53 -07:00

76c0a5379c

[codex] Add turn profiling analytics (#26484 )

## Summary

Add flat profiling fields to `codex_turn_event` so analytics can explain
where turn wall-clock time is spent without changing tool execution
behavior.

The profile reports:
- time before the first sampling request
- sampling time across all attempts and follow-ups
- overhead between sampling requests
- time blocked in the post-sampling tool drain
- time after the final sampling request
- sampling request and retry counts

## Implementation

- Extend the existing turn timing state with constant-memory phase
accounting and one RAII phase guard.
- Observe sampling and the existing post-sampling drain only at turn
orchestration boundaries.
- Keep tool runtime, tool futures, response item handling, and turn
lifecycle values unchanged.
- Add the profiling fields directly to the existing analytics turn event
without changing app-server protocol or rollout persistence.
- Use the existing turn `status` to distinguish completed, failed, and
interrupted profiles.

Exact sampling/tool overlap is intentionally omitted because measuring
tool completion accurately would require hooks in the tool execution
path.

## Validation

- Add app-server end-to-end coverage for a single-sampling turn with no
blocking tool work.
- Add app-server end-to-end coverage for `request_user_input` blocking
followed by a second sampling request.
- CI is running on the PR; tests were not executed locally per
repository guidance.

Ahmed Ibrahim · 2026-06-05 11:27:10 -07:00

8d72fb6de9

[codex] Respect Windows sandbox backend in exec policy (#26307 )

## Why

Windows managed filesystem permissions can now be backed by a real
Windows sandbox. `exec-policy` was still treating the managed read-only
policy shape as if there were never a sandbox backend, so benign
unmatched commands such as PowerShell directory listings could be
rejected with `blocked by policy` even when `windows.sandbox` was
enabled.

The inverse case still needs to stay conservative: when the Windows
sandbox backend is disabled, managed filesystem restrictions are only
configuration intent, not an enforced filesystem boundary. That applies
to writable-root restricted profiles too, not just read-only profiles.

## What Changed

- Thread the effective `WindowsSandboxLevel` into exec-policy approval
decisions for shell, unified exec, and intercepted shell exec paths.
- Treat managed restricted filesystem profiles as lacking sandbox
protection only on Windows when `WindowsSandboxLevel::Disabled`.
- Exclude full-disk-write profiles from that no-backend path because
they do not rely on filesystem sandbox enforcement.
- Remove the cwd-sensitive read-only heuristic and the now-stale cwd
plumbing from exec-policy approval contexts.
- Add Windows coverage for both enabled-sandbox and disabled-backend
behavior, including a writable-root managed profile.

## Validation

- Added/updated `exec_policy` coverage for managed filesystem
restrictions, full-disk-write exclusion, enabled Windows sandbox
behavior, and disabled-backend read-only/writable-root behavior.
- `just test -p codex-core exec_policy` — 100 passed, 10 leaky
- Empirical local `codex exec` probe with `--sandbox read-only -c
'windows.sandbox="unelevated"'`: PowerShell directory listing completed
successfully.
- Disabled-backend control with Windows sandbox cleared: the same
command was rejected with `blocked by policy`.

iceweasel-oai · 2026-06-05 11:20:52 -07:00

82b15b65e2

fix(tui): restore cancelled prompt cursor at end (#26457 )

## Why

Pressing `Esc` on a turn that produced no visible output restores the
submitted prompt so the user can keep editing it. That restore path
preserved the prompt content, images, and mention bindings, but left the
composer cursor at the start of the restored text. The next edit
therefore inserted at the beginning instead of continuing from the end
of the prompt.

## What Changed

- Move the cursor to the end after
`BottomPane::set_composer_text_with_mention_bindings` rehydrates a
restored draft.
- Add test-only cursor accessors so restore tests can assert the
composer state directly.
- Extend the queued restore regression to assert the restored composer
cursor is positioned at `text.len()`.

## How to Test

Manual reviewer flow:

1. Start Codex in the TUI.
2. Submit a prompt that will take long enough to interrupt.
3. Press `Esc` before any visible assistant output appears.
4. Confirm the prompt is restored into the composer and the cursor is at
the end, so typing appends to the prompt.
5. Repeat with a prompt that includes an attached image or resolved
mention and confirm the restored content remains intact.

Targeted tests:

- `just test -p codex-tui
chatwidget::tests::composer_submission::queued_restore_with_remote_images_keeps_local_placeholder_mapping`

Lint note:

- `just argument-comment-lint` is blocked locally by the existing Bazel
`compiler-rt` empty glob failure before analyzing touched code. The
touched Rust diff was manually inspected and adds no new opaque
positional literal callsites.

Felipe Coury · 2026-06-05 15:10:13 -03:00

679a944dbc

fix(tui): Windows composer background (#26181 )

## Why

On Windows, the TUI could not shade the composer against the terminal
background because `terminal_palette::default_colors()` always fell back
to `None`. That preserved safety, but it also meant terminals that do
support OSC 10/11 default color replies had no path to report their real
background color.

This keeps the existing fallback behavior for unsupported terminals
while allowing capable Windows terminals to report their default
foreground/background colors during startup.

| Before | After |
|---|---|
| <img width="1235" height="658" alt="win-before"
src="https://github.com/user-attachments/assets/ff756589-fcb3-43de-8f2a-ebc0369b30dd"
/> | <img width="1235" height="658" alt="win-after"
src="https://github.com/user-attachments/assets/9563ff20-4be5-4608-9414-a2afb647e745"
/> |

## What Changed

- Moved the OSC 10/11 default color parser in
`tui/src/terminal_probe.rs` out of the Unix-only implementation so it
can be reused by Windows.
- Added a Windows-only bounded OSC 10/11 probe using raw console handles
and the existing `windows-sys` dependency.
- Added Windows palette caching in `tui/src/terminal_palette.rs` so
startup probe results, including `None`, are reused instead of probing
again later.
- Wired the Windows color probe into TUI startup after the existing
non-Unix crossterm cursor and keyboard checks.
- Added parser coverage for malformed, partial, and noisy OSC color
replies.

If the probe fails, times out, receives only one color, or receives
malformed data, the cache stores `None` and the composer keeps the
current behavior.

## How to Test

1. On Windows, start Codex in a terminal that supports OSC 10/11 default
color replies.
2. Open the TUI composer.
3. Confirm the composer/status area is painted using the terminal's
reported default background, instead of leaving the background unshaded.
4. Start Codex in a terminal that does not answer OSC 10/11, or
otherwise blocks terminal color replies.
5. Confirm startup still succeeds and the composer uses the existing
fallback behavior.

Targeted tests:

- `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target
just test -p codex-tui terminal_probe`

Additional local verification:

- `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target
just test -p codex-tui` was run; 2774 tests passed, and two unrelated
Guardian feature-flag tests failed reproducibly when isolated.
- `just argument-comment-lint` was attempted but blocked by the local
Bazel/LLVM `include/sanitizer/*.h` empty glob issue. Touched Rust
literal callsites were inspected manually.
- `cargo check -p codex-tui --target x86_64-pc-windows-msvc` was
attempted after installing the target, but local macOS cross-checking is
blocked by missing Windows C SDK headers in native dependencies
(`ring`/`aws-lc-sys`).

---------

Co-authored-by: Kevin Bond <kbond@openai.com>

Felipe Coury · 2026-06-05 11:05:46 -07:00

713192381b

[1 of 2] Align goal extension with core behavior (#26547 )

## Stack

1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
goal extension with core behavior
2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
goal runtime to extension

## Why

The goal runtime is moving out of `codex-core` and into
`codex-goal-extension`. This first PR brings the extension back in line
with the current core behavior before the follow-up PR switches
app-server sessions over to the extension, so that review can focus on
ownership and wiring rather than hidden behavior drift.

## What Changed

- Updates the extension `create_goal` and `update_goal` tool
schemas/descriptions to match the current core wording for explicit
token budgets, blocked-goal audits, resumed blocked goals, and
system-owned budget/usage-limit transitions.
- Marks `codex-goal-extension` as the live `/goal` extension crate
rather than an unwired sketch.
- Looks up the live thread before reading goal state for idle
continuation, so continuation setup exits early when no live thread can
accept the automatic turn.

Eric Traut · 2026-06-05 10:37:38 -07:00

a8c9530911

Clean up Rust release workflow (#26335 )

## Why
PR #26252 moved macOS release signing into the tag-triggered
`rust-release` workflow through the protected `codesigning` environment
and Azure Key Vault. That leaves the old manual unsigned-build /
signed-promotion handoff as dead compatibility scaffolding: it makes the
release DAG harder to reason about and keeps paths around that the
current release process no longer intends to operate.

## What changed
- Remove the manual `workflow_dispatch` inputs and validation for
`build_unsigned`, `promote_signed`, and the deprecated `sign_macos`
flag.
- Drop the `stage-signed-macos` job and the promotion-specific artifact
download, re-upload, pruning, and cleanup logic.
- Make tag-pushed releases always follow the signed release path: build,
sign, package, finalize, publish, and then run downstream release jobs
from `release` success.
- Remove stale `SIGN_MACOS` / `sign_macos` conditions and outputs,
including downstream gates for npm, DotSlash, WinGet, dev website
deploy, and `latest-alpha-cli` branch updates.

## Verification
- `ruby -e 'require "yaml"; YAML.load_file(ARGV.fetch(0)); puts "yaml
ok"' .github/workflows/rust-release.yml`
- `git diff --check`
- `rg -n
"workflow_dispatch|inputs\\.|release_mode|build_unsigned|SIGN_MACOS|outputs\\.sign_macos|sign_macos\\b"
.github/workflows/rust-release.yml` returned no matches

Shijie Rao · 2026-06-05 10:36:14 -07:00

78eba34b41

feat(app-server): add remote control pairing status RPC (#26450 )

## What

Exposes the pairing status transport as experimental app-server v2 RPC
`remoteControl/pairing/status`.

- Adds request/response protocol types for exactly one lookup key:
`pairingCode` or `manualPairingCode`, returning `{ claimed }`.
- Registers the RPC with `global_shared_read("remote-control-pairing")`.
- Wires the method through `MessageProcessor` and
`RemoteControlRequestProcessor`.
- Validates missing/conflicting pairing-code params as invalid requests.
- Documents the RPC in `app-server/README.md`.
- Adds processor, protocol export, and JSON-RPC integration coverage for
both code paths.

## Why

This is the app-server surface the desktop app can poll while the
QR/manual pairing modal is active.

Depends on https://github.com/openai/codex/pull/26449
Related backend change: https://github.com/openai/openai/pull/990244

## Verification

- `cargo test --manifest-path app-server-protocol/Cargo.toml
remote_control`
- `cargo test --manifest-path app-server/Cargo.toml remote_control`
- `cargo fmt --all --check`
- `git diff --check`

hefuc-oai · 2026-06-05 10:33:56 -07:00

0177231ca0

fix(tui): avoid doubled blank rows while streaming (#26636 )

## Summary

During assistant-message streaming, blank markdown lines in the
transient active tail were prefixed with two spaces. Ratatui measured
those whitespace-only lines as two viewport rows, so list- and
table-heavy answers showed doubled vertical gaps while streaming and
then visibly compacted when finalized into scrollback.

- keep whitespace-only `StreamingAgentTailCell` lines structurally empty
while preserving nonblank message prefixes
- clear impossible hyperlink metadata when normalizing a blank tail line
- add an inline snapshot and height regression proving one blank
markdown line occupies one viewport row

Related to #26618, but fixes a separate live-tail row-height issue
rather than stale committed markdown content.

## How to Test

Recommended before/after reproduction:

1. Start the latest Codex build without this change.
2. Submit this exact prompt:

> Send 20 different lists: bullets vs numbered, simple vs complex with
paragraphs in between items, etc. Intertwine them with some tables and
some paragraphs.

3. While the answer streams, observe duplicated vertical gaps around
list items and paragraphs. When the answer finishes, observe the spacing
compact.
4. Start this branch with `just c` and submit the same prompt.
5. Confirm each intended blank markdown line occupies one terminal row
throughout streaming and that the spacing does not compact or jump when
the answer finishes.
6. As a focused regression, verify the sections after the first table,
especially loose lists with paragraphs between items; those blank rows
should remain stable throughout streaming.

Targeted tests:

- `just test -p codex-tui
streaming_agent_tail_blank_line_uses_one_viewport_row`
- `just test -p codex-tui history_cell::tests`

## Test Notes

- Verified the exact prompt above in a real tmux TUI using latest Codex
and this branch as the before/after comparison.
- The full `just test -p codex-tui` run completed 2,782 of 2,784 tests
successfully. Two unrelated guardian feature-flag tests fail
reproducibly in isolation because the expected `OverrideTurnContext`
message is absent.
- `just argument-comment-lint` is blocked locally by the existing Bazel
`compiler-rt` missing-header glob error; the touched Rust diff was
inspected manually for opaque positional literals.

Felipe Coury · 2026-06-05 14:33:31 -03:00

841f057f2d

Make turn diff tracker multi-env aware (#26433 )

## Why

Turn diffs were tracked as one flat set of absolute paths. In
multi-environment turns, local and remote environments can report the
same path while representing different filesystems, so a single path key
can collapse distinct changes or attribute them to the wrong
environment.

The environment name is **NOT** included in the generated unified diff.
This can come later.

pakrym-oai · 2026-06-05 17:31:22 +00:00

86a1ddd028

feat(remote-control): add pairing status transport (#26449 )

## What

Adds transport support for checking remote-control pairing status
against the backend.

- Adds the normalized `server/pair/status` backend URL.
- Adds backend request/response structs for exactly one lookup key:
`pairing_code` or `manual_pairing_code`, returning `{ claimed }`.
- Adds `RemoteControlEnrollment::pairing_status` and
`RemoteControlHandle::pairing_status`.
- Preserves auth refresh/retry behavior and backend error mapping.
- Adds transport coverage for pending, claimed, manual-code payloads,
token refresh, mapped backend errors, malformed responses, and URL
normalization.

## Why

Desktop needs a host-authenticated way to poll whether a QR or manual
pairing code has been claimed.

Related backend change: https://github.com/openai/openai/pull/990244

## Verification

- `cargo test --manifest-path app-server-transport/Cargo.toml
remote_control::tests::pairing_tests`
- `cargo fmt --all --check`
- `git diff --check`

hefuc-oai · 2026-06-05 10:07:25 -07:00

da490ba9de

[codex] Add /usr/bin/bash shell fallback (#26538 )

## Why

Some Linux environments expose `bash` at `/usr/bin/bash` instead of
`/bin/bash`. The shell detection fallback list should cover both
standard locations once PATH/user-shell probing fails.

Stacked on #26480.

## What changed

- Add `/usr/bin/bash` to the bash fallback path list in
`codex-shell-command`.
- Extend shell type detection coverage for `/usr/bin/bash`.
- Add AGENTS.md testing guidance to avoid tests for statically defined
values and negative tests for removed logic.

## Verification

- `just test -p codex-shell-command`

pakrym-oai · 2026-06-05 09:38:26 -07:00

9ddb1de633

[codex] Allow socketpair in proxy-routed Linux sandbox (#26625 )

## Summary

- allow `socketpair(AF_UNIX, ...)` in the proxy-routed Linux seccomp
mode
- continue denying `socket(AF_UNIX, ...)` so user commands cannot create
pathname or abstract Unix sockets
- extend the managed-proxy integration test to verify both behaviors

## Root cause

`NetworkSeccompMode::ProxyRouted` treated anonymous Unix socket pairs
like externally addressable Unix sockets and returned `EPERM`. This
breaks tools that use socket pairs for local child-process IPC even
though a socket pair cannot connect outside the sandbox or bypass the
routed proxy.

`dangerously_allow_all_unix_sockets` controls Unix-socket requests
forwarded by the managed network proxy; it does not currently configure
the Linux seccomp filter. Socket pairs should not require that dangerous
setting because they are unnamed, process-local IPC.

Related but independent: #26553 fixes host proxy bridge socket path
length handling.

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-06-05 09:34:36 -07:00

d40454522e

Require absolute cwd in thread settings (#26532 )

## Why

Thread settings cwd overrides are expected to be resolved before they
enter core. Keeping this boundary as a plain `PathBuf` made it easy for
core/session code to keep fallback normalization and relative-path
resolution logic in places that should only receive an already-resolved
cwd.

This is intentionally the absolute-cwd-only slice: it does not change
environment selection stickiness or cwd-to-default-environment fallback
behavior.

## What changed

- Changes `ThreadSettingsOverrides.cwd`,
`CodexThreadSettingsOverrides.cwd`, and `SessionSettingsUpdate.cwd` to
use `AbsolutePathBuf`.
- Removes core-side cwd normalization/resolution from session settings
updates.
- Updates affected core/app-server test helpers and callsites to pass
existing absolute cwd values or use `abs()` helpers.

## Validation

Opening as draft so CI can start while local validation continues.

pakrym-oai · 2026-06-05 09:29:15 -07:00

40c8f1a007

feat: reload v2 agents on delivery (#26623 )

## Summary

This is the first small step toward making multi-agent v2 agents durable
logical agents whose `ThreadManager` residency is only an implementation
detail.

This PR adds a narrow v2 reload-on-delivery hook:

- If a known v2 agent target is already loaded, delivery is unchanged.
- If the target is still registered but missing from `ThreadManager`,
delivery reloads that exact v2 thread from durable rollout history
before submitting the message.
- If the target is unknown, closed, missing from storage, or not a v2
thread, delivery still fails as not found.

The reload is wired only into existing-agent delivery paths: v2
`send_message` / `followup_task`, and legacy `send_input` when its
target is a known v2 agent.

## Stack

1. **Reload on delivery**: load known unloaded v2 agents before
`followup_task`, `send_message`, or `send_input` delivery. This PR.
2. **Residency LRU**: unload idle resident v2 agents from
`ThreadManager` without making them closed or unreachable.
3. **Execution concurrency**: count active non-root turns, not logical
agents or resident idle threads.
4. **Close semantics**: make v2 close interrupt-only and leave durable
agent identity intact.
5. **Resume cleanup**: remove user-facing v2 resume semantics;
addressing an unloaded durable agent reloads it implicitly.

## Validation

- Ran `just fmt`.
- Left broader tests and clippy to CI.

jif · 2026-06-05 18:18:29 +02:00

d5e4f01af4

Render code comment directives in TUI replay (#26554 )

## Summary

Resumed Codex App or VS Code review sessions can contain
`::code-comment` directives that the TUI previously displayed verbatim
because only rich clients interpret them.

This change rewrites valid line-start directives into readable Markdown
during assistant-message parsing, using the session working directory
for relative file paths. The fallback is applied consistently to live
messages, replayed transcripts, and resume previews while preserving
malformed directives and existing `::git-*` parsing.

## Before

The TUI exposed the raw client directive:

```text
::code-comment{title="Fix body= parsing" body="Keep role=\"tab\", ::git-stage{cwd=/tmp}, file=, and \n literal." file="/repo/src/app.ts" start=10 end=12 priority="P2"}
```

## After

The same directive is rendered as readable review feedback:

```text
- [P2] Fix body= parsing — src/app.ts:10-12
  Keep role="tab", ::git-stage{cwd=/tmp}, file=, and \n literal.
```

Fixes #25658

Eric Traut · 2026-06-05 08:34:34 -07:00

e5af672d73

Fix /goal usage text for control commands (#26551 )

## Why

The TUI's `/goal` usage text only advertised the objective form even
though `/goal clear`, `/goal edit`, `/goal pause`, and `/goal resume`
are implemented. This made the lifecycle controls difficult to discover
and allowed the duplicated help text to drift from actual behavior.

Fixes #25530.

## What changed

- Show the complete `/goal [<objective>|clear|edit|pause|resume]` syntax
in usage messages.
- Share one usage string across slash-command dispatch and goal-related
app messages.
- Add inline snapshot coverage for the control-command usage path.

Eric Traut · 2026-06-05 08:32:53 -07:00

fb0993dd3b

Open Windows app workspaces via deep link (#26500 )

## Summary

Fixes #26423.

On Windows, `codex app PATH` detected Codex Desktop and launched the app
shell target, then only printed a manual instruction to open the
workspace. The Desktop app already supports
`codex://threads/new?path=...`, so the CLI can open the requested
workspace directly.

This updates the Windows launcher to normalize the workspace path,
encode it into a `codex://threads/new` deep link, and open that URL when
Codex Desktop is installed. The installer fallback still opens the
Windows installer and prints the workspace path for after installation.

Eric Traut · 2026-06-05 08:32:42 -07:00

e781816ead

Surface TUI config write error causes (#26537 )

## Summary

TUI config writes currently wrap app-server failures with local context
like `config/batchWrite failed in TUI`, but several user-visible paths
only render the outer error. That hides the actionable app-server
message, such as validation constraints or read-only `CODEX_HOME`
failures, leaving users with a dead-end diagnostic.

This change adds a small formatter next to the TUI config write helpers
that renders the error source chain, then uses it for model persistence,
feature persistence, project trust, status line writes, hook trust, and
hook enablement.

Fixes #26077

Eric Traut · 2026-06-05 08:32:07 -07:00

3acd71fedb

[codex] Fix long proxy socket paths (#26553 )

## Summary

- avoid generating host proxy bridge Unix socket paths that exceed
Linux's `sockaddr_un.sun_path` limit
- fall back from a long `$CODEX_HOME/tmp` path to the system temp
directory, then `/tmp`
- add focused unit coverage for short and overlong parent paths

## Root cause

With a sufficiently long `CODEX_HOME`, the generated
`proxy-route-*.sock` path exceeds Linux's 107-byte pathname limit. The
host bridge child exits before writing its readiness byte, so the parent
reports the indirect error `failed to prepare host proxy routing bridge:
failed to fill whole buffer`.

## Validation

- reproduced the original error with a long `CODEX_HOME` using
`codex-cli 0.138.0-alpha.4`
- `cargo clippy -p codex-linux-sandbox --all-targets`
- `just fix -p codex-linux-sandbox`
- `just fmt`

The Linux-only unit test could not execute locally: the arm64 Docker
build was repeatedly OOM-killed by `rustc` while compiling an unrelated
`codex-app-server-protocol` dependency, before reaching the test.

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-06-05 08:00:46 -07:00

a14a73b54a

feat(app-server): expose account token usage [1 of 2] (#25344 )

## Why

Token activity is useful account-level context, but terminal clients
need a supported app-server path to fetch it without reaching into
ChatGPT backend details directly. The API should also live under the
broader account usage umbrella so future usage surfaces can be added
without proliferating user-facing concepts.

## What Changed

- Add `codex-backend-client` support for the ChatGPT profile token-usage
payload.
- Add the v2 `account/usage/read` app-server RPC.
- Map lifetime usage, peak daily usage, streak, longest task duration,
and daily buckets into app-server protocol types.
- Gate the request on Codex-backend auth, which supports ChatGPT auth
tokens and AgentIdentity.
- Regenerate the app-server JSON and TypeScript schema fixtures.

## Token Count Source

`account/usage/read` returns the token-usage aggregate supplied by the
ChatGPT profile backend. App-server maps that backend-owned aggregate
into protocol fields; it does not recompute cached-token treatment,
usage multipliers, or raw input/output totals locally.

## Stack

1. feat(app-server): expose account token usage [1 of 2] (this PR)
2. [#25345](https://github.com/openai/codex/pull/25345) feat(tui): add
token activity command [2 of 2]

## How to Test

1. Start an app-server client from this branch while authenticated with
ChatGPT or AgentIdentity.
2. Call `account/usage/read`.
3. Confirm the response includes `summary` and `dailyUsageBuckets`.
4. Also verify a session without Codex-backend auth receives the
existing auth error path.

Targeted tests:
- `just test -p codex-backend-client -p codex-app-server-protocol -p
codex-app-server`
- `just write-app-server-schema`

Felipe Coury · 2026-06-05 14:43:44 +00:00

5e62c735b2

refactor: split agent control modules (#26610 )

## Summary

Mechanically splits `AgentControl` into focused modules so later agent
runtime changes are easier to review. The shared lookup, messaging, and
completion logic remains in `control.rs`, while spawn-specific code and
V1 legacy close/resume behavior move into dedicated files.

## Changes

- Extract spawn-agent code into `agent/control/spawn.rs`.
- Extract V1-only legacy close/resume behavior into
`agent/control/legacy.rs`.
- Keep shared control-plane behavior in `agent/control.rs`.
- Preserve existing behavior; this PR is intended to be mechanical.

## Stack

1. This PR - Mechanical `AgentControl` split: extracts spawn and V1
legacy code without behavior changes.
2. #26614 - Execution slot accounting: separates logical agents from
active execution slots.
3. #26611 - Residency and reload runtime: adds resident-agent LRU,
eviction/reload, durable lookup, and V2 delivery through reload.
4. #26612 - V2 tool semantics: narrows `close_agent` to interrupt-only
and updates V2 tool coverage.

jif · 2026-06-05 16:24:22 +02:00

0b1512c2c8

[codex] Keep v1 spawn metadata visible (#26599 )

## Summary
- keep the legacy v1 `spawn_agent` role and model selectors visible
- add regression coverage for the default v1 tool plan

## Why
`hide_spawn_agent_metadata` is a multi-agent v2 setting, but the v1
planning branch also consumed it. After the default changed to `true`,
v1 stopped advertising `agent_type`, `model`, `reasoning_effort`, and
`service_tier`, preventing configured agents from being selected.

This keeps the hidden-metadata default for v2 while opting v1 out of
that behavior.

Fixes #26363.

## Validation
Not run locally, per request; CI will validate the change.

jif · 2026-06-05 14:52:51 +02:00

66232220e2

[codex] Forward turn moderation metadata through app-server (#25710 )

## Why
First-party backends can supply turn-scoped moderation metadata that
app-server clients need for client-side presentation. Exposing this as
an experimental typed notification lets opted-in clients consume it
without interpreting raw Responses API events.

## What changed
- forward `response.metadata.openai_chatgpt_moderation_metadata` from
Responses API SSE and WebSocket streams as turn-scoped moderation
metadata
- emit the experimental app-server v2 `turn/moderationMetadata`
notification with `{ threadId, turnId, metadata }`
- add app-server integration coverage for the typed moderation metadata
notification

## Testing
- `just test -p codex-core
build_ws_client_metadata_includes_window_lineage_and_turn_metadata`
- `just test -p codex-core` (fails locally: 46 failures and 1 timeout,
primarily missing `test_stdio_server` and shell snapshot timeouts)
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server
turn_moderation_metadata_emits_typed_notification_v2`
- `just test -p codex-app-server` (fails locally: 792 passed, 10 failed,
and 5 timed out; failures are in existing environment-sensitive tests,
primarily because nested macOS `sandbox-exec` is not permitted)
- `just write-app-server-schema --experimental --schema-root
/tmp/codex-app-server-schema-experimental`

carlc-oai · 2026-06-05 02:41:06 -07:00

55aa071b17

nit: doc (#26566 )

Matching CBv9

jif · 2026-06-05 11:10:32 +02:00

6dc28ba6e0

Encrypt multi-agent v2 message payloads (#26210 )

## Why

Multi-agent v2 currently routes agent instructions through normal tool
arguments and inter-agent context. That means the parent model can emit
plaintext task text, Codex can persist it in history/rollouts, and the
recipient can receive it as ordinary assistant-message JSON.

This changes the v2 path so agent instructions stay encrypted between
model calls: Responses encrypts the `message` argument returned by the
model, Codex forwards only that ciphertext, and Responses decrypts it
internally for the recipient model.

## What changed

- Mark the v2 `message` parameter as encrypted for `spawn_agent`,
`send_message`, and `followup_task`.
- Treat multi-agent v2 tool `message` values as ciphertext
unconditionally.
- Store v2 inter-agent task text in
`InterAgentCommunication.encrypted_content` with empty plaintext
`content`.
- Convert encrypted inter-agent communications into the Responses
`agent_message` input item before sending the child request.
- Preserve `agent_message` items across history, rollout, compaction,
telemetry, and app-server schema paths.
- Leave multi-agent v1 unchanged.

## Message shape

The model still calls the v2 tools with a `message` argument, but that
value is now ciphertext:

```json
{
  "name": "spawn_agent",
  "arguments": {
    "task_name": "worker",
    "message": "<ciphertext>"
  }
}
```

Codex stores the task as encrypted inter-agent communication:

```json
{
  "author": "/root",
  "recipient": "/root/worker",
  "content": "",
  "encrypted_content": "<ciphertext>",
  "trigger_turn": true
}
```

When Codex builds the recipient request, it forwards the ciphertext
using the new Responses input item:

```json
{
  "type": "agent_message",
  "author": "/root",
  "recipient": "/root/worker",
  "content": [
    {
      "type": "encrypted_content",
      "encrypted_content": "<ciphertext>"
    }
  ]
}
```

Responses decrypts that item internally for the recipient model.

## Context impact

- Parent context no longer carries plaintext v2 agent task instructions
from these tool arguments.
- Codex rollout/history stores ciphertext for v2 agent instructions.
- Recipient requests receive an `agent_message` item instead of
assistant commentary JSON for encrypted task delivery.
- Plaintext completion/status notifications are still plaintext because
they are Codex-generated status messages, not encrypted model tool
arguments.

## Validation

- `just test -p codex-tools`
- `just test -p codex-protocol`
- `just test -p codex-rollout`
- `just test -p codex-rollout-trace`
- `just test -p codex-otel`
- `just write-app-server-schema`

jif · 2026-06-05 10:25:57 +02:00

5f4d06ef18

[codex] Add environment shell info (#26480 )

## Why

Shell detection needs to be available through the `Environment`
abstraction so callers can ask the selected local or remote environment
for shell metadata without adding a separate HTTP endpoint or parallel
info-source path. This keeps shell metadata shaped like the existing
environment-owned filesystem capability and lets remote environments
answer through exec-server JSON-RPC.

## What changed

- Added `environment/info` to the exec-server protocol/client/server and
exposed `Environment::info()`.
- Added local and remote environment info providers on `Environment`,
following the existing capability-provider pattern used for filesystem
access.
- Moved the shared shell detection logic into `codex-shell-command` and
kept core shell APIs as wrappers around that implementation.
- Returned shell metadata as `EnvironmentInfo { shell: ShellInfo }`
using the existing shell detection path.
- Added a remote environment test that calls `Environment::info()`
through an exec-server-backed environment.

## Validation

- `git diff --check`
- `just test -p codex-shell-command`
- `just test -p codex-core -E 'test(/shell::tests::/)'`\n- `just test -p
codex-exec-server environment`

pakrym-oai · 2026-06-04 22:36:25 -07:00

6a6a5f925e

feat(remote-control): allow pairing while disabled (#26215 )

## Why

`remoteControl/pairing/start` creates authorization for future
remote-control connections, so it should not require the live websocket
to already be enabled. Requiring enable first made pairing depend on
presence instead of the persisted server enrollment that pairing
actually uses.

Pairing also needs to recover when that persisted server row is stale.
If `/server/pair` returns `404`, making the first pairing attempt fail
forces a manual retry even though the client can clear the stale row and
create a replacement enrollment immediately.

## What Changed

- Allow `remoteControl/pairing/start` to reuse or create the persisted
remote-control server enrollment while remote control is disabled.
- Keep the selected in-memory enrollment across disable and share it
with websocket connect so a later enable uses the same selected server.
- Thread the app-server client name through pairing so stdio persistence
keeps using the websocket-owned enrollment key.
- Recover pairing server-token auth failures through the existing
refresh/auth-recovery path.
- Recover stale pairing enrollment on `/server/pair` `404` by clearing
the stale selected enrollment, re-enrolling once, and retrying pairing
once.
- Add focused disabled-pairing and stale-pairing recovery coverage.

## Verification

-
`remote_control_pairing_start_returns_pairing_artifacts_while_disabled`
exercises pairing before enable.
- `remote_control_handle_reenrolls_after_stale_pairing_enrollment`
exercises stale `/server/pair` `404` recovery without a manual retry.

Related: N/A

Anton Panasenko · 2026-06-05 05:12:23 +00:00

64e0829cab

core: derive exec policy filesystem policy from profile (#26499 )

## Why

`PermissionProfile` already owns the runtime filesystem sandbox policy
through `file_system_sandbox_policy()`. Keeping a separate
`FileSystemSandboxPolicy` on exec-policy fallback contexts made it
possible for callers and tests to construct split states that the
production permission model should not rely on.

## What changed

- Removed `file_system_sandbox_policy` from `UnmatchedCommandContext`,
`ExecApprovalRequest`, and the intercepted Unix exec-policy context.
- Derived filesystem sandbox policy inside unmatched-command decision
logic from `PermissionProfile::file_system_sandbox_policy()`.
- Simplified shell/unified-exec callers and tests that were only
plumbing the duplicate policy through.

## Testing

Local tests not run per request; relying on remote CI.

Michael Bolin · 2026-06-04 21:48:45 -07:00

a2f5874b7a

[codex] Keep Bazel startup options stable across commands (#26256 )

## Why

`just bazel-clippy` ran target discovery with
`--noexperimental_remote_repo_contents_cache`, then ran the build with
the workspace default `--experimental_remote_repo_contents_cache`. Bazel
therefore killed and restarted its server on each transition, slowing
repeated commands and discarding the in-memory analysis cache. An audit
found the same class of startup-option variation in several CI command
sequences.

## What changed

- Keep local lint target-discovery queries on the workspace-default
Bazel server, while making CI target discovery explicitly use the CI
startup options.
- Normalize GitHub Actions launches through the BuildBuddy wrapper to
share `BAZEL_OUTPUT_USER_ROOT` and
`--noexperimental_remote_repo_contents_cache`.
- Route the CI lockfile check and Windows test-shard query through the
same startup configuration.
- Document the startup-option invariant and add wrapper regression
coverage.

## Validation

- Confirmed consecutive local clippy target-discovery runs retained the
same Bazel server PID.

Adam Perry @ OpenAI · 2026-06-04 20:23:37 -07:00

1d9c9c9f33

fix(rmcp): refresh expired OAuth tokens before startup (#26482 )

## Why

Codex persists OAuth expiry as an absolute `expires_at`, then
reconstructs RMCP’s relative `expires_in` when credentials are loaded.
For an already-expired token, Codex reconstructed `expires_in` as
missing.

[RMCP 0.15 treated a missing `expires_in` as zero when a refresh token
was
present](https://github.com/modelcontextprotocol/rust-sdk/blob/9cfc905a9ef17c8bba6748dc0a9bdd2452681733/crates/rmcp/src/transport/auth.rs#L704-L723),
so this still triggered a refresh. [RMCP 1.7 treats missing expiry
information as unknown and uses the access token
as-is](https://github.com/modelcontextprotocol/rust-sdk/blob/3529c3675ff64db805bd947ca6ece6090809e43d/crates/rmcp/src/transport/auth.rs#L1233-L1265),
causing the stale token to be sent during `initialize`.

## What changed

- Represent a known-expired persisted token as `expires_in = 0`,
preserving `None` for genuinely unknown expiry.
- Add Streamable HTTP coverage requiring the token to refresh before the
startup handshake.

## Validation

- The new regression test fails on RMCP 1.7 before the fix and passes
afterward.
- The same scenario passes on the commit immediately before the RMCP 1.7
update, using RMCP 0.15.
- `just test -p codex-rmcp-client` (63 passed).

Adam Perry @ OpenAI · 2026-06-05 02:31:06 +00:00

4de7a2b9d8

[codex] Add use_responses_lite 'override' logic (#26487 )

## Summary

- add a defaulted `ModelInfo.use_responses_lite` catalog field
- support serializing `reasoning.context` while preserving the existing
effort and summary path
- has not been turned on for any models yet

I've added an override to parallel tools if responses_lite is on. I've
also forced persistent reasoning when using responses_lite. It would be
ideal if we could centralize all the responses_lite plumbing, but I
think this is best for now to keep the plumbing & diffs small.

## Testing

- `cargo test -p codex-protocol
model_info_defaults_availability_nux_to_none_when_omitted`
- `RUST_MIN_STACK=8388608 cargo test -p codex-core
responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls`
- `RUST_MIN_STACK=8388608 cargo test -p codex-core
configured_reasoning_summary_is_sent`
- `cargo check -p codex-core --tests`
- `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes
with pre-existing warnings in `codex-code-mode` and
`codex-core-plugins`)

rka-oai · 2026-06-04 18:49:51 -07:00

e0096db6dc

[codex] Emit sandbox outcome telemetry event (#25955 )

## Summary

Adds a dedicated `codex.sandbox_outcome` telemetry event so we can query
sandbox edge outcomes without threading sandbox metadata through
tool-result output types.

This is meant to make sandbox failures and approved escalation retries
visible in OTEL while keeping the existing `codex.tool_result` event
shape focused on tool completion data.

## What changed

- Adds `SessionTelemetry::sandbox_outcome(...)`, which emits
`codex.sandbox_outcome` as both a log and trace event.
- Records the tool name, call id, sandbox outcome, initial attempt
duration, and escalated attempt duration when a retry runs.
- Emits `denied` when the sandbox blocks execution and no retry is run.
- Emits `timed_out` and `signal` when those sandbox errors surface from
tool execution.
- Emits `escalated` when the initial sandboxed attempt fails and the
approved unsandboxed retry succeeds.
- Adds OTEL coverage for the new event payload, including timing fields.

## Validation

- `RUST_MIN_STACK=8388608 just test -p codex-core
sandbox_outcome_event_records_outcome
handle_sandbox_error_user_approves_retry_records_tool_decision`
- `just test -p codex-otel
otel_export_routing_policy_routes_tool_result_log_and_trace_events
runtime_metrics_summary_collects_tool_api_and_streaming_metrics`
- `just fix -p codex-core`
- `just fix -p codex-otel`

rreichel3-oai · 2026-06-04 20:58:14 -04:00

ecae412740

ci: test windows cross build (#25000 )

We cross build when using bazel for windows. This causes a couple
hiccups in that v8 does a mksnapshot step that is expecting to snapshot
on the host arch which wasn't matching when we were doing the
crossbuild. This was causing segfault failiures when starting up
codemode from a cross built artifact.

This changes things such that we cross build the library and then run
and link a snapshot on the host machine/arch which is windows. This
gives us a functional snapshot and library that can start code-mode on
windows.

This fixes the build and then fixes two test regressions we had.

Channing Conger · 2026-06-04 17:51:13 -07:00

4be1a168fc

Pull plugin service less frequently (#26431 )

# Summary
Reduce download traffic to `github.com/openai/plugins` while continuing
to check for updates on every Codex startup.

# Root cause
The startup sync replaced the local repository with a fresh shallow
clone whenever the remote revision changed. At Codex's global scale,
repeatedly downloading the repository created excessive GitHub traffic.

# Changes
- Run `git ls-remote` on each startup to read the remote HEAD SHA.
- Skip all repository downloads when the local and remote SHAs match.
- Update existing checkouts with an exact-SHA shallow `git fetch`,
followed by reset and clean.
- Bootstrap new installations with `git init` plus the same shallow
fetch, rather than cloning.
- Keep the existing file lock so concurrent Codex processes serialize
updates and do not duplicate fetches.
- Preserve the existing GitHub HTTP and export archive fallback
behavior.

# Impact
Each startup makes one lightweight remote HEAD check. Repository objects
are downloaded only when the revision changes, and existing Git objects
are reused during updates.

# Validation
- `just test -p codex-core-plugins startup_sync` (15 tests passed)
- `just test -p codex-core-plugins` (201 tests passed)
- `just clippy -p codex-core-plugins` (passes with one pre-existing
`large_enum_variant` warning)
- Production app-server smoke test against GitHub:
  - Fresh home: `ls-remote`, `git init`, one exact-SHA shallow fetch
- Unchanged restart: `ls-remote` and local `rev-parse` only; no fetch or
clone
- Bench smoke passed

beggers-openai · 2026-06-04 17:47:58 -07:00

72d0bfb6ba

Improve Windows sandbox setup refresh diagnostics (#26471 )

## Why

Users have been seeing opaque Windows sandbox setup refresh failures
such as `windows sandbox: spawn setup refresh`, including reports in
#24391 and #21208. The setup refresh path already runs the Windows
sandbox setup helper, but it was not using the same structured
`setup_error.json` reporting path that elevated setup uses. As a result,
when the helper exited non-zero, Codex only surfaced a generic refresh
status instead of the helper's `SetupFailure` code and message.

## What changed

- Clear stale `setup_error.json` before non-elevated setup refresh
launches the helper.
- When the refresh helper exits non-zero, read the helper-written report
through the existing `report_helper_failure` path.
- Keep a parent-side launch diagnostic for cases where the helper never
starts, including the helper path, cwd, sandbox log path, and spawn
error.
- Clear the setup error report after a successful refresh.
- Add regression coverage for report consumption and stale-report
avoidance.

## Verification

- `cargo test -p codex-windows-sandbox setup::tests::`

iceweasel-oai · 2026-06-04 16:52:10 -07:00

0b2e7b5eb1

[codex] Expose unavailable app templates in plugin detail (#26317 )

## Summary
- Adds `unavailable_app_templates` to the app-server protocol and
generated schemas/types.
- Parses plugin-service `release.unavailable_app_templates` in the
remote plugin client.
- Maps remote unavailable templates into app-server `PluginDetail`.
- Defaults local plugins to an empty unavailable app template list.

## Validation
- `just write-app-server-schema`
- `cargo +1.95.0 fmt --manifest-path codex-rs/Cargo.toml --all --check`
- `cargo +1.95.0 test --manifest-path codex-rs/Cargo.toml -p
codex-app-server-protocol schema_fixtures`
- `cargo +1.95.0 check --manifest-path codex-rs/Cargo.toml -p
codex-app-server-protocol -p codex-core-plugins -p codex-app-server`
- `git diff --check`

Note: default `cargo check` uses rustc 1.89 locally and failed because
dependencies require newer Rust, so validation was rerun with installed
Rust 1.95.

charlesgong-openai · 2026-06-04 23:42:27 +00:00

b9ff450902

Add skill for pushing CI configuration changes (#26473 )

## Why

Codex agents that modify GitHub Actions configuration need clear
guidance when repository push protections require temporary approval.
Without it, an agent may pursue an unavailable exemption or stop before
checking whether the user already has access.

## What

Add a `pushing-ci-changes` skill that explains the restriction, directs
agents to attempt the push first, and tells them how to involve the user
when approval is required.

## Validation

Not run; this change only adds skill documentation.

Adam Perry @ OpenAI · 2026-06-04 15:40:16 -07:00

e695ec8ec6

fix(app-server): expose remote MCP servers in plugin read (#26453 )

## Why

Remote plugin detail responses include MCP server metadata under
`release.mcp_servers`, but Codex did not deserialize or propagate that
field. As a result, `plugin/read` always returned an empty `mcpServers`
list for remote plugins, so the plugin details pane omitted the MCP
Servers section even when the remote plugin declares one.

This affects uninstalled plugins as well: the remote detail API is the
source of truth and returns MCP server keys without requiring a local
plugin bundle.

## What changed

- Deserialize MCP server entries from remote plugin detail responses.
- Normalize their keys into a sorted, deduplicated list on
`RemotePluginDetail`.
- Return those keys from app-server `plugin/read` instead of hardcoding
an empty list.
- Add regression coverage proving an uninstalled remote plugin returns
its MCP server names.

## Test plan

- `just test -p codex-core-plugins`
- `just test -p codex-app-server plugin_read`

Eric Ning · 2026-06-04 22:10:24 +00:00

769c231aa1

[codex] Preserve logical paths during AGENTS.md discovery (#26465 )

## Intent

Follow up on #26205 by avoiding unnecessary filesystem canonicalization
during `AGENTS.md` discovery. The configured working directory is
already absolute, and canonicalization incorrectly switches symlinked
workspaces from their logical parent hierarchy to the target's
hierarchy.

## User-facing behavior

For a symlinked working directory such as:

```text
test-root/
|-- logical-repo/
|   |-- AGENTS.md              ("logical parent doc")
|   `-- workspace ------------> physical-repo/workspace/
`-- physical-repo/
    |-- AGENTS.md              ("physical parent doc")
    `-- workspace/
        `-- AGENTS.md          ("workspace doc")
```

Before this change, Codex canonicalized `logical-repo/workspace` to
`physical-repo/workspace` before discovery. It therefore loaded
`physical-repo/AGENTS.md` and `physical-repo/workspace/AGENTS.md`,
ignoring the instructions from the repository through which the user
entered the workspace.

After this change, ancestor discovery walks the configured logical path,
so Codex loads `logical-repo/AGENTS.md`. Opening
`logical-repo/workspace/AGENTS.md` still follows the symlink through the
host filesystem, so the workspace document is also loaded.
`physical-repo/AGENTS.md` is not loaded.

## Implementation

Use the logical absolute working directory when discovering project
instructions and reporting instruction sources. Filesystem reads still
follow the working-directory symlink, so an `AGENTS.md` in the target
workspace continues to load while ancestor discovery uses the symlink's
parents.

## Validation

Added integration coverage proving that discovery loads the logical
parent's instructions and the target workspace's instructions, but not
the target parent's instructions.

Adam Perry @ OpenAI · 2026-06-04 15:08:52 -07:00

59ca34206b

Use Winget release environment secret (#26466 )

## Why
`WINGET_PUBLISH_PAT` now lives as a GitHub environment secret under
`mainline-release-winget`. The WinGet release job needs to enter that
environment so `secrets.WINGET_PUBLISH_PAT` resolves during
stable/mainline Rust releases.

## What Changed
- Attach the `winget` job in `.github/workflows/rust-release.yml` to the
`mainline-release-winget` environment.
- Set `deployment: false` so the job can read environment secrets
without creating GitHub deployment records.

## Operational Note
The `mainline-release-winget` environment must allow `rust-v*.*.*` tag
refs before this can run on release tags. The live environment currently
has a custom policy named `rust-v*.*.*` with type `branch`; add the
corresponding `tag` policy before relying on this path for a release.

## Validation
- `git diff --check origin/main...HEAD --
.github/workflows/rust-release.yml`
- `ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts
"yaml ok: #{f}" }' .github/workflows/rust-release.yml`

Shijie Rao · 2026-06-04 14:38:11 -07:00

37c8aefa14

[codex] Use model-advertised reasoning effort order (#26446 )

## Summary
- preserve the model catalog order for app-server
`supportedReasoningEfforts` and document that client contract
- render TUI reasoning choices in the advertised order
- step reasoning shortcuts by adjacent list position instead of deriving
order from known effort names
- anchor unsupported configured values to the advertised default, or the
first option when needed
- remove canonical effort ordering helpers and the unused upgrade effort
mapping

## Validation
- `just fmt`
- Local tests and compilation were not run per request; relying on CI.

Stacked on #26444.

Ahmed Ibrahim · 2026-06-04 14:01:14 -07:00

f6e529656f

[codex] Support model-defined reasoning efforts (#26444 )

## Summary
- accept non-empty model-defined reasoning effort values while
preserving built-in effort behavior
- propagate the non-Copy effort type through core, app-server, TUI,
telemetry, and persistence call sites
- preserve string wire encoding and expose an open-string schema for
clients
- update model selection and shortcut behavior for model-advertised
effort values

## Root cause
`ReasoningEffort` gained a string-backed custom variant, so it could no
longer implement `Copy` or rely on derived closed-enum serialization.
Existing consumers still moved effort values from shared references and
assumed a fixed built-in value set.

## Validation
- `just fmt`
- Local tests and compilation were not run per request; relying on CI.

Ahmed Ibrahim · 2026-06-04 13:36:24 -07:00

8ac304c299

Cleanup experimentalFeature/enablement/set (#26312 )

## Why

`experimentalFeature/enablement/set` still allowed several keys that no
longer need to be managed through this API. Keeping those keys also
preserved corresponding special-case logic, including refreshing the
apps list when the `apps` key was enabled.

The endpoint also rejected an entire request when any key was invalid or
unsupported. That makes clients brittle when they send a mix of current
and stale keys, even when the valid entries can still be applied safely.

## What changed

- remove the feature keys that no longer need to be supported by
`experimentalFeature/enablement/set`
- remove the corresponding apps-list refresh path and its auth/config
plumbing
- ignore and warn on invalid or unsupported keys while still applying
valid keys from the same request
- update the app-server documentation and integration coverage for the
reduced key set and partial-acceptance behavior

## Test plan

- `just test -p codex-app-server experimental_feature_enablement_set` (6
passed)
- `just test -p codex-app-server` exercised the changed tests
successfully; unrelated sandbox-dependent and watcher/timing tests
failed locally

Matthew Zeng · 2026-06-04 13:35:31 -07:00

4a70e0ac1b

7193 Commits