codex

Update safety check wording (#19149 )

Updates wording of cyber safety check.

Eric Traut · 2026-04-23 08:53:25 -07:00

1fda843fbc

exec-server: wait for close after observed exit (#19130 )

## Why

Windows CI can flake in
`server::handler::tests::output_and_exit_are_retained_after_notification_receiver_closes`
after a process has exited but before both output streams have closed.
`exec/read` returned immediately whenever `exited` was true, so callers
that had already observed the exit event could spin instead of
long-polling for the later `closed` state.

## What Changed

- Keep returning immediately when a terminal exit event is newly
observable.
- Allow later reads, after the caller has advanced past that event, to
wait for `closed` or new output until `wait_ms` expires.

## Verification

- CI pending.

jif-oai · 2026-04-23 16:50:17 +02:00

45e1742030

Reject agents.max_threads with multi_agent_v2 (#19129 )

## Why

`multi_agent_v2` uses the v2 agent lifecycle, so accepting the legacy
`agents.max_threads` limit alongside it creates conflicting
configuration semantics. Config load should fail early with a clear
error instead of allowing both knobs to be set.

## What Changed

- During config load, detect when the effective `multi_agent_v2` feature
is enabled and `agents.max_threads` is explicitly set.
- Return an `InvalidInput` error: `agents.max_threads cannot be set when
multi_agent_v2 is enabled`.

## Verification

- `cargo test -p codex-core multi_agent_v2_rejects_agents_max_threads`
passed locally with a temporary focused test for this behavior.
- `cargo test -p codex-core` was also run; the new focused path passed,
but the crate suite has unrelated pre-existing failures in managed
config/proxy/request-permissions tests.

jif-oai · 2026-04-23 13:31:54 +02:00

d3b044938d

Fix auto-review config compatibility across protocol and SDK (#19113 )

## Why

This keeps the partial Guardian subagent -> Auto-review rename
forward-compatible across mixed Codex installations. Newer binaries need
to understand the new `auto_review` spelling, but they cannot write it
to shared `~/.codex/config.toml` yet because older CLI/app-server
bundles only know `user` and `guardian_subagent` and can fail during
config load before recovering.

The Python SDK had the opposite compatibility gap: app-server responses
can contain `approvalsReviewer: "auto_review"`, but the checked-in
generated SDK enum did not accept that value.

## What Changed

- Keep `ApprovalsReviewer::AutoReview` readable from both
`guardian_subagent` and `auto_review`, while serializing it as
`guardian_subagent` in both protocol crates.
- Update TUI Auto-review persistence tests so enabling Auto-review
writes `approvals_reviewer = "guardian_subagent"` while UI copy still
says Auto-review.
- Map managed/cloud `feature_requirements.auto_review` to the existing
`Feature::GuardianApproval` gate without adding a broad local
`[features].auto_review` key or changing config writes.
- Add `auto_review` to the Python SDK `ApprovalsReviewer` enum and cover
`ThreadResumeResponse` validation.

## Testing

- `cargo test -p codex-protocol approvals_reviewer`
- `cargo test -p codex-app-server-protocol approvals_reviewer`
- `cargo test -p codex-tui
update_feature_flags_enabling_guardian_selects_auto_review`
- `cargo test -p codex-tui
update_feature_flags_enabling_guardian_in_profile_sets_profile_auto_review_policy`
- `cargo test -p codex-core
feature_requirements_auto_review_disables_guardian_approval`
- `pytest
sdk/python/tests/test_client_rpc_methods.py::test_thread_resume_response_accepts_auto_review_reviewer`
- `git diff --check`

Won Park · 2026-04-23 03:12:56 -07:00

17ae906048

Support MCP tools in hooks (#18385 )

## Summary

Lifecycle hooks currently treat `PreToolUse`, `PostToolUse`, and
`PermissionRequest` as Bash-only flows
- hook schema constrains `tool_name` to `Bash`
- hook input assumes a command-shaped `tool_input`
- core hook dispatch path passes only shell command strings

That means hooks cannot target MCP tools even though MCP tool names are
model-visible and stable

This change generalizes those hook paths so they can match and receive
payloads for MCP tools while preserving the existing Bash behavior.

## Reviewer Notes

I think these are the key files
- `codex-rs/core/src/tools/handlers/mcp.rs`
- `codex-rs/core/src/mcp_tool_call.rs`

Otherwise the changes across apply_patch, shell, and unified_exec are
mainly to rewire everything to be `tool_input` based instead of just
`command` so that it'll make sense for MCP tools.

## Changes

- Allow `PreToolUse`, `PostToolUse`, and `PermissionRequest` hook inputs
to carry arbitrary `tool_name` and `tool_input` values instead of
hard-coding `Bash` and command-only payloads.
- Add MCP hook payload support through `McpHandler`, using the
model-visible tool name from `ToolInvocation` and the raw MCP arguments
as `tool_input`.
- Include MCP tool responses in `PostToolUse` by serializing
`McpToolOutput` into the hook response payload.
- Run `PermissionRequest` hooks for MCP approval requests after
remembered approval checks and before falling back to user-facing MCP
elicitation.
- Preserve exact matching for literal hook matchers like `Bash` and
`mcp__memory__create_entities`, while keeping regex matcher support for
patterns like `mcp__memory__.*` and `mcp__.*__write.*`.

---------

Co-authored-by: Andrei Eternal <eternal@openai.com>
Co-authored-by: Codex <noreply@openai.com>

Abhinav · 2026-04-23 07:33:57 +00:00

305825abd9

app-server: include filesystem entries in permission requests (#19086 )

## Why

`item/permissions/requestApproval` sends a requested permission profile
to app-server clients. The core profile already stores filesystem
permissions as `entries`, but the v2 compatibility conversion used the
legacy `read`/`write` projection whenever possible and left `entries`
unset.

That made the request ambiguous for clients that consume the canonical
v2 shape: `permissions.fileSystem.entries` was missing even though
filesystem access was being requested. A client that rendered or echoed
grants from `entries` could treat the request as having no filesystem
permission entries, then return an empty or incomplete grant. The
app-server intersects responses with the original request, so omitted
filesystem permissions are denied.

## What Changed

- Populate `AdditionalFileSystemPermissions.entries` when converting
legacy read/write roots for request permission payloads, while
preserving `read` and `write` for compatibility.
- Mark `read` and `write` as transitional schema fields in the generated
app-server schema.
- Add regression coverage for the v2 conversion, the app-server
`item/permissions/requestApproval` round trip, and TUI app-server
approval conversion expectations.
- Refresh generated JSON and TypeScript schema fixtures.

## Verification

- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server request_permissions_round_trip`
- `cargo test -p codex-tui
converts_request_permissions_into_granted_permissions`
- `cargo test -p codex-tui
resolves_permissions_and_user_input_through_app_server_request_id`

Michael Bolin · 2026-04-23 00:21:59 -07:00

8bc667b07b

Persist target default reasoning on model upgrade (#19085 )

## Why

When the TUI upgrade flow moves a user to a newer model, the accepted
migration should also persist the target model's default reasoning
effort. That keeps the upgraded model and reasoning setting aligned
instead of carrying forward a stale previously saved effort from the old
model.

## What changed

- The accepted model migration path now updates in-memory config, TUI
state, and persisted model selection with the target preset's
`default_reasoning_effort`.
- The upgrade destructuring keeps `reasoning_effort_mapping` explicitly
unused because mappings are no longer consulted on accepted migrations.
- Added a catalog test that starts with a pre-existing saved reasoning
effort and verifies the accepted upgrade overwrites it with the target
model default and emits the expected persistence events.
- Rebasing onto current `main` also updates a TUI thread-session test
helper for the latest `permission_profile` field and
`ApprovalsReviewer::AutoReview` rename so CI compiles on the new base.

## Verification

- `cargo test -p codex-tui model_catalog`
- `cargo test -p codex-tui
permission_settings_sync_updates_active_snapshot_without_rewriting_side_thread`

Shijie Rao · 2026-04-22 23:36:15 -07:00

993e3f407e

Clarify cloud requirements error messages (#19078 )

## Why
The current cloud-requirements failures say `workspace-managed config`,
which is ambiguous and can read like it refers to local managed config
such as `managed_config.toml`.

This code path only applies to cloud requirements, so the user-facing
message should name that source directly.

## What changed
- Updated the load failure in
[`codex-rs/cloud-requirements/src/lib.rs`](https://github.com/openai/codex/blob/46e704d1f93054daa9a3b5a9100333c540c81d50/codex-rs/cloud-requirements/src/lib.rs)
to say `failed to load cloud requirements (workspace-managed policies)`.
- Updated the parse failure in the same file to use the same `cloud
requirements (workspace-managed policies)` terminology.
- Kept `workspace-managed` hyphenated because it is used as a compound
modifier.
- Updated the matching assertion in
[`codex-rs/app-server/src/codex_message_processor.rs`](https://github.com/openai/codex/blob/46e704d1f93054daa9a3b5a9100333c540c81d50/codex-rs/app-server/src/codex_message_processor.rs).
- Reused `CLOUD_REQUIREMENTS_LOAD_FAILED_MESSAGE` in the
`codex-cloud-requirements` test where the test is asserting that
crate-local contract directly.

## Testing
`cargo test -p codex-cloud-requirements`

Gav Verma · 2026-04-22 23:07:08 -07:00

2ef2d675d6

feat: Warn and continue on unknown feature requirements (#19038 )

Requirements feature flags now fail open like config feature flags, but
with a startup warning.

<img width="443" height="68" alt="image"
src="https://github.com/user-attachments/assets/76767fa7-8ce8-4fc7-8a09-902fcdda6298"
/>

xl-openai · 2026-04-22 22:50:44 -07:00

951be1a8a1

Use remote plugin IDs for detail reads and enlarge list pages (#19079 )

1. For remote plugin use plugin id (plugin name) directly for read
plugin details;
2. Request up to 200 remote plugins per directory list page.

xl-openai · 2026-04-22 22:50:20 -07:00

fb6308cf64

Add computer_use feature requirement key (#19071 )

## Summary
- add the `computer_use` requirements-only feature key
- include it in generated config schema output
- cover the new key in feature metadata tests

## Testing
- `cargo test -p codex-features`
- `just write-config-schema`
- `just fmt`
- `just fix -p codex-features`

cc @xl-openai

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>

Leo Shimonaka · 2026-04-22 22:49:26 -07:00

7730fb3ab8

TUI: preserve permission state after side conversations (#18924 )

Addresses #18854

## Why

The `/permissions` selector updates the active TUI session state, but
the cached session snapshot used when replaying a thread could still
contain the old approval or sandbox settings. After opening and leaving
`/side`, the main thread replay could restore those stale settings into
the `ChatWidget`, so the UI and the next submitted turn could fall back
to the old permission mode.

## What

- Sync the active thread's cached `ThreadSessionState` whenever approval
policy, sandbox policy, or approval reviewer changes.

## Verification

Confirmed bug prior to fix and correct behavior after fix.

Eric Traut · 2026-04-22 22:40:35 -07:00

08b5e96678

Mark codex_hooks stable (#19012 )

# Why

Hooks are ready to graduate to GA in the next release!

# What

- Moves `Feature::CodexHooks` into the stable feature group.
- Marks the `codex_hooks` feature spec as `Stage::Stable` and
default-enabled.

Abhinav · 2026-04-23 05:34:05 +00:00

23afa173f4

app-server: accept command permission profiles (#18283 )

## Why

`command/exec` is another app-server entry point that can run under
caller-provided permissions. It needs to accept `PermissionProfile`
directly so command execution is not left behind on `SandboxPolicy`
while thread APIs move forward.

Command-level profiles also need to preserve the semantics clients
expect from profile-relative paths. `:cwd` and cwd-relative deny globs
should be anchored to the resolved command cwd for a command-specific
profile, while configured deny-read restrictions such as `**/*.env =
none` still need to be enforced because they can come from config or
requirements rather than the command override itself.

## What Changed

This adds `permissionProfile` to `CommandExecParams`, rejects requests
that combine it with `sandboxPolicy`, and converts accepted profiles
into the runtime filesystem/network permissions used for command
execution.

When a command supplies a profile, the app-server resolves that profile
against the command cwd instead of the thread/server cwd. It also
preserves configured deny-read entries and `globScanMaxDepth` on the
effective filesystem policy so one-off command overrides cannot drop
those read protections. The PR also updates app-server docs/schema
fixtures and adds command-exec coverage for accepted, rejected,
cwd-scoped, and deny-read-preserving profile paths.

## Verification

- `cargo test -p codex-app-server
command_exec_permission_profile_cwd_uses_command_cwd`
- `cargo test -p codex-app-server
command_profile_preserves_configured_deny_read_restrictions`
- `cargo test -p codex-app-server
command_exec_accepts_permission_profile`
- `cargo test -p codex-app-server
command_exec_rejects_sandbox_policy_with_permission_profile`
- `just fix -p codex-app-server`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18283).
* #18288
* #18287
* #18286
* #18285
* #18284
* __->__ #18283

Michael Bolin · 2026-04-22 22:33:16 -07:00

9d824cf4b4

Add safety check notification and error handling (#19055 )

Adds a new app-server notification that fires when a user account has
been flagged for potential safety reasons.

Eric Traut · 2026-04-22 22:24:12 -07:00

bbff4ee61a

Default Fast service tier for eligible ChatGPT plans (#19053 )

## Why

Enterprise and business-like ChatGPT plans should get Codex's Fast
service tier by default when the user or caller has not made an explicit
service-tier choice. At the same time, callers need a durable way to
choose standard routing without adding a new persisted `standard`
service tier value. This keeps existing config compatibility while
letting core own the managed default policy.

## What changed

- Resolve the effective service tier in core at session creation:
explicit `fast` or `flex` wins, explicit null/clear or
`[notice].fast_default_opt_out = true` resolves to standard routing, and
otherwise eligible ChatGPT plans resolve to Fast when FastMode is
enabled.
- Add `[notice].fast_default_opt_out` as the persisted opt-out marker
for managed Fast defaults.
- Treat app-server/TUI `service_tier: null` as an explicit
standard/clear choice by preserving that intent through config loading.
- Update TUI rendering to use core's effective service tier for startup
and status surfaces while still keeping `config.service_tier` as the
explicit configured choice.
- Update `/fast off` to clear `service_tier`, persist the opt-out
marker, and send explicit standard for subsequent turns.

## Verification

- Added unit coverage for config override/notice handling, service-tier
resolution, runtime null clearing, and `/fast off` turn propagation.
- `cargo build -p codex-cli`

Full test suite was not run locally per author request.

Shijie Rao · 2026-04-22 21:54:44 -07:00

02170996e6

protocol: report session permission profiles (#18282 )

## Why

Clients that observe `SessionConfigured` need the same canonical
permission view that app-server thread responses provide. Reporting the
profile in protocol events lets clients keep their local state
synchronized without reinterpreting legacy sandbox fields.

## What changed

This adds `permission_profile` to `SessionConfigured` and propagates it
through core, exec JSON output, MCP server messages, and TUI
history/widget handling.

## Verification

- `cargo test -p codex-tui permissions -- --nocapture`
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`















































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18282).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* __->__ #18282

Michael Bolin · 2026-04-22 21:29:32 -07:00

082fc4f632

codex: support hooks in config.toml and requirements.toml (#18893 )

## Summary

Support the existing hooks schema in inline TOML so hooks can be
configured from both `config.toml` and enterprise-managed
`requirements.toml` without requiring a separate `hooks.json` payload.

This gives enterprise admins a way to ship managed hook policy through
the existing requirements channel while still leaving script delivery to
MDM or other device-management tooling, and it keeps `hooks.json`
working unchanged for existing users.

This also lays the groundwork for follow-on managed filtering work such
as #15937, while continuing to respect project trust gating from #14718.
It does **not** implement `allow_managed_hooks_only` itself.

NOTE: yes, it's a bit unfortunate that the toml isn't formatted as
closely as normal to our default styling. This is because we're trying
to stay compatible with the spec for plugins/hooks that we'll need to
support & the main usecase here is embedding into requirements.toml

## What changed

- moved the shared hook serde model out of `codex-rs/hooks` into
`codex-rs/config` so the same schema can power `hooks.json`, inline
`config.toml` hooks, and managed `requirements.toml` hooks
- added `hooks` support to both `ConfigToml` and
`ConfigRequirementsToml`, including requirements-side `managed_dir` /
`windows_managed_dir`
- treated requirements-managed hooks as one constrained value via
`Constrained`, so managed hook policy is merged atomically and cannot
drift across requirement sources
- updated hook discovery to load requirements-managed hooks first, then
per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning
when a single layer defines both representations
- threaded managed hook metadata through discovered handlers and exposed
requirements hooks in app-server responses, generated schemas, and
`/debug-config`
- added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`,
`codex-rs/core/src/config_loader/tests.rs`, and
`codex-rs/core/tests/suite/hooks.rs`

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-hooks`
- `cargo test -p codex-app-server config_api`

## Documentation

Companion updates are needed in the developers website repo for:

- the hooks guide
- the config reference, sample, basic, and advanced pages
- the enterprise managed configuration guide

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>

Andrei Eternal · 2026-04-22 21:20:09 -07:00

2b2de3f38b

tui: fix approvals popup disabled shortcut test (#19072 )

## Why

This regressed in #19063, which made `GuardianApproval` stable and
enabled by default. That adds an enabled `Auto-review` row to the
permissions popup, but `approvals_popup_navigation_skips_disabled` still
assumed the disabled `Full Access` row lived behind a hard-coded numeric
shortcut, so the test started selecting a different row and closing the
popup instead of verifying disabled-row behavior.

## What

- disable `GuardianApproval` in
`approvals_popup_navigation_skips_disabled` so the popup layout matches
the scenario the test is exercising
- choose the hidden numeric shortcut for the disabled `Full Access` row
by platform (`2` on non-Windows, `3` on Windows where `Read Only` is
shown) before asserting that selecting the disabled row leaves the popup
open

## Testing

- `cargo test -p codex-tui --lib
chatwidget::tests::permissions::approvals_popup_navigation_skips_disabled
-- --exact --nocapture`
- `cargo test -p codex-tui --lib chatwidget::tests::permissions --
--nocapture`
- `cargo test -p codex-tui`

Michael Bolin · 2026-04-22 21:01:26 -07:00

9955eacd22

test: set Rust test thread stack size (#19067 )

## Summary

Set `RUST_MIN_STACK=8388608` for Rust test entry points so
libtest-spawned test threads get an 8 MiB stack.

The Windows BuildBuddy failure on #18893 showed
`//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a
`#[tokio::test]` even though later test binaries in the shard printed
successful summaries. Default `#[tokio::test]` uses a current-thread
Tokio runtime, which means the async test body is driven on libtest's
std-spawned test thread. Increasing the test thread stack addresses that
failure mode directly.

To date, we have been fixing these stack-pressure problems with
localized future-size reductions, such as #13429, and by adding
`Box::pin()` in specific async wrapper chains. This gives us a baseline
test-runner stack size instead of continuing to patch individual tests
only after CI finds another large async future.

## What changed

- Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so
Bazel test actions receive the env var through Bazel's cache-keyed test
environment path.
- Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points
and `just test`.
- Annotated the existing Windows Bazel linker stack reserve as 8 MiB so
it stays aligned with the libtest thread stack size.

## Testing

- `just --list`
- parsed `.github/workflows/rust-ci.yml` and
`.github/workflows/rust-ci-full.yml` with Ruby's YAML loader
- compared `bazel aquery` `TestRunner` action keys before/after explicit
`--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to
`.bazelrc`
- `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors`
- failed locally on the existing sandbox-specific status snapshot
permission mismatch, but loaded the Starlark changes and ran the TUI
test shards

Michael Bolin · 2026-04-22 19:51:49 -07:00

e8ba912fcc

feat(request-permissions) approve with strict review (#19050 )

## Summary
Allow the user to approve a request_permissions_tool request with the
condition that all commands in the rest of the turn are reviewed by
guardian, regardless of sandbox status.

## Testing
- [x] Added unit tests
- [x] Ran locally

Dylan Hurd · 2026-04-23 01:56:32 +00:00

5e71da1424

chore(auto-review) feature => stable (#19063 )

## Summary
Turn on Auto Review

## Testing
- [x] Update unit tests

Dylan Hurd · 2026-04-22 18:51:39 -07:00

c6ab601824

Fix relative stdio MCP cwd fallback (#19031 )

Matthew Zeng · 2026-04-22 17:52:17 -07:00

8f0a92c1e5

core: box multi-agent wrapper futures (#19059 )

## Why

While debugging the Windows stack overflows we saw in
[#13429](https://github.com/openai/codex/pull/13429) and then again in
[#18893](https://github.com/openai/codex/pull/18893), I hit another
overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.

That test drives the legacy multi-agent spawn / close / resume path. The
behavior was fine, but several thin async wrappers were still inlining
much larger `AgentControl` futures into their callers, which was enough
to overflow the default Windows stack.

## What

- Box the thin `AgentControl` wrappers around `spawn_agent_internal`,
`resume_single_agent_from_rollout`, and `shutdown_agent_tree`.
- Box the corresponding legacy `multi_agents` handler calls in `spawn`,
`resume_agent`, and `close_agent`.
- Keep behavior unchanged while reducing future size on this call path
so the Windows test no longer overflows its stack.

## Testing

- `cargo test -p codex-core --lib
tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed
-- --exact --nocapture`
- `cargo test -p codex-core` (this still hit unrelated local
integration-test failures because `codex.exe` / `test_stdio_server.exe`
were not present in this shell; the relevant unit tests passed)

Michael Bolin · 2026-04-22 17:48:13 -07:00

3cc3763e6c

[3/4] Add executor-backed RMCP HTTP client (#18583 )

### Why
The RMCP layer needs a Streamable HTTP client that can talk either
directly over `reqwest` or through the executor HTTP runner without
duplicating MCP session logic higher in the stack. This PR adds that
client-side transport boundary so remote Streamable HTTP MCP can reuse
the same RMCP flow as the local path.

### What
- Add a shared `rmcp-client/src/streamable_http/` module with:
  - `transport_client.rs` for the local-or-remote transport enum
  - `local_client.rs` for the direct `reqwest` implementation
  - `remote_client.rs` for the executor-backed implementation
  - `common.rs` for the small shared Streamable HTTP helpers
- Teach `RmcpClient` to build Streamable HTTP transports in either local
or remote mode while keeping the existing OAuth ownership in RMCP.
- Translate remote POST, GET, and DELETE session operations into
executor `http/request` calls.
- Preserve RMCP session expiry handling and reconnect behavior for the
remote transport.
- Add remote transport coverage in
`rmcp-client/tests/streamable_http_remote.rs` and keep the shared test
support in `rmcp-client/tests/streamable_http_test_support.rs`.

### Verification
- `cargo check -p codex-rmcp-client`
- online CI

### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-22 17:38:04 -07:00

0e78ce80ee

Rename approvals reviewer variant to auto-review (#19056 )

## Why

`approvals_reviewer` now uses `auto_review` as the canonical config/API
value after #18504, but the Rust enum variant and nearby helper/test
names still used `GuardianSubagent` / guardian approval wording. That
made follow-up code and reviews confusing even though the external value
had already moved to Auto-review.

## What changed

- Renamed `ApprovalsReviewer::GuardianSubagent` to
`ApprovalsReviewer::AutoReview`.
- Updated protocol, app-server, config, core, TUI, exec, and analytics
test callsites.
- Renamed nearby helper/test names from guardian approval wording to
Auto-review wording where they refer to the approvals reviewer mode.
- Preserved wire compatibility:
  - `auto_review` remains the canonical serialized value.
  - `guardian_subagent` remains accepted as a legacy alias.

This intentionally does not rename the `[features].guardian_approval`
key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event
names, or app-server Guardian review event types.

## Verification

- `cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-config approvals_reviewer`
- `cargo test -p codex-tui update_feature_flags`
- `cargo test -p codex-core permissions_instructions`
- `cargo test -p codex-tui permissions_selection`

Won Park · 2026-04-22 17:22:35 -07:00

83ec1eb5d6

hooks: emit Bash PostToolUse when exec_command completes via write_stdin (#18888 )

Fixes #16246.

## Why

`exec_command` already emits `PreToolUse`, but long-running unified exec
commands that finish on a later `write_stdin` poll could miss the
matching `PostToolUse`. That left the Bash hook lifecycle inconsistent,
broke expectations around `tool_use_id` and `tool_input.command`, and
meant `PostToolUse` block/replacement feedback could fail to replace the
final session output before it reached model context.

This keeps the fix scoped to the `exec_command` / `write_stdin`
lifecycle. Broader non-Bash hook expansion is still out of scope here
and remains tracked separately in #16732.

## What changed

- Compute and store `PostToolUsePayload` while handlers still have
access to their concrete output type, and carry `tool_use_id` through
that payload.
- Preserve the original hook-facing `exec_command` string through
unified exec state (`ExecCommandRequest`, `ProcessEntry`,
`PreparedProcessHandles`, and `ExecCommandToolOutput`) via
`hook_command`, and remove the now-unused `session_command` output
metadata.
- Emit exactly one Bash `PostToolUse` for long-running `exec_command`
sessions when a later `write_stdin` poll observes final completion,
using the original `exec_command` call id and hook-facing command.
- Keep one-shot `exec_command` behavior aligned with the same payload
construction, including interactive completions that return a final
result directly.
- Apply `PostToolUse` block/replacement feedback before the final
`write_stdin` completion output is sent back to the model.
- Keep `write_stdin` itself out of `PreToolUse` matching so it continues
to act as transport/polling for the original Bash tool call.
- Restore plain matcher behavior for tool-name matchers such as `Bash`
and `Edit|Write`, while still treating patterns with regex characters
(for example `mcp__.*`) as regexes.
- Add unit coverage for unified exec payload construction and parallel
session separation, plus a core integration regression that verifies a
blocked `PostToolUse` replaces the final `write_stdin` output in model
context.

## Testing

- `cargo test -p codex-hooks`
- `cargo test -p codex-core post_tool_use_payload`
- `cargo test -p codex-core
post_tool_use_blocks_when_exec_session_completes_via_write_stdin`

Andrei Eternal · 2026-04-22 17:14:22 -07:00

eed0e07825

rollout: persist turn permission profiles (#18281 )

## Why

Resume and reconstruction need to preserve the permissions that were
active for each user turn. If rollouts only keep legacy sandbox fields,
replay cannot faithfully represent profile-shaped overrides introduced
earlier in the stack.

## What changed

This records `permission_profile` on user-turn rollout events,
reconstructs it through history/state extraction, and updates rollout
reconstruction and related fixtures to keep the field explicit.

## Verification

- `cargo test -p codex-core --test all permissions_messages --
--nocapture`
- `cargo test -p codex-core --test all request_permissions --
--nocapture`











































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18281).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* __->__ #18281

Michael Bolin · 2026-04-22 17:00:29 -07:00

6ca038bbd1

clients: send permission profiles to app-server (#18280 )

## Why

After app-server can accept `PermissionProfile`, first-party clients
should stop preferring legacy sandbox fields when canonical permission
information is available. This keeps the migration moving without
removing legacy compatibility yet.

The client side still has mixed surfaces during the stack: embedded
thread start/resume/fork and exec initial turns can derive a profile
directly from local config, while TUI remote sessions and some
turn-start paths only have a legacy/server-context-safe sandbox
projection. Those paths keep sending legacy sandbox fields rather than
synthesizing or sending lossy/local-only profiles.

## What changed

- Sends `permissionProfile` from exec and embedded TUI thread
start/resume/fork requests when config has a representable profile.
- Keeps legacy sandbox fallback for external sandbox policies, TUI
remote thread lifecycle requests, and TUI turn-start requests that do
not yet carry the active profile.
- Sends the actual config-derived `permissionProfile` for exec initial
turns instead of rebuilding one from the legacy sandbox projection.
- Stores response `permissionProfile` as optional in TUI session state
so external sandbox responses and compatibility payloads preserve
`null`.
- Updates tests for request construction and response mapping.

## Verification

- `cargo check --tests -p codex-tui -p codex-exec`
- `cargo test -p codex-tui app_server_session -- --nocapture`
- `cargo test -p codex-exec thread_start_params -- --nocapture`
- `cargo test -p codex-tui
app_server_session::tests::thread_lifecycle_params -- --nocapture`
- `just fix -p codex-tui -p codex-exec`
- `just fix -p codex-tui`












---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18280).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* __->__ #18280

Michael Bolin · 2026-04-22 16:34:13 -07:00

bc083e4713

exec-server: require explicit filesystem sandbox cwd (#19046 )

## Why

This is a cleanup PR for the `PermissionProfile` migration stack. #19016
fixed remote exec-server sandbox contexts so Docker-backed filesystem
requests use a request/container `cwd` instead of leaking the local test
runner `cwd`. That exposed the broader API problem:
`FileSystemSandboxContext::new(SandboxPolicy)` could still reconstruct
filesystem permissions by reading the exec-server process cwd with
`AbsolutePathBuf::current_dir()`.

That made `cwd`-dependent legacy entries, such as `:cwd`,
`:project_roots`, and relative deny globs, depend on ambient process
state instead of the request sandbox `cwd`. As later PRs make
`PermissionProfile` the primary permissions abstraction, sandbox
contexts should be explicit about whether they carry a request `cwd` or
are profile-only. Removing the implicit constructor prevents new call
sites from accidentally rebuilding permissions against the wrong `cwd`.

## What changed

- Removed `FileSystemSandboxContext::new(SandboxPolicy)`.
- Kept production callers on explicit constructors:
`from_legacy_sandbox_policy(..., cwd)`, `from_permission_profile(...)`,
and `from_permission_profile_with_cwd(...)`.
- Updated exec-server test helpers to construct `PermissionProfile`
values directly instead of routing through legacy `SandboxPolicy`
projections.
- Updated the environment regression test to use an explicit restricted
profile with no synthetic `cwd`.

## Verification

- `cargo test -p codex-exec-server`
- `just fix -p codex-exec-server`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19046).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* __->__ #19046

Michael Bolin · 2026-04-22 23:05:12 +00:00

44dbd9e48a

Rebrand approvals reviewer config to auto-review (#18504 )

### Why

Auto-review is the user-facing name for the approvals reviewer, but the
config/API value still exposed the old `guardian_subagent` name. That
made new configs and generated schemas point users at Guardian
terminology even though the intended product surface is Auto-review.

This PR updates the external `approvals_reviewer` value while preserving
compatibility for existing configs and clients.

### What changed

- Makes `auto_review` the canonical serialized value for
`approvals_reviewer`.
- Keeps `guardian_subagent` accepted as a legacy alias.
- Keeps `user` accepted and serialized as `user`.
- Updates generated config and app-server schemas so
`approvals_reviewer` includes:
  - `user`
  - `auto_review`
  - `guardian_subagent`
- Updates app-server README docs for the reviewer value.
- Updates analytics and config requirements tests for the canonical
auto_review value.


### Compatibility

Existing configs and API payloads using:

```toml
approvals_reviewer = "guardian_subagent"
```

continue to load and map to the Auto-review reviewer behavior. 

New serialization emits: 
```toml
approvals_reviewer = "auto_review" 
```

This PR intentionally does not rename the [features].guardian_approval
key or broad internal Guardian symbols. Those are split out for a
follow-up PR to keep this migration small and avoid touching large
TUI/internal surfaces.

**Verification**
cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent

Won Park · 2026-04-22 15:45:35 -07:00

46142c3cb0

Update bundled OpenAI Docs skill freshness check (#19043 )

## Summary

Sync the bundled `openai-docs` system skill with the already-merged
`openai/skills` update from https://github.com/openai/skills/pull/360.

Codex bundles system skills from `codex-rs/skills/src/assets/samples`,
so this PR copies the same GPT-5.4 OpenAI Docs skill update into the
Codex app/CLI bundle path.

## Changes

- Add the latest-model resolver script to the bundled `openai-docs`
skill.
- Route model upgrade and prompt-upgrade requests through remote
latest-model metadata when current guidance is needed.
- Rename bundled fallback references to `upgrade-guide.md` and
`prompting-guide.md`.
- Keep the bundled fallback guidance GPT-5.4-only.

## Validation

- Verified this bundled skill is byte-for-byte identical to
`openai/skills@origin/main` `skills/.system/openai-docs`.
- Ran the resolver locally and confirmed it returns `gpt-5.4` /
`gpt-5p4`.

Konstantine Kahadze · 2026-04-22 22:31:04 +00:00

0e25c5ff42

[Codex] Register browser requirements feature keys (#18956 )

## Summary
- register `in_app_browser` and `browser_use` as stable feature keys
- allow requirements/MDM feature requirements to pin those desktop
browser controls
- add coverage for browser requirements being accepted by config loading

## Testing
- `cargo fmt --all` (`just fmt` unavailable locally; rustfmt warned
about nightly-only `imports_granularity` config)
- `cargo test -p codex-features`
- `cargo test -p codex-core browser_feature_requirements_are_valid`
- Tested manually by setting in `requirements.toml` and seeing after app
restart state to reflect the setting was correct (at the time hiding the
`Browser Use` setting when the enterprise setting was set to false

khoi · 2026-04-22 15:27:15 -07:00

568cdacc7e

Overlay state DB git metadata for filtered thread lists (#19036 )

## Summary
- Factor the state DB `ThreadMetadata` to rollout `ThreadItem` mapping
into a shared helper used by both DB pages and filesystem overlays
- Generalize filtered filesystem list overlays to fill missing thread
list metadata from the state-derived `ThreadItem`, while preserving
filesystem `path` and `thread_id`
- Add coverage for the merge behavior so existing filesystem values are
not overwritten and future `ThreadItem` fields require an explicit
decision

## Testing
- `just fmt` from `codex-rs`
- `git diff --check -- codex-rs/rollout/src/recorder.rs
codex-rs/rollout/src/recorder_tests.rs`
- Attempted `cargo test -p codex-rollout thread_item_metadata` from
`codex-rs`; blocked in dependency fetch/setup after updating crates.io
and git submodules `https://github.com/livekit/protocol` and
`https://chromium.googlesource.com/libyuv/libyuv`, so the focused tests
did not run

joeytrasatti-openai · 2026-04-22 14:59:20 -07:00

ee70b365ab

exec-server: expose arg0 alias root to fs sandbox (#19016 )

## Why

The post-merge `rust-ci-full` run for #18999 still failed the Ubuntu
remote `suite::remote_env` sandboxed filesystem tests. That run checked
out merge commit `ddde50c611e4800cb805f243ed3c50bbafe7d011`, so the arg0
guard lifetime fix was present.

The Docker-backed failure had two remaining pieces:

- The sandboxed filesystem helper needs to execute Codex through the
`codex-linux-sandbox` arg0 alias path. The helper sandbox was only
granting read access to the real Codex executable parent, so the alias
parent also has to be visible inside the helper sandbox.
- The remote-env tests were building sandbox contexts with
`FileSystemSandboxContext::new()`, which captures the local test runner
cwd. In the Docker remote exec-server, that host checkout path does not
exist, so spawning the filesystem helper failed with `No such file or
directory` before the helper could process the request.

## What Changed

- Track all helper runtime read roots instead of a single root.
- Add both the real Codex executable parent and the
`codex-linux-sandbox` alias parent to sandbox readable roots.
- Avoid sending an unused local cwd in remote filesystem sandbox
contexts when the permission profile has no cwd-dependent entries.
- Build the Docker remote-env test sandbox contexts with a cwd path that
exists inside the container.
- Add unit coverage for the alias-parent root and remote sandbox cwd
handling.

## Verification

- `cargo test -p codex-exec-server`
- `cargo test -p codex-core
remote_test_env_sandboxed_read_allows_readable_root`
- `just fix -p codex-exec-server`
- `just fix -p codex-core`

Michael Bolin · 2026-04-22 21:34:22 +00:00

d3dd0d759b

Fix MCP permission policy sync (#19033 )

###### Why/Context/Summary

Repro: start a session outside Full Access, switch permissions to Full
Access, then submit a new turn that triggers MCP/CUA permission
handling.

The turn used the live Full Access `SessionConfiguration`, but the MCP
coordinator was still synced from the stale `original_config_do_not_use`
/ per-turn config copy. That left the coordinator with an old sandbox
policy, so empty MCP permission elicitations could be denied instead of
auto-accepted.

Fix: update/rebuild the MCP connection manager from the live
turn/session approval and sandbox policy fields.

###### Test plan

```sh
just fmt
cargo test -p codex-core --lib
cargo test -p codex-core --lib mcp_tool_call::tests
```

Leo Shimonaka · 2026-04-22 14:30:29 -07:00

16eeeb534a

feat: add guardian network approval trigger context (#18197 )

## Summary

Give guardian network-access reviews the command context that triggered
a managed-network approval. The prompt JSON now includes the originating
tool call id, tool name, command argv, cwd, sandbox permissions,
additional permissions, justification, and tty state when a single
active tool call can be attributed.

The implementation keeps the trigger shape canonical by serializing
`GuardianNetworkAccessTrigger` directly and lets each runtime build that
trigger from its `ToolCtx`. Non-guardian approval prompts avoid cloning
the full trigger payload.

## UX changes

Guardian network-access reviews now include a `trigger` object that
explains what command caused the network approval. Instead of seeing
only the requested host, the guardian reviewer can also see the
originating tool call, argv, working directory, sandbox mode,
justification, and tty state.

Example payload the guardian reviewer can see:

```json
{
  "tool": "network_access",
  "target": "https://api.github.com:443",
  "host": "api.github.com",
  "protocol": "https",
  "port": 443,
  "trigger": {
    "callId": "call_abc123",
    "toolName": "shell",
    "command": ["gh", "api", "/repos/openai/codex/pulls/18197"],
    "cwd": "/workspace/codex",
    "sandboxPermissions": "require_escalated",
    "justification": "Fetch PR metadata from GitHub.",
    "tty": false
  }
}
```

The network review itself remains scoped to the network decision:
`target_item_id` stays `null`. `trigger.callId` is attribution context
only, so clients can still distinguish network reviews from
item-targeted command reviews.

## Verification

- Added coverage for serializing network trigger context in guardian
approval JSON.
- Added regression coverage that network guardian reviews do not reuse
`trigger.callId` as `target_item_id`.

---------

Co-authored-by: Codex <noreply@openai.com>

viyatb-oai · 2026-04-22 14:00:53 -07:00

2d73bac45f

[2/4] Implement executor HTTP request runner (#18582 )

### Why
Remote streamable HTTP MCP needs the executor to perform ordinary HTTP
requests on the executor side. This keeps network placement aligned with
`experimental_environment = "remote"` without adding MCP-specific
executor APIs.

### What
- Add an executor-side `http/request` runner backed by `reqwest`.
- Validate request method and URL scheme, preserving the transport
boundary at plain HTTP.
- Return buffered responses for ordinary calls and emit ordered
`http/request/bodyDelta` notifications for streaming responses.
- Register the request handler in the exec-server router.
- Document the runner entrypoint, conversion helpers, body-stream
bridge, notification sender, timeout behavior, and new integration-test
helpers.
- Add exec-server integration tests with the existing websocket harness
and a local TCP HTTP peer for buffered and streamed responses, with
comments spelling out what each test proves and its
setup/exercise/assert phases.

### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage

### Verification
- `just fmt`
- `cargo check -p codex-exec-server -p codex-rmcp-client --tests`
- `cargo check -p codex-core --test all` compile-only
- `git diff --check`
- Online full CI is running from the `full-ci` branch, including the
remote Rust test job.

Co-authored-by: Codex <noreply@openai.com>

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-22 20:36:34 +00:00

9360f267f3

app-server: accept permission profile overrides (#18279 )

## Why

`PermissionProfile` is becoming the canonical permissions shape shared
by core and app-server. After app-server responses expose the active
profile, clients need to be able to send that same shape back when
starting, resuming, forking, or overriding a turn instead of translating
through the legacy `sandbox`/`sandboxPolicy` shorthands.

This still needs to preserve the existing requirements/platform
enforcement model. A profile-shaped request can be downgraded or
rejected by constraints, but the server should keep the user's
elevated-access intent for project trust decisions. Turn-level profile
overrides also need to retain existing read protections, including
deny-read entries and bounded glob-scan metadata, so a permission
override cannot accidentally drop configured protections such as
`**/*.env = deny`.

## What changed

- Adds optional `permissionProfile` request fields to `thread/start`,
`thread/resume`, `thread/fork`, and `turn/start`.
- Rejects ambiguous requests that specify both `permissionProfile` and
the legacy `sandbox`/`sandboxPolicy` fields, including running-thread
resume requests.
- Converts profile-shaped overrides into core runtime filesystem/network
permissions while continuing to derive the constrained legacy sandbox
projection used by existing execution paths.
- Preserves project-trust intent for profile overrides that are
equivalent to workspace-write or full-access sandbox requests.
- Preserves existing deny-read entries and `globScanMaxDepth` when
applying turn-level `permissionProfile` overrides.
- Updates app-server docs plus generated JSON/TypeScript schema fixtures
and regression coverage.

## Verification

- `cargo test -p codex-app-server-protocol schema_fixtures`
- `cargo test -p codex-core
session_configuration_apply_permission_profile_preserves_existing_deny_read_entries`







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* __->__ #18279

Michael Bolin · 2026-04-22 13:34:33 -07:00

18a26d7bbc

feat(auto-review) short-circuit (#18890 )

## Summary
Short circuit the convo if auto-review hits too many denials

## Testing
- [x] Added unit tests

---------

Co-authored-by: Codex <noreply@openai.com>

Dylan Hurd · 2026-04-22 20:34:15 +00:00

ed4def8286

feat: Fairly trim skill descriptions within context budget (#18925 )

Preserve skill name/path entries whenever possible and trim descriptions
first, using round-robin character allocation so short descriptions do
not waste budget.

xl-openai · 2026-04-22 12:33:29 -07:00

b77791c228

arg0: keep dispatch aliases alive during async main (#18999 )

## Why

The Ubuntu GNU remote Cargo run has been regularly failing sandboxed
`suite::remote_env` filesystem tests with `No such file or directory`,
while the same cases pass under Bazel. The Cargo remote-env setup starts
`target/debug/codex exec-server` inside Docker via
`scripts/test-remote-env.sh`. That CLI builds `codex-linux-sandbox` and
other arg0 helper aliases in a temporary directory, then passes those
alias paths into the exec-server runtime.

`arg0_dispatch_or_else` constructed `Arg0DispatchPaths` from that
temporary alias guard, but then awaited the async CLI entry point
without otherwise keeping the guard live. That allowed the guard to be
dropped while the exec-server was still running, removing the helper
alias directory. Later sandboxed filesystem calls tried to spawn the
now-deleted `codex-linux-sandbox` path and surfaced as `ENOENT`.

The relevant distinction I found is that `core/tests/common` stores the
result of `arg0_dispatch()` in a process-lifetime
`OnceLock<Option<Arg0PathEntryGuard>>` for test binaries. The Cargo
remote-env setup exercises a real `codex exec-server` process instead,
so it depends on the normal CLI lifetime behavior fixed here.

## What Changed

- Keep the arg0 tempdir guard alive until `main_fn(paths).await`
completes.
- Keep the helper on the real `arg0_dispatch()` shape, where alias setup
can fail and return `None` in production.
- Add a regression test that uses an explicit guard, yields once, and
verifies the generated helper alias path still exists while the async
entry point is running.

## Verification

- `cargo test -p codex-arg0`
- `just argument-comment-lint -p codex-arg0`
- `just fix -p codex-arg0`

Michael Bolin · 2026-04-22 11:06:34 -07:00

ddde50c611

Add plumbing to approve stored Auto-Review denials (#18955 )

## Summary

This adds the structural plumbing needed for an app-server client to
approve a previously denied Guardian review and carry that approval
context into the next model turn.

This PR does not add the actual `/auto-review-denials` tool 

## What Changed

- Added app-server v2 RPC `thread/approveGuardianDeniedAction`.
- Added generated JSON schema and TypeScript fixtures for
`ThreadApproveGuardianDeniedAction*`.
- Added core `Op::ApproveGuardianDeniedAction`.
- Added a core handler that validates the event is a denied Guardian
assessment and injects a developer message containing the stored denial
event JSON.
- Queues the approval context for the next turn if there is no active
turn yet.
- Added the TUI app-server bridge so `Op::ApproveGuardianDeniedAction {
event }` is routed to the app-server request.

## What This Does Not Do

- Does not add `/auto-review-denials`.
- Does not add chat widget recent-denial state.
- Does not add popup/list UI.
- Does not add a product-facing denial lookup/store.
- Does not change where Guardian denials are originally emitted or
persisted.

## Verification

- `cargo test -p codex-tui thread_approve_guardian_denied_action`

Won Park · 2026-04-22 10:38:19 -07:00

11e5af53c4

feat(auto-review) policy config (#18959 )

## Summary
Allow users to customize their own auto-review policy config.

## Testing
- [x] added config_tests

Dylan Hurd · 2026-04-22 10:33:02 -07:00

78593d72ea

[rollout_trace] Record core session rollout traces (#18877 )

## Summary

Wires rollout trace recording into `codex-core` session and turn
execution. This records the core model request/response, compaction, and
session lifecycle boundaries needed for replay without yet tracing every
nested runtime/tool boundary.

## Stack

This is PR 2/5 in the rollout trace stack.

- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command

## Review Notes

This layer is the first live integration point. The important review
question is whether trace recording is isolated from normal session
behavior: trace failures should not become user-visible execution
failures, and recording should preserve the existing turn/session
lifecycle semantics.

The PR depends on the reducer/data model from the first stack entry and
only introduces the core recorder surface that later PRs use for richer
runtime and relationship events.

cassirer-openai · 2026-04-22 17:00:48 +00:00

f67383bcba

TUI: Keep remote app-server events draining (#18932 )

Addresses #18860

Problem: Remote app-server clients could stop draining websocket events
when their bounded local event channel filled, leaving clients stuck on
stale in-progress turns after a disconnect.

Solution: Use an unbounded local event channel for the remote client so
the websocket reader can keep forwarding disconnect and progress events
instead of blocking or dropping them.

Why this is reasonable: This does not make the remote websocket itself
unbounded. The changed queue lives inside the remote client, between the
task that reads the remote websocket and the API consumer in the same
client process. Once an event has been received from the remote server,
preserving it is preferable to blocking websocket reads or dropping
disconnect/lifecycle events; network-level backpressure still happens at
the websocket boundary if the remote side outpaces the client.

Eric Traut · 2026-04-22 09:29:34 -07:00

79ea577156

Stage publishable Python runtime wheels (#18865 )

This is PR 2 of the Python SDK PyPI publishing split. [PR
1](https://github.com/openai/codex/pull/18862) refreshed the generated
SDK bindings; this PR makes the runtime package itself publishable, and
PR 3 will wire the SDK package/version pinning to this runtime package.

## Summary
- Rename the runtime distribution to `openai-codex-cli-bin` while
keeping the import package as `codex_cli_bin`.
- Make the runtime package wheel-only and build `py3-none-<platform>`
wheels instead of interpreter-specific wheels.
- Add `stage-runtime --codex-version` and `--platform-tag` so release
staging can produce the platform wheel matrix from Codex release tags.
- Add focused artifact workflow tests for version normalization,
platform tag injection, and runtime wheel metadata.

## Why Rename
There is already an unofficial PyPI package,
[`codex-bin`](https://pypi.org/project/codex-bin/), distributing OpenAI
Codex binaries. Publishing the official SDK runtime dependency as
`openai-codex-cli-bin` makes the ownership clear, avoids confusing the
SDK-pinned runtime wheel with that unowned wrapper, and keeps the import
package unchanged as `codex_cli_bin`.

## Tests
- `uv run --extra dev pytest
tests/test_artifact_workflow_and_binaries.py` -> 21 passed
- `uv run --extra dev python scripts/update_sdk_artifacts.py
stage-runtime /tmp/codex-python-pr2-rebased/runtime-stage
/tmp/codex-python-pr2-rebased/codex --codex-version
rust-v0.116.0-alpha.1 --platform-tag macosx_11_0_arm64`
- `uv run --with build --extra dev python -m build --wheel
/tmp/codex-python-pr2-rebased/runtime-stage`
- `uv run --with twine --extra dev twine check
/tmp/codex-python-pr2-rebased/runtime-stage/dist/openai_codex_cli_bin-0.116.0a1-py3-none-macosx_11_0_arm64.whl`

## Note
- Full `uv run --extra dev pytest` currently fails because regenerating
from schemas already on `main` adds new DeviceKey Python types. I left
that generated catch-up out of this runtime-only PR.

Steve Coffey · 2026-04-22 08:14:48 -07:00

0127cef5db

[codex] Update imagegen system skill (#18852 )

## Summary

This updates the embedded `imagegen` system skill in `codex-rs/skills`
with the ImageGen 2 skill changes from `openai/skills-internal#87`.

The bundled skill now keeps normal image generation/editing on the
built-in `image_gen` path, updates the CLI fallback defaults to
`gpt-image-2`, and routes explicit transparent-output requests through
`gpt-image-1.5` with clear guidance that `gpt-image-2` does not support
transparent backgrounds.

## Details

- Update `SKILL.md` routing guidance for built-in vs CLI fallback
behavior.
- Update CLI/API references for `gpt-image-2` size constraints, quality
options, near-4K sizes, and unsupported options.
- Update `scripts/image_gen.py` defaults and validation:
  - default model `gpt-image-2`
  - default size `auto`
  - default quality `medium`
  - reject transparent backgrounds on `gpt-image-2`
  - reject `input_fidelity` on `gpt-image-2`
- validate flexible `gpt-image-2` sizes and suggest `3824x2160` /
`2160x3824` for near-4K requests
- Update prompt/reference docs with the new model and routing guidance.

## Validation

- `cargo test -p codex-skills`
- `git diff --check`
- Manual CLI dry-runs for:
  - default `gpt-image-2` payload
  - `3824x2160` near-4K size acceptance
  - `3840x2160` rejection with near-4K guidance
  - transparent background rejection on `gpt-image-2`
  - transparent background acceptance on `gpt-image-1.5`
  - `input_fidelity` rejection on `gpt-image-2`

Bazel target check was not run locally because `bazel` is not installed
in this environment.

Vaibhav Srivastav · 2026-04-22 15:08:10 +00:00

0ebe69a8c3

chore: prep memories for AB (#18973 )

jif-oai · 2026-04-22 11:46:15 +01:00

65420737e8

fix: cargo deny (#18971 )

jif-oai · 2026-04-22 11:46:11 +01:00

ddf65c9647

5722 Commits