codex

[codex] Preserve logical paths during AGENTS.md discovery (#26465 )

## Intent

Follow up on #26205 by avoiding unnecessary filesystem canonicalization
during `AGENTS.md` discovery. The configured working directory is
already absolute, and canonicalization incorrectly switches symlinked
workspaces from their logical parent hierarchy to the target's
hierarchy.

## User-facing behavior

For a symlinked working directory such as:

```text
test-root/
|-- logical-repo/
|   |-- AGENTS.md              ("logical parent doc")
|   `-- workspace ------------> physical-repo/workspace/
`-- physical-repo/
    |-- AGENTS.md              ("physical parent doc")
    `-- workspace/
        `-- AGENTS.md          ("workspace doc")
```

Before this change, Codex canonicalized `logical-repo/workspace` to
`physical-repo/workspace` before discovery. It therefore loaded
`physical-repo/AGENTS.md` and `physical-repo/workspace/AGENTS.md`,
ignoring the instructions from the repository through which the user
entered the workspace.

After this change, ancestor discovery walks the configured logical path,
so Codex loads `logical-repo/AGENTS.md`. Opening
`logical-repo/workspace/AGENTS.md` still follows the symlink through the
host filesystem, so the workspace document is also loaded.
`physical-repo/AGENTS.md` is not loaded.

## Implementation

Use the logical absolute working directory when discovering project
instructions and reporting instruction sources. Filesystem reads still
follow the working-directory symlink, so an `AGENTS.md` in the target
workspace continues to load while ancestor discovery uses the symlink's
parents.

## Validation

Added integration coverage proving that discovery loads the logical
parent's instructions and the target workspace's instructions, but not
the target parent's instructions.

Adam Perry @ OpenAI · 2026-06-04 15:08:52 -07:00

59ca34206b

Use Winget release environment secret (#26466 )

## Why
`WINGET_PUBLISH_PAT` now lives as a GitHub environment secret under
`mainline-release-winget`. The WinGet release job needs to enter that
environment so `secrets.WINGET_PUBLISH_PAT` resolves during
stable/mainline Rust releases.

## What Changed
- Attach the `winget` job in `.github/workflows/rust-release.yml` to the
`mainline-release-winget` environment.
- Set `deployment: false` so the job can read environment secrets
without creating GitHub deployment records.

## Operational Note
The `mainline-release-winget` environment must allow `rust-v*.*.*` tag
refs before this can run on release tags. The live environment currently
has a custom policy named `rust-v*.*.*` with type `branch`; add the
corresponding `tag` policy before relying on this path for a release.

## Validation
- `git diff --check origin/main...HEAD --
.github/workflows/rust-release.yml`
- `ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts
"yaml ok: #{f}" }' .github/workflows/rust-release.yml`

Shijie Rao · 2026-06-04 14:38:11 -07:00

37c8aefa14

[codex] Use model-advertised reasoning effort order (#26446 )

## Summary
- preserve the model catalog order for app-server
`supportedReasoningEfforts` and document that client contract
- render TUI reasoning choices in the advertised order
- step reasoning shortcuts by adjacent list position instead of deriving
order from known effort names
- anchor unsupported configured values to the advertised default, or the
first option when needed
- remove canonical effort ordering helpers and the unused upgrade effort
mapping

## Validation
- `just fmt`
- Local tests and compilation were not run per request; relying on CI.

Stacked on #26444.

Ahmed Ibrahim · 2026-06-04 14:01:14 -07:00

f6e529656f

[codex] Support model-defined reasoning efforts (#26444 )

## Summary
- accept non-empty model-defined reasoning effort values while
preserving built-in effort behavior
- propagate the non-Copy effort type through core, app-server, TUI,
telemetry, and persistence call sites
- preserve string wire encoding and expose an open-string schema for
clients
- update model selection and shortcut behavior for model-advertised
effort values

## Root cause
`ReasoningEffort` gained a string-backed custom variant, so it could no
longer implement `Copy` or rely on derived closed-enum serialization.
Existing consumers still moved effort values from shared references and
assumed a fixed built-in value set.

## Validation
- `just fmt`
- Local tests and compilation were not run per request; relying on CI.

Ahmed Ibrahim · 2026-06-04 13:36:24 -07:00

8ac304c299

Cleanup experimentalFeature/enablement/set (#26312 )

## Why

`experimentalFeature/enablement/set` still allowed several keys that no
longer need to be managed through this API. Keeping those keys also
preserved corresponding special-case logic, including refreshing the
apps list when the `apps` key was enabled.

The endpoint also rejected an entire request when any key was invalid or
unsupported. That makes clients brittle when they send a mix of current
and stale keys, even when the valid entries can still be applied safely.

## What changed

- remove the feature keys that no longer need to be supported by
`experimentalFeature/enablement/set`
- remove the corresponding apps-list refresh path and its auth/config
plumbing
- ignore and warn on invalid or unsupported keys while still applying
valid keys from the same request
- update the app-server documentation and integration coverage for the
reduced key set and partial-acceptance behavior

## Test plan

- `just test -p codex-app-server experimental_feature_enablement_set` (6
passed)
- `just test -p codex-app-server` exercised the changed tests
successfully; unrelated sandbox-dependent and watcher/timing tests
failed locally

Matthew Zeng · 2026-06-04 13:35:31 -07:00

4a70e0ac1b

Remove response.processed websocket request (#26447 )

## Why

The Responses websocket client no longer needs to send a follow-up
`response.processed` request after a turn response has already been
recorded. Keeping that extra acknowledgement path adds feature-gated
control flow and a second websocket request shape that no longer carries
useful behavior.

## What Changed

- Removed the `response.processed` websocket request type and sender.
- Removed the `responses_websocket_response_processed` feature flag and
schema entry.
- Removed turn and remote-compaction plumbing that only tracked response
IDs to send the acknowledgement.
- Removed tests that existed solely to cover the deleted feature path.

## Validation

- `just fix -p codex-core -p codex-api -p codex-features`

pakrym-oai · 2026-06-04 13:15:50 -07:00

d312a53e2a

build: use ThinLTO for release binaries (#23710 )

## Why

Fat LTO makes release builds substantially slower without providing
enough measured runtime benefit to justify the release CI long pole. The
build-profile investigation found that keeping Cargo's default release
`opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced
a clean `codex-cli` release build from 2073.893 seconds to 1243.172
seconds, a 40.06% improvement.

The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%).
Measured runtime changes were small: the worst image workload median was
+0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO
retains cross-crate optimization while avoiding most of the fat-LTO
build cost.

This deliberately avoids global size optimization: final-executable
testing showed a substantial regression on the image request path, which
is expected to become more important as image usage grows.

## What changed

- Set the workspace release profile to `lto = "thin"`, retaining Cargo's
default release `opt-level=3`.
- Remove release and CI workflow-specific LTO overrides so
release-profile builds consistently use the workspace setting.
- Remove the now-unused Windows release workflow input and related
diagnostic output.

## Validation

- Confirmed the release profile parses with `cargo metadata --no-deps
--format-version 1`.
- CI validates release builds across the supported target matrix.

Adam Perry @ OpenAI · 2026-06-04 20:07:53 +00:00

f97d5c3275

[codex] Fix Windows sandbox build script lint (#26445 )

## Why

The Windows ARM64 Cargo clippy job on `main` is failing because
workspace lints deny `clippy::expect_used`, and the
`codex-windows-sandbox` build script used `expect()` while reading
`CARGO_MANIFEST_DIR`.

## What changed

`codex-rs/windows-sandbox-rs/build.rs` now returns `Result<(), String>`
from `main()` and converts a missing `CARGO_MANIFEST_DIR` into an
explicit build-script error. The non-Windows early return and Windows
linker argument behavior are unchanged.

## Verification

- `just clippy -p codex-windows-sandbox -- -D warnings`
- `just test -p codex-windows-sandbox`

pakrym-oai · 2026-06-04 13:03:47 -07:00

555f8caeff

Route AGENTS.md loading through environment filesystems (#26205 )

## Why

Workspace-specific `AGENTS.md` loading needs to use the selected
environment filesystem so remote workspaces and child agents read
instructions from their actual environment instead of the host
filesystem. The app-server should report the same instruction sources
the initialized thread actually loaded, rather than independently
rescanning configuration and filesystem state.

## What changed

- Introduce `LoadedAgentsMd` to retain ordered user, project, and
internal instructions with their provenance.
- Load and canonicalize workspace `AGENTS.md` paths through the primary
`EnvironmentManager` environment, then render the loaded instructions
when constructing turn context.
- Expose cached loaded instruction sources from initialized threads and
use them for app-server start, resume, and fork responses.
- Preserve global `CODEX_HOME` loading and separator behavior while
excluding empty project files that did not supply model-visible
instructions.
- Add integration coverage for CLI injection, selected-environment
provenance and rendering, empty environment selection, and cached
sources on loaded-thread resume.

## Validation

- `just test -p codex-core agents_md`
- `just test -p codex-core
selected_environment_sources_match_model_visible_instructions`
- `just test -p codex-exec agents_md`
- `just test -p codex-app-server instruction_sources`
- `just test -p codex-app-server --status-level fail`

Adam Perry @ OpenAI · 2026-06-04 12:43:07 -07:00

e64b469bbc

Use Azure artifact signing environment secrets (#25945 )

## Why
Windows release signing should read Azure signing credentials from the
`azure-artifact-signing` environment instead of the old repo-level
`AZURE_TRUSTED_SIGNING_*` names. The smoke runs confirmed the
environment secrets resolve with the new `AZURE_ARTIFACT_SIGNING_*`
names once the Windows signing job is attached to that environment.

## What Changed
- Put the real Windows signing job in the `azure-artifact-signing`
environment.
- Switch the Windows signing action inputs from
`AZURE_TRUSTED_SIGNING_*` to `AZURE_ARTIFACT_SIGNING_*`.
- Drop the obsolete `workflow_call.secrets` declarations for the old
repo-level secret names; the caller continues to use `secrets: inherit`.
- Remove the temporary branch-trigger and Windows-only smoke-test
workflow changes before finalizing this PR.

## Validation
- `git diff --check -- .github/workflows/rust-release.yml
.github/workflows/rust-release-windows.yml`
- `ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts
"yaml ok: #{f}" }' .github/workflows/rust-release.yml
.github/workflows/rust-release-windows.yml`

Shijie Rao · 2026-06-04 12:24:26 -07:00

c3fcb0e745

core: allow excluding tool namespaces from code mode (#26320 )

## Why

Research and training setups need to control which tool namespaces
appear inside code mode's nested `tools` surface without disabling those
tools entirely. This makes it possible to train against a deliberately
reduced nested-tool setup while preserving the normal direct and
deferred tool paths.

## What

- Extend `features.code_mode` to accept structured configuration while
preserving the existing boolean syntax.
- Add an exact `excluded_tool_namespaces` list under
`[features.code_mode]`:

  ```toml
  [features.code_mode]
  enabled = true
  excluded_tool_namespaces = ["mcp__codex_apps", "multi_agent_v1"]
  ```

- Filter matching canonical `ToolName` namespaces when constructing code
mode's nested router and code-mode-specific direct tool descriptions.
- Keep excluded tools registered, directly exposed in mixed code mode,
and discoverable through top-level `tool_search` when otherwise
eligible.
- Derive deferred nested-tool guidance after namespace filtering so the
`exec` description does not advertise excluded-only deferred tools.
- Preserve the boolean/table representation when materializing config
locks and update the generated config schema.

## Testing

- `just test -p codex-features`
- `just test -p codex-config`
- `just test -p codex-core load_config_resolves_code_mode_config`
- `just test -p codex-core
lock_contains_prompts_and_materializes_features`
- `just test -p codex-core
excluded_deferred_namespaces_do_not_enable_nested_tool_guidance`
- `just test -p codex-core
code_mode_excludes_configured_nested_tool_namespaces`
- `cargo check -p codex-thread-manager-sample`

sayan-oai · 2026-06-04 18:40:18 +00:00

8b1238856b

[codex-analytics] emit forked thread id on initialization (#26248 )

## Why
- Thread initialization analytics do not identify the source thread for
forked threads.
- The session viewer needs this lineage to construct thread trees.
- Depends on openai/openai#987854. Do not release this change before
that backend schema change is deployed.

## What Changed
- Adds optional `forked_from_thread_id` to `codex_thread_initialized`.
- Populates it from the existing thread fork lineage for app-server and
in-process subagent initialization paths.
- Keeps it null for non-forked threads.

## Verification
- `just fmt`
- `just test -p codex-analytics`
- `just test -p codex-app-server
thread_fork_tracks_thread_initialized_analytics`

kbazzi · 2026-06-04 11:24:12 -07:00

9e41f8ddbe

external-agent-migration: avoid mixed MCP transport configs (#26435 )

## Why

MCP migration could recursively merge an imported server into an
existing same-named Codex server. When one definition used stdio and the
other used HTTP, this produced an invalid mixed configuration containing
both `command` and `url`.

## What changed

- Merge MCP configuration at the server level instead of field by field.
- Preserve an existing same-named Codex MCP server unchanged.
- Report only MCP servers that would actually be added during detection.
- Add regression coverage for mixed command/HTTP source configurations.
- Use neutral fixture names and reserved `example.com` URLs.

## Test plan

- `just test -p codex-app-server repo_mcp`
  - 5 tests passed.
- `just test -p codex-external-agent-migration
mcp_migration_prefers_command_transport_for_mixed_server_config`
  - 1 test passed.

stefanstokic-oai · 2026-06-04 14:16:03 -04:00

c8fdc74b42

app-server: support -c config overrides (#26436 )

## Why

The standalone `codex-app-server` binary already routed a
`CliConfigOverrides` value into app-server startup, but its own clap
args did not expose the shared `-c/--config` option. That meant
`codex-app-server -c key=value` was rejected before the existing config
override path could run, unlike the main `codex` CLI.

## What Changed

- Flatten `CliConfigOverrides` into `AppServerArgs` in
`codex-rs/app-server/src/main.rs`.
- Pass parsed overrides to `run_main_with_transport_options` instead of
always using `CliConfigOverrides::default()`.
- Add a binary parser test covering both `-c` and `--config` for the
standalone app-server.

## Verification

- `just test -p codex-app-server
app_server_accepts_cli_config_overrides`

The broader `just test -p codex-app-server` run was also attempted. It
compiled and ran 812 tests, with 796 passing, but failed in this local
sandbox on unrelated `sandbox-exec: sandbox_apply: Operation not
permitted` command-exec/turn integration paths and a skills watcher
timeout.

Michael Bolin · 2026-06-04 18:05:54 +00:00

881cf191d7

Expose configured marketplace source in plugin list JSON (#26417 )

## Summary
- Follow-up to #25330
- Add `marketplaceSource` to `codex plugin list --json` entries for
configured marketplaces
- Keep the existing per-plugin `source` field unchanged; this still
reports the local plugin source path
- Include only the configured marketplace `sourceType` and `source` from
`config.toml`
- Keep human-readable output unchanged
- Add CLI coverage for configured local and git marketplace sources

Example:

```json
{
  "source": {
    "source": "local",
    "path": "/path/to/.codex/.tmp/marketplaces/debug/plugins/sample"
  },
  "marketplaceSource": {
    "sourceType": "git",
    "source": "https://example.com/acme/agent-skills.git"
  }
}
```

## Validation
- `just fmt`
- `just fix -p codex-cli`
- `just test -p codex-cli plugin_list`

mpc-oai · 2026-06-04 12:20:32 -05:00

cdc1d592df

Bound external agent session detection work (#26291 )

## Why

External agent migration detection parsed and hashed every JSONL session
file. For users with many large conversations, launching migration could
consume substantial CPU and disk resources.

Detection only needs the most recent sessions for the migration UI, so
full-content work should be bounded.

## What

- Use file modification metadata to select the 50 most recent eligible
sessions before parsing JSONL content.
- Skip unchanged imported sessions using metadata stored in the import
ledger.
- Preserve content hashing when metadata indicates a session may have
changed.
- Stream SHA-256 calculation through a 64 KiB buffer instead of loading
an entire session into memory.
- Continue detecting older sessions in subsequent batches after newer
sessions are imported.

## Validation

- `RUST_MIN_STACK=8388608 cargo nextest run --no-fail-fast -p
codex-external-agent-sessions`
  - 20 tests passed.
- Benchmarked release builds against 250 valid JSONL sessions totaling
501 MiB:
  - Median detection time decreased from 1,138.8 ms to 47.0 ms.
  - CPU instructions decreased by 95.8%.
  - Both versions returned the expected 50 sessions.

The benchmark used warm filesystem caches and measured the reduction in
parsing, hashing, and CPU work.

stefanstokic-oai · 2026-06-04 13:13:05 -04:00

cbf62f64cb

Add saved image path hint to standalone image generation (#25947 )

## Why

Standalone image generation returns image bytes to the model, but the
model also needs the host artifact path to reference the generated file
in follow-up work.

## What changed

- Append the default saved-image path hint alongside the generated image
tool output.
- Reuse the existing core image-generation hint text.
- Pass the thread ID and Codex home directory needed to compute the
artifact path.
- Add app-server and extension coverage for the model-visible hint.

## Validation

- `just fmt`
- `just bazel-lock-check`
- `just test -p codex-app-server
standalone_image_generation_returns_saved_path_hint_to_model`

Won Park · 2026-06-04 09:39:20 -07:00

12e8764a9c

Simplify Codex CLI README (#26313 )

## Summary

The codex-rs README was left over from before we moved the docs into the
developer site. Its contents were very much out of date, and we received
some bug reports about it.

Eric Traut · 2026-06-04 09:16:03 -07:00

68db0bb5ec

Load plugin hooks without other plugin capabilities (#26272 )

## Summary

`hooks/list` only consumes plugin hook declarations, but previously
loaded every enabled plugin's skills, MCP configuration, apps, and
capability summary before discarding them.

In a local benchmark, this reduced `hooks/list` latency by over 100ms
(e.g., from 594 to 467ms on startup, and 168 to 16ms when making a
`hooks/list` call later in the same TUI session). This is on the
critical path to rendering the TUI, so every 10s of ms should be eyed
skeptically (IMO).

This change adds a hook-specific plugin loading path that preserves
plugin enablement, remote/local conflict resolution, deterministic
ordering, manifest resolution, and hook-loading warnings while skipping
unrelated capabilities. (I think there's room for a more general design
here that allows you to project the capabilities you need at load-time,
but that seems unnecessary right now.)

Charlie Marsh · 2026-06-04 11:21:40 -04:00

4ae7930f58

Reduce SQLite contention from OpenTelemetry SDK debug logs (#26396 )

## Summary

- skip `opentelemetry_sdk` DEBUG and TRACE events before formatting or
queueing them for the SQLite log sink
- preserve INFO, WARN, and ERROR events from the SDK, along with TRACE
events from application targets
- add a persistence-level regression test for the target and level
policy

## Why

OpenTelemetry's batch log processor emits internal
`BatchLogProcessor.ExportingDueToTimer` meta-events every second per
Codex process. In measured high-fanout `logs_2.sqlite` databases,
low-level `opentelemetry_sdk` events accounted for over 30% of retained
rows (30-60% on the machines of people I asked to check).

Persisting this SDK bookkeeping across many processes adds substantial
write volume and contention without representing application activity.

## Validation

- `just test -p codex-state` (132/132 tests passed, plus bench smoke)
- `just fix -p codex-state`
- `just fmt`

Zanie Blue · 2026-06-04 10:12:08 -05:00

d81fcdf8ef

Optimize unbounded byte scans with memchr (#26265 )

## Summary

This PR adds `memchr` for some low-hanging performance improvements
(namely, in MCP stdio, Ollama streaming, and full message-history
newline counts).

Codex produced the following release benchmarks:

| Operation | Before | After | Speedup |
| --- | ---: | ---: | ---: |
| MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
| Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
| Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |

With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
MCP server, completed `initialize`, requested `tools/list`, and
deserialized a 1 MiB tool description over newline-delimited stdio),
it's about 16x faster end-to-end:

| Branch | 50 calls | Per call |
| --- | ---: | ---: |
| `main` | 862.53 ms | 17.25 ms |
| this branch | 53.89 ms | 1.08 ms |

`memchr` is already in our dependency tree and extremely widely used for
this kind of optimized scanning.

Charlie Marsh · 2026-06-04 09:53:08 -04:00

7da4af622f

Bridge host-loaded skills into the skills extension (#26172 )

## Why

The skills extension needs to become the path that exposes local host
skills without losing the behavior already owned by core skill loading.
Host skill discovery is not just `$CODEX_HOME/skills`: it also includes
config layers, bundled-skill settings, plugin roots, runtime extra
roots, and the filesystem for the selected primary environment.

Rather than making the extension reload host skills and risk drifting
from that authoritative load, this PR bridges the already-loaded
per-turn skills outcome into the extension. That lets the extension
advertise host skills and inject explicit `$skill` prompts while
preserving the same roots, disabled/hidden state, rendered paths, and
environment-backed file reads that the legacy path uses.

## What Changed

- Adds `HostLoadedSkills` in `core-skills` to wrap the turn's
`SkillLoadOutcome` and read `SKILL.md` through the filesystem that
loaded that skill.
- Stores `HostLoadedSkills` in turn extension data for normal turns and
review turns, so the skills extension can consume the loaded host
catalog without reloading it.
- Adds `HostSkillProvider` under `ext/skills/src/provider/host.rs`,
mapping host-loaded skill metadata into the skills-extension
catalog/read contract.
- Registers the host provider by default from
`codex_skills_extension::install()`.
- Preserves host skill metadata such as dependencies, disabled state,
hidden-from-prompt policy, and slash-normalized display paths.
- Passes host-loaded skills through `SkillListQuery` and
`SkillReadRequest` so explicit skill invocation reads only resources
from the loaded host catalog.
- Adds integration coverage for a real legacy
`$CODEX_HOME/skills/.../SKILL.md` skill being listed and injected
through the installed extension.

## Testing

- Added `installed_extension_loads_host_skills_from_legacy_roots` in
`ext/skills/tests/skills_extension.rs`.
- `just test -p codex-skills-extension`

jif · 2026-06-04 15:28:06 +02:00

d46a98d31a

Gate automatic idle turns in Plan mode (#26147 )

## Why

Goal idle continuation is extension-triggered model-visible work, so it
should follow one core-owned rule for when automatic work may start. In
particular, it should not jump ahead of queued user/client work, start
while another task is active, or inject a continuation turn while the
thread is in Plan mode.

Keeping this policy in `try_start_turn_if_idle` avoids passing
`collaboration_mode` or review-specific state through
`ThreadLifecycleContributor::on_thread_idle`. Active `/review` is
covered by the same active-task gate because Review turns are not
steerable.

## What Changed

- Teach `Session::try_start_turn_if_idle` to reject automatic idle turns
in Plan mode, both before reserving an idle turn and after building the
turn context.
- Document `CodexThread::try_start_turn_if_idle` as the extension-facing
gate for automatic idle work, including Plan-mode and active Review-task
behavior.
- Add focused coverage for Plan-mode rejection and active Review-task
rejection without queuing synthetic input.

## Testing

- `just test -p codex-core try_start_turn_if_idle`

jif · 2026-06-04 14:44:45 +02:00

d297616d3e

chore: calm down (#26367 )

Prompt update to address feedback

jif · 2026-06-04 12:46:02 +02:00

16d02ec77c

ci: sign macOS release artifacts with Azure Key Vault (#26252 )

## Why

The public Codex release workflow needs to sign and notarize macOS
binaries and DMGs without placing the Developer ID private key in
GitHub. This moves the private-key operation behind the protected
`codesigning` environment and uses GitHub OIDC with Azure Key Vault
PKCS#11, while preserving the existing external `build_unsigned` /
`promote_signed` fallback.

## What changed

- Add a reusable AKV PKCS11 setup action that authenticates to Azure
with OIDC, downloads pinned signing tools, verifies their SHA-256
digests, and loads the public signing certificate from Key Vault.
- Replace the legacy macOS signing action with scripts that support
AKV-backed `rcodesign`, notarize signed binaries and DMGs, and staple
DMG notarization tickets.
- Restructure `rust-release.yml` so macOS builds produce unsigned
artifacts first, protected jobs perform signing and notarization, macOS
runners package and verify the results, and release publishing waits for
verified artifacts.
- Preserve the manual external-signing handoff flow and make manual-mode
conditions explicit.
- Move the Codex entitlements file alongside the signing scripts and
update CODEOWNERS for the new signing surfaces.

## Verification

- [Live protected signing workflow
run](https://github.com/openai/codex/actions/runs/26903610631) completed
successfully for both macOS architectures, including binary
signing/notarization, DMG signing/notarization, and final artifact
verification.
- Downloaded both signed DMGs and independently verified their checksums
and strict signatures.
- Confirmed `xcrun stapler validate` succeeds and Gatekeeper accepts
both DMGs as `Notarized Developer ID`.
- Mounted both DMGs and confirmed the contained `codex` and
`codex-responses-api-proxy` binaries have valid Developer ID signatures
for the expected architectures.

---------

Co-authored-by: shijie-openai <shijie.rao@openai.com>

Eric Burke · 2026-06-03 20:34:51 -07:00

ad2012d645

[codex-analytics] report compaction request token counts (#25946 )

## Why

Compaction analytics need token counts that better represent the request
being compacted. The existing session snapshot can diverge from the
actual remote compaction request after output rewriting, and remote v2
can use server-side Responses usage when available.

## What changed

- Add an optional `active_context_tokens_before` override to
`CompactionAnalyticsAttempt::track(...)` for remote compaction when it
has a better before-token value than the begin-time session snapshot.
The local `/compact` path passes no override.
- For remote v1 `responses_compact`, subtract the estimated token delta
from pre-compaction output rewriting from the session snapshot, capped
by locally-added tokens since the last successful API response.
- For remote v2 `responses_compaction_v2`, use the same bounded
output-rewrite fallback as remote v1, then overwrite
`active_context_tokens_before` with server `token_usage.input_tokens`
from the `response.completed` event when present.
- Keep the existing v2 compaction-output validation while carrying the
completed response token usage through `collect_compaction_output`.

## Verification

- `just fmt`
- `just test -p codex-core
collect_compaction_output_accepts_additional_output_items`
- `git diff --check`

rhan-oai · 2026-06-04 03:00:44 +00:00

c143a86de8

cli: add package path from install context (#26189 )

## Why

Codex package installs include helper binaries in `codex-path`, such as
the bundled `rg`. Package-layout launches should add that directory
before user commands run, but standalone launches were missing it while
npm launches only worked because `codex.js` had its own legacy `PATH`
rewrite. That made npm and standalone package behavior diverge.

Shell snapshot restoration can also reset `PATH` after runtime setup.
Any package-owned `PATH` prepend has to be recorded as an explicit
runtime override so shells, unified exec, and user-shell commands keep
access to `codex-path` after a snapshot is sourced.

## Repro

Before this change, a curl-installed package could contain `rg` under
`codex-path` but still fail to put it on `PATH`:

```shell
mkdir /tmp/test-codex-curl
curl -fsSL https://chatgpt.com/codex/install.sh \
  | CODEX_HOME=/tmp/test-codex-curl CODEX_NON_INTERACTIVE=1 sh
/tmp/test-codex-curl/packages/standalone/current/bin/codex exec \
  --skip-git-repo-check 'print `which -a rg`'
find /tmp/test-codex-curl -name rg
```

The `which -a rg` output omitted the packaged helper even though `find`
showed it under
`/tmp/test-codex-curl/packages/standalone/releases/.../codex-path/rg`.

The npm install path behaved differently only because
`codex-cli/bin/codex.js` had legacy `PATH` rewriting:

```shell
mkdir /tmp/test-codex-npm
cd /tmp/test-codex-npm
npm install @openai/codex
./node_modules/.bin/codex exec --skip-git-repo-check 'print `which -a rg`'
```

That printed the npm package's `vendor/<target>/codex-path/rg` first.
This PR moves that behavior into Rust-side package launch setup so
curl/standalone and npm/bun launches agree without JS rewriting `PATH`.

## What Changed

- `codex-rs/arg0` now uses
`InstallContext::current().package_layout.path_dir` to prepend the
package helper directory before any threads are created.
- Package helper `PATH` setup is independent from the temporary arg0
alias setup, so `codex-path` is still added even if CODEX_HOME tempdir,
lock, or symlink setup fails.
- `codex-rs/install-context` detects the canonical package layout we
ship: `bin/`, `codex-resources/`, and `codex-path/` next to
`codex-package.json`.
- Shell, local unified exec, and user-shell runtimes now record package
`codex-path` prepends in `explicit_env_overrides`, matching the existing
zsh-fork behavior so shell snapshots cannot restore over the package
helper path.
- Remote unified exec requests do not receive the local app-server
package path overlay.
- `codex-cli/bin/codex.js` no longer computes or overrides `PATH`; it
only locates the native binary in the canonical package layout and
passes npm/bun management metadata.
- Added regression tests for `PATH` ordering, package layout detection,
and shell snapshot preservation of package path prepends.

## Verification

- `node --check codex-cli/bin/codex.js`
- `just test -p codex-install-context -p codex-arg0`
- `just test -p codex-core
user_shell_snapshot_preserves_package_path_prepend`
- `just test -p codex-core tools::runtimes::tests`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-install-context -p codex-arg0 -p codex-core`

Michael Bolin · 2026-06-03 19:08:19 -07:00

6bcccb0ee6

feat(tui): add /app desktop handoff (#25638 )

Felipe Coury · 2026-06-03 20:30:15 -03:00

80b65e9945

fix(tui): add reasoning effort fallback shortcuts (#25623 )

Felipe Coury · 2026-06-03 20:28:31 -03:00

8285cd278b

log plugin MCP server names (#26002 )

## Summary
- emit the plugin capability summary's exact MCP server names in
`codex_plugin_used`

## Test
- `just test -p codex-analytics`
- `just test -p codex-core
explicit_plugin_mentions_track_plugin_used_analytics`
- `just fix -p codex-analytics`

Chris Dong · 2026-06-03 16:06:52 -07:00

4d4837c495

Use Windows setup marker as completion signal (#26074 )

# Why

When an organization requires the elevated Windows sandbox, Codex
launches an elevated helper to provision users, configure firewall and
ACL rules, and lock persistent sandbox directories.

We observed that closing the helper after setup started could leave the
machine partially initialized while the TUI still announced **Sandbox
ready**. Model-only turns continued to work, but the first shell command
retried setup and failed with Windows cancellation error `1223`.

This was not an enforcement bypass; command execution continued to fail
closed. The issue was a false readiness signal: `setup_marker.json` was
written during user provisioning, before the remaining setup stages had
completed.

# What

Treat `setup_marker.json` as the commit record for Windows sandbox
setup:

1. Before full or provisioning setup begins, remove the existing marker
and create the final marker path with a protected ACL.
2. Keep the marker empty and therefore invalid while setup is in
progress. Sandbox users cannot read, modify, or replace it.
3. Run every synchronous setup stage.
4. After setup succeeds, write the valid marker contents without
changing its ACL.
5. After the helper exits successfully, verify the existing readiness
check before enabling the sandbox.

If setup is canceled or fails, the marker remains invalid and Codex
reports setup as incomplete instead of announcing readiness.

Refresh-only and read-ACL-only helper runs continue to leave the marker
untouched. The setup version remains `5` to avoid forcing all existing
Windows users through elevated setup again.

# Verification

- Added coverage confirming sandbox users cannot read or modify the
setup marker after elevated setup.
- Added coverage confirming a successful helper exit without complete
setup artifacts is rejected.
- Ran `just test -p codex-windows-sandbox`.

Abhinav · 2026-06-03 15:33:34 -07:00

0ed2735d19

codex-pr-body: avoid confidential references (#26260 )

## Why

PR descriptions can be visible outside the context used to generate
them. In #23710, a generated description referenced an internal
document, showing that the skill needs an explicit guardrail against
exposing confidential context.

## What changed

- Updated the `codex-pr-body` guidance to prohibit confidential
references, including codenames and OpenAI-internal URLs.

Adam Perry @ OpenAI · 2026-06-03 15:29:57 -07:00

14272b21e9

Rewrite oversized tool outputs during remote compaction (#26251 )

## Why

When trying to fit history under compaction limit rewrite output items
instead of removing them entirely. Otherwise we're breaking
incrementality in relation to the previous response.

pakrym-oai · 2026-06-03 15:25:50 -07:00

4231472c03

feat: catalog multi-agent v2 config (#26254 )

## Why

Model metadata can now select multi-agent v2 even when a user has not
enabled `features.multi_agent_v2` in their config. Some existing configs
still set the legacy `agents.max_threads` knob for v1 multi-agent
behavior, so treating every v2 runtime as incompatible with
`agents.max_threads` would break users whose only v2 signal came from
the model catalog.

The incompatible configuration is specifically enabling
`features.multi_agent_v2` while also setting `agents.max_threads`.
Catalog-forced v2 should use the v2 concurrency setting and ignore the
legacy v1 cap instead of rejecting the config.

## What changed

- Split config validation from runtime concurrency calculation:
`effective_agent_max_threads` now just returns the effective cap for the
resolved multi-agent runtime.
- Added explicit validation for `features.multi_agent_v2` +
`agents.max_threads` at session startup.
- Preserved catalog-selected v2 behavior when `features.multi_agent_v2`
is disabled, so existing configs with `agents.max_threads` keep
starting.
- Updated model-runtime selector coverage so a catalog v2 model still
exposes v2 tools even when `agents.max_threads` is set and the config
flag is disabled.

## Validation

- `cargo check -p codex-core --lib`
- `just test -p codex-core --lib -E
"test(multi_agent_v2_feature_rejects_agents_max_threads) |
test(catalog_v2_allows_agents_max_threads_when_feature_disabled)"`

jif · 2026-06-04 00:24:40 +02:00

11bceb8f8b

[codex] Split Python runtime release workflow (#26226 )

## Why

Python SDK releases pin an exact `openai-codex-cli-bin` version, so all
eight platform runtime wheels must be available on PyPI before the SDK
package is built and published. PyPI does not support reusable workflows
as Trusted Publishers, which means OIDC-backed publishing must run from
each top-level release workflow.

## What changed

- add reusable `python-runtime-build.yml` to prepare and upload all
eight runtime wheels without publishing
- add top-level `python-runtime-release.yml` for manual runtime
publication before updating an SDK pin
- have `python-sdk-release.yml` publish and verify the prepared runtime
wheels from its own top-level trusted job before building the SDK
- verify PyPI exposes exactly the expected eight runtime wheels before
either release workflow continues

## PyPI configuration

- keep the trusted publisher for
`.github/workflows/python-sdk-release.yml` with environment `pypi`
- add a trusted publisher for
`.github/workflows/python-runtime-release.yml` with environment `pypi`
- no trusted publisher is needed for
`.github/workflows/python-runtime-build.yml`

## Validation

- parsed all three workflow YAML files
- validated all embedded shell blocks with `bash -n`
- no local tests run; relying on online CI

Ahmed Ibrahim · 2026-06-03 14:29:52 -07:00

2ca3810005

Restore Windows coverage for code-mode image generation exposure (#25960 )

## Summary

Restore Windows coverage for standalone image generation in code mode.

The previous test executed a V8-backed code-mode cell on Windows CI,
where that runtime path is intentionally excluded because it is
unreliable. The test was then ignored entirely on Windows, removing
useful coverage.

This splits the test into two checks:

- All platforms verify that `image_gen__imagegen` is exposed to the
model when image generation is configured for code mode only.
- Non-Windows platforms continue to execute the full V8-backed flow and
verify that the nested image-generation call succeeds.

## Verification

- `just fmt`
- `git diff --check`
- `just test -p codex-app-server standalone_image_generation`

Result: 3 tests passed, plus the required bench smoke check.

Won Park · 2026-06-03 14:02:55 -07:00

0eb7e6d79b

Fix forked thread name inheritance (#26075 )

Fixes #25950.

## Why
Forking a renamed thread could fall back to the source thread's
first-prompt title because the fork path did not preserve the source's
explicit name. That meant fork-of-renamed-fork flows could show stale
sidebar labels even though the user had renamed the parent.

## What changed
`thread/fork` now reads the source thread's distinct `name`, normalizes
it, persists it onto materialized forks, and applies it to the returned
API thread. Because the source `name` already excludes first-prompt
pseudo-titles, forks inherit only an explicit user rename instead of
stale generated metadata.

Eric Traut · 2026-06-03 12:56:54 -07:00

d8121f93c8

[profile-switcher][rust] -- [1/2] Add app-server account session protocol (#25469 )

## Summary

Adds the app-server v2 `accountSession/*` protocol used by the Desktop
profile switcher and the backend account metadata client needed to
populate workspace choices.

This is the protocol layer only. The app-server lifecycle and
consolidated saved-session storage are split into a follow-up PR.

## Rust Stack

1. This PR
2. [openai/codex#25383](https://github.com/openai/codex/pull/25383) adds
app-server session lifecycle behavior and consolidated saved-session
storage.

## Validation

- Generated app-server schema fixtures are included from the existing
generation flow in the lifecycle PR where the routes are registered.
- Did not run tests per requested scope.

dhruvgupta-oai · 2026-06-04 01:25:11 +05:30

a2ebe07b39

Expose local image paths to models (#25944 )

## Why

Local image attachments include image bytes, but the adjacent
model-visible label omits the source path. Exposing the path lets
model-selected workflows refer back to the intended local image
explicitly.

## What changed

- Include an escaped `path` attribute in model-visible local image
opening tags.
- Reuse the path-aware marker generator in rollout coverage.
- Update protocol, replay, and rollout coverage for the new request
shape.

## Validation

- `just fmt`
- `just test -p codex-protocol`
- `just test -p codex-core skips_local_image_label_text`
- `just test -p codex-core
copy_paste_local_image_persists_rollout_request_shape`
- `git diff --check`

Won Park · 2026-06-03 19:49:58 +00:00

57ab4c89e0

Preserve remote plugin default prompts (#25887 )

## Summary

- Read `default_prompts` from remote plugin release metadata.
- Prefer the plural prompt list over legacy `default_prompt`.
- Fall back to `default_prompt` as a single-item list for backward
compatibility.

## Testing

- `just test -p codex-core-plugins`
- `just test -p codex-app-server`

Eric Ning · 2026-06-03 12:39:13 -07:00

aeac226d16

[codex] Pin Python SDK to runtime 0.137.0a4 (#26216 )

## Summary
- pin the Python SDK runtime to `openai-codex-cli-bin==0.137.0a4`
- refresh generated protocol artifacts from `rust-v0.137.0-alpha.4`
- refresh `sdk/python/uv.lock` with all eight published runtime wheels

## Runtime publication
- published `openai-codex-cli-bin==0.137.0a4` through the
`python-sdk-release` workflow
- includes macOS, manylinux, musllinux, and Windows wheels
- publication run:
https://github.com/openai/codex/actions/runs/26905608531

## Validation
- ran `just fmt`
- generated artifacts from the `rust-v0.137.0-alpha.4` release wheel
- ran `uv lock --check --default-index https://pypi.org/simple`
- did not run tests locally, per request; CI provides the test signal

Ahmed Ibrahim · 2026-06-03 12:14:14 -07:00

10b408080a

[codex] Copy user Bazel settings into Codex worktrees (#25925 )

## Why

Codex-created linked worktrees do not include ignored files from the
main worktree. Bazel users who keep local overrides in `user.bazelrc`
therefore lose those settings in every new worktree.

The setup must also work on Windows and must not overwrite a file that
already exists in the worktree.

## What changed

The checked-in Codex environment now invokes
`.codex/environments/setup.py`. The script resolves the main worktree
and current worktree, then uses
`copy_from_main_worktree_to_worktree(repo_relative_path)` to copy
ignored files into new worktrees without overwriting existing
destinations.

`main()` currently copies `user.bazelrc`. Additional repository-relative
paths can be added as further calls to the same helper.

## Validation

- Ran the setup script in a linked worktree and confirmed it handles a
missing main-worktree `user.bazelrc`.
- Verified the helper copies a main-worktree file, preserves an existing
worktree file, and creates parent directories for a nested path.

Adam Perry @ OpenAI · 2026-06-03 18:29:36 +00:00

2d5c264ebc

core: stop threading SandboxPolicy through exec (#25700 )

## Why

#25450 attempts a broad `SandboxPolicy` removal across several unrelated
surfaces, which makes it hard to review and still leaves new helper code
moving legacy policies around. This PR is a narrower alternative:
migrate only the exec-side Windows sandbox plumbing so the review can
focus on one production path and one compatibility boundary.

The goal is to stop threading `SandboxPolicy` through exec code without
expanding the migration into app-server, protocol, telemetry, config, or
session behavior.

## What changed

- Removed `ExecRequest::compatibility_sandbox_policy()`.
- Changed the Windows restricted-token and elevated filesystem override
helpers to accept `PermissionProfile` plus the split filesystem/network
policies instead of a `SandboxPolicy`.
- Kept the remaining legacy projection local to the writable-root
comparison that still needs to compare split policy behavior against the
legacy Windows backend model.
- Rejected restricted split filesystem policies that still grant
full-disk writes before using the Windows restricted-token backend,
preserving the previous clear-failure behavior for profiles that project
to `ExternalSandbox`.
- Updated the Windows sandbox override tests to exercise the new call
shape and cover the full-write split-profile regression.

## Verification

- `just test -p codex-core windows_restricted_token`
- `just test -p codex-core windows_elevated`

Michael Bolin · 2026-06-03 10:41:41 -07:00

52b359b249

Fix multiline paste in /goal edit (#26047 )

Fixes #26025.

## Why
`/goal edit` opens `CustomPromptView`, which did not use the paste-burst
handling that protects the main composer when terminals deliver paste as
rapid key events. On Windows terminals, the first pasted newline could
be treated as Enter-to-submit, truncating the goal edit and leaving the
rest of the paste behind.

## What
This reuses `PasteBurst` in `CustomPromptView` as a lightweight
Enter-suppression detector for paste-like key streams. Characters still
insert directly, explicit paste still goes through the view paste path,
and ordinary text entry still submits on Enter.

Eric Traut · 2026-06-03 09:36:50 -07:00

a2a9e767f7

feat: guard git enrichment (#26175 )

Skip turn git metadata enrichment when a turn has remote or multiple
executors, so we do not report the orchestrator checkout as executor
workspace metadata.

Test: `just test -p codex-core` (blocked by existing
`Session::conversation_id` compile error in `close_agent.rs`).

jif · 2026-06-03 18:36:10 +02:00

8030c36970

nit: small prompt update for MAv2 (#26179 )

Simple prompt change for MAv2 because of OOD compared to CBv9

jif · 2026-06-03 18:34:32 +02:00

99c9be1d30

[codex] Restore setup helper UAC manifest (#25949 )

## Why

#23764 removed Windows resource stamping from `codex-windows-sandbox`,
but it also removed the setup helper's UAC manifest. That manifest was
doing more than cosmetic version metadata: Microsoft documents
`requestedExecutionLevel level="asInvoker"` as the setting that makes an
executable run at the same permission level as the process that started
it:
https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#trustinfo

In the reported session, `codex-windows-sandbox-setup.exe` was launched
for a non-elevated setup refresh and `CreateProcess` failed with `os
error 740` (`The requested operation requires elevation`). Restoring an
explicit `asInvoker` manifest records the helper's intended default
launch contract: normal launches inherit the caller's token, and
elevation only happens through the code paths that request it
explicitly.

The setup helper has two launch modes:

- setup refresh uses a normal `Command::new(...)` spawn and should never
trigger UAC
- full setup explicitly uses `ShellExecuteExW` with the `runas` verb
when elevation is required

Restoring `asInvoker` keeps refresh non-elevated by default while
preserving the explicit elevated path for full setup.

## What changed

- Restored a minimal `codex-windows-sandbox-setup.manifest` containing
only `requestedExecutionLevel level="asInvoker"`.
- Added a small build script that passes setup-helper-scoped manifest
linker args for MSVC and the Windows GNU/LLVM target used by Bazel.
- Wired the manifest into Bazel build-script data.

This does not restore `winres`, `FileDescription`, `ProductName`, or
package-wide resource stamping, so other Codex binaries that link
`codex-windows-sandbox` do not inherit metadata from this package.

## Verification

- `cargo fmt -p codex-windows-sandbox`
- `cargo build -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- `cargo build -p codex-windows-sandbox --bin codex-command-runner`
- `cargo build -p codex-windows-sandbox --lib`
- Build-script output simulation for `CARGO_CFG_TARGET_ENV=msvc` emits
`/MANIFEST:EMBED` and `/MANIFESTINPUT:<manifest>`.
- Build-script output simulation for `CARGO_CFG_TARGET_ENV=gnu` +
`CARGO_CFG_TARGET_ABI=llvm` emits `-Wl,-Xlink=/manifest:embed` and
`-Wl,-Xlink=/manifestinput:<manifest>`.
- Inspected the built binaries and confirmed:
- `codex-windows-sandbox-setup.exe` contains `requestedExecutionLevel` /
`asInvoker`
  - `codex-command-runner.exe` does not contain those manifest strings
- Windows `VersionInfo` remains blank for `FileDescription` /
`ProductName`
- `just test -p codex-windows-sandbox` ran through Nextest, with 114
passing, 2 skipped, and 1 existing Windows sandbox failure:
`unified_exec::tests::legacy_non_tty_cmd_emits_output` fails with
`CreateRestrictedToken failed: 87`.

iceweasel-oai · 2026-06-03 09:21:24 -07:00

b2344d8fbc

fix: main (#26176 )

jif · 2026-06-03 16:47:34 +02:00

4417e4c193

Implement v1 skills extension prompt injection (#26167 )

## Why

The skills extension needs a real turn-time path before host, executor,
or remote skills can be routed through it. The previous code was mostly
a placeholder catalog/provider sketch, so there was no bounded
available-skills fragment, no source-owned `SKILL.md` read, and no place
for warnings or per-turn selection state to live.

This PR makes `ext/skills` the authority-preserving flow for listing
candidate skills and injecting only explicitly selected main prompts,
without adding more of that logic to `codex-core`.

## What changed

- Expands catalog entries with `main_prompt`, display path, short
description, dependency metadata, enabled/prompt visibility flags, and
authority/package-aware read requests.
- Replaces the placeholder `providers/*` modules with
`SkillProviderSource` and `SkillProviders`, routing list/read/search
calls by source kind and surfacing provider failures as warnings.
- Adds bounded available-skills rendering and `SKILL.md` main-prompt
truncation before the fragments enter model context.
- Resolves explicit skill selections from structured `UserInput::Skill`,
skill-file mentions, `skill://...` paths, and plain `$skill` text
mentions, then reads selected prompts through their owning provider.
- Stores mutable per-thread skills config and per-turn
catalog/selection/warning state.
- Adds `install_with_providers` so tests and future host wiring can
supply concrete providers.

## Testing

- Not run locally.
- Added `codex-rs/ext/skills/tests/skills_extension.rs` coverage for
available-catalog injection, selected prompt injection through the
owning provider, and prompt-hidden skills that remain invokable.

jif · 2026-06-03 16:24:16 +02:00

96d2d2f68c

chore: mechanical rename (#26156 )

Rename `Session::conversation_id` to `Session::thread_id` with an auto
refactor in RustRover

jif · 2026-06-03 15:38:30 +02:00

c9ae0f48a1

7148 Commits