codex

[codex] route sleep through time providers (#29973 )

## Summary

- add a cancellable sleep operation to `TimeProvider`
- route `clock.sleep` through the configured provider
- extend the supported sleep duration to 12 hours
- complete the sleep turn item before propagating provider failures

## Why

This isolates the core clock abstraction needed by external clock
integrations. Existing system and app-server behavior remains wall-clock
based in this PR; the stacked follow-up supplies app-server sleeps from
an external clock.

rka-oai · 2026-06-24 22:17:43 -07:00

f66d793a2d

core: raise token budget message limits (#29970 )

## Why

Token-budget reminder and guidance messages can require more than 1,000
bytes to provide useful model-facing instructions. At the same time,
these strings are injected into model-visible context, so their size
must remain tightly bounded in response to the P0 context-growth
concern. A 2,000-byte runtime cap provides additional room without
allowing the substantially larger context growth of a 4 KiB limit.

## What changed

- raises the runtime byte limits for token-budget reminder templates and
guidance messages from 1,000 to 2,000
- raises the corresponding JSON Schema `maxLength` values to 2,000
- regenerates `codex-rs/core/config.schema.json`

## Testing

- `just test -p codex-features`
- `just test -p codex-core load_config_resolves_token_budget_config
load_config_rejects_invalid_token_budget_reminder_template`

The full `codex-core` test run completed 2,858 tests successfully and
encountered seven unrelated environment-sensitive failures involving
Seatbelt/network environment assertions, MCP capability setup, and abort
timing.

Michael Bolin · 2026-06-25 05:05:32 +00:00

22f12568e1

Report MCP error codes with server attribution (#29969 )

## Why

MCP error-code telemetry special-cased Codex Apps: its reported error
codes were retained, while codes from every other MCP server were
replaced with `unknown`. Error reporting should behave consistently for
every MCP server. The server name already identifies where an error came
from, so telemetry does not need a separate Codex Apps classification.

This follows up on [#28976](https://github.com/openai/codex/pull/28976),
which introduced MCP error-code telemetry.

## What changed

- Add the MCP server name to call, duration, and error metrics.
- Retain bounded, sanitized tool error codes from every MCP server.
- Remove `McpErrorCodeSource` and the Codex Apps ownership lookup from
telemetry collection.
- Use the same metric-tagging path for blocked, rejected, and executed
MCP calls.

## Test plan

- Verify the complete metric tag set includes the sanitized MCP server
name.
- Verify error codes from ordinary MCP servers are retained, bounded,
and sanitized.
- Preserve coverage for request failures, tool-result failures, nested
auth failures, and span attributes.

Ahmed Ibrahim · 2026-06-24 21:08:39 -07:00

cef5444a80

[3/3] core: replay persisted world state (#29837 )

## Why

Persisting `WorldState` snapshots and patches is only useful if resume
and fork restore that exact comparison baseline. Rebuilding it from
`TurnContextItem` loses section state and can either repeat or suppress
model-visible updates.

This is the third PR in the WorldState persistence stack, built on
#29835.

## What

- Replay full WorldState snapshots and RFC 7386 patches through the
existing rollout reconstruction segments.
- Discard state from rolled-back turns and treat compaction as a
baseline reset.
- Hydrate `ContextManager` from the reconstructed snapshot on resume and
fork.
- Remove the synthetic `TurnContextItem` to WorldState conversion path.
- Leave legacy or malformed rollouts without a baseline so the next
update safely emits a full snapshot.

## Testing

- `just test -p codex-core world_state`
- `just test -p codex-core rollout_reconstruction_tests`
- `just fix -p codex-core`
- `just test -p codex-core` *(the changed tests passed; the full run
also hit unrelated existing/test-environment failures, primarily a
missing `test_stdio_server` binary)*

sayan-oai · 2026-06-25 03:32:08 +00:00

a74771340d

[codex] Add Ultra reasoning effort (#29899 )

## Why

Ultra should be one user-facing reasoning selection for work that
benefits from both maximum reasoning and proactive multi-agent
delegation. Without it, clients must coordinate maximum reasoning with
the experimental `multiAgentMode` setting, even though the inference
backend still expects its existing `max` effort value.

This change makes reasoning effort the source of truth: clients select
`ultra`, core derives proactive multi-agent behavior when the turn is
eligible for multi-agent V2, and inference requests continue to use the
backend-compatible `max` value.

## What changed

- Add `ultra` as a first-class reasoning effort and preserve
model-catalog ordering when exposing it to clients.
- Convert `ultra` to `max` at the inference request boundary, including
Responses HTTP/WebSocket requests, startup prewarm, compaction, and
memory summarization.
- Derive effective multi-agent mode per turn from effective reasoning
effort:
  - eligible multi-agent V2 + `ultra` → `proactive`
  - eligible multi-agent V2 + any other effort → `explicitRequestOnly`
- V1 or otherwise ineligible sessions → no multi-agent mode instruction
- Keep the derived effective mode in turn context history so successive
turns can emit a developer-message update only when the effective mode
changes.
- Remove selected multi-agent mode from core session configuration, turn
construction, thread settings, resume/fork restoration, and subagent
spawn plumbing. Subagents inherit reasoning effort and derive their own
effective mode.
- Retain the experimental app-server `multiAgentMode` fields for wire
compatibility while marking them deprecated. Request values are accepted
but ignored; compatibility response fields report `explicitRequestOnly`.
- Display Ultra in the TUI using the order supplied by `model/list`.

## Validation

- `just test -p codex-core ultra_reasoning_uses_max_for_requests`
- `just test -p codex-tui model_reasoning_selection_popup`

Shijie Rao · 2026-06-24 20:13:52 -07:00

df1199fddb

[2/3] core: persist world state in rollouts (#29835 )

## Why

`WorldState` currently remembers its model-visible diff baseline only in
memory. That leaves no durable source for restoring the exact baseline
after resume, fork, rollback, or compaction.

This is the second PR in the WorldState persistence stack, built on
#29833 and following #29249. It records durable state transitions; the
next PR will replay them during rollout reconstruction.

## What

- Add a `world_state` rollout item containing either a full snapshot or
an RFC 7386 JSON Merge Patch.
- Persist a full snapshot after initial context and after compaction
establishes a new context window.
- Persist non-empty patches when later sampling steps or turns advance
the WorldState baseline.
- Write model-visible history before its matching WorldState record, so
an interrupted write can only cause a safe repeated update on replay.
- Preserve WorldState records for full-history forks while excluding
them from thread previews, metadata, and app-server history
materialization.

Older binaries read rollout lines independently, so they skip the
unknown `world_state` records while retaining the rest of the thread.

## Testing

- `just test -p codex-core
snapshot_merge_patch_changes_and_removes_nested_values`
- `just test -p codex-core
world_state_baseline_deduplicates_until_history_is_replaced`
- `just test -p codex-core
deferred_executor_compaction_preserves_then_updates_environment_once`
- `just test -p codex-protocol`
- `just test -p codex-rollout`
- `just test -p codex-state`
- `just test -p codex-thread-store`
- `just test -p codex-app-server-protocol`

sayan-oai · 2026-06-24 20:13:49 -07:00

fa036d39aa

[codex] Populate remote plugin local versions (#29956 )

# What

- Carry installed remote release versions through remote plugin
summaries as `localVersion`.
- Keep the app-server mapping a pure adapter by populating that value in
the remote catalog layer.

# Why

Remote plugin summaries always returned `localVersion: null` even after
their versioned bundles had been installed locally. Consumers such as
scheduled-task template discovery use `localVersion` to resolve a
plugin's materialized root, so templates from remote curated plugins
were silently skipped.

Abhinav · 2026-06-25 03:13:03 +00:00

6db937275f

code-mode: define process host wire protocol (#29804 )

## Why

The process-owned code mode implementation needs an explicit, bounded
wire contract before either side depends on it. Keeping framing and
message semantics in `codex-code-mode-protocol` gives the client and
sidecar one shared source of truth and makes compatibility failures
detectable during connection setup.

## What changed

- adds a versioned client/host handshake with required and optional
capabilities
- defines operation requests and responses for session lifecycle and
cell control
- defines reverse delegate request, response, cancellation, and
cell-closure messages
- adds a four-byte little-endian length-prefixed JSON codec with a hard
frame cap
- rejects malformed frames, unknown fields, invalid identifiers, and
unsupported protocol states
- locks the wire representation down with explicit JSON round-trip tests

## Testing

- `just test -p codex-code-mode-protocol`

## Stack

Part 1 of 6. Followed by
[#29805](https://github.com/openai/codex/pull/29805).

Channing Conger · 2026-06-24 20:03:22 -07:00

b3e1c33776

Represent MCP authentication with an enum (#29924 )

## Why

MCP authentication has distinct OAuth and ChatGPT-session flows.
Representing that choice as `use_chatgpt_auth` makes one flow implicit
and allows the configuration model to express the distinction only
through a boolean.

ChatGPT credential forwarding also needs a first-party trust boundary. A
configurable `chatgpt_base_url` controls routing, but must not grant an
MCP server permission to receive session credentials.

This change builds on #29733, where the boolean was introduced.

## What changed

- Replace `use_chatgpt_auth` with an `auth` field backed by the
exhaustive `McpServerAuth` enum.
- Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
the default.
- Trust only the origin derived from the existing hardcoded
`CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
- Keep configured bearer tokens and authorization headers ahead of the
selected authentication flow.
- Update config writers, schema output, fixtures, and integration-test
setup to use the enum.

## Verification

Integration coverage exercises the complete streamable HTTP startup path
in two independent configurations:

- A directly constructed MCP configuration verifies that matching an
overridden `chatgpt_base_url` does not grant ChatGPT auth.
- A persisted `config.toml` containing an attacker-controlled
`chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
through normal config parsing.

Both tests complete MCP initialization and tool listing and assert that
the full captured request sequence contains no authorization headers.
Separate integration coverage verifies that configured authorization
takes precedence over ChatGPT auth.

Ahmed Ibrahim · 2026-06-24 19:51:51 -07:00

f8937b7d86

TUI support for buffer experience (#29919 )

Eric Traut · 2026-06-24 19:50:50 -07:00

6801941cfe

[1/3] core: make world state snapshots serializable (#29833 )

## Why

`WorldState` currently keeps its diff baseline as live Rust objects
keyed by process-local `TypeId`. That baseline cannot be written to a
rollout or restored after resume, so Codex reconstructs an approximation
from `TurnContextItem`.

This is the first change in the WorldState persistence stack. It gives
every section a stable persisted identity and a compact serializable
comparison snapshot without changing rollout behavior yet.

## What changed

- Require each `WorldStateSection` to define a stable ID and
serializable snapshot type.
- Reject duplicate section IDs when constructing `WorldState`.
- Persist a dedicated environment comparison snapshot using
model-visible strings instead of runtime path types.
- Store only `WorldStateSnapshot` in `ContextManager`, removing the
parallel live-object baseline.
- Render diffs by restoring each section's typed snapshot; invalid
snapshots fall back to a full section render.
- Omit null object fields for future RFC 7386 patches while preserving
null values inside arrays.

Follow-up PRs will record full snapshots and merge patches, then restore
the baseline during resume, fork, and rollback.

## Test plan

- WorldState snapshot tests cover stable IDs, duplicate rejection, null
omission, and array preservation.
- Environment tests cover persistence-safe snapshot values and existing
diff rendering.
- ContextManager baseline deduplication and session context-update
persistence tests.

Related: #29249

sayan-oai · 2026-06-24 19:26:55 -07:00

3e51b46eba

Allow ChatGPT-hosted MCP servers to use session auth (#29733 )

## Why

ChatGPT session authentication was inferred from the reserved Codex Apps
server name. That couples credential routing to Codex Apps-specific
behavior and prevents other MCP endpoints hosted by ChatGPT from
explicitly using the current session.

The opt-in also needs a clear security boundary: an arbitrary MCP
configuration must not be able to redirect ChatGPT credentials to
another origin.

## What changed

- Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
`false`.
- Honor the setting only when the parsed server URL has the same HTTP(S)
origin as the configured `chatgpt_base_url`; otherwise remove the
capability before startup.
- Resolve bearer tokens and static or environment-backed authorization
headers before selecting authentication, with configured authorization
taking precedence over ChatGPT session auth.
- Enable the setting for the built-in Codex Apps and hosted plugin
runtime endpoints while keeping Codex Apps caching and tool
normalization scoped to the reserved server.
- Persist the setting through MCP config rewrite paths and expose it in
the generated config schema.
- Load the current login state for `codex mcp list` so reported auth
status matches runtime behavior.

## Verification

Core integration coverage exercises the complete streamable HTTP MCP
startup path and verifies that:

- a same-origin opted-in server receives the current ChatGPT access
token;
- an explicitly configured authorization header takes precedence;
- a different-origin server completes MCP initialization and tool
listing without receiving any ChatGPT authorization header.

Ahmed Ibrahim · 2026-06-24 19:21:28 -07:00

4c0706e24a

TUI Plugin Sharing 5 - polish remote plugin catalog rows (#26705 )

This is the final plugin sharing PR in the 5-PR stack. It applies the
remaining TUI polish for remote plugin catalog rows and tabs:
admin-disabled plugins now read as blocked/view-only instead of looking
toggleable, admin-installed/default-installed plugins count and sort
like installed plugins, plugin search matches richer metadata, and an
empty successful `Shared with me` section stays hidden.

- Admin-disabled rows use a blocked marker, show `Disabled`, and keep
Enter-only detail behavior without a toggle hint.
- Admin-installed/default-installed plugins show as installed in counts,
ordering, tabs, and detail copy.
- Plugin search now matches descriptions and keywords in addition to
existing row metadata.
- Successful-empty `Shared with me` tabs are hidden, while loading,
error, workspace-empty, and real shared-plugin states remain visible.
- Updates coverage in
`plugins_popup_snapshot_shows_all_marketplaces_and_sorts_installed_then_name`,
`plugins_popup_admin_disabled_installed_plugin_has_no_toggle_hint`,
`plugins_popup_search_matches_plugin_descriptions`, and
`plugins_popup_remote_section_fallback_states_snapshot`.
- Updates snapshots `plugins_popup_curated_marketplace` and
`plugins_popup_empty_shared_section_hidden`.


<img width="2034" height="106" alt="image"
src="https://github.com/user-attachments/assets/3f9a57e1-edd8-4e6c-b0b0-9f632a3c9529"
/>
<img width="2038" height="380" alt="image"
src="https://github.com/user-attachments/assets/45a47491-3381-4846-a13d-496bc0051d42"
/>

canvrno-oai · 2026-06-24 18:48:11 -07:00

a0d5fd772e

core: add configurable <context_window_guidance> message (#29936 )

## Why

This PR adds a configurable `<context_window_guidance>` developer
section immediately after `<context_window>`. Harness integrations need
this section to give the model deployment-specific instructions for
preparing for context-window transitions.

## What changed

- Add an optional `features.token_budget.guidance_message` config with a
1,000-byte runtime cap and generated schema support.
- Render configured guidance as a developer `ContextualUserFragment`
wrapped in `<context_window_guidance>` immediately after
`<context_window>`.
- Omit the section when guidance is unset, empty, or whitespace-only.
- Preserve the resolved value in config locks and classify persisted
guidance as contextual developer content.
- Add integration coverage for rendered content and ordering.

Michael Bolin · 2026-06-24 18:03:44 -07:00

f15df624a6

feat(remote-control): add daemon pairing command (#29913 )

## Why

Users who run Codex remote control through daemon mode can keep the
daemon running, but they do not have a CLI path to mint the short-lived
manual pairing code needed to connect another device. Without this
command, they need to speak app-server JSON-RPC directly.

Related: #25675

## What Changed

- Added `codex remote-control pair`, which connects to the existing
daemon control socket and calls `remoteControl/pairing/start` with
`manualCode: true`.
- Kept the command non-lifecycle-mutating: it does not start, enable, or
restart the daemon.
- Human output labels the manual code as `Pairing code: ...`; `--json`
preserves the full pairing response.
- Added daemon socket-client, CLI formatting, and parser coverage.

## Verification

- `remote_control_client::tests::start_pairing_requests_manual_code`
verifies the daemon client sends `{ "manualCode": true }` and parses the
complete response.
-
`remote_control_cmd::tests::remote_control_pairing_human_output_labels_the_manual_code`
verifies the human-facing output.

Anton Panasenko · 2026-06-24 18:00:06 -07:00

f4e6aa70e5

[codex] nest sleep config under current time reminder (#29910 )

## Summary

- move sleep tool enablement from top-level `[features].sleep_tool` to
`[features.current_time_reminder].sleep_tool`
- remove the standalone `Feature::SleepTool` flag and gate `clock.sleep`
from resolved current-time configuration
- update config schema, config-lock materialization, and existing sleep
coverage

Stacked on #29907.

rka-oai · 2026-06-24 17:49:00 -07:00

35f5d02464

[codex] namespace sleep under clock (#29907 )

## Summary

- expose the interruptible sleep tool as `clock.sleep` instead of
top-level `sleep`
- keep `clock.curr_time` and `clock.sleep` in the same model-visible
namespace when both features are enabled
- update existing core and app-server integration coverage to issue
namespaced sleep calls

## Why

Sleep is a clock operation. Grouping it with `clock.curr_time` gives the
model a more coherent tool surface without changing the sleep feature
gate or runtime behavior.

## Validation

- `just test -p codex-core sleep_tool_follows_feature_gate`
- `just test -p codex-core any_new_input_interrupts_sleep`
- `just test -p codex-app-server
sleep_emits_started_and_completed_items`

rka-oai · 2026-06-24 17:17:28 -07:00

800529218a

Isolate curated plugin sync Git environment (#29785 )

## Why

Several users have reported data loss from this bug, including tracked
files being deleted or replaced and branches appearing to be reset to
the curated plugins repository. This can happen during startup, before
the model chooses to edit anything.

Ambient repository variables such as `GIT_DIR` and `GIT_WORK_TREE` can
override the repository selected by `git -C`, redirecting startup sync's
`git reset --hard` and `git clean -fdx` into the user's active
workspace.

## What

Route every startup-sync Git invocation through a shared command builder
that removes repository-local environment variables before execution.
Add regression coverage to keep those variables isolated.

Fixes #27416

Eric Traut · 2026-06-24 16:04:51 -07:00

134646eff0

Read connector declarations from executor plugins (#29852 )

## Why

Selected capability roots can live on a different executor and operating
system from app-server. Their connector declarations must therefore be
read through the executor that owns the package, without converting
executor URIs into host paths.

This PR adds that authority-bound reader without activating connectors
or changing thread startup.

## What changed

- Add a small `codex-connectors-extension` crate for executor-owned
connector I/O.
- Read only the app configuration explicitly declared by the resolved
plugin manifest.
- Read through the `ExecutorFileSystem` retained by
`ResolvedExecutorPlugin`; there is no host-filesystem fallback or
default-file probe.
- Keep `PathUri` values intact so Windows, Unix, and remote executor
paths work from any orchestrator OS.
- Return full `AppDeclaration` values so the caller retains declaration
names and categories for routing.
- Preserve the selected plugin ID and exact executor URI in read and
parse errors.

The contract is intentionally narrow: selected packages are trusted,
valid packages and packages that provide connectors explicitly declare
their app configuration.

## Stack scope

This PR is stacked on #29851. It only provides the executor-backed
reader. #29856 resolves selected roots at thread start, freezes their
connector snapshot, and contains the remote-capable end-to-end authority
test for the complete path.

jif · 2026-06-24 23:56:50 +01:00

9ff8068880

path-uri: normalize parent segments in absolute joins (#29903 )

## Why

`PathUri::join` normalized `..` for relative paths, but its
absolute-path branch rebuilt URIs through `url::PathSegmentsMut::push`,
which skips dot segments. `/tmp/a/../b` therefore resolved to `/tmp/a/b`
instead of `/tmp/b`.

## What changed

Normalize absolute native path segments before constructing the file
URI. Parent traversal now clamps at POSIX roots, Windows drive roots,
and UNC share roots, including paths with repeated separators.

Add platform-independent coverage for POSIX, drive, UNC, root-clamping,
and repeated-separator cases.

## Manual validation

- `just test -p codex-utils-path-uri`

Adam Perry @ OpenAI · 2026-06-24 22:33:18 +00:00

81f340436c

Add a connector declaration snapshot (#29851 )

## Why

Connector declarations currently enter Codex through broad plugin
capability summaries, then MCP setup, turn tooling, and `app/list` each
reconstruct the same information. That makes executor-selected
connectors difficult to add without coupling connector behavior to the
host plugin loader.

This PR introduces a small connector-owned value that later stack layers
can populate before thread startup.

## What changed

- Move the pure app-declaration parser into `codex-connectors`,
preserving declaration order and category cleanup while leaving
host-side validation and deduplication unchanged.
- Add an immutable `ConnectorSnapshot` with ordered connector IDs and
plugin display-name provenance.
- Adapt the existing local-plugin capability summaries into that
snapshot at current consumer boundaries.
- Use the snapshot for MCP tool provenance, turn connector inventory,
and `app/list`.
- Keep the crate API narrow: no test-only snapshot accessors are
exposed.

The externally visible behavior is unchanged. Connector tools still come
from the orchestrator-owned `/ps/mcp` server, and local plugin
enablement remains owned by the existing plugin loader.

## Stack scope

This is the foundation only. It does not read selected executor packages
or change thread startup. #29852 adds the executor-backed declaration
reader, and #29856 composes selected declarations into a thread
snapshot.

jif · 2026-06-24 23:24:01 +01:00

4e0f863df3

[codex] dedupe remote control account header (#29893 )

## Why

Remote-control HTTP requests applied the authentication headers and then
appended `ChatGPT-Account-ID` again with
`reqwest::RequestBuilder::header`. Since reqwest appends, the wire
request could contain the same header twice. Intermediaries may coalesce
duplicate values into `uuid,uuid`, which is not a valid account ID.

## What changed

- Build remote-control request authentication headers in one place.
- Apply provider headers first, then use `HeaderMap::insert` for the
explicit account ID. This preserves the current account-ID precedence
and all other authentication headers while ensuring exactly one account
header is sent.
- Preserve duplicate HTTP headers in the test harness and assert exactly
one account header for enroll, refresh, list, and revoke requests.

## Validation

Added focused coverage for:

- Adding the explicit account header when the auth provider omits it.
- Replacing multiple provider-supplied account values, including a
differently cased header name.
- Preserving authorization and routing headers while replacing only the
account header.
- Rejecting invalid account header values before sending a request.
- Emitting exactly one account header for enroll, refresh, list, and
revoke requests.
- Maintaining header uniqueness across unauthorized recovery, retry, and
error-response paths.
- Emitting exactly one installation header for enroll and refresh
requests.

Checks run:

- `just test -p codex-app-server-transport request_headers`: 3 passed
- `just test -p codex-app-server-transport remote_control_http_mode`: 6
passed
- `just test -p codex-app-server-transport clients_tests`: 6 passed
- `just test -p codex-app-server-transport`: 123 passed
- `cargo test -p codex-app-server-transport`: 123 passed
- `just clippy -p codex-app-server-transport`
- `just fmt-check`
- `bazel test
//codex-rs/app-server-transport:app-server-transport-unit-tests`

Shuo · 2026-06-24 15:06:53 -07:00

bb05c1f30f

Pipeline bounded AGENTS.md and Git root probes (#29870 )

## Why

When Codex uses a remote `ExecutorFileSystem`, every `get_metadata` call
is an exec-server round trip. Upward discovery currently pays those
round trips serially in two latency-sensitive places:

- session startup, while locating the configured project root before
loading `AGENTS.md`; and
- Git-root discovery, which runs before per-turn Git diff enrichment.

The goal is to remove the serial ancestor dependency without adding a
new filesystem RPC, JSON-RPC batch method, Git executable dependency, or
cache.

## Example

Assume this layout, with `.git` as the configured project-root marker:

```text
/workspace/repo/.git
/workspace/repo/AGENTS.md
/workspace/repo/crates/core/    <- cwd
```

The marker probes have this required precedence:

```text
1. /workspace/repo/crates/core/.git
2. /workspace/repo/crates/.git
3. /workspace/repo/.git
4. /workspace/.git
5. /.git
```

Previously, probe 2 was not sent until probe 1 returned, and probe 3 was
not sent until probe 2 returned. With this change, the client lazily
keeps up to eight ordinary `fs/getMetadata` requests in flight, but
consumes their results in the order above. Codex must still learn that
probes 1 and 2 are absent before accepting probe 3, so the nearest root
always wins. Once probe 3 succeeds, the client has its answer and stops
awaiting probes 4 and 5. Requests that were already sent may still
finish on the worker.

For the marker phase alone, with a 50 ms client-to-worker round trip and
fast local metadata calls, finding the root at probe 3 changes from
roughly three serialized round trips (150 ms) to one round trip plus
worker processing. The later `AGENTS.md` candidate phase remains
separate and ordered.

Only after `/workspace/repo` is selected does `AGENTS.md` discovery
check instruction candidates, in root-to-cwd order:

```text
/workspace/repo/AGENTS.override.md
/workspace/repo/AGENTS.md
/workspace/repo/crates/AGENTS.override.md
/workspace/repo/crates/AGENTS.md
/workspace/repo/crates/core/AGENTS.override.md
/workspace/repo/crates/core/AGENTS.md
```

The first configured candidate found in each directory wins. These
checks remain ordered and no instruction candidate above
`/workspace/repo` is issued. Git-root discovery uses the same bounded
lookup with only `.git` as the marker.

## What changed

- Added a client-side find-up helper that generates `ancestor x marker`
probes lazily, nearest directory first and configured marker order
within each directory.
- Uses an ordered concurrency window of eight scalar metadata requests.
This bounds executor load while preserving nearest-root and marker
precedence.
- Reuses the helper for both configured project-root discovery and
remote Git-root discovery.
- Keeps Git ancestor and marker construction in `AbsolutePathBuf`,
converting only each complete `.git` probe to `PathUri`. This preserves
native paths that require an opaque URI fallback, such as Windows
namespace paths.
- Preserves existing error behavior: `AGENTS.md` discovery propagates
non-`NotFound` metadata errors, while Git discovery treats a failed
marker probe as absent and continues upward.
- Reads each discovered `AGENTS.md` directly instead of statting it a
second time.

No filesystem trait or exec-server protocol method is added. An empty
`project_root_markers` list performs no ancestor-marker I/O and checks
instruction candidates only in `cwd`. This change also deliberately does
not cache roots across turns.

## Symlinks

Upward traversal remains **lexical**. The helper does not canonicalize
`cwd`; it appends marker names to the supplied path and walks that
path's textual parents. The filesystem performs the actual metadata/read
operation, and the current local and exec-server implementations follow
live symlink targets.

For example:

```text
/tmp/pkg -> /workspace/repo/packages/pkg
cwd = /tmp/pkg/src
actual Git marker = /workspace/repo/.git
```

The lexical probes are `/tmp/pkg/src/.git`, `/tmp/pkg/.git`,
`/tmp/.git`, and `/.git`. They do not jump from `/tmp/pkg` to the
target's parent `/workspace/repo`, so this spelling of `cwd` does not
discover `/workspace/repo/.git`. That is the existing behavior and is
unchanged by this PR.

Conversely, if `/tmp/repo -> /workspace/repo`, then probing
`/tmp/repo/.git` follows the directory symlink and finds
`/workspace/repo/.git`; the reported root remains the lexical path
`/tmp/repo`. A live symlink used directly as `.git`, another configured
marker, or `AGENTS.md` is also followed. A symlinked `AGENTS.md` is
loaded when its target is a regular file, while a broken symlink behaves
as `NotFound`.

jif · 2026-06-24 22:58:34 +01:00

39aab9fc45

[plugins] Track plugin install requests by ID (#29684 )

Summary
- Emit `codex_plugin_install_requested` when a validated plugin install
request is made, before the user accepts or declines the elicitation.
- Record the exact model-visible plugin ID, remote plugin ID, required
connector IDs, stable suggestion ID, and `endpoint_recommendation` vs
`legacy_discovery` source.
- Keep `suggest_reason` out of telemetry and leave connector-only
install requests unchanged.

Rollout
- Backend/schema dependency:
https://github.com/openai/openai/pull/1065270
- Land the backend PR before this producer starts sending the event.

Validation
- `just test -p codex-analytics` (83 passed)
- `just test -p codex-core request_plugin_install` (17 passed)
- `just fix -p codex-analytics`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`

Alex Daley · 2026-06-24 21:29:11 +00:00

24423f5712

mcp: keep elicitation requests below app wire types (#29724 )

## Why

Core and tools need to request MCP elicitation without constructing
app-server wire payloads. The request should remain a neutral protocol
concept until app-server serializes it for a client.

## What changed

- Switched core and tools to
`codex_protocol::approvals::ElicitationRequest`.
- Derived turn and server context inside core instead of carrying
app-server request types through lower layers.
- Kept the app-server payload unchanged through an explicit boundary
conversion.
- Removed the remaining production app-server-protocol dependency from
tools.

## Stack

This is PR 5 of 6, stacked on [PR
#29723](https://github.com/openai/codex/pull/29723). Review only the
delta from `codex/split-connector-metadata-types`. Next: [PR
#29725](https://github.com/openai/codex/pull/29725).

## Validation

- `codex-core` MCP coverage passed: 87 tests.
- Tools elicitation and app-server round-trip coverage passed.

Adam Perry @ OpenAI · 2026-06-24 20:53:27 +00:00

df1ee09ec5

[apps] Thread structured icon assets through app list (#29889 )

## Summary

- Add `iconAssets` and `iconDarkAssets` to the app-list protocol.
- Preserve structured icons through directory merging and the connector,
app-
  server, and TUI boundaries.
- Keep legacy logo URLs unchanged as compatibility fallbacks.
- Update generated protocol schemas and TypeScript types.

Drew · 2026-06-24 13:25:44 -07:00

a33ad93996

[codex] Inject agent graph store into ThreadManager (#29736 )

Pick up the AgentGraphStore migration.

- Inject an explicit optional agent graph store into `ThreadManager` 
- Move all calls to spawn, close, recursive resume, and
subtree/archive/delete/feedback traversal through it
- Keep using  `LocalAgentGraphStore` when SQLite is available

This required some changes to the interface to deal with futures:

- The interface now matches `ThreadStore`'s object-safe pattern by
returning a boxed `AgentGraphStoreFuture` directly, allowing
`ThreadManager` to hold `Arc<dyn AgentGraphStore>`

*Slight behavior change!* Unfiltered subtree enumeration now performs a
single all-status breadth-first traversal, so a closed grandchild
beneath an open edge is included; the previous Open-then-Closed
traversals could not cross mixed-status paths and silently omitted it.

Tom · 2026-06-24 13:24:10 -07:00

ece1dfece0

feat(network-proxy): experimental local credential broker (#28034 )

## Why

Codex child processes can inherit injectable local credentials directly,
which lets commands read and exfiltrate the real values. This
experimental slice keeps supported workflows working while moving those
credentials behind the managed network proxy.

This PR contains only the proxy-owned broker implementation. The Codex
config and runtime integration is stacked separately in #29752.

## What changed

- discover supported credentials during child setup, retain real values
only in the in-memory proxy broker, and replace them with shaped dummy
values
- require a presented dummy to select a stored credential and preserve
unrelated explicit authorization headers
- bind GitHub cloud, GitHub Enterprise, and OpenAI credentials to their
intended hosts
- inject credentials only into TLS traffic by default; plaintext
injection requires the explicit dangerous opt-in
- use TLS ClientHello routing for CONNECT so non-TLS protocols remain
opaque tunnels
- expose a pure API that identifies environment keys still holding
broker-generated dummies without mutating the caller's environment

## Scope

- supported credentials: `GH_TOKEN`, `GITHUB_TOKEN`,
`GH_ENTERPRISE_TOKEN`, `GITHUB_ENTERPRISE_TOKEN`, and `OPENAI_API_KEY`
- GitHub cloud credentials match `github.com`, `api.github.com`, and
`*.ghe.com`
- GitHub Enterprise credentials match only the normalized non-cloud
`GH_HOST`
- OpenAI API keys match only `api.openai.com`
- this does not cover SSH agents, kube client certificates, filesystem
secret discovery, or context-injected secret scrubbing

## Validation

- `just test -p codex-network-proxy` (191 passed)
- focused opaque CONNECT, plaintext opt-in, dummy-selection, and
child-isolation regressions passed
- scoped Clippy check for `codex-network-proxy` passed

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
Co-authored-by: Codex <noreply@openai.com>

Winston Howes · 2026-06-24 13:21:16 -07:00

989f55defa

feat(app-server): list descendant threads by ancestor (#29591 )

## Why

`thread/list` can filter direct children with `parentThreadId`, but
clients cannot request an entire spawned subtree. Discovering every
descendant requires repeated client-side requests and gives up the
database's existing filtering and pagination path.

## What changed

Experimental clients can use `ancestorThreadId` to return strict
descendants at any depth while `parentThreadId` retains its direct-child
meaning. The filters are mutually exclusive, the ancestor is excluded,
and every result preserves its immediate `parentThreadId` so callers can
reconstruct the tree.

## How it works

- **Explicit relationship:** Internal list parameters distinguish direct
children from transitive descendants without changing the meaning of
`parentThreadId`.
- **Existing graph:** Persisted parent-child spawn edges remain the
source of truth, so descendant lookup needs no schema migration or
ancestry cache.
- **Indexed traversal:** A recursive SQLite query starts from the
parent-edge index, walks each generation, and applies thread filters,
sorting, and cursor pagination in the same database request.
- **Reconstructable results:** The response stays flat and normally
ordered while carrying each descendant's immediate parent.

## Verification

Ran 550 tests across the protocol, state, rollout, and thread-store
crates, then reran the four focused state, store, and app-server
descendant-listing tests after the final diff reduction. Scoped Clippy
and formatting checks passed. Stable and experimental schema generation
was checked; the stable fixtures remain unchanged while the experimental
schema includes the new field.

Brent Traut · 2026-06-24 13:08:14 -07:00

8057603d0c

Skip credential refresh for WindowsApps launch failures (#29637 )

## Summary

- keep the child error 1312 credential retry for normal executables
- return WindowsApps/AppX launch errors directly instead of rotating
sandbox credentials and retrying the same command

## Why

Windows AppX activation can return `ERROR_NO_SUCH_LOGON_SESSION` (1312)
even when the sandbox token is healthy. For executables under
`WindowsApps`, refreshing the sandbox account password cannot fix that
activation failure; it only triggers elevated setup before the same
command fails again.

This is a focused follow-up to #29624.

jif · 2026-06-24 20:59:53 +01:00

3ccef20ef4

Follow directory symlinks in filesystem walks (#29844 )

Stack 3 of 3. Stacked on #29842.

## What changes

Adds an opt-in `followDirectorySymlinks` setting to `fs/walk`.

When enabled, the walk follows directory symlinks but continues to
ignore symlinked files. Canonical directory identities prevent symlink
cycles, while normal paths keep their existing spelling.

Environment skill discovery enables the setting so symlinked skill
directories continue to work with the new single-RPC scan.

jif · 2026-06-24 20:52:36 +01:00

96d8e34712

[codex] Trace exec-server JSON-RPC requests (#27466 )

## Why

Exec-server JSON-RPC calls can cross local and remote transports, but
trace context stopped at the RPC boundary. That made client and server
work difficult to correlate when diagnosing latency or failures.

## What changed

- Propagate the current W3C trace context on outbound JSON-RPC requests.
- Parent inbound request spans from received trace context.
- Record the received JSON-RPC method on server spans and keep each span
open through response enqueue.
- Add only the OTEL dependencies required by the exec-server crate.

## Stack

Review and land this stack in order:

1. #27466 — trace exec-server JSON-RPC requests **(this PR)**
2. #27467 — record bounded connection, request, and process lifecycle
metrics
3. #27470 — observe remote registration and Noise rendezvous lifecycle

## Validation

- `just test -p codex-exec-server --lib` (153 passed)
- `just bazel-lock-check`
- `just fix -p codex-exec-server`

richardopenai · 2026-06-24 12:50:18 -07:00

74dcce594d

Preserve Windows sandbox identity during credential retry (#29624 )

## Summary

- recognize stale Windows sandbox credentials from both runner logon and
child startup failures
- refresh credentials once without changing the original command,
permissions, file rules, desktop mode, or managed-network identity
- add a Windows regression test that forces error 1312 and inspects the
real retry arguments

## Why

Elevated unified exec starts commands in two steps:

```text
Codex -> sandbox command runner -> requested command
```

Either process start can fail when Windows invalidates the sandbox logon
session. The child-side failure was previously returned as text, so the
parent could not reliably recognize Windows error 1312.

The existing retry also refreshed credentials with `proxy_enforced =
false`, even when the original request used managed networking. That
could change the selected Windows sandbox identity from offline to
online during the retry.

## How

- carry the failure stage and numeric Windows error code through the
command-runner IPC protocol
- preserve native `CreateProcessAsUserW` error codes instead of parsing
error messages
- keep every retry-sensitive field in one request and use it for both
attempts
- retry exactly once after refreshing credentials, then return the
second failure
- share the retry rule with the elevated capture path

The Windows test injects error 1312 on both attempts and verifies:

- two spawn attempts and one credential refresh
- stale credentials are replaced by refreshed credentials
- both attempts receive the same command, environment, cwd, permissions,
roots, deny paths, TTY settings, and private-desktop mode
- credential refresh receives the original `proxy_enforced` value

## Tests

- `just test -p codex-windows-sandbox`
- the new Windows-only regression test is included in the Windows
nextest CI archive

jif · 2026-06-24 20:20:52 +01:00

4907f0c2c3

[codex] suppress low usage remaining warnings when credits are available (#28593 )

## Why

The TUI computed proactive `Heads up, you have less than ...` warnings
before considering workspace credits. As a result, users could see
included-limit warnings even when they could continue using Codex with
workspace credits.

`has_credits` alone is not sufficient to determine whether finite
credits are usable: a spend-control hard limit can cap the reported
balance to zero while `has_credits` still reflects the workspace's raw
balance. Unlimited credits are the opposite case: they are usable even
though no numeric balance is reported.

## What changed

- suppress proactive TUI rate-limit usage warnings and the lower-cost
model nudge when usable workspace credits are available
- treat credits as usable when `has_credits` is true and either
`unlimited` is true or the parsed balance is positive
- continue showing warnings when the usable balance is zero, including
when a spend-control limit has capped otherwise available workspace
credits
- add regression coverage for zero-balance, positive-balance, and
unlimited workspace-credit snapshots

## Validation

- `just test -p codex-tui rate_limit_usage_warnings_`

Brooks · 2026-06-24 18:43:17 +00:00

5013d10824

[codex] fix Windows ConPTY input handling (#29734 )

## Why

Windows unified-exec TTY input did not behave like the non-Windows PTY
path. ConPTY sessions could receive the wrong line ending or mishandle
backspace, especially when sending input to a foreground program through
PowerShell or cmd. The local, legacy restricted, and elevated paths also
handled this normalization separately.

## What changed

- share one stateful Windows TTY input normalizer across local, legacy
restricted, and elevated runner paths
- translate LF and split CRLF into one Windows terminal Enter, encode
backspace as DEL, and preserve UTF-8 and control bytes such as Ctrl-C
- add Windows integration coverage for Unicode input, backspace, Enter,
and PowerShell foreground-child Ctrl-C behavior

## Validation

- `just test -p codex-utils-pty` (13 tests passed; the Unicode
integration test retried once)
- the Unicode integration test passed five consecutive runs with retries
disabled
- integration coverage sends `cafeé 漢字` through cmd and PowerShell and
verifies that Ctrl-C interrupts a running PowerShell foreground child

iceweasel-oai · 2026-06-24 11:27:44 -07:00

a781761eda

Fix environment skill discovery after merge (#29887 )

## Why

The merge of #29831 with the new `fs/walk` environment discovery path
left three `SkillFileDiscovery` initializers without the new namespace
fields. This makes `codex-core-skills` fail to compile and breaks CI for
every PR based on current `main`.

## What changed

- collect plugin roots from the directory entries already returned by
`fs/walk`
- keep the selected root as the namespace fallback
- initialize empty discovery results with empty namespace sets

This preserves the bounded `fs/walk` implementation while restoring the
namespace caching added by #29831.

jif · 2026-06-24 19:08:39 +01:00

8a6a34be75

ci: fail jobs that dirty the worktree (#29720 )

## Why

CI jobs should not silently leave tracked changes or untracked files in
the repository worktree.

## What

- Add a shared final worktree-cleanliness action to 19 checkout-bearing
PR and main CI jobs.
- Ignore the intentional SDK scratch directory and nested V8 checkout.
- Pin Bazelisk in shared CI setup so `.bazelversion` remains
authoritative, avoiding `MODULE.bazel.lock` deltas on Windows runners.
- Leave `rust-ci-full` and release-only workflows unchanged.
- Update `AGENTS.md` to discourage review bots from asking for
`MODULE.bazel.lock` changes.

Adam Perry @ OpenAI · 2026-06-24 11:06:35 -07:00

93c79046d6

Cache plugin namespace during executor skill discovery (#29831 )

## Why

Executor skill discovery runs before the remote skills catalog is
available. For a remote environment, each `ExecutorFileSystem` operation
becomes an exec-server RPC.

Previously, every discovered `SKILL.md` independently resolved its
plugin namespace by walking its ancestors and probing both supported
manifest locations. In the common `plugin/skills/<skill>/SKILL.md`
layout, that repeats 8 RPCs per skill even though every skill under the
plugin root uses the same namespace. These lookups happen while skills
are parsed, so their cost grows linearly with the skill count and adds
directly to first-turn latency.

A selected capability root can also contain standalone skills, multiple
sibling plugins, nested plugins, or symlinked directories. The
optimization therefore needs to retain the nearest-ancestor namespace
for each skill rather than assuming the selected root represents exactly
one plugin.

## What changed

- record plugin-root candidates from directory entries already returned
during skill discovery
- prune candidates that are not ancestors of any discovered `SKILL.md`
before reading manifests
- resolve each relevant plugin root once, with one fallback lookup per
canonical traversal root for symlinked directories
- select the nearest cached plugin namespace for each discovered skill
- avoid namespace lookup entirely when the root contains no skills

No additional directory traversal is required. Namespace work now scales
with the number of plugin roots that contain discovered skills, rather
than the total number of skills or unrelated sibling plugins. Standalone
and nested-plugin names keep their previous behavior.

## Benchmarks

I used a temporary counting `ExecutorFileSystem` around the real local
filesystem. Each filesystem operation was counted as one remote RPC and
given 1 ms of injected latency. Each variant ran three times; times
below are medians.

### One plugin with 100 skills

| Operation | Before | After | Delta |
| --- | ---: | ---: | ---: |
| `get_metadata` | 1,002 | 303 | -699 |
| `read_file` | 200 | 101 | -99 |
| `read_directory` | 102 | 102 | 0 |
| **Total filesystem RPCs** | **1,304** | **506** | **-798 (-61.2%)** |
| **Median load time** | **2.890 s** | **0.997 s** | **2.90× faster** |

The namespace-specific work drops from 800 RPCs to 2 in this layout.

### Multiple plugins under one selected root

These runs compare the correct pre-optimization implementation with the
final nearest-plugin-root cache. The total plugin skill count stays at
100 while the number of plugin roots changes.

| Layout | Before RPCs | After RPCs | Reduction | Before | After |
Speedup |
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
| 2 plugins × 50 skills | 1,312 | 530 | 59.6% | 1,819 ms | 711 ms |
2.56× |
| 10 plugins × 10 skills | 1,344 | 578 | 57.0% | 1,850 ms | 778 ms |
2.38× |
| 50 plugins × 2 skills | 1,504 | 818 | 45.6% | 2,094 ms | 1,086 ms |
1.93× |
| 10 plugins × 10 skills + 10 standalone skills | 1,596 | 630 | 60.5% |
2,209 ms | 860 ms | 2.57× |

The remaining cost grows with the number of relevant plugin manifests.
Each relevant manifest is read once instead of once per skill, while
sibling plugins with no discovered skills are not read. Absolute latency
savings depend on the executor's real RPC latency.

## Tests

- `just test -p codex-core-skills` (109 passed across the library and
integration-test binaries)
- one integration test covers standalone, outer-plugin, nested-plugin,
and unused sibling-plugin layouts, and asserts the exact set of
manifests read

jif · 2026-06-24 17:14:34 +01:00

390b73133b

[codex] show external import result counts (#29567 )

## What changed

- Show per-type import counts in the `/import` review UI and started
message.
- Render completion results as a multi-line summary with total
imported/failed counts and one row per import type.
- Add snapshot coverage for the updated review and completion output.

<img width="537" height="322" alt="Screenshot 2026-06-23 at 9 41 20 PM"
src="https://github.com/user-attachments/assets/166542eb-2097-4b2b-8130-8f6fd8c680ce"
/>


## Why

The TUI previously only reported that Claude Code import started or
finished. Users could not see how many items of each type were selected
or how many actually imported versus failed.

charlesgong-openai · 2026-06-24 08:56:57 -07:00

3694b48a82

Use fs/walk for environment skill discovery (#29842 )

Stack 2 of 3. Base: #29841. Follow-up: #29844.

## What changes

Environment skill discovery currently walks remote filesystems through
repeated `readDirectory` and `getMetadata` calls. This switches that
scan to the bounded `fs/walk` operation from the base PR.

```text
Before: readDirectory(root) -> getMetadata(...) -> readDirectory(child) -> ...
After:  fs/walk(root, limits) -> filter the result for SKILL.md
```

This makes environment skill discovery one RPC while preserving
traversal warnings and the existing depth and directory limits. The scan
also has an explicit entry limit. The follow-up restores
directory-symlink traversal.

jif · 2026-06-24 16:32:35 +01:00

69b76e9d07

Add a bounded filesystem walk RPC (#29841 )

Stack 1 of 3. Follow-ups: #29842 and #29844.

## What changes

Adds a general bounded `fs/walk` operation to the exec server.

The operation returns file and directory entries plus recoverable
per-path errors. It skips symlinks, preserves the existing filesystem
sandbox routing, and enforces depth, directory, entry, and response-size
limits.

This PR only defines and wires the filesystem operation. It does not
change any callers yet.

jif · 2026-06-24 16:05:43 +01:00

c14623d04c

Persist agent messages as response items (#29829 )

## Why

Inter-agent messages are recorded in live history as
`ResponseItem::AgentMessage`, but rollouts stored
`InterAgentCommunication` and rebuilt the response item during resume.
This made the rollout differ from the actual Responses history.

## What changed

- store the prepared `agent_message` response item directly
- keep `trigger_turn` in a small local metadata record for fork
truncation
- keep reading older `inter_agent_communication` rollout items

jif · 2026-06-24 15:43:10 +01:00

b4f0f3eff1

[codex] Emit implicit skill usage for support reads (#29731 )

## Summary
- Index all enabled skills for command-based usage detection, regardless
of `allow_implicit_invocation`.
- Preserve `allow_implicit_invocation` for the model-visible implicit
routing list.
- Add regression coverage for a support/preflight skill whose `SKILL.md`
is read and whose script is run while implicit invocation is disabled.

## Root cause
`allow_implicit_invocation` was used for both model routing and
command-based usage-event detection. That meant support skills like
`data-analytics:user-context` could be read or run by other skills, but
those accesses could not emit implicit usage events.

## Validation
- `just fmt`
- `just test -p codex-core-skills
service::tests::skills_for_config_indexes_usage_detection_for_non_implicit_skills`
- `just test -p codex-core-skills` now has the new test passing, but 3
unrelated local tests fail because
`/Users/alexsong/.agents/skills/test/SKILL.md` is invalid/missing YAML
frontmatter.

alexsong-oai · 2026-06-24 08:57:34 +00:00

f959e7fc98

Keep executor plugin MCP paths URI-native (#29628 )

## Why

Executor-owned plugin roots are `PathUri`, but MCP config normalization
still converts them into a native `Path` using the app-server host's
rules. Relative `cwd` values can therefore resolve against the wrong
filesystem when host and executor path conventions differ.

This PR keeps executor MCP paths URI-native until the selected
environment launches the server, while retaining the existing host
parser behavior.

## What changed

- Keep one shared MCP normalization path with narrow host-`Path` and
executor-`PathUri` entrypoints.
- Preserve native host resolution for locally installed plugin MCP
configs.
- For executor configs, default `cwd` to the plugin root and resolve
relative working directories with the root URI's path convention.
- Accept explicit executor `file:` URIs only when they remain within the
selected plugin root.
- Preserve the selected environment id and existing remote
environment-variable ownership rules.
- Route the executor plugin provider through the URI-native entrypoint
without converting the root on the host.
- Ensure `codex doctor` does not probe executor-owned stdio commands or
foreign working directories on the host.
- Cover foreign Windows roots, relative and absolute executor working
directories, traversal rejection, runtime resolution, and doctor
behavior.

```text
plugin root:    file:///C:/plugins/demo
configured cwd: scripts
                  |
                  v
resolved cwd:  file:///C:/plugins/demo/scripts
                  |
                  v
launch through the selected executor
```

No new provider or filesystem abstraction is introduced.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. #28918 — keep selected plugin roots and resources URI-native.
4. #29626 — load executor skills without host path conversion.
5. **This PR** — resolve executor MCP working directories without host
path conversion.

jif · 2026-06-24 09:46:07 +01:00

3e39e92f03

[codex] Remove auto-compaction opt-out (#29815 )

## Summary

- remove the default-on `auto_compaction` feature flag and generated
config schema entries
- restore unconditional pre-turn, model-switch/hash, and mid-turn
automatic compaction
- expose `new_context` whenever token-budget tooling is enabled
- remove the disabled-auto-compaction integration coverage introduced by
#28260

## Motivation

Roll back the internal auto-compaction escape hatch added in #28260.
Automatic compaction should no longer be suppressible with `--disable
auto_compaction`; existing manual `/compact` behavior remains unchanged.

## Testing

- `just write-config-schema`
- `just test -p codex-features` — 53 passed
- `just test -p codex-core 'suite::compact::'` — 36 passed
- `just test -p codex-core
suite::token_budget::new_context_tool_starts_new_window_before_follow_up`
— 1 passed
- `just fix -p codex-core -p codex-features`
- `just fmt`
- `just test -p codex-core` — 2,778 passed, 59 failed, 16 skipped;
failures were outside the changed compaction paths and were dominated by
missing first-party test binaries and shell-snapshot timeouts

rhan-oai · 2026-06-24 00:15:04 -07:00

2a320fedb5

docs: document remote executor integration testing (#29790 )

## Why

Agents need a clear default for writing remote-compatible integration
tests and reproducible commands for each supported runner.

## What

Expand the `remote-tests` skill with fixture guidance, skip selection,
and Docker and Wine commands. Add always-visible `AGENTS.md` guidance
that points new core and app-server tests toward automatic environment
fixtures.

Stacked on #29789.

Adam Perry @ OpenAI · 2026-06-24 05:55:36 +00:00

31e428a1ef

test: use automatic environments in app-server integration tests (#29789 )

## Why

Topology-neutral app-server integration tests should exercise automatic
environment selection so the same setup covers local and remote
executors.

## What

Migrate eligible tests to `TestAppServer::new_with_auto_env()` and
`send_thread_start_request_with_auto_env()`. Leave explicit-topology
tests unchanged, and skip the request-permissions case on Windows with a
TODO for cross-platform tool routing.

## Validation

- `just test -p codex-app-server`
- `bazel test //codex-rs/app-server:app-server-all-wine-exec-test
--test_output=errors`

Stacked on #29788.

Adam Perry @ OpenAI · 2026-06-23 22:48:06 -07:00

c2b3e3b4f5

test: run app-server integration tests under Wine (#29788 )

## Why

Made a mistake when carving #29746 out of my local changes and the test
was missing from the build graph. Oops!

## What

Enable the app-server Wine exec test target. Remove the `manual` tag
from generated Wine-exec test variants so wildcard Bazel test
invocations select them. Refactor the smoke test to ensure it passes
with current Windows support.

Adam Perry @ OpenAI · 2026-06-24 05:23:29 +00:00

b17f30eb2a

connectors: own app metadata types (#29723 )

## Why

Connector metadata is consumed by connector discovery, ChatGPT
integration, core, and TUI code. Treating app-server's wire DTO as the
shared domain model reverses the intended dependency direction.

## What changed

- Added connector-owned app branding, review, screenshot, metadata, and
info types.
- Added explicit conversions in app-server and TUI while preserving
app-server's wire payloads.
- Removed production app-server-protocol dependencies from connectors
and ChatGPT connector code.

## Stack

This is PR 4 of 6, stacked on [PR
#29722](https://github.com/openai/codex/pull/29722). Review only the
delta from `codex/split-config-layer-types`. Next: [PR
#29724](https://github.com/openai/codex/pull/29724).

## Validation

- Connector and tools coverage passed.
- App-server app-list coverage passed: 13 tests.

Adam Perry @ OpenAI · 2026-06-23 22:08:23 -07:00

e639e8c4bd

config: own layer provenance types (#29722 )

## Why

Config layer provenance describes how effective configuration was
assembled, so it belongs with the config loader rather than in
app-server's serialized API types.

## What changed

- Moved `ConfigLayerSource`, `ConfigLayerMetadata`, and `ConfigLayer`
ownership into `codex-config`.
- Kept app-server's wire payloads unchanged and added explicit
conversions at the app boundary.
- Removed lower-level app-server-protocol dependencies from config
consumers.

## Stack

This is PR 3 of 6, stacked on [PR
#29721](https://github.com/openai/codex/pull/29721). Review only the
delta from `codex/split-auth-domain-types`. Next: [PR
#29723](https://github.com/openai/codex/pull/29723).

## Validation

- `codex-config` coverage passed.
- App-server config-manager and config RPC coverage passed.

Adam Perry @ OpenAI · 2026-06-24 04:03:04 +00:00

1d65ccabd5

7833 Commits