codex

[codex] Enable remote plugins by default (#30297 )

## Summary

- enable the remote plugin feature by default
- promote the remote plugin feature from under development to stable
- preserve the existing `features.remote_plugin` override for explicitly
disabling it
- keep legacy disabled-path coverage explicit in TUI and app-server
tests

## Impact

Remote plugin functionality is enabled by default for configurations
that do not set the feature flag. The existing Codex backend
authentication gate still applies.

## Validation

- `just fmt`
- `just test -p codex-features`
- `just test -p codex-tui
plugins_popup_remote_section_fallback_states_snapshot`
- targeted `codex-app-server` plugin-list and skills-list tests
- `git diff --check`

The full TUI and app-server suites were also exercised locally. All
remote-plugin-related coverage passed; unrelated local
sandbox/test-binary failures remain outside this change.

xl-openai · 2026-06-28 11:46:25 -07:00

e428a12d22

[codex] wire process-owned code mode host into core (#30142 )

## Summary

- add the `code_mode_host` feature flag and select
`ProcessOwnedCodeModeSessionProvider` in `CodeModeService` when enabled
- initialize code-mode sessions lazily so a missing host reports a tool
error without failing thread startup
- resolve `codex-code-mode-host` beside the running Codex binary by
default while preserving `CODEX_CODE_MODE_HOST_PATH` as an override
- add unit and end-to-end coverage for host resolution and graceful
missing-host behavior

## Why

This wires the process-owned session client from #30112 into the core
service behind an opt-in rollout gate. Packaged Codex installations can
place the helper in the same `bin` directory as the main executable
without relying on `PATH`, while development and custom installations
can continue to override the helper path.

## Stack

- Depends on #30112
- Base branch: `cconger/process-owned-session-runtime-4-client`

## Validation

Build `codex` and `codex-code-mode-host`
`CODEX_CODE_MODE_HOST_PATH="$PWD/target/debug/codex-code-mode-host"
./target/debug/codex --enable code_mode_host`

Channing Conger · 2026-06-26 00:23:33 -07:00

7d8906b478

[codex] add current time reminder delivery mode config (#30031 )

```python
delivery_mode = "any_inference" # default
delivery_mode = "after_user_or_tool_output" # new mode
``` 

## Validation
- just test -p codex-core load_config_resolves_current_time_reminder
- just test -p codex-core
lock_contains_prompts_and_materializes_features

rka-oai · 2026-06-25 19:06:43 +00:00

e8d4a1a411

[codex] current time reminder interval to be set to 0 (#30029 )

A zero interval lets callers request a reminder at every
otherwise-eligible inference boundary.

## Validation
- just test -p codex-core load_config_resolves_current_time_reminder

rka-oai · 2026-06-25 18:30:53 +00:00

cc78903379

core: raise token budget message limits (#29970 )

## Why

Token-budget reminder and guidance messages can require more than 1,000
bytes to provide useful model-facing instructions. At the same time,
these strings are injected into model-visible context, so their size
must remain tightly bounded in response to the P0 context-growth
concern. A 2,000-byte runtime cap provides additional room without
allowing the substantially larger context growth of a 4 KiB limit.

## What changed

- raises the runtime byte limits for token-budget reminder templates and
guidance messages from 1,000 to 2,000
- raises the corresponding JSON Schema `maxLength` values to 2,000
- regenerates `codex-rs/core/config.schema.json`

## Testing

- `just test -p codex-features`
- `just test -p codex-core load_config_resolves_token_budget_config
load_config_rejects_invalid_token_budget_reminder_template`

The full `codex-core` test run completed 2,858 tests successfully and
encountered seven unrelated environment-sensitive failures involving
Seatbelt/network environment assertions, MCP capability setup, and abort
timing.

Michael Bolin · 2026-06-25 05:05:32 +00:00

22f12568e1

core: add configurable <context_window_guidance> message (#29936 )

## Why

This PR adds a configurable `<context_window_guidance>` developer
section immediately after `<context_window>`. Harness integrations need
this section to give the model deployment-specific instructions for
preparing for context-window transitions.

## What changed

- Add an optional `features.token_budget.guidance_message` config with a
1,000-byte runtime cap and generated schema support.
- Render configured guidance as a developer `ContextualUserFragment`
wrapped in `<context_window_guidance>` immediately after
`<context_window>`.
- Omit the section when guidance is unset, empty, or whitespace-only.
- Preserve the resolved value in config locks and classify persisted
guidance as contextual developer content.
- Add integration coverage for rendered content and ordering.

Michael Bolin · 2026-06-24 18:03:44 -07:00

f15df624a6

[codex] nest sleep config under current time reminder (#29910 )

## Summary

- move sleep tool enablement from top-level `[features].sleep_tool` to
`[features.current_time_reminder].sleep_tool`
- remove the standalone `Feature::SleepTool` flag and gate `clock.sleep`
from resolved current-time configuration
- update config schema, config-lock materialization, and existing sleep
coverage

Stacked on #29907.

rka-oai · 2026-06-24 17:49:00 -07:00

35f5d02464

[codex] Remove auto-compaction opt-out (#29815 )

## Summary

- remove the default-on `auto_compaction` feature flag and generated
config schema entries
- restore unconditional pre-turn, model-switch/hash, and mid-turn
automatic compaction
- expose `new_context` whenever token-budget tooling is enabled
- remove the disabled-auto-compaction integration coverage introduced by
#28260

## Motivation

Roll back the internal auto-compaction escape hatch added in #28260.
Automatic compaction should no longer be suppressible with `--disable
auto_compaction`; existing manual `/compact` behavior remains unchanged.

## Testing

- `just write-config-schema`
- `just test -p codex-features` — 53 passed
- `just test -p codex-core 'suite::compact::'` — 36 passed
- `just test -p codex-core
suite::token_budget::new_context_tool_starts_new_window_before_follow_up`
— 1 passed
- `just fix -p codex-core -p codex-features`
- `just fmt`
- `just test -p codex-core` — 2,778 passed, 59 failed, 16 skipped;
failures were outside the changed compaction paths and were dominated by
missing first-party test binaries and shell-snapshot timeouts

rhan-oai · 2026-06-24 00:15:04 -07:00

2a320fedb5

[core] debounce current-time reminders by elapsed time (#29659 )

## Summary
- rename `reminder_interval_model_requests` to
`reminder_interval_seconds`
- read the configured time provider before every model request and
inject a reminder only after the configured number of seconds has
elapsed
- preserve immediate first delivery and forced delivery after compaction
changes the context window

## Tests
- `just test -p codex-core current_time_reminder`

rka-oai · 2026-06-23 10:13:27 -07:00

9fe689783d

[codex] Use tool search for MCP tools by default (#29486 )

## Why

MCP tools were only placed behind `tool_search` when a feature flag was
enabled or when there were at least 100 tools. That made the model's
tool flow depend on both rollout configuration and the number of
installed tools.

The searched-tool flow is now the intended behavior. Making it
unconditional when the model and provider support it gives every
supported setup the same behavior and lets us retire the feature flag
safely.

## What changed

- Defer all effective MCP tools when `tool_search` and namespaced tools
are supported.
- Keep exposing MCP tools directly when search cannot be used, so older
or unsupported model/provider combinations still work.
- Mark `tool_search_always_defer_mcp_tools` as removed and ignore old
configured values.
- Keep plugin filtering, app-only filtering, file handling, and MCP
calls working through the searched-tool flow.

## Why many tests changed

Many tests used to act as if the model could see MCP tools in its first
request and call them immediately. That is no longer the real flow: the
model first receives `tool_search`, searches for a tool, receives the
matching MCP tool, and then calls it in the next request.

The tests therefore needed an extra search step, and checks for tool
names, descriptions, and input fields had to move from the first request
to the search result. These are not separate product changes; they make
the tests follow what the model will actually see after this change. The
plugin tests still check which tools are allowed and where they came
from, the file tests still check upload fields and behavior, and the MCP
round-trip test still checks a successful call from start to finish.

## Tests

- `just test -p codex-features`
- Focused `codex-core` tests for MCP exposure and tool planning
- `just test -p codex-core explicit_plugin_mentions`
- `just test -p codex-core stdio_server_round_trip`
- Focused `codex-core` tests for tool search, app-only tools, and MCP
file uploads

sayan-oai · 2026-06-22 16:45:23 -07:00

c53b1dae09

Register full CDP requirements feature (#28769 )

register cdp requirements feature flag

Samuel Yuan · 2026-06-22 22:08:15 +00:00

ff37f4a6ef

[codex] configure rollout budget reminder thresholds (#29423 )

## Summary

Instead of:

    reminder_interval_tokens = 65_536

allow users to configure explicit remaining-token reminder thresholds:

reminder_at_remaining_tokens = [65_536, 32_768, 16_384, 8_192, 4_096,
2_048, 1_024, 512]

## Validation

- CARGO_INCREMENTAL=0 just test -p codex-core rollout_budget: 9 passed
- just fix -p codex-core
- just fmt

rka-oai · 2026-06-22 13:25:48 -07:00

bd5bd953fb

remove flag for image preparation (#29429 )

## What

- make Fjord's centralized response-item image preparation unconditional
for new and resumed history
- have local user images and `view_image` outputs always defer decoding
and resizing to that path
- retain `resize_all_images` as an ignored, removed compatibility key
for released clients
- delete the flag-off producer paths and obsolete policy-specific tests

## Why

Centralized preparation is now the intended image path. Keeping the
runtime feature checks also kept two image-processing implementations
alive and allowed client config to select the legacy behavior.

This is a clean replacement for #28975, rebuilt from the latest `main`.

## How

`prepare_response_items` now runs whenever items enter history and
whenever persisted history is reconstructed. Producers emit deferred
image data, so malformed images become the existing model-visible
placeholder instead of failing the session at the producer.

## Test plan

- `just fmt`
- `just fix -p codex-core -p codex-features`
- `just test -p codex-features` — 52 passed
- focused affected `codex-core` set — 20 passed
- `just test -p codex-core handle_accepts_explicit_high_detail` — 1
passed
- full `just test -p codex-core` attempt — 2,723 passed; 88 unrelated
environment failures from read-only `~/.codex` SQLite state and
unavailable integration helper binaries

rka-oai · 2026-06-22 10:05:11 -07:00

e79d72d75d

Simplify multi-agent mode controls (#29324 )

## Why

Multi-agent delegation policy was split across `multiAgentMode`,
`features.multi_agent_mode`, and `usage_hint_enabled`. These controls
could disagree: a requested mode could be downgraded by the feature
flag, and disabling usage hints also disabled mode instructions.

Some clients also need multi-agent tools without adding
delegation-policy text to model context. The previous two-mode API could
not express that directly.

## What changed

`multiAgentMode` is now the only live delegation-policy control:

| Mode | Behavior |
| --- | --- |
| `none` | Keep multi-agent tools available without adding mode
instructions. |
| `explicitRequestOnly` | Only delegate after an explicit user request.
|
| `proactive` | Delegate when parallel work materially improves speed or
quality. |

- new threads default to `explicitRequestOnly`; omitting the mode on
later turns keeps the current value
- thread start, resume, fork, and settings responses always report the
concrete current mode instead of `null`
- mode selection remains sticky across turns and resume
- usage-hint text no longer controls whether mode instructions apply
- `features.multi_agent_mode` and `usage_hint_enabled` remain accepted
as ignored compatibility settings so existing configs continue to load
- app-server documentation and generated schemas describe the three-mode
API

## Tests

- `just test -p codex-core multi_agent_mode`
- `just test -p codex-core multi_agent_v2_config_from_feature_table`
- `just test -p codex-core spawn_agent_description`
- `just test -p codex-features`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server multi_agent_mode`

jif · 2026-06-22 10:05:36 +02:00

c03742ca0a

[codex] Add internal auto-compaction opt-out (#28260 )

## Summary

- add a default-on `auto_compaction` feature flag as an internal escape
hatch
- skip pre-turn, model-switch/hash, and mid-turn automatic compaction
when the flag is disabled
- preserve manual `/compact` behavior and surface the existing
context-window error when the provider runs out of room
- add integration coverage for disabled pre-turn and mid-turn compaction

## Motivation

Long-running SPO optimization rollouts need the option to preserve their
full context and fail on context exhaustion instead of entering another
compaction window. This deliberately uses the existing feature-flag
mechanism rather than adding a dedicated public config or app-server
API.

Disable it with:

```sh
codex --disable auto_compaction
```

## Testing

- `just test -p codex-features` — 51 passed
- `just test -p codex-core auto_compaction_feature_disabled` — 2 passed
- `just fix -p codex-core -p codex-features`
- `just write-config-schema`
- `just test -p codex-core` — the new compaction tests passed; the
overall local run had 54 unrelated environment failures, primarily
missing first-party test binaries and shell-snapshot timeouts

rhan-oai · 2026-06-21 20:11:50 -07:00

b21f0e7a98

[codex] add configurable token budget compaction reminder (#29255 )

## Why

The token-budget feature reports coarse remaining-context milestones,
but it does not give the model a configurable wrap-up prompt before
automatic compaction. A strict threshold-crossing check can also miss
resumed or reconfigured windows that are already inside the threshold.

## What changed

- Add structured `[features.token_budget]` configuration for an absolute
`reminder_threshold_tokens` and bounded `reminder_message_template`;
`{n_remaining}` is expanded when the reminder is delivered.
- Compute remaining tokens against the next effective auto-compaction
boundary, including scoped `body_after_prefix` accounting and the full
context-window limit.
- Make reminder delivery level-triggered before and after sampling, with
one-shot state owned by `AutoCompactWindow` and re-armed on compaction,
`new_context`, restore, or history replacement.
- Leave the existing initial full-window token-budget context, 25/50/75%
notices, and token-budget tools unchanged.
- Persist the resolved feature configuration in the session config lock
and regenerate the config schema.

## Validation

- `just test -p codex-core token_budget`
- `just test -p codex-core
token_budget_reminder_emits_after_crossing_compaction_threshold`
- `just test -p codex-core auto_compact_window`
- `just test -p codex-core
lock_contains_prompts_and_materializes_features`
- `just test -p codex-features`
- `just test -p codex-config`

pakrym-oai · 2026-06-20 19:13:42 -07:00

6df037d47f

Add indexed web search mode (#28489 )

## Summary

- Add `web_search = "indexed"` alongside `disabled`, `cached`, and
`live`.
- Use that same resolved mode for both hosted and standalone web search.
- For hosted search, send `index_gated_web_access: true` with external
web access enabled only when `indexed` is selected.
- For standalone search, preserve the existing boolean wire values for
existing modes (`cached` maps to `false` and `live` to `true`) and send
`"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
- Carry the mode through managed configuration requirements and
generated schemas.

## Why

Indexed search provides a middle ground between cached-only search and
unrestricted live page fetching. Search queries can remain live while
direct page fetches are limited to URLs admitted by the server.

The existing `web_search` setting remains the single source of truth, so
hosted and standalone executors cannot drift into different access
modes. Without an explicit `indexed` selection, the existing
model-visible tool and request shapes are unchanged.

```toml
web_search = "indexed"

[features]
standalone_web_search = true
```

## Validation

- `just fmt`
- `just test -p codex-api` (`126 passed`)
- `just test -p codex-web-search-extension` (`7 passed`)
- `just test -p codex-core
code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
- Focused configuration, hosted request, standalone request, and
managed-requirement coverage is included in the PR; remaining suites run
in CI.

The full workspace test suite was not run locally.

Winston Howes · 2026-06-19 05:35:57 -07:00

3a2712ea14

Add per-turn multi-agent mode (#28685 )

## Why

Multi-agent v2 currently carries an explicit-request-only delegation
rule in its static usage hint. That provides a safe default, but it
prevents clients from selecting proactive delegation per turn without
changing static guidance or rewriting prior model context.

This change makes delegation mode a session selection that can be
updated through `turn/start`, while deriving the effective model-visible
mode separately for each turn. Eligible multi-agent v2 turns remain
explicit-request-only unless proactive mode is both selected and
enabled.

## What changed

- Add the experimental `turn/start.multiAgentMode` parameter with
`explicitRequestOnly` and `proactive` values. Omission retains the
loaded session's current optional selection.
- Add the default-off `features.multi_agent_mode` feature gate. Eligible
multi-agent v2 turns use the selected mode when enabled; an unset
selection or disabled gate resolves to `explicitRequestOnly`.
- Treat mode prompting as inapplicable for multi-agent v1 and other
unsupported session configurations, producing no multi-agent mode
developer message rather than rejecting the turn.
- Move the explicit-request-only rule out of the static v2 usage hint
and into a bounded, tagged developer context fragment.
- Emit the effective mode in initial context and only when that
effective mode changes on later turns.
- Persist the effective mode in `TurnContextItem` as the durable
baseline for resume and context-update comparisons.

Historical rollout items are not rewritten. Later mode developer
messages establish the current rule incrementally.

## Not covered

- Initial selection through `thread/start` and selected-mode reporting
from thread lifecycle/settings APIs; those are isolated in the stacked
#28792.
- A TUI control or slash command for selecting the mode.
- Persisting a preferred mode to `config.toml`; selection remains
session/turn scoped.
- Changes to multi-agent concurrency limits, tool availability, or model
catalog capability declarations.
- Rewriting historical rollout prompt items. Cold resume restores the
latest persisted effective mode when available while leaving historical
developer messages intact.

## Verification

- `CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode`
- Focused app-server coverage verifies that `turn/start.multiAgentMode`
produces proactive developer instructions for an eligible v2 turn.

## Stack

Followed by #28792, which adds `thread/start` initialization and
lifecycle/settings observability.

Shijie Rao · 2026-06-18 22:47:51 -07:00

fc8c6b7384

[2/3] core: track starting environments in snapshots (#28683 )

## Why

Remote environments may still be resolving when Codex creates a session
or turn. Waiting for the existing all-or-nothing environment snapshot
can hold startup until the selected environment is usable.

Behind the default-off `deferred_executor` feature, let callers take a
useful snapshot immediately: completed environments remain available
normally, while unfinished environments are reported without blocking
startup. With the feature disabled, snapshots preserve the existing
blocking behavior.

Depends on #28674.

## What changed

- Store one ordered list of selected environments in
`ThreadEnvironments`. Each selection owns one shared resolution that
produces its complete `TurnEnvironment`.
- Start new resolutions in the background with `remote_handle()`,
allowing snapshots and the future wait tool to share the same result
while cancellation follows the retained handles.
- Make `snapshot()` a read-only operation: nonblocking snapshots collect
completed resolutions and retain handles for unfinished ones, while
blocking snapshots await every resolution.
- Replace completed failed resolutions from the current manager entry
and log when failed environments are omitted.
- Return attached and starting environments as a point-in-time view, and
count starting environments when deciding whether a snapshot is
local-only.
- Keep existing consumers attached-only. `to_selections()` derives from
attached environments, so child threads do not inherit an environment
that is still starting.

## Test plan

- `just test -p codex-core environment_selection`
- `just test -p codex-core
deferred_executor_reaches_model_before_remote_environment_is_ready`

## Landing note

Keep `deferred_executor` disabled for slow-starting executors until
configurable `environment/add` connection timeouts and caller support
land. When enabled, an environment that attaches after session startup
may remain absent from environment-derived model context, tools,
instructions, skills, and related state until follow-up refresh work
lands.

sayan-oai · 2026-06-19 05:06:34 +00:00

45a133bae0

[codex] Assign response item IDs when recording history (#28814 )

## Why

Client-created response items enter history without IDs, so their
identity is lost across rollout persistence and resume. IDs should be
assigned once at the history-recording boundary, while IDs returned by
the server must remain unchanged.

The Responses API validates item IDs using type-specific prefixes.
Locally generated IDs therefore use the matching prefix plus a
hyphenated UUIDv7, keeping them valid while distinguishable from
server-generated IDs. Because this changes persisted history and
provider request shapes, the behavior is opt-in behind the
under-development `item_ids` feature. Compaction triggers remain request
controls whose API shape does not accept an ID.

## What changed

- Register the disabled-by-default `item_ids` feature and expose it in
`config.schema.json`.
- Make supported optional `ResponseItem` IDs serializable and expose
them in the generated app-server schemas.
- When `item_ids` is enabled, assign an ID during conversation-history
preparation if an item has no ID.
- Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
item conventions.
- Preserve existing server IDs without rewriting them.
- Persist assigned IDs in rollouts and include them in subsequent
Responses requests.
- Remove the unsupported ID field from `CompactionTrigger` and document
why it has no ID.
- Add integration coverage for enabled ID persistence, preservation of
server IDs, and omission of generated IDs while the feature is disabled.

`prepare_conversation_items_for_history` is the single response-item ID
allocation boundary.

## Test plan

- `just test -p codex-features`
- `just test -p codex-core
response_item_ids_persist_across_resume_and_preserve_server_ids`
- `just test -p codex-core
non_openai_responses_requests_omit_item_turn_metadata`
- `just test -p codex-core
resize_all_images_prepares_failures_before_history_insertion`
- `just test -p codex-protocol`
- `just test -p codex-app-server-protocol`
- `just test -p codex-api azure_default_store_attaches_ids_and_headers`

pakrym-oai · 2026-06-18 17:30:55 -07:00

f00f93d8c0

[codex] Remove child AGENTS.md prompt experiment (#28993 )

## Why

`child_agents_md` is a disabled, under-development experiment that adds
a second model-visible explanation of hierarchical `AGENTS.md` behavior.
Keeping it leaves unused prompt, configuration, documentation, and test
surface.

## What changed

- remove the `ChildAgentsMd` feature and `child_agents_md` config schema
entry
- remove the hierarchical prompt asset, export, and instruction
injection
- remove feature-specific tests and documentation
- keep the generic unstable-feature warning coverage using
`apply_patch_streaming_events`

Normal project `AGENTS.md` discovery and composition are unchanged.

## Testing

- `just test -p codex-features`
- `just test -p codex-prompts`
- `just test -p codex-core agents_md`
- `just test -p codex-core unstable_features_warning`

pakrym-oai · 2026-06-18 16:13:07 -07:00

bb72e151e5

feat: opt ChatGPT auth into agent identity (#19049 )

## Stack

This is PR 2 of the simplified HAI single-run-task stack:

- [#19047](https://github.com/openai/codex/pull/19047) Agent Identity
assertion and task-registration primitives, including the shared
run-task helper used by existing Agent Identity JWT auth.
- [#19049](https://github.com/openai/codex/pull/19049)
Disabled-by-default ChatGPT auth opt-in that provisions/reuses persisted
Agent Identity runtime auth and its single run task.
- [#19051](https://github.com/openai/codex/pull/19051) Run-scoped
provider auth that uses one backend-owned task id for first-party
inference and compaction requests.

[#19054](https://github.com/openai/codex/pull/19054) collapsed out of
the active stack because the simplified design no longer needs a
separate background/control-plane task helper.

## Summary

This PR adds the disabled-by-default path for normal ChatGPT-login Codex
sessions to obtain Agent Identity runtime auth through the Codex
backend. Existing Agent Identity JWT startup mode remains a separate
path and does not require the feature flag.

What changed:

- adds the experimental `use_agent_identity` feature flag and config
schema entry
- adds an explicit `AgentIdentityAuthPolicy` so call sites choose
`JwtOnly` or `ChatGptAuth` instead of passing a bare boolean
- stores standalone Agent Identity JWT credentials separately from
backend-registered Agent Identity records
- persists the registered Agent Identity record, private key, and single
run task id in `auth.json` so process restarts reuse the same identity
- derives the agent/task registration base URL from ChatGPT/Codex auth
config while keeping JWT JWKS lookup separate
- provisions and caches ChatGPT-derived Agent Identity runtime auth when
`use_agent_identity` is enabled
- reuses the shared run-task registration helper from PR1 rather than
adding a second task-registration path

This PR intentionally does not switch model inference over to
`AgentAssertion` auth. The provider-auth integration lands in the next
PR.

## Testing

- `just test -p codex-login`

Adrian · 2026-06-18 14:05:27 -07:00

ec848dde0e

Add Config for Time Reminders (varlatency 1/n) (#28822 )

## Summary

Example:

> [features.current_time_reminder]
enabled = true
reminder_interval_model_requests = 1
clock_source = "system"

## Testing

- `just test -p codex-core varlatency`
- `just test -p codex-core
lock_contains_prompts_and_materializes_features`
- `just fix -p codex-core -p codex-config -p codex-features`

rka-oai · 2026-06-18 11:39:02 -07:00

df5f122854

[codex] add rollout token budget configuration (varlength 1/N) (#28746 )

## What

This PR defines the structured configuration contract for shared rollout
token budgets (across ALL agent threads under 1 rollout).

```toml
[features.rollout_budget]
enabled = true
limit_tokens = 100000
reminder_interval_tokens = 10000
sampling_token_weight = 1.0
prefill_token_weight = 0.1
```

The reminder interval defaults to 10% of the rollout limit. Sampling and
prefill weights default to `1.0`.

## Scope

This PR only defines and validates configuration. It does not track
usage, inject reminders, or stop a rollout. Accounting and reminders are
implemented in the stacked follow-up #28494.

The existing `token_budget` feature remains unchanged. `rollout_budget`
has its own feature key and configuration type.

## Tests

The config test verifies that the structured fields resolve into
`RolloutBudgetConfig` and do not enable the existing `token_budget`
feature.

Local checks:

- `just write-config-schema`
- `just test -p codex-core load_config_resolves_rollout_budget`
- `cargo check -p codex-thread-manager-sample`
- `git diff --check`

The full workspace test suite was not run locally.

rka-oai · 2026-06-18 04:29:47 -07:00

ecc4c30e28

Expose selecte namespaces as direct model tools (#28825 )

## Why

Som tools, such as history and notes, must remain top-level when MCP
deferral is enabled while staying unavailable through code-mode `exec`.

## What changed

- Added `features.code_mode.direct_only_tool_namespaces`.
- Classified matching MCP tools as `DirectModelOnly`.
- Kept those tools top-level in `code_mode_only`.
- Excluded them from `tool_search` deferral and the nested `exec`
surface.
- Updated the generated config schema.

## Validation

- `code_mode_only_exposes_direct_model_only_mcp_namespaces`
- `load_config_resolves_code_mode_config`

Won Park · 2026-06-18 04:07:54 +00:00

78a9e169bb

PAC 1 - Add system proxy feature config surface (#26706 )

## Summary

Introduces the default-off `respect_system_proxy` feature flag used to
gate first-class system PAC/proxy support for Codex-owned native
clients.

With the feature disabled or absent, behavior remains unchanged. This PR
establishes the configuration and managed-requirement surface; proxy
discovery and request routing are implemented by follow-up PRs.

## Configuration

User configuration uses the standard boolean feature form:

```toml
[features]
respect_system_proxy = true
```

Managed feature requirements use the corresponding boolean key. The
effective runtime configuration is exposed as a boolean and defaults to
`false`.

## Implementation

- Registers `respect_system_proxy` as an under-development, default-off
feature.
- Resolves user configuration and managed feature requirements into
`Config.respect_system_proxy`.
- Provides bootstrap resolution for startup paths that must evaluate the
feature before full configuration loading completes.
- Uses the standard feature CLI and config-editing behavior.
- Excludes `features.respect_system_proxy` from project-local
configuration.
- Updates the generated configuration schema.

## End-user behavior

- No networking behavior changes when the feature is absent or disabled.
- Enabling the feature makes the boolean available to the native
proxy-routing implementation in follow-up PRs.
- Repository-local configuration cannot enable the feature.

## Test coverage

Covers scalar configuration and CLI override resolution, managed
requirement constraints, bootstrap resolution, and project-local
filtering.

canvrno-oai · 2026-06-16 16:54:37 -07:00

f0cb96bcb1

[codex] Add interruptible sleep tool (#28429 )

## Why

Models sometimes need to pause briefly while waiting for external work,
but using a shell command for that delay ties the wait to a process and
does not naturally resume when new turn input arrives.

## What changed

- add a built-in `sleep` tool behind the under-development `sleep_tool`
feature
- accept a bounded `duration_ms` argument, matching the millisecond
convention used by unified exec
- end the sleep early when either steered user input or mailbox input
arrives
- include elapsed wall-clock time in completed and interrupted outputs
- emit a dedicated core `SleepItem` through `item/started` and
`item/completed`
- expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain
it in reconstructed thread history
- regenerate the configuration schema for the new feature flag
- regenerate app-server JSON and TypeScript schema fixtures

## Test plan

- `just test -p codex-core sleep_tool_follows_feature_gate`
- `just test -p codex-core any_new_input_interrupts_sleep`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server
sleep_emits_started_and_completed_items`

pakrym-oai · 2026-06-15 21:39:21 -07:00

08901fc8e1

Remove terminal resize reflow flag gates (#27794 )

## Why

`terminal_resize_reflow` is now stable and should behave as always on.
Keeping the disabled runtime paths around made the feature look
configurable even though the rollout is complete, and old config could
still suggest there was a supported off mode.

## What Changed

- Marked `terminal_resize_reflow` as `Stage::Removed` while keeping it
default-enabled for compatibility.
- Ignored `[features].terminal_resize_reflow` config entries so stale
`false` settings no longer affect the effective feature set.
- Removed TUI branches that depended on the flag being disabled, so
draw, replay buffering, stream finalization, and resize scheduling all
assume resize reflow is active.
- Simplified resize smoke coverage to exercise the always-on behavior
only.

## Verification

- `just test -p codex-features`
- `just test -p codex-tui resize_reflow`
- `just test -p codex-tui initial_replay_buffer
thread_switch_replay_buffer`

Eric Traut · 2026-06-15 08:23:02 -07:00

8a6a039f26

build: run buildifier from just fmt (#28125 )

## Intent

Keep Bazel and Starlark files consistently formatted without requiring
contributors to install or version buildifier themselves.

## Implementation

- Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
v8.5.1.
- Run buildifier from the shared `just fmt` and `just fmt-check` driver,
with Windows-safe explicit DotSlash invocation.
- Provision DotSlash in formatting CI and contributor devcontainers, and
document the source-build prerequisite.
- Apply the initial mechanical buildifier formatting baseline.

Adam Perry @ OpenAI · 2026-06-13 21:43:39 -07:00

740c4f269d

Promote TUI unified mentions in composer to default mentions feature (#27499 )

## Summary

This PR promotes Mentions 2.0 (unified TUI mention popup) to stable and
enables it by default.

- Keep `mentions_v2` as a temporary rollback path to the legacy split
popups (`--disable mentions_v2`).
- Add feature-default and snapshot coverage for the default experience.

## Prior work

- [#19068 — Unified mentions in
TUI](https://github.com/openai/codex/pull/19068)
- [#22375 — Use plugin/list to get plugins for
mentions](https://github.com/openai/codex/pull/22375)
- [#23363 — Unified mentions tweaks and rendering
polish](https://github.com/openai/codex/pull/23363)

## Test plan

- Launch Codex without any feature overrides.
- Type `@` in the TUI composer.
- Confirm the unified mentions menu opens and displays filesystem,
plugin, and skill results.

canvrno-oai · 2026-06-12 16:29:40 -07:00

1bd6b4c41a

Warn for structured feature toggles (#27076 )

## Summary
Startup warnings for under-development features only recognized bare
boolean toggles like `features.foo = true`. An upcoming feature will use
table-format config, so `features.foo = { enabled = true, ... }` needs
to count as an explicit opt-in too.

This updates the warning predicate to recognize structured tables with
`enabled = true`, while leaving tables without that field unwarned.

## Testing
- `just fmt`
- `just test -p codex-features
unstable_warning_event_mentions_enabled_structured_under_development_feature`

canvrno-oai · 2026-06-12 14:52:07 -07:00

4d1702586a

feat: add secret auth storage configuration (#27504 )

## Why

Windows Credential Manager limits generic credential blobs to 2,560
bytes. The encrypted local secrets backend avoids storing large
serialized auth payloads directly in the OS keyring, but selecting that
backend needs an independently reviewable feature/config layer before
the auth and secrets implementation is wired in.

## What Changed

- Added the stable `secret_auth_storage` feature, enabled by default on
Windows and disabled by default elsewhere.
- Added `AuthKeyringBackendKind` and config resolution for full and
bootstrap config loading.
- Applied managed feature requirements when resolving the bootstrap auth
backend.
- Updated the generated config schema and added focused tests.

This is the base PR for #17931. The auth, secrets, MCP, CLI, TUI, and
app-server implementation remains in that follow-up PR.

## Validation

- `just test -p codex-features`
- `just test -p codex-config`
- `just test -p codex-core
resolve_bootstrap_auth_keyring_backend_kind_uses_secret_auth_storage_feature`
- `just write-config-schema`
- `just fix -p codex-core`

The full `just test -p codex-core` run compiled successfully and ran
2,690 tests; 2,589 passed, one was flaky, and 101 environment-sensitive
tests failed because this shell injects a `pyenv` rehash warning into
command output or because sandboxed subprocesses timed out.

Celia Chen · 2026-06-12 19:15:21 +00:00

b724f5966e

core: enable remote compaction v2 by default (#27573 )

## Why

Remote compaction v2 is ready to become the default for providers that
already support remote compaction. Leaving it behind an
under-development opt-in keeps eligible sessions on the legacy
remote-compaction path.

This does not broaden provider eligibility: OpenAI and Azure move to v2,
while Bedrock and OSS providers retain their existing local-compaction
behavior.

## What changed

- Mark `remote_compaction_v2` stable and enable it by default.
- Make tests that intentionally cover legacy remote compaction
explicitly disable v2.
- Update parity coverage so v2 exercises the production default and only
legacy mode opts out.

## Verification

- `just test -p codex-core
auto_compact_runs_after_resume_when_token_usage_is_over_limit
auto_compact_counts_encrypted_reasoning_before_last_user
auto_compact_runs_when_reasoning_header_clears_between_turns
responses_lite_compact_request_uses_lite_transport_contract`

jif · 2026-06-11 10:07:19 +00:00

273a4aa4f2

[codex] Add token budget context feature (#27438 )

## Why

The model should be able to see bounded context-window budget metadata
when the `token_budget` feature is enabled. The full-window message is
only injected with full context, while normal turns get a smaller
follow-up only when reported usage first crosses a budget threshold.

## What changed

- Added the `TokenBudget` feature flag.
- Added `<token_budget>` developer fragments for full context-window
metadata and current-window remaining tokens.
- Inserted the threshold message during normal turn handling by
comparing token usage before and after sampling, avoiding persistent
threshold bookkeeping.
- Added core integration coverage for full-context-only metadata and
25/50/75 percent threshold messages.

## Verification

- `just test -p codex-core token_budget`
- `git diff --check`

pakrym-oai · 2026-06-10 20:07:06 -07:00

658af936fd

core: resize all history images behind a feature flag (#27247 )

## Summary

Adds complete client-side image preparation behind the default-off
`resize_all_images` feature flag.

When enabled, local image producers defer decoding and resizing. Images
are prepared centrally before insertion into conversation history,
covering user input, `view_image`, and structured tool-output images.

## Behavior

- Processes base64 `data:` images in messages and function/custom tool
outputs.
- Leaves non-data URLs, including HTTP(S) URLs, unchanged.
- Applies image-detail budgets:
  - `high` and omitted: 2048px maximum dimension and 2.5K 32px patches.
  - `original`: 6000px maximum dimension and 10K 32px patches.
  - `auto`: uses the same 2048px / 2.5K-patch budget as high.
  - `low`: unsupported and replaced with an actionable placeholder.
- Preserves original image bytes when no resize or format conversion is
needed.
- Enforces the shared 1 GiB encoded and decoded data-URL sanity limits.
- Replaces only an image that fails preparation, preserving sibling
content and tool-output metadata.
- Uses bounded placeholders distinguishing generic processing failures,
oversized images, and unsupported `low` detail.
- Prepares resumed and forked history before installing it as live
history without modifying persisted rollouts.

## Flag-Off Behavior

When `resize_all_images` is disabled:

- Existing local user-input and `view_image` processing remains
unchanged.
- Existing decoding and error behavior remains unchanged.
- Arbitrary tool-output images are not processed.
- HTTP(S) image URLs continue to be forwarded unchanged.


#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/27245
- 👉 `2` https://github.com/openai/codex/pull/27247
- ⏳ `3` https://github.com/openai/codex/pull/27246
- ⏳ `4` https://github.com/openai/codex/pull/27266

Curtis 'Fjord' Hawthorne · 2026-06-10 19:21:24 -07:00

a6f435ea94

[codex] remove blocking external agent migration flow (#27064 )

## Why

External-agent import should be initiated deliberately instead of
interrupting eligible TUI startups. This cleanup removes the blocking
startup flow before the replacement import experience is introduced
later in the stack.

## What changed

- remove the startup-blocking external-agent migration prompt
- remove the now-unused external migration feature gate
- remove the obsolete TUI app-server migration wrappers
- retain the dormant picker behind a module-scoped dead-code allowance
until the next stack item wires it back in
- keep normal TUI startup focused on entering Codex immediately

## Validation

- `bazel build --config=clippy //codex-rs/tui:tui
//codex-rs/tui:tui-unit-tests-bin`
- `just test -p codex-tui external_agent_config_migration` (8 passed)
- `just test -p codex-tui` (2,786 passed, 12 unrelated local
environment-sensitive failures, 4 skipped)
- `just fix -p codex-tui`
- `just fmt`

## Stack

1. [#27064](https://github.com/openai/codex/pull/27064): remove the
startup migration flow
2. [#27065](https://github.com/openai/codex/pull/27065): extract the
picker renderer
3. [#27070](https://github.com/openai/codex/pull/27070): add the
external-agent import picker UX
4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
through `/import`

**This PR is stack item 1.**

stefanstokic-oai · 2026-06-10 14:25:04 -04:00

636cc11398

Use plugin-service MCP as the hosted plugin runtime (#27198 )

## Stack

- Base: #27191
- This PR is the third vertical and should be reviewed against
`jif/external-plugins-2`, not `main`.

## Why

#27191 moves the host-owned Apps MCP registration behind an extension
contributor, but deliberately preserves the existing endpoint-selection
feature while that contribution contract lands. App-server can therefore
resolve the server through extensions, yet the hosted plugin endpoint is
still selected through temporary `apps_mcp_path_override` plumbing.

That is not the long-term plugin model. A plugin can bundle skills,
connectors, MCP servers, and hooks, and those components do not all need
the same source or execution environment. In particular, an
authenticated HTTP MCP server can expose plugin capabilities directly
from a backend without an executor or an orchestrator filesystem.

This PR completes that hosted vertical. App-server's MCP extension now
owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
continue to arrive as MCP tools, while backend-provided skills arrive as
MCP resources and use Codex's existing resource list/read paths. No
second backend client, skill filesystem, or generic plugin activation
framework is introduced.

The backend route remains the hosted implementation. This change
replaces Codex's temporary endpoint-selection mechanism, not the service
behind the endpoint.

## What changed

### Hosted plugin runtime

The MCP extension now contributes `codex_apps` as the hosted plugin
runtime rather than as a configurable Apps endpoint:

- `https://chatgpt.com` resolves to
`https://chatgpt.com/backend-api/ps/mcp`;
- a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
- the existing product-SKU header and ChatGPT authentication behavior
are preserved;
- executor availability is never consulted for this streamable HTTP
transport.

The same MCP connection carries both component shapes supported by the
hosted endpoint:

- connector actions are discovered and invoked as MCP tools;
- hosted skills are enumerated and read as MCP resources through the
existing `list_mcp_resources` and `read_mcp_resource` paths.

This keeps component access in the subsystem that already owns the
protocol instead of downloading backend skills into an orchestrator
filesystem or inventing a parallel hosted-skill client.

### Explicit runtime ordering

`McpManager` now resolves the reserved `codex_apps` entry in three
ordered phases:

1. install the legacy Apps fallback for compatibility;
2. apply ordered extension `Set` or `Remove` overlays;
3. apply the final ChatGPT-auth gate without synthesizing the server
again.

This ordering is important:

- an ordinary configured or plugin MCP server cannot claim the
auth-bearing `codex_apps` name;
- an extension-contributed hosted runtime wins over the fallback;
- an extension `Remove` remains authoritative;
- a host without the MCP extension retains the legacy Apps endpoint and
current local-only behavior.

The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
longer needed.

### Remove the path override

The `apps_mcp_path_override` feature and its runtime plumbing are
removed, including:

- the feature registry entry and structured feature config;
- `Config` and `McpConfig` fields;
- config schema output;
- config-lock materialization;
- URL override handling in `codex-mcp`.

Existing boolean and structured forms still deserialize as ignored
compatibility input. They are omitted from new serialized config, and
config-lock comparison normalizes the removed input so older locks
remain replayable.

### App-server coverage

App-server MCP fixtures now serve the hosted route at
`/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
therefore exercise the extension-owned endpoint rather than succeeding
through the legacy fallback.

The stack also adds the missing `codex_chatgpt::connectors` re-export
for the manager-backed connector helper introduced in #27191.

## Compatibility

- App-server installs the extension and uses `/ps/mcp` for the hosted
runtime.
- CLI and other hosts that do not install the extension retain the
legacy Apps endpoint.
- Apps disabled or non-ChatGPT authentication removes `codex_apps` from
the effective runtime view.
- Existing local plugins, local skills, executor-selected skills,
configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
- Backend plugin enablement remains account/workspace state owned by the
hosted endpoint; this PR does not add thread-local backend plugin
selection.

## Architectural fit

The stack now proves two independent runtime shapes:

1. #27184 resolves filesystem-backed skills through the executor that
owns a selected root.
2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
extension with no executor.

Together they preserve the intended separation:

- selection identifies a plugin/root when explicit selection is needed;
- each component's owning extension resolves its concrete access
mechanism;
- execution stays with the runtime required by that component;
- existing skills, MCP, connector, and hook subsystems remain the
downstream consumers.

## Planned follow-ups

1. **Executor stdio MCP:** selecting an executor plugin registers a
manifest-declared stdio MCP server and executes it in the environment
that owns the plugin.
2. **Optional backend selection:** only if CCA needs thread-local
selection distinct from backend account/workspace enablement, add a
concrete backend-owned capability location and surface those selected
skills through the skills catalog.
3. **Connector metadata and hooks:** activate those plugin components
through their existing owning subsystems, with executor hooks remaining
environment-bound.
4. **Propagation and persistence:** define explicit resume, fork,
subagent, refresh, and environment-removal semantics once selected roots
have multiple real consumers.
5. **Local convergence:** migrate legacy local skill, MCP, connector,
and hook paths behind their owning extensions one vertical at a time,
then remove duplicate core managers and compatibility plumbing after
parity.

## Verification

Coverage in this change exercises:

- extension-owned `/backend-api/ps/mcp` registration without an
executor;
- preservation of the legacy endpoint in hosts without the extension;
- extension `Set` and `Remove` precedence over the legacy fallback;
- ChatGPT-auth gating for the reserved server;
- hosted MCP resource reads with and without an active thread;
- connector tool invocation and MCP elicitation through the hosted
route;
- ignored boolean and structured forms of the removed path override;
- config-lock replay compatibility for the removed feature.

`cargo check -p codex-features -p codex-mcp-extension -p
codex-app-server` passes. Tests and Clippy were not run locally under
the current development instruction; CI provides the full validation
pass.

jif · 2026-06-10 12:54:21 +02:00

9cd11e9e62

[codex] Gate terminal visualization instructions in TUI (#26013 )

## Summary
- add `Feature::TerminalVisualizationInstructions` as
`UnderDevelopment`, disabled by default
- keep terminal visualization instructions inside the TUI package
- append them to existing developer instructions for TUI start, resume,
and fork flows only when enabled
- intentionally do not apply them to `codex exec`

## Rollout
Control behavior is unchanged. TUI dogfooders can enable
`terminal_visualization_instructions`; no default user receives the new
terminal-specific instructions.

The shared visualization-selection rule is supplied separately through
the `codex_proxy_model_3` Statsig layer for every target Codex model
slug in the gated cohort. This TUI feature determines how to render an
appropriate visualization on the terminal surface; the model-layer
treatment determines when to use one.

## Validation
- `cargo test -p codex-tui
terminal_visualization_instructions_are_gated_for_all_tui_thread_flows
--lib`
- `cargo test -p codex-features --lib`
- `cargo fmt --all -- --check`
- `git diff --check`
- GPT-5.4 and GPT-5.5 real prompt-pipeline smoke tests: both visualized
the positive mapping case, abstained on the negative route case, and
passed exact prompt-stack verification on CLI and App
- refreshed onto current `main` with a clean merge and reran the focused
validation

The full 53-probe all-model treatment comparison and requested
production coding evals remain rollout gates before broadening beyond
the initial employee cohort.

This PR remains open for normal human review.

vie-oai · 2026-06-05 17:23:45 -07:00

61a913d9c8

Remove response.processed websocket request (#26447 )

## Why

The Responses websocket client no longer needs to send a follow-up
`response.processed` request after a turn response has already been
recorded. Keeping that extra acknowledgement path adds feature-gated
control flow and a second websocket request shape that no longer carries
useful behavior.

## What Changed

- Removed the `response.processed` websocket request type and sender.
- Removed the `responses_websocket_response_processed` feature flag and
schema entry.
- Removed turn and remote-compaction plumbing that only tracked response
IDs to send the acknowledgement.
- Removed tests that existed solely to cover the deleted feature path.

## Validation

- `just fix -p codex-core -p codex-api -p codex-features`

pakrym-oai · 2026-06-04 13:15:50 -07:00

d312a53e2a

core: allow excluding tool namespaces from code mode (#26320 )

## Why

Research and training setups need to control which tool namespaces
appear inside code mode's nested `tools` surface without disabling those
tools entirely. This makes it possible to train against a deliberately
reduced nested-tool setup while preserving the normal direct and
deferred tool paths.

## What

- Extend `features.code_mode` to accept structured configuration while
preserving the existing boolean syntax.
- Add an exact `excluded_tool_namespaces` list under
`[features.code_mode]`:

  ```toml
  [features.code_mode]
  enabled = true
  excluded_tool_namespaces = ["mcp__codex_apps", "multi_agent_v1"]
  ```

- Filter matching canonical `ToolName` namespaces when constructing code
mode's nested router and code-mode-specific direct tool descriptions.
- Keep excluded tools registered, directly exposed in mixed code mode,
and discoverable through top-level `tool_search` when otherwise
eligible.
- Derive deferred nested-tool guidance after namespace filtering so the
`exec` description does not advertise excluded-only deferred tools.
- Preserve the boolean/table representation when materializing config
locks and update the generated config schema.

## Testing

- `just test -p codex-features`
- `just test -p codex-config`
- `just test -p codex-core load_config_resolves_code_mode_config`
- `just test -p codex-core
lock_contains_prompts_and_materializes_features`
- `just test -p codex-core
excluded_deferred_namespaces_do_not_enable_nested_tool_guidance`
- `just test -p codex-core
code_mode_excludes_configured_nested_tool_namespaces`
- `cargo check -p codex-thread-manager-sample`

sayan-oai · 2026-06-04 18:40:18 +00:00

8b1238856b

feat: gate unified exec zsh fork composition (#24979 )

## Why

`shell_zsh_fork` and unified exec need to remain independently
controllable for enterprise rollouts, but we also need a third mode that
composes them. That composed mode is intended to preserve unified exec
command lifecycle support while letting the zsh fork provide more
accurate `execv(2)` interception.

Enabling `unified_exec_zsh_fork` by itself is intentionally not
sufficient. It is a composition gate, not a dependency-enabling
shortcut:

- `unified_exec` selects the PTY-backed unified exec tool.
- `shell_zsh_fork` opts into the zsh fork backend.
- `unified_exec_zsh_fork` only allows those two already-enabled modes to
be composed so local zsh unified exec commands can launch through the
zsh fork.

This separation is deliberate. Enterprises and staged rollouts must be
able to enable or disable unified exec and zsh-fork independently. If
`unified_exec_zsh_fork` implied either dependency, then enabling one
under-development composition flag would silently activate a shell
backend that the configured feature set left disabled.

This PR introduces only the configuration and planning gate for that
composition. Existing `shell_zsh_fork` behavior continues to use the
standalone shell tool unless the new composition feature is explicitly
enabled alongside both dependencies.

## What Changed

- Added the under-development feature flag `unified_exec_zsh_fork`.
- Added `UnifiedExecFeatureMode` so the three input feature flags
collapse into `Disabled`, `Direct`, or `ZshFork` mode before tool
planning.
- Updated tool selection so zsh-fork composition requires
`unified_exec`, `shell_zsh_fork`, and `unified_exec_zsh_fork`.
- Kept the existing standalone zsh-fork shell tool behavior when only
`shell_zsh_fork` is enabled.
- Updated config schema output for the new feature flag.

## Verification

- Added feature and tool-config coverage for the new gate.
- Added planner coverage proving `shell_zsh_fork` remains standalone
until composition is explicitly enabled.
- Ran focused tests for `codex-features`, `codex-tools`, and the
affected `codex-core` planner case.





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24979).
* #24982
* #24981
* #24980
* __->__ #24979

Michael Bolin · 2026-06-01 13:01:36 -07:00

d6748f741a

Compress cold local rollouts (#25089 )

## Rollout compression stack

This stack splits #24941 into reviewable steps for local rollout
compression. The design is intentionally staged:

1. Teach readers, listing, search, and lookup to understand compressed
rollouts.
2. Make append and resume paths materialize compressed rollouts back to
plain JSONL before writing.
3. Add a disabled-by-default worker that can compress cold archived
rollouts behind `local_thread_store_compression`.

The key invariant is that writers append to plain `.jsonl`. A
`.jsonl.zst` file is a cold/read representation; if a write is needed,
the compressed file is materialized back to plain JSONL first. Readers
prefer plain `.jsonl` when both forms exist and can fall back to the
compressed sibling during transitions.

The worker is deliberately the last PR and remains behind an
under-development feature flag. It currently scans only
`archived_sessions`, not active `sessions`, because active sessions have
the highest resume/append race risk. That means this stack does not yet
compress most unarchived local history.

## Known race / follow-up

The remaining unresolved design question is writer/compressor
coordination. Even for archived rollouts, a resume or metadata update
can append while the worker is replacing the plain file with
`.jsonl.zst`; the current double-stat checks narrow but do not fully
eliminate the window where a writer has opened the plain file before
unlink. Do not treat the worker PR as production-ready until we either:

- prevent append/resume paths from racing archived compression, or
- introduce a shared representation/append lock or equivalent
coordination.

The first two PRs are useful independently: they make compressed
rollouts readable and make append paths safely recover back to plain
JSONL. The third PR isolates the worker behavior so that coordination
issue is reviewable separately.

## Validation

Focused local validation for the stack includes:

- `just test -p codex-rollout`
- `just test -p codex-thread-store` where thread-store paths were
touched
- `just test -p codex-features` for the feature flag slice
- `just bazel-lock-check` after dependency graph changes
- scoped `just fix -p ...` passes for changed crates

CI is still the source of truth for the full platform matrix.

## This PR in the stack

This is PR 3/3, based on #25088. It adds the under-development feature
flag and starts the best-effort background worker when enabled. The
worker currently compresses only cold archived rollouts, skips active
sessions, verifies compressed output, preserves mtime and permissions,
keeps a store-level lock heartbeat, and cleans stale temp files.

Stack order:

1. #25087: read compressed local rollouts.
2. #25088: materialize compressed rollouts before append.
3. This PR: add the disabled local compression worker.

jif-oai · 2026-06-01 18:35:58 +02:00

01cb97851b

fix(config): use deny for Unix socket permissions (#24970 )

## Why

Unix socket permissions still accepted and displayed `"none"` while file
permissions use the clearer `"deny"` spelling. This keeps network Unix
socket policy vocabulary consistent with filesystem policy vocabulary.

## What changed

- Replace the Unix socket permission variant and serialized spelling
from `none` to `deny` across config, feature configuration, and network
proxy types.
- Update app-server v2 serialization, TUI debug output, focused tests,
and generated schemas to expose `"deny"`.
- Add coverage for denied Unix socket entries in managed requirements
and profile overlay behavior.

## Security

This is a vocabulary change for explicit Unix socket rejection, not a
network access expansion. Denied entries continue to be omitted from the
effective allowlist.

## Validation

- `just fmt`
- `just write-config-schema`
- `just write-app-server-schema`
- `just test -p codex-config -p codex-core -p codex-app-server-protocol
-p codex-tui -E
'test(network_requirements_are_preserved_as_constraints_with_source) |
test(network_permission_containers_project_allowed_and_denied_entries) |
test(network_toml_overlays_unix_socket_permissions_by_path) |
test(permissions_profiles_resolve_extends_parent_first_with_child_overrides)
| test(network_requirements_serializes_canonical_and_legacy_fields) |
test(debug_config_output_formats_unix_socket_permissions)'`\n- Automatic
`bench-smoke` follow-up from `just test`\n- `cargo clippy -p
codex-config -p codex-core -p codex-features -p codex-network-proxy -p
codex-app-server-protocol -p codex-app-server -p codex-tui --all-targets
-- -D warnings`

viyatb-oai · 2026-05-28 23:53:26 +00:00

bf72be5927

Add feature-gated standalone image generation extension (#24723 )

## Why

Add a standalone image generation path that can be exercised
independently of hosted Responses image generation, while retaining the
hosted tool as fallback unless the extension is actually available to
the model.

## What changed

- Added the `codex-image-generation-extension` crate with standalone
generate/edit execution, prior-image selection for edits, model-visible
image output, and local generated-image persistence.
- Installed the extension in app-server behind the disabled-by-default
`imagegenext` feature and backend eligibility checks.
- Updated core tool planning so eligible `image_gen.imagegen` exposure
replaces hosted `image_generation`, while unavailable configurations
retain hosted fallback.
- Added coverage for extension behavior, edit history reuse, feature
gating, auth eligibility, and hosted-tool replacement.
- The extension is installed through app-server only in this PR; other
execution paths retain hosted image generation because hosted
replacement occurs only when the standalone executor is actually
registered and model-visible.
- The initial extension contract intentionally fixes the image model to
`gpt-image-2` and uses automatic image parameters.
- Native generated-image history/card parity and rollout persistence
cleanup are intentionally deferred follow-up work.

## Validation

- `just test -p codex-image-generation-extension`
- `just test -p codex-features`
- `just test -p codex-core
hosted_tools_follow_provider_auth_model_and_config_gates`
- `just test -p codex-app-server`
- `just fix -p codex-image-generation-extension -p codex-features -p
codex-core -p codex-app-server`
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`

---------

Co-authored-by: jif-oai <jif@openai.com>

Won Park · 2026-05-28 11:44:55 -07:00

ecb41fcb64

TUI: Unified mentions tweaks + polish mentions rendering (#23363 )

This change keeps unified @mentions behind the mentions_v2 gate, moves
the flag to under-development, and polishes mention rendering/history
behavior.

It also adds a few small improvements to the mentions feature around
mention rendering and history round-tripping for plugin/tool mentions in
message edit scenarios. Plugin selections now insert `@` mentions with
better casing, and saved history preserves the visible sigil so recalled
messages look the same as what the user typed.

- Preserves `@` sigils when encoding/decoding mention history for
tool/plugin paths.
- Improves plugin mention insertion so display names/casing are
reflected more cleanly in the composer.
- Update composer to render user-entered plugin mentions in the same
color as the mentions menu. ALso applies to recalled/edited messages.
- Left/right arrows no longer switch unified-mention search modes after
an @mention has already been accepted (Ex: arrowing left through a
composed message that contains @mentions).
- Keeps bound mentions stable around punctuation, so accepted `@`
mentions do not reopen the popup and punctuated `$` mentions still
persist to cross-session history.

**Steps to test**
- Ensure mentions_v2 is enabled through configuration or `--enable
mentions_v2`
- Type `@` in the TUI composer and verify filesystem/plugin/skill
results are displayed in the unified mentions menu.
- Select a plugin mention from the `@` popup and confirm the inserted
text is an `@...` mention with casing, then recall/edit the message and
confirm it still renders as `@...`.
- Mention a skill and verify that skills still insert as `$skill`
mentions rather than `@` mentions.
- Verify punctuated mentions such as `@plugin.` and `($skill)` keep
their bound mention behavior across editing and history recall.

canvrno-oai · 2026-05-28 10:30:15 -07:00

6c1215dac6

standalone websearch extension (#23823 )

## Summary

Add the extension-backed standalone `web.run` tool so Codex can call the
standalone search endpoint through the `codex-api` search client and
return its encrypted output to Responses.

- gate the new tool behind `standalone_web_search`
- install the extension in the app-server thread registry and hide
hosted `web_search` when standalone search is enabled for OpenAI
providers so the two paths stay mutually exclusive
- build search context from persisted history using a small tail
heuristic: previous user message, assistant text between the last two
user turns capped at about 1k tokens, and current user message

## Test Plan

- `cargo test -p codex-web-search-extension`
- `cargo test -p codex-api`
- `cargo test -p codex-core
hosted_tools_follow_provider_auth_model_and_config_gates`

sayan-oai · 2026-05-26 11:12:24 -07:00

a22706dfae

Move MCP tool naming mode into manager (#21576 )

## Why

The `non_prefixed_mcp_tool_names` feature should be applied where MCP
tools become model-visible, not by remapping names later in core.
Keeping the decision in `McpConnectionManager` construction makes
`ToolInfo` the single shaped view that spec building, deferred tool
search, routing, and unavailable-tool placeholders can consume directly.

This also preserves the existing external behavior while the feature is
off, and keeps the feature-on behavior for code mode and hooks explicit
at the manager boundary.

## What Changed

- Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig`
into `McpConnectionManager::new`.
- Normalize MCP `ToolInfo` names in the manager using either
legacy-prefixed namespaces or non-prefixed namespaces; the legacy path
adds `mcp__` without restoring the old trailing namespace suffix.
- Remove the core-side MCP name remapping path so specs, tool search,
session resolution, and unavailable-tool placeholder construction use
the manager-provided `ToolName` values directly.
- Keep code mode flattening on the `__` namespace separator.
- Preserve hook compatibility by giving non-prefixed MCP hook names
legacy `mcp__...` matcher aliases.
- Add/adjust integration and unit coverage for non-prefixed code-mode
behavior, hook matching with the feature on and off, and manager-level
legacy prefixing.

## Testing

- `cargo test -p codex-mcp --lib`
- `cargo test -p codex-core --lib tools::spec::tests -- --nocapture`
- `cargo test -p codex-core --lib mcp_tools -- --nocapture`
- `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture`
- `cargo test -p codex-core --test all mcp_tool -- --nocapture`
- `cargo test -p codex-core --test all search_tool -- --nocapture`
- `cargo test -p codex-core --test all hooks_mcp -- --nocapture`
- `cargo test -p codex-core --test all
code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled --
--nocapture`
- `cargo test -p codex-tools`
- `cargo test -p codex-features`

pakrym-oai · 2026-05-26 08:21:15 -07:00

ff7513cd83

Remove plugin hooks feature flag (#22552 )

# Why

This is a follow-up stacked on top of the `plugin_hooks` default-on
change. Once we are comfortable making plugin hooks part of the normal
plugin behavior, the separate feature flag stops buying us much and
leaves extra branching/cache state behind.

# What

- remove the `PluginHooks` feature and generated config-schema entries
- make plugin hook loading/listing follow plugin enablement directly
- drop plugin-manager cache/state that only existed to distinguish
hook-flag toggles
- remove tests and fixtures that modeled `plugin_hooks = true/false`

Abhinav · 2026-05-21 19:15:18 +00:00

24faf49b2a

Make goals feature on by default and no longer experimental (#23732 )

## Why

The `goals` feature is ready to be available without requiring users to
opt into experimental features. Keeping it behind the beta flag leaves
persisted thread goals and automatic goal continuation disabled by
default.

This PR also marks the goal-related app server APIs and events as no
longer experimental.

## What changed

- Mark `goals` as `Stage::Stable`.
- Enable `goals` by default in `codex-rs/features/src/lib.rs`.

Eric Traut · 2026-05-20 15:07:35 -07:00

0e9d222178

Remove ToolSearch feature toggle (#23389 )

## Summary
- mark `ToolSearch` as removed and ignore stale config writes for its
legacy key
- make search tool exposure depend only on model capability, not a
feature toggle
- remove app-server enablement support and prune now-obsolete test
coverage/setup

## Verification
- `cargo test -p codex-features`
- `cargo test -p codex-tools`
- `cargo test -p codex-core search_tool_requires_model_capability`
- `cargo test -p codex-app-server experimental_feature_enablement_set_`

## Notes
- This keeps the legacy config key as a no-op for compatibility while
removing the ability to toggle the behavior off cleanly.
- No developer-facing docs update outside the touched app-server README
was needed.

sayan-oai · 2026-05-19 01:24:39 +00:00

daa11820b0

133 Commits