330 Commits

  • [codex] wire process-owned code mode host into core (#30142)
    ## Summary
    
    - add the `code_mode_host` feature flag and select
    `ProcessOwnedCodeModeSessionProvider` in `CodeModeService` when enabled
    - initialize code-mode sessions lazily so a missing host reports a tool
    error without failing thread startup
    - resolve `codex-code-mode-host` beside the running Codex binary by
    default while preserving `CODEX_CODE_MODE_HOST_PATH` as an override
    - add unit and end-to-end coverage for host resolution and graceful
    missing-host behavior
    
    ## Why
    
    This wires the process-owned session client from #30112 into the core
    service behind an opt-in rollout gate. Packaged Codex installations can
    place the helper in the same `bin` directory as the main executable
    without relying on `PATH`, while development and custom installations
    can continue to override the helper path.
    
    ## Stack
    
    - Depends on #30112
    - Base branch: `cconger/process-owned-session-runtime-4-client`
    
    ## Validation
    
    Build `codex` and `codex-code-mode-host`
    `CODEX_CODE_MODE_HOST_PATH="$PWD/target/debug/codex-code-mode-host"
    ./target/debug/codex --enable code_mode_host`
  • [codex] add current time reminder delivery mode config (#30031)
    ```python
    delivery_mode = "any_inference" # default
    delivery_mode = "after_user_or_tool_output" # new mode
    ``` 
    
    ## Validation
    - just test -p codex-core load_config_resolves_current_time_reminder
    - just test -p codex-core
    lock_contains_prompts_and_materializes_features
  • [codex] current time reminder interval to be set to 0 (#30029)
    A zero interval lets callers request a reminder at every
    otherwise-eligible inference boundary.
    
    ## Validation
    - just test -p codex-core load_config_resolves_current_time_reminder
  • core: raise token budget message limits (#29970)
    ## Why
    
    Token-budget reminder and guidance messages can require more than 1,000
    bytes to provide useful model-facing instructions. At the same time,
    these strings are injected into model-visible context, so their size
    must remain tightly bounded in response to the P0 context-growth
    concern. A 2,000-byte runtime cap provides additional room without
    allowing the substantially larger context growth of a 4 KiB limit.
    
    ## What changed
    
    - raises the runtime byte limits for token-budget reminder templates and
    guidance messages from 1,000 to 2,000
    - raises the corresponding JSON Schema `maxLength` values to 2,000
    - regenerates `codex-rs/core/config.schema.json`
    
    ## Testing
    
    - `just test -p codex-features`
    - `just test -p codex-core load_config_resolves_token_budget_config
    load_config_rejects_invalid_token_budget_reminder_template`
    
    The full `codex-core` test run completed 2,858 tests successfully and
    encountered seven unrelated environment-sensitive failures involving
    Seatbelt/network environment assertions, MCP capability setup, and abort
    timing.
  • Represent MCP authentication with an enum (#29924)
    ## Why
    
    MCP authentication has distinct OAuth and ChatGPT-session flows.
    Representing that choice as `use_chatgpt_auth` makes one flow implicit
    and allows the configuration model to express the distinction only
    through a boolean.
    
    ChatGPT credential forwarding also needs a first-party trust boundary. A
    configurable `chatgpt_base_url` controls routing, but must not grant an
    MCP server permission to receive session credentials.
    
    This change builds on #29733, where the boolean was introduced.
    
    ## What changed
    
    - Replace `use_chatgpt_auth` with an `auth` field backed by the
    exhaustive `McpServerAuth` enum.
    - Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
    the default.
    - Trust only the origin derived from the existing hardcoded
    `CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
    - Keep configured bearer tokens and authorization headers ahead of the
    selected authentication flow.
    - Update config writers, schema output, fixtures, and integration-test
    setup to use the enum.
    
    ## Verification
    
    Integration coverage exercises the complete streamable HTTP startup path
    in two independent configurations:
    
    - A directly constructed MCP configuration verifies that matching an
    overridden `chatgpt_base_url` does not grant ChatGPT auth.
    - A persisted `config.toml` containing an attacker-controlled
    `chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
    through normal config parsing.
    
    Both tests complete MCP initialization and tool listing and assert that
    the full captured request sequence contains no authorization headers.
    Separate integration coverage verifies that configured authorization
    takes precedence over ChatGPT auth.
  • Allow ChatGPT-hosted MCP servers to use session auth (#29733)
    ## Why
    
    ChatGPT session authentication was inferred from the reserved Codex Apps
    server name. That couples credential routing to Codex Apps-specific
    behavior and prevents other MCP endpoints hosted by ChatGPT from
    explicitly using the current session.
    
    The opt-in also needs a clear security boundary: an arbitrary MCP
    configuration must not be able to redirect ChatGPT credentials to
    another origin.
    
    ## What changed
    
    - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
    `false`.
    - Honor the setting only when the parsed server URL has the same HTTP(S)
    origin as the configured `chatgpt_base_url`; otherwise remove the
    capability before startup.
    - Resolve bearer tokens and static or environment-backed authorization
    headers before selecting authentication, with configured authorization
    taking precedence over ChatGPT session auth.
    - Enable the setting for the built-in Codex Apps and hosted plugin
    runtime endpoints while keeping Codex Apps caching and tool
    normalization scoped to the reserved server.
    - Persist the setting through MCP config rewrite paths and expose it in
    the generated config schema.
    - Load the current login state for `codex mcp list` so reported auth
    status matches runtime behavior.
    
    ## Verification
    
    Core integration coverage exercises the complete streamable HTTP MCP
    startup path and verifies that:
    
    - a same-origin opted-in server receives the current ChatGPT access
    token;
    - an explicitly configured authorization header takes precedence;
    - a different-origin server completes MCP initialization and tool
    listing without receiving any ChatGPT authorization header.
  • core: add configurable <context_window_guidance> message (#29936)
    ## Why
    
    This PR adds a configurable `<context_window_guidance>` developer
    section immediately after `<context_window>`. Harness integrations need
    this section to give the model deployment-specific instructions for
    preparing for context-window transitions.
    
    ## What changed
    
    - Add an optional `features.token_budget.guidance_message` config with a
    1,000-byte runtime cap and generated schema support.
    - Render configured guidance as a developer `ContextualUserFragment`
    wrapped in `<context_window_guidance>` immediately after
    `<context_window>`.
    - Omit the section when guidance is unset, empty, or whitespace-only.
    - Preserve the resolved value in config locks and classify persisted
    guidance as contextual developer content.
    - Add integration coverage for rendered content and ordering.
  • [codex] nest sleep config under current time reminder (#29910)
    ## Summary
    
    - move sleep tool enablement from top-level `[features].sleep_tool` to
    `[features.current_time_reminder].sleep_tool`
    - remove the standalone `Feature::SleepTool` flag and gate `clock.sleep`
    from resolved current-time configuration
    - update config schema, config-lock materialization, and existing sleep
    coverage
    
    Stacked on #29907.
  • [codex] Remove auto-compaction opt-out (#29815)
    ## Summary
    
    - remove the default-on `auto_compaction` feature flag and generated
    config schema entries
    - restore unconditional pre-turn, model-switch/hash, and mid-turn
    automatic compaction
    - expose `new_context` whenever token-budget tooling is enabled
    - remove the disabled-auto-compaction integration coverage introduced by
    #28260
    
    ## Motivation
    
    Roll back the internal auto-compaction escape hatch added in #28260.
    Automatic compaction should no longer be suppressible with `--disable
    auto_compaction`; existing manual `/compact` behavior remains unchanged.
    
    ## Testing
    
    - `just write-config-schema`
    - `just test -p codex-features` — 53 passed
    - `just test -p codex-core 'suite::compact::'` — 36 passed
    - `just test -p codex-core
    suite::token_budget::new_context_tool_starts_new_window_before_follow_up`
    — 1 passed
    - `just fix -p codex-core -p codex-features`
    - `just fmt`
    - `just test -p codex-core` — 2,778 passed, 59 failed, 16 skipped;
    failures were outside the changed compaction paths and were dominated by
    missing first-party test binaries and shell-snapshot timeouts
  • chore(core) rm AskForApproval::OnFailure (#28418)
    ## Summary
    Deletes the OnFailure variant of the `AskForApproval` enum. This option
    has been deprecated since #11631.
    
    ## Testing
    - [x] Tests pass
  • [core] debounce current-time reminders by elapsed time (#29659)
    ## Summary
    - rename `reminder_interval_model_requests` to
    `reminder_interval_seconds`
    - read the configured time provider before every model request and
    inject a reminder only after the configured number of seconds has
    elapsed
    - preserve immediate first delivery and forced delivery after compaction
    changes the context window
    
    ## Tests
    - `just test -p codex-core current_time_reminder`
  • mcp: accept foreign absolute cwd for remote stdio (#29493)
    ## Why
    
    Remote stdio MCP servers can run in an environment whose path convention
    differs from the Codex host. A Windows cwd such as
    `C:\Users\openai\share` is absolute for the executor but was rejected by
    a POSIX orchestrator.
    
    Built on #29501, now merged, which only clarifies the host-native
    `PathUri` constructor name.
    
    ## What changed
    
    - Deserialize MCP cwd values as `LegacyAppPathString` so config does not
    apply host path rules.
    - Interpret that spelling as host-native for local launches and convert
    it to `PathUri` at executor launch.
    - Skip host filesystem and command resolution checks for remote stdio in
    `codex doctor`.
    - Add host-independent config and executor-boundary coverage using the
    foreign path convention for each test platform.
    
    ## Validation
    
    - `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
    codex-rmcp-client` (408 passed)
    - `just test -p codex-cli -p codex-rmcp-client` (372 passed)
    - `cargo check --workspace --tests`
    - `just test` (11,311 passed; 43 unrelated environment/timing failures)
    - `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
    codex-mcp-extension -p codex-rmcp-client -p codex-tui`
  • Register full CDP requirements feature (#28769)
    register cdp requirements feature flag
  • [codex] configure rollout budget reminder thresholds (#29423)
    ## Summary
    
    Instead of:
    
        reminder_interval_tokens = 65_536
    
    allow users to configure explicit remaining-token reminder thresholds:
    
    reminder_at_remaining_tokens = [65_536, 32_768, 16_384, 8_192, 4_096,
    2_048, 1_024, 512]
    
    ## Validation
    
    - CARGO_INCREMENTAL=0 just test -p codex-core rollout_budget: 9 passed
    - just fix -p codex-core
    - just fmt
  • Simplify multi-agent mode controls (#29324)
    ## Why
    
    Multi-agent delegation policy was split across `multiAgentMode`,
    `features.multi_agent_mode`, and `usage_hint_enabled`. These controls
    could disagree: a requested mode could be downgraded by the feature
    flag, and disabling usage hints also disabled mode instructions.
    
    Some clients also need multi-agent tools without adding
    delegation-policy text to model context. The previous two-mode API could
    not express that directly.
    
    ## What changed
    
    `multiAgentMode` is now the only live delegation-policy control:
    
    | Mode | Behavior |
    | --- | --- |
    | `none` | Keep multi-agent tools available without adding mode
    instructions. |
    | `explicitRequestOnly` | Only delegate after an explicit user request.
    |
    | `proactive` | Delegate when parallel work materially improves speed or
    quality. |
    
    - new threads default to `explicitRequestOnly`; omitting the mode on
    later turns keeps the current value
    - thread start, resume, fork, and settings responses always report the
    concrete current mode instead of `null`
    - mode selection remains sticky across turns and resume
    - usage-hint text no longer controls whether mode instructions apply
    - `features.multi_agent_mode` and `usage_hint_enabled` remain accepted
    as ignored compatibility settings so existing configs continue to load
    - app-server documentation and generated schemas describe the three-mode
    API
    
    ## Tests
    
    - `just test -p codex-core multi_agent_mode`
    - `just test -p codex-core multi_agent_v2_config_from_feature_table`
    - `just test -p codex-core spawn_agent_description`
    - `just test -p codex-features`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server multi_agent_mode`
  • [codex] Add internal auto-compaction opt-out (#28260)
    ## Summary
    
    - add a default-on `auto_compaction` feature flag as an internal escape
    hatch
    - skip pre-turn, model-switch/hash, and mid-turn automatic compaction
    when the flag is disabled
    - preserve manual `/compact` behavior and surface the existing
    context-window error when the provider runs out of room
    - add integration coverage for disabled pre-turn and mid-turn compaction
    
    ## Motivation
    
    Long-running SPO optimization rollouts need the option to preserve their
    full context and fail on context exhaustion instead of entering another
    compaction window. This deliberately uses the existing feature-flag
    mechanism rather than adding a dedicated public config or app-server
    API.
    
    Disable it with:
    
    ```sh
    codex --disable auto_compaction
    ```
    
    ## Testing
    
    - `just test -p codex-features` — 51 passed
    - `just test -p codex-core auto_compaction_feature_disabled` — 2 passed
    - `just fix -p codex-core -p codex-features`
    - `just write-config-schema`
    - `just test -p codex-core` — the new compaction tests passed; the
    overall local run had 54 unrelated environment failures, primarily
    missing first-party test binaries and shell-snapshot timeouts
  • [codex] add configurable token budget compaction reminder (#29255)
    ## Why
    
    The token-budget feature reports coarse remaining-context milestones,
    but it does not give the model a configurable wrap-up prompt before
    automatic compaction. A strict threshold-crossing check can also miss
    resumed or reconfigured windows that are already inside the threshold.
    
    ## What changed
    
    - Add structured `[features.token_budget]` configuration for an absolute
    `reminder_threshold_tokens` and bounded `reminder_message_template`;
    `{n_remaining}` is expanded when the reminder is delivered.
    - Compute remaining tokens against the next effective auto-compaction
    boundary, including scoped `body_after_prefix` accounting and the full
    context-window limit.
    - Make reminder delivery level-triggered before and after sampling, with
    one-shot state owned by `AutoCompactWindow` and re-armed on compaction,
    `new_context`, restore, or history replacement.
    - Leave the existing initial full-window token-budget context, 25/50/75%
    notices, and token-budget tools unchanged.
    - Persist the resolved feature configuration in the session config lock
    and regenerate the config schema.
    
    ## Validation
    
    - `just test -p codex-core token_budget`
    - `just test -p codex-core
    token_budget_reminder_emits_after_crossing_compaction_threshold`
    - `just test -p codex-core auto_compact_window`
    - `just test -p codex-core
    lock_contains_prompts_and_materializes_features`
    - `just test -p codex-features`
    - `just test -p codex-config`
  • Add config toggles for orchestrator skills and MCP (#28942)
    ## Why
    
    Orchestrator-provided skills and Codex Apps MCP tools add model-visible
    instructions, resources, and tools beyond the local workspace. Hosts
    need config-level switches to disable those orchestrator-owned surfaces
    independently, without disabling regular skills or regular MCP servers.
    
    ## What changed
    
    - Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled`
    config entries, both defaulting to `true`.
    - Includes the new settings in `config.schema.json` and in the config
    lock so resolved thread configuration preserves the same orchestrator
    exposure decisions.
    - Threads `orchestrator.skills.enabled` through the app-server skills
    extension so disabled orchestrator skills do not expose the `skills`
    namespace or inject orchestrator skill context.
    - Gates Codex Apps MCP exposure, app instructions, and app auth
    eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps
    MCP tools available.
    - Updates the thread-manager sample config to disable both
    orchestrator-owned surfaces.
    
    ## Verification
    
    - Added config parsing, loading, defaulting, and schema coverage for the
    new settings.
    - Added MCP exposure coverage that `orchestrator.mcp.enabled = false`
    removes Codex Apps tools while preserving regular MCP tools.
    - Added app-server coverage that `orchestrator.skills.enabled = false`
    prevents orchestrator skill tools, prompts, and resource reads from
    reaching the model turn.
  • Add indexed web search mode (#28489)
    ## Summary
    
    - Add `web_search = "indexed"` alongside `disabled`, `cached`, and
    `live`.
    - Use that same resolved mode for both hosted and standalone web search.
    - For hosted search, send `index_gated_web_access: true` with external
    web access enabled only when `indexed` is selected.
    - For standalone search, preserve the existing boolean wire values for
    existing modes (`cached` maps to `false` and `live` to `true`) and send
    `"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
    - Carry the mode through managed configuration requirements and
    generated schemas.
    
    ## Why
    
    Indexed search provides a middle ground between cached-only search and
    unrestricted live page fetching. Search queries can remain live while
    direct page fetches are limited to URLs admitted by the server.
    
    The existing `web_search` setting remains the single source of truth, so
    hosted and standalone executors cannot drift into different access
    modes. Without an explicit `indexed` selection, the existing
    model-visible tool and request shapes are unchanged.
    
    ```toml
    web_search = "indexed"
    
    [features]
    standalone_web_search = true
    ```
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-api` (`126 passed`)
    - `just test -p codex-web-search-extension` (`7 passed`)
    - `just test -p codex-core
    code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
    - Focused configuration, hosted request, standalone request, and
    managed-requirement coverage is included in the PR; remaining suites run
    in CI.
    
    The full workspace test suite was not run locally.
  • Add per-turn multi-agent mode (#28685)
    ## Why
    
    Multi-agent v2 currently carries an explicit-request-only delegation
    rule in its static usage hint. That provides a safe default, but it
    prevents clients from selecting proactive delegation per turn without
    changing static guidance or rewriting prior model context.
    
    This change makes delegation mode a session selection that can be
    updated through `turn/start`, while deriving the effective model-visible
    mode separately for each turn. Eligible multi-agent v2 turns remain
    explicit-request-only unless proactive mode is both selected and
    enabled.
    
    ## What changed
    
    - Add the experimental `turn/start.multiAgentMode` parameter with
    `explicitRequestOnly` and `proactive` values. Omission retains the
    loaded session's current optional selection.
    - Add the default-off `features.multi_agent_mode` feature gate. Eligible
    multi-agent v2 turns use the selected mode when enabled; an unset
    selection or disabled gate resolves to `explicitRequestOnly`.
    - Treat mode prompting as inapplicable for multi-agent v1 and other
    unsupported session configurations, producing no multi-agent mode
    developer message rather than rejecting the turn.
    - Move the explicit-request-only rule out of the static v2 usage hint
    and into a bounded, tagged developer context fragment.
    - Emit the effective mode in initial context and only when that
    effective mode changes on later turns.
    - Persist the effective mode in `TurnContextItem` as the durable
    baseline for resume and context-update comparisons.
    
    Historical rollout items are not rewritten. Later mode developer
    messages establish the current rule incrementally.
    
    ## Not covered
    
    - Initial selection through `thread/start` and selected-mode reporting
    from thread lifecycle/settings APIs; those are isolated in the stacked
    #28792.
    - A TUI control or slash command for selecting the mode.
    - Persisting a preferred mode to `config.toml`; selection remains
    session/turn scoped.
    - Changes to multi-agent concurrency limits, tool availability, or model
    catalog capability declarations.
    - Rewriting historical rollout prompt items. Cold resume restores the
    latest persisted effective mode when available while leaving historical
    developer messages intact.
    
    ## Verification
    
    - `CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode`
    - Focused app-server coverage verifies that `turn/start.multiAgentMode`
    produces proactive developer instructions for an eligible v2 turn.
    
    ## Stack
    
    Followed by #28792, which adds `thread/start` initialization and
    lifecycle/settings observability.
  • [2/3] core: track starting environments in snapshots (#28683)
    ## Why
    
    Remote environments may still be resolving when Codex creates a session
    or turn. Waiting for the existing all-or-nothing environment snapshot
    can hold startup until the selected environment is usable.
    
    Behind the default-off `deferred_executor` feature, let callers take a
    useful snapshot immediately: completed environments remain available
    normally, while unfinished environments are reported without blocking
    startup. With the feature disabled, snapshots preserve the existing
    blocking behavior.
    
    Depends on #28674.
    
    ## What changed
    
    - Store one ordered list of selected environments in
    `ThreadEnvironments`. Each selection owns one shared resolution that
    produces its complete `TurnEnvironment`.
    - Start new resolutions in the background with `remote_handle()`,
    allowing snapshots and the future wait tool to share the same result
    while cancellation follows the retained handles.
    - Make `snapshot()` a read-only operation: nonblocking snapshots collect
    completed resolutions and retain handles for unfinished ones, while
    blocking snapshots await every resolution.
    - Replace completed failed resolutions from the current manager entry
    and log when failed environments are omitted.
    - Return attached and starting environments as a point-in-time view, and
    count starting environments when deciding whether a snapshot is
    local-only.
    - Keep existing consumers attached-only. `to_selections()` derives from
    attached environments, so child threads do not inherit an environment
    that is still starting.
    
    ## Test plan
    
    - `just test -p codex-core environment_selection`
    - `just test -p codex-core
    deferred_executor_reaches_model_before_remote_environment_is_ready`
    
    ## Landing note
    
    Keep `deferred_executor` disabled for slow-starting executors until
    configurable `environment/add` connection timeouts and caller support
    land. When enabled, an environment that attaches after session startup
    may remain absent from environment-derived model context, tools,
    instructions, skills, and related state until follow-up refresh work
    lands.
  • [codex] Assign response item IDs when recording history (#28814)
    ## Why
    
    Client-created response items enter history without IDs, so their
    identity is lost across rollout persistence and resume. IDs should be
    assigned once at the history-recording boundary, while IDs returned by
    the server must remain unchanged.
    
    The Responses API validates item IDs using type-specific prefixes.
    Locally generated IDs therefore use the matching prefix plus a
    hyphenated UUIDv7, keeping them valid while distinguishable from
    server-generated IDs. Because this changes persisted history and
    provider request shapes, the behavior is opt-in behind the
    under-development `item_ids` feature. Compaction triggers remain request
    controls whose API shape does not accept an ID.
    
    ## What changed
    
    - Register the disabled-by-default `item_ids` feature and expose it in
    `config.schema.json`.
    - Make supported optional `ResponseItem` IDs serializable and expose
    them in the generated app-server schemas.
    - When `item_ids` is enabled, assign an ID during conversation-history
    preparation if an item has no ID.
    - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
    item conventions.
    - Preserve existing server IDs without rewriting them.
    - Persist assigned IDs in rollouts and include them in subsequent
    Responses requests.
    - Remove the unsupported ID field from `CompactionTrigger` and document
    why it has no ID.
    - Add integration coverage for enabled ID persistence, preservation of
    server IDs, and omission of generated IDs while the feature is disabled.
    
    `prepare_conversation_items_for_history` is the single response-item ID
    allocation boundary.
    
    ## Test plan
    
    - `just test -p codex-features`
    - `just test -p codex-core
    response_item_ids_persist_across_resume_and_preserve_server_ids`
    - `just test -p codex-core
    non_openai_responses_requests_omit_item_turn_metadata`
    - `just test -p codex-core
    resize_all_images_prepares_failures_before_history_insertion`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
  • Always use AVAS for realtime WebRTC calls (#28856)
    ## Summary
    
    - Remove the realtime `architecture` selector from core protocol,
    app-server protocol, config parsing, generated schemas, and callers.
    - Always create WebRTC realtime calls with the AVAS query params:
    `intent=quicksilver&architecture=avas`.
    - Keep direct websocket realtime behavior on the existing config/default
    path, while WebRTC starts without an explicit version now default to
    realtime v1 because AVAS requires v1.
    
    ## Notes
    
    - WebRTC realtime now means AVAS. If a caller explicitly asks to start
    WebRTC with realtime v2, Codex rejects that request because the AVAS
    WebRTC path only supports realtime v1. Websocket realtime is separate
    and can still use realtime v2.
    - The old `[realtime] architecture = "realtimeapi" | "avas"` config knob
    is removed. Local configs that still set it will need to delete that
    line.
    - Some app-server tests that were only trying to exercise realtime v2
    protocol behavior now use websocket transport, because WebRTC is
    intentionally locked to AVAS/v1. Separate WebRTC tests cover the AVAS
    query params, v1 startup, SDP flow, and sideband join.
    
    ## Validation
    
    - Merged fresh `origin/main` at `83e6a786a2`.
    - `just fmt`
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `git diff --check`
    - `just test -p codex-api -p codex-core -p codex-app-server-protocol -p
    codex-app-server realtime` (176 passed)
    - `just test -p codex-protocol -p codex-config` (413 passed)
  • [codex] Remove child AGENTS.md prompt experiment (#28993)
    ## Why
    
    `child_agents_md` is a disabled, under-development experiment that adds
    a second model-visible explanation of hierarchical `AGENTS.md` behavior.
    Keeping it leaves unused prompt, configuration, documentation, and test
    surface.
    
    ## What changed
    
    - remove the `ChildAgentsMd` feature and `child_agents_md` config schema
    entry
    - remove the hierarchical prompt asset, export, and instruction
    injection
    - remove feature-specific tests and documentation
    - keep the generic unstable-feature warning coverage using
    `apply_patch_streaming_events`
    
    Normal project `AGENTS.md` discovery and composition are unchanged.
    
    ## Testing
    
    - `just test -p codex-features`
    - `just test -p codex-prompts`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core unstable_features_warning`
  • feat: opt ChatGPT auth into agent identity (#19049)
    ## Stack
    
    This is PR 2 of the simplified HAI single-run-task stack:
    
    - [#19047](https://github.com/openai/codex/pull/19047) Agent Identity
    assertion and task-registration primitives, including the shared
    run-task helper used by existing Agent Identity JWT auth.
    - [#19049](https://github.com/openai/codex/pull/19049)
    Disabled-by-default ChatGPT auth opt-in that provisions/reuses persisted
    Agent Identity runtime auth and its single run task.
    - [#19051](https://github.com/openai/codex/pull/19051) Run-scoped
    provider auth that uses one backend-owned task id for first-party
    inference and compaction requests.
    
    [#19054](https://github.com/openai/codex/pull/19054) collapsed out of
    the active stack because the simplified design no longer needs a
    separate background/control-plane task helper.
    
    ## Summary
    
    This PR adds the disabled-by-default path for normal ChatGPT-login Codex
    sessions to obtain Agent Identity runtime auth through the Codex
    backend. Existing Agent Identity JWT startup mode remains a separate
    path and does not require the feature flag.
    
    What changed:
    
    - adds the experimental `use_agent_identity` feature flag and config
    schema entry
    - adds an explicit `AgentIdentityAuthPolicy` so call sites choose
    `JwtOnly` or `ChatGptAuth` instead of passing a bare boolean
    - stores standalone Agent Identity JWT credentials separately from
    backend-registered Agent Identity records
    - persists the registered Agent Identity record, private key, and single
    run task id in `auth.json` so process restarts reuse the same identity
    - derives the agent/task registration base URL from ChatGPT/Codex auth
    config while keeping JWT JWKS lookup separate
    - provisions and caches ChatGPT-derived Agent Identity runtime auth when
    `use_agent_identity` is enabled
    - reuses the shared run-task registration helper from PR1 rather than
    adding a second task-registration path
    
    This PR intentionally does not switch model inference over to
    `AgentAssertion` auth. The provider-auth integration lands in the next
    PR.
    
    ## Testing
    
    - `just test -p codex-login`
  • Add Config for Time Reminders (varlatency 1/n) (#28822)
    ## Summary
    
    Example:
    
    > [features.current_time_reminder]
    enabled = true
    reminder_interval_model_requests = 1
    clock_source = "system"
    
    ## Testing
    
    - `just test -p codex-core varlatency`
    - `just test -p codex-core
    lock_contains_prompts_and_materializes_features`
    - `just fix -p codex-core -p codex-config -p codex-features`
  • [codex] add rollout token budget configuration (varlength 1/N) (#28746)
    ## What
    
    This PR defines the structured configuration contract for shared rollout
    token budgets (across ALL agent threads under 1 rollout).
    
    ```toml
    [features.rollout_budget]
    enabled = true
    limit_tokens = 100000
    reminder_interval_tokens = 10000
    sampling_token_weight = 1.0
    prefill_token_weight = 0.1
    ```
    
    The reminder interval defaults to 10% of the rollout limit. Sampling and
    prefill weights default to `1.0`.
    
    ## Scope
    
    This PR only defines and validates configuration. It does not track
    usage, inject reminders, or stop a rollout. Accounting and reminders are
    implemented in the stacked follow-up #28494.
    
    The existing `token_budget` feature remains unchanged. `rollout_budget`
    has its own feature key and configuration type.
    
    ## Tests
    
    The config test verifies that the structured fields resolve into
    `RolloutBudgetConfig` and do not enable the existing `token_budget`
    feature.
    
    Local checks:
    
    - `just write-config-schema`
    - `just test -p codex-core load_config_resolves_rollout_budget`
    - `cargo check -p codex-thread-manager-sample`
    - `git diff --check`
    
    The full workspace test suite was not run locally.
  • Expose selecte namespaces as direct model tools (#28825)
    ## Why
    
    Som tools, such as history and notes, must remain top-level when MCP
    deferral is enabled while staying unavailable through code-mode `exec`.
    
    ## What changed
    
    - Added `features.code_mode.direct_only_tool_namespaces`.
    - Classified matching MCP tools as `DirectModelOnly`.
    - Kept those tools top-level in `code_mode_only`.
    - Excluded them from `tool_search` deferral and the nested `exec`
    surface.
    - Updated the generated config schema.
    
    ## Validation
    
    - `code_mode_only_exposes_direct_model_only_mcp_namespaces`
    - `load_config_resolves_code_mode_config`
  • [ez][codex-rs] Support apps._default.default_tools_approval_mode (#27965)
    [from codex]
    
    ## Summary
    
    - add `default_tools_approval_mode` to `[apps._default]` and expose it
    through app-server v2 `config/read`
    - apply it after managed, per-tool, and per-app approval settings,
    before the built-in `auto` fallback
    - document the precedence, regenerate config/app-server schemas, and add
    unit plus end-to-end approval coverage
    
    ## Configuration
    
    ```toml
    [apps._default]
    default_tools_approval_mode = "prompt"
    ```
    
    The effective precedence is managed requirements, tool-specific
    `approval_mode`, app-specific `default_tools_approval_mode`,
    `apps._default.default_tools_approval_mode`, then `auto`.
    
    ## Test plan
    
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `just write-app-server-schema --experimental`
    - `just test -p codex-core app_tool_policy`
    - `just test -p codex-core mcp_turn_metadata`
    - `just test -p codex-config`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server config_read_includes_apps`
    - `just fix -p codex-config -p codex-core -p codex-app-server-protocol
    -p codex-app-server`
    - `just fmt`
  • PAC 1 - Add system proxy feature config surface (#26706)
    ## Summary
    
    Introduces the default-off `respect_system_proxy` feature flag used to
    gate first-class system PAC/proxy support for Codex-owned native
    clients.
    
    With the feature disabled or absent, behavior remains unchanged. This PR
    establishes the configuration and managed-requirement surface; proxy
    discovery and request routing are implemented by follow-up PRs.
    
    ## Configuration
    
    User configuration uses the standard boolean feature form:
    
    ```toml
    [features]
    respect_system_proxy = true
    ```
    
    Managed feature requirements use the corresponding boolean key. The
    effective runtime configuration is exposed as a boolean and defaults to
    `false`.
    
    ## Implementation
    
    - Registers `respect_system_proxy` as an under-development, default-off
    feature.
    - Resolves user configuration and managed feature requirements into
    `Config.respect_system_proxy`.
    - Provides bootstrap resolution for startup paths that must evaluate the
    feature before full configuration loading completes.
    - Uses the standard feature CLI and config-editing behavior.
    - Excludes `features.respect_system_proxy` from project-local
    configuration.
    - Updates the generated configuration schema.
    
    ## End-user behavior
    
    - No networking behavior changes when the feature is absent or disabled.
    - Enabling the feature makes the boolean available to the native
    proxy-routing implementation in follow-up PRs.
    - Repository-local configuration cannot enable the feature.
    
    ## Test coverage
    
    Covers scalar configuration and CLI override resolution, managed
    requirement constraints, bootstrap resolution, and project-local
    filtering.
  • [codex] Add interruptible sleep tool (#28429)
    ## Why
    
    Models sometimes need to pause briefly while waiting for external work,
    but using a shell command for that delay ties the wait to a process and
    does not naturally resume when new turn input arrives.
    
    ## What changed
    
    - add a built-in `sleep` tool behind the under-development `sleep_tool`
    feature
    - accept a bounded `duration_ms` argument, matching the millisecond
    convention used by unified exec
    - end the sleep early when either steered user input or mailbox input
    arrives
    - include elapsed wall-clock time in completed and interrupted outputs
    - emit a dedicated core `SleepItem` through `item/started` and
    `item/completed`
    - expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain
    it in reconstructed thread history
    - regenerate the configuration schema for the new feature flag
    - regenerate app-server JSON and TypeScript schema fixtures
    
    ## Test plan
    
    - `just test -p codex-core sleep_tool_follows_feature_gate`
    - `just test -p codex-core any_new_input_interrupts_sleep`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    sleep_emits_started_and_completed_items`
  • feat: add secret auth storage configuration (#27504)
    ## Why
    
    Windows Credential Manager limits generic credential blobs to 2,560
    bytes. The encrypted local secrets backend avoids storing large
    serialized auth payloads directly in the OS keyring, but selecting that
    backend needs an independently reviewable feature/config layer before
    the auth and secrets implementation is wired in.
    
    ## What Changed
    
    - Added the stable `secret_auth_storage` feature, enabled by default on
    Windows and disabled by default elsewhere.
    - Added `AuthKeyringBackendKind` and config resolution for full and
    bootstrap config loading.
    - Applied managed feature requirements when resolving the bootstrap auth
    backend.
    - Updated the generated config schema and added focused tests.
    
    This is the base PR for #17931. The auth, secrets, MCP, CLI, TUI, and
    app-server implementation remains in that follow-up PR.
    
    ## Validation
    
    - `just test -p codex-features`
    - `just test -p codex-config`
    - `just test -p codex-core
    resolve_bootstrap_auth_keyring_backend_kind_uses_secret_auth_storage_feature`
    - `just write-config-schema`
    - `just fix -p codex-core`
    
    The full `just test -p codex-core` run compiled successfully and ran
    2,690 tests; 2,589 passed, one was flaky, and 101 environment-sensitive
    tests failed because this shell injects a `pyenv` rehash warning into
    command output or because sandboxed subprocesses timed out.
  • realtime: add AVAS architecture override (#27720)
    ## Summary
    
    Adds a `RealtimeConversationArchitecture` option for realtime
    conversation startup, with `realtimeapi` as the default and `avas` as an
    opt-in architecture.
    
    The AVAS path is limited to realtime v1 conversational WebRTC starts,
    and WebRTC call creation appends `intent=quicksilver&architecture=avas`
    to `/v1/realtime/calls`. The existing sideband websocket still joins by
    `call_id`.
    
    This also exposes the per-session architecture override through
    app-server v2 `thread/realtime/start` params and updates the config
    schema for `[realtime].architecture`.
    
    ## Validation
    
    - `just fmt`
    - `just write-config-schema`
    - `just test -p codex-api sends_avas_session_call_query_params`
    - `just test -p codex-core -E
    'test(~conversation_webrtc_start_uses_avas_architecture_query)'`
    - `just test -p codex-core -E 'test(realtime_loads_from_config_toml)'`
    - `just test -p codex-app-server-protocol -E
    'test(~serialize_thread_realtime_start) |
    test(generated_ts_optional_nullable_fields_only_in_params)'`
    - `just test -p codex-app-server -E
    'test(realtime_webrtc_start_emits_sdp_notification)'`
  • [ez][codex-rs] Support approvals reviewer in app defaults (#27075)
    [from codex]
    
    ## Summary
    
    - add `approvals_reviewer` support to `[apps._default]`
    - resolve connected-app reviewers in per-app, app-default, then global
    order
    - expose the setting through the v2 config API and regenerate schema
    fixtures
    
    ## Context
    
    PR #25167 added `apps.<connector_id>.approvals_reviewer`, but the shared
    app defaults table could not specify the reviewer. This extends the same
    behavior to `[apps._default]` while preserving per-app overrides.
    
    Managed `allowed_approvals_reviewers` requirements still constrain both
    default and per-app values. A disallowed app value falls back to the
    global reviewer, and non-app MCP servers continue using the global
    reviewer.
    
    ## Testing
    
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `just fmt`
    - `just test -p codex-config`
    - `just test -p codex-core app_approvals_reviewer`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server config_read_includes_apps`
  • [codex] Add token budget context feature (#27438)
    ## Why
    
    The model should be able to see bounded context-window budget metadata
    when the `token_budget` feature is enabled. The full-window message is
    only injected with full context, while normal turns get a smaller
    follow-up only when reported usage first crosses a budget threshold.
    
    ## What changed
    
    - Added the `TokenBudget` feature flag.
    - Added `<token_budget>` developer fragments for full context-window
    metadata and current-window remaining tokens.
    - Inserted the threshold message during normal turn handling by
    comparing token usage before and after sampling, avoiding persistent
    threshold bookkeeping.
    - Added core integration coverage for full-context-only metadata and
    25/50/75 percent threshold messages.
    
    ## Verification
    
    - `just test -p codex-core token_budget`
    - `git diff --check`
  • core: resize all history images behind a feature flag (#27247)
    ## Summary
    
    Adds complete client-side image preparation behind the default-off
    `resize_all_images` feature flag.
    
    When enabled, local image producers defer decoding and resizing. Images
    are prepared centrally before insertion into conversation history,
    covering user input, `view_image`, and structured tool-output images.
    
    ## Behavior
    
    - Processes base64 `data:` images in messages and function/custom tool
    outputs.
    - Leaves non-data URLs, including HTTP(S) URLs, unchanged.
    - Applies image-detail budgets:
      - `high` and omitted: 2048px maximum dimension and 2.5K 32px patches.
      - `original`: 6000px maximum dimension and 10K 32px patches.
      - `auto`: uses the same 2048px / 2.5K-patch budget as high.
      - `low`: unsupported and replaced with an actionable placeholder.
    - Preserves original image bytes when no resize or format conversion is
    needed.
    - Enforces the shared 1 GiB encoded and decoded data-URL sanity limits.
    - Replaces only an image that fails preparation, preserving sibling
    content and tool-output metadata.
    - Uses bounded placeholders distinguishing generic processing failures,
    oversized images, and unsupported `low` detail.
    - Prepares resumed and forked history before installing it as live
    history without modifying persisted rollouts.
    
    ## Flag-Off Behavior
    
    When `resize_all_images` is disabled:
    
    - Existing local user-input and `view_image` processing remains
    unchanged.
    - Existing decoding and error behavior remains unchanged.
    - Arbitrary tool-output images are not processed.
    - HTTP(S) image URLs continue to be forwarded unchanged.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/27245
    - 👉 `2` https://github.com/openai/codex/pull/27247
    -  `3` https://github.com/openai/codex/pull/27246
    -  `4` https://github.com/openai/codex/pull/27266
  • Use plugin-service MCP as the hosted plugin runtime (#27198)
    ## Stack
    
    - Base: #27191
    - This PR is the third vertical and should be reviewed against
    `jif/external-plugins-2`, not `main`.
    
    ## Why
    
    #27191 moves the host-owned Apps MCP registration behind an extension
    contributor, but deliberately preserves the existing endpoint-selection
    feature while that contribution contract lands. App-server can therefore
    resolve the server through extensions, yet the hosted plugin endpoint is
    still selected through temporary `apps_mcp_path_override` plumbing.
    
    That is not the long-term plugin model. A plugin can bundle skills,
    connectors, MCP servers, and hooks, and those components do not all need
    the same source or execution environment. In particular, an
    authenticated HTTP MCP server can expose plugin capabilities directly
    from a backend without an executor or an orchestrator filesystem.
    
    This PR completes that hosted vertical. App-server's MCP extension now
    owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
    continue to arrive as MCP tools, while backend-provided skills arrive as
    MCP resources and use Codex's existing resource list/read paths. No
    second backend client, skill filesystem, or generic plugin activation
    framework is introduced.
    
    The backend route remains the hosted implementation. This change
    replaces Codex's temporary endpoint-selection mechanism, not the service
    behind the endpoint.
    
    ## What changed
    
    ### Hosted plugin runtime
    
    The MCP extension now contributes `codex_apps` as the hosted plugin
    runtime rather than as a configurable Apps endpoint:
    
    - `https://chatgpt.com` resolves to
    `https://chatgpt.com/backend-api/ps/mcp`;
    - a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
    - the existing product-SKU header and ChatGPT authentication behavior
    are preserved;
    - executor availability is never consulted for this streamable HTTP
    transport.
    
    The same MCP connection carries both component shapes supported by the
    hosted endpoint:
    
    - connector actions are discovered and invoked as MCP tools;
    - hosted skills are enumerated and read as MCP resources through the
    existing `list_mcp_resources` and `read_mcp_resource` paths.
    
    This keeps component access in the subsystem that already owns the
    protocol instead of downloading backend skills into an orchestrator
    filesystem or inventing a parallel hosted-skill client.
    
    ### Explicit runtime ordering
    
    `McpManager` now resolves the reserved `codex_apps` entry in three
    ordered phases:
    
    1. install the legacy Apps fallback for compatibility;
    2. apply ordered extension `Set` or `Remove` overlays;
    3. apply the final ChatGPT-auth gate without synthesizing the server
    again.
    
    This ordering is important:
    
    - an ordinary configured or plugin MCP server cannot claim the
    auth-bearing `codex_apps` name;
    - an extension-contributed hosted runtime wins over the fallback;
    - an extension `Remove` remains authoritative;
    - a host without the MCP extension retains the legacy Apps endpoint and
    current local-only behavior.
    
    The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
    longer needed.
    
    ### Remove the path override
    
    The `apps_mcp_path_override` feature and its runtime plumbing are
    removed, including:
    
    - the feature registry entry and structured feature config;
    - `Config` and `McpConfig` fields;
    - config schema output;
    - config-lock materialization;
    - URL override handling in `codex-mcp`.
    
    Existing boolean and structured forms still deserialize as ignored
    compatibility input. They are omitted from new serialized config, and
    config-lock comparison normalizes the removed input so older locks
    remain replayable.
    
    ### App-server coverage
    
    App-server MCP fixtures now serve the hosted route at
    `/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
    therefore exercise the extension-owned endpoint rather than succeeding
    through the legacy fallback.
    
    The stack also adds the missing `codex_chatgpt::connectors` re-export
    for the manager-backed connector helper introduced in #27191.
    
    ## Compatibility
    
    - App-server installs the extension and uses `/ps/mcp` for the hosted
    runtime.
    - CLI and other hosts that do not install the extension retain the
    legacy Apps endpoint.
    - Apps disabled or non-ChatGPT authentication removes `codex_apps` from
    the effective runtime view.
    - Existing local plugins, local skills, executor-selected skills,
    configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
    - Backend plugin enablement remains account/workspace state owned by the
    hosted endpoint; this PR does not add thread-local backend plugin
    selection.
    
    ## Architectural fit
    
    The stack now proves two independent runtime shapes:
    
    1. #27184 resolves filesystem-backed skills through the executor that
    owns a selected root.
    2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
    extension with no executor.
    
    Together they preserve the intended separation:
    
    - selection identifies a plugin/root when explicit selection is needed;
    - each component's owning extension resolves its concrete access
    mechanism;
    - execution stays with the runtime required by that component;
    - existing skills, MCP, connector, and hook subsystems remain the
    downstream consumers.
    
    ## Planned follow-ups
    
    1. **Executor stdio MCP:** selecting an executor plugin registers a
    manifest-declared stdio MCP server and executes it in the environment
    that owns the plugin.
    2. **Optional backend selection:** only if CCA needs thread-local
    selection distinct from backend account/workspace enablement, add a
    concrete backend-owned capability location and surface those selected
    skills through the skills catalog.
    3. **Connector metadata and hooks:** activate those plugin components
    through their existing owning subsystems, with executor hooks remaining
    environment-bound.
    4. **Propagation and persistence:** define explicit resume, fork,
    subagent, refresh, and environment-removal semantics once selected roots
    have multiple real consumers.
    5. **Local convergence:** migrate legacy local skill, MCP, connector,
    and hook paths behind their owning extensions one vertical at a time,
    then remove duplicate core managers and compatibility plumbing after
    parity.
    
    ## Verification
    
    Coverage in this change exercises:
    
    - extension-owned `/backend-api/ps/mcp` registration without an
    executor;
    - preservation of the legacy endpoint in hosts without the extension;
    - extension `Set` and `Remove` precedence over the legacy fallback;
    - ChatGPT-auth gating for the reserved server;
    - hosted MCP resource reads with and without an active thread;
    - connector tool invocation and MCP elicitation through the hosted
    route;
    - ignored boolean and structured forms of the removed path override;
    - config-lock replay compatibility for the removed feature.
    
    `cargo check -p codex-features -p codex-mcp-extension -p
    codex-app-server` passes. Tests and Clippy were not run locally under
    the current development instruction; CI provides the full validation
    pass.
  • [codex] Gate terminal visualization instructions in TUI (#26013)
    ## Summary
    - add `Feature::TerminalVisualizationInstructions` as
    `UnderDevelopment`, disabled by default
    - keep terminal visualization instructions inside the TUI package
    - append them to existing developer instructions for TUI start, resume,
    and fork flows only when enabled
    - intentionally do not apply them to `codex exec`
    
    ## Rollout
    Control behavior is unchanged. TUI dogfooders can enable
    `terminal_visualization_instructions`; no default user receives the new
    terminal-specific instructions.
    
    The shared visualization-selection rule is supplied separately through
    the `codex_proxy_model_3` Statsig layer for every target Codex model
    slug in the gated cohort. This TUI feature determines how to render an
    appropriate visualization on the terminal surface; the model-layer
    treatment determines when to use one.
    
    ## Validation
    - `cargo test -p codex-tui
    terminal_visualization_instructions_are_gated_for_all_tui_thread_flows
    --lib`
    - `cargo test -p codex-features --lib`
    - `cargo fmt --all -- --check`
    - `git diff --check`
    - GPT-5.4 and GPT-5.5 real prompt-pipeline smoke tests: both visualized
    the positive mapping case, abstained on the negative route case, and
    passed exact prompt-stack verification on CLI and App
    - refreshed onto current `main` with a clean merge and reran the focused
    validation
    
    The full 53-probe all-model treatment comparison and requested
    production coding evals remain rollout gates before broadening beyond
    the initial employee cohort.
    
    This PR remains open for normal human review.
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Remove response.processed websocket request (#26447)
    ## Why
    
    The Responses websocket client no longer needs to send a follow-up
    `response.processed` request after a turn response has already been
    recorded. Keeping that extra acknowledgement path adds feature-gated
    control flow and a second websocket request shape that no longer carries
    useful behavior.
    
    ## What Changed
    
    - Removed the `response.processed` websocket request type and sender.
    - Removed the `responses_websocket_response_processed` feature flag and
    schema entry.
    - Removed turn and remote-compaction plumbing that only tracked response
    IDs to send the acknowledgement.
    - Removed tests that existed solely to cover the deleted feature path.
    
    ## Validation
    
    - `just fix -p codex-core -p codex-api -p codex-features`
  • core: allow excluding tool namespaces from code mode (#26320)
    ## Why
    
    Research and training setups need to control which tool namespaces
    appear inside code mode's nested `tools` surface without disabling those
    tools entirely. This makes it possible to train against a deliberately
    reduced nested-tool setup while preserving the normal direct and
    deferred tool paths.
    
    ## What
    
    - Extend `features.code_mode` to accept structured configuration while
    preserving the existing boolean syntax.
    - Add an exact `excluded_tool_namespaces` list under
    `[features.code_mode]`:
    
      ```toml
      [features.code_mode]
      enabled = true
      excluded_tool_namespaces = ["mcp__codex_apps", "multi_agent_v1"]
      ```
    
    - Filter matching canonical `ToolName` namespaces when constructing code
    mode's nested router and code-mode-specific direct tool descriptions.
    - Keep excluded tools registered, directly exposed in mixed code mode,
    and discoverable through top-level `tool_search` when otherwise
    eligible.
    - Derive deferred nested-tool guidance after namespace filtering so the
    `exec` description does not advertise excluded-only deferred tools.
    - Preserve the boolean/table representation when materializing config
    locks and update the generated config schema.
    
    ## Testing
    
    - `just test -p codex-features`
    - `just test -p codex-config`
    - `just test -p codex-core load_config_resolves_code_mode_config`
    - `just test -p codex-core
    lock_contains_prompts_and_materializes_features`
    - `just test -p codex-core
    excluded_deferred_namespaces_do_not_enable_nested_tool_guidance`
    - `just test -p codex-core
    code_mode_excludes_configured_nested_tool_namespaces`
    - `cargo check -p codex-thread-manager-sample`
  • [app-server][core] Add connector-level Guardian reviewer overrides (#25167)
    Context: https://openai.slack.com/archives/C0B4JAF0Q2C/p1779912328647229
    
    ```
    approvals_reviewer = "auto_review"
    
    [apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
    enabled = true
    approvals_reviewer = "user"
    default_tools_approval_mode = "prompt"
    ```
    
    <img width="230" height="84" alt="Screenshot 2026-05-31 at 11 56 34 AM"
    src="https://github.com/user-attachments/assets/e319f8f7-0983-42a7-98cd-3302732fa406"
    />
    
    <img width="841" height="233" alt="Screenshot 2026-05-31 at 11 52 42 AM"
    src="https://github.com/user-attachments/assets/7ac76645-4e90-4d00-8242-f031146a22a5"
    />
    
    -------
    
    ```
    approvals_reviewer = "user"
    
    [apps.connector_5f3c8c41a1e54ad7a76272c89e2554fa]
    enabled = true
    approvals_reviewer = "auto_review"
    default_tools_approval_mode = "prompt"
    ```
    <img width="195" height="83" alt="Screenshot 2026-05-31 at 12 02 27 PM"
    src="https://github.com/user-attachments/assets/3d374dc8-8aa2-466f-a13f-e4ed8567aa2e"
    />
    <img width="771" height="207" alt="Screenshot 2026-05-31 at 12 05 42 PM"
    src="https://github.com/user-attachments/assets/105c2575-68d6-4ca6-8e69-dc8c82da36a2"
    />
    
    
    
    ## Summary
    - add `apps.<connector_id>.approvals_reviewer` to override Guardian or
    user review routing per connected app
    - apply overrides across direct app MCP calls, delegated MCP prompts,
    and app-server MCP elicitation review while preserving global behavior
    for non-app MCP servers
    - expose and document the config through app-server v2 and generated
    schemas, while honoring global managed reviewer requirements
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • feat: gate unified exec zsh fork composition (#24979)
    ## Why
    
    `shell_zsh_fork` and unified exec need to remain independently
    controllable for enterprise rollouts, but we also need a third mode that
    composes them. That composed mode is intended to preserve unified exec
    command lifecycle support while letting the zsh fork provide more
    accurate `execv(2)` interception.
    
    Enabling `unified_exec_zsh_fork` by itself is intentionally not
    sufficient. It is a composition gate, not a dependency-enabling
    shortcut:
    
    - `unified_exec` selects the PTY-backed unified exec tool.
    - `shell_zsh_fork` opts into the zsh fork backend.
    - `unified_exec_zsh_fork` only allows those two already-enabled modes to
    be composed so local zsh unified exec commands can launch through the
    zsh fork.
    
    This separation is deliberate. Enterprises and staged rollouts must be
    able to enable or disable unified exec and zsh-fork independently. If
    `unified_exec_zsh_fork` implied either dependency, then enabling one
    under-development composition flag would silently activate a shell
    backend that the configured feature set left disabled.
    
    This PR introduces only the configuration and planning gate for that
    composition. Existing `shell_zsh_fork` behavior continues to use the
    standalone shell tool unless the new composition feature is explicitly
    enabled alongside both dependencies.
    
    ## What Changed
    
    - Added the under-development feature flag `unified_exec_zsh_fork`.
    - Added `UnifiedExecFeatureMode` so the three input feature flags
    collapse into `Disabled`, `Direct`, or `ZshFork` mode before tool
    planning.
    - Updated tool selection so zsh-fork composition requires
    `unified_exec`, `shell_zsh_fork`, and `unified_exec_zsh_fork`.
    - Kept the existing standalone zsh-fork shell tool behavior when only
    `shell_zsh_fork` is enabled.
    - Updated config schema output for the new feature flag.
    
    ## Verification
    
    - Added feature and tool-config coverage for the new gate.
    - Added planner coverage proving `shell_zsh_fork` remains standalone
    until composition is explicitly enabled.
    - Ran focused tests for `codex-features`, `codex-tools`, and the
    affected `codex-core` planner case.
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24979).
    * #24982
    * #24981
    * #24980
    * __->__ #24979
  • Compress cold local rollouts (#25089)
    ## Rollout compression stack
    
    This stack splits #24941 into reviewable steps for local rollout
    compression. The design is intentionally staged:
    
    1. Teach readers, listing, search, and lookup to understand compressed
    rollouts.
    2. Make append and resume paths materialize compressed rollouts back to
    plain JSONL before writing.
    3. Add a disabled-by-default worker that can compress cold archived
    rollouts behind `local_thread_store_compression`.
    
    The key invariant is that writers append to plain `.jsonl`. A
    `.jsonl.zst` file is a cold/read representation; if a write is needed,
    the compressed file is materialized back to plain JSONL first. Readers
    prefer plain `.jsonl` when both forms exist and can fall back to the
    compressed sibling during transitions.
    
    The worker is deliberately the last PR and remains behind an
    under-development feature flag. It currently scans only
    `archived_sessions`, not active `sessions`, because active sessions have
    the highest resume/append race risk. That means this stack does not yet
    compress most unarchived local history.
    
    ## Known race / follow-up
    
    The remaining unresolved design question is writer/compressor
    coordination. Even for archived rollouts, a resume or metadata update
    can append while the worker is replacing the plain file with
    `.jsonl.zst`; the current double-stat checks narrow but do not fully
    eliminate the window where a writer has opened the plain file before
    unlink. Do not treat the worker PR as production-ready until we either:
    
    - prevent append/resume paths from racing archived compression, or
    - introduce a shared representation/append lock or equivalent
    coordination.
    
    The first two PRs are useful independently: they make compressed
    rollouts readable and make append paths safely recover back to plain
    JSONL. The third PR isolates the worker behavior so that coordination
    issue is reviewable separately.
    
    ## Validation
    
    Focused local validation for the stack includes:
    
    - `just test -p codex-rollout`
    - `just test -p codex-thread-store` where thread-store paths were
    touched
    - `just test -p codex-features` for the feature flag slice
    - `just bazel-lock-check` after dependency graph changes
    - scoped `just fix -p ...` passes for changed crates
    
    CI is still the source of truth for the full platform matrix.
    
    ## This PR in the stack
    
    This is PR 3/3, based on #25088. It adds the under-development feature
    flag and starts the best-effort background worker when enabled. The
    worker currently compresses only cold archived rollouts, skips active
    sessions, verifies compressed output, preserves mtime and permissions,
    keeps a store-level lock heartbeat, and cleans stale temp files.
    
    Stack order:
    
    1. #25087: read compressed local rollouts.
    2. #25088: materialize compressed rollouts before append.
    3. This PR: add the disabled local compression worker.
  • feat(config) experimental_request_user_input toggle (#24541)
    ## Summary
    Experimental flag to allow toggling `request_user_input`:
    
    ```
    tools.experimental_request_user_input = false
    ```
    
    ## Testing
    - [x] Added unit tests
  • [codex] Fix Vim normal mode editing (#25022)
    ## Summary
    - add Vim normal-mode `s` support to substitute the character under the
    cursor and enter insert mode
    - fix Vim normal-mode `o` so opening below the final line moves the
    cursor onto the new blank line
    - update keymap config/schema and keymap picker snapshots for the new
    action
    
    ## Validation
    - `just fmt`
    - `just write-config-schema`
    - `just test -p codex-config`
    - focused `just test -p codex-tui` coverage for the Vim `s` and `o`
    behavior, keymap conflict handling, and keymap picker snapshots
    - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
    - `git diff --check`
    
    ## Notes
    A full `just test -p codex-tui` run still has two unrelated Guardian
    feature-flag failures in this checkout:
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
  • fix(config): use deny for Unix socket permissions (#24970)
    ## Why
    
    Unix socket permissions still accepted and displayed `"none"` while file
    permissions use the clearer `"deny"` spelling. This keeps network Unix
    socket policy vocabulary consistent with filesystem policy vocabulary.
    
    ## What changed
    
    - Replace the Unix socket permission variant and serialized spelling
    from `none` to `deny` across config, feature configuration, and network
    proxy types.
    - Update app-server v2 serialization, TUI debug output, focused tests,
    and generated schemas to expose `"deny"`.
    - Add coverage for denied Unix socket entries in managed requirements
    and profile overlay behavior.
    
    ## Security
    
    This is a vocabulary change for explicit Unix socket rejection, not a
    network access expansion. Denied entries continue to be omitted from the
    effective allowlist.
    
    ## Validation
    
    - `just fmt`
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `just test -p codex-config -p codex-core -p codex-app-server-protocol
    -p codex-tui -E
    'test(network_requirements_are_preserved_as_constraints_with_source) |
    test(network_permission_containers_project_allowed_and_denied_entries) |
    test(network_toml_overlays_unix_socket_permissions_by_path) |
    test(permissions_profiles_resolve_extends_parent_first_with_child_overrides)
    | test(network_requirements_serializes_canonical_and_legacy_fields) |
    test(debug_config_output_formats_unix_socket_permissions)'`\n- Automatic
    `bench-smoke` follow-up from `just test`\n- `cargo clippy -p
    codex-config -p codex-core -p codex-features -p codex-network-proxy -p
    codex-app-server-protocol -p codex-app-server -p codex-tui --all-targets
    -- -D warnings`
  • Add feature-gated standalone image generation extension (#24723)
    ## Why
    
    Add a standalone image generation path that can be exercised
    independently of hosted Responses image generation, while retaining the
    hosted tool as fallback unless the extension is actually available to
    the model.
    
    ## What changed
    
    - Added the `codex-image-generation-extension` crate with standalone
    generate/edit execution, prior-image selection for edits, model-visible
    image output, and local generated-image persistence.
    - Installed the extension in app-server behind the disabled-by-default
    `imagegenext` feature and backend eligibility checks.
    - Updated core tool planning so eligible `image_gen.imagegen` exposure
    replaces hosted `image_generation`, while unavailable configurations
    retain hosted fallback.
    - Added coverage for extension behavior, edit history reuse, feature
    gating, auth eligibility, and hosted-tool replacement.
    - The extension is installed through app-server only in this PR; other
    execution paths retain hosted image generation because hosted
    replacement occurs only when the standalone executor is actually
    registered and model-visible.
    - The initial extension contract intentionally fixes the image model to
    `gpt-image-2` and uses automatic image parameters.
    - Native generated-image history/card parity and rollout persistence
    cleanup are intentionally deferred follow-up work.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-features`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-app-server`
    - `just fix -p codex-image-generation-extension -p codex-features -p
    codex-core -p codex-app-server`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • feat(tui): make turn interruption keybind configurable (#24766)
    ## Why
    
    Interrupting an active turn is currently fixed to `Esc`, which is easy
    to hit accidentally and cannot be customized through `/keymap`. This
    gives users a less accidental binding while preserving the existing
    default.
    
    ## What Changed
    
    - Adds `tui.keymap.chat.interrupt_turn` to `/keymap`, defaulting to
    `esc` and supporting remapping or unbinding.
    - Uses the configured interrupt binding for running-turn status, queued
    steer interruption, and `request_user_input`, including the visible
    hints.
    - Preserves local `Esc` behavior for popups, Vim insert mode, and
    `/agent` editing while validating conflicts with fixed/backtrack and
    request-input navigation bindings.
    - Adds behavior and snapshot coverage for remapped interruption paths.
    
    ## How to Test
    
    1. Run Codex and open `/keymap`, then set **Interrupt Turn** to `f12`.
    2. Start a turn and confirm `Esc` no longer interrupts it while `f12`
    does; the running hint should display `f12 to interrupt`.
    3. Queue a steer while a turn is running and confirm the preview
    displays `f12`; pressing it should interrupt and submit the steer
    immediately.
    4. Trigger a `request_user_input` prompt and confirm its footer uses
    `f12`; with notes open, `Esc` should still clear notes while `f12`
    interrupts the turn.
    5. Clear the Interrupt Turn binding and confirm the key-specific
    interrupt hint is removed while `Ctrl+C` remains available.
    
    Targeted validation:
    
    - `just write-config-schema`
    - `just fix -p codex-config`
    - `just fix -p codex-tui`
    - `just fmt`
    - `just argument-comment-lint-from-source -p codex-config -p codex-tui`
    - `just test -p codex-config`
    - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
    - `just test -p codex-tui keymap_setup::tests`
    - `just test -p codex-tui` (fails in two pre-existing guardian
    feature-flag tests unrelated to this diff; the intentional picker
    snapshot updates were reviewed and accepted)
  • feat(tui): add vim text object bindings (#24382)
    ## Why
    
    Vim mode currently supports some normal-mode operators and motions, but
    common text-object combinations like `ciw`, `daw`, `di(`, and
    quote/bracket variants are still missing. That makes the composer feel
    incomplete for users who expect operator + text object editing to work
    inside prompts.
    
    Closes #21383.
    
    ## What Changed
    
    - Add Vim pending-state support for operator/text-object sequences.
    - Add `c` as a normal-mode operator for text objects, so combinations
    like `ciw` delete the object and enter insert mode.
    - Support word, WORD, delimiter, and quote text objects:
      - `iw`, `aw`, `iW`, `aW`
      - `i(`, `a(`, `i)`, `a)`, `ib`, `ab`
      - `i[`, `a[`, `i]`, `a]`
      - `i{`, `a{`, `i}`, `a}`, `iB`, `aB`
      - `i"`, `a"`, `i'`, `a'`, `i\``, `a\``
    - Add configurable keymap entries and keymap picker coverage for the new
    Vim text-object context.
    - Regenerate the config schema and update keymap picker snapshots.
    
    ## How to Test
    
    Manual smoke test:
    
    1. Start Codex with Vim composer mode enabled.
    2. Type a draft such as:
       ```text
       alpha beta gamma
       call(foo[bar], {"x": "hello world"})
       say "one \"two\" three" now
       ```
    3. Put the cursor on `beta`, press `ciw`, and confirm `beta` is removed
    and the composer enters insert mode.
    4. Escape back to normal mode, put the cursor on `gamma`, press `daw`,
    and confirm `gamma` plus surrounding whitespace is removed.
    5. Put the cursor inside `foo[bar]`, press `di[`, and confirm only `bar`
    is removed.
    6. Put the cursor inside `call(...)`, press `da(`, and confirm the whole
    parenthesized section is removed.
    7. Put the cursor inside the quoted text, press `ci"`, and confirm the
    quote contents are removed and insert mode starts.
    8. Verify cancellation does not edit text: press `d` then `Esc`, and
    press `d` then `i` then `Esc`.
    
    Targeted tests:
    
    - `cargo test -p codex-tui --lib vim_`
    - `cargo nextest run -p codex-tui keymap_setup::tests`
    
    Additional local checks:
    
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-tui`
    - `git diff --check`
    - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
    
    Local full-suite note: `just test -p codex-tui` ran to completion. The
    keymap snapshot failures were expected and accepted. Two unrelated
    guardian feature-flag tests still fail locally:
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
    
    `just argument-comment-lint` is currently blocked locally by Bazel
    analysis before the lint runs because `compiler-rt` has an empty
    `include/sanitizer/*.h` glob in the local Bazel cache. The touched Rust
    diff was manually inspected for opaque positional literals.