Commit Graph

7054 Commits

  • session: keep startup prewarm aligned with resolved multi-agent runtime (#25841)
    ## Why
    
    Follow-up to #25722. Startup prewarm builds a preview `TurnContext`
    before the first real turn so it can precompute the initial prompt and
    tool surface. After the per-thread runtime work landed, that preview
    path still recomputed multi-agent mode from `model_info` and feature
    defaults instead of reusing the runtime the session had already resolved
    from persisted metadata or inheritance.
    
    That could leave the prewarmed session primed for a different
    multi-agent mode than the first real turn, which is especially risky
    because collaboration tool exposure depends on
    `turn_context.multi_agent_version`.
    
    ## What changed
    
    - In the `TurnMultiAgentRuntime::Preview` path, prefer
    `Session::multi_agent_version()` when it is already known.
    - Only fall back to `model_info.multi_agent_version` and feature
    defaults when the session has not resolved a runtime yet.
    - Keep preview mode read-only: this still avoids storing a runtime
    during startup prewarm.
    
    ## Testing
    
    - Not run (small runtime-selection follow-up)
  • Resolve per-thread multi-agent runtime (#25722)
    Stack split from #25708. Original PR intentionally left open. This third
    PR resolves the effective per-thread multi-agent runtime from persisted
    metadata, inherited runtime, and current model selection.
  • Persist multi-agent runtime metadata (#25721)
    Stack split from #25708. Original PR intentionally left open. This
    second PR persists multi-agent runtime metadata through thread creation,
    rollout recording, and thread storage.
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • feat: reuse compressed rollout search snippets (#25814)
    ## Summary
    - teach rollout search to return precomputed snippets for compressed
    rollouts
    - reuse those snippets in local thread search instead of reopening
    matching compressed files
    - keep the no-`rg` fallback single-pass and add regression coverage for
    the compressed path
    
    ## Why
    `thread/search` currently decodes matching compressed rollouts twice:
    once to discover the matching path and again to extract the snippet
    shown in results. That defeats a meaningful part of the compressed-read
    optimization work.
    
    ## Impact
    Compressed rollout hits now pay one decode pass on the search path while
    plain `.jsonl` hits keep the existing ripgrep-driven flow.
    
    ## Validation
    - `just test -p codex-rollout`
    - `just test -p codex-thread-store`
    - `just fix -p codex-rollout`
    - `just fix -p codex-thread-store`
    - `just fmt`
  • [codex] Validate plugin skill base names (#25782)
    ## Summary
    
    - Validate skill base name length before plugin namespacing.
    - Bound the composed `plugin:skill` qualified name to 128 characters.
    - Keep plugin skill runtime names in the existing `plugin:skill` form.
    - Add regression tests for the max qualified-name boundary and rejection
    path.
    
    ## Root Cause
    
    Plugin skills are represented as `plugin_name:skill_name`, but the
    loader previously applied the 64-character skill name limit after adding
    the plugin namespace. Moving that check to the base name fixes valid
    plugin skills with longer namespaces, and the separate 128-character
    qualified-name limit keeps model-visible skill names bounded.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-core-skills plugin_skill_name_length_limit`
    - `git diff --check`
  • [codex] Move plugin discoverable logic into core-plugins (#25783)
    ## Summary
    - Move plugin discoverable recommendation filtering from `codex-core`
    into `codex-core-plugins` behind `ToolSuggestPluginDiscoveryInput`.
    - Keep `codex-core` as a thin adapter from `Config` to the core-plugins
    API and back to `DiscoverablePluginInfo`.
    - Keep the existing discoverable allowlist private to the core-plugins
    implementation.
    
    ## Validation
    - `just fmt`
    - `just test -p codex-core list_tool_suggest_discoverable_plugins`
    - `git diff --check`
    - Read-only subagent review: no findings
  • [codex] Cache remote plugin catalog for suggestions (#25457)
    ## Summary
    - cache the global remote plugin catalog when remote plugin listing runs
    and warm it during startup
    - use the cached remote catalog in plugin install recommendations with
    canonical `plugin@openai-curated-remote` ids
    - reuse the session `PluginsManager` for plugin recommendations so
    remote cache state is visible on the recommend path
    - skip core installed-state verification for remote plugin install
    suggestions while leaving local plugin and connector verification
    unchanged
    
    ## Testing
    - `just fmt`
    - `git diff --check`
    - `cargo test -p codex-core
    list_tool_suggest_discoverable_plugins_includes_cached_remote_global_plugins`
    - `cargo test -p codex-core
    remote_plugin_install_suggestions_skip_core_installed_verification`
    - `cargo test -p codex-app-server
    plugin_list_includes_remote_marketplaces_when_remote_plugin_enabled`
    
    Earlier focused checks during the same branch: codex-tools TUI filter
    test, request_plugin_install tests, and codex-app-server build.
  • [codex] Add plugin list JSON output (#25330)
    ## Summary
    - add `--json` output to `codex plugin list` with `installed` and
    `available` arrays
    - add `--available` for JSON output only; using it without `--json` is
    rejected
    - keep the existing non-JSON table output unchanged
    - add CLI coverage for JSON installed/available output and the
    `--available`/`--json` requirement
    
    ## Validation
    - `just test -p codex-cli plugin_list`
    - `just fix -p codex-cli`
    - `git diff --check`
    
    Note: `just fmt` ran Rust formatting first, then failed in the Python
    ruff step because `openai-codex-cli-bin==0.132.0` has no wheel for this
    Linux platform.
  • feat: show enterprise monthly credit limits in status (#24812)
    ## Summary
    
    Enterprise users can have an effective monthly credit limit, but Codex
    `/status` currently drops that metadata from the account-usage response.
    
    This change adds the optional `spend_control.individual_limit`
    projection to the existing rate-limit snapshot flow. The backend client
    reads the monthly limit, app-server exposes it as `individualLimit`, and
    the TUI renders a `Monthly credit limit` row through the existing
    progress-bar renderer.
    
    When the backend does not return an effective monthly limit, existing
    rate-limit behavior is unchanged.
    
    ## Existing backend state
    
    The account-usage backend already returns the effective monthly limit
    and current usage together:
    
    ```json
    {
      "spend_control": {
        "reached": false,
        "individual_limit": {
          "limit": "25000",
          "used": "8000",
          "remaining": "17000",
          "used_percent": 32,
          "remaining_percent": 68,
          "reset_after_seconds": 86400,
          "reset_at": 1778137680
        }
      }
    }
    ```
    
    Before this change, Codex projected rolling `primary` and `secondary`
    windows plus `credits`. It ignored `spend_control.individual_limit`, so
    app-server clients and `/status` could not render the monthly cap.
    
    The updated flow is:
    
    ```text
    account usage backend
      -> backend-client reads spend_control.individual_limit
      -> existing rate-limit snapshot carries optional individual_limit
      -> app-server exposes optional individualLimit
      -> TUI renders Monthly credit limit
    ```
    
    ## App-server contract
    
    `account/rateLimits/read` and sparse `account/rateLimits/updated`
    notifications now include an additive nullable
    `rateLimits.individualLimit` field:
    
    ```json
    {
      "individualLimit": {
        "limit": "25000",
        "used": "8000",
        "remainingPercent": 68,
        "resetsAt": 1778137680
      }
    }
    ```
    
    In an `account/rateLimits/read` response, `null` means no monthly limit
    is available. `account/rateLimits/updated` remains a sparse rolling
    notification: clients merge available values into their most recent
    `account/rateLimits/read` snapshot or refetch. Nullable account metadata
    in a rolling notification does not clear a previously observed value.
    
    ## Design decisions
    
    - Extend the existing rate-limit snapshot instead of introducing a
    separate request or wire-level update protocol.
    - Keep the Codex projection narrow: `/status` needs the effective limit,
    current usage, remaining percentage, and reset timestamp.
    - Render the monthly row through the existing progress-bar renderer,
    with one optional detail line for `8,000 of 25,000 credits used`.
    - Keep the backend response optional so existing accounts and older
    usage states preserve their current behavior.
    - Preserve cached monthly metadata when sparse rolling notifications
    omit it. Live account-usage reads remain authoritative and can clear a
    removed limit.
    
    ## Visual evidence
    
    ```text
     Monthly credit limit:   [██████████████░░░░░░] 68% left (resets 07:08 on 7 May)
                             8,000 of 25,000 credits used
    ```
    
    Snapshot:
    `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap`
    
    ## Testing
    
    Tests: generated app-server schema verification, protocol tests,
    backend-client tests, app-server integration coverage, TUI snapshot
    coverage, formatting, and workspace lint cleanup.
  • Move code review rules into AGENTS (#25738)
    ## Why
    Codex Review now supports repository-specific review rules in AGENTS.md.
    Adding the review prompts there makes the guidance available as
    repository review rules next to the code it governs while keeping the
    existing local review skills intact.
    
    ## What changed
    - Added a `## Code Review Rules` section to `AGENTS.md` with the
    existing review prompts for model context, breaking changes, test
    authoring, and change size.
    - Preserved the existing `.codex/skills/code-review*` skill files.
    
    ## Verification
    - `git diff --check origin/main...HEAD`
  • [codex] Add comprehensive root formatting check (#25683)
    ## Why
    
    The root formatting entrypoints could drift: `just fmt` did not format
    the Justfile itself, and the CI-facing check recipe only checked Python
    scripts instead of matching everything formatted by `just fmt`.
    
    ## What changed
    
    - Add a shared cross-platform Python formatter driver used by both `just
    fmt` and `just fmt-check`.
    - Run Justfile, Rust, Python SDK, and internal-script formatter groups
    concurrently while buffering each formatter group's output until it
    finishes.
    - Log formatter starts immediately, then print each formatter group's
    labeled output when it completes.
    - Keep the SDK lint-fix and Ruff formatting passes ordered, with source
    comments explaining their distinct roles and the check-mode equivalents.
    - Run Ruff through shared `uv run --no-sync --with ruff` overlays so
    formatting works on clean glibc Linux checkouts without installing the
    platform-specific SDK runtime wheel.
    - Show `fmt-check` help text in `just -l` and simplify CI to call the
    shared driver through `just fmt-check`.
    - Pin the general CI workflow to `just@1.51.0` so its formatter agrees
    with the checked-in Justfile.
    - Add regression coverage for the thin Just recipes and the driver's
    formatter graph.
    
    ## Validation
    
    - `just fmt`
    - `just fmt-check`
    - `python3 -m pytest
    sdk/python/tests/test_artifact_workflow_and_binaries.py -k 'root_fmt or
    root_format' -q`
    - `pnpm run format`
    - `git diff --check`
    - `just -l | rg -n '^    fmt|fmt-check'`
    - `uvx --from uv==0.7.22 uv run --frozen --project sdk/python --no-sync
    --with ruff ruff check --diff sdk/python`
  • feat(remote-control): add pairing start (#25675)
    ## Why
    
    Remote control enrollment authorizes a desktop server, but app-server v2
    did not expose the follow-up pairing operation needed to mint a
    short-lived controller pairing artifact from that enrolled server.
    Clients need a narrow RPC that starts pairing without exposing the
    backend `serverId` or conflating pairing with websocket connection
    state.
    
    Issue: N/A; internal remote-control pairing API change.
    
    ## What Changed
    
    Added experimental app-server v2 `remoteControl/pairing/start` with
    `manualCode` input and `pairingCode`, nullable `manualPairingCode`,
    `environmentId`, and Unix-seconds `expiresAt` output. The method
    serializes under its own `global("remote-control-pairing")` scope and is
    documented in `app-server/README.md`.
    
    Extended the remote-control transport with private `/server/pair`
    request/response types and normalized `pair_url` handling. Pairing uses
    the current enrolled server bearer, refreshes that bearer when needed,
    keeps backend `server_id` private, validates returned `server_id` and
    `environment_id` against the current enrollment, and preserves backend
    status/header/body context for failures and malformed responses.
    
    Wired the request through `RemoteControlRequestProcessor` and
    `MessageProcessor`, mapping unavailable/disabled pairing to
    `invalid_request` and backend failures to internal errors.
    
    ## Verification
    
    - `just test -p codex-app-server-transport`
    - `just test -p codex-app-server
    remote_control_pairing_start_returns_pairing_artifacts`
  • Handle invalid plugin skills manifest field (#25717)
    ## Summary
    - Treat invalid `plugin.json` `skills` shapes as a field-level warning
    instead of rejecting the whole manifest
    - Keep valid string path behavior unchanged and continue falling back to
    the default `skills/` root
    - Add regression coverage for array-shaped `skills`
    
    ## Tests
    - `just fmt`
    - `cargo test -p codex-core-plugins`
  • Move cloud requirements crate to cloud config (#24621)
    ## Summary
    
    - Moves the existing `codex-cloud-requirements` crate to
    `codex-cloud-config`.
    - Updates workspace dependencies and imports to the new crate name.
    - Intentionally keeps runtime behavior unchanged: this still fetches the
    legacy cloud requirements endpoint.
    
    ## Details
    
    This PR exists to make the lineage obvious before the bundle migration.
    GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs`
    implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather
    than as unrelated new code.
    
    The follow-up PR adapts this moved crate to the new config bundle API
    and switches runtime consumers over.
  • app-server: remove experimental persist_extended_history bool flag (#25712)
    ## Summary
    
    Remove the dead experimental `persistExtendedHistory` app-server flag
    and collapse rollout persistence to the single policy app-server already
    used.
    
    ## What Changed
    
    - Removed `persistExtendedHistory` from v2 thread start/resume/fork
    params and deleted its deprecation notice path.
    - Removed the persistence-mode enums and plumbing through core, rollout,
    and thread-store.
    - Made rollout filtering mode-free, keeping the existing limited
    persisted-history behavior.
    
    ## Test Plan
    
    - `just write-app-server-schema`
    - `cargo nextest run --no-fail-fast -p codex-app-server-protocol
    schema_fixtures`
    - `cargo nextest run --no-fail-fast -p codex-app-server
    thread_shell_command_history_responses_exclude_persisted_command_executions`
    - `cargo nextest run --no-fail-fast -p codex-rollout -p
    codex-thread-store`
    - final `rg` for removed flag/type names
  • Wire managed MITM CA trust into child env (#22668)
    ## Stack
    1. Parent PR: #18240 uses named MITM permissions config.
    2. This PR wires managed MITM CA trust into spawned child processes.
    
    ## Why
    When Codex terminates HTTPS for limited mode or MITM hooks, child HTTPS
    clients need to trust Codex's managed MITM CA. Exporting proxy URLs
    alone is not enough, but blindly replacing user CA settings would be
    wrong: it can break custom enterprise/test roots, leak unreadable CA
    files into generated bundles, or make the child env disagree with its
    sandbox policy.
    
    ## Summary
    1. Build immutable managed CA bundles under `$CODEX_HOME/proxy` that
    include native roots, the managed MITM CA, and only inherited or
    command-scoped CA bundles the child is allowed to read.
    2. Export curated CA env vars alongside managed proxy env vars while
    preserving user CA override semantics, including nested Codex
    `SSL_CERT_FILE` precedence.
    3. Thread generated CA bundle paths into child sandbox readable roots,
    including debug sandbox execution, so the exported env vars work inside
    sandboxed commands.
    4. Remove only Codex-generated MITM CA bundle env when a child
    intentionally drops managed proxying for escalation or no-proxy retry.
    5. Document the managed CA bundle behavior and cover env injection,
    per-child bundle generation, sandbox readable roots, and no-proxy
    cleanup in tests.
    
    ## Validation
    1. Ran `just test -p codex-network-proxy`.
    2. Ran `just test -p codex-protocol`.
    3. Ran `just fix -p codex-network-proxy -p codex-protocol`.
    4. Tried focused `codex-core` validation, but the crate currently fails
    to compile in `core/tests/suite/guardian_review.rs` because an existing
    `Op::UserInput` initializer is missing `additional_context`.
    
    ---------
    
    Co-authored-by: Eva Wong <evawong@openai.com>
  • Reject directory rollout paths for pathless side chats (#25661)
    ## Why
    
    Fixes openai/codex#20944.
    
    Desktop side chats are intentionally ephemeral and pathless. They can
    still accept live turns while loaded, but after a reload there is no
    persisted rollout to resume. In the reported failure mode, Desktop could
    send `$CODEX_HOME` as the resume/fork path for one of these pathless
    side chats.
    
    `thread/resume` and `thread/fork` prefer an explicit `path` over
    `threadId`, and rollout path lookup only checked that a candidate
    existed. That let `$CODEX_HOME` pass as a rollout path, so the later
    rollout reader tried to open a directory and surfaced the low-level `Is
    a directory` error.
    
    ## What Changed
    
    - Reject explicit rollout paths that resolve to a directory or other
    non-file before attempting to read rollout history.
    - Make `codex_rollout::existing_rollout_path` return only plain or
    compressed rollout candidates that are actual files.
    - Add an app-server regression test that creates an ephemeral fork, runs
    a turn while the side thread is loaded, simulates reload, then verifies
    both `thread/resume` and `thread/fork` reject `$CODEX_HOME` with `path
    is a directory` instead of the OS-level directory-read error.
    - Rebase over the `TestAppServer` rename and update the remaining stale
    test harness call sites to use `TestAppServer` with `app_server` local
    variables.
    
    Relevant code:
    
    - `thread-store/src/local/read_thread.rs` validates explicit rollout
    paths before rollout reading:
    https://github.com/openai/codex/blob/25b47c8f425d351aaba4baa955a8092064a1707b/codex-rs/thread-store/src/local/read_thread.rs#L146-L165
    - `rollout/src/compression.rs` now requires file metadata for plain and
    compressed rollout candidates:
    https://github.com/openai/codex/blob/25b47c8f425d351aaba4baa955a8092064a1707b/codex-rs/rollout/src/compression.rs#L940-L950
    - The repro test covers the pathless ephemeral side-chat reload case:
    https://github.com/openai/codex/blob/25b47c8f425d351aaba4baa955a8092064a1707b/codex-rs/app-server/tests/suite/v2/thread_fork.rs#L774-L886
    
    ## Verification
    
    - `just test -p codex-app-server
    pathless_ephemeral_thread_rejects_codex_home_path_after_reload`
  • [codex] Publish release symbol artifacts (#25649)
    ## Why
    
    Production Codex binaries are stripped for distribution, which leaves
    crashes and samples from released builds without the symbols needed for
    useful stack traces. Publish symbols as separate release assets so
    production artifacts stay small while released builds remain
    symbolicateable.
    
    ## What changed
    
    - Add `.github/scripts/archive-release-symbols-and-strip-binaries.sh` to
    package platform-native symbols into `codex-symbols-<artifact>.tar.gz`
    assets while stripping the corresponding Unix binaries before signing.
    - Build release binaries with full debug information before producing
    distribution artifacts.
    - Publish macOS `.dSYM` bundles, Linux `.debug` files with
    `.gnu_debuglink`, and Windows `.pdb` files.
    - Strip Linux `bwrap` before computing its packaged-resource digest, but
    intentionally omit `bwrap` from symbol archives.
    - Preserve symbols artifacts in the unsigned macOS promotion flow.
    
    ## Verification
    
    - Ran `shellcheck` and `bash -n` on
    `.github/scripts/archive-release-symbols-and-strip-binaries.sh`.
    - Parsed the modified workflow YAML files and ran `git diff --check`.
    - Built a macOS release smoke binary and verified that the archived
    `.dSYM` contains DWARF application source information and has the same
    UUID as the stripped production binary.
    - Built Linux smoke binaries and verified that the symbol archive
    contains `codex.debug`, excludes `bwrap.debug`, leaves the expected
    `.gnu_debuglink` in `codex`, and does not mutate the separately stripped
    `bwrap` digest.
    - Staged a Windows smoke archive and verified that it contains the
    expected `.pdb` file.
  • fix(tui): clarify footer shortcut overlay hints (#25625)
    ## Why
    
    The TUI shortcut overlay used static labels for `Tab` and `Ctrl+C`, even
    though both keys change behavior while a task is running. That made the
    visible help misleading: idle `Tab` submits rather than queues, and
    active-turn `Ctrl+C` interrupts rather than exits.
    
    Closes #25531.
    Closes #25564.
    
    ## What Changed
    
    - Pass task-running state into the shortcut overlay renderer.
    - Render `Tab` as `submit message` while idle and `queue message` while
    work is running.
    - Render `Ctrl+C` as `exit` while idle and `interrupt` while work is
    running.
    - Add snapshot coverage for the active-work shortcut overlay and update
    idle overlay snapshots.
    
    ## How to Test
    
    1. Start Codex and open the shortcut overlay with `?` while no task is
    running.
    2. Confirm the overlay shows `tab to submit message` and `ctrl + c to
    exit`.
    3. Start a task, then open or keep the shortcut overlay visible while
    work is running.
    4. Confirm the overlay shows `tab to queue message` and `ctrl + c to
    interrupt`.
    5. Type a follow-up prompt during active work and press `Tab`; confirm
    it queues rather than submitting immediately.
    
    Targeted tests:
    
    - `just test -p codex-tui footer_snapshots`
    - `just test -p codex-tui footer_mode_snapshots`
    
    ## Validation Notes
    
    `just test -p codex-tui` currently has two unrelated guardian
    feature-flag test failures on this base:
    
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    
    `just argument-comment-lint codex-rs/tui/src/bottom_pane/footer.rs`
    could not run locally because the prebuilt wrapper requires `dotslash`;
    the touched Rust diff was manually inspected for opaque positional
    literals.
  • Move tool search metadata onto ToolExecutor (#25684)
    Deferred tools need to be searchable even when they are not implemented
    inside `codex-core`. Extension-provided tools can be registered for
    later discovery, but the search metadata path was still owned by
    core-specific runtime hooks, which meant the shared `ToolExecutor`
    abstraction could not describe how a deferred extension tool should
    appear in `tool_search`.
    
    ## Changes
    
    - Move `ToolSearchEntry` and `ToolSearchInfo` into `codex-tools` and
    re-export them from the shared tools crate.
    - Add a default `ToolExecutor::search_info` implementation that derives
    loadable tool-search metadata from function and namespace specs.
    - Forward search metadata through extension adapters and exposure
    overrides while keeping custom search text/source metadata for dynamic,
    MCP, and multi-agent tools.
    - Remove the old core-local `tool_search_entry` module now that search
    metadata lives with the shared executor APIs.
    
    ## Testing
    
    - Added `deferred_extension_tools_are_discoverable_with_tool_search`
    coverage in `core/src/tools/spec_plan_tests.rs`.
  • Fix stale TestAppServer rename in plugin_list test (#25705)
    ## Why
    
    #25701 renamed the app-server test harness to `TestAppServer`, but it
    raced with #25681, which added a new `plugin_list` test call site still
    using the old `McpProcess` name. Once both changes met on `main`,
    app-server test builds failed before running the suite because
    `McpProcess` no longer exists in that scope.
    
    This PR fixes that CI break by updating the remaining stale call site to
    the renamed helper.
    
    ## What Changed
    
    - Replaced the `McpProcess::new(...)` use in
    `codex-rs/app-server/tests/suite/v2/plugin_list.rs` with
    `TestAppServer::new(...)`.
    - Renamed the local variable from `mcp` to `app_server` at the same call
    site to match the helper rename.
    
    Relevant code:
    https://github.com/openai/codex/blob/aadd9c999b4e0789f7afb2b9b8cc43000bb47e86/codex-rs/app-server/tests/suite/v2/plugin_list.rs#L234-L246
    
    ## Verification
    
    Not run locally; this is a compile fix for the app-server test harness
    rename.
  • [codex] enable parallel standalone web search calls (#25702)
    ## Summary
    - opt the extension-backed standalone `web.run` tool into parallel tool
    execution
    - update the existing extension registration test to assert that the
    tool advertises parallel-call support
    
    ## Why
    The standalone web-search API endpoint now supports parallel requests.
    The extension executor still inherited the shared serial default,
    causing multiple `web.run` calls to acquire the exclusive runtime lock.
    
    ## Impact
    Models that emit multiple standalone web-search calls can now execute
    them concurrently when model-level parallel tool calls are enabled.
    
    ## Validation
    - `just fmt`
    - `just test -p codex-web-search-extension`
    - `git diff --check origin/main...HEAD`
  • fix: rename McpServer to TestAppServer (#25701)
    This PR brought to you via VS Code rather than Codex...
    
    - opened `codex-rs/app-server/tests/common/mcp_process.rs`
    - put the cursor on `McpServer`
    - hit `F2` and renamed the symbol to `TestAppServer`
    - went to the file tree
    - hit enter and renamed `mcp_process.rs` to `test_app_server.rs`
    - ran **Save All Files** from the Command Palette
    - ran `just fmt`
    
    The End
    
    (Admittedly, most of the local variables for `TestAppServer` are still
    named `mcp`, though.)
  • fix: Deduplicate installed local and remote curated plugins (#25681)
    ## Summary
    - Deduplicate installed `openai-curated` and `openai-curated-remote`
    plugin conflicts by feature flag.
    - Prefer remote when remote plugins are enabled; otherwise prefer local,
    while preserving one-sided installs.
    
    ## Testing
    - `just fmt`
    - `git diff --check`
    - Targeted `just test` was blocked locally because `cargo-nextest` is
    not installed.
  • Add Python version compatibility guidance (#25690)
    ## Why
    
    Python contributions in this repository should target the declared
    Python 3 runtime instead of carrying Python 2 compatibility patterns
    forward. When compatibility across Python 3 point releases matters,
    contributors need a consistent source of truth for the minimum supported
    version.
    
    ## What changed
    
    - Added Python development guidance to `AGENTS.md` stating that the
    repository uses Python 3+ and should not use the `__future__` module.
    - Documented that contributors should check the nearest `pyproject.toml`
    `requires-python` field when evaluating Python 3 point-release
    compatibility.
    
    ## Testing
    
    Not run (guidance-only change).
  • [codex] Generalize deferred nested tool guidance (#25689)
    ## Summary
    - describe omitted code-mode tools as deferred nested tools instead of
    MCP/app tools
    - update the prompt-description assertion to match
    
    ## Why
    Deferred dynamic tools are also callable through `tools` and
    discoverable in `ALL_TOOLS`, so the previous MCP/app-specific wording
    was too narrow.
    
    ## Validation
    - `just fmt`
    - `just test -p codex-code-mode`
    - `git diff --check`
  • Add rollout compression histograms (#25680)
    ## Summary
    
    Stacked on #25679. Add histogram telemetry for rollout compression
    runtime, per-file compression time, byte sizes, and compression ratio.
    
    ## Changes
    
    - Emit `codex.rollout_compression.run.duration_ms` tagged by final run
    status.
    - Emit `codex.rollout_compression.file.duration_ms` tagged by file
    outcome.
    - Emit source and compressed byte histograms for compression
    candidates/results.
    - Emit `codex.rollout_compression.file.compression_ratio` for successful
    compressions, recorded as integer basis points.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-rollout`
    - `just fix -p codex-rollout`
  • [codex] document out-of-line test module convention (#25682)
    ## Why
    
    New unit test modules should follow one consistent layout so
    implementation files stay focused and test suites remain easy to locate,
    without creating cleanup churn in existing inline test modules.
    
    ## What changed
    
    - Added `AGENTS.md` guidance requiring new test modules to use separate
    sibling `*_tests.rs` files with an explicit `#[path = "..._tests.rs"]`
    attribute.
    - Clarified that existing inline `#[cfg(test)] mod tests { ... }`
    modules should not be moved solely to follow the new convention.
    
    ## Validation
    
    - Ran `git diff --check`.
  • Add rollout compression counters (#25679)
    ## Summary
    
    Add counter telemetry for the local rollout compression worker so we can
    see when it runs, why it skips, and how individual file/materialization
    paths resolve.
    
    ## Changes
    
    - Emit `codex.rollout_compression.run` with statuses for start,
    completion, failure, duplicate-run skip, and missing runtime skip.
    - Emit `codex.rollout_compression.file` outcomes for scanned,
    compressed, skipped, and failed compression candidates.
    - Emit `codex.rollout_compression.temp_cleanup` and
    `codex.rollout_compression.materialize` counters for cleanup and
    decompression paths.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-rollout`
    - `just fix -p codex-rollout`
  • refactor: hide shell override for zsh fork unified exec (#24980)
    ## Why
    
    When unified exec is configured to launch through the zsh fork, local
    commands should not let the model override the shell binary with the
    `shell` parameter. The configured zsh fork is the mechanism that makes
    `execv(2)` interception reliable, so exposing `shell` for local zsh-fork
    execution would create a confusing API surface and undermine the
    composition.
    
    Remote environments are different: zsh-fork interception is local-only,
    so remote unified-exec calls must keep direct unified-exec behavior and
    still expose `shell` when a remote environment can be selected.
    
    ## What Changed
    
    - Taught the `exec_command` schema builder to omit the `shell` parameter
    when requested.
    - Hid `shell` from the unified-exec tool schema only when zsh-fork
    unified exec applies to all selectable environments.
    - Kept `shell` visible when any remote environment can be targeted,
    because those calls run through direct unified exec.
    - Made unified exec choose the effective shell mode per selected
    environment: local environments keep zsh-fork mode, remote environments
    use direct mode.
    - Left direct unified-exec behavior unchanged, including support for
    model-specified shells there.
    
    ## Verification
    
    - Added schema coverage showing `exec_command` can hide `shell`.
    - Added planner coverage showing zsh-fork unified exec hides `shell` for
    local-only execution while direct unified exec still exposes it.
    - Added planner coverage showing `shell` remains visible when a remote
    environment is available.
    - Added handler coverage showing remote environments use direct
    unified-exec shell mode instead of zsh-fork mode.
    - Ran the focused `codex-core` shell-parameter and zsh-fork tests.
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24980).
    * #24982
    * #24981
    * __->__ #24980
  • feat: gate unified exec zsh fork composition (#24979)
    ## Why
    
    `shell_zsh_fork` and unified exec need to remain independently
    controllable for enterprise rollouts, but we also need a third mode that
    composes them. That composed mode is intended to preserve unified exec
    command lifecycle support while letting the zsh fork provide more
    accurate `execv(2)` interception.
    
    Enabling `unified_exec_zsh_fork` by itself is intentionally not
    sufficient. It is a composition gate, not a dependency-enabling
    shortcut:
    
    - `unified_exec` selects the PTY-backed unified exec tool.
    - `shell_zsh_fork` opts into the zsh fork backend.
    - `unified_exec_zsh_fork` only allows those two already-enabled modes to
    be composed so local zsh unified exec commands can launch through the
    zsh fork.
    
    This separation is deliberate. Enterprises and staged rollouts must be
    able to enable or disable unified exec and zsh-fork independently. If
    `unified_exec_zsh_fork` implied either dependency, then enabling one
    under-development composition flag would silently activate a shell
    backend that the configured feature set left disabled.
    
    This PR introduces only the configuration and planning gate for that
    composition. Existing `shell_zsh_fork` behavior continues to use the
    standalone shell tool unless the new composition feature is explicitly
    enabled alongside both dependencies.
    
    ## What Changed
    
    - Added the under-development feature flag `unified_exec_zsh_fork`.
    - Added `UnifiedExecFeatureMode` so the three input feature flags
    collapse into `Disabled`, `Direct`, or `ZshFork` mode before tool
    planning.
    - Updated tool selection so zsh-fork composition requires
    `unified_exec`, `shell_zsh_fork`, and `unified_exec_zsh_fork`.
    - Kept the existing standalone zsh-fork shell tool behavior when only
    `shell_zsh_fork` is enabled.
    - Updated config schema output for the new feature flag.
    
    ## Verification
    
    - Added feature and tool-config coverage for the new gate.
    - Added planner coverage proving `shell_zsh_fork` remains standalone
    until composition is explicitly enabled.
    - Ran focused tests for `codex-features`, `codex-tools`, and the
    affected `codex-core` planner case.
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24979).
    * #24982
    * #24981
    * #24980
    * __->__ #24979
  • fix: deflake zsh-fork approval test (#25669)
    Fixes this flake:
    https://github.com/openai/codex/actions/runs/26773809591/job/78919970410?pr=25659
    
    This test is about zsh-fork subcommand approval behavior, not workspace
    sandboxing, so it now runs with `DangerFullAccess` to avoid macOS
    sandbox setup failures before the second subcommand approval.
  • exec-server: canonicalize bound filesystem paths (#25149)
    ## Summary
    - add executor filesystem canonicalization as a bound-path operation
    - route remote canonicalization through the exec-server filesystem RPC
    surface
    - keep path normalization attached to the filesystem that owns the path
    
    ## Stack
    - 2/5 in the skills path authority stack extracted from
    https://github.com/openai/codex/pull/25098
    - follows merged https://github.com/openai/codex/pull/25121
    
    ## Validation
    - `cd
    /Users/starr/code/codex-worktrees/pr-25098-restack-review-pr1b/codex-rs
    && just fmt`
    - Not run: tests/checks (not requested)
    - GitHub CI pending on rewritten head
  • [codex-rs] auto-review model override (#23767)
    ## Why
    
    Guardian auto-review normally uses the provider-preferred review model
    when one is available. Some parent models need model-catalog metadata to
    select a different review model while keeping older `/models` payloads
    compatible when that metadata is absent.
    
    ## What changed
    
    - Added optional `ModelInfo::auto_review_model_override` metadata to the
    public model payload as a review-model slug.
    - Updated Guardian review model selection to prefer the catalog override
    when present, while preserving the existing provider preferred-model
    path and parent-model fallback when it is omitted.
    - Added focused Guardian coverage for override and no-override model
    selection.
    - Added an `auto_review` core integration suite test that loads override
    metadata from a remote model catalog path and asserts the strict
    auto-review `/responses` request uses the catalog-selected review model.
    - Updated existing `ModelInfo` fixtures and local catalog constructors
    for the new optional field.
    
    ## Validation
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `cargo test -p codex-core guardian_review_uses_`
    - `cargo test -p codex-core
    remote_model_override_uses_catalog_model_for_strict_auto_review --test
    all`
    - `just fix -p codex-protocol`
    - `just fix -p codex-core`
    - `just fmt`
    - `git diff --check`
  • Check root Python script formatting in CI (#25165)
    ## Why
    
    Python files under `scripts/` were not covered by the repository
    formatting recipe or the CI formatting job, so formatting drift could
    merge unnoticed.
    
    ## What
    
    - Add a dedicated `scripts/pyproject.toml` and `scripts/uv.lock` so
    root-script formatting uses a locked Ruff version.
    - Extend `just fmt` to format root Python scripts and add
    `fmt-scripts-check` for CI.
    - Run `just fmt-scripts-check` from `.github/workflows/ci.yml`,
    installing `uv` through SHA-pinned `astral-sh/setup-uv` while retaining
    the `uv` `0.11.3` pin.
    - Apply Ruff formatting to the root Python scripts, including
    `scripts/just-shell.py`, and extend
    `sdk/python/tests/test_artifact_workflow_and_binaries.py` to cover the
    root formatting recipe.
    - Update `AGENTS.md` so agents run `just fmt` after code changes
    anywhere in the repository.
    
    ## Validation
    
    - Extended the existing Python SDK workflow test to assert that `just
    fmt` includes root Python scripts.
  • Throttle repeated rollout compression runs (#25659)
    ## Why
    
    [#25089](https://github.com/openai/codex/pull/25089) introduced the
    background worker that compresses cold archived rollouts, and
    [#25654](https://github.com/openai/codex/pull/25654) made that pass
    faster once it starts. But the worker still deleted
    `rollout-compression.lock` on successful exit, so the existing six-hour
    staleness window only helped with overlapping or crashed workers. Each
    new local thread-store initialization could immediately rescan archived
    rollouts even if a full pass had just finished.
    
    This change keeps the existing marker around long enough to throttle
    redundant reruns. The worker is still best-effort, but it no longer does
    repeated startup scans when nothing new is eligible for compression.
    
    ## What Changed
    
    - Replace the drop-scoped `CompressionLock` with a
    `CompressionRunMarker` that claims the existing
    `.tmp/rollout-compression.lock` path and leaves it in place after
    success.
    - Reuse the existing six-hour staleness window to block both overlapping
    starts and immediate reruns, while still letting a stale marker be
    reclaimed.
    - Update the worker docs and debug logging to describe the new "already
    running or recently ran" behavior.
    - Extend the rollout compression tests to assert that a successful run
    leaves the marker behind and that a fresh marker suppresses a new run.
    
    ## Validation
    
    - `just test -p codex-rollout`
  • [codex] Consolidate shared prompts in codex-prompts (#25151)
    ## Why
    
    `codex_core` is consistently a bottleneck for incremental builds during
    iteration. The simplest fix is to make the crate smaller.
    
    ## Summary
    
    `codex-core` owns several reusable prompt renderers and static prompt
    assets, which makes the crate harder to split apart.
    
    Rename `codex-review-prompts` to `codex-prompts` and move shared review,
    goal, permissions, compaction, realtime, hierarchical AGENTS.md, and
    `apply_patch` prompts into it. Move prompt-only tests and update
    consumers and `CODEOWNERS`.
    
    ## Validation
    
    - `just test -p codex-prompts -p codex-apply-patch`
    - `just test -p codex-core prompt_caching`
    - Bazel builds for the affected crates
  • [codex] Make justfile recipes Windows-aware (#24983)
    ## Summary
    
    Make the root `justfile` usable from Windows without maintaining a
    separate Windows copy of most recipes.
    
    The repo recipes previously assumed POSIX shell behavior for things like
    variadic argument forwarding (`"$@"`) and stderr redirection
    (`2>/dev/null`). That made common workflows such as `just fmt`, `just
    test`, and `just log` unreliable from Windows. This PR introduces a
    small cross-platform shell adapter so recipes can stay mostly unified
    while still expanding the few shell-specific constructs correctly on
    macOS/Linux and Windows.
    
    ## What Changed
    
    - Add `scripts/just-shell.py` as the configured `just` shell adapter.
      - On Unix it invokes `sh -cu`.
    - On Windows it invokes `pwsh -CommandWithArgs` so arguments containing
    spaces are preserved.
    - Add portable recipe placeholders:
    - `{args}` expands to `"$@"` on Unix and the equivalent PowerShell
    forwarded-args expression on Windows.
    - `{stderr-null}` expands to the platform-specific stderr suppression
    used by `fmt`.
    - Convert most variadic one-line recipes to the unified `{args}` form,
    including `codex`, `exec`, `file-search`, `app-server-test-client`,
    `fix`, `clippy`, `bench`, `mcp-server-run`, `write-app-server-schema`,
    and `argument-comment-lint-from-source`.
    - Keep genuinely shell-specific recipes split or Unix-only for now,
    including recipes backed by `.sh` scripts or recipes whose bodies are
    more than simple command forwarding.
    - Add a Windows `just install` path that installs PowerShell via
    `winget` when `pwsh` is not available, then runs the same basic Rust
    setup steps.
    - Update the SDK test that validates the root `fmt` recipe so it
    recognizes the new portable stderr placeholder.
    
    ## Validation
    
    - `just --summary`
    - `just --dry-run fmt`
    - `just --dry-run bench-smoke`
    - `just --dry-run codex foo "bar binky" baz`
    - `just --dry-run write-hooks-schema`
    - `just --dry-run bazel-lock-update`
    - `just --dry-run argument-comment-lint-from-source -- "foo bar"`
    - `git diff --check -- justfile scripts/just-shell.py
    sdk/python/tests/test_artifact_workflow_and_binaries.py`
    - Verified Windows argv preservation through `scripts/just-shell.py`
    with arguments containing spaces.
    - `uv run --frozen --project sdk/python --extra dev pytest
    sdk/python/tests/test_artifact_workflow_and_binaries.py::test_root_fmt_recipe_formats_rust_and_python_sdk`
  • Preserve plugin app manifest order (#25491)
    ## Summary
    - Preserve app declaration order when loading plugin .app.json files.
    - Keep plugin connector summaries in plugin app order after connector
    metadata is merged and filtered.
    - Add regression coverage for .app.json order and connector summary
    order.
    
    ## Validation
    - just fmt
    - just test -p codex-chatgpt
    connectors_for_plugin_apps_returns_only_requested_plugin_apps
    - just test -p codex-core-plugins
    effective_apps_preserves_app_config_order
    - just fix -p codex-core-plugins (passes with existing clippy
    large_enum_variant warning in core-plugins/src/manifest.rs)
    - just fix -p codex-chatgpt
    - just bazel-lock-update
    - just bazel-lock-check
  • [codex] Rename multi-agent v2 assign_task to followup_task (#25636)
    ## Summary
    
    Renames the MultiAgentV2 turn-triggering tool from `assign_task` to
    `followup_task` so the exposed tool name better describes sending an
    additional task to an existing agent.
    
    This updates the tool spec, handler/module names, registry wiring,
    default multi-agent v2 usage hints, and tests. Rollout trace
    classification keeps accepting legacy `assign_task` events so older
    traces still reduce correctly, while docs show the new tool name.
    
    ## Test plan
    
    - `just test -p codex-core followup_task`
    - `just test -p codex-core -E
    'test(multi_agent_feature_selects_one_agent_tool_family) |
    test(multi_agent_v2_can_use_configured_tool_namespace) |
    test(code_mode_only_can_expose_namespaced_multi_agent_v2_as_normal_tools)'`
    - `just test -p codex-rollout-trace`
    - `just fix -p codex-core`
    - `just fix -p codex-rollout-trace`
    
    Notes: `just fmt` ran `cargo fmt` but failed in the Python ruff phase
    because the local environment could not resolve `hatchling>=1.27.0` from
    the configured internal registry. A full `just test -p codex-core` also
    hit unrelated environment-sensitive integration failures involving
    missing spawned test binaries/sandbox behavior; the changed multi-agent
    spec/handler tests passed in the filtered runs above.
  • exec-server: add environment path refs (#25121)
    ## Summary
    - add public `codex_exec_server::EnvironmentPathRef`
    - bind an absolute path to its owning executor filesystem
    - keep path operations in the next review slice
    
    ## Stack
    - 1/5 in the skills path authority stack extracted from
    https://github.com/openai/codex/pull/25098
    
    ## Validation
    - `cd /Users/starr/code/codex-worktrees/pr-25098-restack4/codex-rs &&
    just fmt`
    - GitHub CI pending on rewritten head
  • Parallelize cold rollout compression (#25654)
    ## Why
    
    [#25089](https://github.com/openai/codex/pull/25089) added the
    background worker for compressing cold archived rollouts, but the worker
    still processed files effectively one at a time: each compression job
    was sent to `spawn_blocking` and then awaited before the next file
    started. On machines with a backlog of archived rollouts, that makes
    catch-up slower than it needs to be even though the actual compression
    work already runs off the async runtime.
    
    ## What Changed
    
    - Queue rollout compression work in a `JoinSet` while directory
    traversal continues.
    - Cap the worker at two in-flight compression jobs so it can overlap
    compression without turning the background task into unbounded blocking
    work.
    - Drain pending jobs before returning, including the
    `read_dir.next_entry()` error path, so every launched job still
    contributes to the final `compressed`, `skipped`, and `failed` stats.
    - Treat task join failures the same way as compression failures in the
    worker's warning and failure accounting.
  • [codex] Use git CLI for release Cargo fetches (#25644)
    ## Summary
    - Configure the rust-release build job with
    `CARGO_NET_GIT_FETCH_WITH_CLI=true`
    - Document the macOS SecureTransport/libgit2 failure mode that hit the
    `libwebrtc`/`libyuv` git submodule fetch
    
    ## Root cause
    The release run at
    https://github.com/openai/codex/actions/runs/26717498860/job/78745156683
    repeatedly failed before compilation because Cargo's libgit2 fetch path
    could not clone the nested `yuv-sys/libyuv` submodule from
    `chromium.googlesource.com`, ending with `SecureTransport error:
    connection closed via error`.
    
    ## Validation
    - `git diff --check`
    
    This is a workflow-only change, so I did not run Rust package tests.
  • Disable SQLite intrinsics for Windows x64 releases (#25490)
    ## Why
    
    Codex 0.135.0 started shipping bundled SQLite 3.51.x via SQLx 0.9.0 to
    avoid the older WAL corruption bug fixed by #24728. On Windows x64,
    #25367 reports an immediate `STATUS_ILLEGAL_INSTRUCTION` crash on a
    Haswell CPU when starting normal Codex paths.
    
    Rather than downgrading SQLite, this keeps the newer bundled SQLite
    source and removes SQLite compiler-intrinsic code paths from the Windows
    x64 release build.
    
    ## What changed
    
    For `x86_64-pc-windows-msvc` release builds, export
    `LIBSQLITE3_FLAGS=SQLITE_DISABLE_INTRINSIC` before `cargo build` in:
    
    - `.github/workflows/rust-release.yml`
    - `.github/workflows/rust-release-windows.yml`
    
    Other targets keep their current SQLite build flags.
    
    ## Verification
    
    - `git diff --check`
  • Compress cold local rollouts (#25089)
    ## Rollout compression stack
    
    This stack splits #24941 into reviewable steps for local rollout
    compression. The design is intentionally staged:
    
    1. Teach readers, listing, search, and lookup to understand compressed
    rollouts.
    2. Make append and resume paths materialize compressed rollouts back to
    plain JSONL before writing.
    3. Add a disabled-by-default worker that can compress cold archived
    rollouts behind `local_thread_store_compression`.
    
    The key invariant is that writers append to plain `.jsonl`. A
    `.jsonl.zst` file is a cold/read representation; if a write is needed,
    the compressed file is materialized back to plain JSONL first. Readers
    prefer plain `.jsonl` when both forms exist and can fall back to the
    compressed sibling during transitions.
    
    The worker is deliberately the last PR and remains behind an
    under-development feature flag. It currently scans only
    `archived_sessions`, not active `sessions`, because active sessions have
    the highest resume/append race risk. That means this stack does not yet
    compress most unarchived local history.
    
    ## Known race / follow-up
    
    The remaining unresolved design question is writer/compressor
    coordination. Even for archived rollouts, a resume or metadata update
    can append while the worker is replacing the plain file with
    `.jsonl.zst`; the current double-stat checks narrow but do not fully
    eliminate the window where a writer has opened the plain file before
    unlink. Do not treat the worker PR as production-ready until we either:
    
    - prevent append/resume paths from racing archived compression, or
    - introduce a shared representation/append lock or equivalent
    coordination.
    
    The first two PRs are useful independently: they make compressed
    rollouts readable and make append paths safely recover back to plain
    JSONL. The third PR isolates the worker behavior so that coordination
    issue is reviewable separately.
    
    ## Validation
    
    Focused local validation for the stack includes:
    
    - `just test -p codex-rollout`
    - `just test -p codex-thread-store` where thread-store paths were
    touched
    - `just test -p codex-features` for the feature flag slice
    - `just bazel-lock-check` after dependency graph changes
    - scoped `just fix -p ...` passes for changed crates
    
    CI is still the source of truth for the full platform matrix.
    
    ## This PR in the stack
    
    This is PR 3/3, based on #25088. It adds the under-development feature
    flag and starts the best-effort background worker when enabled. The
    worker currently compresses only cold archived rollouts, skips active
    sessions, verifies compressed output, preserves mtime and permissions,
    keeps a store-level lock heartbeat, and cleans stale temp files.
    
    Stack order:
    
    1. #25087: read compressed local rollouts.
    2. #25088: materialize compressed rollouts before append.
    3. This PR: add the disabled local compression worker.
  • Preserve renamed thread titles during reconciliation (#25624)
    ## Summary
    - preserve existing explicit SQLite thread titles during rollout
    reconciliation/backfill when the incoming rollout title is only
    first-message-derived
    - keep stale inferred-title repair behavior while avoiding session-index
    scans during startup backfill
    - add a regression test for renamed titles surviving reconcile
    
    ## Testing
    - just fmt
    - just test -p codex-rollout
    - just test -p codex-state
  • Add reasoning-only status surface item (#25504)
    Closes #24886.
    
    ## Why
    Users can configure the TUI status line and terminal title with
    `model-with-reasoning`, but issue #24886 asks for a compact
    reasoning-only item. That lets a setup show just `default`, `low`,
    `medium`, `high`, or `xhigh` without repeating the model name.
    
    ## What changed
    - Added a `reasoning` item for `/statusline` and `/title` setup flows.
    - Rendered the item from the effective reasoning effort, including
    collaboration-mode overrides.
    - Registered `reasoning` with `codex doctor` so Codex-generated
    terminal-title config is not reported as invalid.
    - Updated TUI setup snapshots so the picker previews include the new
    item.