2094 Commits

  • Remove TUI legacy core test_support dependencies (#27484)
    ## Why
    
    The TUI now sits on the app-server layer, but
    `app-server-client::legacy_core` still exposed core test helpers solely
    for TUI tests. We've been whittling away the remaining dependencies.
    This is the next step on that journey.
    
    There is no functional change — just a refactor, and this affects only
    test code, so it should be low risk.
    
    ## What changed
    
    - remove the `legacy_core::test_support` re-export and call
    model-manager test helpers directly
    - keep the bundled model-preset cache local to TUI test support
    - import constraint types directly from `codex-config`
  • [codex] Remove redundant plugin app auth state (#27465)
    ## Summary
    
    - remove the redundant `needsAuth` field from `AppSummary` and generated
    app-server schemas
    - stop `plugin/read` from querying Apps MCP solely to hydrate unused
    connector auth state
    - preserve `plugin/install.appsNeedingAuth` membership and
    `app/list.isAccessible` as the authentication signals
    
    ## Why
    
    Codex App and TUI do not consume `plugin/read.plugin.apps[].needsAuth`.
    Hydrating it could establish an Apps MCP connection and discover tools
    on a cold `plugin/read` request, adding avoidable latency. The plugin
    APIs are still marked under development, so removing this wire field is
    preferable to retaining a misleading default.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `just fmt`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    plugin_install_uses_remote_apps_needing_auth_response`
    - `just test -p codex-app-server
    plugin_install_returns_apps_needing_auth`
    - `just test -p codex-app-server
    plugin_read_returns_plugin_details_with_bundle_contents`
    - `just test -p codex-tui
    plugin_detail_popup_snapshot_shows_install_actions_and_capability_summaries`
    - `$xin-build` simplify and debug reviews
  • [codex] add /import for external agents (#27071)
    ## Why
    
    External-agent import should be discoverable and deliberate without
    blocking startup or claiming the public `codex [PROMPT]` CLI namespace.
    The slash command keeps the flow local to the interactive TUI and reuses
    the existing app-server import API.
    
    ## What changed
    
    - add the user-facing `/import` slash command
    - detect external-agent importable items only when the command is
    invoked
    - run imports through the embedded local app-server
    - show start and completion messages, refresh configuration, and block
    duplicate imports while one is pending
    - reject the flow for unsupported remote and local-daemon sessions
    
    ## Validation
    
    - `just test -p codex-tui external_agent_config_migration` (10 passed)
    - manually exercised an isolated TUI fixture with existing
    external-agent setup and session data using a fresh `CODEX_HOME`
    - verified picker customization, plugin and session detection, import
    completion, repeated invocation, and imported-session resume context
    - the broader `just test -p codex-tui` run passed 2,805 tests, with 2
    unrelated guardian feature-flag failures and 4 skipped tests
    
    ## Draft follow-ups
    
    - review whether completion messaging should remain attached to the
    initiating chat if the user switches chats during an import
    - review shutdown semantics for an in-progress background import
    
    ## Stack
    
    1. [#27064](https://github.com/openai/codex/pull/27064): remove the
    startup migration flow
    2. [#27065](https://github.com/openai/codex/pull/27065): extract the
    picker renderer
    3. [#27070](https://github.com/openai/codex/pull/27070): add the
    external-agent import picker UX
    4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
    through `/import`
    
    **This PR is stack item 4.** Draft while the lower stack dependencies
    are reviewed.
  • [codex] add external agent import picker UX (#27070)
    ## Why
    
    Users need to understand what external-agent data Codex detected, what
    is selected, and how to proceed before an import begins. The updated
    picker makes focus, selection state, and the submission path explicit
    while preserving the existing import backend.
    
    ## What changed
    
    - replace the old migration prompt with a two-step external-agent import
    picker
    - add a customize view with explicit item focus, selection state,
    counts, and a review action
    - separate detected import data into a view model
    - add Unix and Windows snapshots for prompt, item-focus, and
    action-focus states
    
    ## Validation
    
    - `just test -p codex-tui external_agent_config_migration` (10 passed)
    - manually exercised an isolated TUI fixture covering customization,
    selection toggles, review, import, repeated invocation, and session
    resume
    - the broader `just test -p codex-tui` run passed 2,805 tests, with 2
    unrelated guardian feature-flag failures and 4 skipped tests
    
    ## Review note
    
    This is the largest layer in the stack because the interaction state,
    rendering changes, and required snapshots move together. It remains a
    draft in case reviewers prefer a further presentation/state split.
    
    ## Stack
    
    1. [#27064](https://github.com/openai/codex/pull/27064): remove the
    startup migration flow
    2. [#27065](https://github.com/openai/codex/pull/27065): extract the
    picker renderer
    3. [#27070](https://github.com/openai/codex/pull/27070): add the
    external-agent import picker UX
    4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
    through `/import`
    
    **This PR is stack item 3.** Draft while the lower stack dependencies
    are reviewed.
  • [codex] extract external agent import picker renderer (#27065)
    ## Why
    
    The external-agent import picker is easier to review when its rendering
    refactor lands separately from new state and interaction behavior. This
    layer is intended to be behavior-neutral.
    
    ## What changed
    
    - extract external-agent migration rendering into a dedicated `render`
    module
    - preserve existing behavior while separating presentation from
    interaction logic
    - establish a smaller foundation for the import picker UX in the next PR
    
    ## Validation
    
    - `just test -p codex-tui external_agent_config_migration` (10 passed)
    
    ## Stack
    
    1. [#27064](https://github.com/openai/codex/pull/27064): remove the
    startup migration flow
    2. [#27065](https://github.com/openai/codex/pull/27065): extract the
    picker renderer
    3. [#27070](https://github.com/openai/codex/pull/27070): add the
    external-agent import picker UX
    4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
    through `/import`
    
    **This PR is stack item 2.** Draft while the lower stack dependency is
    reviewed.
  • [codex] remove blocking external agent migration flow (#27064)
    ## Why
    
    External-agent import should be initiated deliberately instead of
    interrupting eligible TUI startups. This cleanup removes the blocking
    startup flow before the replacement import experience is introduced
    later in the stack.
    
    ## What changed
    
    - remove the startup-blocking external-agent migration prompt
    - remove the now-unused external migration feature gate
    - remove the obsolete TUI app-server migration wrappers
    - retain the dormant picker behind a module-scoped dead-code allowance
    until the next stack item wires it back in
    - keep normal TUI startup focused on entering Codex immediately
    
    ## Validation
    
    - `bazel build --config=clippy //codex-rs/tui:tui
    //codex-rs/tui:tui-unit-tests-bin`
    - `just test -p codex-tui external_agent_config_migration` (8 passed)
    - `just test -p codex-tui` (2,786 passed, 12 unrelated local
    environment-sensitive failures, 4 skipped)
    - `just fix -p codex-tui`
    - `just fmt`
    
    ## Stack
    
    1. [#27064](https://github.com/openai/codex/pull/27064): remove the
    startup migration flow
    2. [#27065](https://github.com/openai/codex/pull/27065): extract the
    picker renderer
    3. [#27070](https://github.com/openai/codex/pull/27070): add the
    external-agent import picker UX
    4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
    through `/import`
    
    **This PR is stack item 1.**
  • fix: Auto-recover from corrupted sqlite databases (#26859)
    Further investigation of the sqlite incidents showed that the problems
    are due to corruption from the older version of SQLite that we recently
    upgraded, and that the data is truly corrupted in the root database --
    recovery of all data is not possible. Given that the data is
    reconstructable from the rollouts on disk, we should just auto-backup
    the database and let codex rebuild the rollout info from the disk
    rollouts.
    
    The new behavior is that appserver auto-backs-up and rebuilds (with logs
    reflecting that behavior). The CLI now pops a message letting you know
    this happened and the paths of the backed-up corrupt db and the new
    database. There is also context added so that the desktop app can read
    the rebuild info from it and inform the user with it.
  • Add app-server thread/delete API (#25018)
    ## Why
    
    Clients can archive and unarchive threads today, but there is no
    app-server API for permanently removing a thread. Deletion also needs to
    cover the full session tree: deleting a main thread should remove
    spawned subagent threads and the related local metadata instead of
    leaving orphaned rollout files, goals, or subagent state behind.
    
    ## What
    
    - Adds the v2 `thread/delete` request and `thread/deleted` notification,
    with the response shape kept consistent with `thread/archive`.
    - Implements local hard delete for active and archived rollout files.
    - Deletes the requested thread's state DB row as the commit point, then
    best-effort cleans associated state including spawned descendants,
    goals, spawn edges, logs, dynamic tools, and agent job assignments.
    - Updates app-server API docs and generated protocol schema/TypeScript
    fixtures.
  • feat: keep child MCP warnings out of parent transcript (#27174)
    ## Why
    
    MCP startup status notifications are thread-owned, but `ChatWidget`
    trusted upstream routing. If routing state delivered a tagged child
    notification to the active parent widget, the child MCP failure could
    still mutate the parent's startup state and transcript. Rejecting it
    only inside the MCP handler was also too late because shared
    notification handling could already restore and consume the parent's
    retry status.
    
    ## What changed
    
    - Validate a tagged MCP status notification against the visible
    `ChatWidget` thread before shared notification handling mutates any
    parent state.
    - Cover child `Starting` and `Failed` notifications delivered to a
    retrying parent widget, asserting that they preserve its visible retry
    error and saved status header while producing no history or MCP status
    mutation.
    
    ## User impact
    
    Subagent MCP startup failures remain scoped to the child transcript
    instead of appearing as duplicate warnings in the parent transcript.
    
    ## Testing
    
    - `just test -p codex-tui mcp_startup_ignores_status_for_other_thread`
    - `just test -p codex-tui
    primary_thread_ignores_child_mcp_startup_notifications`
    - `just fmt`
  • Reduce TUI legacy core dependencies (#26711)
    ## Why
    
    The TUI still reached through `app-server-client::legacy_core` for
    thread-name normalization and project-instruction filename details. In
    particular, checking the TUI's local filesystem for `/init` is incorrect
    for remote app-server sessions, where the server owns the working
    directory and instruction discovery.
    
    ## What changed
    
    - use the instruction source paths supplied by the app server to decide
    whether `/init` should avoid overwriting project instructions
    - keep the small thread-name normalization helper local to the TUI
    - remove the now-unused instruction filename constants, utility module,
    and other unused `legacy_core` re-exports
    - make status helper tests independent of concrete instruction filenames
    
    ## Verification
    
    - `just test -p codex-app-server-client`
    - `just test -p codex-tui
    slash_init_skips_when_project_instructions_are_loaded`
    - `just test -p codex-tui` ran 2,799 tests; 2,797 passed and two
    unrelated guardian feature-flag tests failed reproducibly in untouched
    code
    
    ### Manual test
    
    Started an app server over WebSocket with a remote workspace containing
    `AGENTS.md`, then connected the TUI using `--remote`. After confirming
    `thread/start` returned the file in `instructionSources`, deleted
    `AGENTS.md` and ran `/init` in the existing session.
    
    The TUI still reported that project instructions already existed and
    skipped `/init`. The trace contained no `turn/start` request, confirming
    the decision came from app-server session state rather than a new
    client-local filesystem check.
  • fix: Prevent /review crash when entering Esc on steer message (#22879)
    This changes the `/review` escape path so `Esc` no longer behaves like
    the normal queued-follow-up interrupt flow while a review is running.
    Steering is not currently supported in `/review` mode, without this
    change users are able to attempt a steer but it leads to a crash (see
    #22815). If the user has already tried to send additional guidance
    during `/review`, the TUI now keeps the review running and shows a
    warning that steer messages are not supported in that mode, while still
    pointing users to `Ctrl+C` if they actually want to cancel. It also adds
    regression coverage for the review-specific warning behavior. When users
    do cancel with Ctrl+C during /review, the TUI now tolerates the
    active-turn race that can happen during review handoff, and any queued
    steer messages are restored to the composer instead of being discarded.
    
    - Special-case `Esc` during an active `/review` when follow-up steer
    input is pending or has already been deferred.
    - Show a clear warning instead of interrupting the running review.
    - Make the Ctrl+C cancel path during /review resilient to active-turn
    races, while preserving any queued steer text by restoring it to the
    composer.
    - Add review-mode test coverage for the warning path.
    
    ## Testing
    
    1. Start a `/review` with a diff large enough that the review stays
    active for more than a few seconds.
    
    2. While the review is still running, type a follow-up / steer message,
    submit it, and then press `Esc`.
       Before: `Esc` causes the TUI to close abruptly.  
    After: the review keeps running and the transcript shows a warning that
    steer messages are not supported during `/review`, with guidance to use
    `Ctrl+C` if you want to cancel.
    
    3. Press `Ctrl+C` if you actually want to stop the review.  
    Before: (after restarting the test since Pt. 2 crashed) this is the
    intentional cancellation path.
    After: this remains the intentional cancellation path, and any queued
    follow-up steer text is restored to the composer instead of being lost.
       
    ## Note:
    `/review` mode explicitly does not support steering at this time (as
    noted in `turn_processer.rs`, if we want to explore that in the future
    this code will need to be modified). This change keeps unsupported steer
    attempts from crashing the TUI and preserves queued follow-up text if
    the user cancels with Ctrl+C.
  • multi-agent: add path-based v2 activity tracking (#27007)
    ## Why
    
    Multi-agent v2 identifies agents by canonical paths, but its tool
    handlers still emitted the larger legacy collaboration begin/end events
    built around nickname and role metadata. App-server, rollout-trace,
    analytics, and TUI consumers therefore lacked one compact path-based
    completion signal that behaved consistently across live events and
    replay.
    
    The TUI also needs a bounded `/agent` status surface for v2 agents. It
    should use recent local activity for previews, refresh liveness without
    loading full histories, and keep the legacy picker available when no
    path-backed v2 agent is known.
    
    ## What changed
    
    - Replace the v2 `spawn_agent`, `send_message`, `followup_task`, and
    `interrupt_agent` legacy lifecycle emissions with a success-only
    `SubAgentActivity` event. The event records the tool call ID, occurrence
    time, affected thread, canonical agent path, and `started`,
    `interacted`, or `interrupted` kind.
    - Expose the activity as a completion-only app-server v2
    `subAgentActivity` thread item in live notifications and reconstructed
    history, regenerate the protocol schemas, and count it in sub-agent tool
    analytics.
    - Track canonical paths from live activity and loaded-thread metadata in
    the TUI, and render the activity in live and replayed transcripts.
    - Make `/agent` list running path-backed agents with summaries from
    bounded local event buffers. Each summary is capped at 240 graphemes,
    the scan is capped at six recent items, only the last three wrapped
    lines are shown, and command output is omitted. Liveness falls back to
    metadata-only `thread/read` when local turn state is unavailable.
    - Persist the activity as a terminal rollout-trace runtime payload and
    reduce it to the corresponding spawn, send, follow-up, or close
    interaction edge. `interrupt_agent` is classified as a close-edge
    operation.
    - Preserve the legacy picker when no path-backed v2 agent is known.
    
    ## Compatibility
    
    App-server v2 clients that consumed `collabAgentToolCall` begin/end
    pairs for these tools must handle the new completion-only
    `subAgentActivity` item. Legacy v1 collaboration behavior is unchanged.
    
    ## Screenshot
    
    <img width="684" height="288" alt="Screenshot 2026-06-08 at 15 40 47"
    src="https://github.com/user-attachments/assets/194b3cd0-619d-45fb-b587-cf3e2b1b8a1d"
    />
    
    ## Testing
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-rollout-trace`
    - Added focused coverage for activity analytics, terminal trace
    serialization, spawn-edge reduction, `interrupt_agent` classification,
    TUI status rendering without aggregated command output, and clearing
    stale running state after a completed turn.
  • [codex] preserve fsmonitor for worktree Git reads (#26880)
    Codex forces `core.fsmonitor=false` on internal Git commands so a
    repository cannot select an executable fsmonitor helper. This also
    disables Git's built-in daemon for `status`, `diff`, and `ls-files`,
    turning those worktree reads into full scans in large repositories.
    
    Read the raw effective `core.fsmonitor` value and preserve it only when
    Git interprets it as true and advertises built-in daemon support through
    `git version --build-options`. Query uncommon boolean spellings back
    through Git using the exact effective value. Unset, false, helper paths,
    malformed values, probe failures, and unsupported Git builds continue to
    force `core.fsmonitor=false`.
    
    Centralize this policy in `git-utils` while keeping process execution in
    the existing local and workspace-command adapters. Probe once per
    worktree workflow and reuse the result for its Git commands, including
    the TUI `/diff` path. Metadata-only commands and repository discovery
    remain disabled without probing. Each probe and requested Git process
    keeps its own existing timeout, and the decision is not cached because
    layered and conditional Git configuration can change while Codex runs.
    
    ---------
    
    Co-authored-by: Chris Bookholt <bookholt@openai.com>
  • Preserve cloud requirements across TUI thread resets (#25177)
    Fixes a TUI regression where thread transitions such as `/new` and
    `/clear` could rebuild config without the cloud requirements loader,
    allowing users to fall back to non-cloud-managed settings. The config
    refresh path now preserves cloud requirements during thread
    reinitialization, and config loading is moved off the deep TUI event
    stack to avoid stack-overflow crashes during those reloads.
    
    - Passes the cloud requirements loader through TUI config rebuild paths.
    - Keeps cloud requirements applied for `/new`, `/clear`, `/fork`, side
    conversations, and session picker transitions.
    - Runs config building on a Tokio task so reloads do not occur on the
    deep TUI caller stack.
    - Adds regression coverage that cloud requirements survive
    thread-transition config refreshes.
    
    ## Test/Repro:
      - Start Codex with a cloud requirement applied.
      - Use `/new` or `/clear`.
    - The refreshed/fresh-session config should still include the cloud
    requirements
      
    This can be tested with any config item, at this moment for oai staff
    the easiest item to test is the `mentions_v2` feature. This is currently
    enabled in cloud requirements, but is not enabled by default. As a
    result, prior to these changes that feature is disabled after `/new` or
    `/clear`. Testing the same steps with a binary from this branch should
    not drop the feature enablement.
  • Show effective sandbox modes in /debug-config (#27068)
    ## Summary
    - Render `/debug-config`'s `allowed_sandbox_modes` from the finalized
    permission constraints instead of the raw requirements list.
    - Add regression coverage for configured full-access and external
    sandbox modes being omitted when effective permissions reject them.
    
    ## Details
    `allowed_sandbox_modes` comes from managed requirements, but the final
    permissions can be further constrained by derived validation rules. For
    example, `permissions.filesystem.deny_read` requires sandbox
    enforcement, so modes that disable or externalize Codex's sandbox are
    not actually usable even if they were present in the raw requirements
    TOML.
    
    The debug renderer now enumerates the configured sandbox-mode labels and
    keeps only those accepted by `Config.permissions`. That makes
    `/debug-config` reflect the same effective permission-profile constraint
    path used by runtime config validation, while preserving the existing
    source/provenance display.
    
    ## Validation
    - Added a regression test for effective sandbox-mode filtering in
    `/debug-config`.
  • fix(tui): linkify complete bare URLs with tildes (#27088)
    ## Background
    
    Bare URLs containing `~` in their path are currently only clickable up
    to the tilde in the interactive TUI. For example, Codex renders the
    visible text for:
    
    
    `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf`
    
    but the OSC 8 destination stops at `https://www.cs.tufts.edu/`. This
    makes Cmd-click open the wrong location even though the terminal
    recognizes the complete URL outside Codex.
    
    Fixes #26774.
    
    ## Root Cause
    
    The URL scanner already accepts `~`. The truncation happens earlier:
    with strikethrough parsing enabled, `pulldown-cmark` splits this URL
    into adjacent decoded `Event::Text` values around the tilde. The
    Markdown renderer annotated each text event independently, so only the
    first event still looked like a complete URL with a supported scheme.
    
    The renderer now merges adjacent decoded text events before URL
    annotation. It preserves the combined source range while retaining
    parser-decoded contents, which avoids regressing entities such as
    `&amp;`.
    
    ## Changes
    
    - Add a small iterator that merges adjacent decoded Markdown text events
    and their source ranges.
    - Apply it at the Markdown renderer boundary before hyperlink detection.
    - Add regression coverage for the reported URL in prose, wrapped table
    output, and entity-decoded URLs.
    
    ## How to Test
    
    1. Run Codex with `just c`.
    2. Ask the assistant to output this exact bare URL with no Markdown link
    syntax:
    
    `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf`
    3. Hold Cmd and hover or click the URL.
    4. Confirm the complete URL, including the suffix after `~`, is one
    destination.
    5. Repeat with the URL inside a Markdown table and confirm wrapped
    portions retain the same complete destination.
    
    Targeted tests:
    
    - `just test -p codex-tui url_with_tilde`
    - `just test -p codex-tui merged_text_events_preserve_entity_decoding`
    
    The full `codex-tui` test run was also executed. Its only failures were
    the two existing Guardian feature-flag tests:
    
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
  • fix: preserve auto review across config and delegation (#26230)
    ## Why
    
    Auto Review should remain the effective approval reviewer when settings
    cross runtime boundaries. A config or app-server round trip must not
    change the reviewer identity, and delegated work must not silently fall
    back to user review.
    
    This requires both a stable canonical serialized value and propagation
    of the effective setting. `auto_review` is the canonical value across
    protocol and app-server output, while `guardian_subagent` remains
    accepted as backward-compatible input.
    
    ## What changed
    
    - serialize `ApprovalsReviewer::AutoReview` consistently as
    `auto_review` across core protocol and app-server v2
    - continue accepting `guardian_subagent` when reading existing config or
    client requests
    - carry the active turn's approval reviewer into spawned agents
    - update config/debug expectations and add delegated-task regression
    coverage
    
    ## Scope
    
    This does not change Guardian policy or remove compatibility with
    existing `guardian_subagent` inputs. It preserves the selected reviewer
    across serialization, config reloads, app-server settings, and delegated
    task setup.
    
    Related Guardian changes are split independently:
    
    - #26231 adds denials and soft denials
    - #26334 retries transient reviewer failures
    - #26333 reuses narrowly scoped low-risk approvals
    - #26232 adds TUI denial recovery
    
    ## Validation
    
    - `just test -p codex-app-server-protocol` (224 passed)
    - regression coverage for delegated task reviewer propagation
    - serialization coverage for canonical `auto_review` output and legacy
    `guardian_subagent` input
    
    ---------
    
    Co-authored-by: saud-oai <saud@openai.com>
  • fix(tui): scope MCP startup status by thread (#26639)
    ## Why
    
    MCP startup failures from spawned subagents were rendered as global
    notifications, so a child thread's failure could pollute the visible
    parent transcript. Routing the notification to the child exposed two
    related replay problems: session refresh could discard the buffered
    event, and a newly created child `ChatWidget` did not know the expected
    MCP server set, which could leave its startup spinner running after
    every server had settled.
    
    MCP startup diagnostics should remain visible in the thread that owns
    the startup without affecting other transcripts. The protocol also needs
    to support a future app-scoped MCP lifecycle where startup is not owned
    by any thread.
    
    ## Reported Behavior
    
    The [originating Slack
    report](https://openai.slack.com/archives/C08JZTV654K/p1780604538859939)
    called out that using subagents could turn MCP startup failures into a
    wall of yellow CLI warnings because repeated failures were not
    deduplicated. The intended behavior is for those diagnostics to remain
    visible once in the thread that owns the startup, without polluting the
    parent transcript.
    
    ## What Changed
    
    - add nullable `threadId` ownership to `mcpServer/startupStatus/updated`
    - populate it from the app-server conversation ID for the current
    thread-scoped lifecycle and regenerate the protocol schema and
    TypeScript artifacts
    - treat a missing or null `threadId` as app-scoped without injecting it
    into the active chat transcript
    - route and buffer thread-owned MCP startup notifications by thread in
    the TUI
    - preserve buffered MCP startup events across child session refresh
    - seed expected MCP servers before replaying a thread snapshot so
    startup reaches its terminal state
    - suppress an identical repeated failure warning for the same server
    within one startup round
    
    The owning thread still renders the detailed failure and final `MCP
    startup incomplete (...)` summary.
    
    ## How to Test
    
    1. Configure an optional MCP server named `smoke` that exits during
    initialization.
    2. Launch the TUI with multi-agent support enabled.
    3. Confirm the main thread's own startup failure renders one detailed
    `smoke` warning and one incomplete-startup summary.
    4. Spawn exactly one subagent.
    5. Confirm the parent transcript does not receive the subagent's MCP
    startup failure.
    6. Switch to the subagent thread and confirm it contains exactly one
    detailed `smoke` failure and one incomplete-startup summary.
    7. Confirm the subagent's MCP startup spinner disappears and the thread
    remains usable.
    8. Switch between the parent and subagent and confirm the warnings
    neither move nor duplicate.
    
    Targeted tests:
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    thread_start_emits_mcp_server_status_updated_notifications`
    - `just test -p codex-tui mcp_startup`
    
    The parent/child behavior and spinner completion were also exercised
    manually in tmux. `just argument-comment-lint` was attempted but blocked
    by an unrelated local Bazel LLVM empty-glob failure; touched Rust
    callsites were inspected manually.
  • [codex] Deduplicate skill load warnings (#26698)
    Skill reloads can get noisy when the watcher keeps triggering
    `skills/list` and the same invalid `SKILL.md` error comes back each
    time.
    
    This keeps the first warning visible, then suppresses repeats while the
    same `(path, message)` is still active. If the error clears and later
    comes back, or if the message changes, it will show again.
    
    Validation:
    - `just fmt`
    - `just test -p codex-tui skill_load_warning_state`
  • permissions: enforce managed permission profile allowlists (#24852)
    ## Why
    
    Permission profile allowlists are an enterprise security boundary, but
    they also need to compose across the managed requirements layers added
    in #24620.
    
    A map representation lets each requirements layer add, allow, or revoke
    individual profiles without replacing an entire array.
    
    ## Managed Contract
    
    Administrators configure the mergeable allow map with
    `allowed_permission_profiles`. A recommended enterprise configuration
    explicitly lists every built-in and custom profile users should be able
    to select:
    
    ```toml
    default_permissions = "review_only"
    
    [allowed_permission_profiles]
    ":read-only" = true
    ":workspace" = true
    review_only = true
    # ":danger-full-access" is intentionally omitted, so it is denied.
    
    [permissions.review_only]
    extends = ":read-only"
    ```
    
    - Profiles whose effective merged value is `true` are allowed.
    - Missing profiles and profiles set to `false` are denied.
    - This is a closed allowlist: built-in profiles and profiles introduced
    in future versions are denied unless explicitly allowed.
    - Explicitly list each built-in profile the enterprise wants to make
    available. Omit built-ins such as `:danger-full-access` when they should
    remain unavailable.
    - Set `default_permissions` explicitly to the allowed profile users
    should receive when they have no local selection.
    - Higher-precedence layers override only the profile keys they define.
    - `false` is only needed when a higher-precedence layer must revoke a
    `true` inherited from a lower layer.
    - Explicit keys must refer to known built-in or managed profiles.
    
    A custom or narrowed allowlist requires an allowed
    `default_permissions`. For compatibility, if both `:workspace` and
    `:read-only` are explicitly allowed, an omitted default resolves to
    `:workspace`; customer configurations should still set the intended
    default explicitly.
    
    When `allowed_permission_profiles` is absent, existing implicit
    permission and legacy `sandbox_mode` behavior is unchanged.
    
    ## What Changed
    
    - Add `allowed_permission_profiles` as a `BTreeMap<String, bool>` that
    merges per profile across requirements layers.
    - Enforce managed defaults, strict denial of omitted profiles, and the
    explicitly allowed standard-pair fallback.
    - Expose `allowedPermissionProfiles` through `configRequirements/read`
    and regenerate its schemas.
    - Add regression coverage for map composition and revocation, managed
    defaults, strict denial of omitted built-ins, and API output.
    
    ## Verification
    
    - Focused `codex-config` coverage for layered map composition and
    revocation
    - Focused `codex-core` coverage for managed defaults, invalid defaults,
    strict denial of omitted built-ins, and the standard built-in pair
    - Focused `codex-app-server` coverage for requirements API output
    - Scoped Clippy for `codex-config`, `codex-core`,
    `codex-app-server-protocol`, and `codex-app-server`
    
    ## Documentation
    
    The managed `requirements.toml` documentation should introduce
    `allowed_permission_profiles` as a closed permission-profile allowlist
    before this setting is published on developers.openai.com.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-rs] support v2 personal access tokens (#25731)
    ## Summary
    
    - add v2 personal access token support for `codex login
    --with-access-token` and `CODEX_ACCESS_TOKEN`
    - classify opaque `at-` tokens separately from legacy Agent Identity
    JWTs
    - hydrate required ChatGPT account metadata through AuthAPI
    `/v1/user-auth-credential/whoami`
    - use PATs directly as bearer tokens while preserving existing ChatGPT
    account surfaces
    - expose PAT-backed auth as the explicit `personalAccessToken`
    app-server auth mode
    
    ## Implementation
    
    PAT auth is intentionally small and stateless. Loading a PAT performs
    one AuthAPI metadata request, stores the hydrated metadata in the
    in-memory auth object, and redacts the secret from debug output. Legacy
    Agent Identity JWT handling remains unchanged. The shared access-token
    classifier lives in a private neutral module because it dispatches
    between both credential types.
    
    PAT hydration fails closed when AuthAPI omits any required metadata,
    including email. Hydrated metadata is intentionally not persisted:
    startup performs a live `whoami` preflight so revoked tokens or changed
    account metadata are not accepted from a stale cache.
    
    ## Workspace restriction scope
    
    This change intentionally does **not** apply
    `forced_chatgpt_workspace_id` to PAT authentication. The setting is a
    client-side config guardrail, not an authorization boundary, and PAT
    does not currently require workspace-ID parity. The PAT login and
    `CODEX_ACCESS_TOKEN` paths therefore validate through AuthAPI without
    threading workspace-restriction state through access-token loading.
    Existing workspace checks for non-PAT auth remain on their established
    paths.
    
    ## App-server compatibility
    
    The public app-server `AuthMode` is shared across v1 and v2, and
    PAT-backed auth reports `personalAccessToken` through both APIs.
    Following human review, this intentionally removes the temporary v1
    compatibility mapping that reported PATs as `chatgpt`; the deprecated v1
    API is kept in parity with v2 rather than maintaining a separate closed
    enum. Clients with exhaustive auth-mode handling in either API version
    must add the new case and should generally treat it as ChatGPT-backed
    unless they need PAT-specific behavior.
    
    The v1 auth-status response still omits the raw PAT when `includeToken`
    is requested because that response cannot carry the account metadata
    needed to reuse the credential safely. Persisted PAT auth also omits the
    new enum value so older Codex builds can deserialize `auth.json` and
    infer PAT auth from the credential field after a rollback.
    
    ## Validation
    
    Latest review-fix validation:
    
    - `CARGO_INCREMENTAL=0 just test -p codex-login` (126 passed)
    - `CARGO_INCREMENTAL=0 just test -p codex-cli` (263 passed)
    - `CARGO_INCREMENTAL=0 just test -p codex-cli
    stored_auth_validation_handles_personal_access_token`
    - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol` (226
    passed)
    - `CARGO_INCREMENTAL=0 just test -p codex-models-manager
    refresh_available_models_uses_remote_only_catalog_for_chatgpt_auth`
    - `CARGO_INCREMENTAL=0 just test -p codex-tui
    existing_non_oauth_chatgpt_login_counts_as_signed_in`
    - `CARGO_INCREMENTAL=0 just fix -p codex-login -p
    codex-app-server-protocol -p codex-models-manager -p codex-tui -p
    codex-cli`
    - `just fmt`
    - `git diff --check`
    
    The broader `codex-tui` suite previously compiled and ran 2,834 tests.
    Three unrelated environment-sensitive guardian/IDE-socket tests failed
    after retries; the PAT-relevant TUI coverage passed.
  • [codex] Gate terminal visualization instructions in TUI (#26013)
    ## Summary
    - add `Feature::TerminalVisualizationInstructions` as
    `UnderDevelopment`, disabled by default
    - keep terminal visualization instructions inside the TUI package
    - append them to existing developer instructions for TUI start, resume,
    and fork flows only when enabled
    - intentionally do not apply them to `codex exec`
    
    ## Rollout
    Control behavior is unchanged. TUI dogfooders can enable
    `terminal_visualization_instructions`; no default user receives the new
    terminal-specific instructions.
    
    The shared visualization-selection rule is supplied separately through
    the `codex_proxy_model_3` Statsig layer for every target Codex model
    slug in the gated cohort. This TUI feature determines how to render an
    appropriate visualization on the terminal surface; the model-layer
    treatment determines when to use one.
    
    ## Validation
    - `cargo test -p codex-tui
    terminal_visualization_instructions_are_gated_for_all_tui_thread_flows
    --lib`
    - `cargo test -p codex-features --lib`
    - `cargo fmt --all -- --check`
    - `git diff --check`
    - GPT-5.4 and GPT-5.5 real prompt-pipeline smoke tests: both visualized
    the positive mapping case, abstained on the negative route case, and
    passed exact prompt-stack verification on CLI and App
    - refreshed onto current `main` with a clean merge and reran the focused
    validation
    
    The full 53-probe all-model treatment comparison and requested
    production coding evals remain rollout gates before broadening beyond
    the initial employee cohort.
    
    This PR remains open for normal human review.
  • Speed up TUI startup by reusing plugin discovery (#26469)
    ## Summary
    
    TUI startup loads related plugin data from `hooks/list`, session MCP
    initialization, and plugin skill warmup. These paths repeated filesystem
    discovery and emitted the same plugin warnings, while `hooks/list` and
    account/model bootstrap ran serially.
    
    This change:
    
    - Reuses one immutable plugin load outcome across startup consumers.
    - Keys the cache only on plugin-relevant configuration.
    - Single-flights concurrent plugin loads and prevents invalidated loads
    from repopulating the cache.
    - Runs hook discovery and account/model bootstrap concurrently.
    - Preserves configuration-migration ordering, hook review behavior, and
    accurate startup telemetry.
    
    In 10 alternating release-build launches in the Ruff repository with the
    existing `~/.codex` configuration, median time to the first editable
    composer decreased from 833ms to 504ms. The branch was faster in 9 of 10
    pairs, with a paired median improvement of 312ms.
  • Use state DB first for resume --last (#26462)
    ## Summary
    
    `codex resume --last` currently lists sessions by updated time using
    scan-and-repair. Updated-time filesystem listing must stat every rollout
    before applying the cwd, provider, and source filters, so startup scales
    with the entire local session history...
    
    This change queries the state DB first for the latest matching session.
    For local workspaces, we only accept the indexed result when its rollout
    path still exists; otherwise we retry with scan-and-repair. The same
    lookup path is shared by `fork --last`.
    
    I benchmarked the same `thread/list` request used by `resume --last` in
    my local `ruff` checkout against a Codex home with 2,599 active rollouts
    totaling 3.7 GiB, including 90 Ruff threads.
    
    Across five fresh release app-server processes with warm filesystem
    caches, the state-DB-only lookup had median latency of 0.37-0.44 ms,
    while scan-and-repair had median latency of 139-162 ms. First-request
    latency was 0.7-1.7 ms versus 142-185 ms.
    
    So this **removes roughly 140-160 ms from the `resume --last` lookup**
    on this machine, and makes that lookup over 300x faster.
    
    The tradeoff is that this does leave two correctness gaps:
    
    - If a newer matching rollout is missing from SQLite but an older
    matching row exists, the fast path resumes the older thread and never
    falls back to the filesystem scan.
    - If an existing row has stale filter or ordering metadata, the fast
    path can select a different thread from scan-and-repair. The rollout
    tests already demonstrate this for stale cwd metadata: state-DB-only
    returns the stale match, while scan-and-repair removes and repairs it.
    
    So you could end up seeing the "wrong" result in cases like...
    
    1. A crash or SQLite error occurs between Codex writing the conversation
    file and updating SQLite, leaving the newer file unindexed.
    
    2. An older Codex version, restore, or manual copy adds a conversation
    file after SQLite’s one-time backfill completed.
    
    These seem pretty rare though (and sessions can always be recovered via
    other mechanisms -- `--last` is just a convenience feature), and I think
    the tradeoffs are good here?
  • Make runtime workspace roots absolute in app-server API (#26552)
    Stacked on #26532.
    
    ## Why
    
    #26532 moves cwd normalization to the app-server/core boundary.
    `runtimeWorkspaceRoots` still accepted raw paths in v2 requests and in
    `ConfigOverrides`, which left core responsible for interpreting those
    roots later. This makes runtime workspace roots follow the same
    absolute-path boundary as cwd.
    
    ## What
    
    - Change v2 `runtimeWorkspaceRoots` request fields for `thread/start`,
    `thread/resume`, `thread/fork`, and `turn/start` to `AbsolutePathBuf`.
    - Deduplicate already-absolute runtime roots in app-server handlers and
    pass them through `ConfigOverrides.workspace_roots` as
    `AbsolutePathBuf`.
    - Update TUI and exec client request builders to pass absolute runtime
    roots directly.
    - Update app-server docs, schema fixtures, and focused tests for
    absolute runtime roots.
    
    ## Testing
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server runtime_workspace_roots`
    - `just test -p codex-core
    session_permission_profile_rebinds_runtime_workspace_roots`
    - `just test -p codex-tui app_server_session`
    - `just test -p codex-exec`
  • fix(tui): restore cancelled prompt cursor at end (#26457)
    ## Why
    
    Pressing `Esc` on a turn that produced no visible output restores the
    submitted prompt so the user can keep editing it. That restore path
    preserved the prompt content, images, and mention bindings, but left the
    composer cursor at the start of the restored text. The next edit
    therefore inserted at the beginning instead of continuing from the end
    of the prompt.
    
    ## What Changed
    
    - Move the cursor to the end after
    `BottomPane::set_composer_text_with_mention_bindings` rehydrates a
    restored draft.
    - Add test-only cursor accessors so restore tests can assert the
    composer state directly.
    - Extend the queued restore regression to assert the restored composer
    cursor is positioned at `text.len()`.
    
    ## How to Test
    
    Manual reviewer flow:
    
    1. Start Codex in the TUI.
    2. Submit a prompt that will take long enough to interrupt.
    3. Press `Esc` before any visible assistant output appears.
    4. Confirm the prompt is restored into the composer and the cursor is at
    the end, so typing appends to the prompt.
    5. Repeat with a prompt that includes an attached image or resolved
    mention and confirm the restored content remains intact.
    
    Targeted tests:
    
    - `just test -p codex-tui
    chatwidget::tests::composer_submission::queued_restore_with_remote_images_keeps_local_placeholder_mapping`
    
    Lint note:
    
    - `just argument-comment-lint` is blocked locally by the existing Bazel
    `compiler-rt` empty glob failure before analyzing touched code. The
    touched Rust diff was manually inspected and adds no new opaque
    positional literal callsites.
  • fix(tui): Windows composer background (#26181)
    ## Why
    
    On Windows, the TUI could not shade the composer against the terminal
    background because `terminal_palette::default_colors()` always fell back
    to `None`. That preserved safety, but it also meant terminals that do
    support OSC 10/11 default color replies had no path to report their real
    background color.
    
    This keeps the existing fallback behavior for unsupported terminals
    while allowing capable Windows terminals to report their default
    foreground/background colors during startup.
    
    | Before | After |
    |---|---|
    | <img width="1235" height="658" alt="win-before"
    src="https://github.com/user-attachments/assets/ff756589-fcb3-43de-8f2a-ebc0369b30dd"
    /> | <img width="1235" height="658" alt="win-after"
    src="https://github.com/user-attachments/assets/9563ff20-4be5-4608-9414-a2afb647e745"
    /> |
    
    ## What Changed
    
    - Moved the OSC 10/11 default color parser in
    `tui/src/terminal_probe.rs` out of the Unix-only implementation so it
    can be reused by Windows.
    - Added a Windows-only bounded OSC 10/11 probe using raw console handles
    and the existing `windows-sys` dependency.
    - Added Windows palette caching in `tui/src/terminal_palette.rs` so
    startup probe results, including `None`, are reused instead of probing
    again later.
    - Wired the Windows color probe into TUI startup after the existing
    non-Unix crossterm cursor and keyboard checks.
    - Added parser coverage for malformed, partial, and noisy OSC color
    replies.
    
    If the probe fails, times out, receives only one color, or receives
    malformed data, the cache stores `None` and the composer keeps the
    current behavior.
    
    ## How to Test
    
    1. On Windows, start Codex in a terminal that supports OSC 10/11 default
    color replies.
    2. Open the TUI composer.
    3. Confirm the composer/status area is painted using the terminal's
    reported default background, instead of leaving the background unshaded.
    4. Start Codex in a terminal that does not answer OSC 10/11, or
    otherwise blocks terminal color replies.
    5. Confirm startup still succeeds and the composer uses the existing
    fallback behavior.
    
    Targeted tests:
    
    - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target
    just test -p codex-tui terminal_probe`
    
    Additional local verification:
    
    - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target
    just test -p codex-tui` was run; 2774 tests passed, and two unrelated
    Guardian feature-flag tests failed reproducibly when isolated.
    - `just argument-comment-lint` was attempted but blocked by the local
    Bazel/LLVM `include/sanitizer/*.h` empty glob issue. Touched Rust
    literal callsites were inspected manually.
    - `cargo check -p codex-tui --target x86_64-pc-windows-msvc` was
    attempted after installing the target, but local macOS cross-checking is
    blocked by missing Windows C SDK headers in native dependencies
    (`ring`/`aws-lc-sys`).
    
    ---------
    
    Co-authored-by: Kevin Bond <kbond@openai.com>
  • fix(tui): avoid doubled blank rows while streaming (#26636)
    ## Summary
    
    During assistant-message streaming, blank markdown lines in the
    transient active tail were prefixed with two spaces. Ratatui measured
    those whitespace-only lines as two viewport rows, so list- and
    table-heavy answers showed doubled vertical gaps while streaming and
    then visibly compacted when finalized into scrollback.
    
    - keep whitespace-only `StreamingAgentTailCell` lines structurally empty
    while preserving nonblank message prefixes
    - clear impossible hyperlink metadata when normalizing a blank tail line
    - add an inline snapshot and height regression proving one blank
    markdown line occupies one viewport row
    
    Related to #26618, but fixes a separate live-tail row-height issue
    rather than stale committed markdown content.
    
    ## How to Test
    
    Recommended before/after reproduction:
    
    1. Start the latest Codex build without this change.
    2. Submit this exact prompt:
    
    > Send 20 different lists: bullets vs numbered, simple vs complex with
    paragraphs in between items, etc. Intertwine them with some tables and
    some paragraphs.
    
    3. While the answer streams, observe duplicated vertical gaps around
    list items and paragraphs. When the answer finishes, observe the spacing
    compact.
    4. Start this branch with `just c` and submit the same prompt.
    5. Confirm each intended blank markdown line occupies one terminal row
    throughout streaming and that the spacing does not compact or jump when
    the answer finishes.
    6. As a focused regression, verify the sections after the first table,
    especially loose lists with paragraphs between items; those blank rows
    should remain stable throughout streaming.
    
    Targeted tests:
    
    - `just test -p codex-tui
    streaming_agent_tail_blank_line_uses_one_viewport_row`
    - `just test -p codex-tui history_cell::tests`
    
    ## Test Notes
    
    - Verified the exact prompt above in a real tmux TUI using latest Codex
    and this branch as the before/after comparison.
    - The full `just test -p codex-tui` run completed 2,782 of 2,784 tests
    successfully. Two unrelated guardian feature-flag tests fail
    reproducibly in isolation because the expected `OverrideTurnContext`
    message is absent.
    - `just argument-comment-lint` is blocked locally by the existing Bazel
    `compiler-rt` missing-header glob error; the touched Rust diff was
    inspected manually for opaque positional literals.
  • Render code comment directives in TUI replay (#26554)
    ## Summary
    
    Resumed Codex App or VS Code review sessions can contain
    `::code-comment` directives that the TUI previously displayed verbatim
    because only rich clients interpret them.
    
    This change rewrites valid line-start directives into readable Markdown
    during assistant-message parsing, using the session working directory
    for relative file paths. The fallback is applied consistently to live
    messages, replayed transcripts, and resume previews while preserving
    malformed directives and existing `::git-*` parsing.
    
    ## Before
    
    The TUI exposed the raw client directive:
    
    ```text
    ::code-comment{title="Fix body= parsing" body="Keep role=\"tab\", ::git-stage{cwd=/tmp}, file=, and \n literal." file="/repo/src/app.ts" start=10 end=12 priority="P2"}
    ```
    
    ## After
    
    The same directive is rendered as readable review feedback:
    
    ```text
    - [P2] Fix body= parsing — src/app.ts:10-12
      Keep role="tab", ::git-stage{cwd=/tmp}, file=, and \n literal.
    ```
    
    Fixes #25658
  • Fix /goal usage text for control commands (#26551)
    ## Why
    
    The TUI's `/goal` usage text only advertised the objective form even
    though `/goal clear`, `/goal edit`, `/goal pause`, and `/goal resume`
    are implemented. This made the lifecycle controls difficult to discover
    and allowed the duplicated help text to drift from actual behavior.
    
    Fixes #25530.
    
    ## What changed
    
    - Show the complete `/goal [<objective>|clear|edit|pause|resume]` syntax
    in usage messages.
    - Share one usage string across slash-command dispatch and goal-related
    app messages.
    - Add inline snapshot coverage for the control-command usage path.
  • Surface TUI config write error causes (#26537)
    ## Summary
    
    TUI config writes currently wrap app-server failures with local context
    like `config/batchWrite failed in TUI`, but several user-visible paths
    only render the outer error. That hides the actionable app-server
    message, such as validation constraints or read-only `CODEX_HOME`
    failures, leaving users with a dead-end diagnostic.
    
    This change adds a small formatter next to the TUI config write helpers
    that renders the error source chain, then uses it for model persistence,
    feature persistence, project trust, status line writes, hook trust, and
    hook enablement.
    
    Fixes #26077
  • [codex] Forward turn moderation metadata through app-server (#25710)
    ## Why
    First-party backends can supply turn-scoped moderation metadata that
    app-server clients need for client-side presentation. Exposing this as
    an experimental typed notification lets opted-in clients consume it
    without interpreting raw Responses API events.
    
    ## What changed
    - forward `response.metadata.openai_chatgpt_moderation_metadata` from
    Responses API SSE and WebSocket streams as turn-scoped moderation
    metadata
    - emit the experimental app-server v2 `turn/moderationMetadata`
    notification with `{ threadId, turnId, metadata }`
    - add app-server integration coverage for the typed moderation metadata
    notification
    
    ## Testing
    - `just test -p codex-core
    build_ws_client_metadata_includes_window_lineage_and_turn_metadata`
    - `just test -p codex-core` (fails locally: 46 failures and 1 timeout,
    primarily missing `test_stdio_server` and shell snapshot timeouts)
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    turn_moderation_metadata_emits_typed_notification_v2`
    - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed,
    and 5 timed out; failures are in existing environment-sensitive tests,
    primarily because nested macOS `sandbox-exec` is not permitted)
    - `just write-app-server-schema --experimental --schema-root
    /tmp/codex-app-server-schema-experimental`
  • [codex] Expose unavailable app templates in plugin detail (#26317)
    ## Summary
    - Adds `unavailable_app_templates` to the app-server protocol and
    generated schemas/types.
    - Parses plugin-service `release.unavailable_app_templates` in the
    remote plugin client.
    - Maps remote unavailable templates into app-server `PluginDetail`.
    - Defaults local plugins to an empty unavailable app template list.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo +1.95.0 fmt --manifest-path codex-rs/Cargo.toml --all --check`
    - `cargo +1.95.0 test --manifest-path codex-rs/Cargo.toml -p
    codex-app-server-protocol schema_fixtures`
    - `cargo +1.95.0 check --manifest-path codex-rs/Cargo.toml -p
    codex-app-server-protocol -p codex-core-plugins -p codex-app-server`
    - `git diff --check`
    
    Note: default `cargo check` uses rustc 1.89 locally and failed because
    dependencies require newer Rust, so validation was rerun with installed
    Rust 1.95.
  • [codex] Use model-advertised reasoning effort order (#26446)
    ## Summary
    - preserve the model catalog order for app-server
    `supportedReasoningEfforts` and document that client contract
    - render TUI reasoning choices in the advertised order
    - step reasoning shortcuts by adjacent list position instead of deriving
    order from known effort names
    - anchor unsupported configured values to the advertised default, or the
    first option when needed
    - remove canonical effort ordering helpers and the unused upgrade effort
    mapping
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
    
    Stacked on #26444.
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Fix multiline paste in /goal edit (#26047)
    Fixes #26025.
    
    ## Why
    `/goal edit` opens `CustomPromptView`, which did not use the paste-burst
    handling that protects the main composer when terminals deliver paste as
    rapid key events. On Windows terminals, the first pasted newline could
    be treated as Enter-to-submit, truncating the goal edit and leaving the
    rest of the paste behind.
    
    ## What
    This reuses `PasteBurst` in `CustomPromptView` as a lightweight
    Enter-suppression detector for paste-like key streams. Characters still
    insert directly, explicit paste still goes through the view paste path,
    and ordinary text entry still submits on Enter.
  • Switch runtime to cloud config bundle (#24622)
    ## Summary
    
    - Adapts the moved `codex-cloud-config` crate from the legacy cloud
    requirements endpoint to the new config bundle endpoint.
    - Switches runtime consumers from `CloudRequirementsLoader` to
    `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered
    config and requirements.
    - Removes the legacy cloud requirements domain loader path.
    
    ## Details
    
    This intentionally keeps `codex-cloud-config` monolithic for review
    lineage: the previous PR establishes the crate move, and this PR shows
    the behavior change against that moved implementation. A follow-up PR
    splits the module back into focused files.
    
    The new bundle path preserves the important cloud requirements loader
    semantics where intended: account-scoped signed cache, 30 minute TTL, 5
    minute refresh cadence, retry/backoff, auth recovery, and fail-closed
    startup loading. The cached payload changes from a single requirements
    TOML string to the backend-delivered bundle, and validation rejects
    malformed config or requirements fragments before cache write/use.
  • Propagate permission approval environment id (#25862)
    ## Stack
    
    1. #25850 - Key request-permission grants by environment: stores and
    applies sticky permission grants per environment id.
    2. #25858 - Add `environmentId` to `request_permissions`: lets the model
    target a selected environment and resolves relative permission paths
    against it.
    3. This PR (#25862) - Propagate permission approval environment id:
    carries the selected environment id through approval events, app-server
    requests, TUI prompts, and delegate forwarding.
    4. #25867 - Add remote request permissions integration coverage:
    verifies the selected remote environment across request, approval, grant
    reuse, and exec.
    
    This PR is stacked on #25858, and #25867 is stacked on this PR.
    
    ## Why
    
    PR2 lets the model bind a `request_permissions` call to a selected
    environment, but the approval event and client-facing request still
    needed to carry that binding. For CCA, the user-facing prompt and
    delegated approval path should know which environment the grant applies
    to instead of relying on cwd alone.
    
    ## What Changed
    
    - Added optional `environmentId` to `RequestPermissionsEvent`.
    - Emit the selected environment id from core permission approval events.
    - Preserve the environment id through delegate forwarding, including
    cwd-based delegated requests.
    - Added `environmentId` to app-server permission approval params,
    generated schema/TypeScript artifacts, and README examples.
    - Preserve and display the environment id in TUI permission approval
    prompts.
    - Updated focused core, app-server protocol, and TUI conversion
    coverage.
    
    ## Testing
    
    Not run locally per instruction. Performed read-only `git diff --check`.
  • Reduce stack pressure in session startup and config rebuilds (#25844)
    ## Why
    
    `/clear` starts a fresh thread with `InitialHistory::Cleared`, which
    re-enters the thread/session startup path. That path now builds large
    async futures through `ThreadManagerState::spawn_thread_with_source`,
    `Codex::spawn`, and `Session::new`. Separately, TUI config rebuilds for
    cwd and permission-profile changes build a similarly heavy
    `ConfigBuilder::build()` future inside the app task. In debug and Bazel
    runs, those call chains can put enough state on the caller stack to
    abort before startup or config refresh completes.
    
    This change keeps the behavior the same while moving the heaviest future
    frames off the caller stack.
    
    ## What changed
    
    - Box `Codex::spawn(...)` in `codex-rs/core/src/thread_manager.rs`
    before awaiting it from `spawn_thread_with_source`.
    - Box `Session::new(...)` in `codex-rs/core/src/session/mod.rs` before
    awaiting it from `Codex::spawn_internal`.
    - Route `ConfigBuilder::build()` through a small `tokio::spawn` helper
    in `codex-rs/tui/src/app/config_persistence.rs` so cwd and
    permission-profile config rebuilds run on a runtime worker stack while
    preserving error context.
    
    ## Verification
    
    CI is running on the PR.
    
    No new targeted tests were added. This is a mechanical stack-pressure
    reduction that keeps the existing behavior and error propagation intact.
  • feat: show enterprise monthly credit limits in status (#24812)
    ## Summary
    
    Enterprise users can have an effective monthly credit limit, but Codex
    `/status` currently drops that metadata from the account-usage response.
    
    This change adds the optional `spend_control.individual_limit`
    projection to the existing rate-limit snapshot flow. The backend client
    reads the monthly limit, app-server exposes it as `individualLimit`, and
    the TUI renders a `Monthly credit limit` row through the existing
    progress-bar renderer.
    
    When the backend does not return an effective monthly limit, existing
    rate-limit behavior is unchanged.
    
    ## Existing backend state
    
    The account-usage backend already returns the effective monthly limit
    and current usage together:
    
    ```json
    {
      "spend_control": {
        "reached": false,
        "individual_limit": {
          "limit": "25000",
          "used": "8000",
          "remaining": "17000",
          "used_percent": 32,
          "remaining_percent": 68,
          "reset_after_seconds": 86400,
          "reset_at": 1778137680
        }
      }
    }
    ```
    
    Before this change, Codex projected rolling `primary` and `secondary`
    windows plus `credits`. It ignored `spend_control.individual_limit`, so
    app-server clients and `/status` could not render the monthly cap.
    
    The updated flow is:
    
    ```text
    account usage backend
      -> backend-client reads spend_control.individual_limit
      -> existing rate-limit snapshot carries optional individual_limit
      -> app-server exposes optional individualLimit
      -> TUI renders Monthly credit limit
    ```
    
    ## App-server contract
    
    `account/rateLimits/read` and sparse `account/rateLimits/updated`
    notifications now include an additive nullable
    `rateLimits.individualLimit` field:
    
    ```json
    {
      "individualLimit": {
        "limit": "25000",
        "used": "8000",
        "remainingPercent": 68,
        "resetsAt": 1778137680
      }
    }
    ```
    
    In an `account/rateLimits/read` response, `null` means no monthly limit
    is available. `account/rateLimits/updated` remains a sparse rolling
    notification: clients merge available values into their most recent
    `account/rateLimits/read` snapshot or refetch. Nullable account metadata
    in a rolling notification does not clear a previously observed value.
    
    ## Design decisions
    
    - Extend the existing rate-limit snapshot instead of introducing a
    separate request or wire-level update protocol.
    - Keep the Codex projection narrow: `/status` needs the effective limit,
    current usage, remaining percentage, and reset timestamp.
    - Render the monthly row through the existing progress-bar renderer,
    with one optional detail line for `8,000 of 25,000 credits used`.
    - Keep the backend response optional so existing accounts and older
    usage states preserve their current behavior.
    - Preserve cached monthly metadata when sparse rolling notifications
    omit it. Live account-usage reads remain authoritative and can clear a
    removed limit.
    
    ## Visual evidence
    
    ```text
     Monthly credit limit:   [██████████████░░░░░░] 68% left (resets 07:08 on 7 May)
                             8,000 of 25,000 credits used
    ```
    
    Snapshot:
    `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap`
    
    ## Testing
    
    Tests: generated app-server schema verification, protocol tests,
    backend-client tests, app-server integration coverage, TUI snapshot
    coverage, formatting, and workspace lint cleanup.
  • Move cloud requirements crate to cloud config (#24621)
    ## Summary
    
    - Moves the existing `codex-cloud-requirements` crate to
    `codex-cloud-config`.
    - Updates workspace dependencies and imports to the new crate name.
    - Intentionally keeps runtime behavior unchanged: this still fetches the
    legacy cloud requirements endpoint.
    
    ## Details
    
    This PR exists to make the lineage obvious before the bundle migration.
    GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs`
    implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather
    than as unrelated new code.
    
    The follow-up PR adapts this moved crate to the new config bundle API
    and switches runtime consumers over.
  • app-server: remove experimental persist_extended_history bool flag (#25712)
    ## Summary
    
    Remove the dead experimental `persistExtendedHistory` app-server flag
    and collapse rollout persistence to the single policy app-server already
    used.
    
    ## What Changed
    
    - Removed `persistExtendedHistory` from v2 thread start/resume/fork
    params and deleted its deprecation notice path.
    - Removed the persistence-mode enums and plumbing through core, rollout,
    and thread-store.
    - Made rollout filtering mode-free, keeping the existing limited
    persisted-history behavior.
    
    ## Test Plan
    
    - `just write-app-server-schema`
    - `cargo nextest run --no-fail-fast -p codex-app-server-protocol
    schema_fixtures`
    - `cargo nextest run --no-fail-fast -p codex-app-server
    thread_shell_command_history_responses_exclude_persisted_command_executions`
    - `cargo nextest run --no-fail-fast -p codex-rollout -p
    codex-thread-store`
    - final `rg` for removed flag/type names
  • fix(tui): clarify footer shortcut overlay hints (#25625)
    ## Why
    
    The TUI shortcut overlay used static labels for `Tab` and `Ctrl+C`, even
    though both keys change behavior while a task is running. That made the
    visible help misleading: idle `Tab` submits rather than queues, and
    active-turn `Ctrl+C` interrupts rather than exits.
    
    Closes #25531.
    Closes #25564.
    
    ## What Changed
    
    - Pass task-running state into the shortcut overlay renderer.
    - Render `Tab` as `submit message` while idle and `queue message` while
    work is running.
    - Render `Ctrl+C` as `exit` while idle and `interrupt` while work is
    running.
    - Add snapshot coverage for the active-work shortcut overlay and update
    idle overlay snapshots.
    
    ## How to Test
    
    1. Start Codex and open the shortcut overlay with `?` while no task is
    running.
    2. Confirm the overlay shows `tab to submit message` and `ctrl + c to
    exit`.
    3. Start a task, then open or keep the shortcut overlay visible while
    work is running.
    4. Confirm the overlay shows `tab to queue message` and `ctrl + c to
    interrupt`.
    5. Type a follow-up prompt during active work and press `Tab`; confirm
    it queues rather than submitting immediately.
    
    Targeted tests:
    
    - `just test -p codex-tui footer_snapshots`
    - `just test -p codex-tui footer_mode_snapshots`
    
    ## Validation Notes
    
    `just test -p codex-tui` currently has two unrelated guardian
    feature-flag test failures on this base:
    
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    
    `just argument-comment-lint codex-rs/tui/src/bottom_pane/footer.rs`
    could not run locally because the prebuilt wrapper requires `dotslash`;
    the touched Rust diff was manually inspected for opaque positional
    literals.
  • Add reasoning-only status surface item (#25504)
    Closes #24886.
    
    ## Why
    Users can configure the TUI status line and terminal title with
    `model-with-reasoning`, but issue #24886 asks for a compact
    reasoning-only item. That lets a setup show just `default`, `low`,
    `medium`, `high`, or `xhigh` without repeating the model name.
    
    ## What changed
    - Added a `reasoning` item for `/statusline` and `/title` setup flows.
    - Rendered the item from the effective reasoning effort, including
    collaboration-mode overrides.
    - Registered `reasoning` with `codex doctor` so Codex-generated
    terminal-title config is not reported as invalid.
    - Updated TUI setup snapshots so the picker previews include the new
    item.
  • Reset slash popup selection when filter changes (#25492)
    ## Summary
    
    Fixes #25295.
    
    The slash-command popup reused its previous `ScrollState` when the
    composer filter token changed. After scrolling the full `/` command
    list, typing a narrower filter such as `/st` could clamp the stale
    selection into the filtered results and highlight the wrong command.
    
    This resets the popup selection and viewport only when the parsed filter
    token changes, so normal arrow navigation is preserved while new filters
    start at the first match.
  • Allow paste in searchable selection menus (#25400)
    ## Summary
    
    I frequently want to be able to paste into the searchable menu -- the
    most common use-case here is when specifying an upstream for a
    `/review`, where I copy the upstream from an open terminal.
  • feat(tui): restore output-free cancelled prompts (#25316)
    ## TL;DR
    
    When you press Esc or Ctrl+C after sending a prompt but before any
    output was rendering, it restores the last composer and the message.
    
    ## Summary
    
    Cancelling a prompt immediately after submission should behave like
    returning to edit that prompt, not like discarding the user's draft.
    Today, pressing `Esc` or `Ctrl+C` before Codex responds leaves the
    submitted prompt in the transcript and returns an empty composer,
    forcing the user to recall or retype it.
    
    When an interrupted turn has not produced substantive visible output,
    restore its submitted prompt directly into the composer and roll back
    that latest turn. This also covers the first prompt in a fresh thread,
    before the TUI has retained a local user-history cell. The restored
    draft keeps its text, image attachments, and active collaboration mode
    so it can be edited and resubmitted in place.
    
    Restoration is intentionally suppressed once the turn has produced
    user-visible activity such as assistant output, tool work, hooks, or
    patches. A transient thinking status does not make the prompt
    ineligible. Rollback also rebuilds terminal scrollback from the retained
    transcript cells so repeated cancellations and terminal resizes do not
    duplicate history.
    
    ## How to Test
    
    1. Start the TUI with `cargo run -p codex-cli --bin codex`.
    2. In a fresh thread, submit the first prompt and press `Esc` before
    Codex emits substantive output. Confirm that the prompt returns to the
    composer for editing and its submitted transcript row is removed.
    3. Repeat with `Ctrl+C`, then repeat after at least one completed turn.
    Confirm the same behavior.
    4. Submit a prompt, wait for assistant output or tool activity, then
    cancel. Confirm that the transcript remains intact and the prompt is not
    restored into the composer.
    5. Cancel several output-free prompts and resize the terminal between
    attempts. Confirm that the startup banner, tip, and transcript history
    do not duplicate in scrollback.
    
    Targeted tests:
    - `just test -p codex-tui cancelled_turn_edit_restores_prompt`
    - `just test -p codex-tui
    output_free_interrupted_turn_requests_prompt_restore`
    - `just test -p codex-tui
    visible_output_prevents_cancelled_turn_prompt_restore`
    - `just test -p codex-tui
    thinking_status_keeps_cancelled_turn_prompt_restore_eligible`
    - `just test -p codex-tui
    patch_activity_prevents_cancelled_turn_prompt_restore`
    
    The full `just test -p codex-tui` run completed with `2746` passing
    tests and two unrelated existing guardian feature-flag failures. `just
    argument-comment-lint` remains blocked locally by the existing Bazel
    LLVM `compiler-rt` sanitizer-header glob failure; the touched Rust diff
    was manually audited for positional literal comments.