Commit Graph

5313 Commits

  • Include legacy deny paths in elevated Windows sandbox setup (#17365)
    ## Summary
    
    This updates the Windows elevated sandbox setup/refresh path to include
    the legacy `compute_allow_paths(...).deny` protected children in the
    same deny-write payload pipe added for split filesystem carveouts.
    
    Concretely, elevated setup and elevated refresh now both build
    deny-write payload paths from:
    
    - explicit split-policy deny-write paths, preserving missing paths so
    setup can materialize them before applying ACLs
    - legacy `compute_allow_paths(...).deny`, which includes existing
    `.git`, `.codex`, and `.agents` children under writable roots
    
    This lets the elevated backend protect `.git` consistently with the
    unelevated/restricted-token path, and removes the old janky hard-coded
    `.codex` / `.agents` elevated setup helpers in favor of the shared
    payload path.
    
    ## Root Cause
    
    The landed split-carveout PR threaded a `deny_write_paths` pipe through
    elevated setup/refresh, but the legacy workspace-write deny set from
    `compute_allow_paths(...).deny` was not included in that payload. As a
    result, elevated workspace-write did not apply the intended deny-write
    ACLs for existing protected children like `<cwd>/.git`.
    
    ## Notes
    
    The legacy protected children still only enter the deny set if they
    already exist, because `compute_allow_paths` filters `.git`, `.codex`,
    and `.agents` with `exists()`. Missing explicit split-policy deny paths
    are preserved separately because setup intentionally materializes those
    before applying ACLs.
    
    ## Validation
    
    - `cargo fmt --check -p codex-windows-sandbox`
    - `cargo test -p codex-windows-sandbox`
    - `cargo build -p codex-cli -p codex-windows-sandbox --bins`
    - Elevated `codex exec` smoke with `windows.sandbox='elevated'`: fresh
    git repo, attempted append to `.git/config`, observed `Access is
    denied`, marker not written, Deny ACE present on `.git`
    - Unelevated `codex exec` smoke with `windows.sandbox='unelevated'`:
    fresh git repo, attempted append to `.git/config`, observed `Access is
    denied`, marker not written, Deny ACE present on `.git`
  • Do not fail thread start when trust persistence fails (#17595)
    Addresses #17593
    
    Problem: A regression introduced in
    https://github.com/openai/codex/pull/16492 made thread/start fail when
    Codex could not persist trusted project state, which crashes startup for
    users with read-only config.toml.
    
    Solution: Treat trusted project persistence as best effort and keep the
    current thread's config trusted in memory when writing config.toml
    fails.
  • Fix TUI compaction item replay (#17657)
    Problem: PR #17601 updated context-compaction replay to call a new
    ChatWidget handler, but the handler was never implemented, breaking
    codex-tui compilation on main.
    
    Solution: Render context-compaction replay through the existing
    info-message path, preserving the intended `Context compacted` UI marker
    without adding a one-off handler.
  • Suppress duplicate compaction and terminal wait events (#17601)
    Addresses #17514
    
    Problem: PR #16966 made the TUI render the deprecated context-compaction
    notification, while v2 could also receive legacy unified-exec
    interaction items alongside terminal-interaction notifications, causing
    duplicate "Context compacted" and "Waited for background terminal"
    messages.
    
    Solution: Suppress deprecated context-compaction notifications and
    legacy unified-exec interaction command items from the app-server v2
    projection, and render canonical context-compaction items through the
    existing TUI info-event path.
  • Wrap status reset timestamps in narrow layouts (#17481)
    Addresses #17453
    
    Problem: /status rate-limit reset timestamps can be truncated in narrow
    layouts, leaving users with partial times or dates.
    
    Solution: Let narrow rate-limit rows drop the fixed progress bar to
    preserve the percent summary, and wrap reset timestamps onto
    continuation lines instead of truncating them.
  • Emit plan-mode prompt notifications for questionnaires (#17417)
    Addresses #17252
    
    Problem: Plan-mode clarification questionnaires used the generic
    user-input notification type, so configs listening for plan-mode-prompt
    did not fire when request_user_input waited for an answer.
    
    Solution: Map request_user_input prompts to the plan-mode-prompt
    notification and remove the obsolete user-input TUI notification
    variant.
  • Fix custom tool output cleanup on stream failure (#17470)
    Addresses #16255
    
    Problem: Incomplete Responses streams could leave completed custom tool
    outputs out of cleanup and retry prompts, making persisted history
    inconsistent and retries stale.
    
    Solution: Route stream and output-item errors through shared cleanup,
    and rebuild retry prompts from fresh session history after the first
    attempt.
  • Make forked agent spawns keep parent model config (#17247)
    ## Summary
    
    When a `spawn_agent` call does a full-history fork, keep the parent's
    effective agent type and model configuration instead of applying child
    role/model overrides.
    
    This is the minimal config-inheritance slice of #16055. Prompt-cache key
    inheritance and MCP tool-surface stability are split into follow-up PRs.
    
    ## Design
    
    - Reject `agent_type`, `model`, and `reasoning_effort` for v1
    `fork_context` spawns.
    - Reject `agent_type`, `model`, and `reasoning_effort` for v2
    `fork_turns = "all"` spawns.
    - Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
    requested model/reasoning overrides and role config still apply there.
    - Keep non-forked spawn behavior unchanged.
    
    ## Tests
    
    - `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
    - `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
    --lib`
    - `cargo +1.93.1 test -p codex-core
    multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
    --lib`
  • Build remote exec env from exec-server policy (#17216)
    ## Summary
    - add an exec-server `envPolicy` field; when present, the server starts
    from its own process env and applies the shell environment policy there
    - keep `env` as the exact environment for local/embedded starts, but
    make it an overlay for remote unified-exec starts
    - move the shell-environment-policy builder into `codex-config` so Core
    and exec-server share the inherit/filter/set/include behavior
    - overlay only runtime/sandbox/network deltas from Core onto the
    exec-server-derived env
    
    ## Why
    Remote unified exec was materializing the shell env inside Core and
    forwarding the whole map to exec-server, so remote processes could
    inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
    base env on the executor while preserving Core-owned runtime additions
    like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
    sandbox marker env.
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - `cargo test -p codex-exec-server --lib`
    - `cargo test -p codex-core --lib unified_exec::process_manager::tests`
    - `cargo test -p codex-core --lib exec_env::tests`
    - `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
    matched 0 tests)
    - `cargo test -p codex-config --lib shell_environment` (compile-only;
    filter matched 0 tests)
    - `just bazel-lock-update`
    
    ## Known local validation issue
    - `just bazel-lock-check` is not runnable in this checkout: it invokes
    `./scripts/check-module-bazel-lock.sh`, which is missing.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • feat: ignore keyring on 0.0.0 (#17221)
    To prevent the spammy: 
    <img width="424" height="172" alt="Screenshot 2026-04-09 at 13 36 16"
    src="https://github.com/user-attachments/assets/b5ece9e3-c561-422f-87ec-041e7bd6813d"
    />
  • Stabilize exec-server process tests (#17605)
    Problem: After #17294 switched exec-server tests to launch the top-level
    `codex exec-server` command, parallel remote exec-process cases can
    flake while waiting for the child server's listen URL or transport
    shutdown.
    
    Solution: Serialize remote exec-server-backed process tests and harden
    the harness so spawned servers are killed on drop and shutdown waits for
    the child process to exit.
  • Run exec-server fs operations through sandbox helper (#17294)
    ## Summary
    - run exec-server filesystem RPCs requiring sandboxing through a
    `codex-fs` arg0 helper over stdin/stdout
    - keep direct local filesystem execution for `DangerFullAccess` and
    external sandbox policies
    - remove the standalone exec-server binary path in favor of top-level
    arg0 dispatch/runtime paths
    - add sandbox escape regression coverage for local and remote filesystem
    paths
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - remote devbox: `cd codex-rs && bazel test --bes_backend=
    --bes_results_url= //codex-rs/exec-server:all` (6/6 passed)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add MCP tool wall time to model output (#17406)
    Include MCP wall time in the output so the model is aware of how long
    it's calls are taking.
  • fix(mcp) pause timer for elicitations (#17566)
    ## Summary
    Stop counting elicitation time towards mcp tool call time. There are
    some tradeoffs here, but in general I don't think time spent waiting for
    elicitations should count towards tool call time, or at least not
    directly towards timeouts.
    
    Elicitations are not exactly like exec_command escalation requests, but
    I would argue it's ~roughly equivalent.
    
    ## Testing
    - [x] Added unit tests
    - [x] Tested locally
  • Expose instruction sources (AGENTS.md) via app server (#17506)
    Addresses #17498
    
    Problem: The TUI derived /status instruction source paths from the local
    client environment, which could show stale <none> output or incorrect
    paths when connected to a remote app server.
    
    Solution: Add an app-server v2 instructionSources snapshot to thread
    start/resume/fork responses, default it to an empty list when older
    servers omit it, and render TUI /status from that server-provided
    session data.
    
    Additional context: The app-server field is intentionally named
    instructionSources rather than AGENTS.md-specific terminology because
    the loaded instruction sources can include global instructions, project
    AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
    future dynamic sources.
  • Remove context status-line meter (#17420)
    Addresses #17313
    
    Problem: The visual context meter in the status line was confusing and
    continued to draw negative feedback, and context reporting should remain
    an explicit opt-in rather than part of the default footer.
    
    Solution: Remove the visual meter, restore opt-in context remaining/used
    percentage items that explicitly say "Context", keep existing
    context-usage configs working as a hidden alias, and update the setup
    text and snapshots.
  • feat(tui): add reverse history search to composer (#17550)
    ## Problem
    
    The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not
    provide the reverse incremental search workflow users expect from
    shells. Users needed a way to search older prompts without immediately
    replacing the current draft, and the interaction needed to handle async
    persistent history, repeated navigation keys, duplicate prompt text,
    footer hints, and preview highlighting without making the main composer
    file even harder to review.
    
    
    https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b
    
    <img width="1227" height="722" alt="image"
    src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af"
    />
    
    ## Mental model
    
    `Ctrl+R` opens a temporary search session owned by the composer. The
    footer line becomes the search input, the composer body previews the
    current match only after the query has text, and `Enter` accepts that
    preview as an editable draft while `Esc` restores the draft that existed
    before search started. The history layer provides a combined offset
    space over persistent and local history, but search navigation exposes
    unique prompt text rather than every physical history row.
    
    ## Non-goals
    
    This change does not rewrite stored history, change normal Up/Down
    browsing semantics, add fuzzy matching, or add persistent metadata for
    attachments in cross-session history. Search deduplication is
    deliberately scoped to the active Ctrl+R search session and uses exact
    prompt text, so case, whitespace, punctuation, and attachment-only
    differences are not normalized.
    
    ## Tradeoffs
    
    The implementation keeps search state in the existing composer and
    history state machines instead of adding a new cross-module controller.
    That keeps ownership local and testable, but it means the composer still
    coordinates visible search status, draft restoration, footer rendering,
    cursor placement, and match highlighting while `ChatComposerHistory`
    owns traversal, async fetch continuation, boundary clamping, and
    unique-result caching. Unique-result caching stores cloned
    `HistoryEntry` values so known matches can be revisited without cache
    lookups; this is simple and robust for interactive search sizes, but it
    is not a global history index.
    
    ## Architecture
    
    `ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches
    the footer to `FooterMode::HistorySearch`, and routes search-mode keys
    before normal editing. Query edits call `ChatComposerHistory::search`
    with `restart = true`, which starts from the newest combined-history
    offset. Repeated `Ctrl+R` or Up searches older; Down searches newer
    through already discovered unique matches or continues the scan.
    Persistent history entries still arrive asynchronously through
    `on_entry_response`, where a pending search either accepts the response,
    skips a duplicate, or requests the next offset.
    
    The composer-facing pieces now live in
    `codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving
    `chat_composer.rs` responsible for routing and rendering integration
    instead of owning every search helper inline.
    `codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the
    owner of stored history, combined offsets, async fetch state, boundary
    semantics, and duplicate suppression. Match highlighting is computed
    from the current composer text while search is active and disappears
    when the match is accepted.
    
    ## Observability
    
    There are no new logs or telemetry. The practical debug path is state
    inspection: `ChatComposer.history_search` tells whether the footer query
    is idle, searching, matched, or unmatched; `ChatComposerHistory.search`
    tracks selected raw offsets, pending persistent fetches, exhausted
    directions, and unique match cache state. If a user reports skipped or
    repeated results, first inspect the exact stored prompt text, the
    selected offset, whether an async persistent response is still pending,
    and whether a query edit restarted the search session.
    
    ## Tests
    
    The change is covered by focused `codex-tui` unit tests for opening
    search without previewing the latest entry, accepting and canceling
    search, no-match restoration, boundary clamping, footer hints,
    case-insensitive highlighting, local duplicate skipping, and persistent
    duplicate skipping through async responses. Snapshot coverage captures
    the footer-mode visual changes. Local verification used `just fmt`,
    `cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and
    `just fix -p codex-tui`.
  • Mirror user text into realtime (#17520)
    - Let typed user messages submit while realtime is active and mirror
    accepted text into the realtime text stream.
    - Add integration coverage and snapshot for outbound realtime text.
  • fix(sandboxing): reject WSL1 bubblewrap sandboxing (#17559)
    ## Summary
    
    - detect WSL1 before Codex probes or invokes the Linux bubblewrap
    sandbox
    - fail early with a clear unsupported-operation message when a command
    would require bubblewrap on WSL1
    - document that WSL2 follows the normal Linux bubblewrap path while WSL1
    is unsupported
    
    ## Why
    
    Codex 0.115.0 made bubblewrap the default Linux sandbox. WSL1 cannot
    create the user namespaces that bubblewrap needs, so shell commands
    currently fail later with a raw bwrap namespace error. This makes the
    unsupported environment explicit and keeps non-bubblewrap paths
    unchanged.
    
    The WSL detection reads /proc/version, lets an explicit WSL<version>
    marker decide WSL1 vs WSL2+, and only treats a bare Microsoft marker as
    WSL1 when no explicit WSL version is present.
    
    addresses https://github.com/openai/codex/issues/16076
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • build(pnpm): require reviewed dependency build scripts (#17558)
    ## Description
    
    Enable pnpm's reviewed build-script gate for this repo.
    
    ## What changed
    
    - added `strictDepBuilds: true` to `pnpm-workspace.yaml`
    
    ## Why
    
    The repo already uses pinned pnpm and frozen installs in CI. This adds
    the remaining guard so dependency build scripts do not run unless they
    are explicitly reviewed.
    
    ## Validation
    
    - ran `pnpm install --frozen-lockfile`
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Support flattened deferred MCP tool calls (#17556)
    ## Summary
    - register flattened handler aliases for deferred MCP tools
    - cover the node_repl-shaped deferred MCP call path in tool registry
    tests
    
    ## Root Cause
    Deferred MCP tools were registered only under their namespaced handler
    key, e.g. `mcp__node_repl__:js`. If the model/bridge emitted the
    flattened qualified name `mcp__node_repl__js`, core parsed it as an MCP
    payload but dispatch looked up the flattened handler key and returned
    `unsupported call` before reaching the MCP handler.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-tools
    search_tool_registers_deferred_mcp_flattened_handlers`
    - `cargo test -p codex-core
    search_tool_registers_namespaced_mcp_tool_aliases`
    - `git diff --check`
  • Budget realtime current thread context (#17519)
    Select Current Thread startup context by budget from newest turns, cap
    each rendered turn at 300 approximate tokens, and add formatter plus
    integration snapshot coverage.
  • [codex] Support bubblewrap in secure Docker devcontainer (#17547)
    ## Summary
    
    - leave the default contributor devcontainer on its lightweight
    platform-only Docker runtime
    - install bubblewrap in setuid mode only in the secure devcontainer
    image for running Codex inside Docker
    - add Docker run args to the secure profile for bubblewrap's required
    capabilities
    - use explicit `seccomp=unconfined` and `apparmor=unconfined` in the
    secure profile instead of shipping a custom seccomp profile
    - document that the relaxed Docker security options are scoped to the
    secure profile
    
    ## Why
    
    Docker's default seccomp profile blocks bubblewrap with `pivot_root:
    Operation not permitted`, even when the container has `CAP_SYS_ADMIN`.
    Docker's default AppArmor profile also blocks bubblewrap with `Failed to
    make / slave: Permission denied`.
    
    A custom seccomp profile works, but it is hard for customers to audit
    and understand. Using Docker's standard `seccomp=unconfined` option is
    clearer: the secure profile intentionally relaxes Docker's outer sandbox
    just enough for Codex to construct its own bubblewrap/seccomp sandbox
    inside the container. The default contributor profile does not get these
    expanded runtime settings.
    
    ## Validation
    
    - `sed '/\\/\\*/,/\\*\\//d' .devcontainer/devcontainer.json | jq empty`
    - `jq empty .devcontainer/devcontainer.secure.json`
    - `git diff --check`
    - `docker build --platform=linux/arm64 -t
    codex-devcontainer-bwrap-test-arm64 ./.devcontainer`
    - `docker build --platform=linux/arm64 -f
    .devcontainer/Dockerfile.secure -t
    codex-devcontainer-secure-bwrap-test-arm64 .`
    - interactive `docker run -it` smoke tests:
      - verified non-root users `ubuntu` and `vscode`
      - verified secure image `/usr/bin/bwrap` is setuid
    - verified user/pid namespace, user/network namespace, and preserved-fd
    `--ro-bind-data` bwrap commands
    - reran secure-image smoke test with simplified `seccomp=unconfined`
    setup:
      - `bwrap-basic-ok`
      - `bwrap-netns-ok`
      - `codex-ok`
    - ran Codex inside the secure image:
      - `codex --version` -> `codex-cli 0.120.0`
    - `codex sandbox linux --full-auto -- /bin/sh -lc '...'` -> exited 0 and
    printed `codex-inner-ok`
    
    Note: direct `bwrap --proc /proc` is still denied by this Docker
    runtime, and Codex's existing proc-mount preflight fallback handles that
    by retrying without `--proc`.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Clarify guardian timeout guidance (#17521)
    ## Summary
    - update the guardian timeout guidance to say permission approval review
    timed out
    - simplify the retry guidance to say retry once or ask the user for
    guidance or explicit approval
    
    ## Testing
    - cargo test -p codex-core
    guardian_timeout_message_distinguishes_timeout_from_policy_denial
    - cargo test -p codex-core
    guardian_review_decision_maps_to_mcp_tool_decision
  • changing decision semantics after guardian timeout (#17486)
    **Summary**
    
    This PR treats Guardian timeouts as distinct from explicit denials in
    the core approval paths.
    Timeouts now return timeout-specific guidance instead of Guardian
    policy-rejection messaging.
    It updates the command, shell, network, and MCP approval flows and adds
    focused test coverage.
  • chore: refactor name and namespace to single type (#17402)
    avoid passing them both around, unify on a type. this now also keys
    `ToolRegistry`.
    
    tests pass
  • Restore codex-tui resume hint on exit (#17415)
    Addresses #17303
    
    Problem: The standalone codex-tui entrypoint only printed token usage on
    exit, so resumable sessions could omit the codex resume footer even when
    thread metadata was available.
    
    Solution: Format codex-tui exit output from AppExitInfo so it includes
    the same resume hint as the main CLI and reports fatal exits
    consistently.
  • Clear /ps after /stop (#17416)
    Addresses #17311
    
    Problem: `/stop` stops background terminals, but `/ps` can still show
    stale entries because the TUI process cache is cleared only after later
    exec end events arrive.
    
    Solution: Clear the TUI's tracked unified exec process list and footer
    immediately when `/stop` submits background terminal cleanup.
  • Support prolite plan type (#17419)
    Addresses #17353
    
    Problem: Codex rate-limit fetching failed when the backend returned the
    new `prolite` subscription plan type.
    
    Solution: Add `prolite` to the backend/account/auth plan mappings, keep
    unknown WHAM plan values decodable, and regenerate app-server plan
    schemas.
  • fix (#17493)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Update issue labeler agent labels (#17483)
    Problem: The automatic issue labeler still treated agent-related issues
    as one broad category, even though more specific agent-area labels now
    exist.
    
    Solution: Update the issue labeler prompt to prefer the new agent-area
    labels and keep "agent" as the fallback for uncategorized core agent
    issues.
  • Handle closed TUI input stream as shutdown (#17430)
    Addresses #17276
    
    Problem: Closing the terminal while the TUI input stream is pending
    could leave the app outside the normal shutdown path, which is risky
    when an approval prompt is active.
    
    Solution: Treat a closed TUI input stream as ShutdownFirst so existing
    thread shutdown behavior cancels pending work and approvals before exit.
  • fix(tui): recall accepted slash commands locally (#17336)
    # TL;DR
    
    - Adds recognized slash commands to the TUI's local in-session recall
    history.
    - This is the MVP of the whole feature: it keeps slash-command recall
    local only: nothing is written to persistent history, app-server
    history, or core history storage.
    - Treats slash commands like submitted text once they parse as a known
    built-in command, regardless of whether command dispatch later succeeds.
    
    # Problem
    
    Slash commands are handled outside the normal message submission path,
    so they could clear the composer without becoming part of the local
    Up-arrow recall list. That made command-heavy workflows awkward: after
    running `/diff`, `/rename Better title`, `/plan investigate this`, or
    even a valid command that reports a usage error, users had to retype the
    command instead of recalling and editing it like a normal prompt.
    
    The goal of this PR is to make slash commands feel like submitted input
    inside the current TUI session while keeping the change deliberately
    local. This is not persistent history yet; it only affects the
    composer's in-memory recall behavior.
    
    # Mental model
    
    The composer owns draft state and local recall. When slash input parses
    as a recognized built-in command, the composer stages the submitted
    command text before returning `InputResult::Command` or
    `InputResult::CommandWithArgs`. `ChatWidget` then dispatches the command
    and records the staged entry once dispatch returns to the input-result
    path.
    
    Command-name recognition is the only validation before local recall. A
    valid slash command is recallable whether it succeeds, fails with a
    usage error, no-ops, is unavailable while a task is running, or is
    skipped by command-specific logic. An unrecognized slash command is
    different: it is restored as a draft, surfaces the existing
    unrecognized-command message, and is not added to recall.
    
    Bare commands recalled from typed text use the trimmed submitted draft.
    Commands selected from the popup record the canonical command text, such
    as `/diff`, rather than the partial filter text the user typed. Inline
    commands with arguments keep the original command invocation available
    locally even when their arguments are later prepared through the normal
    submission pipeline.
    
    # Non-goals
    
    Persisting slash commands across sessions is intentionally out of scope.
    This change does not modify app-server history, core history storage,
    protocol events, or message submission semantics.
    
    This does not change command availability, command side effects, popup
    filtering, command parsing, or the semantics of unsupported commands. It
    only changes whether recognized slash-command invocations are available
    through local Up-arrow recall after the user submits them.
    
    # Tradeoffs
    
    The main tradeoff is that recall is based on command recognition, not
    command outcome. This intentionally favors a simpler user model: if the
    TUI accepted the input as a slash command, the user can recall and edit
    that input just like plain text. That means valid-but-unsuccessful
    invocations such as usage errors are recallable, which is useful when
    the next action is usually to edit and retry.
    
    The previous accept/reject design required command dispatch to report a
    boolean outcome, which made the dispatcher API noisier and forced every
    branch to decide history behavior. This version keeps the dispatch APIs
    as side-effect-only methods and localizes history recording to the
    slash-command input path.
    
    Inline command handling still avoids double-recording by preparing
    inline arguments without using the normal message-submission history
    path. The staged slash-command entry remains the single local recall
    record for the command invocation.
    
    # Architecture
    
    `ChatComposer` stages a pending `HistoryEntry` when recognized
    slash-command input is promoted into an input result. The pending entry
    mirrors the existing local history payload shape so recall can restore
    text elements, local images, remote images, mention bindings, and
    pending paste state when those are present.
    
    `BottomPane` exposes a narrow method for recording that staged command
    entry because it owns the composer. `ChatWidget` records the staged
    entry after dispatching a recognized command from the input-result
    match. Valid commands rejected before they reach `ChatWidget`, such as
    commands unavailable while a task is running, are staged and recorded in
    the composer path that detects the rejection.
    
    Slash-command dispatch itself now lives in
    `chatwidget/slash_dispatch.rs` so the behavior is reviewable without
    adding more weight to `chatwidget.rs`. The extraction is
    behavior-preserving: the dispatch match arms stay intact, while the
    input flow in `chatwidget.rs` remains the single place that connects
    submitted slash-command input to dispatch.
    
    # Observability
    
    There is no new logging because this is a local UI recall behavior and
    the result is directly visible through Up-arrow recall. The practical
    debug path is to trace Enter through
    `ChatComposer::try_dispatch_bare_slash_command`,
    `ChatComposer::try_dispatch_slash_command_with_args`, or popup Enter/Tab
    handling, then confirm the recognized command is staged before dispatch
    and recorded exactly once afterward.
    
    If a valid command unexpectedly does not appear in recall, check whether
    the input path staged slash history before clearing the composer and
    whether it used the `ChatWidget` slash-dispatch wrapper. If an
    unrecognized command unexpectedly appears in recall, check the parser
    branch that should restore the draft instead of staging history.
    
    # Tests
    
    Composer-level tests cover staging and recording for a bare typed slash
    command, a popup-selected command, and an inline command with arguments.
    
    Chat-widget tests cover valid commands being recallable after normal
    dispatch, inline dispatch, usage errors, task-running unavailability,
    no-op stub dispatch, and command-specific skip behavior such as `/init`
    when an instructions file already exists. They also cover the negative
    case: unrecognized slash commands are not added to local recall.
  • Pass turn id with feedback uploads (#17314)
    ## Summary
    - Add an optional `tags` dictionary to feedback upload params.
    - Capture the active app-server turn id in the TUI and submit it as
    `tags.turn_id` with `/feedback` uploads.
    - Merge client-provided feedback tags into Sentry feedback tags while
    preserving reserved system fields like `thread_id`, `classification`,
    `cli_version`, `session_source`, and `reason`.
    
    ## Behavior / impact
    Existing feedback upload callers remain compatible because `tags` is
    optional and nullable. The wire shape is still a normal JSON object /
    TypeScript dictionary, so adding future feedback metadata will not
    require a new top-level protocol field each time. This change only adds
    feedback metadata for Codex CLI/TUI uploads; it does not affect existing
    pipelines, DAGs, exports, or downstream consumers unless they choose to
    read the new `turn_id` feedback tag.
    
    ## Tests
    - `cargo fmt -- --config imports_granularity=Item` passed; stable
    rustfmt warned that `imports_granularity` is nightly-only.
    - `cargo run -p codex-app-server-protocol --bin write_schema_fixtures`
    - `cargo test -p codex-feedback
    upload_tags_include_client_tags_and_preserve_reserved_fields`
    - `cargo test -p codex-app-server-protocol
    schema_fixtures_match_generated`
    - `cargo test -p codex-tui build_feedback_upload_params`
    - `cargo test -p codex-tui
    live_app_server_turn_started_sets_feedback_turn_id`
    - `cargo check -p codex-app-server --tests`
    - `git diff --check`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat(devcontainer): add separate secure customer profile (#10431)
    ## Description
    
    Keeps the existing Codex contributor devcontainer in place and adds a
    separate secure profile for customer use.
    
    ## What changed
    
    - leaves `.devcontainer/devcontainer.json` and the contributor
    `Dockerfile` aligned with `main`
    - adds `.devcontainer/devcontainer.secure.json` and
    `.devcontainer/Dockerfile.secure`
    - adds secure-profile bootstrap scripts:
      - `post_install.py`
      - `post-start.sh`
      - `init-firewall.sh`
    - updates `.devcontainer/README.md` to explain when to use each path
    
    ## Secure profile behavior
    
    The new secure profile is opt-in and is meant for running Codex in a
    stricter project container:
    
    - preinstalls the Codex CLI plus common build tools
    - uses persistent volumes for Codex state, Cargo, Rustup, and GitHub
    auth
    - applies an allowlist-driven outbound firewall at startup
    - blocks IPv6 by default so the allowlist cannot be bypassed via AAAA
    routes
    - keeps the stricter networking isolated from the default contributor
    workflow
    
    ## Resulting behavior
    
    - `devcontainer.json` remains the low-friction Codex contributor setup
    - `devcontainer.secure.json` is the customer-facing secure option
    - the repo supports both workflows without forcing the secure profile on
    Codex contributors
  • Fix thread/list cwd filtering for Windows verbatim paths (#17414)
    Addresses #17302
    
    Problem: `thread/list` compared cwd filters with raw path equality, so
    `resume --last` could miss Windows sessions when the saved cwd used a
    verbatim path form and the current cwd did not.
    
    Solution: Normalize cwd comparisons through the existing path comparison
    utilities before falling back to direct equality, and add Windows
    regression coverage for verbatim paths. I made this a general utility
    function and replaced all of the duplicated instance of it across the
    code base.
  • Stabilize marketplace add local source test (#17424)
    ## Summary
    - Update the marketplace add local-source integration test to pass an
    explicit relative local path.
    - Keep the change test-only; no CLI source parsing behavior changes.
    
    ## Tests
    - cargo fmt -p codex-cli
    - cargo test -p codex-cli --test marketplace_add
    
    ## Impact
    - Production behavior is unchanged.
    - No impact to feedback upload logic, DAGs, exports, or downstream
    pipelines.
    
    Co-authored-by: Codex <noreply@openai.com>
  • [mcp] Support MCP Apps part 3 - Add mcp tool call support. (#17364)
    - [x] Add a new app-server method so that MCP Apps can call their own
    MCP server directly.
  • fix: unblock private DNS in macOS sandbox (#17370)
    ## Summary
    - keep hostname targets proxied by default by removing hostname suffixes
    from the managed `NO_PROXY` value while preserving private/link-local
    CIDRs
    - make the macOS `allow_local_binding` sandbox rules match the local
    socket shape used by DNS tools by allowing wildcard local binds
    - allow raw DNS egress to remote port 53 only when `allow_local_binding`
    is enabled, without opening blanket outbound network access
    
    ## Root cause
    Raw DNS tools do not honor `HTTP_PROXY` or `ALL_PROXY`, so the
    proxy-only Seatbelt policy blocked their resolver traffic before it
    could reach host DNS. In the affected managed config,
    `allow_local_binding = true`, but the existing rule only allowed
    `localhost:*` binds; `dig`/BIND can bind sockets in a way that needs
    wildcard local binding. Separately, hostname suffixes in `NO_PROXY`
    could force internal hostnames to resolve locally instead of through the
    proxy path.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • TUI: enforce core boundary (#17399)
    Problem: The TUI still depended on `codex-core` directly in a number of
    places, and we had no enforcement from keeping this problem from getting
    worse.
    
    Solution: Route TUI core access through
    `codex-app-server-client::legacy_core`, add CI enforcement for that
    boundary, and re-export this legacy bridge inside the TUI as
    `crate::legacy_core` so the remaining call sites stay readable. There is
    no functional change in this PR — just changes to import targets.
    
    Over time, we can whittle away at the remaining symbols in this legacy
    namespace with the eventual goal of removing them all. In the meantime,
    this linter rule will prevent us from inadvertently importing new
    symbols from core.
  • representing guardian review timeouts in protocol types (#17381)
    ## Summary
    
    - Add `TimedOut` to Guardian/review carrier types:
      - `ReviewDecision::TimedOut`
      - `GuardianAssessmentStatus::TimedOut`
      - app-server v2 `GuardianApprovalReviewStatus::TimedOut`
    - Regenerate app-server JSON/TypeScript schemas for the new wire shape.
    - Wire the new status through core/app-server/TUI mappings with
    conservative fail-closed handling.
    - Keep `TimedOut` non-user-selectable in the approval UI.
    
    **Does not change runtime behavior yet; emitting `TimeOut` and
    parent-model timeout messaging will come in followup PRs**
  • Fix Windows exec-server output test flake (#17409)
    Problem: The Windows exec-server test command could let separator
    whitespace become part of `echo` output, making the exact
    retained-output assertion flaky.
    
    Solution: Tighten the Windows `cmd.exe` command by placing command
    separators directly after the echoed tokens so stdout remains
    deterministic while preserving the exact assertion.
  • Add marketplace command (#17087)
    Added a new top-level `codex marketplace add` command for installing
    plugin marketplaces into Codex’s local marketplace cache.
    
    This change adds source parsing for local directories, GitHub shorthand,
    and git URLs, supports optional `--ref` and git-only `--sparse` checkout
    paths, stages the source in a temp directory, validates the marketplace
    manifest, and installs it under
    `$CODEX_HOME/marketplaces/<marketplace-name>`
    
    Included tests cover local install behavior in the CLI and marketplace
    discovery from installed roots in core. Scoped formatting and fix passes
    were run, and targeted CLI/core tests passed.