Commit Graph

4222 Commits

  • [codex] Normalize Windows path in MCP startup snapshot test (#16204)
    ## Summary
    A Windows-only snapshot assertion in the app-server MCP startup warning
    test compared the raw rendered path, so CI saw `C:\tmp\project` instead
    of the normalized `/tmp/project` snapshot fixture.
    
    ## Fix
    Route that snapshot assertion through the existing
    `normalize_snapshot_paths(...)` helper so the test remains
    platform-stable.
  • codex-tools: extract utility tool specs (#16154)
    ## Why
    
    The previous `codex-tools` migration steps moved the shared schema
    models, local-host specs, collaboration specs, and related adapters out
    of `codex-core`, but `core/src/tools/spec.rs` still contained a grab bag
    of pure utility tool builders. Those specs do not need session state or
    handler logic; they only describe wire shapes for tools that
    `codex-core` already knows how to execute.
    
    Moving that remaining low-coupling layer into `codex-tools` keeps the
    migration moving in meaningful chunks and trims another large block of
    passive tool-spec construction out of `codex-core` without touching the
    runtime-coupled handlers.
    
    ## What changed
    
    - extended `codex-tools` to own the pure spec builders for:
      - code-mode `exec` / `wait`
      - `js_repl` / `js_repl_reset`
    - MCP resource tools `list_mcp_resources`,
    `list_mcp_resource_templates`, and `read_mcp_resource`
      - utility tools `list_dir` and `test_sync_tool`
    - split those builders across small module files with sibling
    `*_tests.rs` coverage, keeping `src/lib.rs` exports-only
    - rewired `core/src/tools/spec.rs` to call the extracted builders and
    deleted the duplicated core-local implementations
    - moved the direct JS REPL grammar seam test out of
    `core/src/tools/spec_tests.rs` so it now lives with the extracted
    implementation in `codex-tools`
    - updated `codex-rs/tools/README.md` so the documented crate boundary
    matches the new utility-spec surface
    
    ## Test plan
    
    - `CARGO_TARGET_DIR=/tmp/codex-tools-utility-specs cargo test -p
    codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-utility-specs cargo test -p
    codex-core --lib tools::spec::`
    - `just fix -p codex-tools -p codex-core`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
    - #16132
    - #16138
    - #16141
  • Fix tui_app_server ghost subagent entries in /agent (#16110)
    Fixes #16092
    
    The app-server-backed TUI could accumulate ghost subagent entries in
    `/agent` after resume/backfill flows. Some of those rows were no longer
    live according to the backend, but still appeared selectable in the
    picker and could open as blank threads.
    
    *Cause*
    Unlike the legacy tui behavior, tui_app_server was creating local
    picker/replay state for subagents discovered through metadata refresh
    and loaded-thread backfill, even when no real local session or
    transcript had been attached. That let stale ids survive in the picker
    as if they were replayable threads.
    
    *Fix*
    Stop creating empty local thread channels during subagent metadata
    hydration and loaded-thread backfill.
    When opening /agent, prune metadata-only entries that thread/read
    reports as terminally unavailable.
    When selecting a discovered subagent that is still live but not yet
    locally attached, materialize a real local session on demand from
    thread/read instead of falling back to an empty replay state.
  • Fix app-server TUI MCP startup warnings regression (#16041)
    This addresses #16038
    
    The default `tui_app_server` path stopped surfacing MCP startup failures
    during cold start, even though the legacy TUI still showed warnings like
    `MCP startup incomplete (...)`. The app-server bridge emitted per-server
    startup status notifications, but `tui_app_server` ignored them, so
    failed MCP handshakes could look like a clean startup.
    
    This change teaches `tui_app_server` to consume MCP startup status
    notifications, preserve the immediate per-server failure warning, and
    synthesize the same aggregate startup warning the legacy TUI shows once
    startup settles.
  • codex-tools: extract collaboration tool specs (#16141)
    ## Why
    
    The recent `codex-tools` migration steps have moved shared tool models
    and low-coupling spec helpers out of `codex-core`, but
    `core/src/tools/spec.rs` still owned a large block of pure
    collaboration-tool spec construction. Those builders do not need session
    state or runtime behavior; they only need a small amount of core-owned
    configuration injected at the seam.
    
    Moving that cohesive slice into `codex-tools` makes the crate boundary
    more honest and removes a substantial amount of passive tool-spec logic
    from `codex-core` without trying to move the runtime-coupled multi-agent
    handlers at the same time.
    
    ## What changed
    
    - added `agent_tool.rs`, `request_user_input_tool.rs`, and
    `agent_job_tool.rs` to `codex-tools`, with sibling `*_tests.rs` coverage
    and an exports-only `lib.rs`
    - moved the pure `ToolSpec` builders for:
    - collaboration tools such as `spawn_agent`, `send_input`,
    `send_message`, `assign_task`, `resume_agent`, `wait_agent`,
    `list_agents`, and `close_agent`
      - `request_user_input`
      - agent-job specs `spawn_agents_on_csv` and `report_agent_job_result`
    - rewired `core/src/tools/spec.rs` to call the extracted builders while
    still supplying the core-owned inputs, such as spawn-agent role
    descriptions and wait timeout bounds
    - updated the `core/src/tools/spec.rs` seam tests to build expected
    collaboration specs through `codex-tools`
    - updated `codex-rs/tools/README.md` so the crate documentation reflects
    the broader collaboration-tool boundary
    
    ## Test plan
    
    - `CARGO_TARGET_DIR=/tmp/codex-tools-collab-specs cargo test -p
    codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-collab-specs cargo test -p
    codex-core --lib tools::spec::`
    - `just fix -p codex-tools -p codex-core`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
    - #16132
    - #16138
  • [mcp] Increase MCP startup timeout. (#16080)
    - [x] Increase MCP startup timeout to 30s, as the current 10s causes a
    lot of local MCPs to timeout.
  • Remove TUI voice transcription feature (#16114)
    Removes the partially-completed TUI composer voice transcription flow,
    including its feature flag, app events, and hold-to-talk state machine.
  • codex-tools: extract local host tool specs (#16138)
    ## Why
    
    `core/src/tools/spec.rs` still bundled a set of pure local-host tool
    builders with the orchestration that actually decides when those tools
    are exposed and which handlers back them. That made `codex-core`
    responsible for JSON/tool-shape construction that does not depend on
    session state, and it kept the `codex-tools` migration from taking a
    meaningfully larger bite out of `spec.rs`.
    
    This PR moves that reusable spec-building layer into `codex-tools` while
    leaving feature gating, handler registration, and runtime-coupled
    descriptions in `codex-core`.
    
    ## What changed
    
    - added `codex-rs/tools/src/local_tool.rs` for the pure builders for
    `exec_command`, `write_stdin`, `shell`, `shell_command`, and
    `request_permissions`
    - added `codex-rs/tools/src/view_image.rs` for the `view_image` tool
    spec and output schema so the extracted modules stay right-sized
    - rewired `codex-rs/core/src/tools/spec.rs` to call those extracted
    builders instead of constructing these specs inline
    - kept the `request_permissions` description source in `codex-core`,
    with `codex-tools` taking the description as input so the crate boundary
    does not grow a dependency on handler/runtime code
    - moved the direct constructor coverage for this slice from
    `codex-rs/core/src/tools/spec_tests.rs` into
    `codex-rs/tools/src/local_tool_tests.rs` and
    `codex-rs/tools/src/view_image_tests.rs`
    - updated `codex-rs/tools/README.md` to reflect that `codex-tools` now
    owns this local-host spec layer
    
    ## Test plan
    
    - `CARGO_TARGET_DIR=/tmp/codex-tools-local-host cargo test -p
    codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-local-tools cargo test -p codex-core
    --lib tools::spec::`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
    - #16132
  • Fix skills picker scrolling in tui app server (#16109)
    Fixes #16091.
    
    The app-server TUI was truncating the filtered mention candidate list to
    `MAX_POPUP_ROWS`, so the `$` skills picker only exposed the first 8
    matches. That made it look like many skills were missing and prevented
    keyboard navigation beyond the first page, even though direct
    `$skill-name` insertion still worked.
    
    Testing: I manually verified the regression and confirmed the fix.
  • exec: make review-policy tests hermetic (#16137)
    ## Why
    
    `thread_start_params_from_config()` is supposed to forward the effective
    `approvals_reviewer` into the app-server request, but these tests were
    constructing that config through `ConfigBuilder::build()`, which also
    loads ambient system and managed config layers. On machines with an
    admin or host-level reviewer override, the manual-only case could
    inherit `guardian_subagent` and fail even though the exec-side mapping
    was correct.
    
    ## What changed
    
    - Set `approvals_reviewer` explicitly via `harness_overrides` in the two
    `thread_start_params_*review_policy*` tests in
    `codex-rs/exec/src/lib.rs`.
    - Removed the dependence on default config resolution and temp
    `config.toml` writes so the tests exercise only the reviewer-to-request
    mapping in `codex-exec`.
    
    ## Testing
    
    - `cargo test -p codex-exec`
  • codex-tools: extract code mode tool spec adapters (#16132)
    ## Why
    
    The longer-term `codex-tools` migration is to move pure tool-definition
    and tool-spec plumbing out of `codex-core` while leaving session- and
    runtime-coupled orchestration behind.
    
    The remaining code-mode adapter layer in
    `core/src/tools/code_mode_description.rs` was a good next extraction
    seam because it only transformed `ToolSpec` values for code mode and
    already delegated the low-level description rendering to
    `codex-code-mode`.
    
    ## What Changed
    
    - added `codex-rs/tools/src/code_mode.rs` with
    `augment_tool_spec_for_code_mode()` and
    `tool_spec_to_code_mode_tool_definition()`
    - added focused unit coverage in `codex-rs/tools/src/code_mode_tests.rs`
    - rewired `core/src/tools/spec.rs` and `core/src/tools/code_mode/mod.rs`
    to use the extracted adapters from `codex-tools`
    - removed the old `core/src/tools/code_mode_description.rs` shim and its
    test file from `codex-core`
    - added the `codex-code-mode` dependency to `codex-tools`, updated
    `Cargo.lock`, and refreshed the `codex-tools` README to reflect the
    expanded boundary
    
    ## Test Plan
    
    - `cargo test -p codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
    codex-core --lib tools::spec::`
    - `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
    codex-core --lib tools::code_mode::`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
  • core: fix stale curated plugin cache refresh races (#16126)
    ## Why
    
    The `plugin/list` force-sync path can race app-server startup's curated
    plugin cache refresh.
    
    Startup was capturing the configured curated plugin IDs from the initial
    config snapshot. If `plugin/list` with `forceRemoteSync` removed curated
    plugin entries from `config.toml` while that background refresh was
    still in flight, the startup task could recreate cache directories for
    plugins that had just been uninstalled.
    
    That leaves the `plugin/list` response logically correct but the on-disk
    cache stale, which matches the flaky Ubuntu arm failure seen in
    `codex-app-server::all
    suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state`
    while validating [#16047](https://github.com/openai/codex/pull/16047).
    
    ## What
    
    - change `codex-rs/core/src/plugins/manager.rs` so startup curated-repo
    refresh rereads the current user `config.toml` before deciding which
    curated plugin cache entries to refresh
    - factor the configured-plugin parsing so the same logic can be reused
    from either the config layer stack or the persisted user config value
    - add a regression test that verifies curated plugin IDs are read from
    the latest user config state before cache refresh runs
    
    ## Testing
    
    - `cargo test -p codex-core
    configured_curated_plugin_ids_from_codex_home_reads_latest_user_config
    -- --nocapture`
    - `cargo test -p codex-app-server
    suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state
    -- --nocapture`
    - `just argument-comment-lint`
  • codex-tools: extract configured tool specs (#16129)
    ## Why
    
    This continues the `codex-tools` migration by moving another passive
    tool-spec layer out of `codex-core`.
    
    After `ToolSpec` moved into `codex-tools`, `codex-core` still owned
    `ConfiguredToolSpec` and `create_tools_json_for_responses_api()`. Both
    are data-model and serialization helpers rather than runtime
    orchestration, so keeping them in `core/src/tools/registry.rs` and
    `core/src/tools/spec.rs` left passive tool-definition code coupled to
    `codex-core` longer than necessary.
    
    ## What changed
    
    - moved `ConfiguredToolSpec` into `codex-rs/tools/src/tool_spec.rs`
    - moved `create_tools_json_for_responses_api()` into
    `codex-rs/tools/src/tool_spec.rs`
    - re-exported the new surface from `codex-rs/tools/src/lib.rs`, which
    remains exports-only
    - updated `core/src/client.rs`, `core/src/tools/registry.rs`, and
    `core/src/tools/router.rs` to consume the extracted types and serializer
    from `codex-tools`
    - moved the tool-list serialization test into
    `codex-rs/tools/src/tool_spec_tests.rs`
    - added focused unit coverage for `ConfiguredToolSpec::name()`
    - simplified `core/src/tools/spec_tests.rs` to use the extracted
    `ConfiguredToolSpec::name()` directly and removed the now-redundant
    local `tool_name()` helper
    - updated `codex-rs/tools/README.md` so the crate boundary reflects the
    newly extracted tool-spec wrapper and serialization helper
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
    codex-core --lib tools::spec::`
    - `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
    codex-core --lib client::`
    - `just fix -p codex-tools -p codex-core`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
  • codex-tools: extract tool spec models (#16047)
    ## Why
    
    This continues the `codex-tools` migration by moving another passive
    tool-definition layer out of `codex-core`.
    
    After `ResponsesApiTool` and the lower-level schema adapters moved into
    `codex-tools`, `core/src/client_common.rs` was still owning `ToolSpec`
    and the web-search request wire types even though they are serialized
    data models rather than runtime orchestration. Keeping those types in
    `codex-core` makes the crate boundary look smaller than it really is and
    leaves non-runtime tool-shape code coupled to core.
    
    ## What changed
    
    - moved `ToolSpec`, `ResponsesApiWebSearchFilters`, and
    `ResponsesApiWebSearchUserLocation` into
    `codex-rs/tools/src/tool_spec.rs`
    - added focused unit tests in `codex-rs/tools/src/tool_spec_tests.rs`
    for:
      - `ToolSpec::name()`
      - web-search config conversions
      - `ToolSpec` serialization for `web_search` and `tool_search`
    - kept `codex-rs/tools/src/lib.rs` exports-only by re-exporting the new
    module from `lib.rs`
    - reduced `core/src/client_common.rs` to a compatibility shim that
    re-exports the extracted tool-spec types for current core call sites
    - updated `core/src/tools/spec_tests.rs` to consume the extracted
    web-search types directly from `codex-tools`
    - updated `codex-rs/tools/README.md` so the crate contract reflects that
    `codex-tools` now owns the passive tool-spec request models in addition
    to the lower-level Responses API structs
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
    - `cargo test -p codex-core --lib client_common::`
    - `just fix -p codex-tools -p codex-core`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
  • Remove the codex-tui app-server originator workaround (#16116)
    ## Summary
    - remove the temporary `codex-tui` special-case when setting the default
    originator during app-server initialization
  • Remove remaining custom prompt support (#16115)
    ## Summary
    - remove protocol and core support for discovering and listing custom
    prompts
    - simplify the TUI slash-command flow and command popup to built-in
    commands only
    - delete obsolete custom prompt tests, helpers, and docs references
    - clean up downstream event handling for the removed protocol events
  • build: migrate argument-comment-lint to a native Bazel aspect (#16106)
    ## Why
    
    `argument-comment-lint` had become a PR bottleneck because the repo-wide
    lane was still effectively running a `cargo dylint`-style flow across
    the workspace instead of reusing Bazel's Rust dependency graph. That
    kept the lint enforced, but it threw away the main benefit of moving
    this job under Bazel in the first place: metadata reuse and cacheable
    per-target analysis in the same shape as Clippy.
    
    This change moves the repo-wide lint onto a native Bazel Rust aspect so
    Linux and macOS can lint `codex-rs` without rebuilding the world
    crate-by-crate through the wrapper path.
    
    ## What Changed
    
    - add a nightly Rust toolchain with `rustc-dev` for Bazel and a
    dedicated crate-universe repo for `tools/argument-comment-lint`
    - add `tools/argument-comment-lint/driver.rs` and
    `tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
    as a custom `rustc_driver`
    - switch repo-wide `just argument-comment-lint` and the Linux/macOS
    `rust-ci` lanes to `bazel build --config=argument-comment-lint
    //codex-rs/...`
    - keep the Python/DotSlash wrappers as the package-scoped fallback path
    and as the current Windows CI path
    - gate the Dylint entrypoint behind a `bazel_native` feature so the
    Bazel-native library avoids the `dylint_*` packaging stack
    - update the aspect runtime environment so the driver can locate
    `rustc_driver` correctly under remote execution
    - keep the dedicated `tools/argument-comment-lint` package tests and
    wrapper unit tests in CI so the source and packaged entrypoints remain
    covered
    
    ## Verification
    
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `cargo test` in `tools/argument-comment-lint`
    - `bazel build
    //tools/argument-comment-lint:argument-comment-lint-driver
    --@rules_rust//rust/toolchain/channel=nightly`
    - `bazel build --config=argument-comment-lint
    //codex-rs/utils/path-utils:all`
    - `bazel build --config=argument-comment-lint
    //codex-rs/rollout:rollout`
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
    * #16120
    * __->__ #16106
  • fix: fix comment linter lint violations in Linux-only code (#16118)
    https://github.com/openai/codex/pull/16071 took care of this for
    Windows, so this takes care of things for Linux.
    
    We don't touch the CI jobs in this PR because
    https://github.com/openai/codex/pull/16106 is going to be the real fix
    there (including a major speedup!).
  • Rename tui_app_server to tui (#16104)
    This is a follow-up to https://github.com/openai/codex/pull/15922. That
    previous PR deleted the old `tui` directory and left the new
    `tui_app_server` directory in place. This PR renames `tui_app_server` to
    `tui` and fixes up all references.
  • fix(tui): refresh footer on collaboration mode changes (#16026)
    ## Summary
    - Moves status surface refresh (`refresh_status_surfaces` /
    `refresh_status_line`) from `App` event handlers into `ChatWidget`
    setters via a new `refresh_model_dependent_surfaces()` method
    - Ensures model-dependent UI stays in sync whenever collaboration mode,
    model, or reasoning effort changes, including the footer and terminal
    title in both `tui` and `tui_app_server`
    - Applies the fix to both `tui` and `tui_app_server` widgets
    
    #15961
    
    ## Test plan
    - [x] Added snapshot test
    `status_line_model_with_reasoning_plan_mode_footer` verifying footer
    renders correctly in plan mode
    - [x] Added
    `terminal_title_model_updates_on_model_change_without_manual_refresh` in
    `tui_app_server`
    - [ ] Verify switching collaboration modes updates the footer in real
    TUI
    - [ ] Verify model/reasoning effort changes reflect in the status bar
    and terminal title
    
    ---------
    
    Co-authored-by: Eric Traut <etraut@openai.com>
  • fix: clean up remaining Windows argument-comment-lint violations (#16071)
    ## Why
    
    The initial `argument-comment-lint` rollout left Windows on
    default-target coverage because there were still Windows-only callsites
    failing under `--all-targets`. This follow-up cleans up those remaining
    Windows-specific violations so the Windows CI lane can enforce the same
    stricter coverage, leaving Linux as the remaining platform-specific
    follow-up.
    
    ## What changed
    
    - switched the Windows `rust-ci` argument-comment-lint step back to the
    default wrapper invocation so it runs full-target coverage again
    - added the required `/*param_name*/` annotations at Windows-gated
    literal callsites in:
      - `codex-rs/windows-sandbox-rs/src/lib.rs`
      - `codex-rs/windows-sandbox-rs/src/elevated_impl.rs`
      - `codex-rs/tui_app_server/src/multi_agents.rs`
      - `codex-rs/network-proxy/src/proxy.rs`
    
    ## Validation
    
    - Windows `argument comment lint` CI on this PR
  • shell-command: reuse a PowerShell parser process on Windows (#16057)
    ## Why
    
    `//codex-rs/shell-command:shell-command-unit-tests` became a real
    bottleneck in the Windows Bazel lane because repeated calls to
    `is_safe_command_windows()` were starting a fresh PowerShell parser
    process for every `powershell.exe -Command ...` assertion.
    
    PR #16056 was motivated by that same bottleneck, but its test-only
    shortcut was the wrong layer to optimize because it weakened the
    end-to-end guarantee that our runtime path really asks PowerShell to
    parse the command the way we expect.
    
    This PR attacks the actual cost center instead: it keeps the real
    PowerShell parser in the loop, but turns that parser into a long-lived
    helper process so both tests and the runtime safe-command path can reuse
    it across many requests.
    
    ## What Changed
    
    - add `shell-command/src/command_safety/powershell_parser.rs`, which
    keeps one mutex-protected parser process per PowerShell executable path
    and speaks a simple JSON-over-stdio request/response protocol
    - turn `shell-command/src/command_safety/powershell_parser.ps1` into a
    long-running parser server with comments explaining the protocol, the
    AST-shape restrictions, and why unsupported constructs are rejected
    conservatively
    - keep request ids and a one-time respawn path so a dead or
    desynchronized cached child fails closed instead of silently returning
    mixed parser output
    - preserve separate parser processes for `powershell.exe` and
    `pwsh.exe`, since they do not accept the same language surface
    - avoid a direct `PipelineChainAst` type reference in the PowerShell
    script so the parser service still runs under Windows PowerShell 5.1 as
    well as newer `pwsh`
    - make `shell-command/src/command_safety/windows_safe_commands.rs`
    delegate to the new parser utility instead of spawning a fresh
    PowerShell process for every parse
    - add a Windows-only unit test that exercises multiple sequential
    requests against the same parser process
    
    ## Testing
    
    - adds a Windows-only parser-reuse unit test in `powershell_parser.rs`
    - the main end-to-end verification for this change is the Windows CI
    lane, because the new service depends on real `powershell.exe` /
    `pwsh.exe` behavior
  • Support Codex CLI stdin piping for codex exec (#15917)
    # Summary
    
    Claude Code supports a useful prompt-plus-stdin workflow:
    
    ```bash
    echo "complex input..." | claude -p "summarize concisely"
    ```
    
    Codex previously did not support the equivalent `codex exec` form. While
    `codex exec` could read the prompt from stdin, it could not combine
    piped input with an explicit prompt argument.
    
    This change adds that missing workflow:
    
    ```bash
    echo "complex input..." | codex exec "summarize concisely"
    ```
    
    With this change, when `codex exec` receives both a positional prompt
    and piped stdin, the prompt remains the instruction and stdin is passed
    along as structured `<stdin>...</stdin>` context.
    
    Example:
    
    ```bash
    curl https://jsonplaceholder.typicode.com/comments \
      | ./target/debug/codex exec --skip-git-repo-check "format the top 20 items into a markdown table" \
      > table.md
    ```
    
    This PR also adds regression coverage for:
    - prompt argument + piped stdin
    - legacy stdin-as-prompt behavior
    - `codex exec -` forced-stdin behavior
    - empty-stdin error cases
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
    ## Why
    
    `argument-comment-lint` was green in CI even though the repo still had
    many uncommented literal arguments. The main gap was target coverage:
    the repo wrapper did not force Cargo to inspect test-only call sites, so
    examples like the `latest_session_lookup_params(true, ...)` tests in
    `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
    
    This change cleans up the existing backlog, makes the default repo lint
    path cover all Cargo targets, and starts rolling that stricter CI
    enforcement out on the platform where it is currently validated.
    
    ## What changed
    
    - mechanically fixed existing `argument-comment-lint` violations across
    the `codex-rs` workspace, including tests, examples, and benches
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
    `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
    `--all-targets` unless the caller explicitly narrows the target set
    - fixed both wrappers so forwarded cargo arguments after `--` are
    preserved with a single separator
    - documented the new default behavior in
    `tools/argument-comment-lint/README.md`
    - updated `rust-ci` so the macOS lint lane keeps the plain wrapper
    invocation and therefore enforces `--all-targets`, while Linux and
    Windows temporarily pass `-- --lib --bins`
    
    That temporary CI split keeps the stricter all-targets check where it is
    already cleaned up, while leaving room to finish the remaining Linux-
    and Windows-specific target-gated cleanup before enabling
    `--all-targets` on those runners. The Linux and Windows failures on the
    intermediate revision were caused by the wrapper forwarding bug, not by
    additional lint findings in those lanes.
    
    ## Validation
    
    - `bash -n tools/argument-comment-lint/run.sh`
    - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
    - shell-level wrapper forwarding check for `-- --lib --bins`
    - shell-level wrapper forwarding check for `-- --tests`
    - `just argument-comment-lint`
    - `cargo test` in `tools/argument-comment-lint`
    - `cargo test -p codex-terminal-detection`
    
    ## Follow-up
    
    - Clean up remaining Linux-only target-gated callsites, then switch the
    Linux lint lane back to the plain wrapper invocation.
    - Clean up remaining Windows-only target-gated callsites, then switch
    the Windows lint lane back to the plain wrapper invocation.
  • Fix tui_app_server agent picker closed-state regression (#16014)
    Addresses #15992
    
    The app-server TUI was treating tracked agent threads as closed based on
    listener-task bookkeeping that does not reflect live thread state during
    normal thread switching. That caused the `/agent` picker to gray out
    live agents and could show a false "Agent thread ... is closed" replay
    message after switching branches.
    
    This PR fixes the picker refresh path to query the app server for each
    tracked thread and derive closed vs loaded state from `thread/read`
    status, while preserving cached agent metadata for replay-only threads.
  • Fix tui_app_server resume-by-name lookup regression (#16050)
    Addresses #16049
    
    `codex resume <name>` and `/resume <name>` could fail in the app-server
    TUI path because name lookup pre-filtered `thread/list` with the backend
    `search_term`, but saved thread names are hydrated after listing and are
    not part of that search index. Resolve names by scanning listed threads
    client-side instead, and add a regression test for saved sessions whose
    rollout title does not match the thread name.
  • Remove the legacy TUI split (#15922)
    This is the part 1 of 2 PRs that will delete the `tui` /
    `tui_app_server` split. This part simply deletes the existing `tui`
    directory and marks the `tui_app_server` feature flag as removed. I left
    the `tui_app_server` feature flag in place for now so its presence
    doesn't result in an error. It is simply ignored.
    
    Part 2 will rename the `tui_app_server` directory `tui`. I did this as
    two parts to reduce visible code churn.
  • don't include redundant write roots in apply_patch (#16030)
    apply_patch sometimes provides additional parent dir as a writable root
    when it is already writable. This is mostly a no-op on Mac/Linux but
    causes actual ACL churn on Windows that is best avoided. We are also
    seeing some actual failures with these ACLs in the wild, which I haven't
    fully tracked down, but it's safe/best to avoid doing it altogether.
  • [mcp] Bypass read-only tool checks. (#16044)
    - [x] Auto / unspecified approval mode: read-only tools now skip before
    guardian routing.
    - [x] Approve / always-allow mode: read-only tools still skip, now via
    the shared early return.
    - [x] Prompt mode: read-only tools no longer skip; they continue to
    approval.
  • Fix /copy regression in tui_app_server turn completion (#16021)
    Addresses #16019
    
    `tui_app_server` renders completed assistant messages from item
    notifications, but it only updated `/copy` state from `turn/completed`.
    After the app-server migration, turn completion no longer repeats the
    final assistant text, so `/copy` could stay unavailable even after the
    first normal response.
    
    This PR track the last completed final-answer agent message during an
    active app-server turn and promote it into the `/copy` cache when the
    turn completes. This restores the pre-migration behavior without
    changing rollback handling.
  • Fix tui_app_server hook notification rendering and replay (#16013)
    Addresses #15984
    
    HookStarted/HookCompleted notifications were being translated through a
    fragile JSON bridge, so hook status/output never reached the renderer.
    Early hook notifications could also be dropped during session refresh
    before replay.
    
    This PR fixes `tui_app_server` by mapping app-server hook notifications
    into TUI hook events explicitly and preserving buffered hook
    notifications across refresh, so cold-start and resumed sessions render
    the same hook UI as the legacy TUI.
  • codex-tools: extract responses API tool models (#16031)
    ## Why
    
    The previous extraction steps moved shared tool-schema parsing into
    `codex-tools`, but `codex-core` still owned the generic Responses API
    tool models and the last adapter layer that turned parsed tool
    definitions into `ResponsesApiTool` values.
    
    That left `core/src/tools/spec.rs` and `core/src/client_common.rs`
    holding a chunk of tool-shaping code that does not need session state,
    runtime plumbing, or any other `codex-core`-specific dependency. As a
    result, `codex-tools` owned the parsed tool definition, but `codex-core`
    still owned the generic wire model that those definitions are converted
    into.
    
    This change moves that boundary one step further. `codex-tools` now owns
    the reusable Responses/tool wire structs and the shared conversion
    helpers for dynamic tools, MCP tools, and deferred MCP aliases.
    `codex-core` continues to own `ToolSpec` orchestration and the remaining
    web-search-specific request shapes.
    
    ## What changed
    
    - added `tools/src/responses_api.rs` to own `ResponsesApiTool`,
    `FreeformTool`, `ToolSearchOutputTool`, namespace output types, and the
    shared `ToolDefinition -> ResponsesApiTool` adapter helpers
    - added `tools/src/responses_api_tests.rs` for deferred-loading
    behavior, adapter coverage, and namespace serialization coverage
    - rewired `core/src/tools/spec.rs` to use the extracted dynamic/MCP
    adapter helpers instead of defining those conversions locally
    - rewired `core/src/tools/handlers/tool_search.rs` to use the extracted
    deferred MCP adapter and namespace output types directly
    - slimmed `core/src/client_common.rs` so it now keeps `ToolSpec` and the
    web-search-specific wire types, while reusing the extracted tool models
    from `codex-tools`
    - moved the extracted seam tests out of `core` and updated
    `codex-rs/tools/README.md` plus `tools/src/lib.rs` to reflect the
    expanded `codex-tools` boundary
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
    - `cargo test -p codex-core --lib tools::handlers::tool_search::`
    - `just fix -p codex-tools -p codex-core`
    - `just argument-comment-lint`
    
    ## References
    
    - [#15923](https://github.com/openai/codex/pull/15923) `codex-tools:
    extract shared tool schema parsing`
    - [#15928](https://github.com/openai/codex/pull/15928) `codex-tools:
    extract MCP schema adapters`
    - [#15944](https://github.com/openai/codex/pull/15944) `codex-tools:
    extract dynamic tool adapters`
    - [#15953](https://github.com/openai/codex/pull/15953) `codex-tools:
    introduce named tool definitions`
  • Add usage-based business plan types (#15934)
    ## Summary
    - add `self_serve_business_usage_based` and `enterprise_cbp_usage_based`
    to the public/internal plan enums and regenerate the app-server + Python
    SDK artifacts
    - map both plans through JWT login and backend rate-limit payloads, then
    bucket them with the existing Team/Business entitlement behavior in
    cloud requirements, usage-limit copy, tooltips, and status display
    - keep the earlier display-label remap commit on this branch so the new
    Team-like and Business-like plans render consistently in the UI
    
    ## Testing
    - `just write-app-server-schema`
    - `uv run --project sdk/python python
    sdk/python/scripts/update_sdk_artifacts.py generate-types`
    - `just fix -p codex-protocol -p codex-login -p codex-core -p
    codex-backend-client -p codex-cloud-requirements -p codex-tui -p
    codex-tui-app-server -p codex-backend-openapi-models`
    - `just fmt`
    - `just argument-comment-lint`
    - `cargo test -p codex-protocol
    usage_based_plan_types_use_expected_wire_names`
    - `cargo test -p codex-login usage_based`
    - `cargo test -p codex-backend-client usage_based`
    - `cargo test -p codex-cloud-requirements usage_based`
    - `cargo test -p codex-core usage_limit_reached_error_formats_`
    - `cargo test -p codex-tui plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui remapped`
    - `cargo test -p codex-tui-app-server
    plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui-app-server remapped`
    - `cargo test -p codex-tui-app-server
    preserves_usage_based_plan_type_wire_name`
    
    ## Notes
    - a broader multi-crate `cargo test` run still hits unrelated existing
    guardian-approval config failures in
    `codex-rs/core/src/config/config_tests.rs`
  • plugins: Clean up stale curated plugin sync temp dirs and add sync metrics (#16035)
    1. Keep curated plugin staging directories under TempDir ownership until
    activation succeeds, so failed git/HTTP sync attempts do not leak
    plugins-clone-*.
    2. Best-effort clean up stale plugins-clone-* directories before
    creating a new staged repo, using a conservative age threshold.
    3. Emit OTEL counters for curated plugin startup sync transport attempts
    and final outcome across git and HTTP paths.
  • Normalize /mcp tool grouping for hyphenated server names (#15946)
    Fix display for servers with special characters.
  • fix: fix Windows CI regression introduced in #15999 (#16027)
    #15999 introduced a Windows-only `\r\n` mismatch in review-exit template
    handling. This PR normalizes those template newlines and separates that
    fix from [#16014](https://github.com/openai/codex/pull/16014) so it can
    be reviewed independently.
  • codex-tools: introduce named tool definitions (#15953)
    ## Why
    
    This continues the `codex-tools` migration by moving one more piece of
    generic tool-definition bookkeeping out of `codex-core`.
    
    The earlier extraction steps moved shared schema parsing into
    `codex-tools`, but `core/src/tools/spec.rs` still had to supply tool
    names separately and perform ad hoc rewrites for deferred MCP aliases.
    That meant the crate boundary was still awkward: the parsed shape coming
    back from `codex-tools` was missing part of the definition that
    `codex-core` ultimately needs to assemble a `ResponsesApiTool`.
    
    This change introduces a named `ToolDefinition` in `codex-tools` so both
    MCP tools and dynamic tools cross the crate boundary in the same
    reusable model. `codex-core` still owns the final `ResponsesApiTool`
    assembly, but less of the generic tool-definition shaping logic stays
    behind in `core`.
    
    ## What changed
    
    - replaced `ParsedToolDefinition` with a named `ToolDefinition` in
    `codex-rs/tools/src/tool_definition.rs`
    - added `codex-rs/tools/src/tool_definition_tests.rs` for `renamed()`
    and `into_deferred()`
    - updated `parse_dynamic_tool()` and `parse_mcp_tool()` to return
    `ToolDefinition`
    - simplified `codex-rs/core/src/tools/spec.rs` so it adapts
    `ToolDefinition` into `ResponsesApiTool` instead of rewriting names and
    deferred fields inline
    - updated parser tests and `codex-rs/tools/README.md` to reflect the
    named tool-definition model
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
  • ci: add Bazel clippy workflow for codex-rs (#15955)
    ## Why
    `bazel.yml` already builds and tests the Bazel graph, but `rust-ci.yml`
    still runs `cargo clippy` separately. This PR starts the transition to a
    Bazel-backed lint lane for `codex-rs` so we can eventually replace the
    duplicate Rust build, test, and lint work with Bazel while explicitly
    keeping the V8 Bazel path out of scope for now.
    
    To make that lane practical, the workflow also needs to look like the
    Bazel job we already trust. That means sharing the common Bazel setup
    and invocation logic instead of hand-copying it, and covering the arm64
    macOS path in addition to Linux.
    
    Landing the workflow green also required fixing the first lint findings
    that Bazel surfaced and adding the matching local entrypoint.
    
    ## What changed
    - add a reusable `build:clippy` config to `.bazelrc` and export
    `codex-rs/clippy.toml` from `codex-rs/BUILD.bazel` so Bazel can run the
    repository's existing Clippy policy
    - add `just bazel-clippy` so the local developer entrypoint matches the
    new CI lane
    - extend `.github/workflows/bazel.yml` with a dedicated Bazel clippy job
    for `codex-rs`, scoped to `//codex-rs/... -//codex-rs/v8-poc:all`
    - run that clippy job on Linux x64 and arm64 macOS
    - factor the shared Bazel workflow setup into
    `.github/actions/setup-bazel-ci/action.yml` and the shared Bazel
    invocation logic into `.github/scripts/run-bazel-ci.sh` so the clippy
    and build/test jobs stay aligned
    - fix the first Bazel-clippy findings needed to keep the lane green,
    including the cross-target `cmsghdr::cmsg_len` normalization in
    `codex-rs/shell-escalation/src/unix/socket.rs` and the no-`voice-input`
    dead-code warnings in `codex-rs/tui` and `codex-rs/tui_app_server`
    
    ## Verification
    - `just bazel-clippy`
    - `RUNNER_OS=macOS ./.github/scripts/run-bazel-ci.sh -- build
    --config=clippy --build_metadata=COMMIT_SHA=local-check
    --build_metadata=TAG_job=clippy -- //codex-rs/...
    -//codex-rs/v8-poc:all`
    - `bazel build --config=clippy
    //codex-rs/shell-escalation:shell-escalation`
    - `CARGO_TARGET_DIR=/tmp/codex4-shell-escalation-test cargo test -p
    codex-shell-escalation`
    - `ruby -e 'require "yaml";
    YAML.load_file(".github/workflows/bazel.yml");
    YAML.load_file(".github/actions/setup-bazel-ci/action.yml")'`
    
    ## Notes
    - `CARGO_TARGET_DIR=/tmp/codex4-tui-app-server-test cargo test -p
    codex-tui-app-server` still hits existing guardian-approvals test and
    snapshot failures unrelated to this PR's Bazel-clippy changes.
    
    Related: #15954
  • codex-tools: extract dynamic tool adapters (#15944)
    ## Why
    
    `codex-tools` already owned the shared JSON schema parser and the MCP
    tool schema adapter, but `core/src/tools/spec.rs` still parsed dynamic
    tools directly.
    
    That left the tool-schema boundary split in two different ways:
    
    - MCP tools flowed through `codex-tools`, while dynamic tools were still
    parsed in `codex-core`
    - the extracted dynamic-tool path initially introduced a
    dynamic-specific parsed shape even though `codex-tools` already had very
    similar MCP adapter output
    
    This change finishes that extraction boundary in one step. `codex-core`
    still owns `ResponsesApiTool` assembly, but both MCP tools and dynamic
    tools now enter that layer through `codex-tools` using the same parsed
    tool-definition shape.
    
    ## What changed
    
    - added `tools/src/dynamic_tool.rs` and sibling
    `tools/src/dynamic_tool_tests.rs`
    - introduced `parse_dynamic_tool()` in `codex-tools` and switched
    `core/src/tools/spec.rs` to use it for dynamic tools
    - added `tools/src/parsed_tool_definition.rs` so both MCP and dynamic
    adapters return the same `ParsedToolDefinition`
    - updated `core/src/tools/spec.rs` to build `ResponsesApiTool` through a
    shared local adapter helper instead of separate MCP and dynamic assembly
    paths
    - expanded `core/src/tools/spec_tests.rs` so the dynamic-tool adapter
    test asserts the full converted `ResponsesApiTool`, including
    `defer_loading`
    - updated `codex-rs/tools/README.md` to reflect the shared parsed
    tool-definition boundary
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15944).
    * #15953
    * __->__ #15944
  • fix(sandbox): fix bwrap lookup for multi-entry PATH (#15973)
    ## Summary
    - split the joined `PATH` before running system `bwrap` lookup
    - keep the existing workspace-local `bwrap` skip behavior intact
    - add regression tests that exercise real multi-entry search paths
    
    ## Why
    The PATH-based lookup added in #15791 still wrapped the raw `PATH`
    environment value as a single `PathBuf` before passing it through
    `join_paths()`. On Unix, a normal multi-entry `PATH` contains `:`, so
    that wrapper path is invalid as one path element and the lookup returns
    `None`.
    
    That made Codex behave as if no system `bwrap` was installed even when
    `bwrap` was available on `PATH`, which is what users in #15340 were
    still hitting on `0.117.0-alpha.25`.
    
    ## Impact
    System `bwrap` discovery now works with normal multi-entry `PATH` values
    instead of silently falling back to the vendored binary.
    
    Fixes #15340.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-sandboxing`
    - `cargo test -p codex-linux-sandbox`
    - `just fix -p codex-sandboxing`
    - `just argument-comment-lint`
  • chore: move pty and windows sandbox to Rust 2024 (#15954)
    ## Why
    
    `codex-utils-pty` and `codex-windows-sandbox` were the remaining crates
    in `codex-rs` that still overrode the workspace's Rust 2024 edition.
    Moving them forward in a separate PR keeps the baseline edition update
    isolated from the follow-on Bazel clippy workflow in #15955, while
    making linting and formatting behavior consistent with the rest of the
    workspace.
    
    This PR also needs Cargo and Bazel to agree on the edition for
    `codex-windows-sandbox`. Without the Bazel-side sync, the experimental
    Bazel app-server builds fail once they compile `windows-sandbox-rs`.
    
    ## What changed
    
    - switch `codex-rs/utils/pty` and `codex-rs/windows-sandbox-rs` to
    `edition = "2024"`
    - update `codex-utils-pty` callsites and tests to use the collapsed `if
    let` form that Clippy expects under the new edition
    - fix the Rust 2024 fallout in `windows-sandbox-rs`, including the
    reserved `gen` identifier, `unsafe extern` requirements, and new Clippy
    findings that surfaced under the edition bump
    - keep the edition bump separate from a larger unsafe cleanup by
    temporarily allowing `unsafe_op_in_unsafe_fn` in the Windows entrypoint
    modules that now report it under Rust 2024
    - update `codex-rs/windows-sandbox-rs/BUILD.bazel` to `crate_edition =
    "2024"` so Bazel compiles the crate with the same edition as Cargo
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15954).
    * #15976
    * #15955
    * __->__ #15954