45 Commits

  • [codex] implement standalone code-mode process host (#30111)
    ## Summary
    
    - implement the standalone `codex-code-mode-host` stdio service
    - route sessions, cells, delegate requests, responses, and cancellation
    through a bounded host peer
    - supervise request, writer, cell-forwarding, actor, and V8 failure
    boundaries
    - bound request/session tombstones and fail-stop the connection on
    invalid protocol state
    - add host-only duplex protocol tests and local Cargo/Bazel run recipes
    
    ## Why
    
    This stage makes the host process independently runnable and reviewable
    before exposing any remote client in Codex. Transport or runtime failure
    closes the connection and relies on process replacement rather than
    transactional recovery.
    
    ## Stack
    
    This is **3 of 4** in the process-owned code-mode session stack.
    
    - Depends on #30110
    - The final client PR targets this branch
    
    ## Validation
    
    - `just test -p codex-code-mode-host` — 7 host-only tests passed
    - `just fix -p codex-code-mode-host`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fmt`
  • Make formatter output quiet on success (#29467)
    ## Why
    
    `just fmt` is quite noisy even on successful runs.
    
    ## What
    
    Only print output when a formatter fails.
    
    - Buffer output from each formatter and print only a failed command and
    its diagnostics.
    - Prefix the `justfile` driver invocations with `@` so Just does not
    echo the command itself.
    - Retain rustfmt stderr on failure and cover silent-success and
    failure-reporting behavior.
    
    ## Validation
    
    - Confirmed `just fmt` and `just fmt-check` both exit successfully with
    empty stdout and stderr.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • [codex] Speed up local nextest runs (#26479)
    ## Why
    
    `just test` currently uses the CI-oriented nextest profile, which
    serializes app-server integration tests even on developer machines that
    can run several safely. Bounded local parallelism substantially shortens
    this common iteration loop without changing CI behavior.
    
    Eight-worker experiments were faster, but keeping them reliable required
    relaxing several test deadlines. Four workers for integration tests is a
    solid tradeoff that speeds up local testing without needing to change
    test logic.
    
    ## What changed
    
    - Add a `local` nextest profile that inherits the existing defaults.
    - Allow up to four app-server integration tests to run concurrently
    under that profile.
    - Make `just test` select the local profile on Unix and Windows.
    - Keep the default CI profile serialized and leave all test deadlines
    unchanged.
    
    The tests use separate processes, randomized temporary `CODEX_HOME`
    directories, and ephemeral ports. The remaining shared constraints are
    system resources; each app-server also uses a multi-thread Tokio
    runtime, and fuzzy-search tests can create additional worker threads, so
    the local cap remains intentionally conservative.
    
    ## Performance and validation
    
    All measurements below are warm, execution-only app-server runs with
    nextest retries disabled.
    
    On the current rebased branch, an AMD EPYC 7763 machine with 16 logical
    CPUs and 62 GiB RAM completed three consecutive runs:
    
    | Run | Nextest time | Wall time | Result |
    | --- | ---: | ---: | --- |
    | 1 | 142.941s | 145.17s | 836/836 passed |
    | 2 | 143.402s | 145.59s | 836/836 passed |
    | 3 | 142.870s | 145.08s | 836/836 passed |
    
    The mean wall time was 145.28s. The slow-inventory, approval replay, and
    zsh-fork tests all passed with their original deadlines.
    
    Earlier measurements on the same Linux machine, before the suite grew,
    showed the scaling that motivated the change:
    
    | App-server concurrency | Nextest time | Result |
    | --- | ---: | --- |
    | 1 | 369.5s | 572/572 passed |
    | 2 | 194.5s | 572/572 passed |
    | 4 | 111.0s mean over 3 runs | 3/3 clean |
    
    Four workers reduced that execution time by about 70%, a roughly 3.3x
    speedup over serialization.
  • Remove just bench-smoke from just test. (#26716)
    ## Why
    
    `just test` should run the test suite without also compiling and
    executing benchmark smoke tests. Keeping benchmark validation explicit
    avoids adding unrelated work to every project-specific test invocation.
    
    ## What changed
    
    - Remove the `just bench-smoke` step from the Unix and Windows `test`
    recipes.
    - Document `just bench` and `just bench-smoke` as the explicit benchmark
    commands in `AGENTS.md`.
    
    ## Validation
    
    - `just test -p codex-arg0`
    - `just --dry-run test`
    - `just --dry-run bench-smoke`
  • [codex] Fix Windows BuildBuddy Bazel wrapper execution (#25915)
    ## Why
    
    #25156 moved Bazel CI launches into a shared Python wrapper. On Windows,
    launching Bazel with `os.execvp` can split the spaced
    `--test_env=PATH=...` argument and fail to propagate the eventual Bazel
    exit status, allowing jobs to pass without running tests. This reapplies
    the wrapper after #25909 with a Windows-safe launch path.
    
    ## What changed
    
    Use a waited `subprocess.run` launch on Windows while preserving
    `os.execvp` on Unix. Add a process-level regression test for spaced
    arguments and child exit status, and run it on Windows Bazel shard 1.
    
    ## Experiment
    
    To confirm Bazel was actually invoking tests, patch `87b61d0be6`
    temporarily added an intentionally failing `codex-core` unit test. Bazel
    failed on that sentinel on all three major platforms:
    
    - [Linux Bazel
    test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062486)
    - [macOS Bazel
    test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062362)
    - [Windows Bazel test shard
    1/4](https://github.com/openai/codex/actions/runs/26841132773/job/79151062155)
    
    The sentinel was removed after collecting this evidence. Windows Bazel
    [clippy](https://github.com/openai/codex/actions/runs/26841132773/job/79151062914)
    and [release
    verification](https://github.com/openai/codex/actions/runs/26841132773/job/79151062739)
    also passed.
    
    ## Validation
    
    After removing the sentinel, `just test -p codex-core` no longer
    reported it. The local run retained two unrelated environment-specific
    failures.
  • [codex] Revert shared BuildBuddy Bazel wrapper (#25909)
    ## Why
    
    PR #25905 intentionally adds a failing `codex-core` unit test, but its
    [Bazel test on Windows
    check](https://github.com/openai/codex/actions/runs/26837526950/job/79135369259)
    passed. That shows the Bazel configuration introduced by #25156 is not
    behaving as expected, so revert it while the configuration can be
    investigated separately.
    
    ## What changed
    
    Revert #25156 in full, restoring the previous Bazel remote
    configuration, CI scripts, workflows, `rusty_v8` handling, and
    documentation. This removes the shared BuildBuddy wrapper and its tests.
    
    ## Validation
    
    Not run locally; this exact revert was prioritized for a fast rollback.
  • Route Bazel CI through shared BuildBuddy remote config wrapper (#25156)
    ## Why
    
    Bazel remote configuration was selected in several CI scripts and
    workflow steps. That made the BuildBuddy tenant policy easy to duplicate
    and harder to audit, especially for fork pull requests that must not use
    the OpenAI tenant.
    
    This builds on
    [sluongng/buildbuddy-ci-host-routing](https://github.com/openai/codex/compare/main...sluongng:codex:sluongng/buildbuddy-ci-host-routing)
    and consolidates the policy in one place.
    
    ## What to do if this breaks you
    
    See `codex-rs/docs/bazel.md` for details. TLDR:
    
    1. make a BuildBuddy API key and put it in `~/.bazelrc`
    2. if you're an OpenAI employee, add `common
    --config=buildbuddy-openai-rbe` to `user.bazelrc` in the repo root
    
    Run `just bazel-test` to ensure it works.
    
    Note that `just bazel-remote-test` no longer exists, you need to select
    a remote configuration as documented to use RBE.
    
    ## What changed
    
    - Add `.github/scripts/run_bazel_with_buildbuddy.py` as the shared Bazel
    wrapper and Python library. It selects the OpenAI host only for trusted
    upstream GitHub Actions runs, routes keyed fork runs to the generic
    host, and falls back to local Bazel execution when no key is available.
    - Move endpoint selection into explicit `.bazelrc` configurations and
    update Bazel CI, query helpers, and `rusty_v8` staging to use the shared
    policy. Loading-phase target-discovery queries remain local.
    - Add wrapper and `rusty_v8` unit coverage, plus `just test-scripts` for
    the `.github/scripts` Python tests.
    - Document local Bazel usage, `user.bazelrc` setup, BuildBuddy
    configurations, and CI behavior in `codex-rs/docs/bazel.md`.
    
    ## Validation
    
    - `just test-scripts`
    - `bash -n .github/scripts/run-bazel-ci.sh
    .github/scripts/run-bazel-query-ci.sh
    .github/scripts/run-argument-comment-lint-bazel.sh
    scripts/list-bazel-clippy-targets.sh`
    - `python3 -m py_compile .github/scripts/run_bazel_with_buildbuddy.py
    .github/scripts/test_run_bazel_with_buildbuddy.py
    .github/scripts/test_rusty_v8_bazel.py
    .github/scripts/rusty_v8_bazel.py`
    - `ruff check .github/scripts/run_bazel_with_buildbuddy.py
    .github/scripts/test_run_bazel_with_buildbuddy.py
    .github/scripts/test_rusty_v8_bazel.py
    .github/scripts/rusty_v8_bazel.py`
  • [codex] Add comprehensive root formatting check (#25683)
    ## Why
    
    The root formatting entrypoints could drift: `just fmt` did not format
    the Justfile itself, and the CI-facing check recipe only checked Python
    scripts instead of matching everything formatted by `just fmt`.
    
    ## What changed
    
    - Add a shared cross-platform Python formatter driver used by both `just
    fmt` and `just fmt-check`.
    - Run Justfile, Rust, Python SDK, and internal-script formatter groups
    concurrently while buffering each formatter group's output until it
    finishes.
    - Log formatter starts immediately, then print each formatter group's
    labeled output when it completes.
    - Keep the SDK lint-fix and Ruff formatting passes ordered, with source
    comments explaining their distinct roles and the check-mode equivalents.
    - Run Ruff through shared `uv run --no-sync --with ruff` overlays so
    formatting works on clean glibc Linux checkouts without installing the
    platform-specific SDK runtime wheel.
    - Show `fmt-check` help text in `just -l` and simplify CI to call the
    shared driver through `just fmt-check`.
    - Pin the general CI workflow to `just@1.51.0` so its formatter agrees
    with the checked-in Justfile.
    - Add regression coverage for the thin Just recipes and the driver's
    formatter graph.
    
    ## Validation
    
    - `just fmt`
    - `just fmt-check`
    - `python3 -m pytest
    sdk/python/tests/test_artifact_workflow_and_binaries.py -k 'root_fmt or
    root_format' -q`
    - `pnpm run format`
    - `git diff --check`
    - `just -l | rg -n '^    fmt|fmt-check'`
    - `uvx --from uv==0.7.22 uv run --frozen --project sdk/python --no-sync
    --with ruff ruff check --diff sdk/python`
  • Check root Python script formatting in CI (#25165)
    ## Why
    
    Python files under `scripts/` were not covered by the repository
    formatting recipe or the CI formatting job, so formatting drift could
    merge unnoticed.
    
    ## What
    
    - Add a dedicated `scripts/pyproject.toml` and `scripts/uv.lock` so
    root-script formatting uses a locked Ruff version.
    - Extend `just fmt` to format root Python scripts and add
    `fmt-scripts-check` for CI.
    - Run `just fmt-scripts-check` from `.github/workflows/ci.yml`,
    installing `uv` through SHA-pinned `astral-sh/setup-uv` while retaining
    the `uv` `0.11.3` pin.
    - Apply Ruff formatting to the root Python scripts, including
    `scripts/just-shell.py`, and extend
    `sdk/python/tests/test_artifact_workflow_and_binaries.py` to cover the
    root formatting recipe.
    - Update `AGENTS.md` so agents run `just fmt` after code changes
    anywhere in the repository.
    
    ## Validation
    
    - Extended the existing Python SDK workflow test to assert that `just
    fmt` includes root Python scripts.
  • [codex] Make justfile recipes Windows-aware (#24983)
    ## Summary
    
    Make the root `justfile` usable from Windows without maintaining a
    separate Windows copy of most recipes.
    
    The repo recipes previously assumed POSIX shell behavior for things like
    variadic argument forwarding (`"$@"`) and stderr redirection
    (`2>/dev/null`). That made common workflows such as `just fmt`, `just
    test`, and `just log` unreliable from Windows. This PR introduces a
    small cross-platform shell adapter so recipes can stay mostly unified
    while still expanding the few shell-specific constructs correctly on
    macOS/Linux and Windows.
    
    ## What Changed
    
    - Add `scripts/just-shell.py` as the configured `just` shell adapter.
      - On Unix it invokes `sh -cu`.
    - On Windows it invokes `pwsh -CommandWithArgs` so arguments containing
    spaces are preserved.
    - Add portable recipe placeholders:
    - `{args}` expands to `"$@"` on Unix and the equivalent PowerShell
    forwarded-args expression on Windows.
    - `{stderr-null}` expands to the platform-specific stderr suppression
    used by `fmt`.
    - Convert most variadic one-line recipes to the unified `{args}` form,
    including `codex`, `exec`, `file-search`, `app-server-test-client`,
    `fix`, `clippy`, `bench`, `mcp-server-run`, `write-app-server-schema`,
    and `argument-comment-lint-from-source`.
    - Keep genuinely shell-specific recipes split or Unix-only for now,
    including recipes backed by `.sh` scripts or recipes whose bodies are
    more than simple command forwarding.
    - Add a Windows `just install` path that installs PowerShell via
    `winget` when `pwsh` is not available, then runs the same basic Rust
    setup steps.
    - Update the SDK test that validates the root `fmt` recipe so it
    recognizes the new portable stderr placeholder.
    
    ## Validation
    
    - `just --summary`
    - `just --dry-run fmt`
    - `just --dry-run bench-smoke`
    - `just --dry-run codex foo "bar binky" baz`
    - `just --dry-run write-hooks-schema`
    - `just --dry-run bazel-lock-update`
    - `just --dry-run argument-comment-lint-from-source -- "foo bar"`
    - `git diff --check -- justfile scripts/just-shell.py
    sdk/python/tests/test_artifact_workflow_and_binaries.py`
    - Verified Windows argv preservation through `scripts/just-shell.py`
    with arguments containing spaces.
    - `uv run --frozen --project sdk/python --extra dev pytest
    sdk/python/tests/test_artifact_workflow_and_binaries.py::test_root_fmt_recipe_formats_rust_and_python_sdk`
  • Add app-server startup benchmark crate (#24651)
    ## Summary
    - Add a new `app-server-start-bench` crate to measure app-server startup
    performance
    - Wire the benchmark into the workspace and Bazel build so it can be run
    consistently
    - Update lockfiles and repo automation to account for the new package
  • [codex] Add image re-encoding benchmarks (#23935)
    ## Summary
    - add Divan benchmarks for prompt image re-encoding paths
    - wire the image benchmark smoke test into Rust CI workflows
    
    ## Why
    Image prompt handling includes re-encoding work that benefits from
    repeatable benchmark coverage so changes can be measured in CI and
    locally.
    
    This already helped identify a potential regression from changing compiler flags.
    
    ## Impact
    Developers can run and compare the new image re-encoding benchmarks, and
    CI exercises the benchmark target via the Rust benchmark smoke test.
  • Prefer just test over cargo test in docs (#23910)
    `cargo test` for the core and other crates fails on a fresh macOS
    checkout without the right stack size variable. This change encourages
    using the just test command that sets the environment up correctly.
    
    As a bonus, this should encourage agents to get more benefit out of
    nextest's parallel execution.
  • fix: prevent fmt from updating Python SDK lockfile (#22505)
    ## Why
    
    `just fmt` should align source formatting without resolving dependencies
    or rewriting lockfiles. The Python SDK formatting steps run through
    `uv`, so differing local `uv` versions could decide the SDK lock was
    stale and mutate `sdk/python/uv.lock` before Ruff ran.
    
    ## What
    
    - Add `--frozen` to both Python SDK `uv run ... ruff` commands in the
    root `fmt` recipe.
    - Update the existing Python SDK artifact workflow guard test so future
    changes keep the formatter recipe non-lock-mutating.
    
    ## Verification
    
    - `uv run --frozen --project ../sdk/python --extra dev pytest
    ../sdk/python/tests/test_artifact_workflow_and_binaries.py -q`
  • [8/8] Add Python SDK Ruff formatting (#22021)
    ## Why
    
    The Python SDK needs the same tight formatter/lint loop as the rest of
    the repo: a safe Ruff autofix pass, Ruff formatting, editor save
    behavior, and CI checks that catch drift. Without that loop, SDK changes
    can land with formatting or import ordering that differs from what
    reviewers and CI expect.
    
    ## What
    
    - Add Ruff configuration to `sdk/python/pyproject.toml`, excluding
    generated protocol code and notebooks from the normal lint/format pass.
    - Update `just fmt` so it still formats Rust and also runs Python SDK
    Ruff autofix and formatting.
    - Add Python SDK CI steps for `ruff check` and `ruff format --check`
    before pytest.
    - Recommend the Ruff VS Code extension and enable Python
    format/fix/organize-on-save so Cmd+S uses the same tooling.
    - Apply the resulting Ruff formatting to SDK Python files, examples, and
    the checked-in generated `v2_all.py` output emitted by the pinned
    generator.
    - Add a guard test for the `just fmt` recipe so it keeps working from
    both Rust and Python SDK working directories.
    
    ## Stack
    
    1. #21891 `[1/8]` Pin Python SDK runtime dependency
    2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
    3. #21895 `[3/8]` Run Python SDK tests in CI
    4. #21896 `[4/8]` Define Python SDK public API surface
    5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
    6. #21910 `[6/8]` Add high-level Python SDK approval mode
    7. #22014 `[7/8]` Add Python SDK app-server integration harness
    8. This PR `[8/8]` Add Python SDK Ruff formatting
    
    ## Verification
    
    - Added `test_root_fmt_recipe_formats_rust_and_python_sdk` for the
    shared format recipe.
    - Ran `just fmt` after the recipe update.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Ensure all mentions of cargo-install are --locked (#21592)
    There's already a preference for this in the codebase, but a few of them
    have drifted away. Generally `--locked` is preferred to reduce exposure
    to supply-chain attacks (and just generally improve reproducibility).
    
    In an ideal world these dependencies would maybe even be pinned to
    versions but Cargo is kinda bad at that for devtools. Still better to
    use --locked than not.
  • test: set Rust test thread stack size (#19067)
    ## Summary
    
    Set `RUST_MIN_STACK=8388608` for Rust test entry points so
    libtest-spawned test threads get an 8 MiB stack.
    
    The Windows BuildBuddy failure on #18893 showed
    `//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a
    `#[tokio::test]` even though later test binaries in the shard printed
    successful summaries. Default `#[tokio::test]` uses a current-thread
    Tokio runtime, which means the async test body is driven on libtest's
    std-spawned test thread. Increasing the test thread stack addresses that
    failure mode directly.
    
    To date, we have been fixing these stack-pressure problems with
    localized future-size reductions, such as #13429, and by adding
    `Box::pin()` in specific async wrapper chains. This gives us a baseline
    test-runner stack size instead of continuing to patch individual tests
    only after CI finds another large async future.
    
    ## What changed
    
    - Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so
    Bazel test actions receive the env var through Bazel's cache-keyed test
    environment path.
    - Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points
    and `just test`.
    - Annotated the existing Windows Bazel linker stack reserve as 8 MiB so
    it stays aligned with the libtest thread stack size.
    
    ## Testing
    
    - `just --list`
    - parsed `.github/workflows/rust-ci.yml` and
    `.github/workflows/rust-ci-full.yml` with Ruby's YAML loader
    - compared `bazel aquery` `TestRunner` action keys before/after explicit
    `--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to
    `.bazelrc`
    - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors`
    - failed locally on the existing sandbox-specific status snapshot
    permission mismatch, but loaded the Starlark changes and ran the TUI
    test shards
  • Run exec-server fs operations through sandbox helper (#17294)
    ## Summary
    - run exec-server filesystem RPCs requiring sandboxing through a
    `codex-fs` arg0 helper over stdin/stdout
    - keep direct local filesystem execution for `DangerFullAccess` and
    external sandbox policies
    - remove the standalone exec-server binary path in favor of top-level
    arg0 dispatch/runtime paths
    - add sandbox escape regression coverage for local and remote filesystem
    paths
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - remote devbox: `cd codex-rs && bazel test --bes_backend=
    --bes_results_url= //codex-rs/exec-server:all` (6/6 passed)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • ci: align Bazel repo cache and Windows clippy target handling (#16740)
    ## Why
    
    Bazel CI had two independent Windows issues:
    
    - The workflow saved/restored `~/.cache/bazel-repo-cache`, but
    `.bazelrc` configured `common:ci-windows
    --repository_cache=D:/a/.cache/bazel-repo-cache`, so `actions/cache` and
    Bazel could point at different directories.
    - The Windows `Bazel clippy` job passed the full explicit target list
    from `//codex-rs/...`, but some of those explicit targets are
    intentionally incompatible with `//:local_windows`.
    `run-argument-comment-lint-bazel.sh` already handles that with
    `--skip_incompatible_explicit_targets`; the clippy workflow path did
    not.
    
    I also tried switching the workflow cache path to
    `D:\a\.cache\bazel-repo-cache`, but the Windows clippy job repeatedly
    failed with `Failed to restore: Cache service responded with 400`, so
    the final change standardizes on `$HOME/.cache/bazel-repo-cache` and
    makes cache restore non-fatal.
    
    ## What Changed
    
    - Expose one repository-cache path from
    `.github/actions/setup-bazel-ci/action.yml` and export that path as
    `BAZEL_REPOSITORY_CACHE` so `run-bazel-ci.sh` passes it to Bazel after
    `--config=ci-*`.
    - Move `actions/cache/restore` out of the composite action into
    `.github/workflows/bazel.yml`, and make restore failures non-fatal
    there.
    - Save exactly the exported cache path in `.github/workflows/bazel.yml`.
    - Remove `common:ci-windows
    --repository_cache=D:/a/.cache/bazel-repo-cache` from `.bazelrc` so the
    Windows CI config no longer disagrees with the workflow cache path.
    - Pass `--skip_incompatible_explicit_targets` in the Windows `Bazel
    clippy` job so incompatible explicit targets do not fail analysis while
    the lint aspect still traverses compatible Rust dependencies.
    
    ## Verification
    
    - Parsed `.github/actions/setup-bazel-ci/action.yml` and
    `.github/workflows/bazel.yml` with Ruby's YAML loader.
    - Resubmitted PR `#16740`; CI is rerunning on the amended commit.
  • bazel: lint rust_test targets in clippy workflow (#16450)
    ## Why
    
    `cargo clippy --tests` was catching warnings in inline `#[cfg(test)]`
    code that the Bazel PR Clippy lane missed. The existing Bazel invocation
    linted `//codex-rs/...`, but that did not apply Clippy to the generated
    manual `rust_test` binaries, so warnings in targets such as
    `//codex-rs/state:state-unit-tests-bin` only surfaced as plain compile
    warnings instead of failing the lint job.
    
    ## What Changed
    
    - added `scripts/list-bazel-clippy-targets.sh` to expand the Bazel
    Clippy target set with the generated manual `rust_test` rules while
    still excluding `//codex-rs/v8-poc:all`
    - updated `.github/workflows/bazel.yml` to use that expanded target list
    in the Bazel Clippy PR job
    - updated `just bazel-clippy` to use the same target expansion locally
    - updated `.github/workflows/README.md` to document that the Bazel PR
    lint lane now covers inline `#[cfg(test)]` code
    
    ## Verification
    
    - `./scripts/list-bazel-clippy-targets.sh` includes
    `//codex-rs/state:state-unit-tests-bin`
    - `bazel build --config=clippy -- //codex-rs/state:state-unit-tests-bin`
    now fails with the same unused import in `state/src/runtime/logs.rs`
    that `cargo clippy --tests` reports
  • ci: stop running rust CI with --all-features (#16473)
    ## Why
    
    Now that workspace crate features have been removed and
    `.github/scripts/verify_cargo_workspace_manifests.py` hard-bans new
    ones, Rust CI should stop building and testing with `--all-features`.
    
    Keeping `--all-features` in CI no longer buys us meaningful coverage for
    `codex-rs`, but it still makes the workflow look like we rely on Cargo
    feature permutations that we are explicitly trying to eliminate. It also
    leaves stale examples in the repo that suggest `--all-features` is a
    normal or recommended way to run the workspace.
    
    ## What changed
    
    - removed `--all-features` from the Rust CI `cargo chef cook`, `cargo
    clippy`, and `cargo nextest` invocations in
    `.github/workflows/rust-ci-full.yml`
    - updated the `just test` guidance in `justfile` to reflect that
    workspace crate features are banned and there should be no need to add
    `--all-features`
    - updated the multiline command example and snapshot in
    `codex-rs/tui/src/history_cell.rs` to stop rendering `cargo test
    --all-features --quiet`
    - tightened the verifier docstring in
    `.github/scripts/verify_cargo_workspace_manifests.py` so it no longer
    talks about temporary remaining exceptions
    
    ## How tested
    
    - `python3 .github/scripts/verify_cargo_workspace_manifests.py`
    - `cargo test -p codex-tui`
  • fix: close Bazel argument-comment-lint CI gaps (#16253)
    ## Why
    
    The Bazel-backed `argument-comment-lint` CI path had two gaps:
    
    - Bazel wildcard target expansion skipped inline unit-test crates from
    `src/` modules because the generated `*-unit-tests-bin` `rust_test`
    targets are tagged `manual`.
    - `argument-comment-mismatch` was still only a warning in the Bazel and
    packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could
    still pass CI even when the lint detected it.
    
    That left CI blind to real linux-sandbox examples, including the missing
    `/*local_port*/` comment in
    `codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument
    comments in `codex-rs/linux-sandbox/src/landlock.rs`.
    
    ## What Changed
    
    - Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel
    lint runs cover `//codex-rs/...` plus the manual `rust_test`
    `*-unit-tests-bin` targets.
    - Updated `just argument-comment-lint`, `rust-ci.yml`, and
    `rust-ci-full.yml` to use that helper.
    - Promoted both `argument-comment-mismatch` and
    `uncommented-anonymous-literal-argument` to errors in every strict
    entrypoint:
      - `tools/argument-comment-lint/lint_aspect.bzl`
      - `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
      - `tools/argument-comment-lint/wrapper_common.py`
    - Added wrapper/bin coverage for the stricter lint flags and documented
    the behavior in `tools/argument-comment-lint/README.md`.
    - Fixed the now-covered callsites in
    `codex-rs/linux-sandbox/src/proxy_routing.rs`,
    `codex-rs/linux-sandbox/src/landlock.rs`, and
    `codex-rs/core/src/shell_snapshot_tests.rs`.
    
    This keeps the Bazel target expansion narrow while making the Bazel and
    prebuilt-linter paths enforce the same strict lint set.
    
    ## Verification
    
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `cargo +nightly-2025-09-18 test --manifest-path
    tools/argument-comment-lint/Cargo.toml`
    - `just argument-comment-lint`
  • build: migrate argument-comment-lint to a native Bazel aspect (#16106)
    ## Why
    
    `argument-comment-lint` had become a PR bottleneck because the repo-wide
    lane was still effectively running a `cargo dylint`-style flow across
    the workspace instead of reusing Bazel's Rust dependency graph. That
    kept the lint enforced, but it threw away the main benefit of moving
    this job under Bazel in the first place: metadata reuse and cacheable
    per-target analysis in the same shape as Clippy.
    
    This change moves the repo-wide lint onto a native Bazel Rust aspect so
    Linux and macOS can lint `codex-rs` without rebuilding the world
    crate-by-crate through the wrapper path.
    
    ## What Changed
    
    - add a nightly Rust toolchain with `rustc-dev` for Bazel and a
    dedicated crate-universe repo for `tools/argument-comment-lint`
    - add `tools/argument-comment-lint/driver.rs` and
    `tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
    as a custom `rustc_driver`
    - switch repo-wide `just argument-comment-lint` and the Linux/macOS
    `rust-ci` lanes to `bazel build --config=argument-comment-lint
    //codex-rs/...`
    - keep the Python/DotSlash wrappers as the package-scoped fallback path
    and as the current Windows CI path
    - gate the Dylint entrypoint behind a `bazel_native` feature so the
    Bazel-native library avoids the `dylint_*` packaging stack
    - update the aspect runtime environment so the driver can locate
    `rustc_driver` correctly under remote execution
    - keep the dedicated `tools/argument-comment-lint` package tests and
    wrapper unit tests in CI so the source and packaged entrypoints remain
    covered
    
    ## Verification
    
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `cargo test` in `tools/argument-comment-lint`
    - `bazel build
    //tools/argument-comment-lint:argument-comment-lint-driver
    --@rules_rust//rust/toolchain/channel=nightly`
    - `bazel build --config=argument-comment-lint
    //codex-rs/utils/path-utils:all`
    - `bazel build --config=argument-comment-lint
    //codex-rs/rollout:rollout`
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
    * #16120
    * __->__ #16106
  • refactor: rewrite argument-comment lint wrappers in Python (#16063)
    ## Why
    
    The `argument-comment-lint` entrypoints had grown into two shell
    wrappers with duplicated parsing, environment setup, and Cargo
    forwarding logic. The recent `--` separator regression was a good
    example of the problem: the behavior was subtle, easy to break, and hard
    to verify.
    
    This change rewrites those wrappers in Python so the control flow is
    easier to follow, the shared behavior lives in one place, and the tricky
    argument/defaulting paths have direct test coverage.
    
    ## What changed
    
    - replaced `tools/argument-comment-lint/run.sh` and
    `tools/argument-comment-lint/run-prebuilt-linter.sh` with Python
    entrypoints: `run.py` and `run-prebuilt-linter.py`
    - moved shared wrapper behavior into
    `tools/argument-comment-lint/wrapper_common.py`, including:
      - splitting lint args from forwarded Cargo args after `--`
    - defaulting repo runs to `--manifest-path codex-rs/Cargo.toml
    --workspace --no-deps`
    - defaulting non-`--fix` runs to `--all-targets` unless the caller
    explicitly narrows the target set
      - setting repo defaults for `DYLINT_RUSTFLAGS` and `CARGO_INCREMENTAL`
    - kept the prebuilt wrapper thin: it still just resolves the packaged
    DotSlash entrypoint, keeps `rustup` shims first on `PATH`, infers
    `RUSTUP_HOME` when needed, and then launches the packaged `cargo-dylint`
    path
    - updated `justfile`, `rust-ci.yml`, and
    `tools/argument-comment-lint/README.md` to use the Python entrypoints
    - updated `rust-ci` so the package job runs Python syntax checks plus
    the new wrapper unit tests, and the OS-specific lint jobs invoke the
    wrappers through an explicit Python interpreter
    
    This is a follow-up to #16054: it keeps the current lint semantics while
    making the wrapper logic maintainable enough to iterate on safely.
    
    ## Validation
    
    - `python3 -m py_compile tools/argument-comment-lint/wrapper_common.py
    tools/argument-comment-lint/run.py
    tools/argument-comment-lint/run-prebuilt-linter.py
    tools/argument-comment-lint/test_wrapper_common.py`
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `python3 ./tools/argument-comment-lint/run-prebuilt-linter.py -p
    codex-terminal-detection -- --lib`
    - `python3 ./tools/argument-comment-lint/run.py -p
    codex-terminal-detection -- --lib`
  • Remove the legacy TUI split (#15922)
    This is the part 1 of 2 PRs that will delete the `tui` /
    `tui_app_server` split. This part simply deletes the existing `tui`
    directory and marks the `tui_app_server` feature flag as removed. I left
    the `tui_app_server` feature flag in place for now so its presence
    doesn't result in an error. It is simply ignored.
    
    Part 2 will rename the `tui_app_server` directory `tui`. I did this as
    two parts to reduce visible code churn.
  • ci: add Bazel clippy workflow for codex-rs (#15955)
    ## Why
    `bazel.yml` already builds and tests the Bazel graph, but `rust-ci.yml`
    still runs `cargo clippy` separately. This PR starts the transition to a
    Bazel-backed lint lane for `codex-rs` so we can eventually replace the
    duplicate Rust build, test, and lint work with Bazel while explicitly
    keeping the V8 Bazel path out of scope for now.
    
    To make that lane practical, the workflow also needs to look like the
    Bazel job we already trust. That means sharing the common Bazel setup
    and invocation logic instead of hand-copying it, and covering the arm64
    macOS path in addition to Linux.
    
    Landing the workflow green also required fixing the first lint findings
    that Bazel surfaced and adding the matching local entrypoint.
    
    ## What changed
    - add a reusable `build:clippy` config to `.bazelrc` and export
    `codex-rs/clippy.toml` from `codex-rs/BUILD.bazel` so Bazel can run the
    repository's existing Clippy policy
    - add `just bazel-clippy` so the local developer entrypoint matches the
    new CI lane
    - extend `.github/workflows/bazel.yml` with a dedicated Bazel clippy job
    for `codex-rs`, scoped to `//codex-rs/... -//codex-rs/v8-poc:all`
    - run that clippy job on Linux x64 and arm64 macOS
    - factor the shared Bazel workflow setup into
    `.github/actions/setup-bazel-ci/action.yml` and the shared Bazel
    invocation logic into `.github/scripts/run-bazel-ci.sh` so the clippy
    and build/test jobs stay aligned
    - fix the first Bazel-clippy findings needed to keep the lane green,
    including the cross-target `cmsghdr::cmsg_len` normalization in
    `codex-rs/shell-escalation/src/unix/socket.rs` and the no-`voice-input`
    dead-code warnings in `codex-rs/tui` and `codex-rs/tui_app_server`
    
    ## Verification
    - `just bazel-clippy`
    - `RUNNER_OS=macOS ./.github/scripts/run-bazel-ci.sh -- build
    --config=clippy --build_metadata=COMMIT_SHA=local-check
    --build_metadata=TAG_job=clippy -- //codex-rs/...
    -//codex-rs/v8-poc:all`
    - `bazel build --config=clippy
    //codex-rs/shell-escalation:shell-escalation`
    - `CARGO_TARGET_DIR=/tmp/codex4-shell-escalation-test cargo test -p
    codex-shell-escalation`
    - `ruby -e 'require "yaml";
    YAML.load_file(".github/workflows/bazel.yml");
    YAML.load_file(".github/actions/setup-bazel-ci/action.yml")'`
    
    ## Notes
    - `CARGO_TARGET_DIR=/tmp/codex4-tui-app-server-test cargo test -p
    codex-tui-app-server` still hits existing guardian-approvals test and
    snapshot failures unrelated to this PR's Bazel-clippy changes.
    
    Related: #15954
  • Add cached environment manager for exec server URL (#15785)
    Add environment manager that is a singleton and is created early in
    app-server (before skill manager, before config loading).
    
    Use an environment variable to point to a running exec server.
  • Use released DotSlash package for argument-comment lint (#15199)
    ## Why
    The argument-comment lint now has a packaged DotSlash artifact from
    [#15198](https://github.com/openai/codex/pull/15198), so the normal repo
    lint path should use that released payload instead of rebuilding the
    lint from source every time.
    
    That keeps `just clippy` and CI aligned with the shipped artifact while
    preserving a separate source-build path for people actively hacking on
    the lint crate.
    
    The current alpha package also exposed two integration wrinkles that the
    repo-side prebuilt wrapper needs to smooth over:
    - the bundled Dylint library filename includes the host triple, for
    example `@nightly-2025-09-18-aarch64-apple-darwin`, and Dylint derives
    `RUSTUP_TOOLCHAIN` from that filename
    - on Windows, Dylint's driver path also expects `RUSTUP_HOME` to be
    present in the environment
    
    Without those adjustments, the prebuilt CI jobs fail during `cargo
    metadata` or driver setup. This change makes the checked-in prebuilt
    wrapper normalize the packaged library name to the plain
    `nightly-2025-09-18` channel before invoking `cargo-dylint`, and it
    teaches both the wrapper and the packaged runner source to infer
    `RUSTUP_HOME` from `rustup show home` when the environment does not
    already provide it.
    
    After the prebuilt Windows lint job started running successfully, it
    also surfaced a handful of existing anonymous literal callsites in
    `windows-sandbox-rs`. This PR now annotates those callsites so the new
    cross-platform lint job is green on the current tree.
    
    ## What Changed
    - checked in the current
    `tools/argument-comment-lint/argument-comment-lint` DotSlash manifest
    - kept `tools/argument-comment-lint/run.sh` as the source-build wrapper
    for lint development
    - added `tools/argument-comment-lint/run-prebuilt-linter.sh` as the
    normal enforcement path, using the checked-in DotSlash package and
    bundled `cargo-dylint`
    - updated `just clippy` and `just argument-comment-lint` to use the
    prebuilt wrapper
    - split `.github/workflows/rust-ci.yml` so source-package checks live in
    a dedicated `argument_comment_lint_package` job, while the released lint
    runs in an `argument_comment_lint_prebuilt` matrix on Linux, macOS, and
    Windows
    - kept the pinned `nightly-2025-09-18` toolchain install in the prebuilt
    CI matrix, since the prebuilt package still relies on rustup-provided
    toolchain components
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` to
    normalize host-qualified nightly library filenames, keep the `rustup`
    shim directory ahead of direct toolchain `cargo` binaries, and export
    `RUSTUP_HOME` when needed for Windows Dylint driver setup
    - updated `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
    so future published DotSlash artifacts apply the same nightly-filename
    normalization and `RUSTUP_HOME` inference internally
    - fixed the remaining Windows lint violations in
    `codex-rs/windows-sandbox-rs` by adding the required `/*param*/`
    comments at the reported callsites
    - documented the checked-in DotSlash file, wrapper split, archive
    layout, nightly prerequisite, and Windows `RUSTUP_HOME` requirement in
    `tools/argument-comment-lint/README.md`
  • start of hooks engine (#13276)
    (Experimental)
    
    This PR adds a first MVP for hooks, with SessionStart and Stop
    
    The core design is:
    
    - hooks live in a dedicated engine under codex-rs/hooks
    - each hook type has its own event-specific file
    - hook execution is synchronous and blocks normal turn progression while
    running
    - matching hooks run in parallel, then their results are aggregated into
    a normalized HookRunSummary
    
    On the AppServer side, hooks are exposed as operational metadata rather
    than transcript-native items:
    
    - new live notifications: hook/started, hook/completed
    - persisted/replayed hook results live on Turn.hookRuns
    - we intentionally did not add hook-specific ThreadItem variants
    
    Hooks messages are not persisted, they remain ephemeral. The context
    changes they add are (they get appended to the user's prompt)
  • feat: discourage the use of the --all-features flag (#12429)
    ## Why
    
    Developers are frequently running low on disk space, and routine use of
    `--all-features` contributes to larger Cargo build caches in `target/`
    by compiling additional feature combinations.
    
    This change updates local workflow guidance to avoid `--all-features` by
    default and reserve it for cases where full feature coverage is
    specifically needed.
    
    ## What Changed
    
    - Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test`
    / `just test` for full-suite local runs, and to call out the disk-usage
    cost of routine `--all-features` usage.
    - Updated the root `justfile` so `just fix` and `just clippy` no longer
    pass `--all-features` by default.
    - Updated `docs/install.md` to explicitly describe `cargo test
    --all-features` as an optional heavier-weight run (more build time and
    `target/` disk usage).
    
    ## Verification
    
    - Confirmed the `justfile` parses and the recipes list successfully with
    `just --list`.
  • bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790)
    ## Why this change
    
    When Cargo dependencies change, it is easy to end up with an unexpected
    local diff in
    `MODULE.bazel.lock` after running Bazel. That creates noisy working
    copies and pushes lockfile fixes
    later in the cycle. This change addresses that pain point directly.
    
    ## What this change enforces
    
    The expected invariant is: after dependency updates, `MODULE.bazel.lock`
    is already in sync with
    Cargo resolution. In practice, running `bazel mod deps` should not
    mutate the lockfile in a clean
    state. If it does, the dependency update is incomplete.
    
    ## How this is enforced
    
    This change adds a single lockfile check script that snapshots
    `MODULE.bazel.lock`, runs
    `bazel mod deps`, and fails if the file changes. The same check is wired
    into local workflow
    commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
    Bazel CI (Linux x86_64 job)
    so drift is caught early and consistently. The developer documentation
    is updated in
    `codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
    explicit.
    
    `MODULE.bazel.lock` is also refreshed in this PR to match the current
    Cargo dependency resolution.
    
    ## Expected developer workflow
    
    After changing `Cargo.toml` or `Cargo.lock`, run `just
    bazel-lock-update`, then run
    `just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
    update in the same change.
    
    ## Testing
    
    Ran `just bazel-lock-check` locally.
  • feat: add --experimental to generate-ts (#10402)
    Adding a `--experimental` flag to the `generate-ts` fct in the
    app-sever.
    
    It can be called through one of those 2 command
    ```
    just write-app-server-schema --experimental
    codex app-server generate-ts --experimental
    ```
  • feat: vendor app-server protocol schema fixtures (#10371)
    Similar to what @sayan-oai did in openai/codex#8956 for
    `config.schema.json`, this PR updates the repo so that it includes the
    output of `codex app-server generate-json-schema` and `codex app-server
    generate-ts` and adds a test to verify it is in sync with the current
    code.
    
    Motivation:
    - This makes any schema changes introduced by a PR transparent during
    code review.
    - In particular, this should help us catch PRs that would introduce a
    non-backwards-compatible change to the app schema (eventually, this
    should also be enforced by tooling).
    - Once https://github.com/openai/codex/pull/10231 is in to formalize the
    notion of "experimental" fields, we can work on ensuring the
    non-experimental bits are backwards-compatible.
    
    `codex-rs/app-server-protocol/tests/schema_fixtures.rs` was added as the
    test and `just write-app-server-schema` can be use to generate the
    vendored schema files.
    
    Incidentally, when I run:
    
    ```
    rg _ codex-rs/app-server-protocol/schema/typescript/v2
    ```
    
    I see a number of `snake_case` names that should be `camelCase`.
  • feat: log db client (#10087)
    ```
    just log -h
    if [ "${1:-}" = "--" ]; then shift; fi; cargo run -p codex-state --bin logs_client -- "$@"
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.21s
         Running `target/debug/logs_client -h`
    Tail Codex logs from state.sqlite with simple filters
    
    Usage: logs_client [OPTIONS]
    
    Options:
          --codex-home <CODEX_HOME>  Path to CODEX_HOME. Defaults to $CODEX_HOME or ~/.codex [env: CODEX_HOME=]
          --db <DB>                  Direct path to the SQLite database. Overrides --codex-home
          --level <LEVEL>            Log level to match exactly (case-insensitive)
          --from <RFC3339|UNIX>      Start timestamp (RFC3339 or unix seconds)
          --to <RFC3339|UNIX>        End timestamp (RFC3339 or unix seconds)
          --module <MODULE>          Substring match on module_path
          --file <FILE>              Substring match on file path
          --backfill <BACKFILL>      Number of matching rows to show before tailing [default: 200]
          --poll-ms <POLL_MS>        Poll interval in milliseconds [default: 500]
      -h, --help                     Print help
      ```
  • feat: add bazel-codex entry to justfile (#9177)
    This is less straightforward than I realized, so created an entry for
    this in our `justfile`.
    
    Verified that running `just bazel-codex` from anywhere in the repo uses
    the user's `$PWD` as the one to run Codex.
    
    While here, updated the `MODULE.bazel.lock`, though it looks like I need
    to add a CI job that runs `bazel mod deps --lockfile_mode=error` or
    something.
  • add generated jsonschema for config.toml (#8956)
    ### What
    Add JSON Schema generation for `config.toml`, with checked‑in
    `docs/config.schema.json`. We can move the schema elsewhere if preferred
    (and host it if there's demand).
    
    Add fixture test to prevent drift and `just write-config-schema` to
    regenerate on schema changes.
    
    Generate MCP config schema from `RawMcpServerConfig` instead of
    `McpServerConfig` because that is the runtime type used for
    deserialization.
    
    Populate feature flag values into generated schema so they can be
    autocompleted.
    
    ### Tests
    Added tests + regenerate script to prevent drift. Tested autocompletions
    using generated jsonschema locally with Even Better TOML.
    
    
    
    https://github.com/user-attachments/assets/5aa7cd39-520c-4a63-96fb-63798183d0bc
  • feat: add support for building with Bazel (#8875)
    This PR configures Codex CLI so it can be built with
    [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
    includes configuration so that remote builds can be done using
    [BuildBuddy](https://www.buildbuddy.io).
    
    If you are familiar with Bazel, things should work as you expect, e.g.,
    run `bazel test //... --keep-going` to run all the tests in the repo,
    but we have also added some new aliases in the `justfile` for
    convenience:
    
    - `just bazel-test` to run tests locally
    - `just bazel-remote-test` to run tests remotely (currently, the remote
    build is for x86_64 Linux regardless of your host platform). Note we are
    currently seeing the following test failures in the remote build, so we
    still need to figure out what is happening here:
    
    ```
    failures:
        suite::compact::manual_compact_twice_preserves_latest_user_messages
        suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
        suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
    ```
    
    - `just build-for-release` to build release binaries for all
    platforms/architectures remotely
    
    To setup remote execution:
    - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
    employees should also request org access at
    https://openai.buildbuddy.io/join/ with their `@openai.com` email
    address.)
    - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
    `~/.bazelrc` (add the line `build
    --remote_header=x-buildbuddy-api-key=YOUR_KEY`)
    - Use `--config=remote` in your `bazel` invocations (or add `common
    --config=remote` to your `~/.bazelrc`, or use the `just` commands)
    
    ## CI
    
    In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
    uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
    (we are working on supporting Windows, but that is not ready yet). Note
    that the failures we are seeing in `just bazel-remote-test` do not occur
    on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
    is green right now.
    
    The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
    that macOS CI jobs build _remotely_ on Linux hosts (using the
    `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
    root `BUILD.bazel`) using cross-compilation to build the macOS
    artifacts. Then these artifacts are downloaded locally to GitHub's macOS
    runner so the tests can be executed natively. This is the relevant
    config that enables this:
    
    ```
    common:macos --config=remote
    common:macos --strategy=remote
    common:macos --strategy=TestRunner=darwin-sandbox,local
    ```
    
    Because of the remote caching benefits we get from BuildBuddy, these new
    CI jobs can be extremely fast! For example, consider these two jobs that
    ran all the tests on Linux x86_64:
    
    - Bazel 1m37s
    https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
    - Cargo 9m20s
    https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
    
    For now, we will continue to run both the Bazel and Cargo jobs for PRs,
    but once we add support for Windows and running Clippy, we should be
    able to cutover to using Bazel exclusively for PRs, which should still
    speed things up considerably. We will probably continue to run the Cargo
    jobs post-merge for commits that land on `main` as a sanity check.
    
    Release builds will also continue to be done by Cargo for now.
    
    Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
    Earlier attempt to add support for Buck2, now abandoned:
    https://github.com/openai/codex/pull/8504
    
    ---------
    
    Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • chore: silent just fmt (#8820)
    Done to avoid spammy warnings to end up in the model context without
    having to switch to nightly
    ```
    Warning: can't set `imports_granularity = Item`, unstable features are only available in nightly channel.
    ```
  • Move justfile to repository root (#7652)
    ## Summary
    - move the workspace justfile to the repository root for easier
    discovery
    - set the just working directory to codex-rs so existing recipes still
    run in the Rust workspace
    
    ## Testing
    - not run (not requested)
    
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_69334db473108329b0cc253b7fd8218e)