33 Commits

  • [codex] group blocking and postmerge CI workflows (#30146)
    ## Why
    
    It's hard to change the set of required jobs when they're managed in the
    GitHub UI, and when each workflow is responsible for choosing it's own
    scheduling it's easy to end up with skew between what we enforce on PRs
    vs. on main.
    
    ## What
    
    - add a `blocking-ci` caller workflow, triggered by pull requests and
    pushes to `main`, for Bazel, blob size, cargo-deny, Codespell,
    `repo-checks`, rust CI, and SDK CI
    - add an `always()` terminal job named `CI required` that fails unless
    every called workflow succeeds
    - add a `postmerge-ci` caller workflow for `rust-ci-full` and
    `v8-canary`, with a terminal `Postmerge CI results` job
    - centralize V8 relevance detection in `v8_canary_changes.py`; unrelated
    PR and postmerge runs execute metadata only and skip the expensive build
    matrices
    - leave `v8-canary` outside the blocking gate and leave the external
    `cla` check independent
    
    ## Rollout
    
    A repository admin must replace the existing required GitHub Actions
    contexts with `CI required` in the main-branch ruleset. Retain `cla` as
    a separate required check. Until that change is coordinated, this PR
    cannot satisfy the old standalone check names. In-flight PRs will need
    to be rebased after this lands.
  • ci: template custom runner names by repo (#27024)
    ## Why
    
    These workflows currently hard-code the `codex` runner group and custom
    runner labels. That makes the same workflow definitions less portable
    across repository copies or renamed repos, even though the runner fleet
    follows the repository name scheme. Template the runner identities from
    the repository name so `openai/codex` still resolves to the existing
    `codex-*` runners while other repos can use their own `<repo>-*` runner
    names.
    
    ## What Changed
    
    - Replaced custom runner `group` values such as `codex-runners` with
    `${{ github.event.repository.name }}-runners`.
    - Replaced custom runner labels such as `codex-linux-x64` and
    `codex-windows-arm64` with `${{ github.event.repository.name }}-...`.
    - Covered direct `runs-on` objects, matrix `runs_on` entries, reusable
    workflow runner inputs, and release runner labels.
    
    ## Verification
    
    - Parsed all `.github/workflows/*.yml` files as YAML with Ruby.
    - Searched `.github/workflows` to confirm no hardcoded runner-field
    `codex-runners` or `codex-*` labels remain.
  • ci: use bazel environment for BuildBuddy secret (#26895)
    ## Why
    
    `BUILDBUDDY_API_KEY` now lives in the `bazel` GitHub Actions environment
    as an environment secret. Jobs that need BuildBuddy credentials must opt
    into that environment so `${{ secrets.BUILDBUDDY_API_KEY }}` resolves
    from the protected environment secret instead of relying on an unscoped
    repository/organization secret.
    
    This follows the same environment-secret migration pattern as #26466.
    
    ## What Changed
    
    - Attach each workflow job that reads `BUILDBUDDY_API_KEY` to the
    `bazel` environment.
    - Set `deployment: false` on those job-level environment blocks.
    
    `deployment: false` lets the job enter the `bazel` environment to access
    its environment secrets without creating GitHub deployment records for
    these CI jobs. That keeps the environment as a secret/access-control
    boundary without making ordinary Bazel CI runs look like deploys.
    
    ## Validation
    
    - Parsed the modified workflow YAML files with Ruby's YAML parser.
    - Checked the modified workflow files for trailing whitespace.
  • build: use ThinLTO for release binaries (#23710)
    ## Why
    
    Fat LTO makes release builds substantially slower without providing
    enough measured runtime benefit to justify the release CI long pole. The
    build-profile investigation found that keeping Cargo's default release
    `opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced
    a clean `codex-cli` release build from 2073.893 seconds to 1243.172
    seconds, a 40.06% improvement.
    
    The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%).
    Measured runtime changes were small: the worst image workload median was
    +0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO
    retains cross-crate optimization while avoiding most of the fat-LTO
    build cost.
    
    This deliberately avoids global size optimization: final-executable
    testing showed a substantial regression on the image request path, which
    is expected to become more important as image usage grows.
    
    ## What changed
    
    - Set the workspace release profile to `lto = "thin"`, retaining Cargo's
    default release `opt-level=3`.
    - Remove release and CI workflow-specific LTO overrides so
    release-profile builds consistently use the workspace setting.
    - Remove the now-unused Windows release workflow input and related
    diagnostic output.
    
    ## Validation
    
    - Confirmed the release profile parses with `cargo metadata --no-deps
    --format-version 1`.
    - CI validates release builds across the supported target matrix.
  • Remove libubsan CI workaround (#24782)
    It seems that this was added to allow rustc to load proc macros that had
    been compiled with UBSan enabled, which zig does for debug and
    `ReleaseSafe` builds. When zig drives the link of the final binary it
    knows to include the ubsan runtime, but our zig-built artifacts are
    being linked into a binary whose linking rustc drives. This removes the
    libubsan workaround we have and replaces it with
    `-fno-sanitize=undefined` passed to zig.
    
    The new argument is passed at the end of zig's args so should take
    precedence over any earlier arguments from the script's caller.
  • Add app-server startup benchmark crate (#24651)
    ## Summary
    - Add a new `app-server-start-bench` crate to measure app-server startup
    performance
    - Wire the benchmark into the workspace and Bazel build so it can be run
    consistently
    - Update lockfiles and repo automation to account for the new package
  • Uprev Rust toolchain pins to 1.95.0 (#24684)
    ## Summary
    - Bump the workspace Rust toolchain from `1.93.0` to `1.95.0` across
    Cargo, Bazel, CI, release workflows, devcontainers, and the Codex
    environment config.
    - Refresh `MODULE.bazel.lock` so the Bazel Rust toolchain artifacts
    match the new version.
    - Leave purpose-specific toolchains unchanged, including the
    `argument-comment-lint` nightly and the upstream `rusty_v8` `1.91.0`
    build pin.
    - Includes fixes for new lints from `just fix` and a few codex-authored
    fixes for lints without a suggestion.
  • [codex] Add image re-encoding benchmarks (#23935)
    ## Summary
    - add Divan benchmarks for prompt image re-encoding paths
    - wire the image benchmark smoke test into Rust CI workflows
    
    ## Why
    Image prompt handling includes re-encoding work that benefits from
    repeatable benchmark coverage so changes can be measured in CI and
    locally.
    
    This already helped identify a potential regression from changing compiler flags.
    
    ## Impact
    Developers can run and compare the new image re-encoding benchmarks, and
    CI exercises the benchmark target via the Rust benchmark smoke test.
  • ci: Use codex produced v8 artifacts for release builds (#23934)
    Updates our build script to pull down the artifacts like we do in CI for
    building v8 into our targets.
    
    This changes the flow so that we now pre-install rusty v8 assets for all
    of our release targets from pre-built in workflow.
    Secondarily if running it locally we now optionally pull the assets down
    on python run assuming the user hasn't set the proper values, it then
    provides them.
    
    Sorry for the miss here.
  • Fan out rust-ci-full nextest by platform (#23358)
    ## Why
    
    `rust-ci-full` was paying the full Cargo nextest build-and-run cost once
    per platform, with Windows ARM64 as the long pole. This change moves the
    heavy work into one reusable per-platform flow: build a nextest archive
    once, then replay it across four shards so the platform lane spends less
    time running tests serially. For Windows ARM64, the archive is
    cross-compiled on Windows x64 and replayed on native Windows ARM64
    shards so the slow ARM64 machine is used for execution rather than
    compilation.
    
    ## What changed
    
    - split the `rust-ci-full` nextest matrix into five explicit
    per-platform reusable-workflow calls
    - add `.github/workflows/rust-ci-full-nextest-platform.yml` to build one
    archive, upload timings/helpers, replay four nextest shards, upload
    per-shard JUnit, and roll the shard status back up per platform
    - add Windows CI helpers for Dev Drive setup and MSVC ARM64 linker
    environment export so the Windows ARM64 archive can be produced on
    Windows x64
    - keep the existing Cargo git CLI fetch hardening inside the reusable
    workflow, since caller workflow-level `env` does not flow through
    `workflow_call`
    - document the archive-backed shard shape in
    `.github/workflows/README.md`
    - raise the default nextest slow timeout to 30s so the sharded full-CI
    path does not treat every >15s test as stuck
    
    ## Verification
    
    - validated the archive/shard flow with live GitHub Actions runs on this
    PR branch
    - Windows ARM64 cross-compile latency on completed runs:
    - https://github.com/openai/codex/actions/runs/26118759651: `34m30s`
    lane e2e, `17m16s` archive build, `9m55s` shard phase
    - https://github.com/openai/codex/actions/runs/26120777976: `30m36s`
    lane e2e, `17m21s` archive build, `6m50s` shard phase
    - comparable pre-cross-compile sharded Windows ARM64 runs were `55m01s`,
    `50m21s`, and `46m42s`, so the completed cross-compile runs improved the
    lane by roughly `12m` to `24m` versus the prior range
    - latest corrected cross-compile run:
    https://github.com/openai/codex/actions/runs/26120777976
      - Windows ARM64 archive built successfully on Windows x64
    - native Windows ARM64 shards started immediately after the archive
    upload
    - 3/4 Windows ARM64 shards passed; the failing shard hit the same
    existing `code_mode` test failure seen outside this lane
    - downloaded failed-shard JUnit XML from the validation runs and
    confirmed the remaining red is from known test failures, not
    archive/shard wiring
    - no local Codex tests run per repo guidance
    
    ## Notes
    
    - this PR does not change developers.openai.com documentation
  • Reduce rust-ci-full Windows nextest timeout flakes (#23253)
    ## Why
    Recent `rust-ci-full` failures were dominated by transient Windows
    timeout clusters in process-heavy tests such as `suite::resume`,
    `suite::cli_stream`, `suite::auth_env`,
    `start_thread_uses_all_default_environments_from_codex_home`, and
    `connect_stdio_command_initializes_json_rpc_client_on_windows`.
    
    The goal here is to make those known flaky paths less likely to fail
    full CI without relaxing the global nextest timeout policy.
    
    ## What changed
    - Enable one global nextest retry with `retries = 1` so a single
    transient failure can recover.
    - Add a `windows_process_heavy` test group with `max-threads = 2` for
    the recurring Windows subprocess/session-heavy timeout families.
    - Add Windows-only slow-timeout overrides for that process-heavy group.
    - Add a narrower Windows-only timeout override for
    `start_thread_uses_all_default_environments_from_codex_home`, which
    still exceeded the broader Windows bucket in both Windows full-CI lanes.
    - Increase the `rust-ci-full` nextest job timeout from `45m` to `60m` so
    Windows ARM64 still has job-level headroom after retries and targeted
    per-test timeout increases.
    - Keep the global `slow-timeout` unchanged at `15s`.
    
    ## Validation
    Validated through `rust-ci-full` GitHub Actions reruns on this PR.
    
    Observed improvement on the tuned Windows lanes:
    - Windows x64 went from `5 timed out` to `0 timed out`.
    - Windows ARM64 went from `2 timed out` to `0 timed out`.
    - `start_thread_uses_all_default_environments_from_codex_home` recovered
    as a flaky pass on Windows ARM64 instead of timing out.
    
    The remaining failing tests in those runs were unrelated hard failures
    outside this nextest timeout tuning.
  • Upload rust full CI JUnit reports (#23273)
    ## Why
    
    `rust-ci-full` failures currently leave downstream investigation
    reconstructing basic test facts from raw logs. `cargo nextest` can emit
    standard JUnit XML for each lane, which gives us a small structured
    artifact for post-run failure analysis without changing the test
    execution model.
    
    ## What changed
    
    - enable nextest JUnit output in `codex-rs/.config/nextest.toml`
    - upload the lane-scoped JUnit XML artifact from each `rust-ci-full`
    test lane
    
    ## Verification
    
    - `rust-ci-full` run `26018931531` on head
    `52d77c60e79b36859d944ef28a36b014055c5c48` produced JUnit artifacts for
    macOS, Linux x64 remote, Windows x64, and Windows ARM64 test lanes
    - `rust-ci-full` run `26021241006` on the same head produced the missing
    Linux ARM JUnit artifact after the first run lost that runner before
    export
    - downloaded all five lane JUnit artifacts and verified each contains
    non-empty test counters and failure data
  • Enable --deny-warnings for cargo shear (#21616)
    ## Summary
    
    In https://github.com/openai/codex/pull/21584, we disabled doctests for
    crates that lack any doctests. We can enforce that property via `cargo
    shear --deny-warnings`: crates that lack doctests will be flagged if
    doctests are enabled, and crates with doctests will be flagged if
    doctests are disabled.
    
    A few additional notes:
    
    - By adding `--deny-warnings`, `cargo shear` also flagged a number of
    modules that were not reachable at all. Some of those have been removed.
    - This PR removes a usage of `windows_modules!` (since `cargo shear` and
    `rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
    "windows")]` macros. As a consequence, many of these files exhibit churn
    in this PR, since they weren't being formatted by `rustfmt` at all on
    main.
    - Again, to make the code more analyzable, this PR also removes some
    usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
    module structure. The bin sidecar structure is still retained, but,
    e.g., `windows-sandbox-rs/src/bin/command_runner.rs‎` was moved to
    `windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Address some more GHA hygiene issues (#21622)
    This does two things:
    
    - We use `persist-credentials: false` everywhere now. This is
    unfortunately not the default in GitHub Actions, but it prevents
    `actions/checkout` from dropping `secrets.GITHUB_TOKEN` onto disk.
    - We interpose (some) template expansions through environment variables.
    I've limited this to contexts that have non-fixed values; contexts that
    are fixed (like `*.result`) are not dangerous to expand directly inline
    (but maybe we should clean those up in the future for consistency
    anyways).
    
    This is a medium-risk change in terms of CI breakage: I did a scan for
    usage of `git push` and other commands that implicitly use the persisted
    credential, but couldn't find any. Even still, some implicit usages of
    the persisted credentials may be lurking. Please ping ww@ if any issues
    arise.
  • Use CARGO_NET_GIT_FETCH_WITH_CLI in rust-ci-full for more reliable git fetches (#21628)
    Cargo uses libgit2 by default. In uv, we gave up this entirely and
    always call out to the git CLI because it is much more reliable. This is
    a part of my attempt to reduce flakes in `rust-ci-full`.
  • Fix rust-ci-full failures due to missing bwrap (#21604)
    Since https://github.com/openai/codex/pull/21255, `rust-ci-full` has
    been failing due to a missing `bwrap`.
    
    ```
    thread 'main' panicked at linux-sandbox/src/launcher.rs:43:13:
    bubblewrap is unavailable: no system bwrap was found on PATH and no bundled codex-resources/bwrap binary was found next to the Codex executable
    ```
    
    Since the happy path is now to use the system binary, let's ensure
    that's installed.
    
    
    https://github.com/openai/codex/pull/21604/commits/8d5182663158ee2d15965f39eed26ffa339ecb7d
    was necessary for the `bwrap` executable to be discoverable when the
    working directory is `/`.
    
    I ran `rust-ci-full` at
    https://github.com/openai/codex/actions/runs/25528074506
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Use --locked in cargo build and lint invocations (#21602)
    This ensures CI fails if the committed lockfile is outdated
  • [codex] Fully qualify hash-pins in GitHub Actions (#21436)
    This builds on top of https://github.com/openai/codex/pull/15828 by
    ensuring that hash-pinned actions with version comments are fully
    qualified, rather than referencing floating/mutable comments like "v7".
    This makes actions management tools behave more consistently.
    
    This shouldn't break anything, since it's comment only. But if it does,
    ping ww@ 🙂
  • Upgrade cargo-shear to 1.11.2 (#21547)
    ## Summary
    
    Catches a few additional dependencies (`sha2`, `url`) that should be in
    `dev-dependencies`.
  • test: set Rust test thread stack size (#19067)
    ## Summary
    
    Set `RUST_MIN_STACK=8388608` for Rust test entry points so
    libtest-spawned test threads get an 8 MiB stack.
    
    The Windows BuildBuddy failure on #18893 showed
    `//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a
    `#[tokio::test]` even though later test binaries in the shard printed
    successful summaries. Default `#[tokio::test]` uses a current-thread
    Tokio runtime, which means the async test body is driven on libtest's
    std-spawned test thread. Increasing the test thread stack addresses that
    failure mode directly.
    
    To date, we have been fixing these stack-pressure problems with
    localized future-size reductions, such as #13429, and by adding
    `Box::pin()` in specific async wrapper chains. This gives us a baseline
    test-runner stack size instead of continuing to patch individual tests
    only after CI finds another large async future.
    
    ## What changed
    
    - Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so
    Bazel test actions receive the env var through Bazel's cache-keyed test
    environment path.
    - Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points
    and `just test`.
    - Annotated the existing Windows Bazel linker stack reserve as 8 MiB so
    it stays aligned with the libtest thread stack size.
    
    ## Testing
    
    - `just --list`
    - parsed `.github/workflows/rust-ci.yml` and
    `.github/workflows/rust-ci-full.yml` with Ruby's YAML loader
    - compared `bazel aquery` `TestRunner` action keys before/after explicit
    `--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to
    `.bazelrc`
    - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors`
    - failed locally on the existing sandbox-specific status snapshot
    permission mismatch, but loaded the Starlark changes and ran the TUI
    test shards
  • Reuse remote exec-server in core tests (#17837)
    ## Summary
    - reuse a shared remote exec-server for remote-aware codex-core
    integration tests within a test binary process
    - keep per-test remote cwd creation and cleanup so tests retain
    workspace isolation
    - leave codex_self_exe, codex_linux_sandbox_exe, cwd_path(), and
    workspace_path() behavior unchanged
    
    ## Validation
    - rustfmt codex-rs/core/tests/common/test_codex.rs
    - git diff --check
    - CI is running on the updated branch
  • fix: pin inputs (#17471)
    ## Summary
    - Pin Rust git patch dependencies to immutable revisions and make
    cargo-deny reject unknown git and registry sources unless explicitly
    allowlisted.
    - Add checked-in SHA-256 coverage for the current rusty_v8 release
    assets, wire those hashes into Bazel, and verify CI override downloads
    before use.
    - Add rusty_v8 MODULE.bazel update/check tooling plus a Bazel CI guard
    so future V8 bumps cannot drift from the checked-in checksum manifest.
    - Pin release/lint cargo installs and all external GitHub Actions refs
    to immutable inputs.
    
    ## Future V8 bump flow
    Run these after updating the resolved `v8` crate version and checksum
    manifest:
    
    ```bash
    python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
    python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
    ```
    
    The update command rewrites the matching `rusty_v8_<crate_version>`
    `http_file` SHA-256 values in `MODULE.bazel` from
    `third_party/v8/rusty_v8_<crate_version>.sha256`. The check command is
    also wired into Bazel CI to block drift.
    
    ## Notes
    - This intentionally excludes RustSec dependency upgrades and
    bubblewrap-related changes per request.
    - The branch was rebased onto the latest origin/main before opening the
    PR.
    
    ## Validation
    - cargo fetch --locked
    - cargo deny check advisories
    - cargo deny check
    - cargo deny check sources
    - python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
    - python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
    - python3 -m unittest discover -s .github/scripts -p
    'test_rusty_v8_bazel.py'
    - python3 -m py_compile .github/scripts/rusty_v8_bazel.py
    .github/scripts/rusty_v8_module_bazel.py
    .github/scripts/test_rusty_v8_bazel.py
    - repo-wide GitHub Actions `uses:` audit: all external action refs are
    pinned to 40-character SHAs
    - yq eval on touched workflows and local actions
    - git diff --check
    - just bazel-lock-check
    
    ## Hash verification
    - Confirmed `MODULE.bazel` hashes match
    `third_party/v8/rusty_v8_146_4_0.sha256`.
    - Confirmed GitHub release asset digests for denoland/rusty_v8
    `v146.4.0` and openai/codex `rusty-v8-v146.4.0` match the checked-in
    hashes.
    - Streamed and SHA-256 hashed all 10 `MODULE.bazel` rusty_v8 asset URLs
    locally; every downloaded byte stream matched both `MODULE.bazel` and
    the checked-in manifest.
    
    ## Pin verification
    - Confirmed signing-action pins match the peeled commits for their tag
    comments: `sigstore/cosign-installer@v3.7.0`, `azure/login@v2`, and
    `azure/trusted-signing-action@v0`.
    - Pinned the remaining tag-based action refs in Bazel CI/setup:
    `actions/setup-node@v6`, `facebook/install-dotslash@v2`,
    `bazelbuild/setup-bazelisk@v3`, and `actions/cache/restore@v5`.
    - Normalized all `bazelbuild/setup-bazelisk@v3` refs to the peeled
    commit behind the annotated tag.
    - Audited Cargo git dependencies: every manifest git dependency uses
    `rev` only, every `Cargo.lock` git source has `?rev=<sha>#<same-sha>`,
    and `cargo deny check sources` passes with `required-git-spec = "rev"`.
    - Shallow-fetched each distinct git dependency repo at its pinned SHA
    and verified Git reports each object as a commit.
  • Add full-ci branch trigger (#16980)
    Allow branches to trigger full ci (helpful to run remote tests)
  • ci: stop running rust CI with --all-features (#16473)
    ## Why
    
    Now that workspace crate features have been removed and
    `.github/scripts/verify_cargo_workspace_manifests.py` hard-bans new
    ones, Rust CI should stop building and testing with `--all-features`.
    
    Keeping `--all-features` in CI no longer buys us meaningful coverage for
    `codex-rs`, but it still makes the workflow look like we rely on Cargo
    feature permutations that we are explicitly trying to eliminate. It also
    leaves stale examples in the repo that suggest `--all-features` is a
    normal or recommended way to run the workspace.
    
    ## What changed
    
    - removed `--all-features` from the Rust CI `cargo chef cook`, `cargo
    clippy`, and `cargo nextest` invocations in
    `.github/workflows/rust-ci-full.yml`
    - updated the `just test` guidance in `justfile` to reflect that
    workspace crate features are banned and there should be no need to add
    `--all-features`
    - updated the multiline command example and snapshot in
    `codex-rs/tui/src/history_cell.rs` to stop rendering `cargo test
    --all-features --quiet`
    - tightened the verifier docstring in
    `.github/scripts/verify_cargo_workspace_manifests.py` so it no longer
    talks about temporary remaining exceptions
    
    ## How tested
    
    - `python3 .github/scripts/verify_cargo_workspace_manifests.py`
    - `cargo test -p codex-tui`
  • ci: run Windows argument-comment-lint via native Bazel (#16120)
    ## Why
    
    Follow-up to #16106.
    
    `argument-comment-lint` already runs as a native Bazel aspect on Linux
    and macOS, but Windows is still the long pole in `rust-ci`. To move
    Windows onto the same native Bazel lane, the toolchain split has to let
    exec-side helper binaries build in an MSVC environment while still
    linting repo crates as `windows-gnullvm`.
    
    Pushing the Windows lane onto the native Bazel path exposed a second
    round of Windows-only issues in the mixed exec-toolchain plumbing after
    the initial wrapper/target fixes landed.
    
    ## What Changed
    
    - keep the Windows lint lanes on the native Bazel/aspect path in
    `rust-ci.yml` and `rust-ci-full.yml`
    - add a dedicated `local_windows_msvc` platform for exec-side helper
    binaries while keeping `local_windows` as the `windows-gnullvm` target
    platform
    - patch `rules_rust` so `repository_set(...)` preserves explicit
    exec-platform constraints for the generated toolchains, keep the
    Windows-specific bootstrap/direct-link fixes needed for the nightly lint
    driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot
    - register the custom Windows nightly toolchain set with MSVC exec
    constraints while still exposing both `x86_64-pc-windows-msvc` and
    `x86_64-pc-windows-gnullvm` targets
    - enable `dev_components` on the custom Windows nightly repository set
    so the MSVC exec helper toolchain actually downloads the
    compiler-internal crates that `clippy_utils` needs
    - teach `run-argument-comment-lint-bazel.sh` to enumerate concrete
    Windows Rust rules, normalize the resulting labels, and skip explicitly
    requested incompatible targets instead of failing before the lint run
    starts
    - patch `rules_rust` build-script env propagation so exec-side
    `windows-msvc` helper crates drop forwarded MinGW include and linker
    search paths as whole flag/path pairs instead of emitting malformed
    `CFLAGS`, `CXXFLAGS`, and `LDFLAGS`
    - export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and
    pass the relevant variables through `run-bazel-ci.sh` via `--action_env`
    / `--host_action_env` so Bazel build scripts can see the MSVC and UCRT
    headers on native Windows runs
    - add inline comments to the Windows `setup-bazel-ci` MSVC environment
    export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and
    the filtered `GITHUB_ENV` export fit together
    - patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel
    `windows-msvc` build-script environments, which avoids a Windows-native
    toolchain mismatch that blocked the lint lane before it reached the
    aspect execution
    - patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for
    Bazel `windows-msvc` build-script runs, which avoids missing
    `generated-src/win-x86_64/*.asm` runfiles in the exec-side helper
    toolchain
    - annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and
    `codex-rs/core` that the wider native lint coverage surfaced
    
    ## Patches
    
    This PR introduces a large patch stack because the Windows Bazel lint
    lane currently depends on behavior that upstream dependencies do not
    provide out of the box in the mixed `windows-gnullvm` target /
    `windows-msvc` exec-toolchain setup.
    
    - Most of the `rules_rust` patches look like upstream candidates rather
    than OpenAI-only policy. Preserving explicit exec-platform constraints,
    forwarding the right MSVC/UCRT environment into exec-side build scripts,
    exposing exec-side `rustc-dev` artifacts, and keeping the Windows
    bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust
    integration layer itself.
    - The two `aws-lc-sys` patches are more tactical. They special-case
    Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe
    mismatch and missing NASM runfiles. Those may be harder to upstream
    as-is because they rely on Bazel-specific detection instead of a general
    Cargo/build-script contract.
    - Short term, carrying these patches in-tree is reasonable because they
    unblock a real CI lane and are still narrow enough to audit. Long term,
    the goal should not be to keep growing a permanent local fork of either
    dependency.
    - My current expectation is that the `rules_rust` patches are less
    controversial and should be broken out into focused upstream proposals,
    while the `aws-lc-sys` patches are more likely to be temporary escape
    hatches unless that crate wants a more general hook for hermetic build
    systems.
    
    Suggested follow-up plan:
    
    1. Split the `rules_rust` deltas into upstream-sized PRs or issues with
    minimized repros.
    2. Revisit the `aws-lc-sys` patches during the next dependency bump and
    see whether they can be replaced by an upstream fix, a crate upgrade, or
    a cleaner opt-in mechanism.
    3. Treat each dependency update as a chance to delete patches one by one
    so the local patch set only contains still-needed deltas.
    
    ## Verification
    
    - `./.github/scripts/run-argument-comment-lint-bazel.sh
    --config=argument-comment-lint --keep_going`
    - `RUNNER_OS=Windows
    ./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild
    --config=argument-comment-lint --platforms=//:local_windows
    --keep_going`
    - `cargo test -p codex-linux-sandbox`
    - `cargo test -p codex-core shell_snapshot_tests`
    - `just argument-comment-lint`
    
    ## References
    
    - #16106
  • fix: close Bazel argument-comment-lint CI gaps (#16253)
    ## Why
    
    The Bazel-backed `argument-comment-lint` CI path had two gaps:
    
    - Bazel wildcard target expansion skipped inline unit-test crates from
    `src/` modules because the generated `*-unit-tests-bin` `rust_test`
    targets are tagged `manual`.
    - `argument-comment-mismatch` was still only a warning in the Bazel and
    packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could
    still pass CI even when the lint detected it.
    
    That left CI blind to real linux-sandbox examples, including the missing
    `/*local_port*/` comment in
    `codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument
    comments in `codex-rs/linux-sandbox/src/landlock.rs`.
    
    ## What Changed
    
    - Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel
    lint runs cover `//codex-rs/...` plus the manual `rust_test`
    `*-unit-tests-bin` targets.
    - Updated `just argument-comment-lint`, `rust-ci.yml`, and
    `rust-ci-full.yml` to use that helper.
    - Promoted both `argument-comment-mismatch` and
    `uncommented-anonymous-literal-argument` to errors in every strict
    entrypoint:
      - `tools/argument-comment-lint/lint_aspect.bzl`
      - `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
      - `tools/argument-comment-lint/wrapper_common.py`
    - Added wrapper/bin coverage for the stricter lint flags and documented
    the behavior in `tools/argument-comment-lint/README.md`.
    - Fixed the now-covered callsites in
    `codex-rs/linux-sandbox/src/proxy_routing.rs`,
    `codex-rs/linux-sandbox/src/landlock.rs`, and
    `codex-rs/core/src/shell_snapshot_tests.rs`.
    
    This keeps the Bazel target expansion narrow while making the Bazel and
    prebuilt-linter paths enforce the same strict lint set.
    
    ## Verification
    
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `cargo +nightly-2025-09-18 test --manifest-path
    tools/argument-comment-lint/Cargo.toml`
    - `just argument-comment-lint`
  • ci: use BuildBuddy for rust-ci-full non-Windows argument-comment-lint (#16136)
    ## Why
    
    PR #16130 fixed the Windows `argument-comment-lint` regression in
    `rust-ci-full`, but the next `main` runs still left the Linux and macOS
    lint legs timing out.
    
    In [run
    23695263729](https://github.com/openai/codex/actions/runs/23695263729),
    both non-Windows `argument-comment-lint` jobs were cancelled almost
    exactly 30 minutes after they started. The remaining workflow difference
    versus `rust-ci.yml` was that `rust-ci-full` did not pass
    `BUILDBUDDY_API_KEY` into the non-Windows Bazel lint step, so
    `run-bazel-ci.sh` fell back to local Bazel configuration instead of
    using the faster remote-backed path available on `main`.
    
    ## What changed
    
    - passed `BUILDBUDDY_API_KEY` to the non-Windows `rust-ci-full`
    `argument-comment-lint` Bazel step
    - left the Windows packaged-wrapper path from #16130 unchanged
    - kept the change scoped to `rust-ci-full.yml`
    
    ## Test plan
    
    - loaded `.github/workflows/rust-ci-full.yml` and
    `.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)`
    - inspected run `23695263729` and confirmed `Argument comment lint -
    Linux` and `Argument comment lint - macOS` were cancelled about 30
    minutes after start
    - verified the updated `rust-ci-full` step now matches the non-Windows
    secret wiring already present in `rust-ci.yml`
    
    ## References
    
    - #16130
    - #16106
  • ci: keep rust-ci-full Windows argument-comment-lint on packaged wrapper (#16130)
    ## Why
    
    PR #16106 switched `rust-ci-full` over to the native Bazel-backed
    `argument-comment-lint` path on all three platforms.
    
    That works on Linux and macOS, but the Windows leg in `rust-ci-full` now
    fails before linting starts: Bazel dies while building `rules_rust`'s
    `process_wrapper` tool, so `main` reports an `argument-comment-lint`
    failure even though no Rust lint finding was produced.
    
    Until native Windows Bazel linting is repaired, `rust-ci-full` should
    keep the same Windows split that `rust-ci.yml` already uses.
    
    ## What changed
    
    - restored the Windows-only nightly `argument-comment-lint` toolchain
    setup in `rust-ci-full`
    - limited the Bazel-backed lint step in `rust-ci-full` to non-Windows
    runners
    - routed the Windows runner back through
    `tools/argument-comment-lint/run-prebuilt-linter.py`
    - left the Linux and macOS `rust-ci-full` behavior unchanged
    
    ## Test plan
    
    - loaded `.github/workflows/rust-ci-full.yml` and
    `.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)`
    - inspected failing Actions run `23692864849`, especially job
    `69023229311`, to confirm the Windows failure occurs in Bazel
    `process_wrapper` setup before lint output is emitted
    
    ## References
    
    - #16106
  • build: migrate argument-comment-lint to a native Bazel aspect (#16106)
    ## Why
    
    `argument-comment-lint` had become a PR bottleneck because the repo-wide
    lane was still effectively running a `cargo dylint`-style flow across
    the workspace instead of reusing Bazel's Rust dependency graph. That
    kept the lint enforced, but it threw away the main benefit of moving
    this job under Bazel in the first place: metadata reuse and cacheable
    per-target analysis in the same shape as Clippy.
    
    This change moves the repo-wide lint onto a native Bazel Rust aspect so
    Linux and macOS can lint `codex-rs` without rebuilding the world
    crate-by-crate through the wrapper path.
    
    ## What Changed
    
    - add a nightly Rust toolchain with `rustc-dev` for Bazel and a
    dedicated crate-universe repo for `tools/argument-comment-lint`
    - add `tools/argument-comment-lint/driver.rs` and
    `tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
    as a custom `rustc_driver`
    - switch repo-wide `just argument-comment-lint` and the Linux/macOS
    `rust-ci` lanes to `bazel build --config=argument-comment-lint
    //codex-rs/...`
    - keep the Python/DotSlash wrappers as the package-scoped fallback path
    and as the current Windows CI path
    - gate the Dylint entrypoint behind a `bazel_native` feature so the
    Bazel-native library avoids the `dylint_*` packaging stack
    - update the aspect runtime environment so the driver can locate
    `rustc_driver` correctly under remote execution
    - keep the dedicated `tools/argument-comment-lint` package tests and
    wrapper unit tests in CI so the source and packaged entrypoints remain
    covered
    
    ## Verification
    
    - `python3 -m unittest discover -s tools/argument-comment-lint -p
    'test_*.py'`
    - `cargo test` in `tools/argument-comment-lint`
    - `bazel build
    //tools/argument-comment-lint:argument-comment-lint-driver
    --@rules_rust//rust/toolchain/channel=nightly`
    - `bazel build --config=argument-comment-lint
    //codex-rs/utils/path-utils:all`
    - `bazel build --config=argument-comment-lint
    //codex-rs/rollout:rollout`
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
    * #16120
    * __->__ #16106
  • ci: split fast PR Rust CI from full post-merge Cargo CI (#16072)
    ## Summary
    
    Split the old all-in-one `rust-ci.yml` into:
    
    - a PR-time Cargo workflow in `rust-ci.yml`
    - a full post-merge Cargo workflow in `rust-ci-full.yml`
    
    This keeps the PR path focused on fast Cargo-native hygiene plus the
    Bazel `build` / `test` / `clippy` coverage in `bazel.yml`, while moving
    the heavyweight Cargo-native matrix to `main`.
    
    ## Why
    
    `bazel.yml` is now the main Rust verification workflow for pull
    requests. It already covers the Bazel build, test, and clippy signal we
    care about pre-merge, and it also runs on pushes to `main` to re-verify
    the merged tree and help keep the BuildBuddy caches warm.
    
    What was still missing was a clean split for the Cargo-native checks
    that Bazel does not replace yet. The old `rust-ci.yml` mixed together:
    
    - fast hygiene checks such as `cargo fmt --check` and `cargo shear`
    - `argument-comment-lint`
    - the full Cargo clippy / nextest / release-build matrix
    
    That made every PR pay for the full Cargo matrix even though most of
    that coverage is better treated as post-merge verification. The goal of
    this change is to leave PRs with the checks we still want before merge,
    while moving the heavier Cargo-native matrix off the review path.
    
    ## What Changed
    
    - Renamed the old heavyweight workflow to `rust-ci-full.yml` and limited
    it to `push` on `main` plus `workflow_dispatch`.
    - Added a new PR-only `rust-ci.yml` that runs:
      - changed-path detection
      - `cargo fmt --check`
      - `cargo shear`
      - `argument-comment-lint` on Linux, macOS, and Windows
    - `tools/argument-comment-lint` package tests when the lint itself or
    its workflow wiring changes
    - Kept the PR workflow's gatherer as the single required Cargo-native
    status so branch protection can stay simple.
    - Added `.github/workflows/README.md` to document the intended split
    between `bazel.yml`, `rust-ci.yml`, and `rust-ci-full.yml`.
    - Preserved the recent Windows `argument-comment-lint` behavior from
    `e02fd6e1d3` in `rust-ci-full.yml`, and mirrored cross-platform lint
    coverage into the PR workflow.
    
    A few details are deliberate:
    
    - The PR workflow still keeps the Linux lint lane on the
    default-targets-only invocation for now, while macOS and Windows use the
    broader released-linter path.
    - This PR does not change `bazel.yml`; it changes the Cargo-native
    workflow around the existing Bazel PR path.
    
    ## Testing
    
    - Rebasing this change onto `main` after `e02fd6e1d3`
    - `ruby -e 'require "yaml"; %w[.github/workflows/rust-ci.yml
    .github/workflows/rust-ci-full.yml .github/workflows/bazel.yml].each {
    |f| YAML.load_file(f) }'`