4 Commits

  • ci: cross-compile Windows Bazel tests (#20585)
    ## Status
    
    This is the Bazel PR-CI cross-compilation follow-up to #20485. It is
    intentionally split from the Cargo/cargo-xwin release-build PoC so
    #20485 can stay as the historical release-build exploration. The
    unrelated async-utils test cleanup has been moved to #20686, so this PR
    is focused on the Windows Bazel CI path.
    
    The intended tradeoff is now explicit in `.github/workflows/bazel.yml`:
    pull requests get the fast Windows cross-compiled Bazel test leg, while
    post-merge pushes to `main` run both that fast cross leg and a fully
    native Windows Bazel test leg. The native main-only job keeps full
    V8/code-mode coverage and gets a 40-minute timeout because it is less
    latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes.
    
    ## Why
    
    Windows Bazel PR CI currently does the expensive part of the build on
    Windows. A native Windows Bazel test job on `main` completed in about
    28m12s, leaving very little headroom under the 30-minute job timeout and
    making Windows the slowest PR signal.
    
    #20485 showed that Windows cross-compilation can be materially faster
    for Cargo release builds, but PR CI needs Bazel because Bazel owns our
    test sharding, flaky-test retries, and integration-test layout. This PR
    applies the same high-level shape we already use for macOS Bazel CI:
    compile with remote Linux execution, then run platform-specific tests on
    the platform runner.
    
    The compromise is deliberately signal-aware: code-mode/V8 changes are
    rare enough that PR CI can accept losing the direct V8/code-mode
    smoke-test signal temporarily, while `main` still runs the native
    Windows job post-merge to catch that class of regression. A follow-up PR
    should investigate making the cross-built Windows gnullvm V8 archive
    pass the direct V8/code-mode tests so this tradeoff can eventually go
    away.
    
    ## What Changed
    
    - Adds a `ci-windows-cross` Bazel config that targets
    `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps
    `TestRunner` actions local on the Windows runner.
    - Adds explicit Windows platform definitions for
    `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain
    that lets gnullvm test targets execute under the Windows MSVC host
    platform.
    - Updates the Windows Bazel PR test leg to opt into the cross-compile
    path via `--windows-cross-compile` and `--remote-download-toplevel`.
    - Adds a `test-windows-native-main` job that runs only for `push` events
    on `refs/heads/main`, uses the native Windows Bazel path, includes
    V8/code-mode smoke tests, and has `timeout-minutes: 40`.
    - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous
    local Windows MSVC-host fallback, including
    `--host_platform=//:local_windows_msvc` and `--jobs=8`.
    - Preserves the existing integration-test shape on non-gnullvm
    platforms, while generating Windows-cross wrapper targets only for
    `windows_gnullvm`.
    - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime,
    avoiding hard-coded Cargo paths and duplicate test runfiles.
    - Extends the V8 Bazel patches enough for the
    `x86_64-pc-windows-gnullvm` target and Linux remote execution path.
    - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT`
    at runtime when Bazel provides it, because cross-compiled binaries may
    contain Linux compile-time paths.
    - Keeps the direct V8/code-mode unit smoke tests out of the Windows
    cross PR path for now while native Windows CI continues to cover them
    post-merge.
    
    ## Command Shape
    
    The fast Windows PR test leg invokes the normal Bazel CI wrapper like
    this:
    
    ```shell
    ./.github/scripts/run-bazel-ci.sh \
      --print-failed-action-summary \
      --print-failed-test-logs \
      --windows-cross-compile \
      --remote-download-toplevel \
      -- \
      test \
      --test_tag_filters=-argument-comment-lint \
      --test_verbose_timeout_warnings \
      --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
      -- \
      //... \
      -//third_party/v8:all \
      -//codex-rs/code-mode:code-mode-unit-tests \
      -//codex-rs/v8-poc:v8-poc-unit-tests
    ```
    
    With the BuildBuddy secret available on Windows, the wrapper selects
    `--config=ci-windows-cross` and appends the important Windows-cross
    overrides after rc expansion:
    
    ```shell
    --host_platform=//:rbe
    --shell_executable=/bin/bash
    --action_env=PATH=/usr/bin:/bin
    --host_action_env=PATH=/usr/bin:/bin
    --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}
    ```
    
    The native post-merge Windows job intentionally omits
    `--windows-cross-compile` and does not exclude the V8/code-mode unit
    targets:
    
    ```shell
    ./.github/scripts/run-bazel-ci.sh \
      --print-failed-action-summary \
      --print-failed-test-logs \
      -- \
      test \
      --test_tag_filters=-argument-comment-lint \
      --test_verbose_timeout_warnings \
      --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
      --build_metadata=TAG_windows_native_main=true \
      -- \
      //... \
      -//third_party/v8:all
    ```
    
    ## Research Notes
    
    The existing macOS Bazel CI config already uses the model we want here:
    build actions run remotely with `--strategy=remote`, but `TestRunner`
    actions execute on the macOS runner. This PR mirrors that pattern for
    Windows with `--strategy=TestRunner=local`.
    
    The important Bazel detail is that `rules_rs` is already targeting
    `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes
    where the build actions execute; it does not switch the Bazel PR test
    target to Cargo, `cargo-nextest`, or the MSVC release target.
    
    Cargo release builds differ from this Bazel path for V8: the normal
    Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt
    Windows MSVC `.lib.gz` archives. The Bazel PR path targets
    `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows
    GNU/gnullvm archive, so this PR builds that archive in-tree. That
    Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode
    smoke tests, which is why the workflow keeps native Windows coverage on
    `main`.
    
    The less obvious Bazel detail is test wrapper selection. Bazel chooses
    the Windows test wrapper (`tw.exe`) from the test action execution
    platform, not merely from the Rust target triple. The outer
    `workspace_root_test` therefore declares the default test toolchain and
    uses the bridge toolchain above so the test action executes on Windows
    while its inner Rust binary is built for gnullvm.
    
    The V8 investigation exposed a Windows-client gotcha: even when an
    action execution platform is Linux RBE, Bazel can still derive the
    genrule shell path from the Windows client. That produced remote
    commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux
    workers. The wrapper now passes `--shell_executable=/bin/bash` with
    `--host_platform=//:rbe` for the Windows cross path.
    
    The same Windows-client/Linux-RBE boundary also affected
    `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF
    line endings into Linux remote bash, which failed as `$'\r'`. That
    genrule now keeps the `sed` command on one physical shell line while
    using an explicit Starlark join so the shell arguments stay readable.
    
    ## Verification
    
    Local checks included:
    
    ```shell
    bash -n .github/scripts/run-bazel-ci.sh
    bash -n workspace_root_test_launcher.sh.tpl
    ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}"
    RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh
    RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh
    RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh
    RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh
    ```
    
    The Linux clippy and argument-comment target lists contain zero
    `*-windows-cross-bin` labels, while the Windows lists still include 47
    Windows-cross internal test binaries.
    
    CI evidence:
    
    - Baseline native Windows Bazel test on `main`: success in about 28m12s,
    https://github.com/openai/codex/actions/runs/25206257208/job/73907325959
    - Green Windows-cross Bazel run on the split PR before adding the
    main-only native leg: Windows test 9m16s, Windows release verify 5m10s,
    Windows clippy 4m43s,
    https://github.com/openai/codex/actions/runs/25231890068
    - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`;
    CI is rerunning on that focused diff.
    
    ## Follow-Up
    
    A subsequent PR should investigate making a cross-built Windows binary
    work with V8/code-mode enabled. Likely options are either making the
    Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or
    evaluating whether a Bazel MSVC target/toolchain can reuse the same
    prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already
    use.
  • bazel: run wrapped Rust unit test shards (#18913)
    ## Why
    
    The `codex-tui` Cargo test suite was catching stale snapshot
    expectations, but the matching Bazel unit-test target was still green.
    The TUI unit target is wrapped by `workspace_root_test` so tests run
    from the repository root and Insta can resolve Cargo-like snapshot
    paths. After native Bazel sharding was enabled for that wrapped target,
    rules_rust also inserted its own sharding wrapper around the Rust test
    binary.
    
    Those two wrappers did not compose: rules_rust's sharding wrapper
    expects to run from its own runfiles cwd, while `workspace_root_test`
    deliberately changes cwd to the repo root before invoking the test. In
    that configuration, the inner wrapper could fail to enumerate the Rust
    tests and exit successfully with empty shards, so snapshot regressions
    were not being exercised by Bazel.
    
    ## What Changed
    
    - Stop enabling rules_rust's inner `experimental_enable_sharding` for
    unit-test binaries created by `codex_rust_crate`.
    - Keep the configured `shard_count` on the outer `workspace_root_test`
    target.
    - Add libtest sharding directly to `workspace_root_test_launcher.sh.tpl`
    and `workspace_root_test_launcher.bat.tpl` after the launcher has
    resolved the actual test binary and established the intended
    repository-root cwd.
    - Partition tests by a stable FNV-1a hash of each libtest test name,
    matching the stable-shard behavior we wanted without depending on the
    inner rules_rust wrapper.
    - Preserve ad-hoc local test filters by running the resolved test binary
    directly when explicit test args are supplied.
    - On Windows, run selected libtest names from the shard list in bounded
    PowerShell batches instead of concatenating every selected test into one
    `cmd.exe` command line.
    
    This PR is stacked on top of #18912, which contains only the snapshot
    expectation updates exposed once the Bazel target actually runs the TUI
    unit tests. It is also the reason #18916 becomes visible: once this
    wrapper fix makes Bazel execute the affected `codex-core` test, that
    test needs its own executable-path setup fixed.
    
    ## Verification
    
    - `cargo test -p codex-tui`
    - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors`
    - `bazel test //codex-rs/tui:all --test_output=errors`
    - `bash -n workspace_root_test_launcher.sh.tpl`
    - Exercised the Windows PowerShell batching fragment locally with a fake
    test binary and shard-list file.
  • bazel: enable the full Windows gnullvm CI path (#15952)
    ## Why
    
    This PR is the current, consolidated follow-up to the earlier Windows
    Bazel attempt in #11229. The goal is no longer just to get a tiny
    Windows smoke job limping along: it is to make the ordinary Bazel CI
    path usable on `windows-latest` for `x86_64-pc-windows-gnullvm`, with
    the same broad `//...` test shape that macOS and Linux already use.
    
    The earlier smoke-list version of this work was useful as a foothold,
    but it was not a good long-term landing point. Windows Bazel kept
    surfacing real issues outside that allowlist:
    
    - GitHub's Windows runner exposed runfiles-manifest bugs such as
    `FINDSTR: Cannot open D:MANIFEST`, which broke Bazel test launchers even
    when the manifest file existed.
    - `rules_rs`, `rules_rust`, LLVM extraction, and Abseil still needed
    `windows-gnullvm`-specific fixes for our hermetic toolchain.
    - the V8 path needed more work than just turning the Windows matrix
    entry back on: `rusty_v8` does not ship Windows GNU artifacts in the
    same shape we need, and Bazel's in-tree V8 build needed a set of Windows
    GNU portability fixes.
    
    Windows performance pressure also pushed this toward a full solution
    instead of a permanent smoke suite. During this investigation we hit
    targets such as `//codex-rs/shell-command:shell-command-unit-tests` that
    were much more expensive on Windows because they repeatedly spawn real
    PowerShell parsers (see #16057 for one concrete example of that
    pressure). That made it much more valuable to get the real Windows Bazel
    path working than to keep iterating on a narrowly curated subset.
    
    The net result is that this PR now aims for the same CI contract on
    Windows that we already expect elsewhere: keep standalone
    `//third_party/v8:all` out of the ordinary Bazel lane, but allow V8
    consumers under `//codex-rs/...` to build and test transitively through
    `//...`.
    
    ## What Changed
    
    ### CI and workflow wiring
    
    - re-enable the `windows-latest` / `x86_64-pc-windows-gnullvm` Bazel
    matrix entry in `.github/workflows/bazel.yml`
    - move the Windows Bazel output root to `D:\b` and enable `git config
    --global core.longpaths true` in
    `.github/actions/setup-bazel-ci/action.yml`
    - keep the ordinary Bazel target set on Windows aligned with macOS and
    Linux by running `//...` while excluding only standalone
    `//third_party/v8:all` targets from the normal lane
    
    ### Toolchain and module support for `windows-gnullvm`
    
    - patch `rules_rs` so `windows-gnullvm` is modeled as a distinct Windows
    exec/toolchain platform instead of collapsing into the generic Windows
    shape
    - patch `rules_rust` build-script environment handling so llvm-mingw
    build-script probes do not inherit unsupported `-fstack-protector*`
    flags
    - patch the LLVM module archive so it extracts cleanly on Windows and
    provides the MinGW libraries this toolchain needs
    - patch Abseil so its thread-local identity path matches the hermetic
    `windows-gnullvm` toolchain instead of taking an incompatible MinGW
    pthread path
    - keep both MSVC and GNU Windows targets in the generated Cargo metadata
    because the current V8 release-asset story still uses MSVC-shaped names
    in some places while the Bazel build targets the GNU ABI
    
    ### Windows test-launch and binary-behavior fixes
    
    - update `workspace_root_test_launcher.bat.tpl` to read the runfiles
    manifest directly instead of shelling out to `findstr`, which was the
    source of the `D:MANIFEST` failures on the GitHub Windows runner
    - thread a larger Windows GNU stack reserve through `defs.bzl` so
    Bazel-built binaries that pull in V8 behave correctly both under normal
    builds and under `bazel test`
    - remove the no-longer-needed Windows bootstrap sh-toolchain override
    from `.bazelrc`
    
    ### V8 / `rusty_v8` Windows GNU support
    
    - export and apply the new Windows GNU patch set from
    `patches/BUILD.bazel` / `MODULE.bazel`
    - patch the V8 module/rules/source layers so the in-tree V8 build can
    produce Windows GNU archives under Bazel
    - teach `third_party/v8/BUILD.bazel` to build Windows GNU static
    archives in-tree instead of aliasing them to the MSVC prebuilts
    - reuse the Linux release binding for the experimental Windows GNU path
    where `rusty_v8` does not currently publish a Windows GNU binding
    artifact
    
    ## Testing
    
    - the primary end-to-end validation for this work is the `Bazel`
    workflow plus `v8-canary`, since the hard parts are Windows-specific and
    depend on real GitHub runner behavior
    - before consolidation back onto this PR, the same net change passed the
    full Bazel matrix in [run
    23675590471](https://github.com/openai/codex/actions/runs/23675590471)
    and passed `v8-canary` in [run
    23675590453](https://github.com/openai/codex/actions/runs/23675590453)
    - those successful runs included the `windows-latest` /
    `x86_64-pc-windows-gnullvm` Bazel job with the ordinary `//...` path,
    not the earlier Windows smoke allowlist
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15952).
    * #16067
    * __->__ #15952
  • fix(ci): restore guardian coverage and bazel unit tests (#13912)
    ## Summary
    - restore the guardian review request snapshot test and its tracked
    snapshot after it was dropped from `main`
    - make Bazel Rust unit-test wrappers resolve runfiles correctly on
    manifest-only platforms like macOS and point Insta at the real workspace
    root
    - harden the shell-escalation socket-closure assertion so the musl Bazel
    test no longer depends on fd reuse behavior
    
    ## Verification
    - cargo test -p codex-core
    guardian_review_request_layout_matches_model_visible_request_snapshot
    - cargo test -p codex-shell-escalation
    - bazel test //codex-rs/exec:exec-unit-tests
    //codex-rs/shell-escalation:shell-escalation-unit-tests
    
    Supersedes #13894.
    
    ---------
    
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
    Co-authored-by: viyatb-oai <viyatb@openai.com>
    Co-authored-by: Codex <noreply@openai.com>