4 Commits

  • ci: cross-compile Windows Bazel tests (#20585)
    ## Status
    
    This is the Bazel PR-CI cross-compilation follow-up to #20485. It is
    intentionally split from the Cargo/cargo-xwin release-build PoC so
    #20485 can stay as the historical release-build exploration. The
    unrelated async-utils test cleanup has been moved to #20686, so this PR
    is focused on the Windows Bazel CI path.
    
    The intended tradeoff is now explicit in `.github/workflows/bazel.yml`:
    pull requests get the fast Windows cross-compiled Bazel test leg, while
    post-merge pushes to `main` run both that fast cross leg and a fully
    native Windows Bazel test leg. The native main-only job keeps full
    V8/code-mode coverage and gets a 40-minute timeout because it is less
    latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes.
    
    ## Why
    
    Windows Bazel PR CI currently does the expensive part of the build on
    Windows. A native Windows Bazel test job on `main` completed in about
    28m12s, leaving very little headroom under the 30-minute job timeout and
    making Windows the slowest PR signal.
    
    #20485 showed that Windows cross-compilation can be materially faster
    for Cargo release builds, but PR CI needs Bazel because Bazel owns our
    test sharding, flaky-test retries, and integration-test layout. This PR
    applies the same high-level shape we already use for macOS Bazel CI:
    compile with remote Linux execution, then run platform-specific tests on
    the platform runner.
    
    The compromise is deliberately signal-aware: code-mode/V8 changes are
    rare enough that PR CI can accept losing the direct V8/code-mode
    smoke-test signal temporarily, while `main` still runs the native
    Windows job post-merge to catch that class of regression. A follow-up PR
    should investigate making the cross-built Windows gnullvm V8 archive
    pass the direct V8/code-mode tests so this tradeoff can eventually go
    away.
    
    ## What Changed
    
    - Adds a `ci-windows-cross` Bazel config that targets
    `x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps
    `TestRunner` actions local on the Windows runner.
    - Adds explicit Windows platform definitions for
    `windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain
    that lets gnullvm test targets execute under the Windows MSVC host
    platform.
    - Updates the Windows Bazel PR test leg to opt into the cross-compile
    path via `--windows-cross-compile` and `--remote-download-toplevel`.
    - Adds a `test-windows-native-main` job that runs only for `push` events
    on `refs/heads/main`, uses the native Windows Bazel path, includes
    V8/code-mode smoke tests, and has `timeout-minutes: 40`.
    - Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous
    local Windows MSVC-host fallback, including
    `--host_platform=//:local_windows_msvc` and `--jobs=8`.
    - Preserves the existing integration-test shape on non-gnullvm
    platforms, while generating Windows-cross wrapper targets only for
    `windows_gnullvm`.
    - Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime,
    avoiding hard-coded Cargo paths and duplicate test runfiles.
    - Extends the V8 Bazel patches enough for the
    `x86_64-pc-windows-gnullvm` target and Linux remote execution path.
    - Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT`
    at runtime when Bazel provides it, because cross-compiled binaries may
    contain Linux compile-time paths.
    - Keeps the direct V8/code-mode unit smoke tests out of the Windows
    cross PR path for now while native Windows CI continues to cover them
    post-merge.
    
    ## Command Shape
    
    The fast Windows PR test leg invokes the normal Bazel CI wrapper like
    this:
    
    ```shell
    ./.github/scripts/run-bazel-ci.sh \
      --print-failed-action-summary \
      --print-failed-test-logs \
      --windows-cross-compile \
      --remote-download-toplevel \
      -- \
      test \
      --test_tag_filters=-argument-comment-lint \
      --test_verbose_timeout_warnings \
      --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
      -- \
      //... \
      -//third_party/v8:all \
      -//codex-rs/code-mode:code-mode-unit-tests \
      -//codex-rs/v8-poc:v8-poc-unit-tests
    ```
    
    With the BuildBuddy secret available on Windows, the wrapper selects
    `--config=ci-windows-cross` and appends the important Windows-cross
    overrides after rc expansion:
    
    ```shell
    --host_platform=//:rbe
    --shell_executable=/bin/bash
    --action_env=PATH=/usr/bin:/bin
    --host_action_env=PATH=/usr/bin:/bin
    --test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}
    ```
    
    The native post-merge Windows job intentionally omits
    `--windows-cross-compile` and does not exclude the V8/code-mode unit
    targets:
    
    ```shell
    ./.github/scripts/run-bazel-ci.sh \
      --print-failed-action-summary \
      --print-failed-test-logs \
      -- \
      test \
      --test_tag_filters=-argument-comment-lint \
      --test_verbose_timeout_warnings \
      --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
      --build_metadata=TAG_windows_native_main=true \
      -- \
      //... \
      -//third_party/v8:all
    ```
    
    ## Research Notes
    
    The existing macOS Bazel CI config already uses the model we want here:
    build actions run remotely with `--strategy=remote`, but `TestRunner`
    actions execute on the macOS runner. This PR mirrors that pattern for
    Windows with `--strategy=TestRunner=local`.
    
    The important Bazel detail is that `rules_rs` is already targeting
    `x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes
    where the build actions execute; it does not switch the Bazel PR test
    target to Cargo, `cargo-nextest`, or the MSVC release target.
    
    Cargo release builds differ from this Bazel path for V8: the normal
    Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt
    Windows MSVC `.lib.gz` archives. The Bazel PR path targets
    `windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows
    GNU/gnullvm archive, so this PR builds that archive in-tree. That
    Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode
    smoke tests, which is why the workflow keeps native Windows coverage on
    `main`.
    
    The less obvious Bazel detail is test wrapper selection. Bazel chooses
    the Windows test wrapper (`tw.exe`) from the test action execution
    platform, not merely from the Rust target triple. The outer
    `workspace_root_test` therefore declares the default test toolchain and
    uses the bridge toolchain above so the test action executes on Windows
    while its inner Rust binary is built for gnullvm.
    
    The V8 investigation exposed a Windows-client gotcha: even when an
    action execution platform is Linux RBE, Bazel can still derive the
    genrule shell path from the Windows client. That produced remote
    commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux
    workers. The wrapper now passes `--shell_executable=/bin/bash` with
    `--host_platform=//:rbe` for the Windows cross path.
    
    The same Windows-client/Linux-RBE boundary also affected
    `third_party/v8:binding_cc`: a multiline genrule command can carry CRLF
    line endings into Linux remote bash, which failed as `$'\r'`. That
    genrule now keeps the `sed` command on one physical shell line while
    using an explicit Starlark join so the shell arguments stay readable.
    
    ## Verification
    
    Local checks included:
    
    ```shell
    bash -n .github/scripts/run-bazel-ci.sh
    bash -n workspace_root_test_launcher.sh.tpl
    ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}"
    RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh
    RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh
    RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh
    RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh
    ```
    
    The Linux clippy and argument-comment target lists contain zero
    `*-windows-cross-bin` labels, while the Windows lists still include 47
    Windows-cross internal test binaries.
    
    CI evidence:
    
    - Baseline native Windows Bazel test on `main`: success in about 28m12s,
    https://github.com/openai/codex/actions/runs/25206257208/job/73907325959
    - Green Windows-cross Bazel run on the split PR before adding the
    main-only native leg: Windows test 9m16s, Windows release verify 5m10s,
    Windows clippy 4m43s,
    https://github.com/openai/codex/actions/runs/25231890068
    - The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`;
    CI is rerunning on that focused diff.
    
    ## Follow-Up
    
    A subsequent PR should investigate making a cross-built Windows binary
    work with V8/code-mode enabled. Likely options are either making the
    Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or
    evaluating whether a Bazel MSVC target/toolchain can reuse the same
    prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already
    use.
  • bazel: enable the full Windows gnullvm CI path (#15952)
    ## Why
    
    This PR is the current, consolidated follow-up to the earlier Windows
    Bazel attempt in #11229. The goal is no longer just to get a tiny
    Windows smoke job limping along: it is to make the ordinary Bazel CI
    path usable on `windows-latest` for `x86_64-pc-windows-gnullvm`, with
    the same broad `//...` test shape that macOS and Linux already use.
    
    The earlier smoke-list version of this work was useful as a foothold,
    but it was not a good long-term landing point. Windows Bazel kept
    surfacing real issues outside that allowlist:
    
    - GitHub's Windows runner exposed runfiles-manifest bugs such as
    `FINDSTR: Cannot open D:MANIFEST`, which broke Bazel test launchers even
    when the manifest file existed.
    - `rules_rs`, `rules_rust`, LLVM extraction, and Abseil still needed
    `windows-gnullvm`-specific fixes for our hermetic toolchain.
    - the V8 path needed more work than just turning the Windows matrix
    entry back on: `rusty_v8` does not ship Windows GNU artifacts in the
    same shape we need, and Bazel's in-tree V8 build needed a set of Windows
    GNU portability fixes.
    
    Windows performance pressure also pushed this toward a full solution
    instead of a permanent smoke suite. During this investigation we hit
    targets such as `//codex-rs/shell-command:shell-command-unit-tests` that
    were much more expensive on Windows because they repeatedly spawn real
    PowerShell parsers (see #16057 for one concrete example of that
    pressure). That made it much more valuable to get the real Windows Bazel
    path working than to keep iterating on a narrowly curated subset.
    
    The net result is that this PR now aims for the same CI contract on
    Windows that we already expect elsewhere: keep standalone
    `//third_party/v8:all` out of the ordinary Bazel lane, but allow V8
    consumers under `//codex-rs/...` to build and test transitively through
    `//...`.
    
    ## What Changed
    
    ### CI and workflow wiring
    
    - re-enable the `windows-latest` / `x86_64-pc-windows-gnullvm` Bazel
    matrix entry in `.github/workflows/bazel.yml`
    - move the Windows Bazel output root to `D:\b` and enable `git config
    --global core.longpaths true` in
    `.github/actions/setup-bazel-ci/action.yml`
    - keep the ordinary Bazel target set on Windows aligned with macOS and
    Linux by running `//...` while excluding only standalone
    `//third_party/v8:all` targets from the normal lane
    
    ### Toolchain and module support for `windows-gnullvm`
    
    - patch `rules_rs` so `windows-gnullvm` is modeled as a distinct Windows
    exec/toolchain platform instead of collapsing into the generic Windows
    shape
    - patch `rules_rust` build-script environment handling so llvm-mingw
    build-script probes do not inherit unsupported `-fstack-protector*`
    flags
    - patch the LLVM module archive so it extracts cleanly on Windows and
    provides the MinGW libraries this toolchain needs
    - patch Abseil so its thread-local identity path matches the hermetic
    `windows-gnullvm` toolchain instead of taking an incompatible MinGW
    pthread path
    - keep both MSVC and GNU Windows targets in the generated Cargo metadata
    because the current V8 release-asset story still uses MSVC-shaped names
    in some places while the Bazel build targets the GNU ABI
    
    ### Windows test-launch and binary-behavior fixes
    
    - update `workspace_root_test_launcher.bat.tpl` to read the runfiles
    manifest directly instead of shelling out to `findstr`, which was the
    source of the `D:MANIFEST` failures on the GitHub Windows runner
    - thread a larger Windows GNU stack reserve through `defs.bzl` so
    Bazel-built binaries that pull in V8 behave correctly both under normal
    builds and under `bazel test`
    - remove the no-longer-needed Windows bootstrap sh-toolchain override
    from `.bazelrc`
    
    ### V8 / `rusty_v8` Windows GNU support
    
    - export and apply the new Windows GNU patch set from
    `patches/BUILD.bazel` / `MODULE.bazel`
    - patch the V8 module/rules/source layers so the in-tree V8 build can
    produce Windows GNU archives under Bazel
    - teach `third_party/v8/BUILD.bazel` to build Windows GNU static
    archives in-tree instead of aliasing them to the MSVC prebuilts
    - reuse the Linux release binding for the experimental Windows GNU path
    where `rusty_v8` does not currently publish a Windows GNU binding
    artifact
    
    ## Testing
    
    - the primary end-to-end validation for this work is the `Bazel`
    workflow plus `v8-canary`, since the hard parts are Windows-specific and
    depend on real GitHub runner behavior
    - before consolidation back onto this PR, the same net change passed the
    full Bazel matrix in [run
    23675590471](https://github.com/openai/codex/actions/runs/23675590471)
    and passed `v8-canary` in [run
    23675590453](https://github.com/openai/codex/actions/runs/23675590453)
    - those successful runs included the `windows-latest` /
    `x86_64-pc-windows-gnullvm` Bazel job with the ordinary `//...` path,
    not the earlier Windows smoke allowlist
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15952).
    * #16067
    * __->__ #15952
  • V8 Bazel Build (#15021)
    Alternative approach, we use rusty_v8 for all platforms that its
    predefined, but lets build from source a musl v8 version with bazel for
    x86 and aarch64 only. We would need to release this on github and then
    use the release.