codex

[codex] Fix Windows BuildBuddy Bazel wrapper execution (#25915 )

## Why

#25156 moved Bazel CI launches into a shared Python wrapper. On Windows,
launching Bazel with `os.execvp` can split the spaced
`--test_env=PATH=...` argument and fail to propagate the eventual Bazel
exit status, allowing jobs to pass without running tests. This reapplies
the wrapper after #25909 with a Windows-safe launch path.

## What changed

Use a waited `subprocess.run` launch on Windows while preserving
`os.execvp` on Unix. Add a process-level regression test for spaced
arguments and child exit status, and run it on Windows Bazel shard 1.

## Experiment

To confirm Bazel was actually invoking tests, patch `87b61d0be6`
temporarily added an intentionally failing `codex-core` unit test. Bazel
failed on that sentinel on all three major platforms:

- [Linux Bazel
test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062486)
- [macOS Bazel
test](https://github.com/openai/codex/actions/runs/26841132773/job/79151062362)
- [Windows Bazel test shard
1/4](https://github.com/openai/codex/actions/runs/26841132773/job/79151062155)

The sentinel was removed after collecting this evidence. Windows Bazel
[clippy](https://github.com/openai/codex/actions/runs/26841132773/job/79151062914)
and [release
verification](https://github.com/openai/codex/actions/runs/26841132773/job/79151062739)
also passed.

## Validation

After removing the sentinel, `just test -p codex-core` no longer
reported it. The local run retained two unrelated environment-specific
failures.

Adam Perry @ OpenAI · 2026-06-02 16:22:32 -07:00

6471f8b31a

[codex] Revert shared BuildBuddy Bazel wrapper (#25909 )

## Why

PR #25905 intentionally adds a failing `codex-core` unit test, but its
[Bazel test on Windows
check](https://github.com/openai/codex/actions/runs/26837526950/job/79135369259)
passed. That shows the Bazel configuration introduced by #25156 is not
behaving as expected, so revert it while the configuration can be
investigated separately.

## What changed

Revert #25156 in full, restoring the previous Bazel remote
configuration, CI scripts, workflows, `rusty_v8` handling, and
documentation. This removes the shared BuildBuddy wrapper and its tests.

## Validation

Not run locally; this exact revert was prioritized for a fast rollback.

Adam Perry @ OpenAI · 2026-06-02 11:06:01 -07:00

9e3d5f29e2

Route Bazel CI through shared BuildBuddy remote config wrapper (#25156 )

## Why

Bazel remote configuration was selected in several CI scripts and
workflow steps. That made the BuildBuddy tenant policy easy to duplicate
and harder to audit, especially for fork pull requests that must not use
the OpenAI tenant.

This builds on
[sluongng/buildbuddy-ci-host-routing](https://github.com/openai/codex/compare/main...sluongng:codex:sluongng/buildbuddy-ci-host-routing)
and consolidates the policy in one place.

## What to do if this breaks you

See `codex-rs/docs/bazel.md` for details. TLDR:

1. make a BuildBuddy API key and put it in `~/.bazelrc`
2. if you're an OpenAI employee, add `common
--config=buildbuddy-openai-rbe` to `user.bazelrc` in the repo root

Run `just bazel-test` to ensure it works.

Note that `just bazel-remote-test` no longer exists, you need to select
a remote configuration as documented to use RBE.

## What changed

- Add `.github/scripts/run_bazel_with_buildbuddy.py` as the shared Bazel
wrapper and Python library. It selects the OpenAI host only for trusted
upstream GitHub Actions runs, routes keyed fork runs to the generic
host, and falls back to local Bazel execution when no key is available.
- Move endpoint selection into explicit `.bazelrc` configurations and
update Bazel CI, query helpers, and `rusty_v8` staging to use the shared
policy. Loading-phase target-discovery queries remain local.
- Add wrapper and `rusty_v8` unit coverage, plus `just test-scripts` for
the `.github/scripts` Python tests.
- Document local Bazel usage, `user.bazelrc` setup, BuildBuddy
configurations, and CI behavior in `codex-rs/docs/bazel.md`.

## Validation

- `just test-scripts`
- `bash -n .github/scripts/run-bazel-ci.sh
.github/scripts/run-bazel-query-ci.sh
.github/scripts/run-argument-comment-lint-bazel.sh
scripts/list-bazel-clippy-targets.sh`
- `python3 -m py_compile .github/scripts/run_bazel_with_buildbuddy.py
.github/scripts/test_run_bazel_with_buildbuddy.py
.github/scripts/test_rusty_v8_bazel.py
.github/scripts/rusty_v8_bazel.py`
- `ruff check .github/scripts/run_bazel_with_buildbuddy.py
.github/scripts/test_run_bazel_with_buildbuddy.py
.github/scripts/test_rusty_v8_bazel.py
.github/scripts/rusty_v8_bazel.py`

Adam Perry @ OpenAI · 2026-06-02 09:56:20 -07:00

ebb7980369

bazel: run sharded rust integration tests (#21057 )

## Why

Bazel CI was not actually exercising some sharded Rust integration-test
targets on macOS. The `rules_rust` sharding wrapper expects a symlink
runfiles tree, but this repo runs Bazel with `--noenable_runfiles`. In
that configuration the wrapper could fail to find the generated test
binary, produce an empty test list, and exit successfully. That made
targets such as `//codex-rs/core:core-all-test` look green even when
Cargo CI could still catch failures in the same Rust tests.

The coverage gap appears to have been introduced by
[#18082](https://github.com/openai/codex/pull/18082), which enabled
rules_rust native sharding on `//codex-rs/core:core-all-test` and the
other large Rust test labels. The manifest-runfiles setup itself
predates that change in
[#10098](https://github.com/openai/codex/pull/10098), but #18082 is
where the affected integration tests started running through the
incompatible rules_rust sharding wrapper.
[#18913](https://github.com/openai/codex/pull/18913) fixed the same
class of issue for wrapped unit-test shards, but integration-test shards
were still going through the rules_rust wrapper until this PR.

We still do not have the V8/code-mode pieces stable under the Bazel CI
cross-compile setup, so this keeps those tests out of Bazel while
restoring coverage for the rest of the sharded Rust integration suites.
Cargo CI remains responsible for V8/code-mode coverage for now.

This change did uncover a real failing core test on `main`:
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
That fix is split into
[#21060](https://github.com/openai/codex/pull/21060), which enables the
`apply_patch` tool in the test, teaches the aggregate core test binary
to dispatch the sandboxed filesystem helper, canonicalizes the macOS
temp patch target, and isolates the core test harness from managed
local/enterprise config. Keeping that fix separate lets this PR stay
focused on restoring Bazel coverage while documenting the first failure
it exposed.

## What changed

- Build sharded Rust integration tests as manual `*-bin` binaries and
run them through the existing manifest-aware `workspace_root_test`
launcher.
- Keep Bazel sharding on the launcher target so Rust test cases are
still distributed by stable test-name hashing.
- Configure Bazel CI to skip Rust tests whose names contain
`suite::code_mode::`.
- Exclude the standalone `codex-rs/code-mode` and `codex-rs/v8-poc`
unit-test targets from `bazel.yml`.

## Verification

- `bazel query --output=build //codex-rs/core:core-all-test` now shows
`workspace_root_test` wrapping `//codex-rs/core:core-all-test-bin`.
- `bazel test --test_output=all --nocache_test_results
--test_sharding_strategy=disabled //codex-rs/core:core-all-test
--test_filter=suite::request_permissions_tool::approved_folder_write_request_permissions_unblocks_later_apply_patch`
runs the actual Rust test body and passes.
- `bazel test --test_output=errors --nocache_test_results
--test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::
//codex-rs/core:core-all-test` runs the sharded target with code-mode
skipped and passes overall locally, with one flaky attempt retried by
the existing `flaky = True` setting.

Michael Bolin · 2026-05-04 13:33:14 -07:00

30de54da36

ci: cross-compile Windows Bazel clippy (#20701 )

## Why

#20585 moved the Windows Bazel test job to the cross-compile path, but
the Windows Bazel clippy and verify-release-build jobs were still using
the native Windows/MSVC-host fallback. Those two jobs became the slowest
Windows PR legs, even though both are build-only signal and do not need
to execute the resulting binaries.

## What Changed

- Switches the Windows Bazel clippy job from
`--windows-msvc-host-platform` to `--windows-cross-compile`, so clippy
build actions use Linux RBE while still targeting
`x86_64-pc-windows-gnullvm`.
- Switches the Windows Bazel verify-release-build job to
`--windows-cross-compile` as well. This job only compiles
`cfg(not(debug_assertions))` Rust code under `fastbuild`, so it does not
need a native Windows build host.
- Keeps the old `--skip_incompatible_explicit_targets` behavior only for
fork/community PRs without `BUILDBUDDY_API_KEY`, where `run-bazel-ci.sh`
falls back to the local Windows MSVC-host shape.
- Adds `--windows-cross-compile` support to
`.github/scripts/run-bazel-query-ci.sh`, so target-discovery queries
select the same `ci-windows-cross` config as the subsequent build.
- Threads that option through `scripts/list-bazel-clippy-targets.sh` so
the Windows clippy job discovers targets under the same platform shape
as the subsequent clippy build.

## Verification

Local checks:

```shell
bash -n .github/scripts/run-bazel-query-ci.sh
bash -n scripts/list-bazel-clippy-targets.sh
ruby -e 'require "yaml"; YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh | grep -c -- '-windows-cross-bin$'
RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh --windows-cross-compile | grep -c -- '-windows-cross-bin$'
```

The Linux target-list check reported `0` Windows-cross internal test
binaries, while the Windows cross target-list check reported `47`,
preserving the test-code clippy coverage shape from the existing Windows
job.

Michael Bolin · 2026-05-01 16:40:29 -07:00

cd2760fc08

ci: cross-compile Windows Bazel tests (#20585 )

## Status

This is the Bazel PR-CI cross-compilation follow-up to #20485. It is
intentionally split from the Cargo/cargo-xwin release-build PoC so
#20485 can stay as the historical release-build exploration. The
unrelated async-utils test cleanup has been moved to #20686, so this PR
is focused on the Windows Bazel CI path.

The intended tradeoff is now explicit in `.github/workflows/bazel.yml`:
pull requests get the fast Windows cross-compiled Bazel test leg, while
post-merge pushes to `main` run both that fast cross leg and a fully
native Windows Bazel test leg. The native main-only job keeps full
V8/code-mode coverage and gets a 40-minute timeout because it is less
latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes.

## Why

Windows Bazel PR CI currently does the expensive part of the build on
Windows. A native Windows Bazel test job on `main` completed in about
28m12s, leaving very little headroom under the 30-minute job timeout and
making Windows the slowest PR signal.

#20485 showed that Windows cross-compilation can be materially faster
for Cargo release builds, but PR CI needs Bazel because Bazel owns our
test sharding, flaky-test retries, and integration-test layout. This PR
applies the same high-level shape we already use for macOS Bazel CI:
compile with remote Linux execution, then run platform-specific tests on
the platform runner.

The compromise is deliberately signal-aware: code-mode/V8 changes are
rare enough that PR CI can accept losing the direct V8/code-mode
smoke-test signal temporarily, while `main` still runs the native
Windows job post-merge to catch that class of regression. A follow-up PR
should investigate making the cross-built Windows gnullvm V8 archive
pass the direct V8/code-mode tests so this tradeoff can eventually go
away.

## What Changed

- Adds a `ci-windows-cross` Bazel config that targets
`x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps
`TestRunner` actions local on the Windows runner.
- Adds explicit Windows platform definitions for
`windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain
that lets gnullvm test targets execute under the Windows MSVC host
platform.
- Updates the Windows Bazel PR test leg to opt into the cross-compile
path via `--windows-cross-compile` and `--remote-download-toplevel`.
- Adds a `test-windows-native-main` job that runs only for `push` events
on `refs/heads/main`, uses the native Windows Bazel path, includes
V8/code-mode smoke tests, and has `timeout-minutes: 40`.
- Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous
local Windows MSVC-host fallback, including
`--host_platform=//:local_windows_msvc` and `--jobs=8`.
- Preserves the existing integration-test shape on non-gnullvm
platforms, while generating Windows-cross wrapper targets only for
`windows_gnullvm`.
- Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime,
avoiding hard-coded Cargo paths and duplicate test runfiles.
- Extends the V8 Bazel patches enough for the
`x86_64-pc-windows-gnullvm` target and Linux remote execution path.
- Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT`
at runtime when Bazel provides it, because cross-compiled binaries may
contain Linux compile-time paths.
- Keeps the direct V8/code-mode unit smoke tests out of the Windows
cross PR path for now while native Windows CI continues to cover them
post-merge.

## Command Shape

The fast Windows PR test leg invokes the normal Bazel CI wrapper like
this:

```shell
./.github/scripts/run-bazel-ci.sh \
  --print-failed-action-summary \
  --print-failed-test-logs \
  --windows-cross-compile \
  --remote-download-toplevel \
  -- \
  test \
  --test_tag_filters=-argument-comment-lint \
  --test_verbose_timeout_warnings \
  --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
  -- \
  //... \
  -//third_party/v8:all \
  -//codex-rs/code-mode:code-mode-unit-tests \
  -//codex-rs/v8-poc:v8-poc-unit-tests
```

With the BuildBuddy secret available on Windows, the wrapper selects
`--config=ci-windows-cross` and appends the important Windows-cross
overrides after rc expansion:

```shell
--host_platform=//:rbe
--shell_executable=/bin/bash
--action_env=PATH=/usr/bin:/bin
--host_action_env=PATH=/usr/bin:/bin
--test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}
```

The native post-merge Windows job intentionally omits
`--windows-cross-compile` and does not exclude the V8/code-mode unit
targets:

```shell
./.github/scripts/run-bazel-ci.sh \
  --print-failed-action-summary \
  --print-failed-test-logs \
  -- \
  test \
  --test_tag_filters=-argument-comment-lint \
  --test_verbose_timeout_warnings \
  --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
  --build_metadata=TAG_windows_native_main=true \
  -- \
  //... \
  -//third_party/v8:all
```

## Research Notes

The existing macOS Bazel CI config already uses the model we want here:
build actions run remotely with `--strategy=remote`, but `TestRunner`
actions execute on the macOS runner. This PR mirrors that pattern for
Windows with `--strategy=TestRunner=local`.

The important Bazel detail is that `rules_rs` is already targeting
`x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes
where the build actions execute; it does not switch the Bazel PR test
target to Cargo, `cargo-nextest`, or the MSVC release target.

Cargo release builds differ from this Bazel path for V8: the normal
Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt
Windows MSVC `.lib.gz` archives. The Bazel PR path targets
`windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows
GNU/gnullvm archive, so this PR builds that archive in-tree. That
Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode
smoke tests, which is why the workflow keeps native Windows coverage on
`main`.

The less obvious Bazel detail is test wrapper selection. Bazel chooses
the Windows test wrapper (`tw.exe`) from the test action execution
platform, not merely from the Rust target triple. The outer
`workspace_root_test` therefore declares the default test toolchain and
uses the bridge toolchain above so the test action executes on Windows
while its inner Rust binary is built for gnullvm.

The V8 investigation exposed a Windows-client gotcha: even when an
action execution platform is Linux RBE, Bazel can still derive the
genrule shell path from the Windows client. That produced remote
commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux
workers. The wrapper now passes `--shell_executable=/bin/bash` with
`--host_platform=//:rbe` for the Windows cross path.

The same Windows-client/Linux-RBE boundary also affected
`third_party/v8:binding_cc`: a multiline genrule command can carry CRLF
line endings into Linux remote bash, which failed as `$'\r'`. That
genrule now keeps the `sed` command on one physical shell line while
using an explicit Starlark join so the shell arguments stay readable.

## Verification

Local checks included:

```shell
bash -n .github/scripts/run-bazel-ci.sh
bash -n workspace_root_test_launcher.sh.tpl
ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}"
RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh
RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh
RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh
RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh
```

The Linux clippy and argument-comment target lists contain zero
`*-windows-cross-bin` labels, while the Windows lists still include 47
Windows-cross internal test binaries.

CI evidence:

- Baseline native Windows Bazel test on `main`: success in about 28m12s,
https://github.com/openai/codex/actions/runs/25206257208/job/73907325959
- Green Windows-cross Bazel run on the split PR before adding the
main-only native leg: Windows test 9m16s, Windows release verify 5m10s,
Windows clippy 4m43s,
https://github.com/openai/codex/actions/runs/25231890068
- The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`;
CI is rerunning on that focused diff.

## Follow-Up

A subsequent PR should investigate making a cross-built Windows binary
work with V8/code-mode enabled. Likely options are either making the
Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or
evaluating whether a Bazel MSVC target/toolchain can reuse the same
prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already
use.

Michael Bolin · 2026-05-01 15:55:28 -07:00

466798aa83

ci: reuse Bazel CI startup for target-discovery queries (#19232 )

## Why

A rerun of the Windows Bazel clippy job after
[#19161](https://github.com/openai/codex/pull/19161) had exactly the
cache behavior we wanted in BuildBuddy: zero action-cache misses. Even
so, the GitHub job still took a little over five minutes.

The problem was that the job was paying for two separate Bazel startup
paths:

1. a `bazel query` to discover extra lint targets
2. the real `bazel build --config=clippy ...` invocation

On Windows, that query was bypassing the CI Bazel wrapper, so it did not
reuse the same `--output_user_root`, CI config, or remote-cache setup as
the real build. In practice that meant the rerun could still cold-start
a separate Bazel server before the actual clippy build even began.

## What

- add `.github/scripts/run-bazel-query-ci.sh` to run CI-side Bazel
queries with the same startup and cache-related flags as the main Bazel
command
- switch `scripts/list-bazel-clippy-targets.sh` to use that helper for
manual `rust_test` target discovery
- switch `tools/argument-comment-lint/list-bazel-targets.sh` to use the
same helper
- simplify `.github/scripts/run-argument-comment-lint-bazel.sh` so its
Windows-only query path also goes through the shared helper

This keeps the target-discovery queries aligned with the later
build/test invocation instead of treating them as a separate cold Bazel
session.

## Verification

- `bash -n .github/scripts/run-bazel-query-ci.sh`
- `bash -n scripts/list-bazel-clippy-targets.sh`
- `bash -n tools/argument-comment-lint/list-bazel-targets.sh`
- `bash -n .github/scripts/run-argument-comment-lint-bazel.sh`
- mocked a Windows invocation of `run-bazel-query-ci.sh` and verified it
forwards `--output_user_root`, `--config=ci-windows`, the BuildBuddy
auth header, and the repository cache flags

## Docs

No documentation updates are needed.

Michael Bolin · 2026-04-23 23:26:17 -07:00

b68366718b

ci: align Bazel repo cache and Windows clippy target handling (#16740 )

## Why

Bazel CI had two independent Windows issues:

- The workflow saved/restored `~/.cache/bazel-repo-cache`, but
`.bazelrc` configured `common:ci-windows
--repository_cache=D:/a/.cache/bazel-repo-cache`, so `actions/cache` and
Bazel could point at different directories.
- The Windows `Bazel clippy` job passed the full explicit target list
from `//codex-rs/...`, but some of those explicit targets are
intentionally incompatible with `//:local_windows`.
`run-argument-comment-lint-bazel.sh` already handles that with
`--skip_incompatible_explicit_targets`; the clippy workflow path did
not.

I also tried switching the workflow cache path to
`D:\a\.cache\bazel-repo-cache`, but the Windows clippy job repeatedly
failed with `Failed to restore: Cache service responded with 400`, so
the final change standardizes on `$HOME/.cache/bazel-repo-cache` and
makes cache restore non-fatal.

## What Changed

- Expose one repository-cache path from
`.github/actions/setup-bazel-ci/action.yml` and export that path as
`BAZEL_REPOSITORY_CACHE` so `run-bazel-ci.sh` passes it to Bazel after
`--config=ci-*`.
- Move `actions/cache/restore` out of the composite action into
`.github/workflows/bazel.yml`, and make restore failures non-fatal
there.
- Save exactly the exported cache path in `.github/workflows/bazel.yml`.
- Remove `common:ci-windows
--repository_cache=D:/a/.cache/bazel-repo-cache` from `.bazelrc` so the
Windows CI config no longer disagrees with the workflow cache path.
- Pass `--skip_incompatible_explicit_targets` in the Windows `Bazel
clippy` job so incompatible explicit targets do not fail analysis while
the lint aspect still traverses compatible Rust dependencies.

## Verification

- Parsed `.github/actions/setup-bazel-ci/action.yml` and
`.github/workflows/bazel.yml` with Ruby's YAML loader.
- Resubmitted PR `#16740`; CI is rerunning on the amended commit.

Michael Bolin · 2026-04-03 20:18:33 -07:00

39097ab65d

Back out "bazel: lint rust_test targets in clippy workflow (#16450 )" (#16757 )

This backs out https://github.com/openai/codex/pull/16450 because it was
not good to go yet.

Michael Bolin · 2026-04-03 20:01:26 -07:00

c9e706f8b6

bazel: lint rust_test targets in clippy workflow (#16450 )

## Why

`cargo clippy --tests` was catching warnings in inline `#[cfg(test)]`
code that the Bazel PR Clippy lane missed. The existing Bazel invocation
linted `//codex-rs/...`, but that did not apply Clippy to the generated
manual `rust_test` binaries, so warnings in targets such as
`//codex-rs/state:state-unit-tests-bin` only surfaced as plain compile
warnings instead of failing the lint job.

## What Changed

- added `scripts/list-bazel-clippy-targets.sh` to expand the Bazel
Clippy target set with the generated manual `rust_test` rules while
still excluding `//codex-rs/v8-poc:all`
- updated `.github/workflows/bazel.yml` to use that expanded target list
in the Bazel Clippy PR job
- updated `just bazel-clippy` to use the same target expansion locally
- updated `.github/workflows/README.md` to document that the Bazel PR
lint lane now covers inline `#[cfg(test)]` code

## Verification

- `./scripts/list-bazel-clippy-targets.sh` includes
`//codex-rs/state:state-unit-tests-bin`
- `bazel build --config=clippy -- //codex-rs/state:state-unit-tests-bin`
now fails with the same unused import in `state/src/runtime/logs.rs`
that `cargo clippy --tests` reports

Michael Bolin · 2026-04-03 22:44:53 +00:00

f263607c60

10 Commits