mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
33 Commits
-
[codex] group blocking and postmerge CI workflows (#30146)
## Why It's hard to change the set of required jobs when they're managed in the GitHub UI, and when each workflow is responsible for choosing it's own scheduling it's easy to end up with skew between what we enforce on PRs vs. on main. ## What - add a `blocking-ci` caller workflow, triggered by pull requests and pushes to `main`, for Bazel, blob size, cargo-deny, Codespell, `repo-checks`, rust CI, and SDK CI - add an `always()` terminal job named `CI required` that fails unless every called workflow succeeds - add a `postmerge-ci` caller workflow for `rust-ci-full` and `v8-canary`, with a terminal `Postmerge CI results` job - centralize V8 relevance detection in `v8_canary_changes.py`; unrelated PR and postmerge runs execute metadata only and skip the expensive build matrices - leave `v8-canary` outside the blocking gate and leave the external `cla` check independent ## Rollout A repository admin must replace the existing required GitHub Actions contexts with `CI required` in the main-branch ruleset. Retain `cla` as a separate required check. Until that change is coordinated, this PR cannot satisfy the old standalone check names. In-flight PRs will need to be rebased after this lands.
Adam Perry @ OpenAI ·
2026-06-26 15:07:05 -07:00 -
ci: template custom runner names by repo (#27024)
## Why These workflows currently hard-code the `codex` runner group and custom runner labels. That makes the same workflow definitions less portable across repository copies or renamed repos, even though the runner fleet follows the repository name scheme. Template the runner identities from the repository name so `openai/codex` still resolves to the existing `codex-*` runners while other repos can use their own `<repo>-*` runner names. ## What Changed - Replaced custom runner `group` values such as `codex-runners` with `${{ github.event.repository.name }}-runners`. - Replaced custom runner labels such as `codex-linux-x64` and `codex-windows-arm64` with `${{ github.event.repository.name }}-...`. - Covered direct `runs-on` objects, matrix `runs_on` entries, reusable workflow runner inputs, and release runner labels. ## Verification - Parsed all `.github/workflows/*.yml` files as YAML with Ruby. - Searched `.github/workflows` to confirm no hardcoded runner-field `codex-runners` or `codex-*` labels remain.Michael Bolin ·
2026-06-08 11:47:23 -07:00 -
ci: use bazel environment for BuildBuddy secret (#26895)
## Why `BUILDBUDDY_API_KEY` now lives in the `bazel` GitHub Actions environment as an environment secret. Jobs that need BuildBuddy credentials must opt into that environment so `${{ secrets.BUILDBUDDY_API_KEY }}` resolves from the protected environment secret instead of relying on an unscoped repository/organization secret. This follows the same environment-secret migration pattern as #26466. ## What Changed - Attach each workflow job that reads `BUILDBUDDY_API_KEY` to the `bazel` environment. - Set `deployment: false` on those job-level environment blocks. `deployment: false` lets the job enter the `bazel` environment to access its environment secrets without creating GitHub deployment records for these CI jobs. That keeps the environment as a secret/access-control boundary without making ordinary Bazel CI runs look like deploys. ## Validation - Parsed the modified workflow YAML files with Ruby's YAML parser. - Checked the modified workflow files for trailing whitespace.Michael Bolin ·
2026-06-07 09:24:54 -07:00 -
build: use ThinLTO for release binaries (#23710)
## Why Fat LTO makes release builds substantially slower without providing enough measured runtime benefit to justify the release CI long pole. The build-profile investigation found that keeping Cargo's default release `opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced a clean `codex-cli` release build from 2073.893 seconds to 1243.172 seconds, a 40.06% improvement. The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%). Measured runtime changes were small: the worst image workload median was +0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO retains cross-crate optimization while avoiding most of the fat-LTO build cost. This deliberately avoids global size optimization: final-executable testing showed a substantial regression on the image request path, which is expected to become more important as image usage grows. ## What changed - Set the workspace release profile to `lto = "thin"`, retaining Cargo's default release `opt-level=3`. - Remove release and CI workflow-specific LTO overrides so release-profile builds consistently use the workspace setting. - Remove the now-unused Windows release workflow input and related diagnostic output. ## Validation - Confirmed the release profile parses with `cargo metadata --no-deps --format-version 1`. - CI validates release builds across the supported target matrix.
Adam Perry @ OpenAI ·
2026-06-04 20:07:53 +00:00 -
Revert "Add app-server startup benchmark crate" (#24937)
Reverts openai/codex#24651, broke musl job https://github.com/openai/codex/actions/runs/26585495205/job/78330166927
Adam Perry @ OpenAI ·
2026-05-28 17:49:41 +00:00 -
Remove libubsan CI workaround (#24782)
It seems that this was added to allow rustc to load proc macros that had been compiled with UBSan enabled, which zig does for debug and `ReleaseSafe` builds. When zig drives the link of the final binary it knows to include the ubsan runtime, but our zig-built artifacts are being linked into a binary whose linking rustc drives. This removes the libubsan workaround we have and replaces it with `-fno-sanitize=undefined` passed to zig. The new argument is passed at the end of zig's args so should take precedence over any earlier arguments from the script's caller.
Adam Perry @ OpenAI ·
2026-05-28 15:49:01 +00:00 -
Add app-server startup benchmark crate (#24651)
## Summary - Add a new `app-server-start-bench` crate to measure app-server startup performance - Wire the benchmark into the workspace and Bazel build so it can be run consistently - Update lockfiles and repo automation to account for the new package
Adam Perry @ OpenAI ·
2026-05-28 08:46:30 -07:00 -
Uprev Rust toolchain pins to 1.95.0 (#24684)
## Summary - Bump the workspace Rust toolchain from `1.93.0` to `1.95.0` across Cargo, Bazel, CI, release workflows, devcontainers, and the Codex environment config. - Refresh `MODULE.bazel.lock` so the Bazel Rust toolchain artifacts match the new version. - Leave purpose-specific toolchains unchanged, including the `argument-comment-lint` nightly and the upstream `rusty_v8` `1.91.0` build pin. - Includes fixes for new lints from `just fix` and a few codex-authored fixes for lints without a suggestion.
Adam Perry @ OpenAI ·
2026-05-26 20:59:47 -07:00 -
[codex] Add image re-encoding benchmarks (#23935)
## Summary - add Divan benchmarks for prompt image re-encoding paths - wire the image benchmark smoke test into Rust CI workflows ## Why Image prompt handling includes re-encoding work that benefits from repeatable benchmark coverage so changes can be measured in CI and locally. This already helped identify a potential regression from changing compiler flags. ## Impact Developers can run and compare the new image re-encoding benchmarks, and CI exercises the benchmark target via the Rust benchmark smoke test.
Adam Perry @ OpenAI ·
2026-05-22 22:38:40 +00:00 -
ci: Use codex produced v8 artifacts for release builds (#23934)
Updates our build script to pull down the artifacts like we do in CI for building v8 into our targets. This changes the flow so that we now pre-install rusty v8 assets for all of our release targets from pre-built in workflow. Secondarily if running it locally we now optionally pull the assets down on python run assuming the user hasn't set the proper values, it then provides them. Sorry for the miss here.
Channing Conger ·
2026-05-22 09:42:08 -07:00 -
Fan out rust-ci-full nextest by platform (#23358)
## Why `rust-ci-full` was paying the full Cargo nextest build-and-run cost once per platform, with Windows ARM64 as the long pole. This change moves the heavy work into one reusable per-platform flow: build a nextest archive once, then replay it across four shards so the platform lane spends less time running tests serially. For Windows ARM64, the archive is cross-compiled on Windows x64 and replayed on native Windows ARM64 shards so the slow ARM64 machine is used for execution rather than compilation. ## What changed - split the `rust-ci-full` nextest matrix into five explicit per-platform reusable-workflow calls - add `.github/workflows/rust-ci-full-nextest-platform.yml` to build one archive, upload timings/helpers, replay four nextest shards, upload per-shard JUnit, and roll the shard status back up per platform - add Windows CI helpers for Dev Drive setup and MSVC ARM64 linker environment export so the Windows ARM64 archive can be produced on Windows x64 - keep the existing Cargo git CLI fetch hardening inside the reusable workflow, since caller workflow-level `env` does not flow through `workflow_call` - document the archive-backed shard shape in `.github/workflows/README.md` - raise the default nextest slow timeout to 30s so the sharded full-CI path does not treat every >15s test as stuck ## Verification - validated the archive/shard flow with live GitHub Actions runs on this PR branch - Windows ARM64 cross-compile latency on completed runs: - https://github.com/openai/codex/actions/runs/26118759651: `34m30s` lane e2e, `17m16s` archive build, `9m55s` shard phase - https://github.com/openai/codex/actions/runs/26120777976: `30m36s` lane e2e, `17m21s` archive build, `6m50s` shard phase - comparable pre-cross-compile sharded Windows ARM64 runs were `55m01s`, `50m21s`, and `46m42s`, so the completed cross-compile runs improved the lane by roughly `12m` to `24m` versus the prior range - latest corrected cross-compile run: https://github.com/openai/codex/actions/runs/26120777976 - Windows ARM64 archive built successfully on Windows x64 - native Windows ARM64 shards started immediately after the archive upload - 3/4 Windows ARM64 shards passed; the failing shard hit the same existing `code_mode` test failure seen outside this lane - downloaded failed-shard JUnit XML from the validation runs and confirmed the remaining red is from known test failures, not archive/shard wiring - no local Codex tests run per repo guidance ## Notes - this PR does not change developers.openai.com documentation
starr-openai ·
2026-05-19 17:54:41 -07:00 -
Reduce rust-ci-full Windows nextest timeout flakes (#23253)
## Why Recent `rust-ci-full` failures were dominated by transient Windows timeout clusters in process-heavy tests such as `suite::resume`, `suite::cli_stream`, `suite::auth_env`, `start_thread_uses_all_default_environments_from_codex_home`, and `connect_stdio_command_initializes_json_rpc_client_on_windows`. The goal here is to make those known flaky paths less likely to fail full CI without relaxing the global nextest timeout policy. ## What changed - Enable one global nextest retry with `retries = 1` so a single transient failure can recover. - Add a `windows_process_heavy` test group with `max-threads = 2` for the recurring Windows subprocess/session-heavy timeout families. - Add Windows-only slow-timeout overrides for that process-heavy group. - Add a narrower Windows-only timeout override for `start_thread_uses_all_default_environments_from_codex_home`, which still exceeded the broader Windows bucket in both Windows full-CI lanes. - Increase the `rust-ci-full` nextest job timeout from `45m` to `60m` so Windows ARM64 still has job-level headroom after retries and targeted per-test timeout increases. - Keep the global `slow-timeout` unchanged at `15s`. ## Validation Validated through `rust-ci-full` GitHub Actions reruns on this PR. Observed improvement on the tuned Windows lanes: - Windows x64 went from `5 timed out` to `0 timed out`. - Windows ARM64 went from `2 timed out` to `0 timed out`. - `start_thread_uses_all_default_environments_from_codex_home` recovered as a flaky pass on Windows ARM64 instead of timing out. The remaining failing tests in those runs were unrelated hard failures outside this nextest timeout tuning.
starr-openai ·
2026-05-18 13:06:39 -07:00 -
Upload rust full CI JUnit reports (#23273)
## Why `rust-ci-full` failures currently leave downstream investigation reconstructing basic test facts from raw logs. `cargo nextest` can emit standard JUnit XML for each lane, which gives us a small structured artifact for post-run failure analysis without changing the test execution model. ## What changed - enable nextest JUnit output in `codex-rs/.config/nextest.toml` - upload the lane-scoped JUnit XML artifact from each `rust-ci-full` test lane ## Verification - `rust-ci-full` run `26018931531` on head `52d77c60e79b36859d944ef28a36b014055c5c48` produced JUnit artifacts for macOS, Linux x64 remote, Windows x64, and Windows ARM64 test lanes - `rust-ci-full` run `26021241006` on the same head produced the missing Linux ARM JUnit artifact after the first run lost that runner before export - downloaded all five lane JUnit artifacts and verified each contains non-empty test counters and failure data
starr-openai ·
2026-05-18 11:10:37 -07:00 -
Enable
--deny-warningsforcargo shear(#21616)## Summary In https://github.com/openai/codex/pull/21584, we disabled doctests for crates that lack any doctests. We can enforce that property via `cargo shear --deny-warnings`: crates that lack doctests will be flagged if doctests are enabled, and crates with doctests will be flagged if doctests are disabled. A few additional notes: - By adding `--deny-warnings`, `cargo shear` also flagged a number of modules that were not reachable at all. Some of those have been removed. - This PR removes a usage of `windows_modules!` (since `cargo shear` and `rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os = "windows")]` macros. As a consequence, many of these files exhibit churn in this PR, since they weren't being formatted by `rustfmt` at all on main. - Again, to make the code more analyzable, this PR also removes some usages of `#[path = "cwd_junction.rs"]` in favor of a more standard module structure. The bin sidecar structure is still retained, but, e.g., `windows-sandbox-rs/src/bin/command_runner.rs` was moved to `windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on. --------- Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-08 20:29:00 +00:00 -
[codex] Address some more GHA hygiene issues (#21622)
This does two things: - We use `persist-credentials: false` everywhere now. This is unfortunately not the default in GitHub Actions, but it prevents `actions/checkout` from dropping `secrets.GITHUB_TOKEN` onto disk. - We interpose (some) template expansions through environment variables. I've limited this to contexts that have non-fixed values; contexts that are fixed (like `*.result`) are not dangerous to expand directly inline (but maybe we should clean those up in the future for consistency anyways). This is a medium-risk change in terms of CI breakage: I did a scan for usage of `git push` and other commands that implicitly use the persisted credential, but couldn't find any. Even still, some implicit usages of the persisted credentials may be lurking. Please ping ww@ if any issues arise.
William Woodruff ·
2026-05-08 10:19:27 -07:00 -
Use
CARGO_NET_GIT_FETCH_WITH_CLIinrust-ci-fullfor more reliable git fetches (#21628)Cargo uses libgit2 by default. In uv, we gave up this entirely and always call out to the git CLI because it is much more reliable. This is a part of my attempt to reduce flakes in `rust-ci-full`.
Zanie Blue ·
2026-05-08 09:53:21 -07:00 -
Fix
rust-ci-fullfailures due to missingbwrap(#21604)Since https://github.com/openai/codex/pull/21255, `rust-ci-full` has been failing due to a missing `bwrap`. ``` thread 'main' panicked at linux-sandbox/src/launcher.rs:43:13: bubblewrap is unavailable: no system bwrap was found on PATH and no bundled codex-resources/bwrap binary was found next to the Codex executable ``` Since the happy path is now to use the system binary, let's ensure that's installed. https://github.com/openai/codex/pull/21604/commits/8d5182663158ee2d15965f39eed26ffa339ecb7d was necessary for the `bwrap` executable to be discoverable when the working directory is `/`. I ran `rust-ci-full` at https://github.com/openai/codex/actions/runs/25528074506 --------- Co-authored-by: Codex <noreply@openai.com>
Zanie Blue ·
2026-05-08 09:52:19 -07:00 -
pakrym-oai ·
2026-05-07 20:05:47 -07:00 -
Use
--lockedin cargo build and lint invocations (#21602)This ensures CI fails if the committed lockfile is outdated
Zanie Blue ·
2026-05-07 23:14:18 +00:00 -
[codex] Fully qualify hash-pins in GitHub Actions (#21436)
This builds on top of https://github.com/openai/codex/pull/15828 by ensuring that hash-pinned actions with version comments are fully qualified, rather than referencing floating/mutable comments like "v7". This makes actions management tools behave more consistently. This shouldn't break anything, since it's comment only. But if it does, ping ww@ 🙂
William Woodruff ·
2026-05-07 14:31:20 -07:00 -
Upgrade
cargo-shearto 1.11.2 (#21547)## Summary Catches a few additional dependencies (`sha2`, `url`) that should be in `dev-dependencies`.
Charlie Marsh ·
2026-05-07 11:07:18 -07:00 -
Curtis 'Fjord' Hawthorne ·
2026-04-24 17:49:29 -07:00 -
test: set Rust test thread stack size (#19067)
## Summary Set `RUST_MIN_STACK=8388608` for Rust test entry points so libtest-spawned test threads get an 8 MiB stack. The Windows BuildBuddy failure on #18893 showed `//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a `#[tokio::test]` even though later test binaries in the shard printed successful summaries. Default `#[tokio::test]` uses a current-thread Tokio runtime, which means the async test body is driven on libtest's std-spawned test thread. Increasing the test thread stack addresses that failure mode directly. To date, we have been fixing these stack-pressure problems with localized future-size reductions, such as #13429, and by adding `Box::pin()` in specific async wrapper chains. This gives us a baseline test-runner stack size instead of continuing to patch individual tests only after CI finds another large async future. ## What changed - Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so Bazel test actions receive the env var through Bazel's cache-keyed test environment path. - Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points and `just test`. - Annotated the existing Windows Bazel linker stack reserve as 8 MiB so it stays aligned with the libtest thread stack size. ## Testing - `just --list` - parsed `.github/workflows/rust-ci.yml` and `.github/workflows/rust-ci-full.yml` with Ruby's YAML loader - compared `bazel aquery` `TestRunner` action keys before/after explicit `--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to `.bazelrc` - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors` - failed locally on the existing sandbox-specific status snapshot permission mismatch, but loaded the Starlark changes and ran the TUI test shards
Michael Bolin ·
2026-04-22 19:51:49 -07:00 -
Reuse remote exec-server in core tests (#17837)
## Summary - reuse a shared remote exec-server for remote-aware codex-core integration tests within a test binary process - keep per-test remote cwd creation and cleanup so tests retain workspace isolation - leave codex_self_exe, codex_linux_sandbox_exe, cwd_path(), and workspace_path() behavior unchanged ## Validation - rustfmt codex-rs/core/tests/common/test_codex.rs - git diff --check - CI is running on the updated branch
starr-openai ·
2026-04-14 20:42:03 -07:00 -
fix: pin inputs (#17471)
## Summary - Pin Rust git patch dependencies to immutable revisions and make cargo-deny reject unknown git and registry sources unless explicitly allowlisted. - Add checked-in SHA-256 coverage for the current rusty_v8 release assets, wire those hashes into Bazel, and verify CI override downloads before use. - Add rusty_v8 MODULE.bazel update/check tooling plus a Bazel CI guard so future V8 bumps cannot drift from the checked-in checksum manifest. - Pin release/lint cargo installs and all external GitHub Actions refs to immutable inputs. ## Future V8 bump flow Run these after updating the resolved `v8` crate version and checksum manifest: ```bash python3 .github/scripts/rusty_v8_bazel.py update-module-bazel python3 .github/scripts/rusty_v8_bazel.py check-module-bazel ``` The update command rewrites the matching `rusty_v8_<crate_version>` `http_file` SHA-256 values in `MODULE.bazel` from `third_party/v8/rusty_v8_<crate_version>.sha256`. The check command is also wired into Bazel CI to block drift. ## Notes - This intentionally excludes RustSec dependency upgrades and bubblewrap-related changes per request. - The branch was rebased onto the latest origin/main before opening the PR. ## Validation - cargo fetch --locked - cargo deny check advisories - cargo deny check - cargo deny check sources - python3 .github/scripts/rusty_v8_bazel.py check-module-bazel - python3 .github/scripts/rusty_v8_bazel.py update-module-bazel - python3 -m unittest discover -s .github/scripts -p 'test_rusty_v8_bazel.py' - python3 -m py_compile .github/scripts/rusty_v8_bazel.py .github/scripts/rusty_v8_module_bazel.py .github/scripts/test_rusty_v8_bazel.py - repo-wide GitHub Actions `uses:` audit: all external action refs are pinned to 40-character SHAs - yq eval on touched workflows and local actions - git diff --check - just bazel-lock-check ## Hash verification - Confirmed `MODULE.bazel` hashes match `third_party/v8/rusty_v8_146_4_0.sha256`. - Confirmed GitHub release asset digests for denoland/rusty_v8 `v146.4.0` and openai/codex `rusty-v8-v146.4.0` match the checked-in hashes. - Streamed and SHA-256 hashed all 10 `MODULE.bazel` rusty_v8 asset URLs locally; every downloaded byte stream matched both `MODULE.bazel` and the checked-in manifest. ## Pin verification - Confirmed signing-action pins match the peeled commits for their tag comments: `sigstore/cosign-installer@v3.7.0`, `azure/login@v2`, and `azure/trusted-signing-action@v0`. - Pinned the remaining tag-based action refs in Bazel CI/setup: `actions/setup-node@v6`, `facebook/install-dotslash@v2`, `bazelbuild/setup-bazelisk@v3`, and `actions/cache/restore@v5`. - Normalized all `bazelbuild/setup-bazelisk@v3` refs to the peeled commit behind the annotated tag. - Audited Cargo git dependencies: every manifest git dependency uses `rev` only, every `Cargo.lock` git source has `?rev=<sha>#<same-sha>`, and `cargo deny check sources` passes with `required-git-spec = "rev"`. - Shallow-fetched each distinct git dependency repo at its pinned SHA and verified Git reports each object as a commit.
viyatb-oai ·
2026-04-14 01:45:41 +00:00 -
Add full-ci branch trigger (#16980)
Allow branches to trigger full ci (helpful to run remote tests)
pakrym-oai ·
2026-04-07 11:33:35 -07:00 -
ci: stop running rust CI with --all-features (#16473)
## Why Now that workspace crate features have been removed and `.github/scripts/verify_cargo_workspace_manifests.py` hard-bans new ones, Rust CI should stop building and testing with `--all-features`. Keeping `--all-features` in CI no longer buys us meaningful coverage for `codex-rs`, but it still makes the workflow look like we rely on Cargo feature permutations that we are explicitly trying to eliminate. It also leaves stale examples in the repo that suggest `--all-features` is a normal or recommended way to run the workspace. ## What changed - removed `--all-features` from the Rust CI `cargo chef cook`, `cargo clippy`, and `cargo nextest` invocations in `.github/workflows/rust-ci-full.yml` - updated the `just test` guidance in `justfile` to reflect that workspace crate features are banned and there should be no need to add `--all-features` - updated the multiline command example and snapshot in `codex-rs/tui/src/history_cell.rs` to stop rendering `cargo test --all-features --quiet` - tightened the verifier docstring in `.github/scripts/verify_cargo_workspace_manifests.py` so it no longer talks about temporary remaining exceptions ## How tested - `python3 .github/scripts/verify_cargo_workspace_manifests.py` - `cargo test -p codex-tui`
Michael Bolin ·
2026-04-01 14:06:20 -07:00 -
ci: run Windows argument-comment-lint via native Bazel (#16120)
## Why Follow-up to #16106. `argument-comment-lint` already runs as a native Bazel aspect on Linux and macOS, but Windows is still the long pole in `rust-ci`. To move Windows onto the same native Bazel lane, the toolchain split has to let exec-side helper binaries build in an MSVC environment while still linting repo crates as `windows-gnullvm`. Pushing the Windows lane onto the native Bazel path exposed a second round of Windows-only issues in the mixed exec-toolchain plumbing after the initial wrapper/target fixes landed. ## What Changed - keep the Windows lint lanes on the native Bazel/aspect path in `rust-ci.yml` and `rust-ci-full.yml` - add a dedicated `local_windows_msvc` platform for exec-side helper binaries while keeping `local_windows` as the `windows-gnullvm` target platform - patch `rules_rust` so `repository_set(...)` preserves explicit exec-platform constraints for the generated toolchains, keep the Windows-specific bootstrap/direct-link fixes needed for the nightly lint driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot - register the custom Windows nightly toolchain set with MSVC exec constraints while still exposing both `x86_64-pc-windows-msvc` and `x86_64-pc-windows-gnullvm` targets - enable `dev_components` on the custom Windows nightly repository set so the MSVC exec helper toolchain actually downloads the compiler-internal crates that `clippy_utils` needs - teach `run-argument-comment-lint-bazel.sh` to enumerate concrete Windows Rust rules, normalize the resulting labels, and skip explicitly requested incompatible targets instead of failing before the lint run starts - patch `rules_rust` build-script env propagation so exec-side `windows-msvc` helper crates drop forwarded MinGW include and linker search paths as whole flag/path pairs instead of emitting malformed `CFLAGS`, `CXXFLAGS`, and `LDFLAGS` - export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and pass the relevant variables through `run-bazel-ci.sh` via `--action_env` / `--host_action_env` so Bazel build scripts can see the MSVC and UCRT headers on native Windows runs - add inline comments to the Windows `setup-bazel-ci` MSVC environment export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and the filtered `GITHUB_ENV` export fit together - patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel `windows-msvc` build-script environments, which avoids a Windows-native toolchain mismatch that blocked the lint lane before it reached the aspect execution - patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for Bazel `windows-msvc` build-script runs, which avoids missing `generated-src/win-x86_64/*.asm` runfiles in the exec-side helper toolchain - annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and `codex-rs/core` that the wider native lint coverage surfaced ## Patches This PR introduces a large patch stack because the Windows Bazel lint lane currently depends on behavior that upstream dependencies do not provide out of the box in the mixed `windows-gnullvm` target / `windows-msvc` exec-toolchain setup. - Most of the `rules_rust` patches look like upstream candidates rather than OpenAI-only policy. Preserving explicit exec-platform constraints, forwarding the right MSVC/UCRT environment into exec-side build scripts, exposing exec-side `rustc-dev` artifacts, and keeping the Windows bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust integration layer itself. - The two `aws-lc-sys` patches are more tactical. They special-case Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe mismatch and missing NASM runfiles. Those may be harder to upstream as-is because they rely on Bazel-specific detection instead of a general Cargo/build-script contract. - Short term, carrying these patches in-tree is reasonable because they unblock a real CI lane and are still narrow enough to audit. Long term, the goal should not be to keep growing a permanent local fork of either dependency. - My current expectation is that the `rules_rust` patches are less controversial and should be broken out into focused upstream proposals, while the `aws-lc-sys` patches are more likely to be temporary escape hatches unless that crate wants a more general hook for hermetic build systems. Suggested follow-up plan: 1. Split the `rules_rust` deltas into upstream-sized PRs or issues with minimized repros. 2. Revisit the `aws-lc-sys` patches during the next dependency bump and see whether they can be replaced by an upstream fix, a crate upgrade, or a cleaner opt-in mechanism. 3. Treat each dependency update as a chance to delete patches one by one so the local patch set only contains still-needed deltas. ## Verification - `./.github/scripts/run-argument-comment-lint-bazel.sh --config=argument-comment-lint --keep_going` - `RUNNER_OS=Windows ./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild --config=argument-comment-lint --platforms=//:local_windows --keep_going` - `cargo test -p codex-linux-sandbox` - `cargo test -p codex-core shell_snapshot_tests` - `just argument-comment-lint` ## References - #16106
Michael Bolin ·
2026-03-30 15:32:04 -07:00 -
fix: close Bazel argument-comment-lint CI gaps (#16253)
## Why The Bazel-backed `argument-comment-lint` CI path had two gaps: - Bazel wildcard target expansion skipped inline unit-test crates from `src/` modules because the generated `*-unit-tests-bin` `rust_test` targets are tagged `manual`. - `argument-comment-mismatch` was still only a warning in the Bazel and packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could still pass CI even when the lint detected it. That left CI blind to real linux-sandbox examples, including the missing `/*local_port*/` comment in `codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument comments in `codex-rs/linux-sandbox/src/landlock.rs`. ## What Changed - Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel lint runs cover `//codex-rs/...` plus the manual `rust_test` `*-unit-tests-bin` targets. - Updated `just argument-comment-lint`, `rust-ci.yml`, and `rust-ci-full.yml` to use that helper. - Promoted both `argument-comment-mismatch` and `uncommented-anonymous-literal-argument` to errors in every strict entrypoint: - `tools/argument-comment-lint/lint_aspect.bzl` - `tools/argument-comment-lint/src/bin/argument-comment-lint.rs` - `tools/argument-comment-lint/wrapper_common.py` - Added wrapper/bin coverage for the stricter lint flags and documented the behavior in `tools/argument-comment-lint/README.md`. - Fixed the now-covered callsites in `codex-rs/linux-sandbox/src/proxy_routing.rs`, `codex-rs/linux-sandbox/src/landlock.rs`, and `codex-rs/core/src/shell_snapshot_tests.rs`. This keeps the Bazel target expansion narrow while making the Bazel and prebuilt-linter paths enforce the same strict lint set. ## Verification - `python3 -m unittest discover -s tools/argument-comment-lint -p 'test_*.py'` - `cargo +nightly-2025-09-18 test --manifest-path tools/argument-comment-lint/Cargo.toml` - `just argument-comment-lint`
Michael Bolin ·
2026-03-30 11:59:50 -07:00 -
ci: use BuildBuddy for rust-ci-full non-Windows argument-comment-lint (#16136)
## Why PR #16130 fixed the Windows `argument-comment-lint` regression in `rust-ci-full`, but the next `main` runs still left the Linux and macOS lint legs timing out. In [run 23695263729](https://github.com/openai/codex/actions/runs/23695263729), both non-Windows `argument-comment-lint` jobs were cancelled almost exactly 30 minutes after they started. The remaining workflow difference versus `rust-ci.yml` was that `rust-ci-full` did not pass `BUILDBUDDY_API_KEY` into the non-Windows Bazel lint step, so `run-bazel-ci.sh` fell back to local Bazel configuration instead of using the faster remote-backed path available on `main`. ## What changed - passed `BUILDBUDDY_API_KEY` to the non-Windows `rust-ci-full` `argument-comment-lint` Bazel step - left the Windows packaged-wrapper path from #16130 unchanged - kept the change scoped to `rust-ci-full.yml` ## Test plan - loaded `.github/workflows/rust-ci-full.yml` and `.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)` - inspected run `23695263729` and confirmed `Argument comment lint - Linux` and `Argument comment lint - macOS` were cancelled about 30 minutes after start - verified the updated `rust-ci-full` step now matches the non-Windows secret wiring already present in `rust-ci.yml` ## References - #16130 - #16106
Michael Bolin ·
2026-03-28 15:36:01 -07:00 -
ci: keep rust-ci-full Windows argument-comment-lint on packaged wrapper (#16130)
## Why PR #16106 switched `rust-ci-full` over to the native Bazel-backed `argument-comment-lint` path on all three platforms. That works on Linux and macOS, but the Windows leg in `rust-ci-full` now fails before linting starts: Bazel dies while building `rules_rust`'s `process_wrapper` tool, so `main` reports an `argument-comment-lint` failure even though no Rust lint finding was produced. Until native Windows Bazel linting is repaired, `rust-ci-full` should keep the same Windows split that `rust-ci.yml` already uses. ## What changed - restored the Windows-only nightly `argument-comment-lint` toolchain setup in `rust-ci-full` - limited the Bazel-backed lint step in `rust-ci-full` to non-Windows runners - routed the Windows runner back through `tools/argument-comment-lint/run-prebuilt-linter.py` - left the Linux and macOS `rust-ci-full` behavior unchanged ## Test plan - loaded `.github/workflows/rust-ci-full.yml` and `.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)` - inspected failing Actions run `23692864849`, especially job `69023229311`, to confirm the Windows failure occurs in Bazel `process_wrapper` setup before lint output is emitted ## References - #16106
Michael Bolin ·
2026-03-28 14:50:19 -07:00 -
build: migrate argument-comment-lint to a native Bazel aspect (#16106)
## Why `argument-comment-lint` had become a PR bottleneck because the repo-wide lane was still effectively running a `cargo dylint`-style flow across the workspace instead of reusing Bazel's Rust dependency graph. That kept the lint enforced, but it threw away the main benefit of moving this job under Bazel in the first place: metadata reuse and cacheable per-target analysis in the same shape as Clippy. This change moves the repo-wide lint onto a native Bazel Rust aspect so Linux and macOS can lint `codex-rs` without rebuilding the world crate-by-crate through the wrapper path. ## What Changed - add a nightly Rust toolchain with `rustc-dev` for Bazel and a dedicated crate-universe repo for `tools/argument-comment-lint` - add `tools/argument-comment-lint/driver.rs` and `tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint as a custom `rustc_driver` - switch repo-wide `just argument-comment-lint` and the Linux/macOS `rust-ci` lanes to `bazel build --config=argument-comment-lint //codex-rs/...` - keep the Python/DotSlash wrappers as the package-scoped fallback path and as the current Windows CI path - gate the Dylint entrypoint behind a `bazel_native` feature so the Bazel-native library avoids the `dylint_*` packaging stack - update the aspect runtime environment so the driver can locate `rustc_driver` correctly under remote execution - keep the dedicated `tools/argument-comment-lint` package tests and wrapper unit tests in CI so the source and packaged entrypoints remain covered ## Verification - `python3 -m unittest discover -s tools/argument-comment-lint -p 'test_*.py'` - `cargo test` in `tools/argument-comment-lint` - `bazel build //tools/argument-comment-lint:argument-comment-lint-driver --@rules_rust//rust/toolchain/channel=nightly` - `bazel build --config=argument-comment-lint //codex-rs/utils/path-utils:all` - `bazel build --config=argument-comment-lint //codex-rs/rollout:rollout` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106). * #16120 * __->__ #16106
Michael Bolin ·
2026-03-28 12:41:56 -07:00 -
ci: split fast PR Rust CI from full post-merge Cargo CI (#16072)
## Summary Split the old all-in-one `rust-ci.yml` into: - a PR-time Cargo workflow in `rust-ci.yml` - a full post-merge Cargo workflow in `rust-ci-full.yml` This keeps the PR path focused on fast Cargo-native hygiene plus the Bazel `build` / `test` / `clippy` coverage in `bazel.yml`, while moving the heavyweight Cargo-native matrix to `main`. ## Why `bazel.yml` is now the main Rust verification workflow for pull requests. It already covers the Bazel build, test, and clippy signal we care about pre-merge, and it also runs on pushes to `main` to re-verify the merged tree and help keep the BuildBuddy caches warm. What was still missing was a clean split for the Cargo-native checks that Bazel does not replace yet. The old `rust-ci.yml` mixed together: - fast hygiene checks such as `cargo fmt --check` and `cargo shear` - `argument-comment-lint` - the full Cargo clippy / nextest / release-build matrix That made every PR pay for the full Cargo matrix even though most of that coverage is better treated as post-merge verification. The goal of this change is to leave PRs with the checks we still want before merge, while moving the heavier Cargo-native matrix off the review path. ## What Changed - Renamed the old heavyweight workflow to `rust-ci-full.yml` and limited it to `push` on `main` plus `workflow_dispatch`. - Added a new PR-only `rust-ci.yml` that runs: - changed-path detection - `cargo fmt --check` - `cargo shear` - `argument-comment-lint` on Linux, macOS, and Windows - `tools/argument-comment-lint` package tests when the lint itself or its workflow wiring changes - Kept the PR workflow's gatherer as the single required Cargo-native status so branch protection can stay simple. - Added `.github/workflows/README.md` to document the intended split between `bazel.yml`, `rust-ci.yml`, and `rust-ci-full.yml`. - Preserved the recent Windows `argument-comment-lint` behavior from `e02fd6e1d3` in `rust-ci-full.yml`, and mirrored cross-platform lint coverage into the PR workflow. A few details are deliberate: - The PR workflow still keeps the Linux lint lane on the default-targets-only invocation for now, while macOS and Windows use the broader released-linter path. - This PR does not change `bazel.yml`; it changes the Cargo-native workflow around the existing Bazel PR path. ## Testing - Rebasing this change onto `main` after `e02fd6e1d3` - `ruby -e 'require "yaml"; %w[.github/workflows/rust-ci.yml .github/workflows/rust-ci-full.yml .github/workflows/bazel.yml].each { |f| YAML.load_file(f) }'`Michael Bolin ·
2026-03-27 21:08:08 -07:00