mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
39 Commits
-
[codex] fix Windows ConPTY input handling (#29734)
## Why Windows unified-exec TTY input did not behave like the non-Windows PTY path. ConPTY sessions could receive the wrong line ending or mishandle backspace, especially when sending input to a foreground program through PowerShell or cmd. The local, legacy restricted, and elevated paths also handled this normalization separately. ## What changed - share one stateful Windows TTY input normalizer across local, legacy restricted, and elevated runner paths - translate LF and split CRLF into one Windows terminal Enter, encode backspace as DEL, and preserve UTF-8 and control bytes such as Ctrl-C - add Windows integration coverage for Unicode input, backspace, Enter, and PowerShell foreground-child Ctrl-C behavior ## Validation - `just test -p codex-utils-pty` (13 tests passed; the Unicode integration test retried once) - the Unicode integration test passed five consecutive runs with retries disabled - integration coverage sends `cafeé 漢字` through cmd and PowerShell and verifies that Ctrl-C interrupts a running PowerShell foreground child
iceweasel-oai ·
2026-06-24 11:27:44 -07:00 -
[codex] Handle Ctrl-C for non-TTY unified exec (#26734)
## Why A long-running unified exec process started with `tty: false` could not be interrupted via `write_stdin`: ordinary non-TTY stdin writes are rejected once stdin is closed, but an exact U+0003 payload should still map to a process interrupt. The interrupt should flow through the same process lifecycle path as a real signal so Codex preserves process-reported output and exit metadata instead of fabricating a Ctrl-C exit code or tearing down the session early. ## What Changed - Add `process/signal` to exec-server with `ProcessSignal::Interrupt` and an empty response. - Add a non-consuming `ProcessHandle::signal` path for spawned processes; on Unix it sends SIGINT to the process group and leaves terminate/hard-kill unchanged. - Route non-TTY U+0003 `write_stdin` through `process.signal(...)` instead of `terminate`, then let the normal post-write collection path drain output and observe exit. - Add exec-server coverage where a shell `trap INT` handler prints the signal and exits with its own code. - Add unified exec coverage where a `tty: false` process traps SIGINT, emits output, and exits with its own code. ## Validation - `just test -p codex-exec-server exec_process_signal_interrupts_process` - `just test -p codex-exec-server` - `just test -p codex-core write_stdin_ctrl_c_interrupts_non_tty_session`
pakrym-oai ·
2026-06-09 15:10:17 -07:00 -
fix(linux-sandbox): preserve shell cleanup on interruption (#22729)
## Why Interrupted `shell_command` calls can race with the outer tool-dispatch cancellation path. When that happens, the runtime future may be dropped before the spawned process gets a chance to run `SIGTERM` cleanup. For bwrapd-backed Linux sandbox commands, that can leave synthetic protected-path mount bookkeeping such as `.git/.codex` registrations under `/tmp` behind after a TUI interruption. The relevant cancellation points are the outer dispatch race in [`core/src/tools/parallel.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/parallel.rs#L91-L132) and the process shutdown logic in [`core/src/exec.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/exec.rs#L1367-L1393). ## What changed - Keep `shell_command` dispatch alive long enough for the runtime to finish cancellation cleanup instead of immediately returning the synthetic aborted response. - Fold shell-turn cancellation into the existing `ExecExpiration` path in [`core/src/tools/runtimes/shell.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/runtimes/shell.rs#L267-L274), so cancellation and timeout behavior stay centralized. - On cancellation, send `SIGTERM` first, wait briefly for cleanup to run, then hard-kill any remaining descendants in the original process group. - Treat `ESRCH` as an already-gone process-group cleanup case in `codex-utils-pty`, which keeps best-effort teardown from surfacing a stale-process race as an error. ## Verification - `cargo test -p codex-core cancellation` - Added regression coverage for: - `shell_tool_cancellation_waits_for_runtime_cleanup` - `process_exec_tool_call_cancellation_allows_sigterm_cleanup`
viyatb-oai ·
2026-05-27 12:59:11 -07:00 -
Prefer
just testovercargo testin docs (#23910)`cargo test` for the core and other crates fails on a fresh macOS checkout without the right stack size variable. This change encourages using the just test command that sets the environment up correctly. As a bonus, this should encourage agents to get more benefit out of nextest's parallel execution.
anp-oai ·
2026-05-22 16:58:14 +00:00 -
Disable empty Cargo test targets (#21584)
## Summary `cargo test` has entails both running standard Rust tests and doctests. It turns out that the doctest discovery is fairly slow, and it's a cost you pay even for crates that don't include any doctests. This PR disables doctests with `doctest = false` for crates that lack any doctests. For the collection of crates below, this speeds up test execution by >4x. E.g., before this PR: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s] Range (min … max): 0.418 s … 14.529 s 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms] Range (min … max): 418.0 ms … 436.8 ms 10 runs ``` For a single crate, with >2x speedup, before: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms] Range (min … max): 480.9 ms … 512.0 ms 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms] Range (min … max): 206.8 ms … 221.0 ms 13 runs ``` Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-07 15:44:17 -07:00 -
Fix Windows PTY teardown by preserving ConPTY ownership (#20685)
## Why On Windows, background terminals could stay visible after their shell process had already exited. The elevated runner waits for the PTY output reader to reach EOF before it sends the final exit message, but the ConPTY helper was reducing ownership down to raw handles too early. That left the pseudoconsole's borrowed pipe handles alive past teardown, so EOF never propagated and the session stayed `running`. ## What changed - change `utils/pty/src/win/conpty.rs` to hand off owned ConPTY resources instead of leaking only raw handles - make `windows-sandbox-rs/src/conpty/mod.rs` keep the pseudoconsole owner and the backing pipe handles together until teardown - update the elevated runner and the legacy unified-exec backend to keep that `ConptyInstance` alive, take only the specific pipe handles they need, and drop the owner at teardown instead of trying to close a detached pseudoconsole handle later ## Testing - desktop app in `Auto-review`: 11 x `cmd /c "ping -n 3 google.com"` all exited cleanly and did not accumulate in the UI - desktop app in `Auto-review`: 5 x `cmd /c "ping -n 30 google.com"` appeared in the UI and drained back out on their own
iceweasel-oai ·
2026-05-04 18:40:00 +00:00 -
Improve Windows process management edge cases (#19211)
## Summary Some improvements to Windows process-management issues from https://github.com/openai/codex/pull/15578 - bound the elevated runner pipe-connect handshake instead of waiting forever on blocking pipe connects - terminate the spawned runner if that handshake fails, so timeout/error paths do not leave a stray `codex-command-runner.exe` - loop on partial `WriteFile` results when forwarding stdin in the elevated runner, so input is not silently truncated - fix the concrete HANDLE/SID cleanup paths in the runner setup code - keep draining driver-backed stdout/stderr after exit until the backend closes, instead of dropping the tail after a fixed 200ms grace period - reuse `LocalSid` for SID ownership and add more explanatory comments around the ownership/concurrency-sensitive code paths ## Why The original PR fixed a lot of Windows session plumbing, but there were still a few sharp process-lifecycle edges: - some elevated runner handshakes could block forever - the new timeout path could still orphan the spawned runner process - stdin forwarding still assumed a single `WriteFile` consumed the whole buffer - a few raw HANDLE/SID error paths still leaked - driver-backed output could still lose the last chunk of stdout/stderr on slower backends ## Validation - `cargo fmt -p codex-windows-sandbox -p codex-utils-pty` - `cargo test -p codex-utils-pty` - `cargo test -p codex-windows-sandbox finish_driver_spawn` - `cargo test -p codex-windows-sandbox runner_` Ran a local test matrix of unified-exec and shell_tool tests, all passing
iceweasel-oai ·
2026-04-29 10:00:01 -07:00 -
Add Windows sandbox unified exec runtime support (#15578)
## Summary This is the runtime/foundation half of the Windows sandbox unified-exec work. - add Windows sandbox `unified_exec` session support in `windows-sandbox-rs` for both: - the legacy restricted-token backend - the elevated runner backend - extend the PTY/process runtime so driver-backed sessions can support: - stdin streaming - stdout/stderr separation - exit propagation - PTY resize hooks - add Windows sandbox runtime coverage in `codex-windows-sandbox` / `codex-utils-pty` This PR does **not** enable Windows sandbox `UnifiedExec` for product callers yet because hooking this up to app-server comes in the next PR. Windows sandbox advertising is intentionally kept aligned with `main`, so sandboxed Windows callers still fall back to `ShellCommand`. This PR isolates the runtime/session layer so it can be reviewed independently from product-surface enablement. --------- Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Codex <noreply@openai.com>
iceweasel-oai ·
2026-04-21 10:44:49 -07:00 -
refactor: narrow async lock guard lifetimes (#18211)
Follow-up to https://github.com/openai/codex/pull/18178, where we called out enabling the await-holding lint as a follow-up. The long-term goal is to enable Clippy coverage for async guards held across awaits. This PR is intentionally only the first, low-risk cleanup pass: it narrows obvious lock guard lifetimes and leaves `codex-rs/Cargo.toml` unchanged so the lint is not enabled until the remaining cases are fixed or explicitly justified. It intentionally leaves the active-turn/turn-state locking pattern alone because those checks and mutations need to stay atomic. ## Common fixes used here These are the main patterns reviewers should expect in this PR, and they are also the patterns to reach for when fixing future `await_holding_*` findings: - **Scope the guard to the synchronous work.** If the code only needs data from a locked value, move the lock into a small block, clone or compute the needed values, and do the later `.await` after the block. - **Use direct one-line mutations when there is no later await.** Cases like `map.lock().await.remove(&id)` are acceptable when the guard is only needed for that single mutation and the statement ends before any async work. - **Drain or clone work out of the lock before notifying or awaiting.** For example, the JS REPL drains pending exec senders into a local vector and the websocket writer clones buffered envelopes before it serializes or sends them. - **Use a `Semaphore` only when serialization is intentional across async work.** The test serialization guards intentionally span awaited setup or execution, so using a semaphore communicates "one at a time" without holding a mutex guard. - **Remove the mutex when there is only one owner.** The PTY stdin writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not protect shared access because nothing else had access to the writer. - **Do not split locks that protect an atomic invariant.** This PR deliberately leaves active-turn/turn-state paths alone because those checks and mutations need to stay atomic. Those cases should be fixed separately with a design change or documented with `#[expect]`. ## What changed - Narrow scoped async mutex guards in app-server, JS REPL, network approval, remote-control websocket, and the RMCP test server. - Replace test-only async mutex serialization guards with semaphores where the guard intentionally lives across async work. - Let the PTY pipe writer task own stdin directly instead of wrapping it in an async mutex. ## Verification - `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` - `just clippy -p codex-core` - `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was run; the app-server suite passed, and `codex-core` failed in the local sandbox on six otel approval tests plus `suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`, which appear to depend on local command approval/default rules and `CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.
Michael Bolin ·
2026-04-17 14:06:50 -07:00 -
[codex] reduce module visibility (#16978)
## Summary - reduce public module visibility across Rust crates, preferring private or crate-private modules with explicit crate-root public exports - update external call sites and tests to use the intended public crate APIs instead of reaching through module trees - add the module visibility guideline to AGENTS.md ## Validation - `cargo check --workspace --all-targets --message-format=short` passed before the final fix/format pass - `just fix` completed successfully - `just fmt` completed successfully - `git diff --check` passed
pakrym-oai ·
2026-04-07 08:03:35 -07:00 -
ci: verify codex-rs Cargo manifests inherit workspace settings (#16353)
## Why Bazel clippy now catches lints that `cargo clippy` can still miss when a crate under `codex-rs` forgets to opt into workspace lints. The concrete example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel flagged a clippy violation in `models_cache.rs`, but Cargo did not because that crate inherited workspace package metadata without declaring `[lints] workspace = true`. We already mirror the workspace clippy deny list into Bazel after [#15955](https://github.com/openai/codex/pull/15955), so we also need a repo-side check that keeps every `codex-rs` manifest opted into the same workspace settings. ## What changed - add `.github/scripts/verify_cargo_workspace_manifests.py`, which parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies: - `version.workspace = true` - `edition.workspace = true` - `license.workspace = true` - `[lints] workspace = true` - top-level crate names follow the `codex-*` / `codex-utils-*` conventions, with explicit exceptions for `windows-sandbox-rs` and `utils/path-utils` - run that script in `.github/workflows/ci.yml` - update the current outlier manifests so the check is enforceable immediately - fix the newly exposed clippy violations in the affected crates (`app-server/tests/common`, `file-search`, `feedback`, `shell-escalation`, and `debug-client`) --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353). * #16351 * __->__ #16353
Michael Bolin ·
2026-03-31 21:59:28 +00:00 -
chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
## Why `argument-comment-lint` was green in CI even though the repo still had many uncommented literal arguments. The main gap was target coverage: the repo wrapper did not force Cargo to inspect test-only call sites, so examples like the `latest_session_lookup_params(true, ...)` tests in `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path. This change cleans up the existing backlog, makes the default repo lint path cover all Cargo targets, and starts rolling that stricter CI enforcement out on the platform where it is currently validated. ## What changed - mechanically fixed existing `argument-comment-lint` violations across the `codex-rs` workspace, including tests, examples, and benches - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to `--all-targets` unless the caller explicitly narrows the target set - fixed both wrappers so forwarded cargo arguments after `--` are preserved with a single separator - documented the new default behavior in `tools/argument-comment-lint/README.md` - updated `rust-ci` so the macOS lint lane keeps the plain wrapper invocation and therefore enforces `--all-targets`, while Linux and Windows temporarily pass `-- --lib --bins` That temporary CI split keeps the stricter all-targets check where it is already cleaned up, while leaving room to finish the remaining Linux- and Windows-specific target-gated cleanup before enabling `--all-targets` on those runners. The Linux and Windows failures on the intermediate revision were caused by the wrapper forwarding bug, not by additional lint findings in those lanes. ## Validation - `bash -n tools/argument-comment-lint/run.sh` - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh` - shell-level wrapper forwarding check for `-- --lib --bins` - shell-level wrapper forwarding check for `-- --tests` - `just argument-comment-lint` - `cargo test` in `tools/argument-comment-lint` - `cargo test -p codex-terminal-detection` ## Follow-up - Clean up remaining Linux-only target-gated callsites, then switch the Linux lint lane back to the plain wrapper invocation. - Clean up remaining Windows-only target-gated callsites, then switch the Windows lint lane back to the plain wrapper invocation.
Michael Bolin ·
2026-03-27 19:00:44 -07:00 -
chore: move pty and windows sandbox to Rust 2024 (#15954)
## Why `codex-utils-pty` and `codex-windows-sandbox` were the remaining crates in `codex-rs` that still overrode the workspace's Rust 2024 edition. Moving them forward in a separate PR keeps the baseline edition update isolated from the follow-on Bazel clippy workflow in #15955, while making linting and formatting behavior consistent with the rest of the workspace. This PR also needs Cargo and Bazel to agree on the edition for `codex-windows-sandbox`. Without the Bazel-side sync, the experimental Bazel app-server builds fail once they compile `windows-sandbox-rs`. ## What changed - switch `codex-rs/utils/pty` and `codex-rs/windows-sandbox-rs` to `edition = "2024"` - update `codex-utils-pty` callsites and tests to use the collapsed `if let` form that Clippy expects under the new edition - fix the Rust 2024 fallout in `windows-sandbox-rs`, including the reserved `gen` identifier, `unsafe extern` requirements, and new Clippy findings that surfaced under the edition bump - keep the edition bump separate from a larger unsafe cleanup by temporarily allowing `unsafe_op_in_unsafe_fn` in the Windows entrypoint modules that now report it under Rust 2024 - update `codex-rs/windows-sandbox-rs/BUILD.bazel` to `crate_edition = "2024"` so Bazel compiles the crate with the same edition as Cargo --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15954). * #15976 * #15955 * __->__ #15954
Michael Bolin ·
2026-03-27 02:31:08 -07:00 -
Use released DotSlash package for argument-comment lint (#15199)
## Why The argument-comment lint now has a packaged DotSlash artifact from [#15198](https://github.com/openai/codex/pull/15198), so the normal repo lint path should use that released payload instead of rebuilding the lint from source every time. That keeps `just clippy` and CI aligned with the shipped artifact while preserving a separate source-build path for people actively hacking on the lint crate. The current alpha package also exposed two integration wrinkles that the repo-side prebuilt wrapper needs to smooth over: - the bundled Dylint library filename includes the host triple, for example `@nightly-2025-09-18-aarch64-apple-darwin`, and Dylint derives `RUSTUP_TOOLCHAIN` from that filename - on Windows, Dylint's driver path also expects `RUSTUP_HOME` to be present in the environment Without those adjustments, the prebuilt CI jobs fail during `cargo metadata` or driver setup. This change makes the checked-in prebuilt wrapper normalize the packaged library name to the plain `nightly-2025-09-18` channel before invoking `cargo-dylint`, and it teaches both the wrapper and the packaged runner source to infer `RUSTUP_HOME` from `rustup show home` when the environment does not already provide it. After the prebuilt Windows lint job started running successfully, it also surfaced a handful of existing anonymous literal callsites in `windows-sandbox-rs`. This PR now annotates those callsites so the new cross-platform lint job is green on the current tree. ## What Changed - checked in the current `tools/argument-comment-lint/argument-comment-lint` DotSlash manifest - kept `tools/argument-comment-lint/run.sh` as the source-build wrapper for lint development - added `tools/argument-comment-lint/run-prebuilt-linter.sh` as the normal enforcement path, using the checked-in DotSlash package and bundled `cargo-dylint` - updated `just clippy` and `just argument-comment-lint` to use the prebuilt wrapper - split `.github/workflows/rust-ci.yml` so source-package checks live in a dedicated `argument_comment_lint_package` job, while the released lint runs in an `argument_comment_lint_prebuilt` matrix on Linux, macOS, and Windows - kept the pinned `nightly-2025-09-18` toolchain install in the prebuilt CI matrix, since the prebuilt package still relies on rustup-provided toolchain components - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` to normalize host-qualified nightly library filenames, keep the `rustup` shim directory ahead of direct toolchain `cargo` binaries, and export `RUSTUP_HOME` when needed for Windows Dylint driver setup - updated `tools/argument-comment-lint/src/bin/argument-comment-lint.rs` so future published DotSlash artifacts apply the same nightly-filename normalization and `RUSTUP_HOME` inference internally - fixed the remaining Windows lint violations in `codex-rs/windows-sandbox-rs` by adding the required `/*param*/` comments at the reported callsites - documented the checked-in DotSlash file, wrapper split, archive layout, nightly prerequisite, and Windows `RUSTUP_HOME` requirement in `tools/argument-comment-lint/README.md`
Michael Bolin ·
2026-03-20 03:19:22 +00:00 -
Apply argument comment lint across codex-rs (#14652)
## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/*param*/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/*param*/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/*param*/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- * -> #14652 * #14651
Michael Bolin ·
2026-03-16 16:48:15 -07:00 -
windows-sandbox: add runner IPC foundation for future unified_exec (#14139)
# Summary This PR introduces the Windows sandbox runner IPC foundation that later unified_exec work will build on. The key point is that this is intentionally infrastructure-only. The new IPC transport, runner plumbing, and ConPTY helpers are added here, but the active elevated Windows sandbox path still uses the existing request-file bootstrap. In other words, this change prepares the transport and module layout we need for unified_exec without switching production behavior over yet. Part of this PR is also a source-layout cleanup: some Windows sandbox files are moved into more explicit `elevated/`, `conpty/`, and shared locations so it is clearer which code is for the elevated sandbox flow, which code is legacy/direct-spawn behavior, and which helpers are shared between them. That reorganization is intentional in this first PR so later behavioral changes do not also have to carry a large amount of file-move churn. # Why This Is Needed For unified_exec Windows elevated sandboxed unified_exec needs a long-lived, bidirectional control channel between the CLI and a helper process running under the sandbox user. That channel has to support: - starting a process and reporting structured spawn success/failure - streaming stdout/stderr back incrementally - forwarding stdin over time - terminating or polling a long-lived process - supporting both pipe-backed and PTY-backed sessions The existing elevated one-shot path is built around a request-file bootstrap and does not provide those primitives cleanly. Before we can turn on Windows sandbox unified_exec, we need the underlying runner protocol and transport layer that can carry those lifecycle events and streams. # Why Windows Needs More Machinery Than Linux Or macOS Linux and macOS can generally build unified_exec on top of the existing sandbox/process model: the parent can spawn the child directly, retain normal ownership of stdio or PTY handles, and manage the lifetime of the sandboxed process without introducing a second control process. Windows elevated sandboxing is different. To run inside the sandbox boundary, we cross into a different user/security context and then need to manage a long-lived process from outside that boundary. That means we need an explicit helper process plus an IPC transport to carry spawn, stdin, output, and exit events back and forth. The extra code here is mostly that missing Windows sandbox infrastructure, not a conceptual difference in unified_exec itself. # What This PR Adds - the framed IPC message types and transport helpers for parent <-> runner communication - the renamed Windows command runner with both the existing request-file bootstrap and the dormant IPC bootstrap - named-pipe helpers for the elevated runner path - ConPTY helpers and process-thread attribute plumbing needed for PTY-backed sessions - shared sandbox/process helpers that later PRs will reuse when switching live execution paths over - early file/module moves so later PRs can focus on behavior rather than layout churn # What This PR Does Not Yet Do - it does not switch the active elevated one-shot path over to IPC yet - it does not enable Windows sandbox unified_exec yet - it does not remove the existing request-file bootstrap yet So while this code compiles and the new path has basic validation, it is not yet the exercised production path. That is intentional for this first PR: the goal here is to land the transport and runner foundation cleanly before later PRs start routing real command execution through it. # Follow-Ups Planned follow-up PRs will: 1. switch elevated one-shot Windows sandbox execution to the new runner IPC path 2. layer Windows sandbox unified_exec sessions on top of the same transport 3. remove the legacy request-file path once the IPC-based path is live # Validation - `cargo build -p codex-windows-sandbox`
iceweasel-oai ·
2026-03-16 19:45:06 +00:00 -
fix: preserve zsh-fork escalation fds across unified-exec spawn paths (#13644)
## Why `zsh-fork` sessions launched through unified-exec need the escalation socket to survive the wrapper -> server -> child handoff so later intercepted `exec()` calls can still reach the escalation server. The inherited-fd spawn path also needs to avoid closing Rust's internal exec-error pipe, and the shell-escalation handoff needs to tolerate the receive-side case where a transferred fd is installed into the same stdio slot it will be mapped onto. ## What Changed - Added `SpawnLifecycle::inherited_fds()` in `codex-rs/core/src/unified_exec/process.rs` and threaded inherited fds through `codex-rs/core/src/unified_exec/process_manager.rs` so unified-exec can preserve required descriptors across both PTY and no-stdin pipe spawn paths. - Updated `codex-rs/core/src/tools/runtimes/shell/zsh_fork_backend.rs` to expose the escalation socket fd through the spawn lifecycle. - Added inherited-fd-aware spawn helpers in `codex-rs/utils/pty/src/pty.rs` and `codex-rs/utils/pty/src/pipe.rs`, including Unix pre-exec fd pruning that preserves requested inherited fds while leaving `FD_CLOEXEC` descriptors alone. The pruning helper is now named `close_inherited_fds_except()` to better describe that behavior. - Updated `codex-rs/shell-escalation/src/unix/escalate_client.rs` to duplicate local stdio before transfer and send destination stdio numbers in `SuperExecMessage`, so the wrapper keeps using its own `stdin`/`stdout`/`stderr` until the escalated child takes over. - Updated `codex-rs/shell-escalation/src/unix/escalate_server.rs` so the server accepts the overlap case where a received fd reuses the same stdio descriptor number that the child setup will target with `dup2`. - Added comments around the PTY stdio wiring and the overlap regression helper to make the fd handoff and controlling-terminal setup easier to follow. ## Verification - `cargo test -p codex-utils-pty` - covers preserved-fd PTY spawn behavior, PTY resize, Python REPL continuity, exec-failure reporting, and the no-stdin pipe path - `cargo test -p codex-shell-escalation` - covers duplicated-fd transfer on the client side and verifies the overlap case by passing a pipe-backed stdin payload through the server-side `dup2` path --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13644). * #14624 * __->__ #13644
Michael Bolin ·
2026-03-13 20:25:31 +00:00 -
Stabilize pipe process stdin round-trip test (#14013)
## What changed - keep the explicit stdin-close behavior after writing so the child still receives EOF deterministically - on Windows, stop using `python -c` for the round-trip assertion and instead run a native `cmd.exe` pipeline that reads one line from stdin with `set /p` and echoes it back - send ` ` on Windows so the stdin payload matches the platform-native line ending the shell reader expects ## Why this fixes flakiness The failing branch-local flake was not in `spawn_pipe_process` itself. The child exited cleanly, but the Windows ARM runner sometimes produced an empty stdout string when the test used Python as the stdin consumer. That makes the test sensitive to Python startup and stdin-close timing rather than the pipe primitive we actually want to validate. Switching the Windows path to a native `cmd.exe` reader keeps the assertion focused on our pipe behavior: bytes written to stdin should come back on stdout before EOF closes the process. The explicit ` ` write removes line-ending ambiguity on Windows. ## Scope - test-only - no production logic change
Ahmed Ibrahim ·
2026-03-11 12:33:09 -07:00 -
Stabilize split PTY output on Windows (#14003)
## Summary - run the split stdout/stderr PTY test through the normal shell helper on every platform - use a Windows-native command string instead of depending on Python to emit split streams - assert CRLF line endings on Windows explicitly ## Why this fixes the flake The earlier PTY split-output test used a Python one-liner on Windows while the rest of the file exercised shell-command behavior. That made the test depend on runner-local Python availability and masked the real Windows shell output shape. Using a native cmd-compatible command and asserting the actual CRLF output makes the split stdout/stderr coverage deterministic on Windows runners.
Ahmed Ibrahim ·
2026-03-11 12:33:07 -07:00 -
start of hooks engine (#13276)
(Experimental) This PR adds a first MVP for hooks, with SessionStart and Stop The core design is: - hooks live in a dedicated engine under codex-rs/hooks - each hook type has its own event-specific file - hook execution is synchronous and blocks normal turn progression while running - matching hooks run in parallel, then their results are aggregated into a normalized HookRunSummary On the AppServer side, hooks are exposed as operational metadata rather than transcript-native items: - new live notifications: hook/started, hook/completed - persisted/replayed hook results live on Turn.hookRuns - we intentionally did not add hook-specific ThreadItem variants Hooks messages are not persisted, they remain ephemeral. The context changes they add are (they get appended to the user's prompt)
Andrei Eternal ·
2026-03-10 04:11:31 +00:00 -
Stabilize PTY Python REPL test (#13883)
## What changed - The PTY Python REPL test now starts Python with a startup marker already embedded in argv. - The test waits for that marker in PTY output before making assertions. ## Why this fixes the flake - The old version tried to probe the live REPL almost immediately after spawn. - That races PTY initialization, Python startup, and prompt buffering, all of which vary across platforms and CI load. - By having the child process emit a known marker as part of its own startup path, the test gets a deterministic synchronization point that comes from the process under test rather than from guessed timing. ## Scope - Test-only change.
Ahmed Ibrahim ·
2026-03-09 10:08:36 -07:00 -
Fix inverted Windows PTY
TerminateProcesshandling (#13989)Addresses #13945 The vendored WezTerm ConPTY backend in `codex-rs/utils/pty/src/win/mod.rs` treated `TerminateProcess` return values backwards: nonzero success was handled as failure, and `0` failure was handled as success. This is likely causing a number of bugs reported against Codex running on Windows native where processes are not cleaned up.
Eric Traut ·
2026-03-08 11:52:16 -06:00 -
utils/pty: add streaming spawn and terminal sizing primitives (#13695)
Enhance pty utils: * Support closing stdin * Separate stderr and stdout streams to allow consumers differentiate them * Provide compatibility helper to merge both streams back into combined one * Support specifying terminal size for pty, including on-demand resizes while process is already running * Support terminating the process while still consuming its outputs
Ruslan Nigmatullin ·
2026-03-06 15:13:12 -08:00 -
feat: use process group to kill the PTY (#12688)
Use the process group kill logic to kill the PTY
jif-oai ·
2026-02-24 16:55:23 +00:00 -
Fix flaky windows CI test (#10993)
Hardens PTY Python REPL test and make MCP test startup deterministic **Summary** - `utils/pty/src/tests.rs` - Added a REPL readiness handshake (`wait_for_python_repl_ready`) that repeatedly sends a marker and waits for it in PTY output before sending test commands. - Updated `pty_python_repl_emits_output_and_exits` to: - wait for readiness first, - preserve startup output, - append output collected through process exit. - Reduces Windows/ConPTY flakiness from early stdin writes racing REPL startup. - `mcp-server/tests/suite/codex_tool.rs` - Avoid remote model refresh during MCP test startup, reducing timeout-prone nondeterminism.Eric Traut ·
2026-02-07 08:55:42 -08:00 -
Process-group cleanup for stdio MCP servers to prevent orphan process storms (#10710)
This PR changes stdio MCP child processes to run in their own process group * Add guarded teardown in codex-rmcp-client: send SIGTERM to the group first, then SIGKILL after a short grace period. * Add terminate_process_group helper in process_group.rs. * Add Unix regression test in process_group_cleanup.rs to verify wrapper + grandchild are reaped on client drop. Addresses reported MCP process/thread storm: #10581
Eric Traut ·
2026-02-06 21:26:36 -08:00 -
feat: detach non-tty childs (#9477)
Thanks to the investigations made by * @frantic-openai https://github.com/openai/codex/pull/9403 * @kfiramar https://github.com/openai/codex/pull/9388
jif-oai ·
2026-01-19 11:35:34 +00:00 -
jif-oai ·
2026-01-16 17:24:41 +01:00 -
chore: close pipe on non-pty processes (#9369)
Closing the STDIN of piped process when starting them to avoid commands like `rg` to wait for content on STDIN and hangs for ever
jif-oai ·
2026-01-16 15:54:32 +01:00 -
jif-oai ·
2026-01-14 18:44:04 +00:00 -
feat: add support for building with Bazel (#8875)
This PR configures Codex CLI so it can be built with [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc` includes configuration so that remote builds can be done using [BuildBuddy](https://www.buildbuddy.io). If you are familiar with Bazel, things should work as you expect, e.g., run `bazel test //... --keep-going` to run all the tests in the repo, but we have also added some new aliases in the `justfile` for convenience: - `just bazel-test` to run tests locally - `just bazel-remote-test` to run tests remotely (currently, the remote build is for x86_64 Linux regardless of your host platform). Note we are currently seeing the following test failures in the remote build, so we still need to figure out what is happening here: ``` failures: suite::compact::manual_compact_twice_preserves_latest_user_messages suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view ``` - `just build-for-release` to build release binaries for all platforms/architectures remotely To setup remote execution: - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI employees should also request org access at https://openai.buildbuddy.io/join/ with their `@openai.com` email address.) - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to `~/.bazelrc` (add the line `build --remote_header=x-buildbuddy-api-key=YOUR_KEY`) - Use `--config=remote` in your `bazel` invocations (or add `common --config=remote` to your `~/.bazelrc`, or use the `just` commands) ## CI In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners (we are working on supporting Windows, but that is not ready yet). Note that the failures we are seeing in `just bazel-remote-test` do not occur on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml` is green right now. The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so that macOS CI jobs build _remotely_ on Linux hosts (using the `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the root `BUILD.bazel`) using cross-compilation to build the macOS artifacts. Then these artifacts are downloaded locally to GitHub's macOS runner so the tests can be executed natively. This is the relevant config that enables this: ``` common:macos --config=remote common:macos --strategy=remote common:macos --strategy=TestRunner=darwin-sandbox,local ``` Because of the remote caching benefits we get from BuildBuddy, these new CI jobs can be extremely fast! For example, consider these two jobs that ran all the tests on Linux x86_64: - Bazel 1m37s https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875 - Cargo 9m20s https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875 For now, we will continue to run both the Bazel and Cargo jobs for PRs, but once we add support for Windows and running Clippy, we should be able to cutover to using Bazel exclusively for PRs, which should still speed things up considerably. We will probably continue to run the Cargo jobs post-merge for commits that land on `main` as a sanity check. Release builds will also continue to be done by Cargo for now. Earlier attempt at this PR: https://github.com/openai/codex/pull/8832 Earlier attempt to add support for Buck2, now abandoned: https://github.com/openai/codex/pull/8504 --------- Co-authored-by: David Zbarsky <dzbarsky@gmail.com> Co-authored-by: Michael Bolin <mbolin@openai.com>
zbarsky-openai ·
2026-01-09 11:09:43 -08:00 -
jif-oai ·
2025-12-17 10:29:45 +00:00 -
fix: race on rx subscription (#7921)
Fix race where the PTY was sending first chunk before the subscription to the broadcast
jif-oai ·
2025-12-12 12:40:54 +01:00 -
jif-oai ·
2025-12-10 10:30:38 +00:00 -
Vendor ConPtySystem (#7656)
The repo we were depending on is very large and we need very small part of it. --------- Co-authored-by: Pavel <pavel@krymets.com>
pakrym-oai ·
2025-12-09 17:23:51 +00:00 -
Fix unified_exec on windows (#7620)
Fix unified_exec on windows Requires removal of PSUEDOCONSOLE_INHERIT_CURSOR flag so child processed don't attempt to wait for cursor position response (and timeout). https://github.com/wezterm/wezterm/compare/main...pakrym:wezterm:PSUEDOCONSOLE_INHERIT_CURSOR?expand=1 --------- Co-authored-by: pakrym-oai <pakrym@openai.com>
Pavel Krymets ·
2025-12-05 20:09:43 +00:00 -
chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR - document cargo-deny allowlist with workspace-dep notes and advisory ignores - align workspace crates to inherit version/edition/license for consistent checks
Josh McKinney ·
2025-11-24 12:22:18 -08:00 -
Use codex-linux-sandbox in unified exec (#6480)
Unified exec isn't working on Linux because we don't provide the correct arg0. The library we use for pty management doesn't allow setting arg0 separately from executable. Use the same aliasing strategy we use for `apply_patch` for `codex-linux-sandbox`. Use `#[ctor]` hack to dispatch codex-linux-sandbox calls. Addresses https://github.com/openai/codex/issues/6450
pakrym-oai ·
2025-11-10 17:17:09 -08:00 -
chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand the structure
jif-oai ·
2025-10-20 20:57:37 +01:00