codex

[codex] Add size to internal filesystem metadata (#27927 )

## Why

`ExecutorFileSystem::get_metadata` reports file kind and timestamps but
not size. Internal callers that need to enforce a size limit therefore
have to read the complete file first, which is especially wasteful for
remote filesystems.

This adds the missing internal metadata so consumers can reject
oversized files before transferring or buffering them. The field is
named `size`, matching VS Code's `FileStat.size` filesystem convention.

## What changed

- add `size: u64` to internal `FileMetadata`
- populate it from the underlying filesystem metadata
- carry it through sandbox-helper and remote exec-server responses
- cover files, directories, symlink targets, and sandboxed reads across
local and remote filesystem implementations

The new field is intentionally not exposed through the app-server API.

## Testing

- `just test -p codex-exec-server get_metadata`
- `just test -p codex-exec-server
file_system_sandboxed_metadata_and_read_allow_readable_root`
- `just test -p codex-core-plugins`
- `just test -p codex-skills-extension`

pakrym-oai · 2026-06-12 12:12:08 -07:00

76d8f20241

[codex] migrate ExecutorFileSystem paths to PathUri (#27424 )

## Why

We're moving exec-server to use PathUri for its internal path
representations.

## What

Move `ExecutorFileSystem` APIs to use `PathUri` instead of
`AbsolutePathBuf`. Future changes will convert higher-level parts of
exec-server.

Adam Perry @ OpenAI · 2026-06-11 18:44:18 +00:00

b2a4e3be27

[codex] add cross-platform filesystem adapter coverage (#27454 )

## Why

The exec-server's existing filesystem tests only run on `#[cfg(unix)]`.
We should be running the applicable ones on Windows, and also include
the basic filesystem operations that will be modified by migrating to
`PathUri`.

## What

Split platform-neutral local/remote tests into a shared Unix/Windows
suite while keeping the existing `AbsolutePathBuf` API, and add Windows
junction canonicalization coverage.

Adam Perry @ OpenAI · 2026-06-11 17:53:18 +00:00

cc97839068

[codex] Handle Ctrl-C for non-TTY unified exec (#26734 )

## Why

A long-running unified exec process started with `tty: false` could not
be interrupted via `write_stdin`: ordinary non-TTY stdin writes are
rejected once stdin is closed, but an exact U+0003 payload should still
map to a process interrupt. The interrupt should flow through the same
process lifecycle path as a real signal so Codex preserves
process-reported output and exit metadata instead of fabricating a
Ctrl-C exit code or tearing down the session early.

## What Changed

- Add `process/signal` to exec-server with `ProcessSignal::Interrupt`
and an empty response.
- Add a non-consuming `ProcessHandle::signal` path for spawned
processes; on Unix it sends SIGINT to the process group and leaves
terminate/hard-kill unchanged.
- Route non-TTY U+0003 `write_stdin` through `process.signal(...)`
instead of `terminate`, then let the normal post-write collection path
drain output and observe exit.
- Add exec-server coverage where a shell `trap INT` handler prints the
signal and exits with its own code.
- Add unified exec coverage where a `tty: false` process traps SIGINT,
emits output, and exits with its own code.

## Validation

- `just test -p codex-exec-server
exec_process_signal_interrupts_process`
- `just test -p codex-exec-server`
- `just test -p codex-core
write_stdin_ctrl_c_interrupts_non_tty_session`

pakrym-oai · 2026-06-09 15:10:17 -07:00

f2969f36e8

[codex] Add environment shell info (#26480 )

## Why

Shell detection needs to be available through the `Environment`
abstraction so callers can ask the selected local or remote environment
for shell metadata without adding a separate HTTP endpoint or parallel
info-source path. This keeps shell metadata shaped like the existing
environment-owned filesystem capability and lets remote environments
answer through exec-server JSON-RPC.

## What changed

- Added `environment/info` to the exec-server protocol/client/server and
exposed `Environment::info()`.
- Added local and remote environment info providers on `Environment`,
following the existing capability-provider pattern used for filesystem
access.
- Moved the shared shell detection logic into `codex-shell-command` and
kept core shell APIs as wrappers around that implementation.
- Returned shell metadata as `EnvironmentInfo { shell: ShellInfo }`
using the existing shell detection path.
- Added a remote environment test that calls `Environment::info()`
through an exec-server-backed environment.

## Validation

- `git diff --check`
- `just test -p codex-shell-command`
- `just test -p codex-core -E 'test(/shell::tests::/)'`\n- `just test -p
codex-exec-server environment`

pakrym-oai · 2026-06-04 22:36:25 -07:00

6a6a5f925e

exec-server: canonicalize bound filesystem paths (#25149 )

## Summary
- add executor filesystem canonicalization as a bound-path operation
- route remote canonicalization through the exec-server filesystem RPC
surface
- keep path normalization attached to the filesystem that owns the path

## Stack
- 2/5 in the skills path authority stack extracted from
https://github.com/openai/codex/pull/25098
- follows merged https://github.com/openai/codex/pull/25121

## Validation
- `cd
/Users/starr/code/codex-worktrees/pr-25098-restack-review-pr1b/codex-rs
&& just fmt`
- Not run: tests/checks (not requested)
- GitHub CI pending on rewritten head

starr-openai · 2026-06-01 11:53:31 -07:00

53ac02356e

fix(exec-server): reject websocket requests with Origin headers (#24947 )

## Why

`codex exec-server` has a local WebSocket listener, but it did not apply
the same browser-origin request handling as the `app-server` WebSocket
transport. Requests that carry an `Origin` header should not be upgraded
by this local transport, keeping both local WebSocket servers consistent
and avoiding unexpected browser-initiated connections.

## What changed

- Added an Axum middleware guard in
`codex-rs/exec-server/src/server/transport.rs` that returns `403
Forbidden` for requests carrying an `Origin` header.
- Added an integration test in `codex-rs/exec-server/tests/websocket.rs`
that covers rejection of an `Origin`-bearing WebSocket handshake.
- Kept ordinary WebSocket clients unchanged: existing no-`Origin`
initialization and process behavior remains covered by the crate tests.

## Validation

- `just test -p codex-exec-server` test phase (`186 passed`; run outside
the parent macOS sandbox so nested sandbox tests can execute)
- `just clippy -p codex-exec-server`

viyatb-oai · 2026-05-28 14:44:14 -07:00

a027135bc6

Migrate exec-server remote registration to environments (#23633 )

## Summary
- migrate exec-server remote registration naming from executor to
environment
- align CLI, public Rust exports, registry error messages, and relay
test fixtures with the environment registry contract
- keep the live registration path and response model consistent with
`/cloud/environment/{environment_id}/register`

## Verification
- `cargo test -p codex-exec-server
remote::tests::register_environment_posts_with_auth_provider_headers
--manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
- `cargo test -p codex-exec-server --test relay
multiplexed_remote_environment_routes_independent_virtual_streams
--manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
- `cargo check -p codex-cli --manifest-path
/Users/richardlee/code/codex/codex-rs/Cargo.toml` (still running when PR
opened; will update after completion if needed)

richardopenai · 2026-05-20 00:25:04 -07:00

000bf5ce6d

exec-server: support auth-backed remote executor registration (#22769 )

This updates remote `exec-server` registration to use normal Codex auth
instead of a registry-issued credential. The registry request is built
from the existing auth-provider path, which preserves the biscuit-only
registry contract introduced in
[openai/openai#924101](https://github.com/openai/openai/pull/924101)
while removing the old remote registry bearer env var and its direct
transport assumptions.

The default remote flow uses persisted ChatGPT auth from the normal
Codex config/storage path. This PR also includes the containerized Agent
Identity path needed by
[openai/openai#924260](https://github.com/openai/openai/pull/924260):
remote `exec-server` accepts `--allow-agent-identity-auth`, permits
Agent Identity auth loaded from `CODEX_ACCESS_TOKEN` only when that flag
is present, and reuses the existing Agent task registration plus derived
`AgentAssertion` header generation. API-key auth remains unsupported,
and Agent Identity stays opt-in.

Validation performed beyond normal presubmit coverage:
- `cargo fmt --all --check`
- `cargo check -p codex-cli`
- `cargo test -p codex-exec-server`
- `cargo test -p codex-cli exec_server_agent_identity_auth_flag_`
- `cargo test -p codex-cli remote_exec_server_auth_mode_`

I also attempted `cargo test -p codex-cli`. The new CLI tests passed
inside that run, but the suite ended on an unrelated local
marketplace-state failure in
`plugin_list_excludes_unconfigured_repo_local_marketplaces`.

Michael Zeng · 2026-05-16 12:48:28 -07:00

b200dd1b6f

feat(exec-server): use protobuf relay frames (#22343 )

## Why

Remote exec-server now needs one executor websocket to serve multiple
harness JSON-RPC sessions. Rendezvous routes by `stream_id`, and the
exec-server side needs to use the same stable relay frame contract
instead of a hand-rolled JSON shape.

The relay protocol also needs to make ownership boundaries clear:
harness and executor endpoints own sequencing, acks, retries, duplicate
suppression, segmentation, and reassembly; rendezvous only routes
frames.

## What Changed

- Add the checked-in `codex.exec_server.relay.v1.RelayMessageFrame`
proto plus generated prost bindings for `codex-exec-server`.
- Encode remote harness/executor relay traffic as binary protobuf
websocket frames while keeping local websocket JSON-RPC unchanged.
- Demux executor-side relay streams into independent
`ConnectionProcessor` sessions keyed by `stream_id`.
- Add a programmatic `RemoteExecutorConfig::with_bearer_token(...)`
constructor for non-CLI callers and integration tests.
- Add an integration test that starts the remote executor against a fake
registry/rendezvous websocket and verifies two virtual streams share one
executor websocket without cross-talk, including per-stream reset
behavior.
- Document the remote relay envelope, sequence ranges, `ack`/`ack_bits`,
and endpoint responsibilities in `exec-server/README.md`.

## Verification

- `cargo test -p codex-exec-server --test relay
multiplexed_remote_executor_routes_independent_virtual_streams --
--exact`
- `cargo test -p codex-exec-server --test relay`
- `cargo test -p codex-exec-server` passed outside the sandbox. The
sandboxed run hit macOS `sandbox-exec: sandbox_apply: Operation not
permitted` in filesystem sandbox tests.

Anton Panasenko · 2026-05-12 16:50:45 -07:00

ac466c0dbd

[exec-server] serve websocket listener via HTTP upgrade (#21963 )

## Why

`codex exec-server` should keep the existing public `ws://IP:PORT` URL
shape while serving that websocket connection through an HTTP upgrade
path internally. That keeps the client-facing configuration simple and
allows the listener to work through intermediate HTTP-aware
infrastructure.

## What changed

- keep the emitted and configured exec-server URL as `ws://IP:PORT`
- serve that websocket endpoint through Axum HTTP upgrade handling on
`/`
- expose `GET /readyz` from the same listener for readiness checks
- route upgraded Axum websocket streams through the shared JSON-RPC
connection machinery
- initialize the rustls crypto provider before websocket client
connections
- preserve inbound binary websocket JSON-RPC parsing for compatibility
with the prior transport behavior

## Verification

- `cargo test -p codex-exec-server --test health --test process --test
websocket --test initialize --test exec_process`

Ruslan Nigmatullin · 2026-05-11 17:04:21 -07:00

95d8669ab2

tests: cover sandbox link write behavior (#21819 )

## Why

[PR #1705](https://github.com/openai/codex/pull/1705) moved
`apply_patch` execution under the configured sandbox and called out the
need for integration coverage. We already covered textual `../` escapes,
but did not have coverage for link aliases that live inside a writable
workspace while pointing at, or aliasing, files visible outside it.

This PR locks in the current sandbox boundary without changing
production write semantics. Symlink escapes into a read-only outside
root should fail and leave the outside file unchanged. Existing hard
links are characterized separately: if a user-created hard link already
exists inside the writable root, sandboxed writes preserve normal
hard-link semantics rather than replacing the link and silently breaking
that relationship.

## What Changed

- Added
`apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
to verify `apply_patch` cannot update a symlink that targets a file
outside the writable workspace.
- Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace`
to verify `apply_patch` intentionally writes through an existing hard
link and does not unlink or replace it.
- Added `file_system_sandboxed_write_preserves_existing_hard_link` to
verify sandboxed `fs/writeFile` preserves an existing hard link and
writes the shared inode.

## Testing

- `cargo test -p codex-exec-server file_system_sandboxed_write`
- `cargo test -p codex-core
apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
- `cargo test -p codex-core
apply_patch_cli_preserves_existing_hard_link_outside_workspace`
- `just fix -p codex-exec-server -p codex-core`
- `just fix -p codex-core`



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819).
* #21845
* __->__ #21819

Michael Bolin · 2026-05-09 08:28:15 -07:00

0c70698e24

fix(linux-sandbox): fall back when system bwrap lacks perms (#20628 )

## Why

Codex `0.128` started using `--perms` in more routine Linux sandbox
construction when protected workspace metadata mounts landed in #19852.
Upstream bubblewrap added `--perms` in `v0.5.0`, so system `bwrap`
versions older than that, including the `v0.4.0` and `v0.4.1` family, do
not support the flag. The launcher still selected those binaries as long
as they existed on `PATH`.

That means affected hosts can fail every sandboxed command up front
with:

```text
bwrap: Unknown option --perms
```

The reports in #20590 and duplicate #20623 match that compatibility gap;
#20623 explicitly shows system bubblewrap `0.4.0`.

## What changed

- Replace the single `--argv0` probe with a small system-bwrap
capability probe in `codex-rs/linux-sandbox/src/launcher.rs`.
- Continue using the old-system `--argv0` compatibility path when
needed, but only select a system `bwrap` if it also advertises
`--perms`.
- Fall back to the vendored `bwrap` when the system binary is too old
for the flags Codex now requires.
- Add regression coverage for the old-system-bwrap case so binaries
without `--perms` stay on the vendored path.

## Verification

- Added `falls_back_to_vendored_when_system_bwrap_lacks_perms` to cover
the reported compatibility gap.
- Ran `cargo test -p codex-linux-sandbox` and `cargo clippy -p
codex-linux-sandbox --tests` locally. On macOS, the crate builds but its
Linux-only tests are cfg-gated out, so the new regression test still
needs Linux CI or a Linux devbox run for real execution coverage.

## Related issues

- Fixes #20590
- Duplicate report: #20623

viyatb-oai · 2026-05-04 10:38:31 -07:00

5b80f87c97

permissions: make profiles represent enforcement (#19231 )

## Why

`PermissionProfile` is becoming the canonical permissions abstraction,
but the old shape only carried optional filesystem and network fields.
It could describe allowed access, but not who is responsible for
enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
when profiles were exported, cached, or round-tripped through app-server
APIs.

The important model change is that active permissions are now a disjoint
union over the enforcement mode. Conceptually:

```rust
pub enum PermissionProfile {
    Managed {
        file_system: FileSystemSandboxPolicy,
        network: NetworkSandboxPolicy,
    },
    Disabled,
    External {
        network: NetworkSandboxPolicy,
    },
}
```

This distinction matters because `Disabled` means Codex should apply no
outer sandbox at all, while `External` means filesystem isolation is
owned by an outside caller. Those are not equivalent to a broad managed
sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
inner sandbox may require the outer Codex layer to use no sandbox rather
than a permissive one.

## How Existing Modeling Maps

Legacy `SandboxPolicy` remains a boundary projection, but it now maps
into the higher-fidelity profile model:

- `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
with restricted filesystem entries plus the corresponding network
policy.
- `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
the “no outer sandbox” intent instead of treating it as a lax managed
sandbox.
- `ExternalSandbox { network_access }` maps to
`PermissionProfile::External { network }`, preserving external
filesystem enforcement while still carrying the active network policy.
- Split runtime policies that legacy `SandboxPolicy` cannot faithfully
express, such as managed unrestricted filesystem plus restricted
network, stay `Managed` instead of being collapsed into
`ExternalSandbox`.
- Per-command/session/turn grants remain partial overlays via
`AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
complete active runtime permissions.

## What Changed

- Change active `PermissionProfile` into a tagged union: `managed`,
`disabled`, and `external`.
- Keep partial permission grants separate with
`AdditionalPermissionProfile` for command/session/turn overlays.
- Represent managed filesystem permissions as either `restricted`
entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
present.
- Preserve old rollout compatibility by accepting the pre-tagged `{
network, file_system }` profile shape during deserialization.
- Preserve fidelity for important edge cases: `DangerFullAccess`
round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
and managed unrestricted filesystem + restricted network stays managed
instead of being mistaken for external enforcement.
- Preserve configured deny-read entries and bounded glob scan depth when
full profiles are projected back into runtime policies, including
unrestricted replacements that now become `:root = write` plus deny
entries.
- Regenerate the experimental app-server v2 JSON/TypeScript schema and
update the `command/exec` README example for the tagged
`permissionProfile` shape.

## Compatibility

Legacy `SandboxPolicy` remains available at config/API boundaries as the
compatibility projection. Existing rollout lines with the old
`PermissionProfile` shape continue to load. The app-server
`permissionProfile` field is experimental, so its v2 wire shape is
intentionally updated to match the higher-fidelity model.

## Verification

- `just write-app-server-schema`
- `cargo check --tests`
- `cargo test -p codex-protocol permission_profile`
- `cargo test -p codex-protocol
preserving_deny_entries_keeps_unrestricted_policy_enforceable`
- `cargo test -p codex-app-server-protocol
permission_profile_file_system_permissions`
- `cargo test -p codex-app-server-protocol serialize_client_response`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `just fix`
- `just fix -p codex-protocol`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core`
- `just fix -p codex-app-server`

Michael Bolin · 2026-04-23 23:02:18 -07:00

4816b89204

fix(exec-server): retain output until streams close (#18946 )

## Why

A Mac Bazel run hit a flake in
`server::handler::tests::output_and_exit_are_retained_after_notification_receiver_closes`
where the read path observed process exit but lost the expected buffered
stdout (`first\nsecond\n`). See the [GitHub Actions
job](https://github.com/openai/codex/actions/runs/24758468552/job/72436716505)
and [BuildBuddy
invocation](https://app.buildbuddy.io/invocation/37475a12-4ef2-45fb-ab8a-e49a2aba1d59).

The underlying race is that process exit is not the same thing as
stdout/stderr closure. If a child or grandchild inherits the pipe write
end, or a process duplicates it with `dup2`, the watched process can
exit while the stream is still open and more output can still arrive.
The exec-server was starting exited-process retention cleanup from the
exit event, so the process entry could be removed before the output
streams had actually closed.

While stress-testing the exec-server unit suite,
`server::handler::tests::long_poll_read_fails_after_session_resume`
exposed a separate test race: it started a short-lived command that
could exit and wake the pending long-poll read before the session-resume
assertion observed the resumed-session error. That test is intended to
cover resume eviction, not process-exit delivery, so this change keeps
the process alive and quiet while the second connection resumes the
session.

## What changed

- Keep exec-server process entries retained until stdout/stderr streams
close, then start the post-exit retention timer from the closed event.
- Wake long-poll readers when the closed event is emitted.
- Add focused `local_process` unit coverage that proves late output is
still retained after the short test retention interval has elapsed, and
that closed process entries are eventually evicted.
- Add a local and remote regression test where a parent exits while a
child keeps inherited stdout open. The child waits on an explicit
release file, so the test deterministically observes exit first,
releases the child, then requires a nonzero-wait read from the exit
sequence to receive the late output.
- In `codex-rs/exec-server/src/server/handler/tests.rs`, make
`long_poll_read_fails_after_session_resume` run a long-lived silent
command instead of a short command that prints and exits. This isolates
the test to session-resume behavior and prevents a normal process exit
from satisfying the pending long-poll read first.

## Testing

- `cargo test -p codex-exec-server
exec_process_retains_output_after_exit_until_streams_close`
- `cargo test -p codex-exec-server local_process::tests`
- `cargo test -p codex-exec-server`
- `just fix -p codex-exec-server`
- `bazel test //codex-rs/exec-server:exec-server-unit-tests
//codex-rs/exec-server:exec-server-exec_process-test
//codex-rs/exec-server:exec-server-file_system-test
//codex-rs/exec-server:exec-server-http_client-test
//codex-rs/exec-server:exec-server-initialize-test
//codex-rs/exec-server:exec-server-process-test
//codex-rs/exec-server:exec-server-websocket-test`
- `bazel test --runs_per_test=25
//codex-rs/exec-server:exec-server-unit-tests`

## Documentation

No docs update needed; this is an internal exec-server correctness fix.

Michael Bolin · 2026-04-23 19:49:58 +00:00

491a3058f6

exec-server: require explicit filesystem sandbox cwd (#19046 )

## Why

This is a cleanup PR for the `PermissionProfile` migration stack. #19016
fixed remote exec-server sandbox contexts so Docker-backed filesystem
requests use a request/container `cwd` instead of leaking the local test
runner `cwd`. That exposed the broader API problem:
`FileSystemSandboxContext::new(SandboxPolicy)` could still reconstruct
filesystem permissions by reading the exec-server process cwd with
`AbsolutePathBuf::current_dir()`.

That made `cwd`-dependent legacy entries, such as `:cwd`,
`:project_roots`, and relative deny globs, depend on ambient process
state instead of the request sandbox `cwd`. As later PRs make
`PermissionProfile` the primary permissions abstraction, sandbox
contexts should be explicit about whether they carry a request `cwd` or
are profile-only. Removing the implicit constructor prevents new call
sites from accidentally rebuilding permissions against the wrong `cwd`.

## What changed

- Removed `FileSystemSandboxContext::new(SandboxPolicy)`.
- Kept production callers on explicit constructors:
`from_legacy_sandbox_policy(..., cwd)`, `from_permission_profile(...)`,
and `from_permission_profile_with_cwd(...)`.
- Updated exec-server test helpers to construct `PermissionProfile`
values directly instead of routing through legacy `SandboxPolicy`
projections.
- Updated the environment regression test to use an explicit restricted
profile with no synthetic `cwd`.

## Verification

- `cargo test -p codex-exec-server`
- `just fix -p codex-exec-server`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19046).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* __->__ #19046

Michael Bolin · 2026-04-22 23:05:12 +00:00

44dbd9e48a

[2/4] Implement executor HTTP request runner (#18582 )

### Why
Remote streamable HTTP MCP needs the executor to perform ordinary HTTP
requests on the executor side. This keeps network placement aligned with
`experimental_environment = "remote"` without adding MCP-specific
executor APIs.

### What
- Add an executor-side `http/request` runner backed by `reqwest`.
- Validate request method and URL scheme, preserving the transport
boundary at plain HTTP.
- Return buffered responses for ordinary calls and emit ordered
`http/request/bodyDelta` notifications for streaming responses.
- Register the request handler in the exec-server router.
- Document the runner entrypoint, conversion helpers, body-stream
bridge, notification sender, timeout behavior, and new integration-test
helpers.
- Add exec-server integration tests with the existing websocket harness
and a local TCP HTTP peer for buffered and streamed responses, with
comments spelling out what each test proves and its
setup/exercise/assert phases.

### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage

### Verification
- `just fmt`
- `cargo check -p codex-exec-server -p codex-rmcp-client --tests`
- `cargo check -p codex-core --test all` compile-only
- `git diff --check`
- Online full CI is running from the `full-ci` branch, including the
remote Rust test job.

Co-authored-by: Codex <noreply@openai.com>

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-22 20:36:34 +00:00

9360f267f3

exec-server: carry filesystem sandbox profiles (#18276 )

## Why

The exec-server still needs platform sandbox inputs, but the migration
should preserve the `PermissionProfile` that produced them. Keeping only
the derived legacy sandbox map would keep `SandboxPolicy` as the
effective abstraction and would make full-disk vs. restricted profiles
harder to preserve as the permissions stack starts round-tripping
profiles.

`PermissionProfile` entries can also be cwd-sensitive (`:cwd`,
`:project_roots`, relative globs), so the exec-server must carry the
request sandbox cwd instead of resolving those entries against the
long-lived exec-server process cwd.

## What changed

`FileSystemSandboxContext` now carries `permissions: PermissionProfile`
plus an optional `cwd`:

- removed `sandboxPolicy`, `sandboxPolicyCwd`,
`fileSystemSandboxPolicy`, and `additionalPermissions`
- added `permissions` and `cwd`
- kept the platform knobs `windowsSandboxLevel`,
`windowsSandboxPrivateDesktop`, and `useLegacyLandlock`

Core turn and apply-patch paths populate the context from the active
runtime permissions and request cwd. Exec-server derives platform
`SandboxPolicy`/`FileSystemSandboxPolicy` at the filesystem boundary,
adds helper runtime reads there, and rejects cwd-dependent profiles that
arrive without a cwd.

The legacy `FileSystemSandboxContext::new(SandboxPolicy)` constructor
now preserves the old workspace-write conversion semantics for
compatibility tests/callers.

## Verification

- `cargo test -p codex-exec-server`
- `cargo test -p codex-exec-server sandbox_cwd -- --nocapture`
- `cargo test -p codex-exec-server
sandbox_context_new_preserves_legacy_workspace_write_read_only_subpaths
-- --nocapture`
- `cargo test -p codex-core --lib
file_system_sandbox_context_uses_active_attempt -- --nocapture`

Michael Bolin · 2026-04-21 20:22:28 -07:00

36f8bb4ffa

Support multiple managed environments (#18401 )

## Summary
- refactor EnvironmentManager to own keyed environments with
default/local lookup helpers
- keep remote exec-server client creation lazy until exec/fs use
- preserve disabled agent environment access separately from internal
local environment access

## Validation
- not run (per Codex worktree instruction to avoid tests/builds unless
requested)

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-21 15:29:35 -07:00

ddbe2536be

[1/4] Add executor HTTP request protocol (#18581 )

### Why
Remote streamable HTTP MCP needs a transport-shaped executor primitive
before the MCP client can move network I/O to the executor. This layer
keeps the executor unaware of MCP and gives later PRs an ordered
streaming surface for response bodies.

### What
- Add typed `http/request` and `http/request/bodyDelta` protocol
payloads.
- Add executor client helpers for buffered and streamed HTTP responses.
- Route body-delta notifications to request-scoped streams with sequence
validation and cleanup when a stream finishes or is dropped.
- Document the new protocol constants, transport structs, public client
methods, body-stream lifecycle, and request-scoped routing helpers.
- Add in-memory JSON-RPC client coverage for streamed HTTP response-body
notifications, with comments spelling out what the test proves and each
setup/exercise/assert phase.

### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage

### Verification
- `just fmt`
- `cargo check -p codex-exec-server -p codex-rmcp-client --tests`
- `cargo check -p codex-core --test all` compile-only
- `git diff --check`
- Online full CI is running from the `full-ci` branch, including the
remote Rust test job.

Co-authored-by: Codex <noreply@openai.com>

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-21 02:21:08 +00:00

d6af7a6c03

[6/6] Fail exec client operations after disconnect (#18027 )

## Summary
- Reject new exec-server client operations once the transport has
disconnected.
- Convert pending RPC calls into closed errors instead of synthetic
server errors.
- Cover pending read and later write behavior after remote executor
disconnect.

## Verification
- `just fmt`
- `cargo check -p codex-exec-server`

## Stack
```text
@  #18027 [6/6] Fail exec client operations after disconnect
│
o  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
o  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-20 23:24:06 +00:00

9ef1cab6f7

protocol: canonicalize file system permissions (#18274 )

## Why

`PermissionProfile` needs stable, canonical file-system semantics before
it can become the primary runtime permissions abstraction. Without a
canonical form, callers have to keep re-deriving legacy sandbox maps and
profile comparisons remain lossy or order-dependent.

## What changed

This adds canonicalization helpers for `FileSystemPermissions` and
`PermissionProfile`, expands special paths into explicit sandbox
entries, and updates permission request/conversion paths to consume
those canonical entries. It also tightens the legacy bridge so root-wide
write profiles with narrower carveouts are not silently projected as
full-disk legacy access.

## Verification

- `cargo test -p codex-protocol
root_write_with_read_only_child_is_not_full_disk_write -- --nocapture`
- `cargo test -p codex-sandboxing permission -- --nocapture`
- `cargo test -p codex-tui permissions -- --nocapture`

Michael Bolin · 2026-04-20 09:57:03 -07:00

dcec516313

exec-server: preserve fs helper runtime env (#18380 )

## Summary
- preserve a small fs-helper runtime env allowlist (`PATH`, temp vars)
instead of launching the sandboxed helper with an empty env
- add unit coverage for the allowlist and transformed sandbox request
env
- add a Linux smoke test that starts the test exec-server with a fake
`bwrap` on `PATH`, runs a sandboxed fs write through the remote fs
helper path, and asserts that bwrap path was exercised

## Validation
- `cd /tmp/codex-worktrees/fs-helper-env-defaults/codex-rs && export
PATH=$HOME/code/openai/project/dotslash-gen/bin:$HOME/.local/bin:$PATH
&& bazel test --bes_backend= --bes_results_url=
//codex-rs/exec-server:exec-server-file_system-test
--test_filter=sandboxed_file_system_helper_finds_bwrap_on_preserved_path`
- `cd /tmp/codex-worktrees/fs-helper-env-defaults/codex-rs && export
PATH=$HOME/code/openai/project/dotslash-gen/bin:$HOME/.local/bin:$PATH
&& bazel test --bes_backend= --bes_results_url=
//codex-rs/exec-server:exec-server-unit-tests
--test_filter="helper_env|sandbox_exec_request_carries_helper_env"`
- earlier on this branch before the smoke-test harness adjustment: `cd
/tmp/codex-worktrees/fs-helper-env-defaults/codex-rs && export
PATH=$HOME/code/openai/project/dotslash-gen/bin:$HOME/.local/bin:$PATH
&& bazel test --bes_backend= --bes_results_url=
//codex-rs/exec-server:all`

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-17 20:44:01 +00:00

63e4a900c9

[3/6] Add pushed exec process events (#18020 )

## Summary
- Add a pushed `ExecProcessEvent` stream alongside retained
`process/read` output.
- Publish local and remote output, exit, close, and failure events.
- Cover the event stream with shared local/remote exec process tests.

## Testing
- `cargo check -p codex-exec-server`
- `cargo check -p codex-rmcp-client`
- Not run: `cargo test` per repo instruction; CI will cover.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
o  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
@  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-17 19:07:43 +00:00

9d3a5cf05e

[2/8] Support piped stdin in exec process API (#18086 )

## Summary
- Add an explicit stdin mode to process/start.
- Keep normal non-interactive exec stdin closed while allowing
pipe-backed processes.

## Stack
```text
o  #18027 [8/8] Fail exec client operations after disconnect
│
o  #18025 [7/8] Cover MCP stdio tests with executor placement
│
o  #18089 [6/8] Wire remote MCP stdio through executor
│
o  #18088 [5/8] Add executor process transport for MCP stdio
│
o  #18087 [4/8] Abstract MCP stdio server launching
│
o  #18020 [3/8] Add pushed exec process events
│
@  #18086 [2/8] Support piped stdin in exec process API
│
o  #18085 [1/8] Add MCP server environment config
│
o  main
```

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-04-16 10:30:10 -07:00

2ca270d08d

Stabilize Bazel tests (timeout tweaks and flake fixes) (#17791 )

David de Regt · 2026-04-16 07:57:51 -07:00

6adba99f4d

[codex] Restore remote exec-server filesystem tests (#17989 )

## Summary

- Re-enable remote variants for the exec-server filesystem
sandbox/symlink tests that were made local-only in PR #17671.
- Restore `use_remote` parameterization for the readable-root,
normalized symlink escape, symlink removal, and symlink
copy-preservation cases.
- Preserve `mode={use_remote}` context on key async filesystem failures
so CI failures point at the local or remote lane.

## Validation

- `cd codex-rs && just fmt`
- Not run: `bazel test
//codex-rs/exec-server:exec-server-file_system-test` per local Codex
development guidance to avoid test runs unless explicitly requested.

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-15 20:48:13 +00:00

ba36415a30

Fix fs/readDirectory to skip broken symlinks (#17907 )

## Summary
- Skip directory entries whose metadata lookup fails during
`fs/readDirectory`
- Add an exec-server regression test covering a broken symlink beside
valid entries

## Testing
- `just fmt`
- `cargo test -p codex-exec-server` (started, but dependency/network
updates stalled before completion in this environment)

willwang-openai · 2026-04-15 10:50:22 -07:00

a3d475d33f

Remove exec-server fs sandbox request preflight (#17883 )

## Summary
- Remove the exec-server-side manual filesystem request path preflight
before invoking the sandbox helper.
- Keep sandbox helper policy construction and platform sandbox
enforcement as the access boundary.
- Add a portable local+remote regression for writing through an
explicitly configured alias root.
- Remove the metadata symlink-escape assertion that depended on the
deleted manual preflight; no replacement metadata-specific access probe
is added.

## Tests
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-exec-server --test file_system`
- `git diff --check`

pakrym-oai · 2026-04-15 09:28:30 -07:00

1dead46c90

Route apply_patch through the environment filesystem (#17674 )

## Summary
- route apply_patch runtime execution through the selected Environment
filesystem instead of the local self-exec path
- keep the standalone apply_patch command surface intact while restoring
its launcher/test/docs contract
- add focused apply_patch filesystem sandbox regression coverage

## Validation
- remote devbox Bazel run in progress
- passed: //codex-rs/apply-patch:apply-patch-unit-tests
--test_filter=test_read_file_utf8_with_context_reports_invalid_utf8
- in progress / follow-up: focused core and exec Bazel test slices on
dev

## Follow-up under review
- remote pre-verification and approval/retry behavior still need
explicit scrutiny for delete/update flows
- runtime sandbox-denial classification may need a tighter assertion
path than rendered stderr matching

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-14 12:49:02 -07:00

c24124b37d

[codex] Add symlink flag to fs metadata (#17719 )

Add `is_symlink` to FsMetadata struct.

pakrym-oai · 2026-04-13 17:46:56 -07:00

f3cbe3d385

Stabilize exec-server filesystem tests in CI (#17671 )

## Summary\n- add an exec-server package-local test helper binary that
can run exec-server and fs-helper flows\n- route exec-server filesystem
tests through that helper instead of cross-crate codex helper
binaries\n- stop relying on Bazel-only extra binary wiring for these
tests\n\n## Testing\n- not run (per repo guidance for codex changes)

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-13 16:53:42 -07:00

280a4a6d42

fix: stability exec server (#17640 )

jif-oai · 2026-04-13 14:52:12 +01:00

49ca7c9f24

Build remote exec env from exec-server policy (#17216 )

## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env

## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.

## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`

## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>

jif-oai · 2026-04-13 09:59:08 +01:00

bacb92b1d7

Stabilize exec-server process tests (#17605 )

Problem: After #17294 switched exec-server tests to launch the top-level
`codex exec-server` command, parallel remote exec-process cases can
flake while waiting for the child server's listen URL or transport
shutdown.

Solution: Serialize remote exec-server-backed process tests and harden
the harness so spawned servers are killed on drop and shutdown waits for
the child process to exit.

Eric Traut · 2026-04-13 00:31:13 -07:00

6550007cca

Run exec-server fs operations through sandbox helper (#17294 )

## Summary
- run exec-server filesystem RPCs requiring sandboxing through a
`codex-fs` arg0 helper over stdin/stdout
- keep direct local filesystem execution for `DangerFullAccess` and
external sandbox policies
- remove the standalone exec-server binary path in favor of top-level
arg0 dispatch/runtime paths
- add sandbox escape regression coverage for local and remote filesystem
paths

## Validation
- `just fmt`
- `git diff --check`
- remote devbox: `cd codex-rs && bazel test --bes_backend=
--bes_results_url= //codex-rs/exec-server:all` (6/6 passed)

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-12 18:36:03 -07:00

d626dc3895

feat: move exec-server ownership (#16344 )

This introduces session-scoped ownership for exec-server so ws
disconnects no longer immediately kill running remote exec processes,
and it prepares the protocol for reconnect-based resume.
- add session_id / resume_session_id to the exec-server initialize
handshake
  - move process ownership under a shared session registry
- detach sessions on websocket disconnect and expire them after a TTL
instead of killing processes immediately (we will resume based on this)
- allow a new connection to resume an existing session and take over
notifications/ownership
- I use UUID to make them not predictable as we don't have auth for now
- make detached-session expiry authoritative at resume time so teardown
wins at the TTL boundary
- reject long-poll process/read calls that get resumed out from under an
older attachment

---------

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-04-10 14:11:47 +01:00

085ffb4456

Add sandbox support to filesystem APIs (#16751 )

## Summary
- add optional `sandboxPolicy` support to the app-server filesystem
request surface
- thread sandbox-aware filesystem options through app-server and
exec-server adapters
- enforce sandboxed read/write access in the filesystem abstraction with
focused local and remote coverage

## Validation
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-exec-server file_system`
- `cargo test -p codex-app-server suite::v2::fs`

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-04-08 12:10:48 -07:00

f383cc980d

[codex] Migrate apply_patch to executor filesystem (#17027 )

- Migrate apply-patch verification and application internals to use the
async `ExecutorFileSystem` abstraction from `exec-server`.
- Convert apply-patch `cwd` handling to `AbsolutePathBuf` through the
verifier/parser/handler boundary.

Doesn't change how the tool itself works.

pakrym-oai · 2026-04-07 21:20:22 +00:00

e9702411ab

chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 )

## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.

Michael Bolin · 2026-03-27 19:00:44 -07:00

61dfe0b86c

feat: use ProcessId in exec-server (#15866 )

Use a full struct for the ProcessId to increase readability and make it
easier in the future to make it evolve if needed

jif-oai · 2026-03-26 16:45:36 +01:00

6d2f4aaafc

feat: exec-server prep for unified exec (#15691 )

This PR partially rebase `unified_exec` on the `exec-server` and adapt
the `exec-server` accordingly.

## What changed in `exec-server`

1. Replaced the old "broadcast-driven; process-global" event model with
process-scoped session events. The goal is to be able to have dedicated
handler for each process.
2. Add to protocol contract to support explicit lifecycle status and
stream ordering:
- `WriteResponse` now returns `WriteStatus` (Accepted, UnknownProcess,
StdinClosed, Starting) instead of a bool.
  - Added seq fields to output/exited notifications.
  - Added terminal process/closed notification.
3. Demultiplexed remote notifications into per-process channels. Same as
for the event sys
4. Local and remote backends now both implement ExecBackend.
5. Local backend wraps internal process ID/operations into per-process
ExecProcess objects.
6. Remote backend registers a session channel before launch and
unregisters on failed launch.

## What changed in `unified_exec`

1. Added unified process-state model and backend-neutral process
wrapper. This will probably disappear in the future, but it makes it
easier to keep the work flowing on both side.
- `UnifiedExecProcess` now handles both local PTY sessions and remote
exec-server processes through a shared `ProcessHandle`.
- Added `ProcessState` to track has_exited, exit_code, and terminal
failure message consistently across backends.
2. Routed write and lifecycle handling through process-level methods.

## Some rationals

1. The change centralizes execution transport in exec-server while
preserving policy and orchestration ownership in core, avoiding
duplicated launch approval logic. This comes from internal discussion.
2. Session-scoped events remove coupling/cross-talk between processes
and make stream ordering and terminal state explicit (seq, closed,
failed).
3. The failure-path surfacing (remote launch failures, write failures,
transport disconnects) makes command tool output and cleanup behavior
deterministic

## Follow-ups:
* Unify the concept of thread ID behind an obfuscated struct
* FD handling
* Full zsh-fork compatibility
* Full network sandboxing compatibility
* Handle ws disconnection

jif-oai · 2026-03-26 15:22:34 +01:00

7dac332c93

Add cached environment manager for exec server URL (#15785 )

Add environment manager that is a singleton and is created early in
app-server (before skill manager, before config loading).

Use an environment variable to point to a running exec server.

pakrym-oai · 2026-03-25 16:14:36 -07:00

8fa88fa8ca

Split exec process into local and remote implementations (#15233 )

## Summary
- match the exec-process structure to filesystem PR #15232
- expose `ExecProcess` on `Environment`
- make `LocalProcess` the real implementation and `RemoteProcess` a thin
network proxy over `ExecServerClient`
- make `ProcessHandler` a thin RPC adapter delegating to `LocalProcess`
- add a shared local/remote process test

## Validation
- `just fmt`
- `CARGO_TARGET_DIR=~/.cache/cargo-target/codex cargo test -p
codex-exec-server`
- `just fix -p codex-exec-server`

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-03-20 03:13:08 +00:00

96a86710c3

Refactor ExecServer filesystem split between local and remote (#15232 )

For each feature we have:
1. Trait exposed on environment
2. **Local Implementation** of the trait
3. Remote implementation that uses the client to proxy via network
4. Handler implementation that handles PRC requests and calls into
**Local Implementation**

pakrym-oai · 2026-03-19 17:08:04 -07:00

403b397e4e

Add exec-server exec RPC implementation (#15090 )

Stacked PR 2/3, based on the stub PR.

Adds the exec RPC implementation and process/event flow in exec-server
only.

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-03-19 19:00:36 +00:00

1d210f639e

Remove stdio transport from exec server (#15119 )

Summary
- delete the deprecated stdio transport plumbing from the exec server
stack
- add a basic `exec_server()` harness plus test utilities to start a
server, send requests, and await events
- refresh exec-server dependencies, configs, and documentation to
reflect the new flow

Testing
- Not run (not requested)

---------

Co-authored-by: starr-openai <starr@openai.com>
Co-authored-by: Codex <noreply@openai.com>

pakrym-oai · 2026-03-19 01:00:35 +00:00

903660edba

Add exec-server stub server and protocol docs (#15089 )

Stacked PR 1/3.

This is the initialize-only exec-server stub slice: binary/client
scaffolding and protocol docs, without exec/filesystem implementation.

---------

Co-authored-by: Codex <noreply@openai.com>

starr-openai · 2026-03-19 00:30:05 +00:00

81996fcde6

refactor: delete exec-server and move execve wrapper into shell-escalation (#12632 )

## Why

We already plan to remove the shell-tool MCP path, and doing that
cleanup first makes the follow-on `shell-escalation` work much simpler.

This change removes the last remaining reason to keep
`codex-rs/exec-server` around by moving the `codex-execve-wrapper`
binary and shared shell test fixtures to the crates/tests that now own
that functionality.

## What Changed

### Delete `codex-rs/exec-server`

- Remove the `exec-server` crate, including the MCP server binary,
MCP-specific modules, and its test support/test suite
- Remove `exec-server` from the `codex-rs` workspace and update
`Cargo.lock`

### Move `codex-execve-wrapper` into `codex-rs/shell-escalation`

- Move the wrapper implementation into `shell-escalation`
(`src/unix/execve_wrapper.rs`)
- Add the `codex-execve-wrapper` binary entrypoint under
`shell-escalation/src/bin/`
- Update `shell-escalation` exports/module layout so the wrapper
entrypoint is hosted there
- Move the wrapper README content from `exec-server` to
`shell-escalation/README.md`

### Move shared shell test fixtures to `app-server`

- Move the DotSlash `bash`/`zsh` test fixtures from
`exec-server/tests/suite/` to `app-server/tests/suite/`
- Update `app-server` zsh-fork tests to reference the new fixture paths

### Keep `shell-tool-mcp` as a shell-assets package

- Update `.github/workflows/shell-tool-mcp.yml` packaging so the npm
artifact contains only patched Bash/Zsh payloads (no Rust binaries)
- Update `shell-tool-mcp/package.json`, `shell-tool-mcp/src/index.ts`,
and docs to reflect the shell-assets-only package shape
- `shell-tool-mcp-ci.yml` does not need changes because it is already
JS-only

## Verification

- `cargo shear`
- `cargo clippy -p codex-shell-escalation --tests`
- `just clippy`

Michael Bolin · 2026-02-23 20:10:22 -08:00

38f84b6b29

test: vendor zsh fork via DotSlash and stabilize zsh-fork tests (#12518 )

## Why

The zsh integration tests were still brittle in two ways:

- they relied on `CODEX_TEST_ZSH_PATH` / environment-specific setup, so
they often did not exercise the patched zsh fork that `shell-tool-mcp`
ships
- once the tests consistently used the vendored zsh fork, they exposed
real Linux-specific zsh-fork issues in CI

In particular, the Linux failures were not just test noise:

- the zsh-fork launch path was dropping `ExecRequest.arg0`, so Linux
`codex-linux-sandbox` arg0 dispatch did not run and zsh wrapper-mode
could receive malformed arguments
- the
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
test uses the zsh exec bridge (which talks to the parent over a Unix
socket), but Linux restricted sandbox seccomp denies `connect(2)`,
causing timeouts on `ubuntu-24.04` x86/arm

This PR makes the zsh tests consistently run against the intended
vendored zsh fork and fixes/hardens the zsh-fork path so the Linux CI
signal is meaningful.

## What Changed

- Added a single shared test-only DotSlash file for the patched zsh fork
at `codex-rs/exec-server/tests/suite/zsh` (analogous to the existing
`bash` test resource).
- Updated both app-server and exec-server zsh tests to use that shared
DotSlash zsh (no duplicate zsh DotSlash file, no `CODEX_TEST_ZSH_PATH`
dependency).
- Updated the app-server zsh-fork test helper to resolve the shared
DotSlash zsh and avoid silently falling back to host zsh.
- Kept the app-server zsh-fork tests configured via `config.toml`, using
a test wrapper path where needed to force `zsh -df` (and rewrite `-lc`
to `-c`) for the subcommand-decline test.
- Hardened the app-server subcommand-decline zsh-fork test for CI
variability:
  - tolerate an extra `/responses` POST with a no-op mock response
- tolerate non-target approval ordering while remaining strict on the
two `/usr/bin/true` approvals and decline behavior
- use `DangerFullAccess` on Linux for this one test because it validates
zsh approval flow, not Linux sandbox socket restrictions
- Fixed zsh-fork process launching on Linux by preserving `req.arg0` in
`ZshExecBridge::execute_shell_request(...)` so `codex-linux-sandbox`
arg0 dispatch continues to work.
- Moved `maybe_run_zsh_exec_wrapper_mode()` under
`arg0_dispatch_or_else(...)` in `app-server` and `cli` so wrapper-mode
handling coexists correctly with arg0-dispatched helper modes.
- Consolidated duplicated `dotslash -- fetch` resolution logic into
shared test support (`core/tests/common/lib.rs`).
- Updated `codex-rs/exec-server/tests/suite/accept_elicitation.rs` to
use DotSlash zsh and hardened the zsh elicitation test for Bazel/zsh
differences by:
  - resolving an absolute `git` path
  - running `git init --quiet .`
- asserting success / `.git` creation instead of relying on banner text

## Verification

- `cargo test -p codex-app-server turn_start_zsh_fork -- --nocapture`
- `cargo test -p codex-exec-server accept_elicitation -- --nocapture`
- `bazel test //codex-rs/exec-server:exec-server-all-test
--test_output=streamed --test_arg=--nocapture
--test_arg=accept_elicitation_for_prompt_rule_with_zsh`
- CI (`rust-ci`) on the final cleaned commit: `Tests — ubuntu-24.04 -
x86_64-unknown-linux-gnu` and `Tests — ubuntu-24.04-arm -
aarch64-unknown-linux-gnu` passed in [run
22291424358](https://github.com/openai/codex/actions/runs/22291424358)

Michael Bolin · 2026-02-22 19:39:56 -08:00

e8949f4507

71 Commits