Commit Graph

166 Commits

  • feat(exec-server): add Noise rendezvous environment (#28774)
    ## Why
    
    Codex can run a remote exec server through the Noise relay, but the
    normal
    environment-manager path could not establish an
    environment-registry-backed
    harness connection. Signed rendezvous URLs and harness authorizations
    are
    short-lived, so reconnects must fetch a fresh bundle instead of
    retaining
    stale connection credentials. A stalled registry request must also fail
    within
    the regular remote connection deadline, without exposing these
    credentials in
    debug logs.
    
    Issue: N/A (internal environment-service integration).
    
    ## What Changed
    
    - Add environment-manager configuration for a registry-backed Noise
    rendezvous
      environment.
    - Request a fresh bundle from
    `/cloud/environment/{environment_id}/connect` for every physical harness
      connection, using the existing 10-second remote connection timeout.
    - Share the Environment Registry register, connect, and validate wire
    payloads
      through `codex-exec-server` and `codex-core-api`.
    - Redact the signed rendezvous URL and harness authorization from the
    public
      connect response's `Debug` output.
    - Add focused coverage for registry bundle retrieval, stalled requests,
    and
      credential redaction.
  • exec-server: expose environment registry payloads (#28651)
    ## Why
    
    Services that proxy the exec-server environment registry endpoints need
    to deserialize and forward the same Noise registration and harness-key
    validation payloads. Those wire models currently live as private,
    serialize-only structs in `exec-server`, which forces consumers to
    duplicate the contract.
    
    ## What changed
    
    - Add owned serde models for registration and harness-key validation
    requests and responses.
    - Use those models in the existing exec-server registry client.
    - Re-export the models from `codex-exec-server` and `codex-core-api`.
    - Keep the harness authorization request free of a derived `Debug`
    implementation so it is not accidentally logged.
    
    ## Testing
    
    - Focused exec-server registration and harness-key validation tests: 2
    passed.
    - `cargo check -p codex-core-api`
    
    The full `codex-exec-server` suite compiled and ran 254 tests: 222
    passed, while 32 existing filesystem sandbox tests could not run under
    the nested macOS sandbox (`sandbox_apply: Operation not permitted`).
    
    Co-authored-by: Codex <noreply@openai.com>
  • unified-exec: preserve PathUri through exec-server (#28681)
    ## Why
    
    It should be possible for app-server to handle "foreign" OS paths in
    unified_exec working directories, allowing e.g. a Linux app-server to
    run processes on e.g. a Windows exec-server.
    
    ## What
    
    Convert the core unified_exec cwd values to use `PathUri`.
    
    Adds fallible path conversion in several places to try to minimize the
    scope of this change. The only time this change suppresses errors from
    converting `PathUri` to an `AbsolutePathBuf` is when the turn is
    configured with no sandboxing at all to allow us to make progress
    testing without sandboxing.
    
    Future changes to apply_patch and sandboxing will clean up these error
    paths.
    
    A tool's cwd is resolved from joining a model-provided workdir to the
    environment's cwd. When using `AbsolutePathBuf::join()`, an
    absolute-path workdir would overwrite the environment's cwd and we would
    resolve permissions/sandboxing against the model-provided path. This
    change extends `PathUri::join()` to also treat an absolute rhs as an
    override of the base/lhs.
    
    This also removes some coverage from the remove_env_windows tests until
    a follow-up converts foreign paths in command exec events correctly.
    
    ## Breaking Changes
    
    When using `AbsolutePathBuf::join()` for workdir resolution, we ended up
    resolving tilde-prefixed paths against the app-server's `$HOME`, e.g.
    `~/foo/bar` becomes `/home/anp/foo/bar`. It's difficult to do this with
    `PathUri` joining, so after offline discussion this PR no longer
    implements it.
    
    A quick check of some power users' rollouts suggests that models don't
    actually generate home-prefixed absolute working directories for their
    spawns, so this shouldn't have any real blast radius.
  • Run fs helper through Windows sandbox wrapper (#28359)
    ## Why
    
    This is the final PR in the Windows fs-helper sandbox stack and contains
    the actual bug fix.
    
    The exec-server filesystem helper is a direct-spawn path: it asks
    `SandboxManager` for a `SandboxExecRequest`, then launches the returned
    argv itself. That works on macOS and Linux because the transformed argv
    is already a self-contained sandbox wrapper. On Windows, the transformed
    request carried `WindowsRestrictedToken` metadata, but the direct-spawn
    fs-helper runner still launched the helper argv directly.
    
    That means Windows filesystem built-ins backed by the fs-helper could
    run with the parent Codex process permissions instead of the configured
    Windows sandbox. This PR makes the direct-spawn transform produce a
    self-contained Windows wrapper argv before fs-helper launches it.
    
    ## What Changed
    
    - Added `SandboxManager::transform_for_direct_spawn()` for callers that
    launch the returned argv themselves.
    - Wrapped Windows restricted-token direct-spawn requests with `codex.exe
    --run-as-windows-sandbox` and then marked the outer request as
    unsandboxed, matching the macOS/Linux wrapper argv shape.
    - Updated `exec-server/src/fs_sandbox.rs` to use the direct-spawn
    transform for fs-helper launches.
    - Materialized the inner `codex.exe --codex-run-as-fs-helper` executable
    into `.sandbox-bin` so the sandboxed user can run it.
    - Carried runtime workspace roots through `FileSystemSandboxContext` as
    `PathUri` values so `:workspace_roots` policies resolve correctly
    without sending native client paths over exec-server JSON.
    - Preserved wrapper setup identity environment needed by Windows sandbox
    setup without changing the serialized inner helper environment.
    
    ## Verification
    
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just test -p codex-sandboxing transform_for_direct_spawn_windows`
    - `just test -p codex-exec-server fs_sandbox::tests`
    - `just fix -p codex-windows-sandbox -p codex-sandboxing -p
    codex-exec-server -p codex-core -p codex-file-system`
    
    Local note: `just fmt` completed Rust formatting, but this workstation
    still fails the non-Rust formatter phases because uv cannot open its
    cache and the local buildifier/dotslash path is missing.
  • Back off registry retries during exec recovery (#28546)
    ## Why
    
    PR #28512 retries a failed session recovery every 100 ms. Every Noise
    recovery attempt first asks the environment registry for a fresh
    connection bundle, even when the eventual failure comes from the
    WebSocket or initialize handshake. During an outage, that could make
    each disconnected client call the registry about 250 times during the
    25-second recovery window.
    
    ## What changes
    
    All retryable Noise recovery failures now use a separate backoff
    schedule:
    
    ```text
    base:    500 ms -> 1 s -> 2 s -> 4 s -> 5 s maximum
    actual:  500-750 ms, 1-1.5 s, 2-3 s, 4-6 s, 5-7.5 s
    ```
    
    The extra 0-50% is deterministic per-session jitter so disconnected
    clients do not retry together. Direct WebSocket recovery keeps the
    existing 100 ms retry because it does not re-enter the registry.
  • Resume exec-server sessions after disconnect (#28512)
    Supersedes #28288 (closed).
    
    ## Why
    
    A short WebSocket interruption currently ends every client-side process
    handle, even though exec-server keeps the server session and its
    processes alive for a short time.
    
    This is especially visible for executor-backed stdio MCP servers: a
    temporary connection loss becomes a permanent `Transport closed` error.
    The server already has the information needed to resume the session, but
    the client opens a fresh session instead of using it.
    
    This change reconnects below the process and MCP layers. Existing
    process handles stay valid, missed output is recovered, and the same
    server-side processes continue running.
    
    ## State machine
    
    One logical `ExecServerClient` stays alive while its underlying RPC
    connection changes generations.
    
    ```text
                             transport closes
           +------------------------------------------------+
           |                                                v
    +-------------+                                  +-------------+
    |  Connected  |                                  | Recovering  |
    +-------------+                                  +-------------+
           ^                                                |
           | session resumed, processes caught up           | retryable error
           +------------------------------------------------+ loops until deadline
                                                            |
                                                            | deadline or permanent error
                                                            v
                                                      +-------------+
                                                      |   Failed    |
                                                      +-------------+
    ```
    
    ### `Connected`
    
    - New RPC calls use the current connection.
    - Process notifications are published in sequence order.
    - A disconnect only starts recovery if it came from the current
    connection generation. Late events from older generations cannot replace
    the active connection.
    
    ### `Recovering`
    
    - New calls wait instead of choosing a half-connected RPC client.
    - Existing process handles, wake subscriptions, and event subscriptions
    stay open.
    - Streaming HTTP response bodies fail immediately because their byte
    streams cannot be resumed safely.
    - Recovery first waits for process starts that were already in flight. A
    start whose result became ambiguous is cleaned up after reconnection
    instead of being silently adopted.
    - The client reconnects with the learned `session_id`. The server may
    briefly report that the old connection is still attached, so that error
    is retried until the detach finishes.
    - The notification consumer starts before the resume handshake
    completes. This prevents a busy process from filling the notification
    queue and blocking the initialize response.
    - Before installing the new connection, the client catches up every
    recoverable process with `process/read`.
    
    ### `Failed`
    
    - Recovery stops after 25 seconds or after a permanent error.
    - Waiting calls are released with one stable disconnect error.
    - Existing process sessions receive a terminal failure instead of
    waiting forever.
    
    ## Recovering process events
    
    Output, exit, and close events share one sequence. During normal
    operation, the client buffers early events until every lower sequence
    has been published.
    
    After reconnection, the client reads each process starting after its
    last published sequence:
    
    1. Retained output chunks are inserted by sequence number.
    2. Exit and close state are reconstructed in their sequence positions.
    3. Events already received as live notifications are ignored as
    duplicates.
    4. Newly contiguous events are published in order.
    5. If the server no longer retains enough output to fill a sequence gap,
    only that process is terminated and failed. The recovered connection
    remains usable for other processes.
    
    The server reports its full next event sequence for unbounded reads,
    including exit and close events. Closed processes remain readable for
    the same 30-second window used to retain detached sessions.
    
    ## Other details
    
    - Detached server sessions are retained for 30 seconds, leaving margin
    around the client's 25-second recovery deadline.
    - Session attach and detach update the active notification sender under
    the same attachment lock, so an old connection cannot clear a newly
    attached sender.
    - A dedicated error code distinguishes the temporary "session is still
    attached" race from permanent initialization errors.
    - Process starts are identity-checked on both client and server. Cleanup
    from an older start cannot remove a newer process that reused the same
    ID.
    - Mutating requests that were already in flight when the transport
    closed are not replayed, because the client cannot know whether the
    server applied them. Requests started after recovery is known wait for
    the replacement connection.
    - We assume the server/client version stays in sync (on the before/after
    this PR)
    
    ## User impact
    
    Long-running commands and stdio MCP servers can survive a temporary
    exec-server WebSocket interruption without changing process IDs or
    losing output produced during the outage.
  • [codex] exec-server: stream files in chunks (#28354)
    ## Why
    
    `fs/readFile` buffers the entire file in one response, which makes large
    remote reads expensive and prevents callers from applying backpressure.
    We need an opt-in streaming path with bounded block sizes while
    preserving the existing single-call API for small and sandboxed reads.
    
    ## What changed
    
    - Add `ExecServerClient::stream`, returning a named `FileReadStream`
    that implements `futures::Stream` and yields immutable 1 MiB byte
    blocks.
    - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs.
    `fs/readBlock` accepts an explicit offset and length.
    - Keep unsandboxed files open between block reads, cap open handles per
    connection, and clean them up on EOF, error, stream drop, explicit
    close, or connection shutdown.
    - Reject platform-sandboxed streaming opens instead of turning the
    one-shot sandbox helper into a persistent server. Existing `fs/readFile`
    behavior is unchanged.
    
    ## Testing
    
    - `just test -p codex-exec-server`
    - Integration coverage for 1 MiB chunking, exact block-boundary EOF,
    sandbox rejection, and continued reads from the opened file after path
    replacement.
    - Handle-manager coverage for non-sequential offsets, variable block
    lengths, the 128-handle limit, and capacity release after close.
  • path-uri: clarify invalid host path errors (#28473)
    ## Why
    
    Ensure a consistent string format when exposing path conversion errors
    to the model.
    
    ## What
    
    - Render `PathUriParseError::InvalidFileUriPath` as `'$PATH' is invalid
    on '$OS'`.
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • exec-server: default remote transport to Noise (#26245)
    ## Why
    
    The transport in
    [openai/codex#26242](https://github.com/openai/codex/pull/26242) needs
    to be used by every remote orchestrator-to-executor connection before
    JSON-RPC traffic starts.
    
    ## Changes
    
    - Generates one executor Noise identity when remote exec-server starts
    and registers its public key.
    - Creates a harness identity for each physical remote environment
    connection.
    - Fetches a fresh registry bundle before connecting and validates the
    authenticated harness key before completing the executor handshake.
    - Multiplexes encrypted logical streams over the existing executor
    WebSocket.
    - Adds bounded stream, handshake-failure, and reassembly state.
    - Adds safe lifecycle diagnostics without logging keys, authorizations,
    plaintext, or ciphertext.
    - Covers reconnects, replay rejection, validation failure, framing
    limits, and encrypted JSON-RPC tool traffic.
    
    ## Stack
    
    1. [openai/codex#26242](https://github.com/openai/codex/pull/26242):
    Noise channel and relay transport
    2. **[openai/codex#26245](https://github.com/openai/codex/pull/26245)**:
    remote registration and runtime activation
    
    ## Verification
    
    - `just test -p codex-exec-server`
    - `just fix -p codex-exec-server`
    - `just bazel-lock-check`
    - `cargo shear`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Run core integration tests against a Wine-backed Windows executor (#28401)
    ## Why
    
    We want to exercise a linux app-server against a windows exec-server
    without having to repeat every test case. This approach has slight
    precedent in the remote docker test setup.
    
    ## What
    
    Run the shared `codex-core` integration suite against Windows
    exec-server behavior from Linux. This makes cross-OS path and shell
    regressions visible while keeping unsupported cases owned by individual
    tests.
    
    - Add `local`, `docker`, and `wine-exec` test environment selection with
    legacy Docker compatibility.
    - Extend `codex_rust_crate` to generate a sharded Wine-exec variant
    using a cross-built Windows server and pinned Bazel Wine/PowerShell
    runtimes.
    - Teach remote-aware helpers about Windows paths and track temporary
    incompatibilities with source-local `skip_if_wine_exec!` calls and
    follow-up reasons.
  • Use PathUri in filesystem permission paths for exec-server (#28165)
    ## Why
    
    Progress towards letting app-server and exec-server run on different
    platforms, specifically for sandbox configuration.
    
    ## What
    
    - Make the filesystem path containment hierarchy generic, defaulting to
    `AbsolutePathBuf` for now.
    - Have clients specify `AbsolutePathBuf` or `PathUri` directly where
    needed.
    - Use `PathUri` throughout exec-server filesystem protocol and trait
    boundaries.
    - Implement `From` for conversion to path URIs and `TryFrom` for
    fallible conversion to absolute paths through the generic type
    hierarchy.
  • exec-server: add Noise relay transport (#26242)
    ## Why
    
    Rendezvous forwards traffic between the orchestrator and exec-server.
    The endpoints need to authenticate each other and encrypt that traffic
    without trusting Rendezvous with plaintext or endpoint keys.
    
    ## Changes
    
    - Adds a hybrid Noise IK channel through Clatter using X25519,
    ML-KEM-768, AES-256-GCM, and SHA-256.
    - Binds each handshake to `environment_id`, `executor_registration_id`,
    and `stream_id`.
    - Pins the registry-provided executor key and carries the harness
    authorization inside the encrypted handshake.
    - Orders relay frames before consuming Noise nonces and fragments large
    JSON-RPC messages into bounded records.
    - Bounds handshake payloads, frames, streams, and message reassembly.
    
    Runtime activation is in
    [openai/codex#26245](https://github.com/openai/codex/pull/26245).
    
    ## Stack
    
    1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**:
    Noise channel and relay transport
    2. [openai/codex#26245](https://github.com/openai/codex/pull/26245):
    remote registration and runtime activation
    
    ## Verification
    
    - `just test -p codex-exec-server`
    - Oversized initiator payload regression coverage
    - `just fix -p codex-exec-server`
    - `just bazel-lock-check`
    - `cargo shear`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: restore exec-server relay keepalives (#28286)
    ## Why
    
    The ws pump refactor removed the relay keepalive timers that had been
    added to keep idle rendezvous connections alive. An idle relay could
    therefore be closed by the rendezvous service or a load balancer,
    disconnecting executor-backed MCP processes.
    
    ## What
    
    - restore periodic WebSocket ping frames on both rendezvous relay
    endpoints
    - keep missed-tick behavior bounded with `MissedTickBehavior::Skip`
    - cover the harness and remote-environment pumps with focused
    traffic-after-keepalive tests
  • [codex] exec-server honors remote environment cwd and shell (#28122)
    ## Why
    
    Next slice needed to make progress on the `remote_env_windows` test is
    to support passing a Windows cwd for the remote environment and using
    that environment's native shell. This lets the test run a real Windows
    process instead of only recording an early path or shell mismatch.
    
    ## What
    
    - change `TurnEnvironmentSelection.cwd` from `AbsolutePathBuf` to
    `PathUri`
    - convert local cwd values to URIs when constructing selections
    - preserve a remote primary cwd instead of replacing it with the local
    legacy fallback
    - prefer the selected environment's discovered shell for unified exec,
    falling back to the session shell when unavailable
    - convert back to a host-native absolute path at current native-only
    consumer boundaries
    - reject or deny unsupported foreign cwd values at the existing
    request-permissions boundary, with TODOs for its future migration
    - extend the hermetic Wine test to execute Windows PowerShell in
    `C:\windows` and verify successful process completion
    - record the current app-server rejection against the same Wine-backed
    remote Windows fixture when its cwd is supplied as a native Windows path
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • [codex] Carry exec-server cwd as PathUri (#28032)
    ## Why
    
    This is the second-to-last place in the exec-server protocol that needs
    to migrate to URIs to support cross-OS operation.
    
    ## What
    
    - Change `ExecParams.cwd` to `PathUri`.
    - Keep the cwd URI-shaped through core and rmcp producers, converting it
    to `AbsolutePathBuf` only in `LocalProcess::start_process`.
    - Reject non-native cwd URIs before launch and update the affected
    protocol documentation and call sites.
  • [codex] Add hermetic Wine exec-server test (#27937)
    ## Why
    
    We want to make it possible for an app-server orchestrator on one OS to
    control an exec-server on another host running a different OS. In
    practice this kinda already works if you get lucky and the two hosts
    have the same path format, but we mangle quite a lot of operations if
    either end is Windows.
    
    This test starts exercising that interaction, although right now the
    initial bootstrap fails. Future changes will expand the test's
    assertions to match improved support.
    
    ## What
    
    Stacked on #27964. This adds a small Windows exec-server fixture and a
    Linux protocol smoke test using the reusable Wine harness, covering
    Windows environment discovery, non-TTY `cmd.exe` execution, output, exit
    status, and working directory.
    
    Once we've got the full codex binary cross-building under Bazel we could
    consider moving to the real binary instead of the stripped down
    exec-server-only binary used here.
  • [codex] make PathUri::from_abs_path infallible (#27976)
    ## Why
    
    `PathUri::from_abs_path` can fail for absolute paths that do not have a
    normal `file:` URI representation, forcing filesystem call sites to
    handle a conversion error even though the original path can be preserved
    losslessly.
    
    ## What
    
    Make `from_abs_path` infallible and migrate its callers. Unrepresentable
    paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
    Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
    leading encoded null reserves a namespace that cannot collide with a
    real Unix or Windows path, and fallback URIs remain opaque to lexical
    path operations.
    
    ## Validation
    
    Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
    device/verbatim and non-Unicode paths, serialization, malformed
    fallbacks, opaque lexical operations, invalid native payloads, and
    literal `/bad/path` collision resistance.
  • [codex] Add size to internal filesystem metadata (#27927)
    ## Why
    
    `ExecutorFileSystem::get_metadata` reports file kind and timestamps but
    not size. Internal callers that need to enforce a size limit therefore
    have to read the complete file first, which is especially wasteful for
    remote filesystems.
    
    This adds the missing internal metadata so consumers can reject
    oversized files before transferring or buffering them. The field is
    named `size`, matching VS Code's `FileStat.size` filesystem convention.
    
    ## What changed
    
    - add `size: u64` to internal `FileMetadata`
    - populate it from the underlying filesystem metadata
    - carry it through sandbox-helper and remote exec-server responses
    - cover files, directories, symlink targets, and sandboxed reads across
    local and remote filesystem implementations
    
    The new field is intentionally not exposed through the app-server API.
    
    ## Testing
    
    - `just test -p codex-exec-server get_metadata`
    - `just test -p codex-exec-server
    file_system_sandboxed_metadata_and_read_allow_readable_root`
    - `just test -p codex-core-plugins`
    - `just test -p codex-skills-extension`
  • sandboxing: migrate cwd inputs to PathUri (#27816)
    ## Why
    
    Sandbox cwd values can cross app-server and exec-server host boundaries.
    They should retain URI semantics until the receiving host validates them
    instead of being interpreted early as native paths.
    
    ## What
    
    - Carry `PathUri` through filesystem sandbox contexts, sandbox commands,
    and transform inputs.
    - Convert command and policy cwd once in `SandboxManager::transform`,
    then keep launch requests native.
    - Preserve sandbox cwd over remote filesystem transport and reject
    non-native URIs without fallback.
    - Cache paired native/URI turn-environment cwd values during migration,
    with immutable access to keep them synchronized.
    - Extend existing protocol, forwarding, transform, and core runtime
    tests.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • Remove fs/join and fs/parent from exec-server protocol (#27700)
    ## Summary
    
    Path composition is already handled by `PathUri`, leaving `fs/join` and
    `fs/parent` as redundant exec-server protocol surface. Because
    app-server and exec-server are deployed atomically, these obsolete
    methods can be removed without a compatibility shim.
    
    This removes the protocol constants and payloads, public client APIs,
    server registrations and handlers, and endpoint-only tests. Existing
    in-process `PathUri` join/parent coverage remains.
    
    ## Validation
    
    - `just test -p codex-exec-server` (215 passed, 2 skipped)
  • [codex] migrate exec-server filesystem protocol to PathUri (#27653)
    Exec-server filesystem calls should preserve cross-platform `file:` URIs
    across the remote boundary instead of converting them through paths
    native to the client host.
    
    This changes the exec-server filesystem protocol DTOs to use `PathUri`,
    carries those values directly through remote and sandbox-helper
    transports, and keeps legacy native absolute-path request strings
    readable for compatibility. It also updates protocol documentation and
    coverage for URI serialization and non-native URI forwarding.
  • [codex] migrate ExecutorFileSystem paths to PathUri (#27424)
    ## Why
    
    We're moving exec-server to use PathUri for its internal path
    representations.
    
    ## What
    
    Move `ExecutorFileSystem` APIs to use `PathUri` instead of
    `AbsolutePathBuf`. Future changes will convert higher-level parts of
    exec-server.
  • [codex] remove EnvironmentPathRef (#27433)
    We're switching to using a static encoding of the host path in
    `PathUri`. We may need a type like this again but we can add it when
    it's more compelling.
    
    Stacked on #27454.
  • [codex] add cross-platform filesystem adapter coverage (#27454)
    ## Why
    
    The exec-server's existing filesystem tests only run on `#[cfg(unix)]`.
    We should be running the applicable ones on Windows, and also include
    the basic filesystem operations that will be modified by migrating to
    `PathUri`.
    
    ## What
    
    Split platform-neutral local/remote tests into a shared Unix/Windows
    suite while keeping the existing `AbsolutePathBuf` API, and add Windows
    junction canonicalization coverage.
  • [codex] Handle Ctrl-C for non-TTY unified exec (#26734)
    ## Why
    
    A long-running unified exec process started with `tty: false` could not
    be interrupted via `write_stdin`: ordinary non-TTY stdin writes are
    rejected once stdin is closed, but an exact U+0003 payload should still
    map to a process interrupt. The interrupt should flow through the same
    process lifecycle path as a real signal so Codex preserves
    process-reported output and exit metadata instead of fabricating a
    Ctrl-C exit code or tearing down the session early.
    
    ## What Changed
    
    - Add `process/signal` to exec-server with `ProcessSignal::Interrupt`
    and an empty response.
    - Add a non-consuming `ProcessHandle::signal` path for spawned
    processes; on Unix it sends SIGINT to the process group and leaves
    terminate/hard-kill unchanged.
    - Route non-TTY U+0003 `write_stdin` through `process.signal(...)`
    instead of `terminate`, then let the normal post-write collection path
    drain output and observe exit.
    - Add exec-server coverage where a shell `trap INT` handler prints the
    signal and exits with its own code.
    - Add unified exec coverage where a `tty: false` process traps SIGINT,
    emits output, and exits with its own code.
    
    ## Validation
    
    - `just test -p codex-exec-server
    exec_process_signal_interrupts_process`
    - `just test -p codex-exec-server`
    - `just test -p codex-core
    write_stdin_ctrl_c_interrupts_non_tty_session`
  • [codex] Add environment shell info (#26480)
    ## Why
    
    Shell detection needs to be available through the `Environment`
    abstraction so callers can ask the selected local or remote environment
    for shell metadata without adding a separate HTTP endpoint or parallel
    info-source path. This keeps shell metadata shaped like the existing
    environment-owned filesystem capability and lets remote environments
    answer through exec-server JSON-RPC.
    
    ## What changed
    
    - Added `environment/info` to the exec-server protocol/client/server and
    exposed `Environment::info()`.
    - Added local and remote environment info providers on `Environment`,
    following the existing capability-provider pattern used for filesystem
    access.
    - Moved the shared shell detection logic into `codex-shell-command` and
    kept core shell APIs as wrappers around that implementation.
    - Returned shell metadata as `EnvironmentInfo { shell: ShellInfo }`
    using the existing shell detection path.
    - Added a remote environment test that calls `Environment::info()`
    through an exec-server-backed environment.
    
    ## Validation
    
    - `git diff --check`
    - `just test -p codex-shell-command`
    - `just test -p codex-core -E 'test(/shell::tests::/)'`\n- `just test -p
    codex-exec-server environment`
  • exec-server: canonicalize bound filesystem paths (#25149)
    ## Summary
    - add executor filesystem canonicalization as a bound-path operation
    - route remote canonicalization through the exec-server filesystem RPC
    surface
    - keep path normalization attached to the filesystem that owns the path
    
    ## Stack
    - 2/5 in the skills path authority stack extracted from
    https://github.com/openai/codex/pull/25098
    - follows merged https://github.com/openai/codex/pull/25121
    
    ## Validation
    - `cd
    /Users/starr/code/codex-worktrees/pr-25098-restack-review-pr1b/codex-rs
    && just fmt`
    - Not run: tests/checks (not requested)
    - GitHub CI pending on rewritten head
  • exec-server: add environment path refs (#25121)
    ## Summary
    - add public `codex_exec_server::EnvironmentPathRef`
    - bind an absolute path to its owning executor filesystem
    - keep path operations in the next review slice
    
    ## Stack
    - 1/5 in the skills path authority stack extracted from
    https://github.com/openai/codex/pull/25098
    
    ## Validation
    - `cd /Users/starr/code/codex-worktrees/pr-25098-restack4/codex-rs &&
    just fmt`
    - GitHub CI pending on rewritten head
  • exec-server: preserve fs helper CoreFoundation env (#25118)
    ## Summary
    - preserve macOS `__CF_USER_TEXT_ENCODING` when launching the sandboxed
    fs helper
    - keep the fs-helper env narrow; this adds only the CoreFoundation
    startup var instead of copying the broader MCP stdio baseline
    - add focused coverage that the helper keeps that var without admitting
    `HOME`
    
    ## Diagnosis
    The sandboxed fs helper is not launched like a normal child process.
    Exec-server rebuilds its environment from an allowlist, then calls
    `env_clear()` before re-execing Codex with `--codex-run-as-fs-helper`.
    That helper dispatches before the normal Codex startup path and only
    needs to boot a small Tokio runtime, read one JSON request from stdin,
    perform the direct filesystem operation, and write one JSON response.
    
    The reported macOS hang sampled the helper before Rust main, in
    CoreFoundation initialization while resolving the default text encoding:
    `_CFStringGetUserDefaultEncoding -> getpwuid_r -> notify_register_check
    -> bootstrap_look_up3 -> mach_msg2_trap`. The fs-helper allowlist kept
    `PATH` and temp vars for runtime needs, but it dropped macOS
    `__CF_USER_TEXT_ENCODING`. Other Codex subprocess launchers that
    intentionally build a minimal Unix baseline, such as MCP stdio, already
    preserve that variable.
    
    My read is that stripping `__CF_USER_TEXT_ENCODING` forced this internal
    helper down CoreFoundation's fallback user-lookup path, and that lookup
    intermittently wedged on the affected machine before the helper could
    read stdin or touch the target file. Preserving only this macOS startup
    variable avoids that fallback without broadening the fs-helper
    environment to shell-like vars such as `HOME`, `USER`, locale settings,
    terminal settings, or proxy credentials.
    
    Internal Slack thread omitted from the public PR body.
    
    ## Validation
    - `cd codex-rs && just fmt`
    - `git diff --check`
  • [exec-server] Kill dropped filesystem helpers (#25116)
    ## Summary
    - terminate sandbox filesystem helpers when the Tokio child handle is
    dropped
    
    ## Why
    A sandbox filesystem helper can stall during process startup before
    reading stdin. If the owning async operation is cancelled or torn down,
    the spawned helper should not remain running as an orphaned process.
    
    Setting `kill_on_drop(true)` gives the filesystem helper the cleanup
    behavior that Tokio child processes otherwise do not enable by default.
    
    This intentionally does not add a timeout. It does not detect or recover
    an active hung file edit while the owning future remains alive. A more
    precise startup-health mechanism can be handled separately.
    
    ## Validation
    - `just test -p codex-exec-server` (186 tests passed; benchmark smoke
    passed)
    - `just fmt`
    - `just fix -p codex-exec-server`
    - `git diff --check`
  • feat: Add focused diagnostics for MCP HTTP send failures (#25013)
    Adds failure-only logging for MCP streamable HTTP post_message calls and
    the underlying reqwest send path, capturing the MCP method/request id,
    endpoint shape, auth-header presence, timeout/connect classification,
    and sanitized error source chain without logging headers, bodies,
    tokens, or full URLs.
  • fix(exec-server): reject websocket requests with Origin headers (#24947)
    ## Why
    
    `codex exec-server` has a local WebSocket listener, but it did not apply
    the same browser-origin request handling as the `app-server` WebSocket
    transport. Requests that carry an `Origin` header should not be upgraded
    by this local transport, keeping both local WebSocket servers consistent
    and avoiding unexpected browser-initiated connections.
    
    ## What changed
    
    - Added an Axum middleware guard in
    `codex-rs/exec-server/src/server/transport.rs` that returns `403
    Forbidden` for requests carrying an `Origin` header.
    - Added an integration test in `codex-rs/exec-server/tests/websocket.rs`
    that covers rejection of an `Origin`-bearing WebSocket handshake.
    - Kept ordinary WebSocket clients unchanged: existing no-`Origin`
    initialization and process behavior remains covered by the crate tests.
    
    ## Validation
    
    - `just test -p codex-exec-server` test phase (`186 passed`; run outside
    the parent macOS sandbox so nested sandbox tests can execute)
    - `just clippy -p codex-exec-server`
  • Allow API-key auth for remote exec-server registration (#24666)
    ## Overview
    Allow remote `codex exec-server` registration to use existing API-key
    auth while restricting where those credentials can be sent.
    
    - Accept `CodexAuth::ApiKey` for the normal `--remote` registration
    path.
    - Restrict API-key remote registration to HTTPS `openai.com` and
    `openai.org` hosts and subdomains, with explicit HTTP loopback support
    for local development.
    - Disable registry registration redirects so credentials cannot be
    forwarded to an unvalidated destination.
    - Retain `--use-agent-identity-auth` as the explicit Agent Identity
    path.
    - Document remote registration using `CODEX_API_KEY`.
    
    ## Big picture
    Callers can now provide an API key directly to `exec-server`
    registration without first establishing ChatGPT login state:
    
    ```sh
    CODEX_API_KEY="$OPENAI_API_KEY" \
    codex exec-server \
      --remote "https://<host>.openai.org/api" \
      --environment-id "$ENVIRONMENT_ID"
    ```
    
    ## Validation
    - `cargo fmt --all` (`just fmt` is not installed on this host)
    - `cargo test -p codex-cli -p codex-exec-server`
  • Uprev Rust toolchain pins to 1.95.0 (#24684)
    ## Summary
    - Bump the workspace Rust toolchain from `1.93.0` to `1.95.0` across
    Cargo, Bazel, CI, release workflows, devcontainers, and the Codex
    environment config.
    - Refresh `MODULE.bazel.lock` so the Bazel Rust toolchain artifacts
    match the new version.
    - Leave purpose-specific toolchains unchanged, including the
    `argument-comment-lint` nightly and the upstream `rusty_v8` `1.91.0`
    build pin.
    - Includes fixes for new lints from `just fix` and a few codex-authored
    fixes for lints without a suggestion.
  • Reconnect disconnected exec-server websocket clients with fresh sessions (#23867)
    ## Summary
    - replace the one-shot lazy remote exec-server cache with a
    lock-protected current client
    - when the cached websocket client is already disconnected, create one
    fresh websocket client/session on the next `get()`
    - keep existing disconnect failure behavior for old process sessions and
    HTTP body streams; do not add session resume or request retry
    
    ## Why
    The prior PR direction was trying to grow into session restore: resume
    the old `session_id`, preserve existing process handles, and add
    reconnect retry policy. That is more machinery than we want for this
    slice.
    
    For now, the useful minimum is simpler: later fresh remote operations
    should not be stuck behind a dead cached websocket client, but anything
    already attached to the dead connection should fail loudly through the
    existing disconnect path. The server already has detached-session
    cleanup via its existing TTL, so this PR does not need to add
    client-side session preservation.
    
    ## What Changed
    - `LazyRemoteExecServerClient::get()` now keeps the current concrete
    client in a small mutex-protected cache plus one async connect lock.
    - If that cached client is still connected, `get()` returns it.
    - If that cached websocket client has observed the transport close,
    `get()` creates a brand-new websocket client with a brand-new
    exec-server session and replaces the cache.
    - If that cached client is stdio-backed, behavior stays one-shot: the
    dead client is returned and later work surfaces the existing disconnect
    error.
    - No `resume_session_id`, backoff, request replay, or existing
    `RemoteExecProcess` rebinding is added here.
    - Added focused websocket coverage that proves two concurrent `get()`
    calls after disconnect share one fresh replacement client/session.
  • Migrate exec-server remote registration to environments (#23633)
    ## Summary
    - migrate exec-server remote registration naming from executor to
    environment
    - align CLI, public Rust exports, registry error messages, and relay
    test fixtures with the environment registry contract
    - keep the live registration path and response model consistent with
    `/cloud/environment/{environment_id}/register`
    
    ## Verification
    - `cargo test -p codex-exec-server
    remote::tests::register_environment_posts_with_auth_provider_headers
    --manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
    - `cargo test -p codex-exec-server --test relay
    multiplexed_remote_environment_routes_independent_virtual_streams
    --manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
    - `cargo check -p codex-cli --manifest-path
    /Users/richardlee/code/codex/codex-rs/Cargo.toml` (still running when PR
    opened; will update after completion if needed)
  • Refactor exec-server websocket pump (#23327)
    ## Why
    Exec-server websocket handling had separate reader and writer tasks for
    the same socket. That made websocket control-frame handling asymmetric:
    the task reading frames could observe `Ping`, but the task allowed to
    write frames was elsewhere. This PR moves each physical websocket onto
    one always-running pump so the socket owner can handle application
    frames and websocket control frames together.
    
    ## What changed
    - Refactored direct exec-server websocket connections in `connection.rs`
    to use one task that owns the websocket for outbound JSON-RPC, inbound
    JSON-RPC, periodic keepalive pings, and `Ping` -> `Pong` replies.
    - Refactored relay websocket handling in `relay.rs` the same way for
    both the harness-side logical connection and the multiplexed executor
    physical socket.
    - Preserved the existing keepalive ownership policy: outbound direct
    websocket clients still send periodic pings, inbound Axum accepts only
    reply with pongs, and relay physical websocket endpoints keep their
    existing periodic pings.
    - Added focused websocket pump tests for ping/pong, binary JSON-RPC,
    relay data, malformed relay text frames, and close/disconnect behavior.
    - Reconnect behavior is intentionally left for a follow-up.
    
    ## Validation
    - Devbox Bazel focused unit target:
    - `//codex-rs/exec-server:exec-server-unit-tests
    --test_filter='websocket_connection_|harness_connection_|multiplexed_executor_'`
  • Make local environment optional in EnvironmentManager (#23369)
    ## Summary
    - make `EnvironmentManager` local environment/runtime paths optional
    - simplify constructor surface around snapshot materialization
    - rename local env accessors to `require_local_environment` /
    `try_local_environment`
    
    ## Validation
    - devbox Bazel build for touched crate surfaces
    - `//codex-rs/exec-server:exec-server-unit-tests`
    - `//codex-rs/app-server-client:app-server-client-unit-tests`
    - filtered touched `//codex-rs/core:core-unit-tests` cases
  • Add exec-server websocket keepalive (#23226)
    ## Summary
    - send periodic websocket Ping frames from outbound exec-server
    websocket clients
    - cover direct exec-server websocket clients plus rendezvous
    harness/executor websocket connections
    - keep inbound axum-accepted exec-server websocket connections passive
    - add focused keepalive coverage for direct and relay websocket paths
    
    ## Validation
    - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test
    //codex-rs/exec-server:exec-server-unit-tests
    --test_filter='websocket_connection_sends_keepalive_ping|harness_connection_sends_keepalive_ping|multiplexed_executor_sends_keepalive_ping'
    - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test
    //codex-rs/exec-server:exec-server-relay-test
    --test_filter=multiplexed_remote_executor_routes_independent_virtual_streams
  • exec-server: support auth-backed remote executor registration (#22769)
    This updates remote `exec-server` registration to use normal Codex auth
    instead of a registry-issued credential. The registry request is built
    from the existing auth-provider path, which preserves the biscuit-only
    registry contract introduced in
    [openai/openai#924101](https://github.com/openai/openai/pull/924101)
    while removing the old remote registry bearer env var and its direct
    transport assumptions.
    
    The default remote flow uses persisted ChatGPT auth from the normal
    Codex config/storage path. This PR also includes the containerized Agent
    Identity path needed by
    [openai/openai#924260](https://github.com/openai/openai/pull/924260):
    remote `exec-server` accepts `--allow-agent-identity-auth`, permits
    Agent Identity auth loaded from `CODEX_ACCESS_TOKEN` only when that flag
    is present, and reuses the existing Agent task registration plus derived
    `AgentAssertion` header generation. API-key auth remains unsupported,
    and Agent Identity stays opt-in.
    
    Validation performed beyond normal presubmit coverage:
    - `cargo fmt --all --check`
    - `cargo check -p codex-cli`
    - `cargo test -p codex-exec-server`
    - `cargo test -p codex-cli exec_server_agent_identity_auth_flag_`
    - `cargo test -p codex-cli remote_exec_server_auth_mode_`
    
    I also attempted `cargo test -p codex-cli`. The new CLI tests passed
    inside that run, but the suite ended on an unrelated local
    marketplace-state failure in
    `plugin_list_excludes_unconfigured_repo_local_marketplaces`.
  • Fix remote environment test fixtures (#22572)
    ## Why
    The Docker remote-env coverage was failing before it reached the
    behavior those tests are meant to exercise. The remote-aware test
    fixture only registered the remote environment, so tests that
    intentionally select both `local` and `remote` could not start a turn.
    After that was fixed, two tests exposed stale fixtures: the approval
    test was auto-approving under workspace-write, and the remote
    `view_image` test was writing invalid PNG bytes.
    
    ## What Changed
    - Added `EnvironmentManager::create_for_tests_with_local(...)` so tests
    can keep the provider default while also selecting `local` explicitly.
    - Updated `build_remote_aware()` to use that test-only manager when a
    remote exec-server URL is present.
    - Changed the remote apply-patch approval helper to use
    `SandboxPolicy::new_read_only_policy()` so the test actually exercises
    approval caching per environment.
    - Replaced the hardcoded remote `view_image` PNG blob with the existing
    `png_bytes(...)` helper so the test uses a valid image fixture.
    
    ## Validation
    Ran these isolated Docker remote-env tests on the devbox with
    `$remote-tests` setup:
    -
    `suite::remote_env::apply_patch_freeform_routes_to_selected_remote_environment`
    -
    `suite::remote_env::apply_patch_approvals_are_remembered_per_environment`
    -
    `suite::remote_env::apply_patch_intercepted_exec_command_routes_to_selected_remote_environment`
    -
    `suite::remote_env::exec_command_routes_to_selected_remote_environment`
    - `suite::view_image::view_image_routes_to_selected_remote_environment`
    
    All five pass.
  • feat(exec-server): use protobuf relay frames (#22343)
    ## Why
    
    Remote exec-server now needs one executor websocket to serve multiple
    harness JSON-RPC sessions. Rendezvous routes by `stream_id`, and the
    exec-server side needs to use the same stable relay frame contract
    instead of a hand-rolled JSON shape.
    
    The relay protocol also needs to make ownership boundaries clear:
    harness and executor endpoints own sequencing, acks, retries, duplicate
    suppression, segmentation, and reassembly; rendezvous only routes
    frames.
    
    ## What Changed
    
    - Add the checked-in `codex.exec_server.relay.v1.RelayMessageFrame`
    proto plus generated prost bindings for `codex-exec-server`.
    - Encode remote harness/executor relay traffic as binary protobuf
    websocket frames while keeping local websocket JSON-RPC unchanged.
    - Demux executor-side relay streams into independent
    `ConnectionProcessor` sessions keyed by `stream_id`.
    - Add a programmatic `RemoteExecutorConfig::with_bearer_token(...)`
    constructor for non-CLI callers and integration tests.
    - Add an integration test that starts the remote executor against a fake
    registry/rendezvous websocket and verifies two virtual streams share one
    executor websocket without cross-talk, including per-stream reset
    behavior.
    - Document the remote relay envelope, sequence ranges, `ack`/`ack_bits`,
    and endpoint responsibilities in `exec-server/README.md`.
    
    ## Verification
    
    - `cargo test -p codex-exec-server --test relay
    multiplexed_remote_executor_routes_independent_virtual_streams --
    --exact`
    - `cargo test -p codex-exec-server --test relay`
    - `cargo test -p codex-exec-server` passed outside the sandbox. The
    sandboxed run hit macOS `sandbox-exec: sandbox_apply: Operation not
    permitted` in filesystem sandbox tests.
  • [exec-server] serve websocket listener via HTTP upgrade (#21963)
    ## Why
    
    `codex exec-server` should keep the existing public `ws://IP:PORT` URL
    shape while serving that websocket connection through an HTTP upgrade
    path internally. That keeps the client-facing configuration simple and
    allows the listener to work through intermediate HTTP-aware
    infrastructure.
    
    ## What changed
    
    - keep the emitted and configured exec-server URL as `ws://IP:PORT`
    - serve that websocket endpoint through Axum HTTP upgrade handling on
    `/`
    - expose `GET /readyz` from the same listener for readiness checks
    - route upgraded Axum websocket streams through the shared JSON-RPC
    connection machinery
    - initialize the rustls crypto provider before websocket client
    connections
    - preserve inbound binary websocket JSON-RPC parsing for compatibility
    with the prior transport behavior
    
    ## Verification
    
    - `cargo test -p codex-exec-server --test health --test process --test
    websocket --test initialize --test exec_process`
  • fix(exec-server): suppress Windows taskkill output (#22058)
    ## Summary
    
    This is the `exec-server` follow-up to #21759.
    
    #21759 fixed the Windows `taskkill` output leak for the `rmcp-client`
    MCP teardown path, but #22050 showed that `exec-server` still had a
    parallel `taskkill /T /F` cleanup path in
    `exec-server/src/connection.rs`. Because that command inherited the
    parent stdio handles, Windows could still print `SUCCESS:` lines into
    the user's terminal during stdio child cleanup.
    
    This change silences that remaining `exec-server` callsite by
    redirecting `taskkill` stdin, stdout, and stderr to `Stdio::null()`.
    
    ## What Changed
    
    - add a Windows-only `Stdio` import in `exec-server/src/connection.rs`
    - redirect the `taskkill` command in `kill_windows_process_tree` to
    `Stdio::null()` for stdin, stdout, and stderr
    - keep the existing kill semantics unchanged by still checking
    `.status()` and preserving the existing fallback/logging behavior
    
    ## How to Test
    
    Manual validation is Windows-only, so I did not run the UI repro path
    locally here.
    
    1. On Windows, use a Codex build from this branch.
    2. Exercise an `exec-server` stdio flow that spawns a child process tree
    and then triggers transport cleanup.
    3. Confirm the child process tree is still torn down.
    4. Confirm the terminal no longer shows `SUCCESS: The process with PID
    ... has been terminated.` lines during cleanup.
    
    Targeted tests:
    - `cargo test -p codex-exec-server
    client::tests::dropping_stdio_client_terminates_spawned_process --
    --exact`
    - `cargo test -p codex-exec-server
    client::tests::malformed_stdio_message_terminates_spawned_process --
    --exact`
    
    Notes:
    - `cargo test -p codex-exec-server` still hits unrelated local macOS
    `sandbox-exec: sandbox_apply: Operation not permitted` failures in
    `tests/file_system.rs`.
    
    ## References
    
    - Fixes the remaining callsite discussed in #22050
    - Related earlier fix: #21759
  • tests: cover sandbox link write behavior (#21819)
    ## Why
    
    [PR #1705](https://github.com/openai/codex/pull/1705) moved
    `apply_patch` execution under the configured sandbox and called out the
    need for integration coverage. We already covered textual `../` escapes,
    but did not have coverage for link aliases that live inside a writable
    workspace while pointing at, or aliasing, files visible outside it.
    
    This PR locks in the current sandbox boundary without changing
    production write semantics. Symlink escapes into a read-only outside
    root should fail and leave the outside file unchanged. Existing hard
    links are characterized separately: if a user-created hard link already
    exists inside the writable root, sandboxed writes preserve normal
    hard-link semantics rather than replacing the link and silently breaking
    that relationship.
    
    ## What Changed
    
    - Added
    `apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
    to verify `apply_patch` cannot update a symlink that targets a file
    outside the writable workspace.
    - Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace`
    to verify `apply_patch` intentionally writes through an existing hard
    link and does not unlink or replace it.
    - Added `file_system_sandboxed_write_preserves_existing_hard_link` to
    verify sandboxed `fs/writeFile` preserves an existing hard link and
    writes the shared inode.
    
    ## Testing
    
    - `cargo test -p codex-exec-server file_system_sandboxed_write`
    - `cargo test -p codex-core
    apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
    - `cargo test -p codex-core
    apply_patch_cli_preserves_existing_hard_link_outside_workspace`
    - `just fix -p codex-exec-server -p codex-core`
    - `just fix -p codex-core`
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819).
    * #21845
    * __->__ #21819
  • Increase exec-server environment transport timeouts (#21825)
    ## Why
    
    The environment-backed exec-server transport currently hardcodes 5
    second connect and initialize timeouts in `client_transport.rs`. That is
    short for SSH-backed stdio environments and remote websocket
    environments, and there is currently no way to raise those values from
    `CODEX_HOME/environments.toml`.
    
    This stacked follow-up raises the default environment transport timeouts
    and lets each configured environment override them in
    `environments.toml`.
    
    ## What Changed
    
    - raise the default environment transport connect and initialize
    timeouts from 5s to 10s
    - store concrete timeout values on `ExecServerTransportParams` instead
    of hardcoding them in `connect_for_transport(...)`
    - add `connect_timeout_sec` and `initialize_timeout_sec` to
    `[[environments]]` entries in `environments.toml`
    - apply parse-time defaults so runtime transport code receives fully
    resolved timeout values
    - reject `connect_timeout_sec` on stdio environments because it only
    applies to websocket transports
    - extend parser tests to cover the new fields and defaults
    
    ## Stack
    
    - base: https://github.com/openai/codex/pull/21794
    - this PR: configurable environment transport timeouts
    
    ## Validation
    
    - `cd
    /Users/starr/code/codex-worktrees/exec-env-timeouts-config-20260508/codex-rs
    && just fmt`
    - not run: tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] support executor registry remote environments (#21323)
    ## Summary
    
    Support registry-backed remote executors end to end so downstream
    services can resolve an executor id into an exec-server URL and make
    that environment available to Codex without relying on the legacy cloud
    environments flow.
    
    ## What changed
    
    - switch remote executor registration to the executor registry bootstrap
    contract
    - allow named remote environments to be inserted into
    `EnvironmentManager` at runtime
    - add the experimental app-server RPC `environment/add` so initialized
    experimental clients can register those remote environments for later
    `thread/start` and `turn/start` selection
    
    ## Validation
    
    Ran focused validation locally:
    - `cargo test -p codex-exec-server environment_manager_`
    - `cargo test -p codex-exec-server
    register_executor_posts_with_bearer_token_header`
    - `cargo test -p codex-app-server-protocol`