15 Commits

  • Test selected capabilities across availability and resume (#30157)
    ## Why
    
    This stack crosses World State, executor skills, selected plugin
    metadata, MCP processes, connectors, dynamic environments, and resume.
    This PR adds two end-to-end scenarios that validate those pieces
    together.
    
    Both tests enable `deferred_executor`, so they exercise the real
    delayed-environment path.
    
    ## Scenario 1: availability across turns and resume
    
    ```text
    1. Start a thread with one selected plugin root bound to E1.
    2. E1 is unavailable.
       - executor skill is absent
       - selected MCP is absent
       - connector has no selected-plugin attribution
    3. Start E1 and register the same stable environment ID.
    4. Start a new turn.
       - the executor skill appears through World State
       - its body beats a colliding host skill
       - the selected MCP tool is advertised and executes inside E1
       - the connector is attributed to the selected plugin
    5. Start another turn without changing E1.
       - the MCP PID stays the same, proving runtime reuse
    6. Restart app-server and resume the thread.
       - durable selected-root intent is restored
       - skills, MCP, and connector attribution are restored
       - a new MCP PID proves ephemeral process state was rebuilt
    ```
    
    ## Scenario 2: availability changes inside one turn
    
    ```text
    1. Start a turn while E1 is unavailable.
    2. The first model sample sees no executor skill, MCP, or selected connector.
    3. The turn pauses on request_user_input.
    4. Start E1 and register it while that same turn is still active.
    5. Continue the turn.
    6. The very next model sample sees:
       - the executor skill catalog
       - the selected MCP tool
       - selected-plugin connector attribution
    7. The model calls the MCP, and its output proves execution happened inside E1.
    ```
    
    This second scenario specifically protects the aeon-style behavior:
    capability state is captured again for every sampling step, not only at
    the next user turn.
    
    ## Scope
    
    These are integration tests only. They do not add a combinatorial matrix
    for unsupported plugin-file mutation, environment generations, transport
    disconnects, or delayed `required = true` executor MCPs.
  • Pipeline bounded AGENTS.md and Git root probes (#29870)
    ## Why
    
    When Codex uses a remote `ExecutorFileSystem`, every `get_metadata` call
    is an exec-server round trip. Upward discovery currently pays those
    round trips serially in two latency-sensitive places:
    
    - session startup, while locating the configured project root before
    loading `AGENTS.md`; and
    - Git-root discovery, which runs before per-turn Git diff enrichment.
    
    The goal is to remove the serial ancestor dependency without adding a
    new filesystem RPC, JSON-RPC batch method, Git executable dependency, or
    cache.
    
    ## Example
    
    Assume this layout, with `.git` as the configured project-root marker:
    
    ```text
    /workspace/repo/.git
    /workspace/repo/AGENTS.md
    /workspace/repo/crates/core/    <- cwd
    ```
    
    The marker probes have this required precedence:
    
    ```text
    1. /workspace/repo/crates/core/.git
    2. /workspace/repo/crates/.git
    3. /workspace/repo/.git
    4. /workspace/.git
    5. /.git
    ```
    
    Previously, probe 2 was not sent until probe 1 returned, and probe 3 was
    not sent until probe 2 returned. With this change, the client lazily
    keeps up to eight ordinary `fs/getMetadata` requests in flight, but
    consumes their results in the order above. Codex must still learn that
    probes 1 and 2 are absent before accepting probe 3, so the nearest root
    always wins. Once probe 3 succeeds, the client has its answer and stops
    awaiting probes 4 and 5. Requests that were already sent may still
    finish on the worker.
    
    For the marker phase alone, with a 50 ms client-to-worker round trip and
    fast local metadata calls, finding the root at probe 3 changes from
    roughly three serialized round trips (150 ms) to one round trip plus
    worker processing. The later `AGENTS.md` candidate phase remains
    separate and ordered.
    
    Only after `/workspace/repo` is selected does `AGENTS.md` discovery
    check instruction candidates, in root-to-cwd order:
    
    ```text
    /workspace/repo/AGENTS.override.md
    /workspace/repo/AGENTS.md
    /workspace/repo/crates/AGENTS.override.md
    /workspace/repo/crates/AGENTS.md
    /workspace/repo/crates/core/AGENTS.override.md
    /workspace/repo/crates/core/AGENTS.md
    ```
    
    The first configured candidate found in each directory wins. These
    checks remain ordered and no instruction candidate above
    `/workspace/repo` is issued. Git-root discovery uses the same bounded
    lookup with only `.git` as the marker.
    
    ## What changed
    
    - Added a client-side find-up helper that generates `ancestor x marker`
    probes lazily, nearest directory first and configured marker order
    within each directory.
    - Uses an ordered concurrency window of eight scalar metadata requests.
    This bounds executor load while preserving nearest-root and marker
    precedence.
    - Reuses the helper for both configured project-root discovery and
    remote Git-root discovery.
    - Keeps Git ancestor and marker construction in `AbsolutePathBuf`,
    converting only each complete `.git` probe to `PathUri`. This preserves
    native paths that require an opaque URI fallback, such as Windows
    namespace paths.
    - Preserves existing error behavior: `AGENTS.md` discovery propagates
    non-`NotFound` metadata errors, while Git discovery treats a failed
    marker probe as absent and continues upward.
    - Reads each discovered `AGENTS.md` directly instead of statting it a
    second time.
    
    No filesystem trait or exec-server protocol method is added. An empty
    `project_root_markers` list performs no ancestor-marker I/O and checks
    instruction candidates only in `cwd`. This change also deliberately does
    not cache roots across turns.
    
    ## Symlinks
    
    Upward traversal remains **lexical**. The helper does not canonicalize
    `cwd`; it appends marker names to the supplied path and walks that
    path's textual parents. The filesystem performs the actual metadata/read
    operation, and the current local and exec-server implementations follow
    live symlink targets.
    
    For example:
    
    ```text
    /tmp/pkg -> /workspace/repo/packages/pkg
    cwd = /tmp/pkg/src
    actual Git marker = /workspace/repo/.git
    ```
    
    The lexical probes are `/tmp/pkg/src/.git`, `/tmp/pkg/.git`,
    `/tmp/.git`, and `/.git`. They do not jump from `/tmp/pkg` to the
    target's parent `/workspace/repo`, so this spelling of `cwd` does not
    discover `/workspace/repo/.git`. That is the existing behavior and is
    unchanged by this PR.
    
    Conversely, if `/tmp/repo -> /workspace/repo`, then probing
    `/tmp/repo/.git` follows the directory symlink and finds
    `/workspace/repo/.git`; the reported root remains the lexical path
    `/tmp/repo`. A live symlink used directly as `.git`, another configured
    marker, or `AGENTS.md` is also followed. A symlinked `AGENTS.md` is
    loaded when its target is a regular file, while a broken symlink behaves
    as `NotFound`.
  • Follow directory symlinks in filesystem walks (#29844)
    Stack 3 of 3. Stacked on #29842.
    
    ## What changes
    
    Adds an opt-in `followDirectorySymlinks` setting to `fs/walk`.
    
    When enabled, the walk follows directory symlinks but continues to
    ignore symlinked files. Canonical directory identities prevent symlink
    cycles, while normal paths keep their existing spelling.
    
    Environment skill discovery enables the setting so symlinked skill
    directories continue to work with the new single-RPC scan.
  • Add a bounded filesystem walk RPC (#29841)
    Stack 1 of 3. Follow-ups: #29842 and #29844.
    
    ## What changes
    
    Adds a general bounded `fs/walk` operation to the exec server.
    
    The operation returns file and directory entries plus recoverable
    per-path errors. It skips symlinks, preserves the existing filesystem
    sandbox routing, and enforces depth, directory, entry, and response-size
    limits.
    
    This PR only defines and wires the filesystem operation. It does not
    change any callers yet.
  • Run fs helper through Windows sandbox wrapper (#28359)
    ## Why
    
    This is the final PR in the Windows fs-helper sandbox stack and contains
    the actual bug fix.
    
    The exec-server filesystem helper is a direct-spawn path: it asks
    `SandboxManager` for a `SandboxExecRequest`, then launches the returned
    argv itself. That works on macOS and Linux because the transformed argv
    is already a self-contained sandbox wrapper. On Windows, the transformed
    request carried `WindowsRestrictedToken` metadata, but the direct-spawn
    fs-helper runner still launched the helper argv directly.
    
    That means Windows filesystem built-ins backed by the fs-helper could
    run with the parent Codex process permissions instead of the configured
    Windows sandbox. This PR makes the direct-spawn transform produce a
    self-contained Windows wrapper argv before fs-helper launches it.
    
    ## What Changed
    
    - Added `SandboxManager::transform_for_direct_spawn()` for callers that
    launch the returned argv themselves.
    - Wrapped Windows restricted-token direct-spawn requests with `codex.exe
    --run-as-windows-sandbox` and then marked the outer request as
    unsandboxed, matching the macOS/Linux wrapper argv shape.
    - Updated `exec-server/src/fs_sandbox.rs` to use the direct-spawn
    transform for fs-helper launches.
    - Materialized the inner `codex.exe --codex-run-as-fs-helper` executable
    into `.sandbox-bin` so the sandboxed user can run it.
    - Carried runtime workspace roots through `FileSystemSandboxContext` as
    `PathUri` values so `:workspace_roots` policies resolve correctly
    without sending native client paths over exec-server JSON.
    - Preserved wrapper setup identity environment needed by Windows sandbox
    setup without changing the serialized inner helper environment.
    
    ## Verification
    
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just test -p codex-sandboxing transform_for_direct_spawn_windows`
    - `just test -p codex-exec-server fs_sandbox::tests`
    - `just fix -p codex-windows-sandbox -p codex-sandboxing -p
    codex-exec-server -p codex-core -p codex-file-system`
    
    Local note: `just fmt` completed Rust formatting, but this workstation
    still fails the non-Rust formatter phases because uv cannot open its
    cache and the local buildifier/dotslash path is missing.
  • [codex] exec-server: stream files in chunks (#28354)
    ## Why
    
    `fs/readFile` buffers the entire file in one response, which makes large
    remote reads expensive and prevents callers from applying backpressure.
    We need an opt-in streaming path with bounded block sizes while
    preserving the existing single-call API for small and sandboxed reads.
    
    ## What changed
    
    - Add `ExecServerClient::stream`, returning a named `FileReadStream`
    that implements `futures::Stream` and yields immutable 1 MiB byte
    blocks.
    - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs.
    `fs/readBlock` accepts an explicit offset and length.
    - Keep unsandboxed files open between block reads, cap open handles per
    connection, and clean them up on EOF, error, stream drop, explicit
    close, or connection shutdown.
    - Reject platform-sandboxed streaming opens instead of turning the
    one-shot sandbox helper into a persistent server. Existing `fs/readFile`
    behavior is unchanged.
    
    ## Testing
    
    - `just test -p codex-exec-server`
    - Integration coverage for 1 MiB chunking, exact block-boundary EOF,
    sandbox rejection, and continued reads from the opened file after path
    replacement.
    - Handle-manager coverage for non-sequential offsets, variable block
    lengths, the 128-handle limit, and capacity release after close.
  • Use PathUri in filesystem permission paths for exec-server (#28165)
    ## Why
    
    Progress towards letting app-server and exec-server run on different
    platforms, specifically for sandbox configuration.
    
    ## What
    
    - Make the filesystem path containment hierarchy generic, defaulting to
    `AbsolutePathBuf` for now.
    - Have clients specify `AbsolutePathBuf` or `PathUri` directly where
    needed.
    - Use `PathUri` throughout exec-server filesystem protocol and trait
    boundaries.
    - Implement `From` for conversion to path URIs and `TryFrom` for
    fallible conversion to absolute paths through the generic type
    hierarchy.
  • [codex] Add size to internal filesystem metadata (#27927)
    ## Why
    
    `ExecutorFileSystem::get_metadata` reports file kind and timestamps but
    not size. Internal callers that need to enforce a size limit therefore
    have to read the complete file first, which is especially wasteful for
    remote filesystems.
    
    This adds the missing internal metadata so consumers can reject
    oversized files before transferring or buffering them. The field is
    named `size`, matching VS Code's `FileStat.size` filesystem convention.
    
    ## What changed
    
    - add `size: u64` to internal `FileMetadata`
    - populate it from the underlying filesystem metadata
    - carry it through sandbox-helper and remote exec-server responses
    - cover files, directories, symlink targets, and sandboxed reads across
    local and remote filesystem implementations
    
    The new field is intentionally not exposed through the app-server API.
    
    ## Testing
    
    - `just test -p codex-exec-server get_metadata`
    - `just test -p codex-exec-server
    file_system_sandboxed_metadata_and_read_allow_readable_root`
    - `just test -p codex-core-plugins`
    - `just test -p codex-skills-extension`
  • sandboxing: migrate cwd inputs to PathUri (#27816)
    ## Why
    
    Sandbox cwd values can cross app-server and exec-server host boundaries.
    They should retain URI semantics until the receiving host validates them
    instead of being interpreted early as native paths.
    
    ## What
    
    - Carry `PathUri` through filesystem sandbox contexts, sandbox commands,
    and transform inputs.
    - Convert command and policy cwd once in `SandboxManager::transform`,
    then keep launch requests native.
    - Preserve sandbox cwd over remote filesystem transport and reject
    non-native URIs without fallback.
    - Cache paired native/URI turn-environment cwd values during migration,
    with immutable access to keep them synchronized.
    - Extend existing protocol, forwarding, transform, and core runtime
    tests.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • [codex] migrate ExecutorFileSystem paths to PathUri (#27424)
    ## Why
    
    We're moving exec-server to use PathUri for its internal path
    representations.
    
    ## What
    
    Move `ExecutorFileSystem` APIs to use `PathUri` instead of
    `AbsolutePathBuf`. Future changes will convert higher-level parts of
    exec-server.
  • exec-server: canonicalize bound filesystem paths (#25149)
    ## Summary
    - add executor filesystem canonicalization as a bound-path operation
    - route remote canonicalization through the exec-server filesystem RPC
    surface
    - keep path normalization attached to the filesystem that owns the path
    
    ## Stack
    - 2/5 in the skills path authority stack extracted from
    https://github.com/openai/codex/pull/25098
    - follows merged https://github.com/openai/codex/pull/25121
    
    ## Validation
    - `cd
    /Users/starr/code/codex-worktrees/pr-25098-restack-review-pr1b/codex-rs
    && just fmt`
    - Not run: tests/checks (not requested)
    - GitHub CI pending on rewritten head
  • Migrate exec-server remote registration to environments (#23633)
    ## Summary
    - migrate exec-server remote registration naming from executor to
    environment
    - align CLI, public Rust exports, registry error messages, and relay
    test fixtures with the environment registry contract
    - keep the live registration path and response model consistent with
    `/cloud/environment/{environment_id}/register`
    
    ## Verification
    - `cargo test -p codex-exec-server
    remote::tests::register_environment_posts_with_auth_provider_headers
    --manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
    - `cargo test -p codex-exec-server --test relay
    multiplexed_remote_environment_routes_independent_virtual_streams
    --manifest-path /Users/richardlee/code/codex/codex-rs/Cargo.toml`
    - `cargo check -p codex-cli --manifest-path
    /Users/richardlee/code/codex/codex-rs/Cargo.toml` (still running when PR
    opened; will update after completion if needed)
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Refactor exec-server filesystem API into codex-file-system (#19892)
    ## Summary
    - Extracted the shared filesystem types and `ExecutorFileSystem` trait
    into a new `codex-file-system` crate
    - Switched `codex-config` and `codex-git-utils` to depend on that crate
    instead of `codex-exec-server`
    - Kept `codex-exec-server` re-exporting the same API for existing
    callers
    
    ## Testing
    - Ran `cargo test -p codex-file-system`
    - Ran `cargo test -p codex-git-utils`
    - Ran `cargo test -p codex-config`
    - Ran `cargo test -p codex-exec-server`
    - Ran `just fix -p codex-file-system`, `just fix -p codex-git-utils`,
    `just fix -p codex-config`, `just fix -p codex-exec-server`
    - Ran `just fmt`
    - Updated and verified the Bazel module lockfile