9 Commits

  • protocol: separate app and exec RPC ownership (#29714)
    ## Why
    
    The app-server and exec-server expose separate JSON-RPC APIs, but
    exec-server currently sources its serialized protocol and envelope types
    through app-server-oriented code. Giving each API an explicit owner
    makes the crate boundary legible without introducing shared generic
    envelopes.
    
    ## What changed
    
    - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and
    JSON-RPC envelopes.
    - Updated exec-server clients, transports, handlers, and tests to use
    the new crate.
    - Exposed app-server's existing JSON-RPC types through a public `rpc`
    module while retaining root re-exports.
    - Preserved existing wire shapes, including exec `PathUri` behavior.
    
    ## Stack
    
    This is PR 1 of 6. Next: [PR
    #29721](https://github.com/openai/codex/pull/29721), which moves auth
    mode below the app wire boundary.
    
    ## Validation
    
    - Exec-server protocol and server coverage passed in the focused
    protocol test runs.
    - App-server protocol schema fixtures passed.
  • path-uri: remove legacy path deserialization (#29158)
    ## Why
    
    I'd originally added `PathUri` legacy path deserialization thinking we'd
    want it for having `PathUri` in public app-server APIs. Since then we've
    added `LegacyAppPathString` to handle the messy conversions that we need
    for backcompat. It's confusing for `PathUri` to support deserializing
    legacy paths when we don't yet want to actually expose app-server
    callers or rollout storage to the new URI format.
    
    Stacked on top of #29472 to avoid breaking compatibility in case those
    types ended up stored somewhere for someone.
    
    ## What changed
    
    - Parse deserialized `PathUri` values exclusively as valid `file:` URIs.
    - Replace legacy acceptance coverage with rejection coverage for
    top-level filesystem paths and sandbox working directories.
    - Serialize CWDs in hand-built exec-server process requests as `PathUri`
    values.
  • Report remote sandbox denials semantically (#29424)
    ## Why
    
    #29113 moved remote sandbox setup and enforcement to the exec server.
    That gives the executor ownership of the platform-specific work: a Linux
    executor chooses and runs a Linux sandbox even when the Codex
    orchestrator is running on macOS or Windows.
    
    It also means the orchestrator no longer knows which concrete sandbox
    the executor selected. When that sandbox blocks a remote command, the
    orchestrator currently sees only a failed process and can treat the
    denial as an ordinary command failure. The existing sandbox approval and
    retry path is then skipped.
    
    This PR lets the executor report one portable fact:
    
    > This command probably failed because the executor sandbox blocked it.
    
    The executor keeps its concrete sandbox type private. The protocol sends
    only the semantic result.
    
    ## Example
    
    Suppose a local macOS Codex session asks a Linux devbox to write outside
    the allowed workspace.
    
    Before this PR:
    
    ```text
    Linux sandbox blocks the write
        -> remote process exits with "Permission denied"
        -> local orchestrator sees an ordinary command failure
        -> the normal sandbox approval and retry path can be skipped
    ```
    
    With this PR:
    
    ```text
    Linux sandbox blocks the write
        -> executor reports sandboxDenied: true
        -> unified exec returns UnifiedExecError::SandboxDenied
        -> the existing approval prompt is shown
        -> an approved retry runs through the existing unsandboxed retry path
    ```
    
    ## What changes
    
    ### The executor remembers its selected sandbox
    
    The prepared remote process now retains the executor-selected
    `SandboxType`. This value never crosses the executor boundary.
    
    Commands started without a sandbox retain `SandboxType::None` and are
    never reported as sandbox denials.
    
    ### The executor uses the existing denial heuristic
    
    The existing local denial heuristic moves from `codex-core` into the
    shared `codex-sandboxing` crate.
    
    When a sandboxed remote process exits, the executor:
    
    1. waits the same short output grace period used by local unified exec;
    2. reads the output currently available in the existing retained output
    buffer;
    3. runs the existing heuristic using the exit code and common denial
    messages;
    4. stores the yes/no result before publishing the process exit.
    
    This deliberately matches the old local unified-exec behavior. It does
    not add a new streaming classifier, another output buffer, or stronger
    output-retention guarantees.
    
    ### The protocol reports a portable boolean
    
    `process/read` gains `sandboxDenied`:
    
    ```json
    {
      "exited": true,
      "exitCode": 1,
      "closed": false,
      "sandboxDenied": true
    }
    ```
    
    The field defaults to `false` when an older executor omits it. The
    response does not expose the executor sandbox implementation or
    executor-native paths.
    
    ### Unified exec uses the existing error path
    
    The exec-server client carries `sandboxDenied` into the unified process
    state. If it is true, unified exec returns the existing `SandboxDenied`
    error instead of trying to classify remote output using an
    orchestrator-side sandbox type.
    
    Remote process exit remains visible as soon as the process exits. This
    PR does not wait for stdout or stderr to close and does not change the
    existing process lifecycle.
    
    ## Scope
    
    This PR is intentionally limited to matching the existing local
    unified-exec behavior for the initial command execution path.
    
    It does not add:
    
    - incremental denial tracking across the full output stream;
    - new denial handling for commands completed later through
    `write_stdin`;
    - new guarantees for preserving the semantic flag during the narrow
    reconnect-recovery race.
    
    Those can be considered separately if the same behavior is added for
    local execution.
    
    ## Test coverage
    
    One remote end-to-end integration test covers the complete intended
    flow:
    
    ```text
    remote read-only sandbox
        -> denied write
        -> executor reports the denial
        -> Codex requests approval
        -> user approves
        -> retry succeeds on the remote executor
    ```
    
    Existing lifecycle coverage continues to verify that remote process exit
    is reported before late output streams close.
  • Recover exec process stdin writes (#28895)
    ## Summary
    
    Remote stdio MCP servers send tool calls by writing JSON-RPC bytes
    through `process/write`.
    
    When the exec-server websocket drops at the wrong time, the remote
    process can survive session recovery, but the stdin write can still fail
    back to RMCP as a transport send error. RMCP then closes the stdio MCP
    transport, so tools like `node_repl` are lost even though the
    process/session recovery path is working.
    
    This changes `process/write` to be safe to retry across exec-server
    recovery:
    
    - adds a required `writeId` to `process/write`
    - retries remote `Session::write` with the same `writeId` after
    reconnect
    - remembers accepted write ids per process so duplicate retries return
    `Accepted` without writing the same bytes to child stdin again
    - covers both the client retry path and server-side write id dedupe with
    tests
    
    In simple terms:
    
    ```text
    before:
    write to MCP stdin -> websocket closes -> write errors -> RMCP closes node_repl
    
    after:
    write to MCP stdin -> websocket closes -> reconnect -> retry same writeId
    server either writes once or recognizes it already did
    ```
  • [2/8] Support piped stdin in exec process API (#18086)
    ## Summary
    - Add an explicit stdin mode to process/start.
    - Keep normal non-interactive exec stdin closed while allowing
    pipe-backed processes.
    
    ## Stack
    ```text
    o  #18027 [8/8] Fail exec client operations after disconnect
    │
    o  #18025 [7/8] Cover MCP stdio tests with executor placement
    │
    o  #18089 [6/8] Wire remote MCP stdio through executor
    │
    o  #18088 [5/8] Add executor process transport for MCP stdio
    │
    o  #18087 [4/8] Abstract MCP stdio server launching
    │
    o  #18020 [3/8] Add pushed exec process events
    │
    @  #18086 [2/8] Support piped stdin in exec process API
    │
    o  #18085 [1/8] Add MCP server environment config
    │
    o  main
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: move exec-server ownership (#16344)
    This introduces session-scoped ownership for exec-server so ws
    disconnects no longer immediately kill running remote exec processes,
    and it prepares the protocol for reconnect-based resume.
    - add session_id / resume_session_id to the exec-server initialize
    handshake
      - move process ownership under a shared session registry
    - detach sessions on websocket disconnect and expire them after a TTL
    instead of killing processes immediately (we will resume based on this)
    - allow a new connection to resume an existing session and take over
    notifications/ownership
    - I use UUID to make them not predictable as we don't have auth for now
    - make detached-session expiry authoritative at resume time so teardown
    wins at the TTL boundary
    - reject long-poll process/read calls that get resumed out from under an
    older attachment
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: use ProcessId in exec-server (#15866)
    Use a full struct for the ProcessId to increase readability and make it
    easier in the future to make it evolve if needed
  • Add exec-server exec RPC implementation (#15090)
    Stacked PR 2/3, based on the stub PR.
    
    Adds the exec RPC implementation and process/event flow in exec-server
    only.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Remove stdio transport from exec server (#15119)
    Summary
    - delete the deprecated stdio transport plumbing from the exec server
    stack
    - add a basic `exec_server()` harness plus test utilities to start a
    server, send requests, and await events
    - refresh exec-server dependencies, configs, and documentation to
    reflect the new flow
    
    Testing
    - Not run (not requested)
    
    ---------
    
    Co-authored-by: starr-openai <starr@openai.com>
    Co-authored-by: Codex <noreply@openai.com>