5 Commits

  • [codex] consume pushed exec-server process events (#30273)
    ## Summary
    
    - complete unified-exec processes from the ordered event stream instead
    of issuing a final zero-wait `process/read`
    - add optional executor sandbox-denial state to `process/exited`
    - retain `process/read` as a retained-output and compatibility fallback
    for receiver lag, sequence gaps, and legacy servers
    - recover sandbox-denial state across transport reconnection
    - cover the real `TestCodex` remote-exec path without adding a public
    test-only event constructor
    
    ## Why
    
    A successful one-shot tool call currently receives its output and
    terminal notifications, then pays another wide-area `process/read` round
    trip before returning. Staging traces showed that remote response wait
    accounted for more than 99.8% of RPC time; local serialization,
    queueing, and deserialization were below 0.6 ms.
    
    ## Measured impact
    
    A direct staging A/B used the same build and route and changed only
    completion mode. Each arm ran three times with 30 one-shot
    `/usr/bin/true` calls per run. The table reports the median of the three
    per-run percentiles.
    
    | Metric | Final `process/read` | Pushed events | Change |
    | --- | ---: | ---: | ---: |
    | End-to-end completion p50 | 159.5 ms | 118.7 ms | -40.8 ms (-25.6%) |
    | End-to-end completion p95 | 182.4 ms | 131.7 ms | -50.6 ms (-27.8%) |
    | Completion-wait p50 | 80.1 ms | 41.5 ms | -38.5 ms (-48.1%) |
    | Final `process/read` RPC p50 | 79.9 ms | eliminated | -79.9 ms |
    
    TCP_NODELAY was enabled in both A/B arms, so its effect cancels out. The
    successful, complete, in-order event path issued zero final
    `process/read` calls.
    
    ## Compatibility and recovery
    
    - new servers send `sandboxDenied` on `process/exited`
    - legacy servers omit it, which triggers one compatibility
    `process/read`
    - broadcast lag or a sequence gap triggers a retained-output read
    - recovery remains bounded by the server's existing 1 MiB
    retained-output window
    - complete, in-order event streams issue no completion read
    - sandbox denial is attached to the exit event before consumers can
    observe process completion
    - server-first and client-first rollouts remain wire-compatible;
    server-first realizes the latency win immediately
    
    ## Integration coverage
    
    The `TestCodex` suite exercises four distinct remote-exec contracts:
    
    - complete pushed output/exit/close with zero reads
    - direct pushed sandbox denial with zero reads
    - legacy missing denial metadata with exactly one compatibility read
    - count-bounded replay eviction recovered from retained output without
    duplication
    
    ## Validation
    
    - `just test -p codex-core
    exec_command_consumes_pushed_remote_process_events`: 4 passed
    - `just test -p codex-core unified_exec::process_tests::`: 4 passed
    - `just test -p codex-exec-server`: 294 passed, 2 skipped
    - `just test -p codex-exec-server-protocol`: 5 passed
    - `just test -p codex-rmcp-client`: 89 passed, 2 skipped
    - focused Bazel `//codex-rs/core:core-all-test`: passed across 16 shards
    - scoped `just fix` passed for core and exec-server
    - `just fmt` passed
    
    The complete workspace suite was not rerun; focused Cargo and Bazel
    coverage passed for the changed behavior.
  • Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
    ## Why
    
    #28522 routes selected-plugin HTTP MCP traffic through the owning
    executor, but OAuth bootstrap and refresh still used host-local clients.
    Executor-only servers therefore cannot complete discovery or login
    through the same network boundary as the MCP connection.
    
    ## What changed
    
    - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
    contract
    - let RMCP own discovery, dynamic registration, PKCE, token exchange,
    and refresh
    - route auth status, persisted-token startup, and app-server login
    through the server runtime while preserving the existing local discovery
    path
    - add optional `threadId` to `mcpServer/oauth/login` and echo it in the
    completion notification
    - implement RMCP's redirect policy and 1 MiB OAuth response limit over
    executor HTTP
    - cover selected-thread OAuth discovery and login through an
    executor-only route
    
    Depends on #28522.
  • [codex] Trace exec-server JSON-RPC requests (#27466)
    ## Why
    
    Exec-server JSON-RPC calls can cross local and remote transports, but
    trace context stopped at the RPC boundary. That made client and server
    work difficult to correlate when diagnosing latency or failures.
    
    ## What changed
    
    - Propagate the current W3C trace context on outbound JSON-RPC requests.
    - Parent inbound request spans from received trace context.
    - Record the received JSON-RPC method on server spans and keep each span
    open through response enqueue.
    - Add only the OTEL dependencies required by the exec-server crate.
    
    ## Stack
    
    Review and land this stack in order:
    
    1. #27466 — trace exec-server JSON-RPC requests **(this PR)**
    2. #27467 — record bounded connection, request, and process lifecycle
    metrics
    3. #27470 — observe remote registration and Noise rendezvous lifecycle
    
    ## Validation
    
    - `just test -p codex-exec-server --lib` (153 passed)
    - `just bazel-lock-check`
    - `just fix -p codex-exec-server`
  • Add a bounded filesystem walk RPC (#29841)
    Stack 1 of 3. Follow-ups: #29842 and #29844.
    
    ## What changes
    
    Adds a general bounded `fs/walk` operation to the exec server.
    
    The operation returns file and directory entries plus recoverable
    per-path errors. It skips symlinks, preserves the existing filesystem
    sandbox routing, and enforces depth, directory, entry, and response-size
    limits.
    
    This PR only defines and wires the filesystem operation. It does not
    change any callers yet.
  • protocol: separate app and exec RPC ownership (#29714)
    ## Why
    
    The app-server and exec-server expose separate JSON-RPC APIs, but
    exec-server currently sources its serialized protocol and envelope types
    through app-server-oriented code. Giving each API an explicit owner
    makes the crate boundary legible without introducing shared generic
    envelopes.
    
    ## What changed
    
    - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and
    JSON-RPC envelopes.
    - Updated exec-server clients, transports, handlers, and tests to use
    the new crate.
    - Exposed app-server's existing JSON-RPC types through a public `rpc`
    module while retaining root re-exports.
    - Preserved existing wire shapes, including exec `PathUri` behavior.
    
    ## Stack
    
    This is PR 1 of 6. Next: [PR
    #29721](https://github.com/openai/codex/pull/29721), which moves auth
    mode below the app wire boundary.
    
    ## Validation
    
    - Exec-server protocol and server coverage passed in the focused
    protocol test runs.
    - App-server protocol schema fixtures passed.