mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
d4ec08b8f0
## Summary - complete unified-exec processes from the ordered event stream instead of issuing a final zero-wait `process/read` - add optional executor sandbox-denial state to `process/exited` - retain `process/read` as a retained-output and compatibility fallback for receiver lag, sequence gaps, and legacy servers - recover sandbox-denial state across transport reconnection - cover the real `TestCodex` remote-exec path without adding a public test-only event constructor ## Why A successful one-shot tool call currently receives its output and terminal notifications, then pays another wide-area `process/read` round trip before returning. Staging traces showed that remote response wait accounted for more than 99.8% of RPC time; local serialization, queueing, and deserialization were below 0.6 ms. ## Measured impact A direct staging A/B used the same build and route and changed only completion mode. Each arm ran three times with 30 one-shot `/usr/bin/true` calls per run. The table reports the median of the three per-run percentiles. | Metric | Final `process/read` | Pushed events | Change | | --- | ---: | ---: | ---: | | End-to-end completion p50 | 159.5 ms | 118.7 ms | -40.8 ms (-25.6%) | | End-to-end completion p95 | 182.4 ms | 131.7 ms | -50.6 ms (-27.8%) | | Completion-wait p50 | 80.1 ms | 41.5 ms | -38.5 ms (-48.1%) | | Final `process/read` RPC p50 | 79.9 ms | eliminated | -79.9 ms | TCP_NODELAY was enabled in both A/B arms, so its effect cancels out. The successful, complete, in-order event path issued zero final `process/read` calls. ## Compatibility and recovery - new servers send `sandboxDenied` on `process/exited` - legacy servers omit it, which triggers one compatibility `process/read` - broadcast lag or a sequence gap triggers a retained-output read - recovery remains bounded by the server's existing 1 MiB retained-output window - complete, in-order event streams issue no completion read - sandbox denial is attached to the exit event before consumers can observe process completion - server-first and client-first rollouts remain wire-compatible; server-first realizes the latency win immediately ## Integration coverage The `TestCodex` suite exercises four distinct remote-exec contracts: - complete pushed output/exit/close with zero reads - direct pushed sandbox denial with zero reads - legacy missing denial metadata with exactly one compatibility read - count-bounded replay eviction recovered from retained output without duplication ## Validation - `just test -p codex-core exec_command_consumes_pushed_remote_process_events`: 4 passed - `just test -p codex-core unified_exec::process_tests::`: 4 passed - `just test -p codex-exec-server`: 294 passed, 2 skipped - `just test -p codex-exec-server-protocol`: 5 passed - `just test -p codex-rmcp-client`: 89 passed, 2 skipped - focused Bazel `//codex-rs/core:core-all-test`: passed across 16 shards - scoped `just fix` passed for core and exec-server - `just fmt` passed The complete workspace suite was not rerun; focused Cargo and Bazel coverage passed for the changed behavior.
443 lines
11 KiB
Markdown
443 lines
11 KiB
Markdown
# codex-exec-server
|
|
|
|
`codex-exec-server` is the library backing `codex exec-server`, a small
|
|
JSON-RPC server for spawning and controlling subprocesses through
|
|
`codex-utils-pty`.
|
|
|
|
It provides:
|
|
|
|
- a CLI entrypoint: `codex exec-server`
|
|
- a Rust client: `ExecServerClient`
|
|
- a small protocol module with shared request/response types
|
|
|
|
This crate owns the transport, protocol, and filesystem/process handlers. The
|
|
top-level `codex` binary owns hidden helper dispatch for sandboxed
|
|
filesystem operations and `codex-linux-sandbox`.
|
|
|
|
## Transport
|
|
|
|
The server speaks the exec-specific `codex-exec-server-protocol` message
|
|
envelope on the wire.
|
|
|
|
The CLI entrypoint supports:
|
|
|
|
- `ws://IP:PORT` (default)
|
|
- `--remote URL --environment-id ID [--name NAME]`
|
|
|
|
Remote mode registers the local exec-server with the environment registry,
|
|
then reconnects to the service-provided rendezvous websocket as the environment.
|
|
Remote communication uses the Noise relay contract; the registry and harness
|
|
must support it.
|
|
It uses the standard Codex ChatGPT sign-in state; run `codex login` first when
|
|
remote registration needs authentication. Containerized callers that receive an
|
|
Agent Identity JWT in `CODEX_ACCESS_TOKEN` can opt into that auth path with
|
|
`--use-agent-identity-auth`; Codex then registers an Agent task and sends the
|
|
derived AgentAssertion headers on the registry request.
|
|
|
|
Alternatively, API users can instead use `CODEX_API_KEY`;
|
|
Codex sends it as a bearer token on the registration request. For example:
|
|
|
|
```sh
|
|
CODEX_API_KEY="$OPENAI_API_KEY" \
|
|
codex exec-server \
|
|
--remote ... \
|
|
--environment-id "$ENVIRONMENT_ID"
|
|
```
|
|
|
|
Wire framing:
|
|
|
|
- local websocket: one JSON-RPC message per websocket frame
|
|
- Noise remote websocket: binary protobuf relay frames carrying encrypted payloads
|
|
|
|
## Remote Relay Message Format
|
|
|
|
In remote mode, the harness and environment communicate through rendezvous using
|
|
`codex.exec_server.relay.v1.RelayMessageFrame`; the checked-in schema is in
|
|
`src/proto/codex.exec_server.relay.v1.proto`. The relay frame carries stream
|
|
identity plus endpoint-owned reliability metadata:
|
|
|
|
```text
|
|
version
|
|
stream_id
|
|
body // handshake | data | ack_frame | resume | reset | heartbeat
|
|
ack // highest contiguous peer segment seq received
|
|
ack_bits // bitset for peer segment seqs after ack
|
|
seq // data only: segment sequence number
|
|
segment_index // data only: 0-based index within message
|
|
segment_count // data only: number of segments in message
|
|
payload // handshake bytes or encrypted data record
|
|
next_seq // resume only: next sender seq
|
|
reason // reset only: reset reason
|
|
```
|
|
|
|
`stream_id` identifies one virtual harness/environment JSON-RPC session on the
|
|
environment websocket. The harness generates a UUIDv4 `stream_id`; the environment
|
|
demuxes frames by `stream_id` and runs an independent `ConnectionProcessor` per
|
|
stream.
|
|
|
|
Use segment-level sequence numbers for reliability:
|
|
|
|
```text
|
|
seq = 0, 1, 2, 3, ...
|
|
```
|
|
|
|
Use contiguous segment sequence ranges to identify and stitch a segmented
|
|
application message:
|
|
|
|
```text
|
|
message_start_seq = seq - segment_index
|
|
segment_index = 0
|
|
segment_count = 1
|
|
```
|
|
|
|
`message_start_seq` is derived by the receiver, not sent on the wire. For
|
|
unsplit messages, `message_start_seq == seq`, `segment_index == 0`, and
|
|
`segment_count == 1`.
|
|
|
|
Use cumulative `ack` plus fixed-size `ack_bits` instead of variable ack ranges:
|
|
|
|
```text
|
|
ack = highest contiguous received segment seq
|
|
bit i in ack_bits acknowledges seq = ack + 1 + i
|
|
```
|
|
|
|
Send `ack` and `ack_bits` redundantly on every outbound frame. Acks are not
|
|
themselves acked. Acks, retries, duplicate suppression, segmentation, and
|
|
reassembly are endpoint responsibilities; rendezvous only routes relay frames
|
|
by `stream_id`.
|
|
|
|
## Lifecycle
|
|
|
|
Each connection follows this sequence:
|
|
|
|
1. Send `initialize`.
|
|
2. Wait for the `initialize` response.
|
|
3. Send `initialized`.
|
|
4. Call process or filesystem RPCs.
|
|
|
|
If the server receives any notification other than `initialized`, it replies
|
|
with an error using request id `-1`.
|
|
|
|
If the websocket connection closes, the server terminates any remaining managed
|
|
processes for that client connection.
|
|
|
|
## API
|
|
|
|
### `initialize`
|
|
|
|
Initial handshake request.
|
|
|
|
Request params:
|
|
|
|
```json
|
|
{
|
|
"clientName": "my-client"
|
|
}
|
|
```
|
|
|
|
Response:
|
|
|
|
```json
|
|
{}
|
|
```
|
|
|
|
### `initialized`
|
|
|
|
Handshake acknowledgement notification sent by the client after a successful
|
|
`initialize` response.
|
|
|
|
Params are currently ignored. Sending any other notification method is treated
|
|
as an invalid request.
|
|
|
|
### `process/start`
|
|
|
|
Starts a new managed process.
|
|
|
|
Request params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"argv": ["bash", "-lc", "printf 'hello\\n'"],
|
|
"cwd": "file:///absolute/working/directory",
|
|
"env": {
|
|
"PATH": "/usr/bin:/bin"
|
|
},
|
|
"tty": true,
|
|
"pipeStdin": false,
|
|
"arg0": null
|
|
}
|
|
```
|
|
|
|
Field definitions:
|
|
|
|
- `processId`: caller-chosen stable id for this process within the connection.
|
|
- `argv`: command vector. It must be non-empty.
|
|
- `cwd`: `file:` URI for the child process working directory.
|
|
- `env`: environment variables passed to the child process.
|
|
- `tty`: when `true`, spawn a PTY-backed interactive process.
|
|
- `pipeStdin`: when `true`, keep non-PTY stdin writable via `process/write`.
|
|
- `arg0`: optional argv0 override forwarded to `codex-utils-pty`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1"
|
|
}
|
|
```
|
|
|
|
Behavior notes:
|
|
|
|
- Reusing an existing `processId` is rejected.
|
|
- PTY-backed processes accept later writes through `process/write`.
|
|
- Non-PTY processes reject writes unless `pipeStdin` is `true`.
|
|
- Output is streamed asynchronously via `process/output`.
|
|
- Exit is reported asynchronously via `process/exited`.
|
|
|
|
### `process/read`
|
|
|
|
Reads buffered output and terminal state for a managed process.
|
|
|
|
Request params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"afterSeq": null,
|
|
"maxBytes": 65536,
|
|
"waitMs": 1000
|
|
}
|
|
```
|
|
|
|
Field definitions:
|
|
|
|
- `processId`: managed process id returned by `process/start`.
|
|
- `afterSeq`: optional sequence number cursor; when present, only newer chunks
|
|
are returned.
|
|
- `maxBytes`: optional response byte budget.
|
|
- `waitMs`: optional long-poll timeout in milliseconds.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"chunks": [],
|
|
"nextSeq": 1,
|
|
"exited": false,
|
|
"exitCode": null,
|
|
"closed": false,
|
|
"failure": null
|
|
}
|
|
```
|
|
|
|
### `process/write`
|
|
|
|
Writes raw bytes to a running process stdin.
|
|
|
|
Request params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"chunk": "aGVsbG8K"
|
|
}
|
|
```
|
|
|
|
`chunk` is base64-encoded raw bytes. In the example above it is `hello\n`.
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"status": "accepted"
|
|
}
|
|
```
|
|
|
|
Behavior notes:
|
|
|
|
- Writes to an unknown `processId` are rejected.
|
|
- Writes to a non-PTY process are rejected unless it started with `pipeStdin`.
|
|
|
|
### `process/terminate`
|
|
|
|
Terminates a running managed process.
|
|
|
|
Request params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1"
|
|
}
|
|
```
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"running": true
|
|
}
|
|
```
|
|
|
|
If the process is already unknown or already removed, the server responds with:
|
|
|
|
```json
|
|
{
|
|
"running": false
|
|
}
|
|
```
|
|
|
|
## Notifications
|
|
|
|
### `process/output`
|
|
|
|
Streaming output chunk from a running process.
|
|
|
|
Params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"seq": 1,
|
|
"stream": "stdout",
|
|
"chunk": "aGVsbG8K"
|
|
}
|
|
```
|
|
|
|
Fields:
|
|
|
|
- `processId`: process identifier
|
|
- `seq`: per-process output sequence number
|
|
- `stream`: `"stdout"`, `"stderr"`, or `"pty"`
|
|
- `chunk`: base64-encoded output bytes
|
|
|
|
### `process/exited`
|
|
|
|
Final process exit notification.
|
|
|
|
Params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"seq": 2,
|
|
"exitCode": 0,
|
|
"sandboxDenied": false
|
|
}
|
|
```
|
|
|
|
`sandboxDenied` lets streaming clients preserve executor-side sandbox denial
|
|
detection without issuing a final `process/read` request. Clients recover it
|
|
with `process/read` when an older server omits the field.
|
|
|
|
### `process/closed`
|
|
|
|
Notification emitted after process output is closed and the process handle is
|
|
removed.
|
|
|
|
Params:
|
|
|
|
```json
|
|
{
|
|
"processId": "proc-1",
|
|
"seq": 3
|
|
}
|
|
```
|
|
|
|
## Filesystem RPCs
|
|
|
|
Filesystem methods require valid `file:` URI strings and return JSON-RPC errors
|
|
for invalid or unavailable paths. Native absolute path strings are rejected;
|
|
callers must convert them to `file:` URIs before sending requests:
|
|
|
|
- `fs/readFile`
|
|
- `fs/open`, `fs/readBlock`, and `fs/close` (internal transport for
|
|
`ExecutorFileSystem::read_file_stream`)
|
|
- `fs/writeFile`
|
|
- `fs/createDirectory`
|
|
- `fs/getMetadata`
|
|
- `fs/canonicalize`
|
|
- `fs/readDirectory`
|
|
- `fs/remove`
|
|
- `fs/copy`
|
|
|
|
Each filesystem request accepts an optional `sandbox` object. When `sandbox`
|
|
contains a `ReadOnly` or `WorkspaceWrite` policy, the operation runs in a
|
|
hidden helper process launched from the top-level `codex` executable and
|
|
prepared through the shared sandbox transform path. Helper requests and
|
|
responses are passed over stdin/stdout.
|
|
|
|
## Errors
|
|
|
|
The server returns JSON-RPC errors with these codes:
|
|
|
|
- `-32600`: invalid request
|
|
- `-32602`: invalid params
|
|
- `-32603`: internal error
|
|
|
|
Typical error cases:
|
|
|
|
- unknown method
|
|
- malformed params
|
|
- empty `argv`
|
|
- duplicate `processId`
|
|
- writes to unknown processes
|
|
- writes to non-PTY processes
|
|
- sandbox-denied filesystem operations
|
|
|
|
## Rust surface
|
|
|
|
The crate exports:
|
|
|
|
- `ExecServerClient`
|
|
- `ExecServerError`
|
|
- `ExecServerClientConnectOptions`
|
|
- `RemoteExecServerConnectArgs`
|
|
- protocol request/response structs for process and filesystem RPCs
|
|
- `DEFAULT_LISTEN_URL` and `ExecServerListenUrlParseError`
|
|
- `ExecServerRuntimePaths`
|
|
- `run_main()` for embedding the websocket server
|
|
- `RemoteEnvironmentConfig` and `run_remote_environment()` for embedding remote
|
|
registration mode
|
|
|
|
Callers must pass `ExecServerRuntimePaths` to `run_main()`. The top-level
|
|
`codex exec-server` command builds these paths from the `codex` arg0 dispatch
|
|
state. `RemoteEnvironmentConfig::new(...)` also takes the auth provider that
|
|
remote registration should use; the CLI builds that provider from Codex auth
|
|
state before starting remote mode.
|
|
|
|
## Example session
|
|
|
|
Initialize:
|
|
|
|
```json
|
|
{"id":1,"method":"initialize","params":{"clientName":"example-client"}}
|
|
{"id":1,"result":{}}
|
|
{"method":"initialized","params":{}}
|
|
```
|
|
|
|
Start a process:
|
|
|
|
```json
|
|
{"id":2,"method":"process/start","params":{"processId":"proc-1","argv":["bash","-lc","printf 'ready\\n'; while IFS= read -r line; do printf 'echo:%s\\n' \"$line\"; done"],"cwd":"file:///tmp","env":{"PATH":"/usr/bin:/bin"},"tty":true,"pipeStdin":false,"arg0":null}}
|
|
{"id":2,"result":{"processId":"proc-1"}}
|
|
{"method":"process/output","params":{"processId":"proc-1","seq":1,"stream":"stdout","chunk":"cmVhZHkK"}}
|
|
```
|
|
|
|
Write to the process:
|
|
|
|
```json
|
|
{"id":3,"method":"process/write","params":{"processId":"proc-1","chunk":"aGVsbG8K"}}
|
|
{"id":3,"result":{"status":"accepted"}}
|
|
{"method":"process/output","params":{"processId":"proc-1","seq":2,"stream":"stdout","chunk":"ZWNobzpoZWxsbwo="}}
|
|
```
|
|
|
|
Terminate it:
|
|
|
|
```json
|
|
{"id":4,"method":"process/terminate","params":{"processId":"proc-1"}}
|
|
{"id":4,"result":{"running":true}}
|
|
{"method":"process/exited","params":{"processId":"proc-1","seq":3,"exitCode":0,"sandboxDenied":false}}
|
|
{"method":"process/closed","params":{"processId":"proc-1","seq":4}}
|
|
```
|