mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
198 Commits
-
[codex] disable Nagle on Rendezvous WebSockets (#30269)
## Summary Disable Nagle unconditionally for both exec-server Rendezvous WebSocket connections. - pass `disable_nagle=true` at the executor and harness connection call sites - keep the existing signed URL, protocol, and connection flow unchanged - add no feature flag, rollout schema, path variant, or experiment-specific telemetry The companion internal PR enables `TCP_NODELAY` on accepted Rendezvous sockets: https://github.com/openai/openai/pull/1082463 ## Why Rendezvous carries small, latency-sensitive relay and JSON-RPC frames. Three staging runs of 30 steady-state `process/read` calls per configuration measured p50 improving from 139.1 ms to 81.5 ms and p95 from 162.0 ms to 95.8 ms with Nagle disabled. The expected packet overhead is small at the current connection scale. We will use existing latency, error, packet, and CPU monitoring and revert normally if production regresses. ## Rollout and rollback The client and accepted-socket changes can deploy independently. New connections receive the setting as each side deploys. Rollback is a normal code revert; there is no persisted assignment or gate state to unwind. ## Validation - `just test -p codex-exec-server --lib`: 164 passed - `just fix -p codex-exec-server`: passed - `just fmt`: passed - independent final review found no actionable issue
richardopenai ·
2026-06-29 19:14:47 -05:00 -
[app-server] expose environment info RPC (#30291)
## Why App-server clients that configure named execution environments need to discover an environment's shell and working directory before selecting it for a thread or turn. Because the environment can run on a different operating system than app-server, its working directory is represented as a canonical `file:` URI rather than a host-local path string. The probe also needs a bounded response time: an exec-server that completes initialization but never answers `environment/info` must not hold the environment serialization queue indefinitely. ## What changed - Add an experimental `environment/info` app-server RPC for named environments. - Route the probe through the managed environment connection and return target-native shell metadata plus the default working directory as a `PathUri`. - Return connection and protocol failures as JSON-RPC errors. - Bound the exec-server probe response to 30 seconds and remove timed-out calls from the pending-request table so later environment mutations can proceed. - Cover successful responses, omitted working directories, unknown environments, connection failures, and pending-call cleanup. ## Protocol examples Request: ```json { "id": 42, "method": "environment/info", "params": { "environmentId": "remote-a" } } ``` Successful response: ```json { "id": 42, "result": { "shell": { "name": "zsh", "path": "/bin/zsh" }, "cwd": "file:///workspace" } } ``` If the exec-server initializes but does not answer the probe within 30 seconds: ```json { "id": 42, "error": { "code": -32603, "message": "failed to get info for environment `remote-a`: exec-server protocol error: timed out waiting for exec-server `environment/info` response after 30s" } } ``` ## Testing - App-server integration coverage for successful info (including omitted `cwd`), unknown environments, and connection failures. - Exec-server RPC coverage verifying a timed-out call is removed from the pending-request table. --------- Co-authored-by: Michael Bolin <mbolin@openai.com>Max Johnson ·
2026-06-27 19:34:10 +00:00 -
[codex] consume pushed exec-server process events (#30273)
## Summary - complete unified-exec processes from the ordered event stream instead of issuing a final zero-wait `process/read` - add optional executor sandbox-denial state to `process/exited` - retain `process/read` as a retained-output and compatibility fallback for receiver lag, sequence gaps, and legacy servers - recover sandbox-denial state across transport reconnection - cover the real `TestCodex` remote-exec path without adding a public test-only event constructor ## Why A successful one-shot tool call currently receives its output and terminal notifications, then pays another wide-area `process/read` round trip before returning. Staging traces showed that remote response wait accounted for more than 99.8% of RPC time; local serialization, queueing, and deserialization were below 0.6 ms. ## Measured impact A direct staging A/B used the same build and route and changed only completion mode. Each arm ran three times with 30 one-shot `/usr/bin/true` calls per run. The table reports the median of the three per-run percentiles. | Metric | Final `process/read` | Pushed events | Change | | --- | ---: | ---: | ---: | | End-to-end completion p50 | 159.5 ms | 118.7 ms | -40.8 ms (-25.6%) | | End-to-end completion p95 | 182.4 ms | 131.7 ms | -50.6 ms (-27.8%) | | Completion-wait p50 | 80.1 ms | 41.5 ms | -38.5 ms (-48.1%) | | Final `process/read` RPC p50 | 79.9 ms | eliminated | -79.9 ms | TCP_NODELAY was enabled in both A/B arms, so its effect cancels out. The successful, complete, in-order event path issued zero final `process/read` calls. ## Compatibility and recovery - new servers send `sandboxDenied` on `process/exited` - legacy servers omit it, which triggers one compatibility `process/read` - broadcast lag or a sequence gap triggers a retained-output read - recovery remains bounded by the server's existing 1 MiB retained-output window - complete, in-order event streams issue no completion read - sandbox denial is attached to the exit event before consumers can observe process completion - server-first and client-first rollouts remain wire-compatible; server-first realizes the latency win immediately ## Integration coverage The `TestCodex` suite exercises four distinct remote-exec contracts: - complete pushed output/exit/close with zero reads - direct pushed sandbox denial with zero reads - legacy missing denial metadata with exactly one compatibility read - count-bounded replay eviction recovered from retained output without duplication ## Validation - `just test -p codex-core exec_command_consumes_pushed_remote_process_events`: 4 passed - `just test -p codex-core unified_exec::process_tests::`: 4 passed - `just test -p codex-exec-server`: 294 passed, 2 skipped - `just test -p codex-exec-server-protocol`: 5 passed - `just test -p codex-rmcp-client`: 89 passed, 2 skipped - focused Bazel `//codex-rs/core:core-all-test`: passed across 16 shards - scoped `just fix` passed for core and exec-server - `just fmt` passed The complete workspace suite was not rerun; focused Cargo and Bazel coverage passed for the changed behavior.
richardopenai ·
2026-06-26 18:05:52 -07:00 -
Persist Cloudflare affinity cookies for MCP HTTP (#29516)
[Codex Thread 019ef1f9-36e2-7e91-9337-504f097b9dc1](https://codex-thread-link.openai.chatgpt-team.site/thread/019ef1f9-36e2-7e91-9337-504f097b9dc1) ## Why Hosted plugin-service Streamable HTTP MCP traffic uses `https://chatgpt.com/backend-api/ps/mcp` and depends on Cloudflare's `__cflb` cookie for load-balancer affinity. The local and exec-server `http/request` path built a fresh reqwest client for each request without installing Codex's existing shared ChatGPT Cloudflare cookie store, so affinity could be lost between calls. This is an affinity-hardening change motivated by an incident investigation. It does not establish the broader connector-cache incident RCA or claim to fix that incident in full. ## What changed - Install the existing process-local, strictly allowlisted ChatGPT Cloudflare cookie store on the reqwest client used by `ReqwestHttpClient`. - Fresh clients now share allowed Cloudflare infrastructure cookies within the process that originates the local or exec-server network request. - Keep the existing HTTPS ChatGPT-host and Cloudflare-cookie-name restrictions. This does not introduce a general cookie jar or send ChatGPT Cloudflare cookies to unrelated hosts. ## Test coverage - `codex-client` unit coverage verifies that the existing strict store accepts and returns `__cflb` for HTTPS ChatGPT URLs. - The exec-server HTTPS integration test sends four independent `http/request` calls through a local TLS-intercepting proxy and verifies that: - `Set-Cookie: __cflb=west` is sent on the next plugin-service request; - a later `Set-Cookie: __cflb=central` replaces the stored value; - non-Cloudflare session cookies are discarded; - no stored ChatGPT Cloudflare cookie is sent to a non-ChatGPT host. - `just test -p codex-client` — 38 passed. - `just test -p codex-exec-server --test chatgpt_cloudflare_affinity` — 1 passed. - `just bazel-lock-check` — passed. ## Non-goals - No persistence of ChatGPT auth, account, session, residency, or arbitrary cookies. - No cookie persistence for third-party MCP servers. - No special composition of caller-provided `Cookie` headers. - No plugin-service, connector-cache, Habitat/habicache, routing, redirect, or API-contract changes. - No broader incident RCA conclusions.
stevenlee-oai ·
2026-06-26 02:23:24 -04:00 -
Test selected capabilities across availability and resume (#30157)
## Why This stack crosses World State, executor skills, selected plugin metadata, MCP processes, connectors, dynamic environments, and resume. This PR adds two end-to-end scenarios that validate those pieces together. Both tests enable `deferred_executor`, so they exercise the real delayed-environment path. ## Scenario 1: availability across turns and resume ```text 1. Start a thread with one selected plugin root bound to E1. 2. E1 is unavailable. - executor skill is absent - selected MCP is absent - connector has no selected-plugin attribution 3. Start E1 and register the same stable environment ID. 4. Start a new turn. - the executor skill appears through World State - its body beats a colliding host skill - the selected MCP tool is advertised and executes inside E1 - the connector is attributed to the selected plugin 5. Start another turn without changing E1. - the MCP PID stays the same, proving runtime reuse 6. Restart app-server and resume the thread. - durable selected-root intent is restored - skills, MCP, and connector attribution are restored - a new MCP PID proves ephemeral process state was rebuilt ``` ## Scenario 2: availability changes inside one turn ```text 1. Start a turn while E1 is unavailable. 2. The first model sample sees no executor skill, MCP, or selected connector. 3. The turn pauses on request_user_input. 4. Start E1 and register it while that same turn is still active. 5. Continue the turn. 6. The very next model sample sees: - the executor skill catalog - the selected MCP tool - selected-plugin connector attribution 7. The model calls the MCP, and its output proves execution happened inside E1. ``` This second scenario specifically protects the aeon-style behavior: capability state is captured again for every sampling step, not only at the next user turn. ## Scope These are integration tests only. They do not add a combinatorial matrix for unsupported plugin-file mutation, environment generations, transport disconnects, or delayed `required = true` executor MCPs.
jif ·
2026-06-26 03:11:55 +01:00 -
[codex] Propagate traces through exec-server HTTP (#30117)
Fixes distributed trace continuity across exec-server JSON-RPC HTTP egress by adding an executor client span and injecting its W3C context through a reusable `codex-otel` helper. This preserves the caller trace across core/tool → executor → provider/MCP instead of dropping parentage at raw reqwest. Note that this doesn't include the websocket path, which is needed to really get the full story but at least we cover the basic http path with this change.
Tom ·
2026-06-25 23:22:22 +00:00 -
[codex] Observe remote exec-server lifecycle (#27470)
## Summary - Record bounded duration and outcome metrics for remote environment registration and Noise rendezvous connection attempts. - Count reconnects by bounded reason: disconnect, connection failure, or rejected registration. - Trace registration at the owning client boundary without exporting raw environment or registration identifiers. - Replace the stale pre-Noise WebSocket observability design with the current remote transport model. ## Stack Review and land this stack in order: 1. #27466 — trace exec-server JSON-RPC requests 2. #27467 — record bounded connection, request, and process lifecycle metrics 3. #27470 — observe remote registration and Noise rendezvous lifecycle **(this PR)** ## Validation - `just test -p codex-exec-server --lib` (149 passed) - `just test -p codex-cli --test exec_server` (4 passed) - `just argument-comment-lint` - `just bazel-lock-check` - `just fix -p codex-exec-server -p codex-cli` - `just fmt`
richardopenai ·
2026-06-25 13:42:40 -07:00 -
[codex] Retry temporarily offline exec-server recovery (#30098)
## Summary - retry ERS `409 environment_offline` responses inside the existing exec-server recovery loop - keep all other registry conflicts terminal - add focused coverage for both cases ## Root cause When an exec server disconnects and reconnects, the client already starts recovery and calls ERS `/connect`. During the transient executor presence gap, ERS can return `409 environment_offline`. The retry classifier treated every 409 as terminal, so the first response aborted the existing 25-second recovery window before the executor came back online. That then caused active processes to be marked lost. This change classifies only the structured `environment_offline` conflict as retryable. Recovery continues with the existing bounded deadline, exponential backoff, and jitter. ## Validation - `just test -p codex-exec-server client::recovery::tests` — 4 passed - `just fix -p codex-exec-server` — passed - `just fmt` — passed - Full `just test -p codex-exec-server` reached unrelated macOS filesystem-sandbox integration failures because nested `/usr/bin/sandbox-exec` is denied in this environment (`sandbox_apply: Operation not permitted`).
richardopenai ·
2026-06-25 19:25:04 +00:00 -
[codex] Record exec-server lifecycle metrics (#27467)
## Summary - Record bounded connection, request, and process lifecycle metrics. - Report active gauges from callbacks on every collection, including delta exports. - Serialize active-count updates so concurrent starts and finishes cannot publish stale values. - Serialize process exit, explicit termination, and shutdown through the process registry so exactly one completion result wins. - Keep the implementation small with single-owner RAII guards and one real OTLP/HTTP integration test using the existing `wiremock` dependency. ## Root cause Process exit and session shutdown previously used cloned completion state. That avoided duplicate emission, but it duplicated lifecycle ownership and made the ordering harder to reason about. The process registry mutex already defines the lifecycle ordering, so the final implementation stores the metric guard and termination flag directly on the process entry. Whichever path claims the entry first owns the completion result. Production metric export uses delta temporality. Event-only synchronous gauge recordings disappear after the next collection when no count changes, so active counts now use observable callbacks that report current state on every collection. The cleanup also removes the constant `result="accepted"` connection tag, redundant route and response assertions, a custom HTTP collector, and fallback initialization machinery that did not add behavior. ## Stack Review and land this stack in order: 1. #27466 — trace exec-server JSON-RPC requests 2. #27467 — record bounded connection, request, and process lifecycle metrics **(this PR)** 3. #27470 — observe remote registration and Noise rendezvous lifecycle ## Validation - `just test -p codex-exec-server --lib` (158 passed) - `just test -p codex-cli --test exec_server` (3 passed) - `just test -p codex-otel observable_gauge_is_collected_on_every_delta_snapshot` (1 passed) - `CARGO_BUILD_JOBS=1 just fix -p codex-otel -p codex-exec-server` - `just fmt` - `git diff --check`
richardopenai ·
2026-06-25 11:02:11 -07:00 -
Persist selected capability roots and resolve availability per model step (#29856)
## Why `selectedCapabilityRoots` is durable thread intent: “use this capability root from environment `worker`.” The important product assumption is: > One environment ID always names the same logical executor and stable contents. `worker` does not silently change from executor A to an unrelated executor B. The process-local connection handle for `worker` can still be replaced while Codex is running, though, for example when `environment/add` registers a fresh handle for the same logical environment. The thread should persist only the stable selection. Each model step should pair that selection with the exact ready handle captured for that step. ## The boundary ```text persisted thread intent plugin@1 -> environment "worker" | | capture the current step v model-step view unavailable, or plugin@1 + worker's exact captured ready handle ``` The environment ID is the stable identity and cache key. The `Arc<Environment>` is only a process-local handle retained so consumers of one model step use the same captured environment. It is never persisted and it does not imply different environment contents. ## What changes ### Persist the stable selection Selected roots are written into `SessionMeta` and restored with the thread. Forked subagents inherit the same selections, including bounded-history forks. Only stable data is persisted: root ID, environment ID, and root path. ### Capture readiness together with the exact handle The environment snapshot records: ```rust environment_id -> Some(Arc<Environment>) // ready in this step environment_id -> None // still starting in this step ``` This prevents readiness and execution from coming from different registry snapshots. For example: ```text step snapshot: worker -> handle A, ready environment/add: worker -> fresh handle B for the same logical environment current step: plugin@1 still uses captured handle A ``` Without carrying handle A in the snapshot, the resolver could combine “A was ready” with handle B and treat B as ready before it had finished starting. This does not change cache invalidation. Stable capability metadata remains identified by environment ID and capability root. Replacing a process-local handle under the same stable environment ID does not invalidate or rediscover that metadata. ### Resolve availability per model step - A ready captured environment produces resolved roots using its captured handle. - A starting, missing, or failed environment is omitted from that step. - A selected lazy environment that is outside the turn's captured environment set is asked to start, and a later step can observe it as ready. - No capability files are scanned here. Transient transport disconnects remain the remote client's reconnect concern. This PR models initial attachment/readiness; it does not add live socket-connectivity state. ## Example ```text thread selection: plugin@1 -> environment "worker" step 1: worker is starting -> plugin@1 unavailable step 2: worker is ready -> plugin@1 resolves through worker's captured handle step 3: fresh local handle -> current step remains pinned; a later step captures its own view ``` Temporary unavailability does not discard the durable selection. Later PRs can retain stable metadata caches while projecting only currently available capabilities into model-visible World State. ## Compatibility The app-server request shape does not change. Older rollouts without `selected_capability_roots` deserialize to an empty list. ## Stack 1. **This PR:** persist stable selected roots and resolve them through an exact model-step handle. 2. #29960: cache stable skill metadata and project available skills into World State. 3. #29946: cache stable plugin declarations and manage the separate live MCP runtime.jif ·
2026-06-25 17:49:43 +00:00 -
Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
## Why #28522 routes selected-plugin HTTP MCP traffic through the owning executor, but OAuth bootstrap and refresh still used host-local clients. Executor-only servers therefore cannot complete discovery or login through the same network boundary as the MCP connection. ## What changed - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient` contract - let RMCP own discovery, dynamic registration, PKCE, token exchange, and refresh - route auth status, persisted-token startup, and app-server login through the server runtime while preserving the existing local discovery path - add optional `threadId` to `mcpServer/oauth/login` and echo it in the completion notification - implement RMCP's redirect policy and 1 MiB OAuth response limit over executor HTTP - cover selected-thread OAuth discovery and login through an executor-only route Depends on #28522.
jif ·
2026-06-25 10:31:17 +01:00 -
Follow directory symlinks in filesystem walks (#29844)
Stack 3 of 3. Stacked on #29842. ## What changes Adds an opt-in `followDirectorySymlinks` setting to `fs/walk`. When enabled, the walk follows directory symlinks but continues to ignore symlinked files. Canonical directory identities prevent symlink cycles, while normal paths keep their existing spelling. Environment skill discovery enables the setting so symlinked skill directories continue to work with the new single-RPC scan.
jif ·
2026-06-24 20:52:36 +01:00 -
[codex] Trace exec-server JSON-RPC requests (#27466)
## Why Exec-server JSON-RPC calls can cross local and remote transports, but trace context stopped at the RPC boundary. That made client and server work difficult to correlate when diagnosing latency or failures. ## What changed - Propagate the current W3C trace context on outbound JSON-RPC requests. - Parent inbound request spans from received trace context. - Record the received JSON-RPC method on server spans and keep each span open through response enqueue. - Add only the OTEL dependencies required by the exec-server crate. ## Stack Review and land this stack in order: 1. #27466 — trace exec-server JSON-RPC requests **(this PR)** 2. #27467 — record bounded connection, request, and process lifecycle metrics 3. #27470 — observe remote registration and Noise rendezvous lifecycle ## Validation - `just test -p codex-exec-server --lib` (153 passed) - `just bazel-lock-check` - `just fix -p codex-exec-server`
richardopenai ·
2026-06-24 12:50:18 -07:00 -
Add a bounded filesystem walk RPC (#29841)
Stack 1 of 3. Follow-ups: #29842 and #29844. ## What changes Adds a general bounded `fs/walk` operation to the exec server. The operation returns file and directory entries plus recoverable per-path errors. It skips symlinks, preserves the existing filesystem sandbox routing, and enforces depth, directory, entry, and response-size limits. This PR only defines and wires the filesystem operation. It does not change any callers yet.
jif ·
2026-06-24 16:05:43 +01:00 -
test: add app-server auto environment helper (#29746)
## Why Start moving towards app-server tests defaulting to running against remote & foreign OS executors. To do so we need a point of indirection similar to core integration tests' `build_with_auto_env`, but with the flexibility of letting tests control environment registration if they need to. ## What This adds: - `TestAppServer::new_with_auto_env()` for constructing an app server with a default environment defined by the test runner (e.g. bazel) - `TestAppServer::auto_env_params()` for tests to easily acquire turn env params tailored to the automatic environment - `TestAppServer::send_thread_start_request_with_auto_env()` to make it easy for tests to start a thread using the automatic environment The above methods all fail if the test calling them has set up an environment where the automatic environment configuration conflicts with test-created state. ## Validation Adds a couple of basic smoke tests to the app-server test suite. Follow-ups will migrate more tests to use it.
Adam Perry @ OpenAI ·
2026-06-24 01:06:29 +00:00 -
protocol: separate app and exec RPC ownership (#29714)
## Why The app-server and exec-server expose separate JSON-RPC APIs, but exec-server currently sources its serialized protocol and envelope types through app-server-oriented code. Giving each API an explicit owner makes the crate boundary legible without introducing shared generic envelopes. ## What changed - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and JSON-RPC envelopes. - Updated exec-server clients, transports, handlers, and tests to use the new crate. - Exposed app-server's existing JSON-RPC types through a public `rpc` module while retaining root re-exports. - Preserved existing wire shapes, including exec `PathUri` behavior. ## Stack This is PR 1 of 6. Next: [PR #29721](https://github.com/openai/codex/pull/29721), which moves auth mode below the app wire boundary. ## Validation - Exec-server protocol and server coverage passed in the focused protocol test runs. - App-server protocol schema fixtures passed.
Adam Perry @ OpenAI ·
2026-06-23 22:37:31 +00:00 -
path-uri: remove legacy path deserialization (#29158)
## Why I'd originally added `PathUri` legacy path deserialization thinking we'd want it for having `PathUri` in public app-server APIs. Since then we've added `LegacyAppPathString` to handle the messy conversions that we need for backcompat. It's confusing for `PathUri` to support deserializing legacy paths when we don't yet want to actually expose app-server callers or rollout storage to the new URI format. Stacked on top of #29472 to avoid breaking compatibility in case those types ended up stored somewhere for someone. ## What changed - Parse deserialized `PathUri` values exclusively as valid `file:` URIs. - Replace legacy acceptance coverage with rejection coverage for top-level filesystem paths and sandbox working directories. - Serialize CWDs in hand-built exec-server process requests as `PathUri` values.
Adam Perry @ OpenAI ·
2026-06-23 21:47:00 +00:00 -
[codex] Report the exec-server working directory (#29666)
## Summary - add the exec-server working directory to `environment/info` as an optional `PathUri` - populate it from the executor process's current directory - preserve compatibility with older responses that omit `cwd` ## Why Remote clients currently have no executor-native default working directory. This forces callers such as app-server-backend to assume `/workspace`, which fails for laptop environments. Reporting the cwd alongside the detected shell lets clients use the path convention and location of the actual executor. ## Impact This is backward-compatible: the new response field is optional, and clients can continue handling responses from older exec servers. A follow-up app-server-backend change will consume the value for cwd-less `command/exec` requests. ## Validation - `just test -p codex-exec-server` (275 passed, 2 skipped)
Rasmus Rygaard ·
2026-06-23 13:39:13 -07:00 -
[codex] Preserve proxy state for filesystem sandbox helpers (#29671)
## Why Filesystem helpers intentionally run with a minimal environment that excludes proxy variables. After filesystem operations started using the Windows sandbox wrapper, the wrapper derived an empty proxy configuration from that helper environment and compared it with the persistent sandbox setup marker. When the marker contained proxy ports, every filesystem operation appeared to require a firewall update, which could launch elevated setup, show a UAC or loader dialog, and fail operations such as `apply_patch` with error 1223. Filesystem helpers do not use network access, so they should preserve the proxy/firewall state established by normal sandboxed process launches. ## What changed - Add an explicit Windows sandbox proxy-settings mode for reconciling or preserving persistent proxy state. - Use preserve mode for filesystem helpers while normal process launches continue to reconcile proxy settings from their environment. - Carry the selected proxy state consistently through setup validation, elevated setup, and non-elevated ACL refreshes. - Cover wrapper argument propagation and marker-derived proxy preservation. ## Validation - `cargo build -p codex-cli --bin codex` - `just test -p codex-windows-sandbox preserving_proxy_settings_uses_the_existing_marker` - `just test -p codex-windows-sandbox windows_wrapper_args_round_trip` - `just test -p codex-windows-sandbox setup_request_prefers_explicit_proxy_settings` - `just test -p codex-sandboxing transform_for_direct_spawn_windows` - `just test -p codex-exec-server fs_sandbox::tests` - Ran the same sandboxed `fs/writeFile` reproduction against published `0.142.0-alpha.6` and the new CLI. The published CLI launched elevated setup and failed with `ShellExecuteExW ... 1223`; the new CLI completed without elevation. Related to #28359.
iceweasel-oai ·
2026-06-23 12:29:46 -07:00 -
Prepare managed network sandbox context (#29456)
## Why Managed network configures commands to use local HTTP and SOCKS proxies. For commands delegated to the exec server, the proxy environment and the sandbox policy were prepared separately. On macOS, that meant a command could receive `HTTPS_PROXY=http://127.0.0.1:43123` while Seatbelt still denied access to port `43123`. ## What changed `NetworkProxy` now prepares the command environment and sandbox context together from the same runtime snapshot: ```text Prepared managed network ├── command environment: HTTPS_PROXY=http://127.0.0.1:43123 └── sandbox context: allow outbound to 127.0.0.1:43123 ``` That context travels with remote exec requests. The exec server preserves the managed proxy and CA environment, and macOS Seatbelt allows only the prepared loopback proxy ports without enabling broad network access or local binding. The protocol field is optional and the existing enforcement flag remains in place, preserving compatibility with callers that do not send the new context.
jif ·
2026-06-23 20:07:09 +01:00 -
path-uri: clarify host-native path conversion (#29501)
## Why Downstream refactors are producing confusing code with this functionality having a very generic name. Encoding the specific conversion approach in the method name makes it clearer. ## What Rename `PathUri::from_path` to `PathUri::from_host_native_path` and update its Rust call sites.
Adam Perry @ OpenAI ·
2026-06-23 00:02:33 +00:00 -
Report remote sandbox denials semantically (#29424)
## Why #29113 moved remote sandbox setup and enforcement to the exec server. That gives the executor ownership of the platform-specific work: a Linux executor chooses and runs a Linux sandbox even when the Codex orchestrator is running on macOS or Windows. It also means the orchestrator no longer knows which concrete sandbox the executor selected. When that sandbox blocks a remote command, the orchestrator currently sees only a failed process and can treat the denial as an ordinary command failure. The existing sandbox approval and retry path is then skipped. This PR lets the executor report one portable fact: > This command probably failed because the executor sandbox blocked it. The executor keeps its concrete sandbox type private. The protocol sends only the semantic result. ## Example Suppose a local macOS Codex session asks a Linux devbox to write outside the allowed workspace. Before this PR: ```text Linux sandbox blocks the write -> remote process exits with "Permission denied" -> local orchestrator sees an ordinary command failure -> the normal sandbox approval and retry path can be skipped ``` With this PR: ```text Linux sandbox blocks the write -> executor reports sandboxDenied: true -> unified exec returns UnifiedExecError::SandboxDenied -> the existing approval prompt is shown -> an approved retry runs through the existing unsandboxed retry path ``` ## What changes ### The executor remembers its selected sandbox The prepared remote process now retains the executor-selected `SandboxType`. This value never crosses the executor boundary. Commands started without a sandbox retain `SandboxType::None` and are never reported as sandbox denials. ### The executor uses the existing denial heuristic The existing local denial heuristic moves from `codex-core` into the shared `codex-sandboxing` crate. When a sandboxed remote process exits, the executor: 1. waits the same short output grace period used by local unified exec; 2. reads the output currently available in the existing retained output buffer; 3. runs the existing heuristic using the exit code and common denial messages; 4. stores the yes/no result before publishing the process exit. This deliberately matches the old local unified-exec behavior. It does not add a new streaming classifier, another output buffer, or stronger output-retention guarantees. ### The protocol reports a portable boolean `process/read` gains `sandboxDenied`: ```json { "exited": true, "exitCode": 1, "closed": false, "sandboxDenied": true } ``` The field defaults to `false` when an older executor omits it. The response does not expose the executor sandbox implementation or executor-native paths. ### Unified exec uses the existing error path The exec-server client carries `sandboxDenied` into the unified process state. If it is true, unified exec returns the existing `SandboxDenied` error instead of trying to classify remote output using an orchestrator-side sandbox type. Remote process exit remains visible as soon as the process exits. This PR does not wait for stdout or stderr to close and does not change the existing process lifecycle. ## Scope This PR is intentionally limited to matching the existing local unified-exec behavior for the initial command execution path. It does not add: - incremental denial tracking across the full output stream; - new denial handling for commands completed later through `write_stdin`; - new guarantees for preserving the semantic flag during the narrow reconnect-recovery race. Those can be considered separately if the same behavior is added for local execution. ## Test coverage One remote end-to-end integration test covers the complete intended flow: ```text remote read-only sandbox -> denied write -> executor reports the denial -> Codex requests approval -> user approves -> retry succeeds on the remote executor ``` Existing lifecycle coverage continues to verify that remote process exit is reported before late output streams close.
jif ·
2026-06-22 19:33:28 +02:00 -
Apply sandbox intent inside remote exec servers (#29113)
## Why PR #29108 lets the orchestrator send sandbox intent with `process/start` without wrapping the command for its own operating system. This PR completes that boundary by making the executor interpret and enforce the intent using its own filesystem paths and sandbox implementation. For example, a macOS TUI targeting a Linux devbox sends `/bin/bash -lc pwd`. The Linux executor turns that into its own `codex-linux-sandbox ... /bin/bash -lc pwd` launch. ## What changes - Keep `process/start` unchanged when no sandbox intent is present. - Convert sandbox `PathUri` values into native paths on the executor. - Bind symbolic `:workspace_roots` permissions to the executor's native sandbox cwd. - Select the sandbox implementation on the executor and wrap the original command immediately before spawning it. - Reject sandbox-required execution before spawning when the executor cannot enforce the intent. - Pass exec-server runtime paths into process creation so Linux can locate `codex-linux-sandbox`. The boundary is therefore: ```text orchestrator executor original argv + sandbox intent -> select and enforce local sandbox ``` This PR intentionally treats a denied remote command as an ordinary command failure. Draft follow-up #29424 carries a semantic `sandboxDenied` result back to unified exec for the existing approval and retry flow. ## Platform scope Linux and macOS use their existing direct-spawn sandbox transforms. Windows sandboxed remote process launch is intentionally unsupported in this PR. The current Windows direct-spawn wrapper does not correctly preserve arbitrary argv, TTY behavior, or pass the full child environment out of band. The executor rejects the request instead of running it incorrectly or unsandboxed. ## Known follow-ups - The transported permission profile can still contain orchestrator-materialized helper or explicit paths. A `TODO(jif)` marks where the executor boundary should receive pre-host-materialization permission intent. - The sandbox wrapper currently replaces a requested custom inner `arg0`. A `TODO(jif)` marks where this must be preserved or rejected explicitly. - Draft PR #29424 contains the deferred sandbox-denial classification and approval/retry behavior. ## Rollout assumption This executor-sandbox stack is unreleased and its client and executor are expected to move together. This PR does not add mixed-version negotiation with older exec servers.
jif ·
2026-06-22 12:45:37 +02:00 -
Test pipelined scalar exec-server requests (#29325)
## Summary This adds focused coverage for the simpler same-connection scalar request path. The exec-server connection already supports multiple in-flight JSON-RPC scalar requests on one connection. This test locks in that behavior by sending two normal requests before reading either response, without adding a batch frame or any new API surface. ## What changed - Added a processor-level test that initializes an exec-server connection. - Sends two scalar `environment/info` requests back-to-back on the same connection. - Verifies both responses come back on the same connection by request id. Checked locally with: - `just test -p codex-exec-server connection_accepts_pipelined_scalar_requests`
jif ·
2026-06-21 13:40:51 +02:00 -
Carry sandbox intent to remote exec servers (#29108)
## What changed PR #29099 stopped sending the orchestrator's concrete sandbox wrapper to a remote exec-server. Remote commands now arrive as plain native argv. This PR adds the next piece: Codex also sends portable sandbox intent next to that plain argv. For a remote unified-exec command, the request can now include: - the canonical permission profile before local workspace-root materialization - the sandbox cwd and workspace roots as `PathUri` values - Windows sandbox settings - the legacy Landlock setting - whether managed networking must be enforced The important part is that symbolic entries such as `:workspace_roots` stay symbolic while crossing the boundary. The executor can then bind them to its own workspace-root paths instead of receiving orchestrator-local absolute paths. The data travels through `ExecRequest` into `ExecParams`. Older exec-servers can still deserialize requests because the new fields have defaults. ## Why The orchestrator should not decide how another machine implements sandboxing. For example: - a local macOS Codex would normally build a Seatbelt command - a remote Linux executor needs a Linux sandbox command instead The orchestrator now sends the plain command plus the policy it intended to enforce. A later PR can let the exec-server choose and build the correct sandbox for its own operating system. ## Important detail This keeps the portable intent separate from the local `SandboxType`. `SandboxType::None` is ambiguous: - it can mean the command was explicitly approved to run without a sandbox - it can also mean the orchestrator host has no concrete sandbox implementation available Those cases are different for remote execution. This PR adds `sandbox_requested` so an executor can still receive sandbox intent when the orchestrator cannot build a local wrapper. Explicit unsandboxed retries still send no sandbox context. ## Behavior today This PR only transports the intent. The exec-server accepts the new fields but does not apply them yet. Remote commands therefore remain unsandboxed after this PR, just as they are after PR #29099. ## Follow-up The next PR will make exec-server read this portable intent, bind symbolic workspace permissions to executor-native roots, choose the sandbox for its own operating system, build the wrapper locally, and then spawn the command.
jif ·
2026-06-21 12:33:21 +02:00 -
[3/3] app-server: configure environment connection timeout (#29025)
## Why Remote environments registered through `environment/add` currently use the fixed 10-second WebSocket connection timeout. Slow-starting executors need a caller-selected connection window, but this should not add retry policy or couple exec-server behavior to Core’s `deferred_executor` feature. Make the timeout an optional part of the existing experimental request. Existing clients continue using the current default, while callers that know an executor may take longer can request a larger window explicitly. Depends on #28683. ## What changed - Add optional `connectTimeoutMs` to `EnvironmentAddParams` and document it in the app-server README. - Pass the optional timeout through `EnvironmentRequestProcessor` into one `EnvironmentManager::upsert_environment()` path; the manager applies the existing default when it is omitted. - Preserve the existing single-attempt lifecycle. The configured value controls WebSocket connection and handshake time for both initial connection and later reconnects; initialization retains its separate timeout. - Add an app-server integration test that sends the real JSON-RPC request and verifies a stalled handshake observes the requested timeout. ## Test plan - `just test -p codex-app-server-protocol` - `just test -p codex-exec-server` - `just test -p codex-app-server environment_add_applies_connect_timeout` ## Rollout This is additive and does not enable `deferred_executor`. Callers should send a non-default timeout only after a compatible app-server is deployed; omitted or `null` values retain the existing 10-second default.
sayan-oai ·
2026-06-19 05:27:45 +00:00 -
[1/3] core: add remote environment connection lifecycle (#28674)
## Why Remote environments can be registered before their exec-server is first used. Starting the connection at registration time uses that startup window, while sharing one startup result prevents background work and capability calls from opening competing connections. Keep initial startup simple: each environment makes one connection attempt using its configured transport timeout. A failed initial attempt is final for that environment, while an environment that disconnects after connecting can still recover on a later operation. ## What changed - Start URL and Noise environments in the background when they are added to `EnvironmentManager`. Provider snapshots are fully validated before connection work begins. - Share one initial connection attempt and its saved result across metadata, process, filesystem, and HTTP callers. - Keep configured stdio environments lazy until first use so registration does not launch a process. - Tie background startup work to the environment lifetime so replacing or dropping an environment cancels unfinished work. - After an established client disconnects, share one fresh connection attempt across concurrent callers. A failed attempt fails the current operation without permanently preventing a later attempt. - Store the shared lazy client directly on `Environment` and expose small methods for starting, observing, and awaiting startup. ## Test plan - `just test -p codex-exec-server` - `just test -p codex-app-server turn_start_resolves_sticky_thread_local_environment_and_turn_overrides`
sayan-oai ·
2026-06-18 21:50:15 -07:00 -
core: load AGENTS.md from foreign environments (#28958)
## Why Make it possible to load AGENTS.md from remote exec-servers whose OS is different than app-server. ## What - keep `AGENTS.md` discovery and provenance as `PathUri`, with root-aware parent and ancestor traversal - expose lifecycle instruction sources as legacy app-server path strings in events while retaining `PathUri` internally - preserve and test mixed POSIX and Windows paths in model context and TUI status output - cover remote Windows loading end to end by seeding the Wine prefix through host filesystem APIs - fix bug in `PathUri`'s parent() implementation that would erase Windows drive letters
Adam Perry @ OpenAI ·
2026-06-18 15:06:23 -07:00 -
[codex] Initialize exec-server OpenTelemetry at startup (#25019)
## Summary - Initialize stderr tracing and the configured OpenTelemetry provider for local and remote `codex exec-server` startup. - Instrument the local and remote server entrypoints with a root runtime span. - Keep raw Noise environment, registration, and stream identifiers out of exported spans while preserving them in local debug events. - Keep telemetry setup in a focused CLI module instead of growing the top-level command entrypoint. ## Stack - Previous: none (`#27058` has merged) - Next: #27466 ## Validation - `just test -p codex-exec-server --lib` (139 passed) - `just test -p codex-cli --test exec_server` (3 passed) - `just bazel-lock-check` - `just fix -p codex-exec-server -p codex-cli` - `just fmt` --------- Co-authored-by: Richard Lee <richardlee@openai.com>
starr-openai ·
2026-06-18 11:03:42 -07:00 -
Recover exec process stdin writes (#28895)
## Summary Remote stdio MCP servers send tool calls by writing JSON-RPC bytes through `process/write`. When the exec-server websocket drops at the wrong time, the remote process can survive session recovery, but the stdin write can still fail back to RMCP as a transport send error. RMCP then closes the stdio MCP transport, so tools like `node_repl` are lost even though the process/session recovery path is working. This changes `process/write` to be safe to retry across exec-server recovery: - adds a required `writeId` to `process/write` - retries remote `Session::write` with the same `writeId` after reconnect - remembers accepted write ids per process so duplicate retries return `Accepted` without writing the same bytes to child stdin again - covers both the client retry path and server-side write id dedupe with tests In simple terms: ```text before: write to MCP stdin -> websocket closes -> write errors -> RMCP closes node_repl after: write to MCP stdin -> websocket closes -> reconnect -> retry same writeId server either writes once or recognizes it already did ```
jif ·
2026-06-18 19:04:26 +02:00 -
Add network environment ID plumbing (#28766)
## Why Prepare network approval scoping to distinguish execution environments without changing behavior yet. ## What changed - Add optional environment IDs to network policy requests. - Add optional network environment IDs to exec and sandbox request structs. - Thread default None values through existing construction points. - Fix stale constructor call sites that caused the CI compile failures. ## Not included - Per-environment proxy listeners. - Network approval cache or prompt behavior changes. - Ambiguous request attribution handling. Those behavior changes moved to stacked follow-up #28899. ## Validation - just fmt - CI will run tests and clippy
jif ·
2026-06-18 14:09:38 +02:00 -
Refresh signed exec-server URLs on reconnect (#28374)
## Summary - add a provider API that supplies a fresh signed WebSocket URL for each remote exec-server connection - refresh the signed URL after disconnects and retry once when a handshake returns `401 Unauthorized` - allow `EnvironmentManager` consumers to register remote environments backed by the URL provider ## Tests - `just test -p codex-exec-server -E 'test(remote_websocket_client_refreshes_url_after_unauthorized_handshake) | test(remote_websocket_client_refreshes_url_after_disconnect)'` — 2 passed - `cargo check -p codex-core-api` — passed - `just fix -p codex-exec-server` — passed - `just fix -p codex-core-api` — no test targets; no-op - `just fmt` — passed - `just test -p codex-exec-server` — 187 passed; 32 unrelated macOS sandbox tests could not invoke nested `sandbox-exec` (`Operation not permitted`)
Anton Panasenko ·
2026-06-17 20:58:48 -07:00 -
feat(exec-server): add Noise rendezvous environment (#28774)
## Why Codex can run a remote exec server through the Noise relay, but the normal environment-manager path could not establish an environment-registry-backed harness connection. Signed rendezvous URLs and harness authorizations are short-lived, so reconnects must fetch a fresh bundle instead of retaining stale connection credentials. A stalled registry request must also fail within the regular remote connection deadline, without exposing these credentials in debug logs. Issue: N/A (internal environment-service integration). ## What Changed - Add environment-manager configuration for a registry-backed Noise rendezvous environment. - Request a fresh bundle from `/cloud/environment/{environment_id}/connect` for every physical harness connection, using the existing 10-second remote connection timeout. - Share the Environment Registry register, connect, and validate wire payloads through `codex-exec-server` and `codex-core-api`. - Redact the signed rendezvous URL and harness authorization from the public connect response's `Debug` output. - Add focused coverage for registry bundle retrieval, stalled requests, and credential redaction.Anton Panasenko ·
2026-06-17 17:20:53 -07:00 -
exec-server: expose environment registry payloads (#28651)
## Why Services that proxy the exec-server environment registry endpoints need to deserialize and forward the same Noise registration and harness-key validation payloads. Those wire models currently live as private, serialize-only structs in `exec-server`, which forces consumers to duplicate the contract. ## What changed - Add owned serde models for registration and harness-key validation requests and responses. - Use those models in the existing exec-server registry client. - Re-export the models from `codex-exec-server` and `codex-core-api`. - Keep the harness authorization request free of a derived `Debug` implementation so it is not accidentally logged. ## Testing - Focused exec-server registration and harness-key validation tests: 2 passed. - `cargo check -p codex-core-api` The full `codex-exec-server` suite compiled and ran 254 tests: 222 passed, while 32 existing filesystem sandbox tests could not run under the nested macOS sandbox (`sandbox_apply: Operation not permitted`). Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-17 13:27:25 -07:00 -
unified-exec: preserve PathUri through exec-server (#28681)
## Why It should be possible for app-server to handle "foreign" OS paths in unified_exec working directories, allowing e.g. a Linux app-server to run processes on e.g. a Windows exec-server. ## What Convert the core unified_exec cwd values to use `PathUri`. Adds fallible path conversion in several places to try to minimize the scope of this change. The only time this change suppresses errors from converting `PathUri` to an `AbsolutePathBuf` is when the turn is configured with no sandboxing at all to allow us to make progress testing without sandboxing. Future changes to apply_patch and sandboxing will clean up these error paths. A tool's cwd is resolved from joining a model-provided workdir to the environment's cwd. When using `AbsolutePathBuf::join()`, an absolute-path workdir would overwrite the environment's cwd and we would resolve permissions/sandboxing against the model-provided path. This change extends `PathUri::join()` to also treat an absolute rhs as an override of the base/lhs. This also removes some coverage from the remove_env_windows tests until a follow-up converts foreign paths in command exec events correctly. ## Breaking Changes When using `AbsolutePathBuf::join()` for workdir resolution, we ended up resolving tilde-prefixed paths against the app-server's `$HOME`, e.g. `~/foo/bar` becomes `/home/anp/foo/bar`. It's difficult to do this with `PathUri` joining, so after offline discussion this PR no longer implements it. A quick check of some power users' rollouts suggests that models don't actually generate home-prefixed absolute working directories for their spawns, so this shouldn't have any real blast radius.
Adam Perry @ OpenAI ·
2026-06-17 19:36:16 +00:00 -
Run fs helper through Windows sandbox wrapper (#28359)
## Why This is the final PR in the Windows fs-helper sandbox stack and contains the actual bug fix. The exec-server filesystem helper is a direct-spawn path: it asks `SandboxManager` for a `SandboxExecRequest`, then launches the returned argv itself. That works on macOS and Linux because the transformed argv is already a self-contained sandbox wrapper. On Windows, the transformed request carried `WindowsRestrictedToken` metadata, but the direct-spawn fs-helper runner still launched the helper argv directly. That means Windows filesystem built-ins backed by the fs-helper could run with the parent Codex process permissions instead of the configured Windows sandbox. This PR makes the direct-spawn transform produce a self-contained Windows wrapper argv before fs-helper launches it. ## What Changed - Added `SandboxManager::transform_for_direct_spawn()` for callers that launch the returned argv themselves. - Wrapped Windows restricted-token direct-spawn requests with `codex.exe --run-as-windows-sandbox` and then marked the outer request as unsandboxed, matching the macOS/Linux wrapper argv shape. - Updated `exec-server/src/fs_sandbox.rs` to use the direct-spawn transform for fs-helper launches. - Materialized the inner `codex.exe --codex-run-as-fs-helper` executable into `.sandbox-bin` so the sandboxed user can run it. - Carried runtime workspace roots through `FileSystemSandboxContext` as `PathUri` values so `:workspace_roots` policies resolve correctly without sending native client paths over exec-server JSON. - Preserved wrapper setup identity environment needed by Windows sandbox setup without changing the serialized inner helper environment. ## Verification - `just bazel-lock-update` - `just bazel-lock-check` - `just test -p codex-sandboxing transform_for_direct_spawn_windows` - `just test -p codex-exec-server fs_sandbox::tests` - `just fix -p codex-windows-sandbox -p codex-sandboxing -p codex-exec-server -p codex-core -p codex-file-system` Local note: `just fmt` completed Rust formatting, but this workstation still fails the non-Rust formatter phases because uv cannot open its cache and the local buildifier/dotslash path is missing.
iceweasel-oai ·
2026-06-17 10:00:42 -07:00 -
Back off registry retries during exec recovery (#28546)
## Why PR #28512 retries a failed session recovery every 100 ms. Every Noise recovery attempt first asks the environment registry for a fresh connection bundle, even when the eventual failure comes from the WebSocket or initialize handshake. During an outage, that could make each disconnected client call the registry about 250 times during the 25-second recovery window. ## What changes All retryable Noise recovery failures now use a separate backoff schedule: ```text base: 500 ms -> 1 s -> 2 s -> 4 s -> 5 s maximum actual: 500-750 ms, 1-1.5 s, 2-3 s, 4-6 s, 5-7.5 s ``` The extra 0-50% is deterministic per-session jitter so disconnected clients do not retry together. Direct WebSocket recovery keeps the existing 100 ms retry because it does not re-enter the registry.
jif ·
2026-06-17 11:52:23 +02:00 -
Resume exec-server sessions after disconnect (#28512)
Supersedes #28288 (closed). ## Why A short WebSocket interruption currently ends every client-side process handle, even though exec-server keeps the server session and its processes alive for a short time. This is especially visible for executor-backed stdio MCP servers: a temporary connection loss becomes a permanent `Transport closed` error. The server already has the information needed to resume the session, but the client opens a fresh session instead of using it. This change reconnects below the process and MCP layers. Existing process handles stay valid, missed output is recovered, and the same server-side processes continue running. ## State machine One logical `ExecServerClient` stays alive while its underlying RPC connection changes generations. ```text transport closes +------------------------------------------------+ | v +-------------+ +-------------+ | Connected | | Recovering | +-------------+ +-------------+ ^ | | session resumed, processes caught up | retryable error +------------------------------------------------+ loops until deadline | | deadline or permanent error v +-------------+ | Failed | +-------------+ ``` ### `Connected` - New RPC calls use the current connection. - Process notifications are published in sequence order. - A disconnect only starts recovery if it came from the current connection generation. Late events from older generations cannot replace the active connection. ### `Recovering` - New calls wait instead of choosing a half-connected RPC client. - Existing process handles, wake subscriptions, and event subscriptions stay open. - Streaming HTTP response bodies fail immediately because their byte streams cannot be resumed safely. - Recovery first waits for process starts that were already in flight. A start whose result became ambiguous is cleaned up after reconnection instead of being silently adopted. - The client reconnects with the learned `session_id`. The server may briefly report that the old connection is still attached, so that error is retried until the detach finishes. - The notification consumer starts before the resume handshake completes. This prevents a busy process from filling the notification queue and blocking the initialize response. - Before installing the new connection, the client catches up every recoverable process with `process/read`. ### `Failed` - Recovery stops after 25 seconds or after a permanent error. - Waiting calls are released with one stable disconnect error. - Existing process sessions receive a terminal failure instead of waiting forever. ## Recovering process events Output, exit, and close events share one sequence. During normal operation, the client buffers early events until every lower sequence has been published. After reconnection, the client reads each process starting after its last published sequence: 1. Retained output chunks are inserted by sequence number. 2. Exit and close state are reconstructed in their sequence positions. 3. Events already received as live notifications are ignored as duplicates. 4. Newly contiguous events are published in order. 5. If the server no longer retains enough output to fill a sequence gap, only that process is terminated and failed. The recovered connection remains usable for other processes. The server reports its full next event sequence for unbounded reads, including exit and close events. Closed processes remain readable for the same 30-second window used to retain detached sessions. ## Other details - Detached server sessions are retained for 30 seconds, leaving margin around the client's 25-second recovery deadline. - Session attach and detach update the active notification sender under the same attachment lock, so an old connection cannot clear a newly attached sender. - A dedicated error code distinguishes the temporary "session is still attached" race from permanent initialization errors. - Process starts are identity-checked on both client and server. Cleanup from an older start cannot remove a newer process that reused the same ID. - Mutating requests that were already in flight when the transport closed are not replayed, because the client cannot know whether the server applied them. Requests started after recovery is known wait for the replacement connection. - We assume the server/client version stays in sync (on the before/after this PR) ## User impact Long-running commands and stdio MCP servers can survive a temporary exec-server WebSocket interruption without changing process IDs or losing output produced during the outage.
jif ·
2026-06-17 10:20:39 +02:00 -
[codex] exec-server: stream files in chunks (#28354)
## Why `fs/readFile` buffers the entire file in one response, which makes large remote reads expensive and prevents callers from applying backpressure. We need an opt-in streaming path with bounded block sizes while preserving the existing single-call API for small and sandboxed reads. ## What changed - Add `ExecServerClient::stream`, returning a named `FileReadStream` that implements `futures::Stream` and yields immutable 1 MiB byte blocks. - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs. `fs/readBlock` accepts an explicit offset and length. - Keep unsandboxed files open between block reads, cap open handles per connection, and clean them up on EOF, error, stream drop, explicit close, or connection shutdown. - Reject platform-sandboxed streaming opens instead of turning the one-shot sandbox helper into a persistent server. Existing `fs/readFile` behavior is unchanged. ## Testing - `just test -p codex-exec-server` - Integration coverage for 1 MiB chunking, exact block-boundary EOF, sandbox rejection, and continued reads from the opened file after path replacement. - Handle-manager coverage for non-sequential offsets, variable block lengths, the 128-handle limit, and capacity release after close.
pakrym-oai ·
2026-06-16 09:50:55 -07:00 -
path-uri: clarify invalid host path errors (#28473)
## Why Ensure a consistent string format when exposing path conversion errors to the model. ## What - Render `PathUriParseError::InvalidFileUriPath` as `'$PATH' is invalid on '$OS'`.
Adam Perry @ OpenAI ·
2026-06-16 09:03:44 -07:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
exec-server: default remote transport to Noise (#26245)
## Why The transport in [openai/codex#26242](https://github.com/openai/codex/pull/26242) needs to be used by every remote orchestrator-to-executor connection before JSON-RPC traffic starts. ## Changes - Generates one executor Noise identity when remote exec-server starts and registers its public key. - Creates a harness identity for each physical remote environment connection. - Fetches a fresh registry bundle before connecting and validates the authenticated harness key before completing the executor handshake. - Multiplexes encrypted logical streams over the existing executor WebSocket. - Adds bounded stream, handshake-failure, and reassembly state. - Adds safe lifecycle diagnostics without logging keys, authorizations, plaintext, or ciphertext. - Covers reconnects, replay rejection, validation failure, framing limits, and encrypted JSON-RPC tool traffic. ## Stack 1. [openai/codex#26242](https://github.com/openai/codex/pull/26242): Noise channel and relay transport 2. **[openai/codex#26245](https://github.com/openai/codex/pull/26245)**: remote registration and runtime activation ## Verification - `just test -p codex-exec-server` - `just fix -p codex-exec-server` - `just bazel-lock-check` - `cargo shear` --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-15 17:39:00 -07:00 -
Run core integration tests against a Wine-backed Windows executor (#28401)
## Why We want to exercise a linux app-server against a windows exec-server without having to repeat every test case. This approach has slight precedent in the remote docker test setup. ## What Run the shared `codex-core` integration suite against Windows exec-server behavior from Linux. This makes cross-OS path and shell regressions visible while keeping unsupported cases owned by individual tests. - Add `local`, `docker`, and `wine-exec` test environment selection with legacy Docker compatibility. - Extend `codex_rust_crate` to generate a sharded Wine-exec variant using a cross-built Windows server and pinned Bazel Wine/PowerShell runtimes. - Teach remote-aware helpers about Windows paths and track temporary incompatibilities with source-local `skip_if_wine_exec!` calls and follow-up reasons.
Adam Perry @ OpenAI ·
2026-06-16 00:38:41 +00:00 -
Use PathUri in filesystem permission paths for exec-server (#28165)
## Why Progress towards letting app-server and exec-server run on different platforms, specifically for sandbox configuration. ## What - Make the filesystem path containment hierarchy generic, defaulting to `AbsolutePathBuf` for now. - Have clients specify `AbsolutePathBuf` or `PathUri` directly where needed. - Use `PathUri` throughout exec-server filesystem protocol and trait boundaries. - Implement `From` for conversion to path URIs and `TryFrom` for fallible conversion to absolute paths through the generic type hierarchy.
Adam Perry @ OpenAI ·
2026-06-15 23:55:23 +00:00 -
exec-server: add Noise relay transport (#26242)
## Why Rendezvous forwards traffic between the orchestrator and exec-server. The endpoints need to authenticate each other and encrypt that traffic without trusting Rendezvous with plaintext or endpoint keys. ## Changes - Adds a hybrid Noise IK channel through Clatter using X25519, ML-KEM-768, AES-256-GCM, and SHA-256. - Binds each handshake to `environment_id`, `executor_registration_id`, and `stream_id`. - Pins the registry-provided executor key and carries the harness authorization inside the encrypted handshake. - Orders relay frames before consuming Noise nonces and fragments large JSON-RPC messages into bounded records. - Bounds handshake payloads, frames, streams, and message reassembly. Runtime activation is in [openai/codex#26245](https://github.com/openai/codex/pull/26245). ## Stack 1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**: Noise channel and relay transport 2. [openai/codex#26245](https://github.com/openai/codex/pull/26245): remote registration and runtime activation ## Verification - `just test -p codex-exec-server` - Oversized initiator payload regression coverage - `just fix -p codex-exec-server` - `just bazel-lock-check` - `cargo shear` --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-15 16:39:41 -07:00 -
chore: restore exec-server relay keepalives (#28286)
## Why The ws pump refactor removed the relay keepalive timers that had been added to keep idle rendezvous connections alive. An idle relay could therefore be closed by the rendezvous service or a load balancer, disconnecting executor-backed MCP processes. ## What - restore periodic WebSocket ping frames on both rendezvous relay endpoints - keep missed-tick behavior bounded with `MissedTickBehavior::Skip` - cover the harness and remote-environment pumps with focused traffic-after-keepalive tests
jif ·
2026-06-15 17:24:36 +02:00 -
[codex] exec-server honors remote environment cwd and shell (#28122)
## Why Next slice needed to make progress on the `remote_env_windows` test is to support passing a Windows cwd for the remote environment and using that environment's native shell. This lets the test run a real Windows process instead of only recording an early path or shell mismatch. ## What - change `TurnEnvironmentSelection.cwd` from `AbsolutePathBuf` to `PathUri` - convert local cwd values to URIs when constructing selections - preserve a remote primary cwd instead of replacing it with the local legacy fallback - prefer the selected environment's discovered shell for unified exec, falling back to the session shell when unavailable - convert back to a host-native absolute path at current native-only consumer boundaries - reject or deny unsupported foreign cwd values at the existing request-permissions boundary, with TODOs for its future migration - extend the hermetic Wine test to execute Windows PowerShell in `C:\windows` and verify successful process completion - record the current app-server rejection against the same Wine-backed remote Windows fixture when its cwd is supplied as a native Windows path
Adam Perry @ OpenAI ·
2026-06-14 06:07:46 +00:00 -
build: run buildifier from just fmt (#28125)
## Intent Keep Bazel and Starlark files consistently formatted without requiring contributors to install or version buildifier themselves. ## Implementation - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier v8.5.1. - Run buildifier from the shared `just fmt` and `just fmt-check` driver, with Windows-safe explicit DotSlash invocation. - Provision DotSlash in formatting CI and contributor devcontainers, and document the source-build prerequisite. - Apply the initial mechanical buildifier formatting baseline.
Adam Perry @ OpenAI ·
2026-06-13 21:43:39 -07:00 -
[codex] Carry exec-server cwd as PathUri (#28032)
## Why This is the second-to-last place in the exec-server protocol that needs to migrate to URIs to support cross-OS operation. ## What - Change `ExecParams.cwd` to `PathUri`. - Keep the cwd URI-shaped through core and rmcp producers, converting it to `AbsolutePathBuf` only in `LocalProcess::start_process`. - Reject non-native cwd URIs before launch and update the affected protocol documentation and call sites.
Adam Perry @ OpenAI ·
2026-06-13 20:56:42 +00:00 -
[codex] Add hermetic Wine exec-server test (#27937)
## Why We want to make it possible for an app-server orchestrator on one OS to control an exec-server on another host running a different OS. In practice this kinda already works if you get lucky and the two hosts have the same path format, but we mangle quite a lot of operations if either end is Windows. This test starts exercising that interaction, although right now the initial bootstrap fails. Future changes will expand the test's assertions to match improved support. ## What Stacked on #27964. This adds a small Windows exec-server fixture and a Linux protocol smoke test using the reusable Wine harness, covering Windows environment discovery, non-TTY `cmd.exe` execution, output, exit status, and working directory. Once we've got the full codex binary cross-building under Bazel we could consider moving to the real binary instead of the stripped down exec-server-only binary used here.
Adam Perry @ OpenAI ·
2026-06-12 20:20:23 -07:00