codex

[codex] add process-owned code-mode session client (#30112 )

## Summary

- add `ProcessOwnedCodeModeSessionProvider` and logical session
generation/rebinding state
- add the supervised child-process connection, reader/writer tasks, and
driver state machine
- make dropped execute/wait/open callers cancellation-safe with explicit
ownership handoff and durable cleanup
- validate cell/delegate lifecycle state and reject invalid protocol
transitions
- add end-to-end stdio coverage for delegates, cancellation, frame
limits, child loss, stale generations, replacement, and long-lived
sessions

## Why

This final stage exposes the process-owned client only after the wire
protocol, host-safe runtime, and standalone host are independently in
place. Transport failure is fail-stop: the client closes local state,
cancels callbacks, reaps the child, and lazily rebuilds a fresh host
generation rather than transactionally recovering the old connection.

## Stack

This is **4 of 4** in the process-owned code-mode session stack.

- Depends on #30111
- Full stack: #30108 → #30110 → #30111 → this PR

## Validation

- `just test -p codex-code-mode -p codex-code-mode-host` — 86 passed
- `just fix -p codex-code-mode`
- `just fix -p codex-code-mode-host`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `bazel test //codex-rs/code-mode:code-mode-unit-tests
//codex-rs/code-mode-host:code-mode-host-unit-tests
//codex-rs/code-mode-host:code-mode-host-stdio-test
//codex-rs/code-mode-protocol:code-mode-protocol-unit-tests` — 4/4
passed
- `just fmt`

Channing Conger · 2026-06-25 23:46:17 -07:00

ab16046c88

[codex] add code-mode host failure supervision hooks (#30110 )

## Why

A process host should be discarded and rebuilt after critical actor or
V8 failure, while the existing in-process production path must keep its
current cell-error semantics. This change establishes that failure
boundary without adding the host process or remote client.

## What changed

- add optional task-failure supervision to the transport-neutral
code-mode session runtime
- report Tokio cell-actor failures and V8 runtime-thread panics to a
host-provided fail-stop handler
- preserve the existing handler-less in-process behavior
- make host-owned cell ID allocation fail before numeric wraparound

## Follow-up

The V8 panic signal surfaced here should also be consumed by the
`InProcessCodeModeSession` manager in a future change so it can fail the
affected cell. This PR intentionally leaves the handler-less in-process
behavior unchanged while putting the required panic tracking in place.

## Stack

This is **2 of 4** in the process-owned code-mode session stack.

- #30108 is merged into `main`
- The next PR targets this branch

## Validation

- `just test -p codex-code-mode` — 53 passed
- `just argument-comment-lint -p codex-code-mode`
- `just fix -p codex-code-mode`

Channing Conger · 2026-06-25 15:33:58 -07:00

6c21297bba

code-mode standalone: extract protocol and add host crate (#27724 )

This is phase 1 of a 4 phase stack:
1. **Add protocol and host crates for new IPC code mode implementation**
2. Create the new standalone binary
3. Create a new IPC `CodeModeSessionProvider` to use new binary
4. Remove v8 from core and only use IPC provider


## Add protocol and host crates for new IPC code mode implementation
Establish a clean process boundary without changing the existing
in-process behavior.

- Add the codex-code-mode-protocol crate for shared session, runtime,
response, and tool-definition types.
- Move protocol-facing code out of the V8-backed implementation.
- Add a buildable codex-code-mode-host crate as the foundation for the
standalone process.
- Keep the existing in-process runtime as the active implementation.

Channing Conger · 2026-06-11 22:37:26 -07:00

aa46f2debf

code-mode: introduce durable session interface (#24180 )

## Summary

Introduce a `CodeModeSession` interface for executing and managing
code-mode cells.

This moves cell lifecycle, callback delegation, termination, and
shutdown behind a session abstraction, while continuing to use the
existing in-process implementation, and the ability to implement an
external process one behind this interface.

A Codex session owns one `CodeModeSession`, which in turn owns its
running cells and stored code-mode state. Each cell is represented to
the caller as a `StartedCell`, exposing its cell ID and initial
response.

It also introduces a `CodeModeSessionDelegate` callback interface. A
session uses the delegate to invoke nested host tools and emit
notifications while a cell is running, allowing the runtime to
communicate with its owning Codex session without depending directly on
core turn handling.

<img width="2121" height="1001" alt="image"
src="https://github.com/user-attachments/assets/c349a819-2a59-485c-bda4-2caf68ac4c31"
/>

Channing Conger · 2026-05-29 11:42:52 -07:00

c9dc0f6338

Enable V8 sandboxing for source-built builds (#21146 )

## Summary

This is the first PR in the V8 in-process sandboxing rollout.

It adds the build-system and Rust feature plumbing needed to support
sandboxed V8 builds, then enables sandboxing by default for the
source-built Bazel V8 path that we control directly. It deliberately
keeps the published `rusty_v8` artifact workflows on their current
non-sandboxed contract so this PR can land and ship independently before
we change any released artifacts.

## Rollout plan

- [x] **PR 1: land sandbox plumbing and default source-built Bazel V8 to
sandboxed mode**

- [ ] **PR 2: publish sandbox-enabled release artifacts and add
compatibility validation**
- Produce sandboxed artifact pairs for every released Cargo target that
does not already use the source-built Bazel path.
- Add CI coverage that consumes those sandboxed artifacts and verifies:
    - `codex-v8-poc` reports sandbox enabled
    - `codex-code-mode` builds/tests against the sandboxed path

- [ ] **PR 3: switch release consumers to sandboxed artifacts by
default**
  - Update released artifact selectors/checksums.
- Enable the Rust `v8_enable_sandbox` feature in the default release
path.
- Make the sandboxed artifact family the normal path for published
builds.

- [ ] **PR 4: remove rollout-only compatibility paths**
- Remove the temporary non-sandbox release compatibility config once the
new default has shipped and baked.
  - Keep the invariant tests permanently.

Channing Conger · 2026-05-05 14:36:37 -07:00

36460387ec

refactor: use cloneable async channels for shared receivers (#18398 )

This is the first mechanical cleanup in a stack whose higher-level goal
is to enable Clippy coverage for async guards held across `.await`
points.

The follow-up commits enable Clippy's
[`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock)
lint and the configurable
[`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type)
lint for Tokio guard types. This PR handles the cases where the
underlying issue is not protected shared mutable state, but a
`tokio::sync::mpsc::UnboundedReceiver` wrapped in `Arc<Mutex<_>>` so
cloned owners can call `recv().await`.

Using a mutex for that shape forces the receiver lock guard to live
across `.await`. Switching these paths to `async-channel` gives us
cloneable `Receiver`s, so each owner can hold a receiver handle directly
and await messages without an async mutex guard.

## What changed

- In `codex-rs/code-mode`, replace the turn-message
`mpsc::UnboundedSender`/`UnboundedReceiver` plus `Arc<Mutex<Receiver>>`
with `async_channel::Sender`/`Receiver`.
- In `codex-rs/codex-api`, replace the realtime websocket event receiver
with an `async_channel::Receiver`, allowing `RealtimeWebsocketEvents`
clones to receive without locking.
- Add `async-channel` as a dependency for `codex-code-mode` and
`codex-api`, and update `Cargo.lock`.

## Verification

- The split stack was verified at the final lint-enabling head with
`just clippy`.

Michael Bolin · 2026-04-17 15:20:30 -07:00

c9c4caafd8

register all mcp tools with namespace (#17404 )

stacked on #17402.

MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.

also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.

make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.

this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.

sayan-oai · 2026-04-15 21:02:59 +08:00

0df7e9a820

[codex] Initialize ICU data for code mode V8 (#17709 )

Link ICU data into code mode, otherwise locale-dependent methods cause a
panic and a crash.

pakrym-oai · 2026-04-13 22:01:58 -07:00

ad37389c18

Code mode on v8 (#15276 )

Moves Code Mode to a new crate with no dependencies on codex. This
create encodes the code mode semantics that we want for lifetime,
mounting, tool calling.

The model-facing surface is mostly unchanged. `exec` still runs raw
JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
are still available through `tools.*`, and helpers like `text`, `image`,
`store`, `load`, `notify`, `yield_control`, and `exit` still exist.

The major change is underneath that surface:

- Old code mode was an external Node runtime.
- New code mode is an in-process V8 runtime embedded directly in Rust.
- Old code mode managed cells inside a long-lived Node runner process.
- New code mode manages cells in Rust, with one V8 runtime thread per
active `exec`.
- Old code mode used JSON protocol messages over child stdin/stdout plus
Node worker-thread messages.
- New code mode uses Rust channels and direct V8 callbacks/events.

This PR also fixes the two migration regressions that fell out of that
substrate change:

- `wait { terminate: true }` now waits for the V8 runtime to actually
stop before reporting termination.
- synchronous top-level `exit()` now succeeds again instead of surfacing
as a script error.

---

- `core/src/tools/code_mode/*` is now mostly an adapter layer for the
public `exec` / `wait` tools.
- `code-mode/src/service.rs` owns cell sessions and async control flow
in Rust.
- `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
JavaScript execution.
- each `exec` spawns a dedicated runtime thread plus a Rust
session-control task.
- helper globals are installed directly into the V8 context instead of
being injected through a source prelude.
- helper modules like `tools.js` and `@openai/code_mode` are synthesized
through V8 module resolution callbacks in Rust.

---

Also added a benchmark for showing the speed of init and use of a code
mode env:
```
$ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
scenario tools samples warmups iters mean/exec p95/exec rssΔ p50 rssΔ max
cold_exec 0 30 0 1 1.13ms 1.20ms 8.05MiB 8.06MiB
warm_exec 0 30 1 25 473.43us 512.49us 912.00KiB 1.33MiB
cold_exec 32 30 0 1 1.03ms 1.15ms 8.08MiB 8.11MiB
warm_exec 32 30 1 25 509.73us 545.76us 960.00KiB 1.30MiB
cold_exec 128 30 0 1 1.14ms 1.19ms 8.30MiB 8.34MiB
warm_exec 128 30 1 25 575.08us 591.03us 736.00KiB 864.00KiB
memory uses a fresh-process max RSS delta for each scenario
```

---------

Co-authored-by: Codex <noreply@openai.com>

Channing Conger · 2026-03-20 23:36:58 -07:00

e4eedd6170

9 Commits