codex

Code mode on v8 (#15276 )

Moves Code Mode to a new crate with no dependencies on codex. This
create encodes the code mode semantics that we want for lifetime,
mounting, tool calling.

The model-facing surface is mostly unchanged. `exec` still runs raw
JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
are still available through `tools.*`, and helpers like `text`, `image`,
`store`, `load`, `notify`, `yield_control`, and `exit` still exist.

The major change is underneath that surface:

- Old code mode was an external Node runtime.
- New code mode is an in-process V8 runtime embedded directly in Rust.
- Old code mode managed cells inside a long-lived Node runner process.
- New code mode manages cells in Rust, with one V8 runtime thread per
active `exec`.
- Old code mode used JSON protocol messages over child stdin/stdout plus
Node worker-thread messages.
- New code mode uses Rust channels and direct V8 callbacks/events.

This PR also fixes the two migration regressions that fell out of that
substrate change:

- `wait { terminate: true }` now waits for the V8 runtime to actually
stop before reporting termination.
- synchronous top-level `exit()` now succeeds again instead of surfacing
as a script error.

---

- `core/src/tools/code_mode/*` is now mostly an adapter layer for the
public `exec` / `wait` tools.
- `code-mode/src/service.rs` owns cell sessions and async control flow
in Rust.
- `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
JavaScript execution.
- each `exec` spawns a dedicated runtime thread plus a Rust
session-control task.
- helper globals are installed directly into the V8 context instead of
being injected through a source prelude.
- helper modules like `tools.js` and `@openai/code_mode` are synthesized
through V8 module resolution callbacks in Rust.

---

Also added a benchmark for showing the speed of init and use of a code
mode env:
```
$ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
scenario tools samples warmups iters mean/exec p95/exec rssΔ p50 rssΔ max
cold_exec 0 30 0 1 1.13ms 1.20ms 8.05MiB 8.06MiB
warm_exec 0 30 1 25 473.43us 512.49us 912.00KiB 1.33MiB
cold_exec 32 30 0 1 1.03ms 1.15ms 8.08MiB 8.11MiB
warm_exec 32 30 1 25 509.73us 545.76us 960.00KiB 1.30MiB
cold_exec 128 30 0 1 1.14ms 1.19ms 8.30MiB 8.34MiB
warm_exec 128 30 1 25 575.08us 591.03us 736.00KiB 864.00KiB
memory uses a fresh-process max RSS delta for each scenario
```

---------

Co-authored-by: Codex <noreply@openai.com>

Channing Conger · 2026-03-20 23:36:58 -07:00

e4eedd6170

Add remote env CI matrix and integration test (#14869 )

`CODEX_TEST_REMOTE_ENV` will make `test_codex` start the executor
"remotely" (inside a docker container) turning any integration test into
remote test.

pakrym-oai · 2026-03-20 08:02:50 -07:00

ba85a58039

Split features into codex-features crate (#15253 )

- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.

Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-03-19 20:12:07 -07:00

2e22885e79

Add apply_patch code mode result (#15100 )

It's empty !

pakrym-oai · 2026-03-18 16:11:10 -07:00

56d0c6bf67

Add update_plan code mode result (#15103 )

It's empty!

pakrym-oai · 2026-03-18 16:10:51 -07:00

3590e181fa

Return image URL from view_image tool (#15072 )

Cleanup image semantics in code mode.

`view_image` now returns `{image_url:string, details?: string}` 

`image()` now allows both string parameter and `{image_url:string,
details?: string}`

pakrym-oai · 2026-03-18 13:58:20 -07:00

5cada46ddf

Propagate tool errors to code mode (#15075 )

Clean up error flow to push the FunctionCallError all the way up to
dispatcher and allow code mode to surface as exception.

pakrym-oai · 2026-03-18 13:57:55 -07:00

88e5382fc4

Add notify to code-mode (#14842 )

Allows model to send an out-of-band notification.

The notification is injected as another tool call output for the same
call_id.

pakrym-oai · 2026-03-18 09:37:13 -07:00

606d85055f

Rename exec_wait tool to wait (#14983 )

Summary
- document that code mode only exposes `exec` and the renamed `wait`
tool
- update code mode tool spec and descriptions to match the new tool name
- rename tests and helper references from `exec_wait` to `wait`

Testing
- Not run (not requested)

pakrym-oai · 2026-03-17 14:22:26 -07:00

ee756eb80f

Add exit helper to code mode scripts (#14851 )

- **Summary**
- expose `exit` through the code mode bridge and module so scripts can
stop mid-flight
  - surface the helper in the description documentation
  - add a regression test ensuring `exit()` terminates execution cleanly
- **Testing**
  - Not run (not requested)

pakrym-oai · 2026-03-16 22:07:58 +00:00

a3ba10b44b

dynamic tool calls: add param exposeToContext to optionally hide tool (#14501 )

This extends dynamic_tool_calls to allow us to hide a tool from the
model context but still use it as part of the general tool calling
runtime (for ex from js_repl/code_mode)

Channing Conger · 2026-03-14 01:58:43 -07:00

70eddad6b0

Add code_mode_only feature (#14617 )

Summary
- add the code_mode_only feature flag/config schema and wire its
dependency on code_mode
- update code mode tool descriptions to list nested tools with detailed
headers
- restrict available tools for prompt and exec descriptions when
code_mode_only is enabled and test the behavior

Testing
- Not run (not requested)

pakrym-oai · 2026-03-13 13:30:19 -07:00

477a2dd345

code mode: single line tool declarations (#14526 )

## Summary
- render code mode tool declarations as single-line TypeScript snippets
- make the JSON schema renderer emit inline object shapes for these
declarations
- update code mode/spec expectations to match the new inline rendering

## Testing
- `just fmt`
- `cargo test -p codex-core render_json_schema_to_typescript`
- `cargo test -p codex-core code_mode_augments_`
- `cargo test -p codex-core --test all exports_all_tools_metadata --
--nocapture`

pakrym-oai · 2026-03-13 10:08:34 -07:00

9c9867c9fa

code_mode: Move exec params from runtime declarations to @pragma (#14511 )

This change moves code_mode exec session settings out of the runtime API
and into an optional first-line pragma, so instead of calling runtime
helpers like set_yield_time() or set_max_output_tokens_per_exec_call(),
the model can write // @exec: {"yield_time_ms": ...,
"max_output_tokens": ...} at the top of the freeform exec source. Rust
now parses that pragma before building the source, validates it, and
passes the values directly in the exec start message to the code-mode
broker, which applies them at session start without any worker-runtime
mutation path. The @openai/code_mode module no longer exposes those
setter functions, the docs and grammar were updated to describe the
pragma form, and the existing code_mode tests were converted to use
pragma-based configuration instead.

Channing Conger · 2026-03-13 03:27:42 +00:00

0daffe667a

Expose code-mode tools through globals (#14517 )

Summary
- make all code-mode tools accessible as globals so callers only need
`tools.<name>`
- rename text/image helpers and key globals (store, load, ALL_TOOLS,
etc.) to reflect the new shared namespace
- update the JS bridge, runners, descriptions, router, and tests to
follow the new API

Testing
- Not run (not requested)

pakrym-oai · 2026-03-12 15:43:59 -07:00

a2546d5dff

Rename exec session IDs to cell IDs (#14510 )

- Update the code-mode executor, wait handler, and protocol plumbing to
use cell IDs instead of session IDs for node communication
- Switch tool metadata, wait description, and suite tests to refer to
cell IDs so user-visible messages match the new terminology

**Testing**
- Not run (not requested)

pakrym-oai · 2026-03-12 14:05:30 -07:00

04e14bdf23

Fix MCP tool calling (#14491 )

Properly escape mcp tool names and make tools only available via
imports.

pakrym-oai · 2026-03-12 13:38:52 -07:00

dadffd27d4

Skip nested tool call parallel test on Windows (#14505 )

**Summary**
- disable the `code_mode_nested_tool_calls_can_run_in_parallel` test on
Windows where `exec_command` is unavailable

**Testing**
- Not run (not requested)

pakrym-oai · 2026-03-12 13:32:11 -07:00

a5a4899d0c

Add parallel tool call test (#14494 )

Summary
- pin tests to `test-gpt-5.1-codex` so code-mode suites exercise that
model explicitly
- add a regression test that ensures nested tool calls can execute in
parallel and assert on timing
- refresh `codex-rs/Cargo.lock` for the updated dependency tree (add
`codex-utils-pty`, drop `codex-otel`)

Testing
- Not run (not requested)

pakrym-oai · 2026-03-12 12:10:14 -07:00

25e301ed98

Add default code-mode yield timeout (#14484 )

Summary
- expose the default yield timeout through code mode runtime so the
handler, wait tool, and protocol share the same 10s value that matches
unified exec
- document the timeout change in the tool descriptions and propagate the
value all the way into the runner metadata
- adjust Cargo.lock to keep the dependency tree in sync with the added
code mode tool dependency

Testing
- Not run (not requested)

pakrym-oai · 2026-03-12 12:06:23 -07:00

d1b03f0d7f

Cleanup code_mode tool descriptions (#14480 )

Move to separate files and clarify a bit.

pakrym-oai · 2026-03-12 11:13:35 -07:00

cfe3f6821a

Dispatch tools when code mode is not awaited directly (#14437 )

## Summary
- start a code mode worker once per turn and let it pump nested tool
calls through a dedicated queue
- simplify code mode request/response dispatch around request ids and
generic runner-unavailable errors
- clean up the code mode process API and runner protocol plumbing

## Testing
- not run yet

pakrym-oai · 2026-03-12 09:00:20 -07:00

2f03b1a322

Support waiting for code_mode sessions (#14295 )

## Summary
- persist the code mode runner process in the session-scoped code mode
store
- switch the runner protocol from `init` to `start` with explicit
session ids
- handle runner-side session processing without the init waiter queue

## Validation
- just fmt
- cargo check -p codex-core
- node --check codex-rs/core/src/tools/code_mode_runner.cjs

pakrym-oai · 2026-03-11 23:13:54 -07:00

f6c6128fc7

Add ALL_TOOLS export to code mode (#14294 )

So code mode can search for tools.

pakrym-oai · 2026-03-11 12:33:10 -07:00

65b325159d

Prefix code mode output with success or failure message and include error stack (#14272 )

pakrym-oai · 2026-03-11 12:33:09 -07:00

01792a4c61

Rename code mode tool to exec (#14254 )

Summary
- update the code-mode handler, runner, instructions, and error text to
refer to the `exec` tool name everywhere that used to say `code_mode`
- ensure generated documentation strings and tool specs describe `exec`
and rely on the shared `PUBLIC_TOOL_NAME`
- refresh the suite tests so they invoke `exec` instead of the old name

Testing
- Not run (not requested)

pakrym-oai · 2026-03-11 12:33:09 -07:00

8a099b3dfb

Add store/load support for code mode (#14259 )

adds support for transferring state across code mode invocations.

pakrym-oai · 2026-03-11 12:33:09 -07:00

83b22bb612

Add code_mode output helpers for text and images (#14244 )

Summary
- document how code-mode can import `output_text`/`output_image` and
ensure `add_content` stays compatible
- add a synthetic `@openai/code_mode` module that appends content items
and validates inputs
- cover the new behavior with integration tests for structured text and
image outputs

Testing
- Not run (not requested)

pakrym-oai · 2026-03-11 12:33:08 -07:00

07c22d20f6

Add model-controlled truncation for code mode results (#14258 )

Summary
- document that `@openai/code_mode` exposes
`set_max_output_tokens_per_exec_call` and that `code_mode` truncates the
final Rust-side output when the budget is exceeded
- enforce the configured budget in the Rust tool runner, reusing
truncation helpers so text-only outputs follow the unified-exec wrapper
and mixed outputs still fit within the limit
- ensure the new behavior is covered by a code-mode integration test and
string spec update

Testing
- Not run (not requested)

pakrym-oai · 2026-03-11 12:33:08 -07:00

3d41ff0b77

Add output schema to MCP tools and expose MCP tool results in code mode (#14236 )

Summary
- drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers
to keep MCP tooling focused on the final result shape
- wire the new schema definitions through code mode, context, handlers,
and spec modules so MCP tools serialize the exact output shape expected
by the model
- extend code mode tests to cover multiple MCP call scenarios and ensure
the serialized data matches the new schema
- refresh JS runner helpers and protocol models alongside the schema
changes

Testing
- Not run (not requested)

pakrym-oai · 2026-03-11 12:33:08 -07:00

ee8f84153e

Expose strongly-typed result for exec_command (#14183 )

Summary
- document output types for the various tool handlers and registry so
the API exposes richer descriptions
- update unified execution helpers and client tests to align with the
new output metadata
- clean up unused helpers across tool dispatch paths

Testing
- Not run (not requested)

pakrym-oai · 2026-03-11 12:33:07 -07:00

00ea8aa7ee

Export tools module into code mode runner (#14167 )

**Summary**
- allow `code_mode` to pass enabled tools metadata to the runner and
expose them via `tools.js`
- import tools inside JavaScript rather than relying only on globals or
proxies for nested tool calls
- update specs, docs, and tests to exercise the new bridge and explain
the tooling changes

**Testing**
- Not run (not requested)

pakrym-oai · 2026-03-09 21:59:09 -07:00

710682598d

Add code_mode experimental feature (#13418 )

A much narrower and more isolated (no node features) version of js_repl

pakrym-oai · 2026-03-09 20:56:27 -07:00

da616136cc

33 Commits