Commit Graph

14 Commits

  • Persist js_repl codex helpers across cells (#14503)
    ## Summary
    
    This changes `js_repl` so saved references to `codex.tool(...)` and
    `codex.emitImage(...)` keep working across cells.
    
    Previously, those helpers were recreated per exec and captured that
    exec's `message.id`. If a persisted object or saved closure reused an
    old helper in a later cell, the nested tool/image call could fail with
    `js_repl exec context not found`.
    
    This patch:
    - keeps stable `codex.tool` and `codex.emitImage` helper identities in
    the kernel
    - resolves the current exec dynamically at call time using
    `AsyncLocalStorage`
    - adds regression coverage for persisted helper references across cells
    - updates the js_repl docs and project-doc instructions to describe the
    new behavior and its limits
    
    ## Why
    
    We already support persistent top-level bindings across `js_repl` cells,
    so persisted objects should be able to reuse `codex` helpers in later
    active cells. The bug was that helper identity was exec-scoped, not
    kernel-scoped.
    
    Using `AsyncLocalStorage` fixes the cross-cell reuse case without
    falling back to a single global active exec that could accidentally
    attribute stale background callbacks to the wrong cell.
  • Let models opt into original image detail (#14175)
    ## Summary
    
    This PR narrows original image detail handling to a single opt-in
    feature:
    
    - `image_detail_original` lets the model request `detail: "original"` on
    supported models
    - Omitting `detail` preserves the default resized behavior
    
    The model only sees `detail: "original"` guidance when the active model
    supports it:
    
    - JS REPL instructions include the guidance and examples only on
    supported models
    - `view_image` only exposes a `detail` parameter when the feature and
    model can use it
    
    The image detail API is intentionally narrow and consistent across both
    paths:
    
    - `view_image.detail` supports only `"original"`; otherwise omit the
    field
    - `codex.emitImage(..., detail)` supports only `"original"`; otherwise
    omit the field
    - Unsupported explicit values fail clearly at the API boundary instead
    of being silently reinterpreted
    - Unsupported explicit `detail: "original"` requests fall back to normal
    behavior when the feature is disabled or the model does not support
    original detail
  • Add js_repl cwd and homeDir helpers (#14385)
    ## Summary
    
    This PR adds two read-only path helpers to `js_repl`:
    
    - `codex.cwd`
    - `codex.homeDir`
    
    They are exposed alongside the existing `codex.tmpDir` helper so the
    REPL can reference basic host path context without reopening direct
    `process` access.
    
    ## Implementation
    
    - expose `codex.cwd` and `codex.homeDir` from the js_repl kernel
    - make `codex.homeDir` come from the kernel process environment
    - pass session dependency env through js_repl kernel startup so
    `codex.homeDir` matches the env a shell-launched process would see
    - keep existing shell `HOME` population behavior unchanged
    - update js_repl prompt/docs and add runtime/integration coverage for
    the new helpers
  • Clarify js_repl image emission and encoding guidance (#13639)
    ## Summary
    
    This updates the `js_repl` prompt and docs to make the image guidance
    less confusing.
    
    ## What changed
    
    - Clarified that `codex.emitImage(...)` adds one image per call and can
    be called multiple times to emit multiple images.
    - Reworded the image-encoding guidance to be general `js_repl` advice
    instead of `ImageDetailOriginal`-specific behavior.
    - Updated the guidance to recommend JPEG at about quality 85 when lossy
    compression is acceptable, and PNG when transparency or lossless detail
    matters.
    - Mirrored the same wording in the public `js_repl` docs.
  • Harden js_repl emitImage to accept only data: URLs (#13507)
    ### Motivation
    
    - Prevent untrusted js_repl code from supplying arbitrary external URLs
    that the host would forward into model input and cause external fetches
    / data exfiltration. This change narrows the emitImage contract to safe,
    self-contained data URLs.
    
    ### Description
    
    - Kernel: added `normalizeEmitImageUrl` and enforce that string-valued
    `codex.emitImage(...)` inputs and `input_image`/content-item paths only
    accept non-empty `data:` URLs; byte-based paths still produce data URLs
    as before (`kernel.js`).
    - Host: added `validate_emitted_image_url` and check `EmitImage`
    requests before creating `FunctionCallOutputContentItem::InputImage`,
    returning an error to the kernel if the URL is not a `data:` URL
    (`mod.rs`).
    - Tests/docs: added a runtime test
    `js_repl_emit_image_rejects_non_data_url` to assert rejection of
    non-data URLs and updated user-facing docs/instruction text to state
    `data URL` support instead of generic direct image URLs (`mod.rs`,
    `docs/js_repl.md`, `project_doc.rs`).
    
    ### Testing
    
    - Ran `just fmt` in `codex-rs`; it completed successfully.
    - Added a runtime test (`cargo test -p codex-core
    js_repl_emit_image_rejects_non_data_url`) but executing the test in this
    environment failed due to a missing system dependency required by
    `codex-linux-sandbox` (the vendored `bubblewrap` build requires
    `libcap.pc` via `pkg-config`), so the test could not be run here.
    - Attempted a focused `cargo test` invocation with and without default
    features; both compile/test attempts were blocked by the same missing
    system `libcap` dependency in this environment.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)
  • Persist initialized js_repl bindings after failed cells (#13482)
    ## Summary
    
    - Change `js_repl` failed-cell persistence so later cells keep prior
    bindings plus only the current-cell bindings whose initialization
    definitely completed before the throw.
    - Preserve initialized lexical bindings across failed cells via
    module-namespace readability, including top-level destructuring that
    partially succeeds before a later throw.
    - Preserve hoisted `var` and `function` bindings only when execution
    clearly reached their declaration site, and preserve direct top-level
    pre-declaration `var` writes and updates through explicit write-site
    markers.
    - Preserve top-level `for...in` / `for...of` `var` bindings when the
    loop body executes at least once, using a first-iteration guard to avoid
    per-iteration bookkeeping overhead.
    - Keep prior module state intact across link-time failures and
    evaluation failures before the prelude runs, while still allowing failed
    cells that already recreated prior bindings to persist updates to those
    existing bindings.
    - Hide internal commit hooks from user `js_repl` code after the prelude
    aliases them, so snippets cannot spoof committed bindings by calling the
    raw `import.meta` hooks directly.
    - Add focused regression coverage for the supported failed-cell
    behaviors and the intentionally unsupported boundaries.
    - Update `js_repl` docs and generated instructions to describe the new,
    narrower failed-cell persistence model.
    
    ## Motivation
    
    We saw `js_repl` drop bindings that had already been initialized
    successfully when a later statement in the same cell threw, for example:
    
        const { context: liveContext, session } =
          await initializeGoogleSheetsLiveForTab(tab);
        // later statement throws
    
    That was surprising in practice because successful earlier work
    disappeared from the next cell.
    
    This change makes failed-cell persistence more useful without trying to
    model every possible partially executed JavaScript edge case. The
    resulting behavior is narrower and easier to reason about:
    
    - prior bindings are always preserved
    - lexical bindings persist when their initialization completed before
    the throw
    - hoisted `var` / `function` bindings persist only when execution
    clearly reached their declaration or a supported top-level `var` write
    site
    - failed cells that already recreated prior bindings can persist writes
    to those existing bindings even if they introduce no new bindings
    
    The detailed edge-case matrix stays in `docs/js_repl.md`. The
    model-facing `project_doc` guidance is intentionally shorter and focused
    on generation-relevant behavior.
    
    ## Supported Failed-Cell Behavior
    
    - Prior bindings remain available after a failed cell.
    - Initialized lexical bindings remain available after a failed cell.
    - Top-level destructuring like `const { a, b } = ...` preserves names
    whose initialization completed before a later throw.
    - Hoisted `function` bindings persist when execution reached the
    declaration statement before the throw.
    - Direct top-level pre-declaration `var` writes and updates persist, for
    example:
      - `x = 1`
      - `x += 1`
      - `x++`
    - short-circuiting logical assignments only persist when the write
    branch actually runs
    - Non-empty top-level `for...in` / `for...of` `var` loops persist their
    loop bindings.
    - Failed cells can persist updates to existing carried bindings after
    the prelude has run, even when the cell commits no new bindings.
    - Link failures and eval failures before the prelude do not poison
    `@prev`.
    
    ## Intentionally Unsupported Failed-Cell Cases
    
    - Hoisted function reads before the declaration, such as `foo(); ...;
    function foo() {}`
    - Aliasing or inference-based recovery from reads before declaration
    - Nested writes inside already-instrumented assignment RHS expressions
    - Destructuring-assignment recovery for hoisted `var`
    - Partial `var` destructuring recovery
    - Pre-declaration `undefined` reads for hoisted `var`
    - Empty top-level `for...in` / `for...of` loop vars
    - Nested or scope-sensitive pre-declaration `var` writes outside direct
    top-level expression statements
  • [js_repl] Support local ESM file imports (#13437)
    ## Summary
    - add `js_repl` support for dynamic imports of relative and absolute
    local ESM `.js` / `.mjs` files
    - keep bare package imports on the native Node path and resolved from
    REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`),
    even when they originate from imported local files
    - restrict static imports inside imported local files to other local
    relative/absolute `.js` / `.mjs` files, and surface a clear error for
    unsupported top-level static imports in the REPL cell
    - run imported local files inside the REPL VM context so they can access
    `codex.tmpDir`, `codex.tool`, captured `console`, and Node-like
    `import.meta` helpers
    - reload local files between execs so later `await import("./file.js")`
    calls pick up edits and fixed failures, while preserving package/builtin
    caching and persistent top-level REPL bindings
    - make `import.meta.resolve()` self-consistent by allowing the returned
    `file://...` URLs to round-trip through `await import(...)`
    - update both public and injected `js_repl` docs to clarify the narrowed
    contract, including global bare-import resolution behavior for local
    absolute files
    
    ## Testing
    - `cargo test -p codex-core js_repl_`
    - built codex binary and verified behavior
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Make js_repl image output controllable (#13331)
    ## Summary
    
    Instead of always adding inner function call outputs to the model
    context, let js code decide which ones to return.
    
    - Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the
    outer `js_repl` function output.
    - Keep `codex.tool(...)` return values unchanged as structured JS
    objects.
    - Add `codex.emitImage(...)` as the explicit path for attaching an image
    to the outer `js_repl` function output.
    - Support emitting from a direct image URL, a single `input_image` item,
    an explicit `{ bytes, mimeType }` object, or a raw tool response object
    containing exactly one image.
    - Preserve existing `view_image` original-resolution behavior when JS
    emits the raw `view_image` tool result.
    - Suppress the special `ViewImageToolCall` event for `js_repl`-sourced
    `view_image` calls so nested inspection stays side-effect free until JS
    explicitly emits.
    - Update the `js_repl` docs and generated project instructions with both
    recommended patterns:
      - `await codex.emitImage(codex.tool("view_image", { path }))`
    - `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg",
    quality: 85 }), mimeType: "image/jpeg" })`
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/13050
    - 👉 `2` https://github.com/openai/codex/pull/13331
    -  `3` https://github.com/openai/codex/pull/13049
  • Log js_repl nested tool responses in rollout history (#12837)
    ## Summary
    
    - add tracing-based diagnostics for nested `codex.tool(...)` calls made
    from `js_repl`
    - emit a bounded, sanitized summary at `info!`
    - emit the exact raw serialized response object or error string seen by
    JavaScript at `trace!`
    - document how to enable these logs and where to find them, especially
    for `codex app-server`
    
    ## Why
    
    Nested `codex.tool(...)` calls inside `js_repl` are a debugging
    boundary: JavaScript sees the tool result, but that result is otherwise
    hard to inspect from outside the kernel.
    
    This change adds explicit tracing for that path using the repo’s normal
    observability pattern:
    - `info` for compact summaries
    - `trace` for exact raw payloads when deep debugging is needed
    
    ## What changed
    
    - `js_repl` now summarizes nested tool-call results across the response
    shapes it can receive:
      - message content
      - function-call outputs
      - custom tool outputs
      - MCP tool results and MCP error results
      - direct error strings
    - each nested `codex.tool(...)` completion logs:
      - `exec_id`
      - `tool_call_id`
      - `tool_name`
      - `ok`
      - a bounded summary struct describing the payload shape
    - at `trace`, the same path also logs the exact serialized response
    object or error string that JavaScript received
    - docs now include concrete logging examples for `codex app-server`
    - unit coverage was added for multimodal function output summaries and
    error summaries
    
    ## How to use it
    
    ### Summary-only logging
    
    Set:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=info
    ```
    
    For `codex app-server`, tracing output is written to the server process
    `stderr`.
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=info \
    LOG_FORMAT=json \
    codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    This emits bounded summary lines for nested `codex.tool(...)` calls.
    
    ### Full raw debugging
    
    Set:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace
    ```
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace \
    LOG_FORMAT=json \
    codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    At `trace`, you get:
    - the same `info` summary line
    - a `trace` line with the exact serialized response object seen by
    JavaScript
    - or the exact error string if the nested tool call failed
    
    ### Where the logs go
    
    For `codex app-server`, these logs go to process `stderr`, so redirect
    or capture `stderr` to inspect them.
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace \
    LOG_FORMAT=json \
    /Users/fjord/code/codex/codex-rs/target/debug/codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    Then inspect:
    
    ```sh
    rg "js_repl nested tool call" /tmp/codex-app-server.log
    ```
    
    Without an explicit `RUST_LOG` override, these `js_repl` nested
    tool-call logs are typically not visible.
  • js_repl: remove codex.state helper references (#12275)
    ## Summary
    
    This PR removes `codex.state` from the `js_repl` helper surface and
    removes all corresponding documentation/instruction references.
    
    ## Motivation
    
    Top-level bindings in `js_repl` now persist across cells, so the extra
    `codex.state` helper is redundant and adds unnecessary API/docs surface.
    
    ## Changes
    
    - Removed the long-lived `state` object from the Node kernel helper
    wiring.
    - Stopped exposing `codex.state` (and `context.state`) during `js_repl`
    execution.
    - Updated user-facing `js_repl` docs to remove `codex.state`.
    - Updated generated instruction text and related test expectations to
    list only:
      - `codex.tmpDir`
      - `codex.tool(name, args?)`
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/12300
    - 👉 `2` https://github.com/openai/codex/pull/12275
    -  `3` https://github.com/openai/codex/pull/12205
    -  `4` https://github.com/openai/codex/pull/12185
    -  `5` https://github.com/openai/codex/pull/10673
  • [js_repl] paths for node module resolution can be specified for js_repl (#11944)
    # External (non-OpenAI) Pull Request Requirements
    
    In `js_repl` mode, module resolution currently starts from
    `js_repl_kernel.js`, which is written to a per-kernel temp dir. This
    effectively means that bare imports will not resolve.
    
    This PR adds a new config option, `js_repl_node_module_dirs`, which is a
    list of dirs that are used (in order) to resolve a bare import. If none
    of those work, the current working directory of the thread is used.
    
    For example:
    ```toml
    js_repl_node_module_dirs = [
        "/path/to/node_modules/",
        "/other/path/to/node_modules/",
    ]
    ```
  • Add js_repl_tools_only model and routing restrictions (#10671)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    - 👉 `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Add js_repl host helpers and exec end events (#10672)
    ## Summary
    
    This PR adds host-integrated helper APIs for `js_repl` and updates model
    guidance so the agent can use them reliably.
    
    ### What’s included
    
    - Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call
    normal Codex tools.
    - Keep persistent JS state and scratch-path helpers available:
      - `codex.state`
      - `codex.tmpDir`
    - Wire `js_repl` tool calls through the standard tool router path.
    - Add/align `js_repl` execution completion/end event behavior with
    existing tool logging patterns.
    - Update dynamic prompt injection (`project_doc`) to document:
      - how to call `codex.tool(...)`
      - raw output behavior
    - image flow via `view_image` (`codex.tmpDir` +
    `codex.tool("view_image", ...)`)
    - stdio safety guidance (`console.log` / `codex.tool`, avoid direct
    `process.std*`)
    
    ## Why
    
    - Standardize JS-side tool usage on `codex.tool(...)`
    - Make `js_repl` behavior more consistent with existing tool execution
    and event/logging patterns.
    - Give the model enough runtime guidance to use `js_repl` safely and
    effectively.
    
    ## Testing
    
    - Added/updated unit and runtime tests for:
      - `codex.tool` calls from `js_repl` (including shell/MCP paths)
      - image handoff flow via `view_image`
      - prompt-injection text for `js_repl` guidance
      - execution/end event behavior and related regression coverage
    
    
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    - 👉 `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Add feature-gated freeform js_repl core runtime (#10674)
    ## Summary
    
    This PR adds an **experimental, feature-gated `js_repl` core runtime**
    so models can execute JavaScript in a persistent REPL context across
    tool calls.
    
    The implementation integrates with existing feature gating, tool
    registration, prompt composition, config/schema docs, and tests.
    
    ## What changed
    
    - Added new experimental feature flag: `features.js_repl`.
    - Added freeform `js_repl` tool and companion `js_repl_reset` tool.
    - Gated tool availability behind `Feature::JsRepl`.
    - Added conditional prompt-section injection for JS REPL instructions
    via marker-based prompt processing.
    - Implemented JS REPL handlers, including freeform parsing and pragma
    support (timeout/reset controls).
    - Added runtime resolution order for Node:
      1. `CODEX_JS_REPL_NODE_PATH`
      2. `js_repl_node_path` in config
      3. `PATH`
    - Added JS runtime assets/version files and updated docs/schema.
    
    ## Why
    
    This enables richer agent workflows that require incremental JavaScript
    execution with preserved state, while keeping rollout safe behind an
    explicit feature flag.
    
    ## Testing
    
    Coverage includes:
    
    - Feature-flag gating behavior for tool exposure.
    - Freeform parser/pragma handling edge cases.
    - Runtime behavior (state persistence across calls and top-level `await`
    support).
    
    ## Usage
    
    ```toml
    [features]
    js_repl = true
    ```
    
    Optional runtime override:
    
    - `CODEX_JS_REPL_NODE_PATH`, or
    - `js_repl_node_path` in config.
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670