24 Commits

  • Preserve namespaces on custom tool calls (#30302)
    ## Summary
    
    - Preserve the optional namespace on custom tool calls during response
    deserialization and app-server replay.
    - Use the namespaced tool identifier for streaming argument handling and
    tool dispatch.
    - Regenerate app-server protocol schemas.
    - Add regression tests covering namespace serialization and routing.
    
    ## Testing
    
    - Ran affected protocol and app-server test suites.
    - Ran the full core test suite; two load-sensitive timing tests passed
    when rerun individually.
    - Ran Clippy and formatting checks.
    - Verified with a local end-to-end app-server replay that the namespace
    is preserved through the complete request/response flow.
  • [codex] allow CCA image generation and web search extensions (#29909)
    ## Summary
    
    - allow the standalone image-generation and web-search extensions for
    the actor-authorized provider shape used by CCA
    - preserve builtin `image_generation` and `web_search` for older models
    and existing flows
    - keep ordinary non-OpenAI providers excluded from both extensions
    - remove only the image extension local managed-AuthManager requirement
    that CCA cannot satisfy
    - share actor-authorization detection through `ModelProviderInfo`
    - keep Core tests focused on routing behavior and cover header-shape
    edge cases in `model-provider-info`
    - add a Responses Lite regression that verifies both
    `image_gen.imagegen` and `web.run`
    
    ## Why
    
    CCA uses a provider named `local` with `requires_openai_auth: false` and
    a non-empty `x-openai-actor-authorization` header. Core accepts that
    provider shape, but both extension provider-name gates rejected it;
    image generation additionally required a Codex-managed login.
    
    The standalone paths must coexist with existing builtin tools. New
    Responses Lite models can receive `image_gen.imagegen` and `web.run`,
    while older models continue using builtin tools.
    
    ## Impact
    
    This enables both standalone extensions for CCA once installed
    downstream, without removing or changing builtin-tool compatibility for
    older models.
    
    ## Validation
    
    - `just test -p codex-core
    responses_lite_exposes_standalone_tools_for_actor_authorized_provider`
    - `just test -p codex-core
    responses_lite_uses_standalone_web_search_and_image_generation`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-model-provider-info`
    - `just fmt`
    - `git diff --check`
  • Let image generation extension hosts control output persistence (#29711)
    ## Why
    
    Some extension hosts need generated images returned without writing them
    to the local filesystem or giving the model a local path.
    
    ## What changed
    
    **tl;dr**: we now conduct all extension operations in the image gen
    extension
    
    - Let hosts provide an optional image save root when installing the
    extension.
    - Save images and return path hints only when a save root is configured.
    - Return image data without saving or adding a path hint when no save
    root is configured.
    - Preserve the extension-provided `saved_path` instead of persisting
    extension images again in core.
    - Leave built-in image generation unchanged.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
    - `just test -p codex-core
    extension_tool_uses_granted_turn_permissions_without_local_persistence`
    - `just test -p codex-core tools::handlers::extension_tools::tests`
    - tested on CODEX CLI on both save_root: CODEX_HOME and None 
    - tested on CODEX APP on both as well
  • [codex] Use input items for Responses Lite tools (#27946)
    When using Responses Lite, we should all use `additional_tools` and a
    developer item instead of the top level tools array & instructions
    field. This keeps things 1-to-1.
    
    Forced namespacing for _all_ tools will land in a following PR after
    some coordination & fixes in Responses API (around collisions & return
    items).
    
    The goal is to eventually expand the scope of this to _all_ requests
    from codex, but that will require larger coordination across providers &
    slower rollout.
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • [codex] Add optional IDs to response items (#28812)
    ## Why
    
    `ResponseItem` variants do not have a consistent internal ID shape: some
    variants carry required IDs, some carry optional IDs, and some cannot
    represent an ID at all. The existing fields also use inconsistent serde,
    TypeScript, and JSON-schema annotations. A single enum-level access path
    is needed before history recording can assign and retain IDs.
    
    This PR establishes that internal model only. It intentionally does not
    generate or serialize IDs; allocation and wire persistence are isolated
    in the stacked follow-up.
    
    ## What changed
    
    - Give every concrete `ResponseItem` variant an `Option<String>` ID
    field.
    - Apply the same internal-only annotations to every ID field:
    `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and
    `#[schemars(skip)]`.
    - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared
    accessors.
    - Preserve IDs when history items are rewritten for truncation.
    - Adapt consumers that previously assumed reasoning and image-generation
    IDs were required.
    - Regenerate app-server schemas so the hidden fields are represented
    consistently.
    
    The serde catch-all `ResponseItem::Other` remains ID-less because it
    must remain a unit variant.
    
    ## Test plan
    
    - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace
    -p codex-image-generation-extension`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api -p codex-rollout-trace -p
    codex-image-generation-extension`
    - `just test -p codex-core event_mapping`
  • Use PathUri in filesystem permission paths for exec-server (#28165)
    ## Why
    
    Progress towards letting app-server and exec-server run on different
    platforms, specifically for sandbox configuration.
    
    ## What
    
    - Make the filesystem path containment hierarchy generic, defaulting to
    `AbsolutePathBuf` for now.
    - Have clients specify `AbsolutePathBuf` or `PathUri` directly where
    needed.
    - Use `PathUri` throughout exec-server filesystem protocol and trait
    boundaries.
    - Implement `From` for conversion to path URIs and `TryFrom` for
    fallible conversion to absolute paths through the generic type
    hierarchy.
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • [codex] make PathUri::from_abs_path infallible (#27976)
    ## Why
    
    `PathUri::from_abs_path` can fail for absolute paths that do not have a
    normal `file:` URI representation, forcing filesystem call sites to
    handle a conversion error even though the original path can be preserved
    losslessly.
    
    ## What
    
    Make `from_abs_path` infallible and migrate its callers. Unrepresentable
    paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
    Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
    leading encoded null reserves a namespace that cannot collide with a
    real Unix or Windows path, and fallback URIs remain opaque to lexical
    path operations.
    
    ## Validation
    
    Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
    device/verbatim and non-Unicode paths, serialization, malformed
    fallbacks, opaque lexical operations, invalid native payloads, and
    literal `/bad/path` collision resistance.
  • Handle standalone image generation failures as terminal items (#27920)
    ## Why
    
    Standalone image generation emitted a started item but no terminal item
    when the backend failed. Clients could leave the operation unresolved or
    render it as successful.
    
    ## What changed
    
    - Emit a terminal image-generation item with `status: "failed"` when
    generation or editing fails.
    - Skip image persistence for failed terminal items.
    - Render failed image generation distinctly in TUI history.
    - Preserve the status when handling live and replayed terminal items.
    
    ## Looks for TUI, App-Side change needed 
    
    <img width="867" height="89" alt="image"
    src="https://github.com/user-attachments/assets/9e32342f-a982-411e-8498-456639fc468a"
    />
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - App-server image-generation tests
    - Core stream-event tests
    - TUI image-generation lifecycle and snapshot tests
    - Scoped Clippy and formatting
  • Fix image extension PathUri conversion (#27711)
    ## Why
    
    `main` stopped compiling when #27498 passed an `AbsolutePathBuf` to the
    `ExecutorFileSystem` API migrated to `PathUri` by #27653.
    
    ## What
    
    Convert referenced image paths to `PathUri` before filesystem reads,
    declare the internal path-URI dependency, and refresh `Cargo.lock`.
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.
  • [codex] Remove async_trait from ToolExecutor (#27304)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    Stacked on #27299, this PR applies the trait change after the handler
    bodies have been outlined.
    
    ## What
    
    Changed `ToolExecutor::handle` to return an explicit boxed
    `ToolExecutorFuture` instead of using `async_trait`.
    
    Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
    the future alias through `codex-tools` and `codex-extension-api`, and
    removed `codex-tools` direct `async-trait` dependency.
  • [codex] Outline ToolExecutor handler bodies (#27299)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    For ease of reviewing, this is a prefactor to extract trait method
    implementations to inherent methods. This will prevent changing
    indentation from creating a huge diff.
    
    ## What
    
    Outlined existing `ToolExecutor::handle` bodies into inherent async
    `handle_call` methods across core and extension tool handlers.
    
    The trait methods still use `async_trait` and now delegate to
    `self.handle_call(...).await`; handler behavior is unchanged.
  • Remove async-trait from extension contributors (#27383)
    ## Why
    
    Extension contributors are registered behind `dyn Trait` objects, so
    native `async fn`/RPITIT methods would make these traits
    non-object-safe. Spell out the boxed, `Send` future contract directly so
    `extension-api` no longer needs `async-trait` while retaining the
    existing runtime model.
    
    ## What changed
    
    - add a shared `ExtensionFuture` alias and use it for asynchronous
    contributor methods
    - migrate production and test implementations to return `Box::pin(async
    move { ... })`
    - remove `async-trait` dependencies where they are no longer used,
    keeping it dev-only where unrelated test executors still require it
    
    ## Behavior
    
    No behavior change is intended. Contributor futures remain boxed,
    `Send`, dynamically dispatched, and lazily executed; cancellation and
    callback ordering stay unchanged.
    
    ## Testing
    
    - `just test -p codex-extension-api` (11 passed)
    - affected extension crates (64 passed)
    - targeted `codex-core` contributor tests (14 passed)
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    A broad local `codex-core` run compiled successfully but encountered
    unrelated sandbox and missing test-binary fixture failures; CI will run
    the full checks.
  • Route image edits through referenced file paths (#26486)
    ## Why
    
    Image edits should use the exact images selected by the model instead of
    inferring edit inputs from conversation history.
    
    ## What changed
    
    - Replaced the image tool's `action` argument with optional
    `referenced_image_paths`.
    - Treats omitted or empty references as generation and populated
    references as editing.
    - Reads referenced absolute image paths and packages them as image data
    URLs for the edit request.
    - Removed the previous history-selection and image-count heuristics.
    - Updated direct and code-mode tool instructions and calls.
    - Added an app-server integration test covering an attached image routed
    to the image edit endpoint.
    
    ## Validation
    - Tested end-to-end on local `just codex` with copy pasted image,
    attached image, etc.
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-app-server
    standalone_image_edit_uses_attached_model_visible_image`
    - `just fix -p codex-image-generation-extension`
    - `just bazel-lock-check`
  • [codex] Use standalone tools for Responses Lite (#26490)
    ## Summary
    
    Responses Lite does not execute hosted Responses tools, so models using
    it must route web search and image generation through Codex-owned
    executors & standalone Response's API endpoints.
    
    This PR is stacked on #26487.
    
    ## Validation
    
    - `cargo test -p codex-core responses_lite_ --lib`
    - `cargo test -p codex-core
    standalone_executors_remain_hidden_without_flags_or_responses_lite
    --lib`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates --lib`
    - `cargo test -p codex-web-search-extension -p
    codex-image-generation-extension`
    - `cargo test -p codex-app-server --test all standalone_`
    - `cargo fmt --all -- --check`
  • Encrypt multi-agent v2 message payloads (#26210)
    ## Why
    
    Multi-agent v2 currently routes agent instructions through normal tool
    arguments and inter-agent context. That means the parent model can emit
    plaintext task text, Codex can persist it in history/rollouts, and the
    recipient can receive it as ordinary assistant-message JSON.
    
    This changes the v2 path so agent instructions stay encrypted between
    model calls: Responses encrypts the `message` argument returned by the
    model, Codex forwards only that ciphertext, and Responses decrypts it
    internally for the recipient model.
    
    ## What changed
    
    - Mark the v2 `message` parameter as encrypted for `spawn_agent`,
    `send_message`, and `followup_task`.
    - Treat multi-agent v2 tool `message` values as ciphertext
    unconditionally.
    - Store v2 inter-agent task text in
    `InterAgentCommunication.encrypted_content` with empty plaintext
    `content`.
    - Convert encrypted inter-agent communications into the Responses
    `agent_message` input item before sending the child request.
    - Preserve `agent_message` items across history, rollout, compaction,
    telemetry, and app-server schema paths.
    - Leave multi-agent v1 unchanged.
    
    ## Message shape
    
    The model still calls the v2 tools with a `message` argument, but that
    value is now ciphertext:
    
    ```json
    {
      "name": "spawn_agent",
      "arguments": {
        "task_name": "worker",
        "message": "<ciphertext>"
      }
    }
    ```
    
    Codex stores the task as encrypted inter-agent communication:
    
    ```json
    {
      "author": "/root",
      "recipient": "/root/worker",
      "content": "",
      "encrypted_content": "<ciphertext>",
      "trigger_turn": true
    }
    ```
    
    When Codex builds the recipient request, it forwards the ciphertext
    using the new Responses input item:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root",
      "recipient": "/root/worker",
      "content": [
        {
          "type": "encrypted_content",
          "encrypted_content": "<ciphertext>"
        }
      ]
    }
    ```
    
    Responses decrypts that item internally for the recipient model.
    
    ## Context impact
    
    - Parent context no longer carries plaintext v2 agent task instructions
    from these tool arguments.
    - Codex rollout/history stores ciphertext for v2 agent instructions.
    - Recipient requests receive an `agent_message` item instead of
    assistant commentary JSON for encrypted task delivery.
    - Plaintext completion/status notifications are still plaintext because
    they are Codex-generated status messages, not encrypted model tool
    arguments.
    
    ## Validation
    
    - `just test -p codex-tools`
    - `just test -p codex-protocol`
    - `just test -p codex-rollout`
    - `just test -p codex-rollout-trace`
    - `just test -p codex-otel`
    - `just write-app-server-schema`
  • Add saved image path hint to standalone image generation (#25947)
    ## Why
    
    Standalone image generation returns image bytes to the model, but the
    model also needs the host artifact path to reference the generated file
    in follow-up work.
    
    ## What changed
    
    - Append the default saved-image path hint alongside the generated image
    tool output.
    - Reuse the existing core image-generation hint text.
    - Pass the thread ID and Codex home directory needed to compute the
    artifact path.
    - Add app-server and extension coverage for the model-visible hint.
    
    ## Validation
    
    - `just fmt`
    - `just bazel-lock-check`
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
  • Expose standalone image generation in code mode (#25923)
    ## Why
    
    Standalone image generation remained top-level-only in code-mode
    sessions.
    
    ## What changed
    
    - Change imagegen exposure from `DirectModelOnly` to `Direct`.
    - Keep direct-mode access while enabling nested code-mode access.
    - Add a focused regression test for the exposure contract.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
  • Route standalone image generation through host finalization md (#25176)
    ## Why
    
    Standalone image-generation extensions emitted turn items through the
    low-level event path, bypassing host-owned finalization such as image
    persistence and contributor processing. At the same time, the
    generated-image save-path hint must remain visible to the model through
    the extension tool's `FunctionCallOutput`, rather than the legacy
    built-in developer-message path.
    
    ## What changed
    
    - Extended `ExtensionTurnItem` to support image-generation items while
    keeping the extension-facing emitter API limited to `emit_started` and
    `emit_completed`.
    - Routed extension completion through core `finalize_turn_item`, so
    standalone image-generation items receive host-owned processing and
    persisted `saved_path` values before publication.
    - Kept legacy built-in image generation on its existing
    developer-message hint path, while standalone image generation returns
    its deterministic saved-path hint in `FunctionCallOutput`.
    - Shared the image artifact path and output-hint formatting used by core
    and the image-generation extension.
    - Passed thread identity through extension tool calls so standalone
    image generation can construct the same intended artifact path as core.
    - Added an app-server integration test covering real standalone image
    generation, saved artifact publication, model-visible output hint
    wiring, and absence of the legacy developer-message hint.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-goal-extension`
    - `just test -p codex-memories-extension`
    - Targeted `codex-core` tests for image save history, extension
    completion finalization, and contributor execution
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
    - `just fix -p codex-core`
    - `just fix -p codex-image-generation-extension`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
  • Route extension image generation through the native image completion pipeline (#24972)
    ## Why
    
    The standalone `image_gen.imagegen` extension should behave like native
    image generation for artifact persistence and UI completion, while
    returning its save-location guidance as part of the tool result instead
    of injecting a developer message.
    
    ## What Changed
    
    - Added an image-generation completion hook for extension tools so core
    can persist generated images and emit the existing `ImageGeneration`
    lifecycle events.
    - Reused core image artifact persistence for extension output and
    removed extension-local save-path/file-writing logic.
    - Split shared image persistence from built-in finalization so native
    image generation keeps its existing developer-message instruction
    behavior.
    - Returned the generated image save-location instruction through the
    extension `FunctionCallOutput`, alongside the generated image input for
    model follow-up.
    - Preserved the existing image-generation event shape for current UI and
    replay compatibility.
    - Avoided cloning the full generated-image base64 payload when emitting
    the in-progress image item.
    - Removed dependencies no longer needed after moving persistence out of
    the extension crate.
    
    ## Fast Follow
    - Adjust the existing Extension API and add a general `TurnItem`
    finalization path for re-usability of code
    
    ## Validation
    
    - Ran `just fmt`.
    - Ran `just bazel-lock-update`.
    - Ran `just bazel-lock-check`.
    - Ran `just test -p codex-tools -p codex-extension-api -p
    codex-image-generation-extension`.
    - Ran `just test -p codex-core
    image_generation_publication_is_finalized_by_core`.
    - Ran `just test -p codex-core
    handle_output_item_done_records_image_save_history_message`.
    - Ran `just fix -p codex-tools -p codex-extension-api -p codex-core -p
    codex-image-generation-extension`.
  • Add feature-gated standalone image generation extension (#24723)
    ## Why
    
    Add a standalone image generation path that can be exercised
    independently of hosted Responses image generation, while retaining the
    hosted tool as fallback unless the extension is actually available to
    the model.
    
    ## What changed
    
    - Added the `codex-image-generation-extension` crate with standalone
    generate/edit execution, prior-image selection for edits, model-visible
    image output, and local generated-image persistence.
    - Installed the extension in app-server behind the disabled-by-default
    `imagegenext` feature and backend eligibility checks.
    - Updated core tool planning so eligible `image_gen.imagegen` exposure
    replaces hosted `image_generation`, while unavailable configurations
    retain hosted fallback.
    - Added coverage for extension behavior, edit history reuse, feature
    gating, auth eligibility, and hosted-tool replacement.
    - The extension is installed through app-server only in this PR; other
    execution paths retain hosted image generation because hosted
    replacement occurs only when the standalone executor is actually
    registered and model-visible.
    - The initial extension contract intentionally fixes the image model to
    `gpt-image-2` and uses automatic image parameters.
    - Native generated-image history/card parity and rollout persistence
    cleanup are intentionally deferred follow-up work.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-features`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-app-server`
    - `just fix -p codex-image-generation-extension -p codex-features -p
    codex-core -p codex-app-server`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>