22 Commits

  • [codex] allow CCA image generation and web search extensions (#29909)
    ## Summary
    
    - allow the standalone image-generation and web-search extensions for
    the actor-authorized provider shape used by CCA
    - preserve builtin `image_generation` and `web_search` for older models
    and existing flows
    - keep ordinary non-OpenAI providers excluded from both extensions
    - remove only the image extension local managed-AuthManager requirement
    that CCA cannot satisfy
    - share actor-authorization detection through `ModelProviderInfo`
    - keep Core tests focused on routing behavior and cover header-shape
    edge cases in `model-provider-info`
    - add a Responses Lite regression that verifies both
    `image_gen.imagegen` and `web.run`
    
    ## Why
    
    CCA uses a provider named `local` with `requires_openai_auth: false` and
    a non-empty `x-openai-actor-authorization` header. Core accepts that
    provider shape, but both extension provider-name gates rejected it;
    image generation additionally required a Codex-managed login.
    
    The standalone paths must coexist with existing builtin tools. New
    Responses Lite models can receive `image_gen.imagegen` and `web.run`,
    while older models continue using builtin tools.
    
    ## Impact
    
    This enables both standalone extensions for CCA once installed
    downstream, without removing or changing builtin-tool compatibility for
    older models.
    
    ## Validation
    
    - `just test -p codex-core
    responses_lite_exposes_standalone_tools_for_actor_authorized_provider`
    - `just test -p codex-core
    responses_lite_uses_standalone_web_search_and_image_generation`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-model-provider-info`
    - `just fmt`
    - `git diff --check`
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • Add indexed web search mode (#28489)
    ## Summary
    
    - Add `web_search = "indexed"` alongside `disabled`, `cached`, and
    `live`.
    - Use that same resolved mode for both hosted and standalone web search.
    - For hosted search, send `index_gated_web_access: true` with external
    web access enabled only when `indexed` is selected.
    - For standalone search, preserve the existing boolean wire values for
    existing modes (`cached` maps to `false` and `live` to `true`) and send
    `"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
    - Carry the mode through managed configuration requirements and
    generated schemas.
    
    ## Why
    
    Indexed search provides a middle ground between cached-only search and
    unrestricted live page fetching. Search queries can remain live while
    direct page fetches are limited to URLs admitted by the server.
    
    The existing `web_search` setting remains the single source of truth, so
    hosted and standalone executors cannot drift into different access
    modes. Without an explicit `indexed` selection, the existing
    model-visible tool and request shapes are unchanged.
    
    ```toml
    web_search = "indexed"
    
    [features]
    standalone_web_search = true
    ```
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-api` (`126 passed`)
    - `just test -p codex-web-search-extension` (`7 passed`)
    - `just test -p codex-core
    code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
    - Focused configuration, hosted request, standalone request, and
    managed-requirement coverage is included in the PR; remaining suites run
    in CI.
    
    The full workspace test suite was not run locally.
  • [codex] Assign response item IDs when recording history (#28814)
    ## Why
    
    Client-created response items enter history without IDs, so their
    identity is lost across rollout persistence and resume. IDs should be
    assigned once at the history-recording boundary, while IDs returned by
    the server must remain unchanged.
    
    The Responses API validates item IDs using type-specific prefixes.
    Locally generated IDs therefore use the matching prefix plus a
    hyphenated UUIDv7, keeping them valid while distinguishable from
    server-generated IDs. Because this changes persisted history and
    provider request shapes, the behavior is opt-in behind the
    under-development `item_ids` feature. Compaction triggers remain request
    controls whose API shape does not accept an ID.
    
    ## What changed
    
    - Register the disabled-by-default `item_ids` feature and expose it in
    `config.schema.json`.
    - Make supported optional `ResponseItem` IDs serializable and expose
    them in the generated app-server schemas.
    - When `item_ids` is enabled, assign an ID during conversation-history
    preparation if an item has no ID.
    - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
    item conventions.
    - Preserve existing server IDs without rewriting them.
    - Persist assigned IDs in rollouts and include them in subsequent
    Responses requests.
    - Remove the unsupported ID field from `CompactionTrigger` and document
    why it has no ID.
    - Add integration coverage for enabled ID persistence, preservation of
    server IDs, and omission of generated IDs while the feature is disabled.
    
    `prepare_conversation_items_for_history` is the single response-item ID
    allocation boundary.
    
    ## Test plan
    
    - `just test -p codex-features`
    - `just test -p codex-core
    response_item_ids_persist_across_resume_and_preserve_server_ids`
    - `just test -p codex-core
    non_openai_responses_requests_omit_item_turn_metadata`
    - `just test -p codex-core
    resize_all_images_prepares_failures_before_history_insertion`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
  • feat: render typed envelopes for multi-agent v2 messages (#28368)
    ## Why
    
    Multi-agent v2 messages need a consistent, model-visible envelope that
    identifies what kind of interaction occurred, who sent it, and which
    agent it targets. Previously, encrypted deliveries exposed only
    `encrypted_content`, while child completion used the legacy
    `<subagent_notification>` shape. That meant the client could not
    consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the
    same format.
    
    This change adds the routing envelope as plaintext while keeping task
    and message payloads encrypted. No new Responses API field is required:
    an encrypted delivery is represented as an `input_text` header
    immediately followed by its existing `encrypted_content` item.
    
    Every envelope now follows this shape:
    
    ```text
    Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER>
    Task name: <recipient agent path>
    Sender: <author agent path>
    Payload:
    <message payload>
    ```
    
    ## Message types
    
    ### `NEW_TASK`
    
    `NEW_TASK` is used when the recipient should begin a new turn, including
    an initial `spawn_agent` task and a later `followup_task`.
    
    For a root agent spawning `/root/worker`, the request contains a
    plaintext envelope followed by the encrypted task:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root",
      "recipient": "/root/worker",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n"
        },
        {
          "type": "encrypted_content",
          "encrypted_content": "<encrypted task payload>"
        }
      ]
    }
    ```
    
    Conceptually, the model receives:
    
    ```text
    Message Type: NEW_TASK
    Task name: /root/worker
    Sender: /root
    Payload:
    Review the authentication changes and report any regressions.
    ```
    
    ### `MESSAGE`
    
    `MESSAGE` is used for a queued `send_message` delivery. It communicates
    with an existing agent without starting a new turn.
    
    For `/root/worker` reporting progress to the root agent, the request
    contains:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root/worker",
      "recipient": "/root",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n"
        },
        {
          "type": "encrypted_content",
          "encrypted_content": "<encrypted message payload>"
        }
      ]
    }
    ```
    
    Conceptually, the model receives:
    
    ```text
    Message Type: MESSAGE
    Task name: /root
    Sender: /root/worker
    Payload:
    The protocol tests pass; I am checking the resume path now.
    ```
    
    ### `FINAL_ANSWER`
    
    `FINAL_ANSWER` is emitted when a child agent reaches a terminal state
    and reports its result to its parent. Completion payloads are already
    available locally, so the complete envelope is represented as plaintext
    rather than as a plaintext header plus encrypted content.
    
    For `/root/worker` completing work for the root agent, the request
    contains:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root/worker",
      "recipient": "/root",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found."
        }
      ]
    }
    ```
    
    The model-visible form is:
    
    ```text
    Message Type: FINAL_ANSWER
    Task name: /root
    Sender: /root/worker
    Payload:
    No regressions found.
    ```
    
    Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a
    terminal-status description in the payload.
    
    ## What changed
    
    - Render `NEW_TASK` or `MESSAGE` in
    `InterAgentCommunication::to_model_input_item`, based on whether the
    encrypted delivery starts a turn.
    - Replace the multi-agent v2 `<subagent_notification>` completion
    payload with a model-visible `FINAL_ANSWER` envelope.
    - Document `Task name`, `Sender`, and `Payload` consistently in the
    multi-agent developer instructions.
    - Prevent local-only history projections from treating an encrypted
    message's plaintext header as the complete assistant message.
    - Preserve rollout-trace interaction edges when an agent message
    contains both plaintext and encrypted content.
    
    Legacy multi-agent behavior remains unchanged.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout-trace`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-core
    encrypted_multi_agent_v2_spawn_sends_agent_message_to_child`
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core
    multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn`
    - `just test -p codex-core
    multi_agent_v2_completion_queues_message_for_direct_parent`
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • Support plaintext agent messages (#27830)
    ## Why
    
    Multi-agent v2 `send_message` deliveries already reach the receiving
    model as typed `agent_message` items with encrypted content.
    Child-completion notifications are generated by Codex itself, so their
    content is plaintext and previously fell back to a serialized JSON
    envelope inside an assistant message.
    
    With plaintext `input_text` supported for `agent_message`, both delivery
    paths can use the same model-visible type while preserving explicit
    author and recipient metadata.
    
    ## What changed
    
    - add plaintext `input_text` support to `AgentMessageInputContent` and
    regenerate the affected app-server schemas
    - preserve `InterAgentCommunication` as structured mailbox input instead
    of converting it to assistant text
    - record delivered communications as typed `agent_message` history items
    - persist a dedicated rollout item so local delivery metadata such as
    `trigger_turn` remains available without leaking into the Responses
    request
    - reconstruct typed agent messages on resume and preserve fork-turn
    truncation behavior
    - remove request-time assistant-content parsing
    - preserve plaintext and encrypted inter-agent deliveries in stage-one
    memory inputs
    - normalize and link plaintext and encrypted agent messages in rollout
    traces without treating inbound messages as child results
    - cover the real MultiAgent V2 child-completion path end to end with
    deterministic mailbox synchronization
    
    ## Verification
    
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order
    record_initial_history_reconstructs_typed_inter_agent_message
    fork_turn_positions_use_inter_agent_delivery_metadata`
    - `just test -p codex-memories-write
    serializes_inter_agent_communications_for_memory`
    - `just test -p codex-rollout-trace
    agent_messages_preserve_routing_and_content
    sub_agent_started_activity_creates_spawn_edge`
    - `just test -p codex-rollout-trace
    agent_result_edge_falls_back_to_child_thread_without_result_message`
    - `just test -p codex-protocol -p codex-rollout -p
    codex-app-server-protocol`
  • [codex] Expand hosted web search citation guidance (#27501)
    ## Summary
    
    - Expand the hosted web search prompt with explicit Markdown-link
    citation guidance.
    - Keep internal `turnX` reference IDs out of final responses and place
    citations next to supported claims.
    
    ## Context
    
    
    https://openai.slack.com/archives/C0AU83S0ZQU/p1781133381448499?thread_ts=1780352049.512299&cid=C0AU83S0ZQU
    
    ## Test plan
    
    - Confirmed `codex-rs/ext/web-search/web_run_description.md` exactly
    matches the supplied target prompt.
    - `UV_CACHE_DIR=/tmp/codex-uv-cache
    PATH=/tmp/codex-just/bin:/home/dev-user/.rustup/toolchains/1.95.0-x86_64-unknown-linux-gnu/bin:$PATH
    python3 scripts/format.py --check`
    - `git diff --check`
  • [codex] Remove async_trait from ToolExecutor (#27304)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    Stacked on #27299, this PR applies the trait change after the handler
    bodies have been outlined.
    
    ## What
    
    Changed `ToolExecutor::handle` to return an explicit boxed
    `ToolExecutorFuture` instead of using `async_trait`.
    
    Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
    the future alias through `codex-tools` and `codex-extension-api`, and
    removed `codex-tools` direct `async-trait` dependency.
  • [codex] Outline ToolExecutor handler bodies (#27299)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    For ease of reviewing, this is a prefactor to extract trait method
    implementations to inherent methods. This will prevent changing
    indentation from creating a huge diff.
    
    ## What
    
    Outlined existing `ToolExecutor::handle` bodies into inherent async
    `handle_call` methods across core and extension tool handlers.
    
    The trait methods still use `async_trait` and now delegate to
    `self.handle_call(...).await`; handler behavior is unchanged.
  • Remove async-trait from extension contributors (#27383)
    ## Why
    
    Extension contributors are registered behind `dyn Trait` objects, so
    native `async fn`/RPITIT methods would make these traits
    non-object-safe. Spell out the boxed, `Send` future contract directly so
    `extension-api` no longer needs `async-trait` while retaining the
    existing runtime model.
    
    ## What changed
    
    - add a shared `ExtensionFuture` alias and use it for asynchronous
    contributor methods
    - migrate production and test implementations to return `Box::pin(async
    move { ... })`
    - remove `async-trait` dependencies where they are no longer used,
    keeping it dev-only where unrelated test executors still require it
    
    ## Behavior
    
    No behavior change is intended. Contributor futures remain boxed,
    `Send`, dynamically dispatched, and lazily executed; cancellation and
    callback ordering stay unchanged.
    
    ## Testing
    
    - `just test -p codex-extension-api` (11 passed)
    - affected extension crates (64 passed)
    - targeted `codex-core` contributor tests (14 passed)
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    A broad local `codex-core` run compiled successfully but encountered
    unrelated sandbox and missing test-binary fixture failures; CI will run
    the full checks.
  • Update web search citation prompt (#27096)
    ## Summary
    
    - Update the web search tool prompt to require Markdown links for cited
    sources.
    - Explicitly tell the model not to use `turnX`-style citations in
    responses.
    
    ## Context
    
    
    https://openai.slack.com/archives/C0AU83S0ZQU/p1780964147777649?thread_ts=1780352049.512299&cid=C0AU83S0ZQU
    
    ## Test plan
    
    - `git diff --check`
    - `python3 scripts/format.py --check` (fails only on Rust formatter
    setup: rustup cannot create temp files under `/home/dev-user/.rustup`;
    Just and Python formatter checks pass when using temp cache dirs)
  • [codex] Exclude external tool output from memories (#26821)
    ## Summary
    
    - add contains_external_context() to tool output so other tools can be
    opted out of influencing memory when disable_on_external_context=true
    - Classify standalone web-search output as external context (to match
    behavior as hosted web search)
    - Verify with integration test
  • [codex] Enable standalone web search in code mode (#26719)
    ## What
    
    - Consume plaintext `output` from standalone search while retaining
    optional `encrypted_output` parsing.
    - Expose `web.run` to code mode and return search output to nested
    JavaScript calls.
    - Cover direct and code-mode standalone search paths with integration
    tests.
    
    ## Why
    
    `/v1/alpha/search` now returns plaintext output, which code mode needs
    to consume standalone search results.
    
    ## Test plan
    
    - `just test -p codex-api`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-core code_mode_can_call_standalone_web_search`
    - `just test -p codex-app-server
    standalone_web_search_round_trips_output`
  • [codex] Use standalone tools for Responses Lite (#26490)
    ## Summary
    
    Responses Lite does not execute hosted Responses tools, so models using
    it must route web search and image generation through Codex-owned
    executors & standalone Response's API endpoints.
    
    This PR is stacked on #26487.
    
    ## Validation
    
    - `cargo test -p codex-core responses_lite_ --lib`
    - `cargo test -p codex-core
    standalone_executors_remain_hidden_without_flags_or_responses_lite
    --lib`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates --lib`
    - `cargo test -p codex-web-search-extension -p
    codex-image-generation-extension`
    - `cargo test -p codex-app-server --test all standalone_`
    - `cargo fmt --all -- --check`
  • [codex] enable parallel standalone web search calls (#25702)
    ## Summary
    - opt the extension-backed standalone `web.run` tool into parallel tool
    execution
    - update the existing extension registration test to assert that the
    tool advertises parallel-call support
    
    ## Why
    The standalone web-search API endpoint now supports parallel requests.
    The extension executor still inherited the shared serial default,
    causing multiple `web.run` calls to acquire the exclusive runtime lock.
    
    ## Impact
    Models that emit multiple standalone web-search calls can now execute
    them concurrently when model-level parallel tool calls are enabled.
    
    ## Validation
    - `just fmt`
    - `just test -p codex-web-search-extension`
    - `git diff --check origin/main...HEAD`
  • [codex] Require model for standalone web search (#25131)
    ## Why
    
    The standalone `/v1/alpha/search` request now requires a `model`, but
    the `web.run` extension currently omits it.
    
    Adds `model` to extension `ToolCall` invocation.
    
    Follow-up to #23823.
    
    ## What changed
    
    - Make `SearchRequest.model` required.
    - Expose the effective per-turn model on extension tool calls and pass
    it in standalone web-search requests.
    - Assert the model is forwarded in the app-server round-trip test.
    
    ## Testing
    
    - `just test -p codex-api -p codex-tools -p codex-web-search-extension
    -p codex-memories-extension -p codex-goal-extension`
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    - `just test -p codex-app-server -E
    'test(standalone_web_search_round_trips_encrypted_output)'`
  • Show activity for standalone web search calls (#24693)
    ## Why
    
    Standalone `web.run` calls run in the extension, so they need normal
    web-search progress activity while a request is in flight and durable
    completed activity after a thread is reloaded.
    
    Follow-up to #23823; uses the extension turn-item emission path added in
    #24813.
    
    ## What changed
    
    - Emit standalone `web.run` start/completion items through the host
    turn-item emitter, preserving standard client delivery and rollout
    persistence.
    - Include useful completion detail for queries, image queries, and
    literal-URL `open`/`find` commands.
    - Render completed searches as `Searched the web` or `Searched the web
    for <detail>`, with snapshot coverage for the detail-free case.
    - Extend the app-server round-trip test to verify completed search
    activity is reconstructed by `thread/read` after a fresh-process reload.
    
    ## Testing
    
    - `just test -p codex-web-search-extension`
    - `just test -p codex-app-server -E
    "test(standalone_web_search_round_trips_encrypted_output)"`
  • fix: dont compact standalone websearch schema (#24660)
    add new `parse_tool_input_schema_without_compaction` to bypass the
    existing compaction/trimming of client-provided tool schemas that are
    over 4k bytes.
    
    we want this for standalone web search to keep field guidance/metadata
    on certain fields; this keeps us closer to parity with existing hosted
    tool schema (which didnt go through this 4k byte filter).
  • make direct only allowed caller for standalone websearch (#24646)
    only allow `Direct` callers of the standalone websearch tool because its
    not supported in codemode
  • standalone websearch extension (#23823)
    ## Summary
    
    Add the extension-backed standalone `web.run` tool so Codex can call the
    standalone search endpoint through the `codex-api` search client and
    return its encrypted output to Responses.
    
    - gate the new tool behind `standalone_web_search`
    - install the extension in the app-server thread registry and hide
    hosted `web_search` when standalone search is enabled for OpenAI
    providers so the two paths stay mutually exclusive
    - build search context from persisted history using a small tail
    heuristic: previous user message, assistant text between the last two
    user turns capped at about 1k tokens, and current user message
    
    ## Test Plan
    
    - `cargo test -p codex-web-search-extension`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`