16 Commits

  • mcp: keep elicitation requests below app wire types (#29724)
    ## Why
    
    Core and tools need to request MCP elicitation without constructing
    app-server wire payloads. The request should remain a neutral protocol
    concept until app-server serializes it for a client.
    
    ## What changed
    
    - Switched core and tools to
    `codex_protocol::approvals::ElicitationRequest`.
    - Derived turn and server context inside core instead of carrying
    app-server request types through lower layers.
    - Kept the app-server payload unchanged through an explicit boundary
    conversion.
    - Removed the remaining production app-server-protocol dependency from
    tools.
    
    ## Stack
    
    This is PR 5 of 6, stacked on [PR
    #29723](https://github.com/openai/codex/pull/29723). Review only the
    delta from `codex/split-connector-metadata-types`. Next: [PR
    #29725](https://github.com/openai/codex/pull/29725).
    
    ## Validation
    
    - `codex-core` MCP coverage passed: 87 tests.
    - Tools elicitation and app-server round-trip coverage passed.
  • connectors: own app metadata types (#29723)
    ## Why
    
    Connector metadata is consumed by connector discovery, ChatGPT
    integration, core, and TUI code. Treating app-server's wire DTO as the
    shared domain model reverses the intended dependency direction.
    
    ## What changed
    
    - Added connector-owned app branding, review, screenshot, metadata, and
    info types.
    - Added explicit conversions in app-server and TUI while preserving
    app-server's wire payloads.
    - Removed production app-server-protocol dependencies from connectors
    and ChatGPT connector code.
    
    ## Stack
    
    This is PR 4 of 6, stacked on [PR
    #29722](https://github.com/openai/codex/pull/29722). Review only the
    delta from `codex/split-config-layer-types`. Next: [PR
    #29724](https://github.com/openai/codex/pull/29724).
    
    ## Validation
    
    - Connector and tools coverage passed.
    - App-server app-list coverage passed: 13 tests.
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.
  • [codex] Remove async_trait from ToolExecutor (#27304)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    Stacked on #27299, this PR applies the trait change after the handler
    bodies have been outlined.
    
    ## What
    
    Changed `ToolExecutor::handle` to return an explicit boxed
    `ToolExecutorFuture` instead of using `async_trait`.
    
    Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
    the future alias through `codex-tools` and `codex-extension-api`, and
    removed `codex-tools` direct `async-trait` dependency.
  • chore: add JSON schema policy fixture coverage (#24152)
    ## Why
    
    Before changing the Codex Bridge JSON schema policy, add integration
    coverage around real connector-like MCP tool schemas. The existing unit
    tests cover individual sanitizer behaviors, but they do not make it easy
    to see whether full fixture schemas keep model-visible guidance, prune
    only unreachable definitions, drop unsupported JSON Schema fields, and
    stay within the Responses API schema budget.
    
    ## What Changed
    
    - Added `tools/tests/json_schema_policy_fixtures.rs`, which converts MCP
    tool fixtures through `mcp_tool_to_responses_api_tool` and validates the
    resulting Responses tool parameters.
    - Added connector-style fixtures for Slack, Google Calendar, Google
    Drive, Notion, and Microsoft Outlook Email under
    `tools/tests/fixtures/json_schema_policy/`.
    - Added fixture assertions for preserved guidance, pruned definitions,
    expected field drops after `JsonSchema` conversion, marker count
    baselines, and dangling local `$ref` prevention.
    - Added a real oversized golden Notion `create_page` input schema
    fixture to exercise the compaction path that strips descriptions, drops
    root `$defs`, rewrites local refs, and fits the compacted schema under
    the budget.
  • feat: support local refs and defs in tool input schemas (#23357)
    # Why
    
    Some connector tool input schemas use local JSON Schema references and
    definition tables to avoid duplicating large nested shapes. Codex
    previously lowered these schemas into the supported subset in a way that
    could discard `$ref`-only schema objects and lose the corresponding
    definitions, which made non-strict tool registration less faithful than
    the original connector schema.
    
    This keeps the existing minimal-lowering policy: Codex still does not
    raw-pass through arbitrary JSON Schema, but it now preserves local
    reference structure that fits the Responses-compatible subset and prunes
    definition entries that cannot be reached by following `$ref`s from the
    root schema after sanitization, including refs found transitively inside
    other reachable definitions. The pruning matters because Responses
    parses definition tables even when entries are unused, so keeping dead
    definitions wastes prompt tokens.
    
    # What changed
    
    - Added `$ref`, `$defs`, and legacy `definitions` fields to the tool
    `JsonSchema` representation.
    - Updated `parse_tool_input_schema` lowering so `$ref`-only schema
    objects survive sanitization instead of becoming `{}`.
    - Sanitized definition tables recursively and dropped malformed
    definition tables so non-strict registration degrades gracefully.
    - Added reachability pruning for root definition tables by starting from
    refs outside definition tables, then following refs inside reachable
    definitions.
    - Added JSON Pointer decoding for local definition refs such as
    `#/$defs/Foo~1Bar`.
    
    # Verification
    ran local golden-schema probes against representative connector schemas
    to validate behavior on real generated schemas:
    
    | Golden schema | Before bytes | After bytes | `$defs` before -> after |
    `$ref` before -> after | Result |
    |---|---:|---:|---:|---:|---|
    | `google_calendar/create_space` | 7111 | 4526 | 7 -> 7 | 7 -> 7 | all
    definitions preserved because all are reachable |
    | `figma/apply_file_variable_changes` | 4609 | 999 | 8 -> 5 | 8 -> 5 |
    unused defs pruned after unsupported `oneOf` shapes lower away |
    | `snowflake/list_catalog_integrations` | 1380 | 404 | 3 -> 0 | 0 -> 0 |
    all defs pruned because none are referenced |
    | `dropbox/create_shared_link` | 8894 | 1836 | 14 -> 4 | 9 -> 4 | only
    defs reachable from the root schema after sanitization are retained,
    including transitively through other retained defs |
    
    Token increase across golden schema due to this change:
    <img width="817" height="366" alt="Screenshot 2026-05-19 at 1 47 04 PM"
    src="https://github.com/user-attachments/assets/d5c80fe9-da85-41e6-8ac7-a01d1e0b0f71"
    />
  • feat: add turn_id and truncation_policy to extension tool calls (#23666)
    ## Why
    
    Extension-owned tools currently receive a stripped `ToolCall` with only
    `call_id`, `tool_name`, and `payload`.
    That makes extension work that needs turn-local execution context
    awkward, especially web-search extension work that needs the active
    `truncation_policy` at tool invocation time.
    
    Reconstructing that value from config or `ExtensionData` would be
    indirect and could drift from the actual turn context, so the cleaner
    fix is to pass the needed turn metadata directly on the extension-facing
    invocation type.
    
    ## What changed
    
    - added `turn_id` and `truncation_policy` to `codex_tools::ToolCall`
    - populated those fields when core adapts `ToolInvocation` into an
    extension tool call
    - added a focused adapter test that verifies extension executors receive
    the forwarded turn metadata
    - updated the memories extension tests to construct the richer
    `ToolCall`
    - added the `codex-utils-output-truncation` dependency to `codex-tools`
    and refreshed lockfiles
    
    ## Testing
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-memories-extension`
    - `cargo test -p codex-core passes_turn_fields_to_extension_call`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
  • feat: make ToolExecutor an async trait (#22560)
    ## Why
    
    `codex_tools::ToolExecutor` keeps a tool spec attached to its runtime
    handler, but extension tools still carried a parallel
    `ExtensionToolFuture` / `ExtensionToolExecutor` shape. That made
    extension-owned tools look different from host tools even though
    routing, registration, and execution need the same abstraction.
    
    This PR makes the shared executor contract directly async and lets
    extension tools implement it too, so host tools and extension tools can
    move through the same registration path.
    
    ## What changed
    
    - Changed `ToolExecutor::handle` to an `async fn` using `async-trait`,
    and updated built-in tool handlers to implement the async trait
    directly.
    - Replaced the bespoke `ExtensionToolFuture` contract with a marker
    `ExtensionToolExecutor` over `ToolExecutor<ToolCall, Output =
    JsonToolOutput>`, re-exporting `ToolExecutor` from
    `codex-extension-api`.
    - Updated the memories extension tools to implement the shared executor
    trait.
    - Split tool-router construction into collected executors plus hosted
    model specs, keeping hosted tools like web search and image generation
    separate from executable handlers.
    - Updated spec/router tests and extension-tool stubs for the new
    executor shape.
    
    ## Verification
    
    - Not run locally.
  • feat: extract shared tool executor interface (#22359)
    ## Why
    
    Codex still models model-visible tools and executable behavior largely
    inside `codex-core`, which makes it harder to evolve the tool system
    toward a single reusable abstraction for built-ins, MCP-backed tools,
    dynamic tools, and later tools injected from outside core.
    
    This PR takes the next incremental step in that direction by moving the
    common execution-facing pieces out of core and separating them from
    core-only orchestration. The intent is to let shared tool abstractions
    improve in one place, while `codex-core` keeps the parts that are still
    inherently host-specific today, such as `ToolInvocation`, dispatch
    wiring, and hook integration.
    
    This PR is mostly moving things around. The only interesting piece is
    this abstraction:
    https://github.com/openai/codex/pull/22359/changes#diff-81af519002548ba51ed102bdaaf77e081d40a1e73a6e5f9b104bbbc96a6f1b3dR13
    
    ## What changed
    
    - Added `codex_tools::ToolExecutor<Invocation>` as the shared execution
    trait for model-visible tools.
    - Moved the reusable execution support types from `codex-core` into
    `codex-tools`:
      - `FunctionCallError`
      - `ToolPayload`
      - `ToolOutput`
    - Refactored core tool implementations so that execution behavior lives
    on `ToolExecutor<ToolInvocation>`, while `ToolHandler` remains the
    core-local extension point for hook payloads, telemetry tags, diff
    consumers, and other orchestration concerns.
    - Kept the registry and dispatch flow behaviorally unchanged while
    making the shared/extracted boundary explicit across built-in, MCP,
    dynamic, extension-backed, shell, and multi-agent tool handlers.
    
    ## Verification
    
    - `cargo test -p codex-tools`
    - `just fix -p codex-tools`
    - `just fix -p codex-core`
    - `cargo test -p codex-core` progressed through the updated tool
    surfaces and then hit the existing unrelated multi-agent stack overflow
    in
    `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Extract tool config into codex-tools (#16379)
    ## Why
    
    `codex-core` already owns too much of the tool stack, and `AGENTS.md`
    explicitly pushes us to move shared code out of `codex-core` instead of
    letting it keep growing. This PR takes the next incremental step in
    moving `core/src/tools` toward `codex-rs/tools` by extracting
    low-coupling tool configuration and image-detail gating logic into
    `codex-tools`.
    
    That gives later extraction work a cleaner boundary to build on without
    trying to move the entire tools subtree in one shot.
    
    ## What changed
    
    - moved `ToolsConfig`, `ToolsConfigParams`, shell backend config, and
    unified-exec session selection from `core/src/tools/spec.rs` into
    `codex-tools`
    - moved original image-detail gating and normalization into
    `codex-tools`
    - updated `codex-core` to consume the new `codex-tools` exports and pass
    a rendered agent-type description instead of raw role config
    - kept `codex-rs/tools/src/lib.rs` exports-only, with extracted unit
    tests living in sibling `*_tests.rs` modules
    
    ## Testing
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
  • codex-tools: extract discoverable tool models (#16254)
    ## Why
    
    `#16193` moved the pure `tool_search` and `tool_suggest` spec builders
    into `codex-tools`, but `codex-core` still owned the shared
    discoverable-tool model that those builders and the `tool_suggest`
    runtime both depend on. This change continues the migration by moving
    that reusable model boundary out of `codex-core` as well, so the
    discovery/suggestion stack uses one shared set of types and
    `core/src/tools` no longer needs its own `discoverable.rs` module.
    
    ## What changed
    
    - Moved `DiscoverableTool`, `DiscoverablePluginInfo`, and
    `filter_tool_suggest_discoverable_tools_for_client()` into
    `codex-rs/tools/src/tool_discovery.rs` alongside the extracted
    discovery/suggestion spec builders.
    - Added `codex-app-server-protocol` as a `codex-tools` dependency so the
    shared discoverable-tool model can own the connector-side `AppInfo`
    variant directly.
    - Updated `core/src/tools/handlers/tool_suggest.rs`,
    `core/src/tools/spec.rs`, `core/src/tools/router.rs`,
    `core/src/connectors.rs`, and `core/src/codex.rs` to consume the shared
    `codex-tools` model instead of the old core-local declarations.
    - Changed `core/src/plugins/discoverable.rs` to return
    `DiscoverablePluginInfo` directly, moved the pure client-filter coverage
    into `tool_discovery_tests.rs`, and deleted the old
    `core/src/tools/discoverable.rs` module.
    - Updated `codex-rs/tools/README.md` so the crate boundary documents
    that `codex-tools` now owns the discoverable-tool models in addition to
    the discovery/suggestion spec builders.
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
    codex-core --lib tools::handlers::tool_suggest::`
    - `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
    codex-core --lib tools::spec::`
    - `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
    codex-core --lib plugins::discoverable::`
    - `just bazel-lock-check`
    - `just argument-comment-lint`
    
    ## References
    
    - #16193
    - #16154
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
    - #16132
    - #16138
    - #16141
  • codex-tools: extract code mode tool spec adapters (#16132)
    ## Why
    
    The longer-term `codex-tools` migration is to move pure tool-definition
    and tool-spec plumbing out of `codex-core` while leaving session- and
    runtime-coupled orchestration behind.
    
    The remaining code-mode adapter layer in
    `core/src/tools/code_mode_description.rs` was a good next extraction
    seam because it only transformed `ToolSpec` values for code mode and
    already delegated the low-level description rendering to
    `codex-code-mode`.
    
    ## What Changed
    
    - added `codex-rs/tools/src/code_mode.rs` with
    `augment_tool_spec_for_code_mode()` and
    `tool_spec_to_code_mode_tool_definition()`
    - added focused unit coverage in `codex-rs/tools/src/code_mode_tests.rs`
    - rewired `core/src/tools/spec.rs` and `core/src/tools/code_mode/mod.rs`
    to use the extracted adapters from `codex-tools`
    - removed the old `core/src/tools/code_mode_description.rs` shim and its
    test file from `codex-core`
    - added the `codex-code-mode` dependency to `codex-tools`, updated
    `Cargo.lock`, and refreshed the `codex-tools` README to reflect the
    expanded boundary
    
    ## Test Plan
    
    - `cargo test -p codex-tools`
    - `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
    codex-core --lib tools::spec::`
    - `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
    codex-core --lib tools::code_mode::`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just argument-comment-lint`
    
    ## References
    
    - #15923
    - #15928
    - #15944
    - #15953
    - #16031
    - #16047
    - #16129
  • codex-tools: extract dynamic tool adapters (#15944)
    ## Why
    
    `codex-tools` already owned the shared JSON schema parser and the MCP
    tool schema adapter, but `core/src/tools/spec.rs` still parsed dynamic
    tools directly.
    
    That left the tool-schema boundary split in two different ways:
    
    - MCP tools flowed through `codex-tools`, while dynamic tools were still
    parsed in `codex-core`
    - the extracted dynamic-tool path initially introduced a
    dynamic-specific parsed shape even though `codex-tools` already had very
    similar MCP adapter output
    
    This change finishes that extraction boundary in one step. `codex-core`
    still owns `ResponsesApiTool` assembly, but both MCP tools and dynamic
    tools now enter that layer through `codex-tools` using the same parsed
    tool-definition shape.
    
    ## What changed
    
    - added `tools/src/dynamic_tool.rs` and sibling
    `tools/src/dynamic_tool_tests.rs`
    - introduced `parse_dynamic_tool()` in `codex-tools` and switched
    `core/src/tools/spec.rs` to use it for dynamic tools
    - added `tools/src/parsed_tool_definition.rs` so both MCP and dynamic
    adapters return the same `ParsedToolDefinition`
    - updated `core/src/tools/spec.rs` to build `ResponsesApiTool` through a
    shared local adapter helper instead of separate MCP and dynamic assembly
    paths
    - expanded `core/src/tools/spec_tests.rs` so the dynamic-tool adapter
    test asserts the full converted `ResponsesApiTool`, including
    `defer_loading`
    - updated `codex-rs/tools/README.md` to reflect the shared parsed
    tool-definition boundary
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15944).
    * #15953
    * __->__ #15944
  • codex-tools: extract MCP schema adapters (#15928)
    ## Why
    
    `codex-tools` already owns the shared tool input schema model and parser
    from the first extraction step, but `core/src/tools/spec.rs` still owned
    the MCP-specific adapter that normalizes `rmcp::model::Tool` schemas and
    wraps `structuredContent` into the call result output schema.
    
    Keeping that adapter in `codex-core` means the reusable MCP schema path
    is still split across crates, and the unit tests for that logic stay
    anchored in `codex-core` even though the runtime orchestration does not
    need to move yet.
    
    This change takes the next small step by moving the reusable MCP schema
    adapter into `codex-tools` while leaving `ResponsesApiTool` assembly in
    `codex-core`.
    
    ## What changed
    
    - added `tools/src/mcp_tool.rs` and sibling
    `tools/src/mcp_tool_tests.rs`
    - introduced `ParsedMcpTool`, `parse_mcp_tool()`, and
    `mcp_call_tool_result_output_schema()` in `codex-tools`
    - updated `core/src/tools/spec.rs` to consume parsed MCP tool parts from
    `codex-tools`
    - removed the now-redundant MCP schema unit tests from
    `core/src/tools/spec_tests.rs`
    - expanded `codex-rs/tools/README.md` to describe this second migration
    step
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core --lib tools::spec::`
  • codex-tools: extract shared tool schema parsing (#15923)
    ## Why
    
    `parse_tool_input_schema` and the supporting `JsonSchema` model were
    living in `core/src/tools/spec.rs`, but they already serve callers
    outside `codex-core`.
    
    Keeping that shared schema parsing logic inside `codex-core` makes the
    crate boundary harder to reason about and works against the guidance in
    `AGENTS.md` to avoid growing `codex-core` when reusable code can live
    elsewhere.
    
    This change takes the first extraction step by moving the schema parsing
    primitive into its own crate while keeping the rest of the tool-spec
    assembly in `codex-core`.
    
    ## What changed
    
    - added a new `codex-tools` crate under `codex-rs/tools`
    - moved the shared tool input schema model and sanitizer/parser into
    `tools/src/json_schema.rs`
    - kept `tools/src/lib.rs` exports-only, with the module-level unit tests
    split into `json_schema_tests.rs`
    - updated `codex-core` to use `codex-tools::JsonSchema` and re-export
    `parse_tool_input_schema`
    - updated `codex-app-server` dynamic tool validation to depend on
    `codex-tools` directly instead of reaching through `codex-core`
    - wired the new crate into the Cargo workspace and Bazel build graph