## Summary
Some customer MCP tools expose large input schemas that exceed Codex's
compact schema budget even after description stripping. Today, the final
compaction pass collapses complex schemas starting at depth 2, which can
erase important shallow call structure such as small `anyOf` branches,
required fields, and help-mode entry points. In one reported case, this
degraded a tool schema into `query: any | any`, leaving the model
without enough structure to discover the required help call.
This change raises the deep-schema collapse boundary from depth 2 to
depth 3. That preserves one additional layer of the tool contract while
still collapsing deeper expensive subtrees to `{}` when a schema remains
over budget.
## What Changed
- Increased `MAX_COMPACT_TOOL_SCHEMA_DEPTH` from `2` to `3`.
- Updated the schema compaction traversal test to assert the new
collapse boundary.
- The resulting compacted shape keeps useful shallow structure, for
example:
- top-level argument names
- shallow `anyOf` branches
- required object fields
- nested property names one level deeper than before
## Validation
- Ran `just test -p codex-tools`: 81 tests passed.
- Ran a golden schema corpus comparison over 214 discovered tool input
schemas under `golden_schemas/*/mcp_tools/*/input_schema.json`.
- Depth 2 and depth 3 had identical percentile token counts across the
corpus.
- Both ended with `0 / 214` schemas over 1k tokens.
- Both ended with `0 / 214` schemas over the 4,000-byte compact JSON
budget.
- Only one golden schema changed, increasing from 49 to 56 tokens, so
this does not appear to introduce a meaningful corpus-wide regression.
Corpus percentile results:
| Percentile | Depth 2 | Depth 3 |
|---|---:|---:|
| p0 | 9 | 9 |
| p10 | 31 | 31 |
| p25 | 54 | 54 |
| p50 | 81 | 81 |
| p75 | 143 | 143 |
| p90 | 290 | 290 |
| p95 | 431 | 431 |
| p99 | 600 | 600 |
| max | 832 | 832 |
codex-tools
codex-tools is the shared support crate for building, adapting, and executing
model-visible tools outside codex-core.
Today this crate owns the host-facing tool models and helpers that no longer
need to live in core/src/tools/spec.rs or core/src/client_common.rs:
- aggregate host models such as
ToolSpec,ConfiguredToolSpec,LoadableToolSpec,ResponsesApiNamespace, andResponsesApiNamespaceTool - host discovery models used while assembling tool sets, including discoverable-tool models and request-plugin-install helpers
- host adapters such as schema sanitization, MCP/dynamic conversion, code-mode augmentation, and image-detail normalization
- shared executable-tool contracts such as
ToolExecutor,ToolCall, andToolOutput
That extraction is the first step in a longer migration. The goal is not to
move all of core/src/tools into this crate in one shot. Instead, the plan is
to peel off reusable pieces in reviewable increments while keeping
compatibility-sensitive orchestration in codex-core until the surrounding
boundaries are ready.
Vision
Over time, this crate should hold host-side tool machinery that is shared by multiple consumers, for example:
- host-visible aggregate tool models
- tool-set planning and discovery helpers
- MCP and dynamic-tool adaptation into Responses API shapes
- code-mode compatibility shims that do not depend on
codex-core - other narrowly scoped host utilities that multiple crates need
The corresponding non-goals are just as important:
- do not move
codex-coreorchestration here prematurely - do not pull
Session/TurnContext/ approval flow / runtime execution logic into this crate unless those dependencies have first been split into stable shared interfaces - do not turn this crate into a grab-bag for unrelated helper code
Migration approach
The expected migration shape is:
- Keep extension-owned executable-tool authoring in
codex-extension-api. - Move host-side planning/adaptation helpers here when they no longer need to
stay coupled to
codex-core. - Leave compatibility-sensitive adapters in
codex-corewhile downstream call sites are updated. - Only extract higher-level host infrastructure after the crate boundaries are clear and independently testable.
Crate conventions
This crate should start with stricter structure than core/src/tools so it
stays easy to grow:
src/lib.rsshould remain exports-only.- Business logic should live in named module files such as
foo.rs. - Unit tests for
foo.rsshould live in a siblingfoo_tests.rs. - The implementation file should wire tests with:
#[cfg(test)]
#[path = "foo_tests.rs"]
mod tests;
If this crate starts accumulating code that needs runtime state from
codex-core, that is a sign to revisit the extraction boundary before adding
more here.