4 Commits

  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • feat: support oneOf and allOf in tool input schemas (#24118)
    ## Why
    
    Some connector golden schemas use JSON Schema composition keywords
    beyond `anyOf`, specifically top-level or nested `oneOf` and `allOf`.
    Codex currently needs to preserve those shapes when parsing MCP tool
    input schemas so connector tools do not lose valid schema structure
    during normalization.
    
    To prevent an increased Responses API error rate, this PR will be merged
    after the Responses API supports top-level `oneOf`/`allOf`.
    
    ## What Changed
    
    - Adds `oneOf` and `allOf` support to `JsonSchema`, matching the
    existing `anyOf` handling.
    - Traverses `oneOf` and `allOf` anywhere schema children are visited,
    including sanitization, definition reachability, description stripping,
    and deep schema compaction.
    - Adds a final large-schema compaction pass that prunes schema objects
    containing `anyOf`, `oneOf`, or `allOf` to `{}` if earlier compaction
    passes still leave the schema over budget.
    
    ## Validation
    Golden schema token validation over `2,025` schemas under
    `golden_schemas`, all parsed successfully. Token count is `o200k_base`
    over compact JSON from `parse_tool_input_schema`.
    
    | Percentile | Before PR | After oneOf/allOf | After pruning |
    |---|---:|---:|---:|
    | p0 | 9 | 9 | 9 |
    | p10 | 63 | 64 | 64 |
    | p25 | 86 | 87 | 87 |
    | p50 | 125 | 128 | 128 |
    | p75 | 203 | 206 | 206 |
    | p90 | 327 | 333 | 333 |
    | p95 | 460 | 473 | 473 |
    | p99 | 763 | 779 | 779 |
    | max | 891 | 955 | 955 |
    
    Totals:
    
    | Parser state | Total tokens |
    |---|---:|
    | Before PR | 345,713 |
    | After oneOf/allOf | 352,686 |
    | After pruning | 352,686 |
    
    The pruning column matches the oneOf/allOf column for this corpus
    because no parsed compact golden schema remains over the `4,000`
    compact-byte budget after the earlier compaction passes.
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • chore: add JSON schema policy fixture coverage (#24152)
    ## Why
    
    Before changing the Codex Bridge JSON schema policy, add integration
    coverage around real connector-like MCP tool schemas. The existing unit
    tests cover individual sanitizer behaviors, but they do not make it easy
    to see whether full fixture schemas keep model-visible guidance, prune
    only unreachable definitions, drop unsupported JSON Schema fields, and
    stay within the Responses API schema budget.
    
    ## What Changed
    
    - Added `tools/tests/json_schema_policy_fixtures.rs`, which converts MCP
    tool fixtures through `mcp_tool_to_responses_api_tool` and validates the
    resulting Responses tool parameters.
    - Added connector-style fixtures for Slack, Google Calendar, Google
    Drive, Notion, and Microsoft Outlook Email under
    `tools/tests/fixtures/json_schema_policy/`.
    - Added fixture assertions for preserved guidance, pruned definitions,
    expected field drops after `JsonSchema` conversion, marker count
    baselines, and dangling local `$ref` prevention.
    - Added a real oversized golden Notion `create_page` input schema
    fixture to exercise the compaction path that strips descriptions, drops
    root `$defs`, rewrites local refs, and fits the compacted schema under
    the budget.