38 Commits

  • [codex] Use model metadata for skills usage instructions (#29740)
    ## Summary
    
    - add a false-by-default `include_skills_usage_instructions` model
    metadata field
    - enable the field for the bundled `gpt-5.5` model metadata
    - consume the metadata in both core and extension skill rendering
    - remove hardcoded legacy-model matching and its marker plumbing
  • [codex] Add comp_hash to model metadata (#27532)
    ## Summary
    - add optional `comp_hash` metadata to `ModelInfo`
    - update `ModelInfo` fixtures for the shared schema change
    - keep older model responses compatible by defaulting the field to
    `None`
    
    ## Why
    The models endpoint needs an opaque identifier for compaction-compatible
    model configurations. This PR only exposes that value in model metadata;
    it does not add it to turn context or change runtime behavior.
    
    Follow-up #27520 carries the value through turn context and rollouts,
    then uses it to trigger compaction.
    
    ## Stack
    - based directly on `main`
    - replaces #27519, which was accidentally merged into the wrong base
    branch
    - functionality follow-up: #27520
    
    ## Testing
    - `just test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `just fix -p codex-core -p codex-protocol -p codex-analytics -p
    codex-models-manager`
  • [codex] Add use_responses_lite 'override' logic (#26487)
    ## Summary
    
    - add a defaulted `ModelInfo.use_responses_lite` catalog field
    - support serializing `reasoning.context` while preserving the existing
    effort and summary path
    - has not been turned on for any models yet
    
    I've added an override to parallel tools if responses_lite is on. I've
    also forced persistent reasoning when using responses_lite. It would be
    ideal if we could centralize all the responses_lite plumbing, but I
    think this is best for now to keep the plumbing & diffs small.
    
    ## Testing
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-core
    configured_reasoning_summary_is_sent`
    - `cargo check -p codex-core --tests`
    - `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes
    with pre-existing warnings in `codex-code-mode` and
    `codex-core-plugins`)
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • [codex-rs] auto-review model override (#23767)
    ## Why
    
    Guardian auto-review normally uses the provider-preferred review model
    when one is available. Some parent models need model-catalog metadata to
    select a different review model while keeping older `/models` payloads
    compatible when that metadata is absent.
    
    ## What changed
    
    - Added optional `ModelInfo::auto_review_model_override` metadata to the
    public model payload as a review-model slug.
    - Updated Guardian review model selection to prefer the catalog override
    when present, while preserving the existing provider preferred-model
    path and parent-model fallback when it is omitted.
    - Added focused Guardian coverage for override and no-override model
    selection.
    - Added an `auto_review` core integration suite test that loads override
    metadata from a remote model catalog path and asserts the strict
    auto-review `/responses` request uses the catalog-selected review model.
    - Updated existing `ModelInfo` fixtures and local catalog constructors
    for the new optional field.
    
    ## Validation
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `cargo test -p codex-core guardian_review_uses_`
    - `cargo test -p codex-core
    remote_model_override_uses_catalog_model_for_strict_auto_review --test
    all`
    - `just fix -p codex-protocol`
    - `just fix -p codex-core`
    - `just fmt`
    - `git diff --check`
  • [codex] Add model tool mode selector (#25031)
    ## Why
    Some models need to select their code-execution behavior through model
    catalog metadata. Models without that metadata must continue to follow
    the existing `CodeMode` and `CodeModeOnly` feature flags, including when
    a newer server sends an enum value this client does not recognize.
    
    ## What changed
    - add optional `ModelInfo.tool_mode` metadata with `direct`,
    `code_mode`, and `code_mode_only`
    - treat omitted and unknown wire values as `None`
    - resolve `None` from the existing feature flags
    - carry the resolved `ToolMode` directly on `TurnContext`, outside
    `Config`
    - use the resolved value for turn creation, model switches, review
    turns, tool planning, and code execution
    
    ## Coverage
    - add protocol coverage for omitted, known, and unknown enum values
    - add focused coverage for flag fallback and explicit metadata
    overriding feature flags
    - add core integration coverage that fetches remote model metadata
    through `/v1/models` and verifies the outbound `/responses` tools for
    explicit `direct` and `code_mode_only` selectors
    
    ## Stack
    - followed by #25032
  • Honor client-resolved service tier defaults (#23537)
    ## Why
    
    Model catalog responses can now advertise a nullable
    `default_service_tier` for each model. Codex needs to preserve three
    distinct states all the way from config/app-server inputs to inference:
    
    - no explicit service tier, so the client may apply the current model
    catalog default when FastMode is enabled
    - explicit `default`, meaning the user intentionally wants standard
    routing
    - explicit catalog tier ids such as `priority`, `flex`, or future tiers
    
    Keeping those states distinct prevents the UI from showing one tier
    while core sends another, especially after model switches or app-server
    `thread/start` / `turn/start` updates.
    
    ## What Changed
    
    - Plumbed `default_service_tier` through model catalog protocol types,
    app-server model responses, generated schemas, model cache fixtures, and
    provider/model-manager conversions.
    - Added the request-only `default` service tier sentinel and normalized
    legacy config spelling so `fast` in `config.toml` still materializes as
    the runtime/request id `priority`.
    - Moved catalog default resolution to the TUI/client side, including
    recomputing the effective service tier when model/FastMode-dependent
    surfaces change.
    - Updated app-server thread lifecycle config construction so
    `serviceTier: null` preserves explicit standard-routing intent by
    mapping to `default` instead of internal `None`.
    - Kept core responsible for validating explicit tiers against the
    current model and stripping `default` before `/v1/responses`, without
    applying catalog defaults itself.
    
    ## Validation
    
    - `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
    - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
    - `cargo test -p codex-tui service_tier`
    - `cargo test -p codex-protocol service_tier_for_request`
    - `cargo test -p codex-core get_service_tier`
    - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
    service_tier`
  • 1- Add model service tiers metadata (#20969)
    ## Why
    
    The model list needs to carry display-ready service tier metadata so
    clients can render tier choices with stable IDs, names, and
    descriptions. A raw speed-tier string list is not enough for richer UI
    copy or future tier labels.
    
    ## What changed
    
    - Added `ModelServiceTier` to shared model metadata with string `id`,
    `name`, and `description` fields.
    - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
    empty defaults for older cached model payloads.
    - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
    it through TUI app-server model conversion.
    - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
    metadata as deprecated in source and generated schema output.
    - Regenerated app-server protocol JSON schema and TypeScript fixtures,
    including `ModelServiceTier.ts`.
    
    ## Verification
    
    - Ran `just write-app-server-schema`.
    - Did not run local tests per repo instruction; relying on PR CI.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Update models.json and related fixtures (#19323)
    Supersedes #18735.
    
    The scheduled rust-release-prepare workflow force-pushed
    `bot/update-models-json` back to the generated models.json-only diff,
    which dropped the test and snapshot updates needed for CI.
    
    This PR keeps the latest generated `models.json` from #18735 and adds
    the corresponding fixture updates:
    - preserve model availability NUX in the app-server model cache fixture
    - update core/TUI expectations for the new `gpt-5.4` `xhigh` default
    reasoning
    - refresh affected TUI chatwidget snapshots for the `gpt-5.5`
    default/model copy changes
    
    Validation run locally while preparing the fix:
    - `just fmt`
    - `cargo test -p codex-app-server model_list`
    - `cargo test -p codex-core includes_no_effort_in_request`
    - `cargo test -p codex-core
    includes_default_reasoning_effort_in_request_when_defined_by_model_info`
    - `cargo test -p codex-tui --lib chatwidget::tests`
    - `cargo insta pending-snapshots`
    
    ---------
    
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
  • Add max context window model metadata (#18382)
    Adds max_context_window to model metadata and routes core context-window
    reads through resolved model info. Config model_context_window overrides
    are clamped to max_context_window when present; without an override, the
    model context_window is used.
  • Use model metadata for Fast Mode status (#16949)
    Fast Mode status was still tied to one model name in the TUI and
    model-list plumbing. This changes the model metadata shape so a model
    can advertise additional speed tiers, carries that field through the
    app-server model list, and uses it to decide when to show Fast Mode
    status.
    
    For people using Codex, the behavior is intended to stay the same for
    existing models. Fast Mode still requires the existing signed-in /
    feature-gated path; the difference is that the UI can now recognize any
    model the model list marks as Fast-capable, instead of requiring a new
    client-side slug check.
  • remove temporary ownership re-exports (#16626)
    Stacked on #16508.
    
    This removes the temporary `codex-core` / `codex-login` re-export shims
    from the ownership split and rewrites callsites to import directly from
    `codex-model-provider-info`, `codex-models-manager`, `codex-api`,
    `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
    
    No behavior change intended; this is the mechanical import cleanup layer
    split out from the ownership move.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • ci: verify codex-rs Cargo manifests inherit workspace settings (#16353)
    ## Why
    
    Bazel clippy now catches lints that `cargo clippy` can still miss when a
    crate under `codex-rs` forgets to opt into workspace lints. The concrete
    example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
    flagged a clippy violation in `models_cache.rs`, but Cargo did not
    because that crate inherited workspace package metadata without
    declaring `[lints] workspace = true`.
    
    We already mirror the workspace clippy deny list into Bazel after
    [#15955](https://github.com/openai/codex/pull/15955), so we also need a
    repo-side check that keeps every `codex-rs` manifest opted into the same
    workspace settings.
    
    ## What changed
    
    - add `.github/scripts/verify_cargo_workspace_manifests.py`, which
    parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
      - `version.workspace = true`
      - `edition.workspace = true`
      - `license.workspace = true`
      - `[lints] workspace = true`
    - top-level crate names follow the `codex-*` / `codex-utils-*`
    conventions, with explicit exceptions for `windows-sandbox-rs` and
    `utils/path-utils`
    - run that script in `.github/workflows/ci.yml`
    - update the current outlier manifests so the check is enforceable
    immediately
    - fix the newly exposed clippy violations in the affected crates
    (`app-server/tests/common`, `file-search`, `feedback`,
    `shell-escalation`, and `debug-client`)
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
    * #16351
    * __->__ #16353
  • Prefer websockets when providers support them (#13592)
    Remove all flags and model settings.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Apply argument comment lint across codex-rs (#14652)
    ## Why
    
    Once the repo-local lint exists, `codex-rs` needs to follow the
    checked-in convention and CI needs to keep it from drifting. This commit
    applies the fallback `/*param*/` style consistently across existing
    positional literal call sites without changing those APIs.
    
    The longer-term preference is still to avoid APIs that require comments
    by choosing clearer parameter types and call shapes. This PR is
    intentionally the mechanical follow-through for the places where the
    existing signatures stay in place.
    
    After rebasing onto newer `main`, the rollout also had to cover newly
    introduced `tui_app_server` call sites. That made it clear the first cut
    of the CI job was too expensive for the common path: it was spending
    almost as much time installing `cargo-dylint` and re-testing the lint
    crate as a representative test job spends running product tests. The CI
    update keeps the full workspace enforcement but trims that extra
    overhead from ordinary `codex-rs` PRs.
    
    ## What changed
    
    - keep a dedicated `argument_comment_lint` job in `rust-ci`
    - mechanically annotate remaining opaque positional literals across
    `codex-rs` with exact `/*param*/` comments, including the rebased
    `tui_app_server` call sites that now fall under the lint
    - keep the checked-in style aligned with the lint policy by using
    `/*param*/` and leaving string and char literals uncommented
    - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
    registry/git metadata in the lint job
    - split changed-path detection so the lint crate's own `cargo test` step
    runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
    - continue to run the repo wrapper over the `codex-rs` workspace, so
    product-code enforcement is unchanged
    
    Most of the code changes in this commit are intentionally mechanical
    comment rewrites or insertions driven by the lint itself.
    
    ## Verification
    
    - `./tools/argument-comment-lint/run.sh --workspace`
    - `cargo test -p codex-tui-app-server -p codex-tui`
    - parsed `.github/workflows/rust-ci.yml` locally with PyYAML
    
    ---
    
    * -> #14652
    * #14651
  • chore: add web_search_tool_type for image support (#13538)
    add `web_search_tool_type` on model_info that can be populated from
    backend. will be used to filter which models can use `web_search` with
    images and which cant.
    
    added small unit test.
  • Add under-development original-resolution view_image support (#13050)
    ## Summary
    
    Add original-resolution support for `view_image` behind the
    under-development `view_image_original_resolution` feature flag.
    
    When the flag is enabled and the target model is `gpt-5.3-codex` or
    newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends
    `detail: "original"` to the Responses API instead of using the legacy
    resize/compress path.
    
    ## What changed
    
    - Added `view_image_original_resolution` as an under-development feature
    flag.
    - Added `ImageDetail` to the protocol models and support for serializing
    `detail: "original"` on tool-returned images.
    - Added `PromptImageMode::Original` to `codex-utils-image`.
      - Preserves original PNG/JPEG/WebP bytes.
      - Keeps legacy behavior for the resize path.
    - Updated `view_image` to:
    - use the shared `local_image_content_items_with_label_number(...)`
    helper in both code paths
      - select original-resolution mode only when:
        - the feature flag is enabled, and
        - the model slug parses as `gpt-5.3-codex` or newer
    - Kept local user image attachments on the existing resize path; this
    change is specific to `view_image`.
    - Updated history/image accounting so only `detail: "original"` images
    use the docs-based GPT-5 image cost calculation; legacy images still use
    the old fixed estimate.
    - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG
    at 85% quality unless lossless is required, while still allowing other
    formats when explicitly requested.
    - Updated tests and helper code that construct
    `FunctionCallOutputContentItem::InputImage` to carry the new `detail`
    field.
    
    ## Behavior
    
    ### Feature off
    - `view_image` keeps the existing resize/re-encode behavior.
    - History estimation keeps the existing fixed-cost heuristic.
    
    ### Feature on + `gpt-5.3-codex+`
    - `view_image` sends original-resolution images with `detail:
    "original"`.
    - PNG/JPEG/WebP source bytes are preserved when possible.
    - History estimation uses the GPT-5 docs-based image-cost calculation
    for those `detail: "original"` images.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/13050
    -  `2` https://github.com/openai/codex/pull/13331
    -  `3` https://github.com/openai/codex/pull/13049
  • Add model availability NUX metadata (#12972)
    - replace show_nux with structured availability_nux model metadata
    - expose availability NUX data through the app-server model API
    - update shared fixtures and tests for the new field
  • Use model catalog default for reasoning summary fallback (#12873)
    ## Summary
    - make `Config.model_reasoning_summary` optional so unset means use
    model default
    - resolve the optional config value to a concrete summary when building
    `TurnContext`
    - add protocol support for `default_reasoning_summary` in model metadata
    
    ## Validation
    - `cargo test -p codex-core --lib client::tests -- --nocapture`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: rm hardcoded PRESETS list (#12650)
    rm `PRESETS` list harcoded in `model_presets` as we now have bundled
    `models.json` with equivalent info.
    
    update logic to rely on bundled models instead, update tests.
  • fix: show user warning when using default fallback metadata (#11690)
    ### What
    It's currently unclear when the harness falls back to the default,
    generic `ModelInfo`. This happens when the `remote_models` feature is
    disabled or the model is truly unknown, and can lead to bad performance
    and issues in the harness.
    
    Add a user-facing warning when this happens so they are aware when their
    setup is broken.
    
    ### Tests
    Added tests, tested locally.
  • Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
    ## Why
    
    `codex-core` was being built in multiple feature-resolved permutations
    because test-only behavior was modeled as crate features. For a large
    crate, those permutations increase compile cost and reduce cache reuse.
    
    ## Net Change
    
    - Removed the `test-support` crate feature and related feature wiring so
    `codex-core` no longer needs separate feature shapes for test consumers.
    - Standardized cross-crate test-only access behind
    `codex_core::test_support`.
    - External test code now imports helpers from
    `codex_core::test_support`.
    - Underlying implementation hooks are kept internal (`pub(crate)`)
    instead of broadly public.
    
    ## Outcome
    
    - Fewer `codex-core` build permutations.
    - Better incremental cache reuse across test targets.
    - No intended production behavior change.
  • Prefer websocket transport when model opts in (#11386)
    Summary
    - add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
    in all fixtures and constructors
    - wire the new flag into websocket selection so models that opt in
    always use websocket transport even when the feature gate is off
    
    Testing
    - Not run (not requested)
  • [Codex][CLI] Gate image inputs by model modalities (#10271)
    ###### Summary
    
    - Add input_modalities to model metadata so clients can determine
    supported input types.
    - Gate image paste/attach in TUI when the selected model does not
    support images.
    - Block submits that include images for unsupported models and show a
    clear warning.
    - Propagate modality metadata through app-server protocol/model-list
    responses.
      - Update related tests/fixtures.
    
      ###### Rationale
    
      - Models support different input modalities.
    - Clients need an explicit capability signal to prevent unsupported
    requests.
    - Backward-compatible defaults preserve existing behavior when modality
    metadata is absent.
    
      ###### Scope
    
      - codex-rs/protocol, codex-rs/core, codex-rs/tui
      - codex-rs/app-server-protocol, codex-rs/app-server
      - Generated app-server types / schema fixtures
    
      ###### Trade-offs
    
    - Default behavior assumes text + image when field is absent for
    compatibility.
      - Server-side validation remains the source of truth.
    
      ###### Follow-up
    
    - Non-TUI clients should consume input_modalities to disable unsupported
    attachments.
    - Model catalogs should explicitly set input_modalities for text-only
    models.
    
      ###### Testing
    
      - cargo fmt --all
      - cargo test -p codex-tui
      - env -u GITHUB_APP_KEY cargo test -p codex-core --lib
      - just write-app-server-schema
    - cargo run -p codex-cli --bin codex -- app-server generate-ts --out
    app-server-types
      - test against local backend
      
    <img width="695" height="199" alt="image"
    src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
    />
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • chore(personality) new schema with fallbacks (#10147)
    ## Summary
    Let's dial in this api contract in a bit more with more robust fallback
    behavior when model_instructions_template is false.
    
    Switches to a more explicit template / variables structure, with more
    fallbacks.
    
    ## Testing
    - [x] Adding unit tests
    - [x] Tested locally
  • feat(core) ModelInfo.model_instructions_template (#9597)
    ## Summary
    #9555 is the start of a rename, so I'm starting to standardize here.
    Sets up `model_instructions` templating with a strongly-typed object for
    injecting a personality block into the model instructions.
    
    ## Testing
    - [x] Added tests
    - [x] Ran locally
  • Add migration_markdown in model_info (#9219)
    Next step would be to clean Model Upgrade in model presets
    
    ---------
    
    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
  • Fix app-server write_models_cache to treat models with less priority number as higher priority. (#8844)
    Rank models with p0 higher than p1. This shouldn't result in any
    behavioral changes. Just reordering.
  • Merge Modelfamily into modelinfo (#8763)
    - Merge ModelFamily into ModelInfo
    - Remove logic for adding instructions to apply patch
    - Add compaction limit and visible context window to `ModelInfo`
  • Remove reasoning format (#8484)
    This isn't very useful parameter. 
    
    logic:
    ```
    if model puts `**` in their reasoning, trim it and visualize the header.
    if couldn't trim: don't render
    if model doesn't support: don't render
    ```
    
    We can simplify to:
    ```
    if could trim, visualize header.
    if not, don't render
    ```
  • remove minimal client version (#8447)
    This isn't needed value by client
  • Rename OpenAI models to models manager (#8346)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • feat: model picker (#8209)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • make model optional in config (#7769)
    - Make Config.model optional and centralize default-selection logic in
    ModelsManager, including a default_model helper (with
    codex-auto-balanced when available) so sessions now carry an explicit
    chosen model separate from the base config.
    - Resolve `model` once in `core` and `tui` from config. Then store the
    state of it on other structs.
    - Move refreshing models to be before resolving the default model