13 Commits

  • Optimize unbounded byte scans with memchr (#26265)
    ## Summary
    
    This PR adds `memchr` for some low-hanging performance improvements
    (namely, in MCP stdio, Ollama streaming, and full message-history
    newline counts).
    
    Codex produced the following release benchmarks:
    
    | Operation | Before | After | Speedup |
    | --- | ---: | ---: | ---: |
    | MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
    | Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
    | Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |
    
    With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
    MCP server, completed `initialize`, requested `tools/list`, and
    deserialized a 1 MiB tool description over newline-delimited stdio),
    it's about 16x faster end-to-end:
    
    | Branch | 50 calls | Per call |
    | --- | ---: | ---: |
    | `main` | 862.53 ms | 17.25 ms |
    | this branch | 53.89 ms | 1.08 ms |
    
    `memchr` is already in our dependency tree and extremely widely used for
    this kind of optimized scanning.
  • remove temporary ownership re-exports (#16626)
    Stacked on #16508.
    
    This removes the temporary `codex-core` / `codex-login` re-export shims
    from the ownership split and rewrites callsites to import directly from
    `codex-model-provider-info`, `codex-models-manager`, `codex-api`,
    `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
    
    No behavior change intended; this is the mechanical import cleanup layer
    split out from the ownership move.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • ollama: default to Responses API for built-ins (#8798)
    This is an alternate PR to solving the same problem as
    <https://github.com/openai/codex/pull/8227>.
    
    In this PR, when Ollama is used via `--oss` (or via `model_provider =
    "ollama"`), we default it to use the Responses format. At runtime, we do
    an Ollama version check, and if the version is older than when Responses
    support was added to Ollama, we print out a warning.
    
    Because there's no way of configuring the wire api for a built-in
    provider, we temporarily add a new `oss_provider`/`model_provider`
    called `"ollama-chat"` that will force the chat format.
    
    Once the `"chat"` format is fully removed (see
    <https://github.com/openai/codex/discussions/7782>), `ollama-chat` can
    be removed as well
    
    ---------
    
    Co-authored-by: Eric Traut <etraut@openai.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • make model optional in config (#7769)
    - Make Config.model optional and centralize default-selection logic in
    ModelsManager, including a default_model helper (with
    codex-auto-balanced when available) so sessions now carry an explicit
    chosen model separate from the base config.
    - Resolve `model` once in `core` and `tui` from config. Then store the
    state of it on other structs.
    - Move refreshing models to be before resolving the default model
  • LM Studio OSS Support (#2312)
    ## Overview
    
    Adds LM Studio OSS support. Closes #1883
    
    
    ### Changes
    This PR enhances the behavior of `--oss` flag to support LM Studio as a
    provider. Additionally, it introduces a new flag`--local-provider` which
    can take in `lmstudio` or `ollama` as values if the user wants to
    explicitly choose which one to use.
    
    If no provider is specified `codex --oss` will auto-select the provider
    based on whichever is running.
    
    #### Additional enhancements 
    The default can be set using `oss-provider` in config like:
    
    ```
    oss_provider = "lmstudio"
    ```
    
    For non-interactive users, they will need to either provide the provider
    as an arg or have it in their `config.toml`
    
    ### Notes
    For best performance, [set the default context
    length](https://lmstudio.ai/docs/app/advanced/per-model) for gpt-oss to
    the maximum your machine can support
    
    ---------
    
    Co-authored-by: Matt Clayton <matt@lmstudio.ai>
    Co-authored-by: Eric Traut <etraut@openai.com>
  • Use assert_matches (#4756)
    assert_matches is soon to be in std but is experimental for now.
  • chore: clippy on redundant closure (#4058)
    Add redundant closure clippy rules and let Codex fix it by minimising
    FQP
  • chore: upgrade to Rust 1.89 (#2465)
    Codex created this PR from the following prompt:
    
    > upgrade this entire repo to Rust 1.89. Note that this requires
    updating codex-rs/rust-toolchain.toml as well as the workflows in
    .github/. Make sure that things are "clippy clean" as this change will
    likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
    are sufficient to check for correctness
    
    Note this modifies a lot of lines because it folds nested `if`
    statements using `&&`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
    * #2467
    * __->__ #2465
  • Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
    This PR:
    * Added the clippy.toml to configure allowable expect / unwrap usage in
    tests
    * Removed as many expect/allow lines as possible from tests
    * moved a bunch of allows to expects where possible
    
    Note: in integration tests, non `#[test]` helper functions are not
    covered by this so we had to leave a few lingering `expect(expect_used`
    checks around
  • fix: when using --oss, ensure correct configuration is threaded through correctly (#1859)
    This PR started as an investigation with the goal of eliminating the use
    of `unsafe { std::env::set_var() }` in `ollama/src/client.rs`, as
    setting environment variables in a multithreaded context is indeed
    unsafe and these tests were observed to be flaky, as a result.
    
    Though as I dug deeper into the issue, I discovered that the logic for
    instantiating `OllamaClient` under test scenarios was not quite right.
    In this PR, I aimed to:
    
    - share more code between the two creation codepaths,
    `try_from_oss_provider()` and `try_from_provider_with_base_url()`
    - use the values from `Config` when setting up Ollama, as we have
    various mechanisms for overriding config values, so we should be sure
    that we are always using the ultimate `Config` for things such as the
    `ModelProviderInfo` associated with the `oss` id
    
    Once this was in place,
    `OllamaClient::try_from_provider_with_base_url()` could be used in unit
    tests for `OllamaClient` so it was possible to create a properly
    configured client without having to set environment variables.
  • fix: correct spelling error that sneaked through (#1855)
    I ended up force-pushing https://github.com/openai/codex/pull/1848
    because CI jobs were not being triggered after updating the PR on
    GitHub, so this spelling error sneaked through.
  • Introduce --oss flag to use gpt-oss models (#1848)
    This adds support for easily running Codex backed by a local Ollama
    instance running our new open source models. See
    https://github.com/openai/gpt-oss for details.
    
    If you pass in `--oss` you'll be prompted to install/launch ollama, and
    it will automatically download the 20b model and attempt to use it.
    
    We'll likely want to expand this with some options later to make the
    experience smoother for users who can't run the 20b or want to run the
    120b.
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>