51 Commits

  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • core: remove cross-crate re-exports from lib.rs (#16512)
    ## Why
    
    `codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
    which made downstream crates depend on `codex-core` as a proxy module
    instead of the actual owner crate.
    
    Removing those forwards makes crate boundaries explicit and lets leaf
    crates drop unnecessary `codex-core` dependencies. In this PR, this
    reduces the dependency on `codex-core` to `codex-login` in the following
    files:
    
    ```
    codex-rs/backend-client/Cargo.toml
    codex-rs/mcp-server/tests/common/Cargo.toml
    ```
    
    ## What
    
    - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
    `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
    `codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
    `codex-tools`, and `codex-utils-path`.
    - Delete the `default_client` forwarding shim in `codex-rs/core`.
    - Update in-crate and downstream callsites to import directly from the
    owning `codex-*` crate.
    - Add direct Cargo dependencies where callsites now target the owner
    crate, and remove `codex-core` from `codex-rs/backend-client`.
  • ci: verify codex-rs Cargo manifests inherit workspace settings (#16353)
    ## Why
    
    Bazel clippy now catches lints that `cargo clippy` can still miss when a
    crate under `codex-rs` forgets to opt into workspace lints. The concrete
    example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
    flagged a clippy violation in `models_cache.rs`, but Cargo did not
    because that crate inherited workspace package metadata without
    declaring `[lints] workspace = true`.
    
    We already mirror the workspace clippy deny list into Bazel after
    [#15955](https://github.com/openai/codex/pull/15955), so we also need a
    repo-side check that keeps every `codex-rs` manifest opted into the same
    workspace settings.
    
    ## What changed
    
    - add `.github/scripts/verify_cargo_workspace_manifests.py`, which
    parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
      - `version.workspace = true`
      - `edition.workspace = true`
      - `license.workspace = true`
      - `[lints] workspace = true`
    - top-level crate names follow the `codex-*` / `codex-utils-*`
    conventions, with explicit exceptions for `windows-sandbox-rs` and
    `utils/path-utils`
    - run that script in `.github/workflows/ci.yml`
    - update the current outlier manifests so the check is enforceable
    immediately
    - fix the newly exposed clippy violations in the affected crates
    (`app-server/tests/common`, `file-search`, `feedback`,
    `shell-escalation`, and `debug-client`)
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
    * #16351
    * __->__ #16353
  • Move terminal module to terminal-detection crate (#15216)
    - Move core/src/terminal.rs and its tests into a standalone
    terminal-detection workspace crate.
    - Update direct consumers to depend on codex-terminal-detection and
    import terminal APIs directly.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • test: deflake nextest child-process leak in MCP harnesses (#11263)
    ## Summary
    - add deterministic child-process cleanup to both test `McpProcess`
    helpers
    - keep Tokio `kill_on_drop(true)` but also reap via bounded `try_wait()`
    polling in `Drop`
    - document the failure mode and why this avoids nondeterministic `LEAK`
    flakes
    
    ## Why
    `cargo nextest` leak detection can intermittently report `LEAK` when a
    spawned server outlives test teardown, making CI flaky.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-app-server`
    - `cargo test -p codex-mcp-server`
    
    
    ## Failing CI Reference
    - Original failing job:
    https://github.com/openai/codex/actions/runs/21845226299/job/63039443593?pr=11245
  • Upgrade rmcp to 0.14 (#10718)
    - [x] Upgrade rmcp to 0.14
  • feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
    We started working with MCP in Codex before
    https://crates.io/crates/rmcp was mature, so we had our own crate for
    MCP types that was generated from the MCP schema:
    
    
    https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md
    
    Now that `rmcp` is more mature, it makes more sense to use their MCP
    types in Rust, as they handle details (like the `_meta` field) that our
    custom version ignored. Though one advantage that our custom types had
    is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
    whereas the types in `rmcp` do not. As such, part of the work of this PR
    is leveraging the adapters between `rmcp` types and the serializable
    types that are API for us (app server and MCP) introduced in #10356.
    
    Note this PR results in a number of changes to
    `codex-rs/app-server-protocol/schema`, which merit special attention
    during review. We must ensure that these changes are still
    backwards-compatible, which is possible because we have:
    
    ```diff
    - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
    + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
    ```
    
    so `ContentBlock` has been replaced with the more general `JsonValue`.
    Note that `ContentBlock` was defined as:
    
    ```typescript
    export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
    ```
    
    so the deletion of those individual variants should not be a cause of
    great concern.
    
    Similarly, we have the following change in
    `codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
    
    ```
    - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
    + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
    ```
    
    so:
    
    - `annotations?: ToolAnnotations` ➡️ `JsonValue`
    - `inputSchema: ToolInputSchema` ➡️ `JsonValue`
    - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
    
    and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
    * #10357
    * __->__ #10349
    * #10356
  • Feat: request user input tool (#9472)
    ### Summary
    * Add `requestUserInput` tool that the model can use for gather
    feedback/asking question mid turn.
    
    
    ### Tool input schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput input",
      "type": "object",
      "additionalProperties": false,
      "required": ["questions"],
      "properties": {
        "questions": {
          "type": "array",
          "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
          "minItems": 1,
          "maxItems": 3,
          "items": {
            "type": "object",
            "additionalProperties": false,
            "required": ["id", "header", "question"],
            "properties": {
              "id": {
                "type": "string",
                "description": "Stable identifier for mapping answers (snake_case)."
              },
              "header": {
                "type": "string",
                "description": "Short header label shown in the UI (12 or fewer chars)."
              },
              "question": {
                "type": "string",
                "description": "Single-sentence prompt shown to the user."
              },
              "options": {
                "type": "array",
                "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
                "minItems": 2,
                "maxItems": 3,
                "items": {
                  "type": "object",
                  "additionalProperties": false,
                  "required": ["value", "label", "description"],
                  "properties": {
                    "value": {
                      "type": "string",
                      "description": "Machine-readable value (snake_case)."
                    },
                    "label": {
                      "type": "string",
                      "description": "User-facing label (1-5 words)."
                    },
                    "description": {
                      "type": "string",
                      "description": "One short sentence explaining impact/tradeoff if selected."
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    ```
    
    ### Tool output schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput output",
      "type": "object",
      "additionalProperties": false,
      "required": ["answers"],
      "properties": {
        "answers": {
          "type": "object",
          "description": "Map of question id to user answer.",
          "additionalProperties": {
            "type": "object",
            "additionalProperties": false,
            "required": ["selected"],
            "properties": {
              "selected": {
                "type": "array",
                "items": { "type": "string" }
              },
              "other": {
                "type": ["string", "null"]
              }
            }
          }
        }
      }
    }
    ```
  • feat: add support for building with Bazel (#8875)
    This PR configures Codex CLI so it can be built with
    [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
    includes configuration so that remote builds can be done using
    [BuildBuddy](https://www.buildbuddy.io).
    
    If you are familiar with Bazel, things should work as you expect, e.g.,
    run `bazel test //... --keep-going` to run all the tests in the repo,
    but we have also added some new aliases in the `justfile` for
    convenience:
    
    - `just bazel-test` to run tests locally
    - `just bazel-remote-test` to run tests remotely (currently, the remote
    build is for x86_64 Linux regardless of your host platform). Note we are
    currently seeing the following test failures in the remote build, so we
    still need to figure out what is happening here:
    
    ```
    failures:
        suite::compact::manual_compact_twice_preserves_latest_user_messages
        suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
        suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
    ```
    
    - `just build-for-release` to build release binaries for all
    platforms/architectures remotely
    
    To setup remote execution:
    - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
    employees should also request org access at
    https://openai.buildbuddy.io/join/ with their `@openai.com` email
    address.)
    - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
    `~/.bazelrc` (add the line `build
    --remote_header=x-buildbuddy-api-key=YOUR_KEY`)
    - Use `--config=remote` in your `bazel` invocations (or add `common
    --config=remote` to your `~/.bazelrc`, or use the `just` commands)
    
    ## CI
    
    In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
    uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
    (we are working on supporting Windows, but that is not ready yet). Note
    that the failures we are seeing in `just bazel-remote-test` do not occur
    on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
    is green right now.
    
    The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
    that macOS CI jobs build _remotely_ on Linux hosts (using the
    `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
    root `BUILD.bazel`) using cross-compilation to build the macOS
    artifacts. Then these artifacts are downloaded locally to GitHub's macOS
    runner so the tests can be executed natively. This is the relevant
    config that enables this:
    
    ```
    common:macos --config=remote
    common:macos --strategy=remote
    common:macos --strategy=TestRunner=darwin-sandbox,local
    ```
    
    Because of the remote caching benefits we get from BuildBuddy, these new
    CI jobs can be extremely fast! For example, consider these two jobs that
    ran all the tests on Linux x86_64:
    
    - Bazel 1m37s
    https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
    - Cargo 9m20s
    https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
    
    For now, we will continue to run both the Bazel and Cargo jobs for PRs,
    but once we add support for Windows and running Clippy, we should be
    able to cutover to using Bazel exclusively for PRs, which should still
    speed things up considerably. We will probably continue to run the Cargo
    jobs post-merge for commits that land on `main` as a sanity check.
    
    Release builds will also continue to be done by Cargo for now.
    
    Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
    Earlier attempt to add support for Buck2, now abandoned:
    https://github.com/openai/codex/pull/8504
    
    ---------
    
    Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496)
    This PR introduces a `codex-utils-cargo-bin` utility crate that
    wraps/replaces our use of `assert_cmd::Command` and
    `escargot::CargoBuild`.
    
    As you can infer from the introduction of `buck_project_root()` in this
    PR, I am attempting to make it possible to build Codex under
    [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
    achieve faster incremental local builds (largely due to Buck2's
    [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
    build strategy, as well as benefits from its local build daemon) as well
    as faster CI builds if we invest in remote execution and caching.
    
    See
    https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
    for more details about the performance advantages of Buck2.
    
    Buck2 enforces stronger requirements in terms of build and test
    isolation. It discourages assumptions about absolute paths (which is key
    to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
    variables that Cargo provides are absolute paths (which
    `assert_cmd::Command` reads), this is a problem for Buck2, which is why
    we need this `codex-utils-cargo-bin` utility.
    
    My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
    passed to a `rust_test()` build rule as relative paths.
    `codex-utils-cargo-bin` will resolve these values to absolute paths,
    when necessary.
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
    * #8498
    * __->__ #8496
  • chore: add cargo-deny configuration (#7119)
    - add GitHub workflow running cargo-deny on push/PR
    - document cargo-deny allowlist with workspace-dep notes and advisory
    ignores
    - align workspace crates to inherit version/edition/license for
    consistent checks
  • fix: separate codex mcp into codex mcp-server and codex app-server (#4471)
    This is a very large PR with some non-backwards-compatible changes.
    
    Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish
    server that had two overlapping responsibilities:
    
    - Running an MCP server, providing some basic tool calls.
    - Running the app server used to power experiences such as the VS Code
    extension.
    
    This PR aims to separate these into distinct concepts:
    
    - `codex mcp-server` for the MCP server
    - `codex app-server` for the "application server"
    
    Note `codex mcp` still exists because it already has its own subcommands
    for MCP management (`list`, `add`, etc.)
    
    The MCP logic continues to live in `codex-rs/mcp-server` whereas the
    refactored app server logic is in the new `codex-rs/app-server` folder.
    Note that most of the existing integration tests in
    `codex-rs/mcp-server/tests/suite` were actually for the app server, so
    all the tests have been moved with the exception of
    `codex-rs/mcp-server/tests/suite/mod.rs`.
    
    Because this is already a large diff, I tried not to change more than I
    had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses
    the name `McpProcess` for now, but I will do some mechanical renamings
    to things like `AppServer` in subsequent PRs.
    
    While `mcp-server` and `app-server` share some overlapping functionality
    (like reading streams of JSONL and dispatching based on message types)
    and some differences (completely different message types), I ended up
    doing a bit of copypasta between the two crates, as both have somewhat
    similar `message_processor.rs` and `outgoing_message.rs` files for now,
    though I expect them to diverge more in the near future.
    
    One material change is that of the initialize handshake for `codex
    app-server`, as we no longer use the MCP types for that handshake.
    Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an
    `Initialize` variant to `ClientRequest`, which takes the `ClientInfo`
    object we need to update the `USER_AGENT_SUFFIX` in
    `codex-rs/app-server/src/message_processor.rs`.
    
    One other material change is in
    `codex-rs/app-server/src/codex_message_processor.rs` where I eliminated
    a use of the `send_event_as_notification()` method I am generally trying
    to deprecate (because it blindly maps an `EventMsg` into a
    `JSONNotification`) in favor of `send_server_notification()`, which
    takes a `ServerNotification`, as that is intended to be a custom enum of
    all notification types supported by the app server. So to make this
    update, I had to introduce a new variant of `ServerNotification`,
    `SessionConfigured`, which is a non-backwards compatible change with the
    old `codex mcp`, and clients will have to be updated after the next
    release that contains this PR. Note that
    `codex-rs/app-server/tests/suite/list_resume.rs` also had to be update
    to reflect this change.
    
    I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility
    crate to avoid some of the copying between `mcp-server` and
    `app-server`.
  • [mcp-server] Expose fuzzy file search in MCP (#2677)
    ## Summary
    Expose a simple fuzzy file search implementation for mcp clients to work
    with
    
    ## Testing
    - [x] Tested locally
  • chore: unify cargo versions (#4044)
    Unify cargo versions at root
  • feat: added SetDefaultModel to JSON-RPC server (#3512)
    This adds `SetDefaultModel`, which takes `model` and `reasoning_effort`
    as optional fields. If set, the field will overwrite what is in the
    user's `config.toml`.
    
    This reuses logic that was added to support the `/model` command in the
    TUI: https://github.com/openai/codex/pull/2799.
  • Simplify auth flow and reconcile differences between ChatGPT and API Key auth (#3189)
    This PR does the following:
    * Adds the ability to paste or type an API key.
    * Removes the `preferred_auth_method` config option. The last login
    method is always persisted in auth.json, so this isn't needed.
    * If OPENAI_API_KEY env variable is defined, the value is used to
    prepopulate the new UI. The env variable is otherwise ignored by the
    CLI.
    * Adds a new MCP server entry point "login_api_key" so we can implement
    this same API key behavior for the VS Code extension.
    <img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM"
    src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9"
    />
    <img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM"
    src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9"
    />
  • feat: add UserInfo request to JSON-RPC server (#3428)
    This adds a simple endpoint that provides the email address encoded in
    `$CODEX_HOME/auth.json`.
    
    As noted, for now, we do not hit the server to verify this is the user's
    true email address.
  • fix: ensure output of codex-rs/mcp-types/generate_mcp_types.py matches codex-rs/mcp-types/src/lib.rs (#3439)
    https://github.com/openai/codex/pull/3395 updated `mcp-types/src/lib.rs`
    by hand, but that file is generated code that is produced by
    `mcp-types/generate_mcp_types.py`. Unfortunately, we do not have
    anything in CI to verify this right now, but I will address that in a
    subsequent PR.
    
    #3395 ended up introducing a change that added a required field when
    deserializing `InitializeResult`, breaking Codex when used as an MCP
    client, so the quick fix in #3436 was to make the new field `Optional`
    with `skip_serializing_if = "Option::is_none"`, but that did not address
    the problem that `mcp-types/generate_mcp_types.py` and
    `mcp-types/src/lib.rs` are out of sync.
    
    This PR gets things back to where they are in sync. It removes the
    custom `mcp_types::McpClientInfo` type that was added to
    `mcp-types/src/lib.rs` and forces us to use the generated
    `mcp_types::Implementation` type. Though this PR also updates
    `generate_mcp_types.py` to generate the additional `user_agent:
    Optional<String>` field on `Implementation` so that we can continue to
    specify it when Codex operates as an MCP server.
    
    However, this also requires us to specify `user_agent: None` when Codex
    operates as an MCP client.
    
    We may want to introduce our own `InitializeResult` type that is
    specific to when we run as a server to avoid this in the future, but my
    immediate goal is just to get things back in sync.
  • Improved resiliency of two auth-related tests (#3427)
    This PR improves two existing auth-related tests. They were failing when
    run in an environment where an `OPENAI_API_KEY` env variable was
    defined. The change makes them more resilient.
  • Set a user agent suffix when used as a mcp server (#3395)
    This automatically adds a user agent suffix whenever the CLI is used as
    a MCP server
  • feat: add ArchiveConversation to ClientRequest (#3353)
    Adds support for `ArchiveConversation` in the JSON-RPC server that takes
    a `(ConversationId, PathBuf)` pair and:
    
    - verifies the `ConversationId` corresponds to the rollout id at the
    `PathBuf`
    - if so, invokes
    `ConversationManager.remove_conversation(ConversationId)`
    - if the `CodexConversation` was in memory, send `Shutdown` and wait for
    `ShutdownComplete` with a timeout
    - moves the `.jsonl` file to `$CODEX_HOME/archived_sessions`
    
    ---------
    
    Co-authored-by: Gabriel Peal <gabriel@openai.com>
  • feat: Run cargo shear during CI (#3338)
    Run cargo shear as part of the CI to ensure no unused dependencies
  • Add a getUserAgent MCP method (#3320)
    This will allow the extension to pass this user agent + a suffix for its
    requests
  • MCP: add session resume + history listing; (#3185)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
  • [mcp-server] Update read config interface (#3093)
    ## Summary
    Follow-up to #3056
    
    This PR updates the mcp-server interface for reading the config settings
    saved by the user. At risk of introducing _another_ Config struct, I
    think it makes sense to avoid tying our protocol to ConfigToml, as its
    become a bit unwieldy. GetConfigTomlResponse was a de-facto struct for
    this already - better to make it explicit, in my opinion.
    
    This is technically a breaking change of the mcp-server protocol, but
    given the previous interface was introduced so recently in #2725, and we
    have not yet even started to call it, I propose proceeding with the
    breaking change - but am open to preserving the old endpoint.
    
    ## Testing
    - [x] Added additional integration test coverage
  • chore: print stderr from MCP server to test output using eprintln! (#2849)
    Related to https://github.com/openai/codex/pull/2848, I don't see the
    stderr from `codex mcp` colocated with the other stderr from
    `test_shell_command_approval_triggers_elicitation()` when it fails even
    though we have `RUST_LOG=debug` set when we spawn `codex mcp`:
    
    
    https://github.com/openai/codex/blob/1e9e703b969d3f0965b31d1cc3d70fed3ebdd6f6/codex-rs/mcp-server/tests/common/mcp_process.rs#L65
    
    Let's try this new logic which should be more explicit.
  • chore: try to make it easier to debug the flakiness of test_shell_command_approval_triggers_elicitation (#2848)
    `test_shell_command_approval_triggers_elicitation()` is one of a number
    of integration tests that we have observed to be flaky on GitHub CI, so
    this PR tries to reduce the flakiness _and_ to provide us with more
    information when it flakes. Specifically:
    
    - Changed the command that we use to trigger the elicitation from `git
    init` to `python3 -c 'import pathlib; pathlib.Path(r"{}").touch()'`
    because running `git` seems more likely to invite variance.
    - Increased the timeout to wait for the task response from 10s to 20s.
    - Added more logging.
  • [mcp-server] Add GetConfig endpoint (#2725)
    ## Summary
    Adds a GetConfig request to the MCP Protocol, so MCP clients can
    evaluate the resolved config.toml settings which the harness is using.
    
    ## Testing
    - [x] Added an end to end test of the endpoint
  • Add AuthManager and enhance GetAuthStatus command (#2577)
    This PR adds a central `AuthManager` struct that manages the auth
    information used across conversations and the MCP server. Prior to this,
    each conversation and the MCP server got their own private snapshots of
    the auth information, and changes to one (such as a logout or token
    refresh) were not seen by others.
    
    This is especially problematic when multiple instances of the CLI are
    run. For example, consider the case where you start CLI 1 and log in to
    ChatGPT account X and then start CLI 2 and log out and then log in to
    ChatGPT account Y. The conversation in CLI 1 is still using account X,
    but if you create a new conversation, it will suddenly (and
    unexpectedly) switch to account Y.
    
    With the `AuthManager`, auth information is read from disk at the time
    the `ConversationManager` is constructed, and it is cached in memory.
    All new conversations use this same auth information, as do any token
    refreshes.
    
    The `AuthManager` is also used by the MCP server's GetAuthStatus
    command, which now returns the auth method currently used by the MCP
    server.
    
    This PR also includes an enhancement to the GetAuthStatus command. It
    now accepts two new (optional) input parameters: `include_token` and
    `refresh_token`. Callers can use this to request the in-use auth token
    and can optionally request to refresh the token.
    
    The PR also adds tests for the login and auth APIs that I recently added
    to the MCP server.
  • chore: move mcp-server/src/wire_format.rs to protocol/src/mcp_protocol.rs (#2423)
    The existing `wire_format.rs` should share more types with the
    `codex-protocol` crate (like `AskForApproval` instead of maintaining a
    parallel `CodexToolCallApprovalPolicy` enum), so this PR moves
    `wire_format.rs` into `codex-protocol`, renaming it as
    `mcp-protocol.rs`. We also de-dupe types, where appropriate.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2423).
    * #2424
    * __->__ #2423
  • feat: introduce ClientRequest::SendUserTurn (#2345)
    This adds a new request type, `SendUserTurn`, that makes it possible to
    submit a `Op::UserTurn` operation (introduced in #2329) to a
    conversation. This PR also adds a new integration test that verifies
    that changing from `AskForApproval::UnlessTrusted` to
    `AskForApproval::Never` mid-conversation ensures that an elicitation is
    no longer sent for running `python3 -c print(42)`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345).
    * __->__ #2345
    * #2329
    * #2343
    * #2340
    * #2338
  • fix: try to fix flakiness in test_shell_command_approval_triggers_elicitation (#2344)
    I still see flakiness in
    `test_shell_command_approval_triggers_elicitation()` on occasion where
    `MockServer` claims it has not received all of its expected requests.
    
    I recently introduced a similar type of test in #2264,
    `test_codex_jsonrpc_conversation_flow()`, which I have not seen flake
    (yet!), so this PR pulls over two things I did in that test:
    
    - increased `worker_threads` from `2` to `4`
    - added an assertion to make sure the `task_complete` notification is
    received
    
    Honestly, I'm still not sure why `MockServer` claims it sometimes does
    not receive all its expected requests given that we assert that the
    final `JSONRPCResponse` is read on the stream, but let's give this a
    shot.
    
    Assuming this fixes things, my hypothesis is that the increase in
    `worker_threads` helps because perhaps there are async tasks in
    `MockServer` that do not reliably complete fully when there are not
    enough threads available? If that is correct, it seems like the test
    would still be flaky, though perhaps with lower frequency?
  • fix: verify notifications are sent with the conversationId set (#2278)
    This updates `CodexMessageProcessor` so that each notification it sends
    for a `EventMsg` from a `CodexConversation` such that:
    
    - The `params` always has an appropriate `conversationId` field.
    - The `method` is now includes the name of the `EventMsg` type rather
    than using `codex/event` as the `method` type for all notifications. (We
    currently prefix the method name with `codex/event/`, but I think that
    should go away once we formalize the notification schema in
    `wire_format.rs`.)
    
    As part of this, we update `test_codex_jsonrpc_conversation_flow()` to
    verify that the `task_finished` notification has made it through the
    system instead of sleeping for 5s and "hoping" the server finished
    processing the task. Note we have seen some flakiness in some of our
    other, similar integration tests, and I expect adding a similar check
    would help in those cases, as well.
  • feat: support traditional JSON-RPC request/response in MCP server (#2264)
    This introduces a new set of request types that our `codex mcp`
    supports. Note that these do not conform to MCP tool calls so that
    instead of having to send something like this:
    
    ```json
    {
      "jsonrpc": "2.0",
      "method": "tools/call",
      "id": 42,
      "params": {
        "name": "newConversation",
        "arguments": {
          "model": "gpt-5",
          "approvalPolicy": "on-request"
        }
      }
    }
    ```
    
    we can send something like this:
    
    
    ```json
    {
      "jsonrpc": "2.0",
      "method": "newConversation",
      "id": 42,
      "params": {
        "model": "gpt-5",
        "approvalPolicy": "on-request"
      }
    }
    ```
    
    Admittedly, this new format is not a valid MCP tool call, but we are OK
    with that right now. (That is, not everything we might want to request
    of `codex mcp` is something that is appropriate for an autonomous agent
    to do.)
    
    To start, this introduces four request types:
    
    - `newConversation`
    - `sendUserMessage`
    - `addConversationListener`
    - `removeConversationListener`
    
    The new `mcp-server/tests/codex_message_processor_flow.rs` shows how
    these can be used.
    
    The types are defined on the `CodexRequest` enum, so we introduce a new
    `CodexMessageProcessor` that is responsible for dealing with requests
    from this enum. The top-level `MessageProcessor` has been updated so
    that when `process_request()` is called, it first checks whether the
    request conforms to `CodexRequest` and dispatches it to
    `CodexMessageProcessor` if so.
    
    Note that I also decided to use `camelCase` for the on-the-wire format,
    as that seems to be the convention for MCP.
    
    For the moment, the new protocol is defined in `wire_format.rs` within
    the `mcp-server` crate, but in a subsequent PR, I will probably move it
    to its own crate to ensure the protocol has minimal dependencies and
    that we can codegen a schema from it.
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2264).
    * #2278
    * __->__ #2264
  • MCP: add conversation.create tool [Stack 2/2] (#1783)
    Introduce conversation.create handler (handle_create_conversation) and
    wire it in MessageProcessor.
    
    Stack:
    Top: #1783 
    Bottom: #1784
    
    ---------
    
    Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
  • MCP server: route structured tool-call requests and expose mcp_protocol [Stack 2/3] (#1751)
    - Expose mcp_protocol from mcp-server for reuse in tests and callers.
    - In MessageProcessor, detect structured ToolCallRequestParams in
    tools/call and forward to a new handler.
    - Add handle_new_tool_calls scaffold (returns error for now).
    - Test helper: add send_send_user_message_tool_call to McpProcess to
    send ConversationSendMessage requests;
    
    This is the second PR in a stack.
    Stack:
    Final: #1686
    Intermediate: #1751
    First: #1750
  • Serializing the eventmsg type to snake_case (#1709)
    This was an abrupt change on our clients. We need to serialize as
    snake_case.
  • Changing method in MCP notifications (#1684)
    - Changing the codex/event type
  • fix: create separate test_support crates to eliminate #[allow(dead_code)] (#1667)
    Because of a quirk of how implementation tests work in Rust, we had a
    number of `#[allow(dead_code)]` annotations that were misleading because
    the functions _were_ being used, just not by all integration tests in a
    `tests/` folder, so when compiling the test that did not use the
    function, clippy would complain that it was unused.
    
    This fixes things by create a "test_support" crate under the `tests/`
    folder that is imported as a dev dependency for the respective crate.
  • Add support for custom base instructions (#1645)
    Allows providing custom instructions file as a config parameter and
    custom instruction text via MCP tool call.
  • Add an elicitation for approve patch and refactor tool calls (#1642)
    1. Added an elicitation for `approve-patch` which is very similar to
    `approve-exec`.
    2. Extracted both elicitations to their own files to prevent
    `codex_tool_runner` from blowing up in size.