Commit Graph

21 Commits

  • Add cloud tasks (#3197)
    Adds a TUI for managing, applying, and creating cloud tasks
  • [MCP] Introduce an experimental official rust sdk based mcp client (#4252)
    The [official Rust
    SDK](https://github.com/modelcontextprotocol/rust-sdk/tree/57fc428c578a1a3fe851ee0838bf068bda120eb3)
    has come a long way since we first started our mcp client implementation
    5 months ago and, today, it is much more complete than our own
    stdio-only implementation.
    
    This PR introduces a new config flag `experimental_use_rmcp_client`
    which will use a new mcp client powered by the sdk instead of our own.
    
    To keep this PR simple, I've only implemented the same stdio MCP
    functionality that we had but will expand on it with future PRs.
    
    ---------
    
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • chore: unify cargo versions (#4044)
    Unify cargo versions at root
  • fix: ensure output of codex-rs/mcp-types/generate_mcp_types.py matches codex-rs/mcp-types/src/lib.rs (#3439)
    https://github.com/openai/codex/pull/3395 updated `mcp-types/src/lib.rs`
    by hand, but that file is generated code that is produced by
    `mcp-types/generate_mcp_types.py`. Unfortunately, we do not have
    anything in CI to verify this right now, but I will address that in a
    subsequent PR.
    
    #3395 ended up introducing a change that added a required field when
    deserializing `InitializeResult`, breaking Codex when used as an MCP
    client, so the quick fix in #3436 was to make the new field `Optional`
    with `skip_serializing_if = "Option::is_none"`, but that did not address
    the problem that `mcp-types/generate_mcp_types.py` and
    `mcp-types/src/lib.rs` are out of sync.
    
    This PR gets things back to where they are in sync. It removes the
    custom `mcp_types::McpClientInfo` type that was added to
    `mcp-types/src/lib.rs` and forces us to use the generated
    `mcp_types::Implementation` type. Though this PR also updates
    `generate_mcp_types.py` to generate the additional `user_agent:
    Optional<String>` field on `Implementation` so that we can continue to
    specify it when Codex operates as an MCP server.
    
    However, this also requires us to specify `user_agent: None` when Codex
    operates as an MCP client.
    
    We may want to introduce our own `InitializeResult` type that is
    specific to when we run as a server to avoid this in the future, but my
    immediate goal is just to get things back in sync.
  • Set a user agent suffix when used as a mcp server (#3395)
    This automatically adds a user agent suffix whenever the CLI is used as
    a MCP server
  • fix: remove unnecessary flush() calls (#2873)
    Because we are writing to a pipe, these `flush()` calls are unnecessary,
    so removing these saves us one syscall per write in these two cases.
  • fix: use std::env::args_os instead of std::env::args (#1698)
    Apparently `std::env::args()` will panic during iteration if any
    argument to the process is not valid Unicode:
    
    https://doc.rust-lang.org/std/env/fn.args.html
    
    Let's avoid the risk and just go with `std::env::args_os()`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1698).
    * #1705
    * #1703
    * #1702
    * __->__ #1698
    * #1697
  • chore: support MCP schema 2025-06-18 (#1621)
    This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
    `2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
    the code that uses `mcp-types` to honor the changes.
    
    Ran
    
    ```
    npx @modelcontextprotocol/inspector just codex mcp
    ```
    
    and verified that I was able to invoke the `codex` tool, as expected.
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
    * #1623
    * #1622
    * __->__ #1621
  • chore(rs): update dependencies (#1494)
    ### Chores
    - Update cargo dependencies
    - Remove unused cargo dependencies
    - Fix clippy warnings
    - Update Dockerfile (package.json requires node 22)
    - Let Dependabot update bun, cargo, devcontainers, docker,
    github-actions, npm (nix still not supported)
    
    ### TODO
    - Upgrade dependencies with breaking changes
    
    ```shell
    $ cargo update --verbose
       Unchanged crossterm v0.28.1 (available: v0.29.0)
       Unchanged schemars v0.8.22 (available: v1.0.4)
    ```
  • fix: honor RUST_LOG in mcp-client CLI and default to DEBUG (#1149)
    We had `debug!()` logging statements already, but they weren't being
    printed because `tracing_subscriber` was not set up.
  • chore: pin Rust version to 1.86 and use io::Error::other to prepare for 1.87 (#947)
    Previously, our GitHub actions specified the Rust toolchain as
    `dtolnay/rust-toolchain@stable`, which meant the version could change
    out from under us. In this case, the move from 1.86 to 1.87 introduced
    new clippy warnings, causing build failures.
    
    Because it will take a little time to fix all the new clippy warnings,
    this PR pins things to 1.86 for now to unbreak the build.
    
    It also replaces `io::Error::new(io::ErrorKind::Other)` with
    `io::Error::other()` in preparation for 1.87.
  • fix: navigate initialization phase before tools/list request in MCP client (#904)
    Apparently the MCP server implemented in JavaScript did not require the
    `initialize` handshake before responding to tool list/call, so I missed
    this.
  • feat: experimental env var: CODEX_SANDBOX_NETWORK_DISABLED (#879)
    When using Codex to develop Codex itself, I noticed that sometimes it
    would try to add `#[ignore]` to the following tests:
    
    ```
    keeps_previous_response_id_between_tasks()
    retries_on_early_close()
    ```
    
    Both of these tests start a `MockServer` that launches an HTTP server on
    an ephemeral port and requires network access to hit it, which the
    Seatbelt policy associated with `--full-auto` correctly denies. If I
    wasn't paying attention to the code that Codex was generating, one of
    these `#[ignore]` annotations could have slipped into the codebase,
    effectively disabling the test for everyone.
    
    To that end, this PR enables an experimental environment variable named
    `CODEX_SANDBOX_NETWORK_DISABLED` that is set to `1` if the
    `SandboxPolicy` used to spawn the process does not have full network
    access. I say it is "experimental" because I'm not convinced this API is
    quite right, but we need to start somewhere. (It might be more
    appropriate to have an env var like `CODEX_SANDBOX=full-auto`, but the
    challenge is that our newer `SandboxPolicy` abstraction does not map to
    a simple set of enums like in the TypeScript CLI.)
    
    We leverage this new functionality by adding the following code to the
    aforementioned tests as a way to "dynamically disable" them:
    
    ```rust
    if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
        println!(
            "Skipping test because it cannot execute when network is disabled in a Codex sandbox."
        );
        return;
    }
    ```
    
    We can use the `debug seatbelt --full-auto` command to verify that
    `cargo test` fails when run under Seatbelt prior to this change:
    
    ```
    $ cargo run --bin codex -- debug seatbelt --full-auto -- cargo test
    ---- keeps_previous_response_id_between_tasks stdout ----
    
    thread 'keeps_previous_response_id_between_tasks' panicked at /Users/mbolin/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wiremock-0.6.3/src/mock_server/builder.rs:107:46:
    Failed to bind an OS port for a mock server.: Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" }
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    
    
    failures:
        keeps_previous_response_id_between_tasks
    
    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
    
    error: test failed, to rerun pass `-p codex-core --test previous_response_id`
    ```
    
    Though after this change, the above command succeeds! This means that,
    going forward, when Codex operates on Codex itself, when it runs `cargo
    test`, only "real failures" should cause the command to fail.
    
    As part of this change, I decided to tighten up the codepaths for
    running `exec()` for shell tool calls. In particular, we do it in `core`
    for the main Codex business logic itself, but we also expose this logic
    via `debug` subcommands in the CLI in the `cli` crate. The logic for the
    `debug` subcommands was not quite as faithful to the true business logic
    as I liked, so I:
    
    * refactored a bit of the Linux code, splitting `linux.rs` into
    `linux_exec.rs` and `landlock.rs` in the `core` crate.
    * gating less code behind `#[cfg(target_os = "linux")]` because such
    code does not get built by default when I develop on Mac, which means I
    either have to build the code in Docker or wait for CI signal
    * introduced `macro_rules! configure_command` in `exec.rs` so we can
    have both sync and async versions of this code. The synchronous version
    seems more appropriate for straight threads or potentially fork/exec.
  • Workspace lints and disallow unwrap (#855)
    Sets submodules to use workspace lints. Added denying unwrap as a
    workspace level lint, which found a couple of cases where we could have
    propagated errors. Also manually labeled ones that were fine by my eye.
  • fix: add optional timeout to McpClient::send_request() (#852)
    We now impose a 10s timeout on the initial `tools/list` request to an
    MCP server. We do not apply a timeout for other types of requests yet,
    but we should start enforcing those, as well.
  • Update submodules version to come from the workspace (#850)
    Tie the version of submodules to the workspace version.
  • Update cargo to 2024 edition (#842)
    Some effects of this change:
    - New formatting changes across many files. No functionality changes
    should occur from that.
    - Calls to `set_env` are considered unsafe, since this only happens in
    tests we wrap them in `unsafe` blocks
  • fix: build all crates individually as part of CI (#833)
    I discovered that `cargo build` worked for the entire workspace, but not
    for the `mcp-client` or `core` crates.
    
    * `mcp-client` failed to build because it underspecified the set of
    features it needed from `tokio`.
    * `core` failed to build because it was using a "feature" of its own
    crate in the default, no-feature version.
     
    This PR fixes the builds and adds a check in CI to defend against this
    sort of thing going forward.
  • feat: update McpClient::new_stdio_client() to accept an env (#831)
    Cleans up the signature for `new_stdio_client()` to more closely mirror
    how MCP servers are declared in config files (`command`, `args`, `env`).
    Also takes a cue from Claude Code where the MCP server is launched with
    a restricted `env` so that it only includes "safe" things like `USER`
    and `PATH` (see the `create_env_for_mcp_server()` function introduced in
    this PR for details) by default, as it is common for developers to have
    sensitive API keys present in their environment that should only be
    forwarded to the MCP server when the user has explicitly configured it
    to do so.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/831).
    * #829
    * __->__ #831
  • feat: initial McpClient for Rust (#822)
    This PR introduces an initial `McpClient` that we will use to give Codex
    itself programmatic access to foreign MCPs. This does not wire it up in
    Codex itself yet, but the new `mcp-client` crate includes a `main.rs`
    for basic testing for now.
    
    Manually tested by sending a `tools/list` request to Codex's own MCP
    server:
    
    ```
    codex-rs$ cargo build
    codex-rs$ cargo run --bin codex-mcp-client ./target/debug/codex-mcp-server
    {
      "tools": [
        {
          "description": "Run a Codex session. Accepts configuration parameters matching the Codex Config struct.",
          "inputSchema": {
            "properties": {
              "approval-policy": {
                "description": "Execution approval policy expressed as the kebab-case variant name (`unless-allow-listed`, `auto-edit`, `on-failure`, `never`).",
                "enum": [
                  "auto-edit",
                  "unless-allow-listed",
                  "on-failure",
                  "never"
                ],
                "type": "string"
              },
              "cwd": {
                "description": "Working directory for the session. If relative, it is resolved against the server process's current working directory.",
                "type": "string"
              },
              "disable-response-storage": {
                "description": "Disable server-side response storage.",
                "type": "boolean"
              },
              "model": {
                "description": "Optional override for the model name (e.g. \"o3\", \"o4-mini\")",
                "type": "string"
              },
              "prompt": {
                "description": "The *initial user prompt* to start the Codex conversation.",
                "type": "string"
              },
              "sandbox-permissions": {
                "description": "Sandbox permissions using the same string values accepted by the CLI (e.g. \"disk-write-cwd\", \"network-full-access\").",
                "items": {
                  "enum": [
                    "disk-full-read-access",
                    "disk-write-cwd",
                    "disk-write-platform-user-temp-folder",
                    "disk-write-platform-global-temp-folder",
                    "disk-full-write-access",
                    "network-full-access"
                  ],
                  "type": "string"
                },
                "type": "array"
              }
            },
            "required": [
              "prompt"
            ],
            "type": "object"
          },
          "name": "codex"
        }
      ]
    }
    ```