Commit Graph

58 Commits

  • feat: stream exec stdout events (#1786)
    ## Summary
    - stream command stdout as `ExecCommandStdout` events
    - forward streamed stdout to clients and ignore in human output
    processor
    - adjust call sites for new streaming API
  • Auto format toml (#1745)
    Add recommended extension and configure it to auto format prompt.
  • fix: run apply_patch calls through the sandbox (#1705)
    Building on the work of https://github.com/openai/codex/pull/1702, this
    changes how a shell call to `apply_patch` is handled.
    
    Previously, a shell call to `apply_patch` was always handled in-process,
    never leveraging a sandbox. To determine whether the `apply_patch`
    operation could be auto-approved, the
    `is_write_patch_constrained_to_writable_paths()` function would check if
    all the paths listed in the paths were writable. If so, the agent would
    apply the changes listed in the patch.
    
    Unfortunately, this approach afforded a loophole: symlinks!
    
    * For a soft link, we could fix this issue by tracing the link and
    checking whether the target is in the set of writable paths, however...
    * ...For a hard link, things are not as simple. We can run `stat FILE`
    to see if the number of links is greater than 1, but then we would have
    to do something potentially expensive like `find . -inum <inode_number>`
    to find the other paths for `FILE`. Further, even if this worked, this
    approach runs the risk of a
    [TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
    race condition, so it is not robust.
    
    The solution, implemented in this PR, is to take the virtual execution
    of the `apply_patch` CLI into an _actual_ execution using `codex
    --codex-run-as-apply-patch PATCH`, which we can run under the sandbox
    the user specified, just like any other `shell` call.
    
    This, of course, assumes that the sandbox prevents writing through
    symlinks as a mechanism to write to folders that are not in the writable
    set configured by the sandbox. I verified this by testing the following
    on both Mac and Linux:
    
    ```shell
    #!/usr/bin/env bash
    set -euo pipefail
    
    # Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?
    
    # Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
    SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    # EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
    EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    
    echo "SANDBOX_DIR: $SANDBOX_DIR"
    echo "EXPLOIT_DIR: $EXPLOIT_DIR"
    
    cleanup() {
      # Only remove if it looks sane and still exists
      [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
      [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
    }
    
    trap cleanup EXIT
    
    echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"
    
    # Drop the -s to test hard links.
    ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"
    
    cat "${SANDBOX_DIR}/link-to-original.txt"
    
    if [[ "$(uname)" == "Linux" ]]; then
        SANDBOX_SUBCOMMAND=landlock
    else
        SANDBOX_SUBCOMMAND=seatbelt
    fi
    
    # Attempt the exploit
    cd "${SANDBOX_DIR}"
    
    codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true
    
    cat "${EXPLOIT_DIR}/original.txt"
    ```
    
    Admittedly, this change merits a proper integration test, but I think I
    will have to do that in a follow-up PR.
  • remove conversation history widget (#1727)
    this widget is no longer used.
  • Add an experimental plan tool (#1726)
    This adds a tool the model can call to update a plan. The tool doesn't
    actually _do_ anything but it gives clients a chance to read and render
    the structured plan. We will likely iterate on the prompt and tools
    exposed for planning over time.
  • Relative instruction file (#1722)
    Passing in an instruction file with a bad path led to silent failures,
    also instruction relative paths were handled in an unintuitive fashion.
  • fix: support special --codex-run-as-apply-patch arg (#1702)
    This introduces some special behavior to the CLIs that are using the
    `codex-arg0` crate where if `arg1` is `--codex-run-as-apply-patch`, then
    it will run as if `apply_patch arg2` were invoked. This is important
    because it means we can do things like:
    
    ```
    SANDBOX_TYPE=landlock # or seatbelt for macOS
    codex debug "${SANDBOX_TYPE}" -- codex --codex-run-as-apply-patch PATCH
    ```
    
    which gives us a way to run `apply_patch` while ensuring it adheres to
    the sandbox the user specified.
    
    While it would be nice to use the `arg0` trick like we are currently
    doing for `codex-linux-sandbox`, there is no way to specify the `arg0`
    for the underlying command when running under `/usr/bin/sandbox-exec`,
    so it will not work for us in this case.
    
    Admittedly, we could have also supported this via a custom environment
    variable (e.g., `CODEX_ARG0`), but since environment variables are
    inherited by child processes, that seemed like a potentially leakier
    abstraction.
    
    This change, as well as our existing reliance on checking `arg0`, place
    additional requirements on those who include `codex-core`. Its
    `README.md` has been updated to reflect this.
    
    While we could have just added an `apply-patch` subcommand to the
    `codex` multitool CLI, that would not be sufficient for the standalone
    `codex-exec` CLI, which is something that we distribute as part of our
    GitHub releases for those who know they will not be using the TUI and
    therefore prefer to use a slightly smaller executable:
    
    https://github.com/openai/codex/releases/tag/rust-v0.10.0
    
    To that end, this PR adds an integration test to ensure that the
    `--codex-run-as-apply-patch` option works with the standalone
    `codex-exec` CLI.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1702).
    * #1705
    * #1703
    * __->__ #1702
    * #1698
    * #1697
  • chore: update Codex::spawn() to return a struct instead of a tuple (#1677)
    Also update `init_codex()` to return a `struct` instead of a tuple, as well.
  • Update render name in tui for approval_policy to match with config values (#1675)
    Currently, codex on start shows the value for the approval policy as
    name of
    [AskForApproval](https://github.com/openai/codex/blob/2437a8d17a0cf972d1a6e7f303d469b6e2f57eae/codex-rs/core/src/protocol.rs#L128)
    enum, which differs from
    [approval_policy](https://github.com/openai/codex/blob/2437a8d17a0cf972d1a6e7f303d469b6e2f57eae/codex-rs/config.md#approval_policy)
    config values.
    E.g. "untrusted" becomes "UnlessTrusted", "on-failure" -> "OnFailure",
    "never" -> "Never".
    This PR changes render names of the approval policy to match with
    configuration values.
  • Flaky CI fix (#1647)
    Flushing before sending `TaskCompleteEvent` and ending the submission
    loop to avoid race conditions.
  • Add support for custom base instructions (#1645)
    Allows providing custom instructions file as a config parameter and
    custom instruction text via MCP tool call.
  • [mcp-server] Add reply tool call (#1643)
    ## Summary
    Adds a new mcp tool call, `codex-reply`, so we can continue existing
    sessions. This is a first draft and does not yet support sessions from
    previous processes.
    
    ## Testing
    - [x] tested with mcp client
  • feat: add --json flag to codex exec (#1603)
    This is designed to facilitate programmatic use of Codex in a more
    lightweight way than using `codex mcp`.
    
    Passing `--json` to `codex exec` will print each event as a line of JSON
    to stdout. Note that it does not print the individual tokens as they are
    streamed, only full messages, as this is aimed at programmatic use
    rather than to power UI.
    
    <img width="1348" height="1307" alt="image"
    src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7"
    />
    
    I changed the existing `EventProcessor` into a trait and moved the
    implementation to `EventProcessorWithHumanOutput`. Then I introduced an
    alternative implementation, `EventProcessorWithJsonOutput`. The `--json`
    flag determines which implementation to use.
  • Add streaming to exec and tui (#1594)
    Added support for streaming in `tui`
    Added support for streaming in `exec`
    
    
    https://github.com/user-attachments/assets/4215892e-d940-452c-a1d0-416ed0cf14eb
  • support deltas in core (#1587)
    - Added support for message and reasoning deltas
    - Skipped adding the support in the cli and tui for later
    - Commented a failing test (wrong merge) that needs fix in a separate
    PR.
    
    Side note: I think we need to disable merge when the CI don't pass.
  • feat: add new config option: model_supports_reasoning_summaries (#1524)
    As noted in the updated docs, this makes it so that you can set:
    
    ```toml
    model_supports_reasoning_summaries = true
    ```
    
    as a way of overriding the existing heuristic for when to set the
    `reasoning` field on a sampling request:
    
    
    https://github.com/openai/codex/blob/341c091c5b09dc706ab5c7d629516e6ef5aaf902/codex-rs/core/src/client_common.rs#L152-L166
  • chore(rs): update dependencies (#1494)
    ### Chores
    - Update cargo dependencies
    - Remove unused cargo dependencies
    - Fix clippy warnings
    - Update Dockerfile (package.json requires node 22)
    - Let Dependabot update bun, cargo, devcontainers, docker,
    github-actions, npm (nix still not supported)
    
    ### TODO
    - Upgrade dependencies with breaking changes
    
    ```shell
    $ cargo update --verbose
       Unchanged crossterm v0.28.1 (available: v0.29.0)
       Unchanged schemars v0.8.22 (available: v1.0.4)
    ```
  • feat: add support for --sandbox flag (#1476)
    On a high-level, we try to design `config.toml` so that you don't have
    to "comment out a lot of stuff" when testing different options.
    
    Previously, defining a sandbox policy was somewhat at odds with this
    principle because you would define the policy as attributes of
    `[sandbox]` like so:
    
    ```toml
    [sandbox]
    mode = "workspace-write"
    writable_roots = [ "/tmp" ]
    ```
    
    but if you wanted to temporarily change to a read-only sandbox, you
    might feel compelled to modify your file to be:
    
    ```toml
    [sandbox]
    mode = "read-only"
    # mode = "workspace-write"
    # writable_roots = [ "/tmp" ]
    ```
    
    Technically, commenting out `writable_roots` would not be strictly
    necessary, as `mode = "read-only"` would ignore `writable_roots`, but
    it's still a reasonable thing to do to keep things tidy.
    
    Currently, the various values for `mode` do not support that many
    attributes, so this is not that hard to maintain, but one could imagine
    this becoming more complex in the future.
    
    In this PR, we change Codex CLI so that it no longer recognizes
    `[sandbox]`. Instead, it introduces a top-level option, `sandbox_mode`,
    and `[sandbox_workspace_write]` is used to further configure the sandbox
    when when `sandbox_mode = "workspace-write"` is used:
    
    ```toml
    sandbox_mode = "workspace-write"
    
    [sandbox_workspace_write]
    writable_roots = [ "/tmp" ]
    ```
    
    This feels a bit more future-proof in that it is less tedious to
    configure different sandboxes:
    
    ```toml
    sandbox_mode = "workspace-write"
    
    [sandbox_read_only]
    # read-only options here...
    
    [sandbox_workspace_write]
    writable_roots = [ "/tmp" ]
    
    [sandbox_danger_full_access]
    # danger-full-access options here...
    ```
    
    In this scheme, you never need to comment out the configuration for an
    individual sandbox type: you only need to redefine `sandbox_mode`.
    
    Relatedly, previous to this change, a user had to do `-c
    sandbox.mode=read-only` to change the mode on the command line. With
    this change, things are arguably a bit cleaner because the equivalent
    option is `-c sandbox_mode=read-only` (and now `-c
    sandbox_workspace_write=...` can be set separately).
    
    Though more importantly, we introduce the `-s/--sandbox` option to the
    CLI, which maps directly to `sandbox_mode` in `config.toml`, making
    config override behavior easier to reason about. Moreover, as you can
    see in the updates to the various Markdown files, it is much easier to
    explain how to configure sandboxing when things like `--sandbox
    read-only` can be used as an example.
    
    Relatedly, this cleanup also made it straightforward to add support for
    a `sandbox` option for Codex when used as an MCP server (see the changes
    to `mcp-server/src/codex_tool_config.rs`).
    
    Fixes https://github.com/openai/codex/issues/1248.
  • feat: show number of tokens remaining in UI (#1388)
    When using the OpenAI Responses API, we now record the `usage` field for
    a `"response.completed"` event, which includes metrics about the number
    of tokens consumed. We also introduce `openai_model_info.rs`, which
    includes current data about the most common OpenAI models available via
    the API (specifically `context_window` and `max_output_tokens`). If
    Codex does not recognize the model, you can set `model_context_window`
    and `model_max_output_tokens` explicitly in `config.toml`.
    
    When then introduce a new event type to `protocol.rs`, `TokenCount`,
    which includes the `TokenUsage` for the most recent turn.
    
    Finally, we update the TUI to record the running sum of tokens used so
    the percentage of available context window remaining can be reported via
    the placeholder text for the composer:
    
    ![Screenshot 2025-06-25 at 11 20
    55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49)
    
    We could certainly get much fancier with this (such as reporting the
    estimated cost of the conversation), but for now, we are just trying to
    achieve feature parity with the TypeScript CLI.
    
    Though arguably this improves upon the TypeScript CLI, as the TypeScript
    CLI uses heuristics to estimate the number of tokens used rather than
    using the `usage` information directly:
    
    
    https://github.com/openai/codex/blob/296996d74e345b1b05d8c3451a06ace21c5ada96/codex-cli/src/utils/approximate-tokens-used.ts#L3-L16
    
    Fixes https://github.com/openai/codex/issues/1242
  • feat: add --dangerously-bypass-approvals-and-sandbox (#1384)
    This PR reworks `assess_command_safety()` so that the combination of
    `AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
    that commands are run without _any_ sandbox and the user should never be
    prompted. In turn, it adds support for a new
    `--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
    with `--approval-policy` or `--full-auto`) that sets both of those
    options.
    
    Fixes https://github.com/openai/codex/issues/1254
  • chore: improve docstring for --full-auto (#1379)
    Reference `-c sandbox.mode=workspace-write` in the docstring and users
    can read the config docs for `sandbox` for more information.
  • fix: pretty-print the sandbox config in the TUI/exec modes (#1376)
    Now that https://github.com/openai/codex/pull/1373 simplified the
    sandbox config, we can print something much simpler in the TUI (and in
    `codex exec`) to summarize the sandbox config.
    
    Before:
    
    ![Screenshot 2025-06-24 at 5 45
    52 PM](https://github.com/user-attachments/assets/b7633efb-a619-43e1-9abe-7bb0be2d0ec0)
    
    With this change:
    
    ![Screenshot 2025-06-24 at 5 46
    44 PM](https://github.com/user-attachments/assets/8d099bdd-a429-4796-a08d-70931d984e4f)
    
    For reference, my `config.toml` contains:
    
    ```
    [sandbox]
    mode = "workspace-write"
    writable_roots = ["/tmp", "/Users/mbolin/.pyenv/shims"]
    ```
    
    Fixes https://github.com/openai/codex/issues/1248
  • feat: redesign sandbox config (#1373)
    This is a major redesign of how sandbox configuration works and aims to
    fix https://github.com/openai/codex/issues/1248. Specifically, it
    replaces `sandbox_permissions` in `config.toml` (and the
    `-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
    three variants:
    
    ```toml
    # Safest option: full disk is read-only, but writes and network access are disallowed.
    [sandbox]
    mode = "read-only"
    
    # The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
    # writable_roots can be used to specify additional writable folders.
    [sandbox]
    mode = "workspace-write"
    writable_roots = []  # Optional, defaults to the empty list.
    network_access = false  # Optional, defaults to false.
    
    # Disable sandboxing: use at your own risk!!!
    [sandbox]
    mode = "danger-full-access"
    ```
    
    This should make sandboxing easier to reason about. While we have
    dropped support for `-s`, the way it works now is:
    
    - no flags => `read-only`
    - `--full-auto` => `workspace-write`
    - currently, there is no way to specify `danger-full-access` via a CLI
    flag, but we will revisit that as part of
    https://github.com/openai/codex/issues/1254
    
    Outstanding issue:
    
    - As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
    still conflating sandbox preferences with approval preferences in that
    case, which needs to be cleaned up.
  • feat: make reasoning effort/summaries configurable (#1199)
    Previous to this PR, we always set `reasoning` when making a request
    using the Responses API:
    
    
    https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
    
    Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
    would fail with:
    
    ```shell
    "Unsupported parameter: 'reasoning.effort' is not supported with this model."
    ```
    
    We take a cue from the TypeScript CLI, which does a check on the model
    name:
    
    
    https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
    
    This PR does a similar check, though also adds support for the following
    config options:
    
    ```
    model_reasoning_effort = "low" | "medium" | "high" | "none"
    model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
    ```
    
    This way, if you have a model whose name happens to start with `"o"` (or
    `"codex"`?), you can set these to `"none"` to explicitly disable
    reasoning, if necessary. (That said, it seems unlikely anyone would use
    the Responses API with non-OpenAI models, but we provide an escape
    hatch, anyway.)
    
    This PR also updates both the TUI and `codex exec` to show `reasoning
    effort` and `reasoning summaries` in the header.
  • feat: show the version when starting Codex (#1182)
    The TypeScript version of the CLI shows the version when it starts up,
    which is helpful when users share screenshots (and nice to know, as a
    user).
  • feat: add hide_agent_reasoning config option (#1181)
    This PR introduces a `hide_agent_reasoning` config option (that defaults
    to `false`) that users can enable to make the output less verbose by
    suppressing reasoning output.
    
    To test, verified that this includes agent reasoning in the output:
    
    ```
    echo hello | just exec
    ```
    
    whereas this does not:
    
    ```
    echo hello | just exec --config hide_agent_reasoning=false
    ```
  • feat: dim the timestamp in the exec output (#1180)
    This required changing `ts_println!()` to take `$self:ident`, which is a
    bit more verbose, but the usability improvement seems worth it.
    
    Also eliminated an unnecessary `.to_string()` while here.
  • feat: grab-bag of improvements to exec output (#1179)
    Fixes:
    
    * Instantiate `EventProcessor` earlier in `lib.rs` so
    `print_config_summary()` can be an instance method of it and leverage
    its various `Style` fields to ensure it honors `with_ansi` properly.
    * After printing the config summary, print out user's prompt with the
    heading `User instructions:`. As noted in the comment, now that we can
    read the instructions via stdin as of #1178, it is helpful to the user
    to ensure they know what instructions were given to Codex.
    * Use same colors/bold/italic settings for headers as the TUI, making
    the output a bit easier to read.
  • feat: for codex exec, if PROMPT is not specified, read from stdin if not a TTY (#1178)
    This attempts to make `codex exec` more flexible in how the prompt can
    be passed:
    
    * as before, it can be passed as a single string argument
    * if `-` is passed as the value, the prompt is read from stdin
    * if no argument is passed _and stdin is a tty_, prints a warning to
    stderr that no prompt was specified an exits non-zero.
    * if no argument is passed _and stdin is NOT a tty_, prints `Reading
    prompt from stdin...` to stderr to let the user know that Codex will
    wait until it reads EOF from stdin to proceed. (You can repro this case
    by doing `yes | just exec` since stdin is not a TTY in that case but it
    also never reaches EOF).
  • fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151)
    The output of an MCP server tool call can be one of several types, but
    to date, we treated all outputs as text by showing the serialized JSON
    as the "tool output" in Codex:
    
    
    https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101
    
    This PR adds support for the `ImageContent` variant so we can now
    display an image output from an MCP tool call.
    
    In making this change, we introduce a new
    `ResponseInputItem::McpToolCallOutput` variant so that we can work with
    the `mcp_types::CallToolResult` directly when the function call is made
    to an MCP server.
    
    Though arguably the more significant change is the introduction of
    `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that
    uses `ratatui_image` to render an image into the terminal. To support
    this, we introduce `ImageRenderCache`, cache a
    `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the
    appropriate scaled image data and dimensions based on the current
    terminal size.
    
    To test, I created a minimal `package.json`:
    
    ```json
    {
      "name": "kitty-mcp",
      "version": "1.0.0",
      "type": "module",
      "description": "MCP that returns image of kitty",
      "main": "index.js",
      "dependencies": {
        "@modelcontextprotocol/sdk": "^1.12.0"
      }
    }
    ```
    
    with the following `index.js` to define the MCP server:
    
    ```js
    #!/usr/bin/env node
    
    import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
    import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
    import { readFile } from "node:fs/promises";
    import { join } from "node:path";
    
    const IMAGE_URI = "image://Ada.png";
    
    const server = new McpServer({
      name: "Demo",
      version: "1.0.0",
    });
    
    server.tool(
      "get-cat-image",
      "If you need a cat image, this tool will provide one.",
      async () => ({
        content: [
          { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" },
        ],
      })
    );
    
    server.resource("Ada the Cat", IMAGE_URI, async (uri) => {
      const base64Image = await getAdaPngBase64();
      return {
        contents: [
          {
            uri: uri.href,
            mimeType: "image/png",
            blob: base64Image,
          },
        ],
      };
    });
    
    async function getAdaPngBase64() {
      const __dirname = new URL(".", import.meta.url).pathname;
      // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png
      const filePath = join(__dirname, "Ada.png");
      const imageData = await readFile(filePath);
      const base64Image = imageData.toString("base64");
      return base64Image;
    }
    
    const transport = new StdioServerTransport();
    await server.connect(transport);
    ```
    
    With the local changes from this PR, I added the following to my
    `config.toml`:
    
    ```toml
    [mcp_servers.kitty]
    command = "node"
    args = ["/Users/mbolin/code/kitty-mcp/index.js"]
    ```
    
    Running the TUI from source:
    
    ```
    cargo run --bin codex -- --model o3 'I need a picture of a cat'
    ```
    
    I get:
    
    <img width="732" alt="image"
    src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869"
    />
    
    Now, that said, I have only tested in iTerm and there is definitely some
    funny business with getting an accurate character-to-pixel ratio
    (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10
    rows to render instead of 4), so there is still work to be done here.
  • feat: add support for -c/--config to override individual config items (#1137)
    This PR introduces support for `-c`/`--config` so users can override
    individual config values on the command line using `--config
    name=value`. Example:
    
    ```
    codex --config model=o4-mini
    ```
    
    Making it possible to set arbitrary config values on the command line
    results in a more flexible configuration scheme and makes it easier to
    provide single-line examples that can be copy-pasted from documentation.
    
    Effectively, it means there are four levels of configuration for some
    values:
    
    - Default value (e.g., `model` currently defaults to `o4-mini`)
    - Value in `config.toml` (e.g., user could override the default to be
    `model = "o3"` in their `config.toml`)
    - Specifying `-c` or `--config` to override `model` (e.g., user can
    include `-c model=o3` in their list of args to Codex)
    - If available, a config-specific flag can be used, which takes
    precedence over `-c` (e.g., user can specify `--model o3` in their list
    of args to Codex)
    
    Now that it is possible to specify anything that could be configured in
    `config.toml` on the command line using `-c`, we do not need to have a
    custom flag for every possible config option (which can clutter the
    output of `--help`). To that end, as part of this PR, we drop support
    for the `--disable-response-storage` flag, as users can now specify `-c
    disable_response_storage=true` to get the equivalent functionality.
    
    Under the hood, this works by loading the `config.toml` into a
    `toml::Value`. Then for each `key=value`, we create a small synthetic
    TOML file with `value` so that we can run the TOML parser to get the
    equivalent `toml::Value`. We then parse `key` to determine the point in
    the original `toml::Value` to do the insert/replace. Once all of the
    overrides from `-c` args have been applied, the `toml::Value` is
    deserialized into a `ConfigToml` and then the `ConfigOverrides` are
    applied, as before.
  • fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)
    Historically, we spawned the Seatbelt and Landlock sandboxes in
    substantially different ways:
    
    For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
    specified as an arg followed by the original command:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec.rs#L147-L219
    
    For **Landlock/Seccomp**, we would do
    `tokio::runtime::Builder::new_current_thread()`, _invoke
    Landlock/Seccomp APIs to modify the permissions of that new thread_, and
    then spawn the command:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec_linux.rs#L28-L49
    
    While it is neat that Landlock/Seccomp supports applying a policy to
    only one thread without having to apply it to the entire process, it
    requires us to maintain two different codepaths and is a bit harder to
    reason about. The tipping point was
    https://github.com/openai/codex/pull/1061, in which we had to start
    building up the `env` in an unexpected way for the existing
    Landlock/Seccomp approach to continue to work.
    
    This PR overhauls things so that we do similar things for Mac and Linux.
    It turned out that we were already building our own "helper binary"
    comparable to Mac's `sandbox-exec` as part of the `cli` crate:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/cli/Cargo.toml#L10-L12
    
    We originally created this to build a small binary to include with the
    Node.js version of the Codex CLI to provide support for Linux
    sandboxing.
    
    Though the sticky bit is that, at this point, we still want to deploy
    the Rust version of Codex as a single, standalone binary rather than a
    CLI and a supporting sandboxing binary. To satisfy this goal, we use
    "the arg0 trick," in which we:
    
    * use `std::env::current_exe()` to get the path to the CLI that is
    currently running
    * use the CLI as the `program` for the `Command`
    * set `"codex-linux-sandbox"` as arg0 for the `Command`
    
    A CLI that supports sandboxing should check arg0 at the start of the
    program. If it is `"codex-linux-sandbox"`, it must invoke
    `codex_linux_sandbox::run_main()`, which runs the CLI as if it were
    `codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
    appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
    the original command, so do _replace_ the process rather than spawn a
    subprocess. Incidentally, we do this before starting the Tokio runtime,
    so the process should only have one thread when `execvp(3)` is called.
    
    Because the `core` crate that needs to spawn the Linux sandboxing is not
    a CLI in its own right, this means that every CLI that includes `core`
    and relies on this behavior has to (1) implement it and (2) provide the
    path to the sandboxing executable. While the path is almost always
    `std::env::current_exe()`, we needed to make this configurable for
    integration tests, so `Config` now has a `codex_linux_sandbox_exe:
    Option<PathBuf>` property to facilitate threading this through,
    introduced in https://github.com/openai/codex/pull/1089.
    
    This common pattern is now captured in
    `codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
    functions that should use it have been updated as part of this PR.
    
    The `codex-linux-sandbox` crate added to the Cargo workspace as part of
    this PR now has the bulk of the Landlock/Seccomp logic, which makes
    `core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
    `core/src/landlock.rs` were removed/ported as part of this PR. I also
    moved the unit tests for this code into an integration test,
    `linux-sandbox/tests/landlock.rs`, in which I use
    `env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
    `codex_linux_sandbox_exe` since `std::env::current_exe()` is not
    appropriate in that case.
  • feat: add codex_linux_sandbox_exe: Option<PathBuf> field to Config (#1089)
    https://github.com/openai/codex/pull/1086 is a work-in-progress to make
    Linux sandboxing work more like Seatbelt where, for the command we want
    to sandbox, we build up the command and then hand it, and some sandbox
    configuration flags, to another command to set up the sandbox and then
    run it.
    
    In the case of Seatbelt, macOS provides this helper binary and provides
    it at `/usr/bin/sandbox-exec`. For Linux, we have to build our own and
    pass it through (which is what #1086 does), so this makes the new
    `codex_linux_sandbox_exe` available on `Config` so that it will later be
    available in `exec.rs` when we need it in #1086.
  • feat: show Config overview at start of exec (#1073)
    Now the `exec` output starts with something like:
    
    ```
    --------
    workdir:  /Users/mbolin/code/codex/codex-rs
    model:  o3
    provider:  openai
    approval:  Never
    sandbox:  SandboxPolicy { permissions: [DiskFullReadAccess, DiskWritePlatformUserTempFolder, DiskWritePlatformGlobalTempFolder, DiskWriteCwd, DiskWriteFolder { folder: "/Users/mbolin/.pyenv/shims" }] }
    --------
    ```
    
    which makes it easier to reason about when looking at logs.
  • feat: experimental --output-last-message flag to exec subcommand (#1037)
    This introduces an experimental `--output-last-message` flag that can be
    used to identify a file where the final message from the agent will be
    written. Two use cases:
    
    - Ultimately, we will likely add a `--quiet` option to `exec`, but even
    if the user does not want any output written to the terminal, they
    probably want to know what the agent did. Writing the output to a file
    makes it possible to get that information in a clean way.
    - Relatedly, when using `exec` in CI, it is easier to review the
    transcript written "normally," (i.e., not as JSON or something with
    extra escapes), but getting programmatic access to the last message is
    likely helpful, so writing the last message to a file gets the best of
    both worlds.
    
    I am calling this "experimental" because it is possible that we are
    overfitting and will want a more general solution to this problem that
    would justify removing this flag.
  • chore: update exec crate to use std::time instead of chrono (#952)
    When I originally wrote `elapsed.rs`, I realized we were using both
    `std::time` and `chrono` with no real benefit of having both. We should
    try to keep the `exec` subcommand trim (as it also buildable as a
    standalone executable), so this helps tighten things up.
  • feat: record messages from user in ~/.codex/history.jsonl (#939)
    This is a large change to support a "history" feature like you would
    expect in a shell like Bash.
    
    History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
    is a JSONL file, it is straightforward to append new entries (as opposed
    to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
    valid JSON, each new entry entails rewriting the entire file). Because
    it is possible for there to be multiple instances of Codex CLI writing
    to `history.jsonl` at once, we use advisory file locking when working
    with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
    
    Because we believe history is a sufficiently useful feature, we enable
    it by default. Though to provide some safety, we set the file
    permissions of `history.jsonl` to be `o600` so that other users on the
    system cannot read the user's history. We do not yet support a default
    list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
    
    
    https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
    
    We are going to take a more conservative approach to this list in the
    Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
    sensitive information like API tokens, it would also exclude valuable
    information such as references to Git commits.
    
    As noted in the updated documentation, users can opt-out of history by
    adding the following to `config.toml`:
    
    ```toml
    [history]
    persistence = "none" 
    ```
    
    Because `history.jsonl` could, in theory, be quite large, we take a[n
    arguably overly pedantic] approach in reading history entries into
    memory. Specifically, we start by telling the client the current number
    of entries in the history file (`history_entry_count`) as well as the
    inode (`history_log_id`) of `history.jsonl` (see the new fields on
    `SessionConfiguredEvent`).
    
    The client is responsible for keeping new entries in memory to create a
    "local history," but if the user hits up enough times to go "past" the
    end of local history, then the client should use the new
    `GetHistoryEntryRequest` in the protocol to fetch older entries.
    Specifically, it should pass the `history_log_id` it was given
    originally and work backwards from `history_entry_count`. (It should
    really fetch history in batches rather than one-at-a-time, but that is
    something we can improve upon in subsequent PRs.)
    
    The motivation behind this crazy scheme is that it is designed to defend
    against:
    
    * The `history.jsonl` being truncated during the session such that the
    index into the history is no longer consistent with what had been read
    up to that point. We do not yet have logic to enforce a `max_bytes` for
    `history.jsonl`, but once we do, we will aspire to implement it in a way
    that should result in a new inode for the file on most systems.
    * New items from concurrent Codex CLI sessions amending to the history.
    Because, in absence of truncation, `history.jsonl` is an append-only
    log, so long as the client reads backwards from `history_entry_count`,
    it should always get a consistent view of history. (That said, it will
    not be able to read _new_ commands from concurrent sessions, but perhaps
    we will introduce a `/` command to reload latest history or something
    down the road.)
    
    Admittedly, my testing of this feature thus far has been fairly light. I
    expect we will find bugs and introduce enhancements/fixes going forward.
  • chore: handle all cases for EventMsg (#936)
    For now, this removes the `#[non_exhaustive]` directive on `EventMsg` so
    that we are forced to handle all `EventMsg` by default. (We may revisit
    this if/when we publish `core/` as a `lib` crate.) For now, it is
    helpful to have this as a forcing function because we have effectively
    two UIs (`tui` and `exec`) and usually when we add a new variant to
    `EventMsg`, we want to be sure that we update both.
  • fix: change EventMsg enum so every variant takes a single struct (#925)
    https://github.com/openai/codex/pull/922 did this for the
    `SessionConfigured` enum variant, and I think it is generally helpful to
    be able to work with the values as each enum variant as their own type,
    so this converts the remaining variants and updates all of the
    callsites.
    
    Added a simple unit test to verify that the JSON-serialized version of
    `Event` does not have any unexpected nesting.
  • feat: introduce --profile for Rust CLI (#921)
    This introduces a much-needed "profile" concept where users can specify
    a collection of options under one name and then pass that via
    `--profile` to the CLI.
    
    This PR introduces the `ConfigProfile` struct and makes it a field of
    `CargoToml`. It further updates
    `Config::load_from_base_config_with_overrides()` to respect
    `ConfigProfile`, overriding default values where appropriate. A detailed
    unit test is added at the end of `config.rs` to verify this behavior.
    
    Details on how to use this feature have also been added to
    `codex-rs/README.md`.
  • feat: support the chat completions API in the Rust CLI (#862)
    This is a substantial PR to add support for the chat completions API,
    which in turn makes it possible to use non-OpenAI model providers (just
    like in the TypeScript CLI):
    
    * It moves a number of structs from `client.rs` to `client_common.rs` so
    they can be shared.
    * It introduces support for the chat completions API in
    `chat_completions.rs`.
    * It updates `ModelProviderInfo` so that `env_key` is `Option<String>`
    instead of `String` (for e.g., ollama) and adds a `wire_api` field
    * It updates `client.rs` to choose between `stream_responses()` and
    `stream_chat_completions()` based on the `wire_api` for the
    `ModelProviderInfo`
    * It updates the `exec` and TUI CLIs to no longer fail if the
    `OPENAI_API_KEY` environment variable is not set
    * It updates the TUI so that `EventMsg::Error` is displayed more
    prominently when it occurs, particularly now that it is important to
    alert users to the `CodexErr::EnvVar` variant.
    * `CodexErr::EnvVar` was updated to include an optional `instructions`
    field so we can preserve the behavior where we direct users to
    https://platform.openai.com if `OPENAI_API_KEY` is not set.
    * Cleaned up the "welcome message" in the TUI to ensure the model
    provider is displayed.
    * Updated the docs in `codex-rs/README.md`.
    
    To exercise the chat completions API from OpenAI models, I added the
    following to my `config.toml`:
    
    ```toml
    model = "gpt-4o"
    model_provider = "openai-chat-completions"
    
    [model_providers.openai-chat-completions]
    name = "OpenAI using Chat Completions"
    base_url = "https://api.openai.com/v1"
    env_key = "OPENAI_API_KEY"
    wire_api = "chat"
    ```
    
    Though to test a non-OpenAI provider, I installed ollama with mistral
    locally on my Mac because ChatGPT said that would be a good match for my
    hardware:
    
    ```shell
    brew install ollama
    ollama serve
    ollama pull mistral
    ```
    
    Then I added the following to my `~/.codex/config.toml`:
    
    ```toml
    model = "mistral"
    model_provider = "ollama"
    ```
    
    Note this code could certainly use more test coverage, but I want to get
    this in so folks can start playing with it.
    
    For reference, I believe https://github.com/openai/codex/pull/247 was
    roughly the comparable PR on the TypeScript side.
  • Workspace lints and disallow unwrap (#855)
    Sets submodules to use workspace lints. Added denying unwrap as a
    workspace level lint, which found a couple of cases where we could have
    propagated errors. Also manually labeled ones that were fine by my eye.
  • feat: read model_provider and model_providers from config.toml (#853)
    This is the first step in supporting other model providers in the Rust
    CLI. Specifically, this PR adds support for the new entries in `Config`
    and `ConfigOverrides` to specify a `ModelProviderInfo`, which is the
    basic config needed for an LLM provider. This PR does not get us all the
    way there yet because `client.rs` still categorically appends
    `/responses` to the URL and expects the endpoint to support the OpenAI
    Responses API. Will fix that next!
  • Update cargo to 2024 edition (#842)
    Some effects of this change:
    - New formatting changes across many files. No functionality changes
    should occur from that.
    - Calls to `set_env` are considered unsafe, since this only happens in
    tests we wrap them in `unsafe` blocks
  • chore: introduce codex-common crate (#843)
    I started this PR because I wanted to share the `format_duration()`
    utility function in `codex-rs/exec/src/event_processor.rs` with the TUI.
    The question was: where to put it?
    
    `core` should have as few dependencies as possible, so moving it there
    would introduce a dependency on `chrono`, which seemed undesirable.
    `core` already had this `cli` feature to deal with a similar situation
    around sharing common utility functions, so I decided to:
    
    * make `core` feature-free
    * introduce `common`
    * `common` can have as many "special interest" features as it needs,
    each of which can declare their own deps
    * the first two features of common are `cli` and `elapsed`
    
    In practice, this meant updating a number of `Cargo.toml` files,
    replacing this line:
    
    ```toml
    codex-core = { path = "../core", features = ["cli"] }
    ```
    
    with these:
    
    ```toml
    codex-core = { path = "../core" }
    codex-common = { path = "../common", features = ["cli"] }
    ```
    
    Moving `format_duration()` into its own file gave it some "breathing
    room" to add a unit test, so I had Codex generate some tests and new
    support for durations over 1 minute.
  • feat: show MCP tool calls in codex exec subcommand (#841)
    This is analogous to the change for the TUI in
    https://github.com/openai/codex/pull/836, but for `codex exec`.
    
    To test, I ran:
    
    ```
    cargo run --bin codex-exec -- 'what is the weather in wellesley ma tomorrow'
    ```
    
    and saw:
    
    
    ![image](https://github.com/user-attachments/assets/5714e07f-88c7-4dd9-aa0d-be54c1670533)
  • fix: is_inside_git_repo should take the directory as a param (#809)
    https://github.com/openai/codex/pull/800 made `cwd` a property of
    `Config` and made it so the `cwd` is not necessarily
    `std::env::current_dir()`. As such, `is_inside_git_repo()` should check
    `Config.cwd` rather than `std::env::current_dir()`.
    
    This PR updates `is_inside_git_repo()` to take `Config` instead of an
    arbitrary `PathBuf` to force the check to operate on a `Config` where
    `cwd` has been resolved to what the user specified.
  • feat: make cwd a required field of Config so we stop assuming std::env::current_dir() in a session (#800)
    In order to expose Codex via an MCP server, I realized that we should be
    taking `cwd` as a parameter rather than assuming
    `std::env::current_dir()` as the `cwd`. Specifically, the user may want
    to start a session in a directory other than the one where the MCP
    server has been started.
    
    This PR makes `cwd: PathBuf` a required field of `Session` and threads
    it all the way through, though I think there is still an issue with not
    honoring `workdir` for `apply_patch`, which is something we also had to
    fix in the TypeScript version: https://github.com/openai/codex/pull/556.
    
    This also adds `-C`/`--cd` to change the cwd via the command line.
    
    To test, I ran:
    
    ```
    cargo run --bin codex -- exec -C /tmp 'show the output of ls'
    ```
    
    and verified it showed the contents of my `/tmp` folder instead of
    `$PWD`.