Commit Graph

18 Commits

  • feat: use the arg0 trick with apply_patch (#2646)
    Historically, Codex CLI has treated `apply_patch` (and its sometimes
    misspelling, `applypatch`) as a "virtual CLI," intercepting it when it
    appears as the first arg to `command` for the `"container.exec",
    `"shell"`, or `"local_shell"` tools.
    
    This approach has a known limitation where if, say, the model created a
    Python script that runs `apply_patch` and then tried to run the Python
    script, we have no insight as to what the model is trying to do and the
    Python Script would fail because `apply_patch` was never really on the
    `PATH`.
    
    One way to solve this problem is to require users to install an
    `apply_patch` executable alongside the `codex` executable (or at least
    put it someplace where Codex can discover it). Though to keep Codex CLI
    as a standalone executable, we exploit "the arg0 trick" where we create
    a temporary directory with an entry named `apply_patch` and prepend that
    directory to the `PATH` for the duration of the invocation of Codex.
    
    - On UNIX, `apply_patch` is a symlink to `codex`, which now changes its
    behavior to behave like `apply_patch` if arg0 is `apply_patch` (or
    `applypatch`)
    - On Windows, `apply_patch.bat` is a batch script that runs `codex
    --codex-run-as-apply-patch %*`, as Codex also changes its behavior if
    the first argument is `--codex-run-as-apply-patch`.
  • [apply-patch] Fix applypatch for heredocs (#2477)
    ## Summary
    Follow up to #2186 for #2072 - we added handling for `applypatch` in
    default commands, but forgot to add detection to the heredocs logic.
    
    ## Testing
    - [x] Added unit tests
  • chore: upgrade to Rust 1.89 (#2465)
    Codex created this PR from the following prompt:
    
    > upgrade this entire repo to Rust 1.89. Note that this requires
    updating codex-rs/rust-toolchain.toml as well as the workflows in
    .github/. Make sure that things are "clippy clean" as this change will
    likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
    are sufficient to check for correctness
    
    Note this modifies a lot of lines because it folds nested `if`
    statements using `&&`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
    * #2467
    * __->__ #2465
  • Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
    This PR:
    * Added the clippy.toml to configure allowable expect / unwrap usage in
    tests
    * Removed as many expect/allow lines as possible from tests
    * moved a bunch of allows to expects where possible
    
    Note: in integration tests, non `#[test]` helper functions are not
    covered by this so we had to leave a few lingering `expect(expect_used`
    checks around
  • [apply-patch] Support applypatch command string (#2186)
    ## Summary
    GPT-OSS and `gpt-5-mini` have training artifacts that cause the models
    to occasionally use `applypatch` instead of `apply_patch`. I think
    long-term we'll want to provide `apply_patch` as a first class tool, but
    for now let's silently handle this case to avoid hurting model
    performance
    
    ## Testing
    - [x] Added unit test
  • Propagate apply_patch filesystem errors (#1892)
    ## Summary
    We have been returning `exit code 0` from the apply patch command when
    writes fail, which causes our `exec` harness to pass back confusing
    messages to the model. Instead, we should loudly fail so that the
    harness and the model can handle these errors appropriately.
    
    Also adds a test to confirm this behavior.
    
    ## Testing
    - `cargo test -p codex-apply-patch`
  • fix: run apply_patch calls through the sandbox (#1705)
    Building on the work of https://github.com/openai/codex/pull/1702, this
    changes how a shell call to `apply_patch` is handled.
    
    Previously, a shell call to `apply_patch` was always handled in-process,
    never leveraging a sandbox. To determine whether the `apply_patch`
    operation could be auto-approved, the
    `is_write_patch_constrained_to_writable_paths()` function would check if
    all the paths listed in the paths were writable. If so, the agent would
    apply the changes listed in the patch.
    
    Unfortunately, this approach afforded a loophole: symlinks!
    
    * For a soft link, we could fix this issue by tracing the link and
    checking whether the target is in the set of writable paths, however...
    * ...For a hard link, things are not as simple. We can run `stat FILE`
    to see if the number of links is greater than 1, but then we would have
    to do something potentially expensive like `find . -inum <inode_number>`
    to find the other paths for `FILE`. Further, even if this worked, this
    approach runs the risk of a
    [TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
    race condition, so it is not robust.
    
    The solution, implemented in this PR, is to take the virtual execution
    of the `apply_patch` CLI into an _actual_ execution using `codex
    --codex-run-as-apply-patch PATCH`, which we can run under the sandbox
    the user specified, just like any other `shell` call.
    
    This, of course, assumes that the sandbox prevents writing through
    symlinks as a mechanism to write to folders that are not in the writable
    set configured by the sandbox. I verified this by testing the following
    on both Mac and Linux:
    
    ```shell
    #!/usr/bin/env bash
    set -euo pipefail
    
    # Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?
    
    # Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
    SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    # EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
    EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    
    echo "SANDBOX_DIR: $SANDBOX_DIR"
    echo "EXPLOIT_DIR: $EXPLOIT_DIR"
    
    cleanup() {
      # Only remove if it looks sane and still exists
      [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
      [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
    }
    
    trap cleanup EXIT
    
    echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"
    
    # Drop the -s to test hard links.
    ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"
    
    cat "${SANDBOX_DIR}/link-to-original.txt"
    
    if [[ "$(uname)" == "Linux" ]]; then
        SANDBOX_SUBCOMMAND=landlock
    else
        SANDBOX_SUBCOMMAND=seatbelt
    fi
    
    # Attempt the exploit
    cd "${SANDBOX_DIR}"
    
    codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true
    
    cat "${EXPLOIT_DIR}/original.txt"
    ```
    
    Admittedly, this change merits a proper integration test, but I think I
    will have to do that in a follow-up PR.
  • chore(rs): update dependencies (#1494)
    ### Chores
    - Update cargo dependencies
    - Remove unused cargo dependencies
    - Fix clippy warnings
    - Update Dockerfile (package.json requires node 22)
    - Let Dependabot update bun, cargo, devcontainers, docker,
    github-actions, npm (nix still not supported)
    
    ### TODO
    - Upgrade dependencies with breaking changes
    
    ```shell
    $ cargo update --verbose
       Unchanged crossterm v0.28.1 (available: v0.29.0)
       Unchanged schemars v0.8.22 (available: v1.0.4)
    ```
  • fix: provide tolerance for apply_patch tool (#993)
    As explained in detail in the doc comment for `ParseMode::Lenient`, we
    have observed that GPT-4.1 does not always generate a valid invocation
    of `apply_patch`. Fortunately, the error is predictable, so we introduce
    some new logic to the `codex-apply-patch` crate to recover from this
    error.
    
    Because we would like to avoid this becoming a de facto standard (as it
    would be incompatible if `apply_patch` were provided as an actual
    executable, unless we also introduced the lenient behavior in the
    executable, as well), we require passing `ParseMode::Lenient` to
    `parse_patch_text()` to make it clear that the caller is opting into
    supporting this special case.
    
    Note the analogous change to the TypeScript CLI was
    https://github.com/openai/codex/pull/930. In addition to changing the
    accepted input to `apply_patch`, it also introduced additional
    instructions for the model, which we include in this PR.
    
    Note that `apply-patch` does not depend on either `regex` or
    `regex-lite`, so some of the checks are slightly more verbose to avoid
    introducing this dependency.
    
    That said, this PR does not leverage the existing
    `extract_heredoc_body_from_apply_patch_command()`, which depends on
    `tree-sitter` and `tree-sitter-bash`:
    
    
    https://github.com/openai/codex/blob/5a5aa899143f9b9ef606692c401b010368b15bdb/codex-rs/apply-patch/src/lib.rs#L191-L246
    
    though perhaps it should.
  • fix: apply patch issue when using different cwd (#942)
    If you run a codex instance outside of the current working directory
    from where you launched the codex binary it won't be able to apply
    patches correctly, even if the sandbox policy allows it. This manifests
    weird behaviours, such as
    
    * Reading the same filename in the binary working directory, and
    overwriting it in the session working directory. e.g. if you have a
    `readme` in both folders it will overwrite the readme in the session
    working directory with the readme in the binary working directory
    *applied with the suggested patch*.
    * The LLM ends up in weird loops trying to verify and debug why the
    apply_patch won't work, and it can result in it applying patches by
    manually writing python or javascript if it figures out that either is
    supported by the system instead.
    
    I added a test-case to ensure that the patch contents are based on the
    cwd.
    
    ## Issue: mixing relative & absolute paths in apply_patch
    
    1. The apply_patch tool use relative paths based on the session working
    directory.
    2. `unified_diff_from_chunks` eventually ends up [reading the source
    file](https://github.com/reflectionai/codex/blob/main/codex-rs/apply-patch/src/lib.rs#L410)
    to figure out what the diff is, by using the relative path.
    3. The changes are targeted using an absolute path derived from the
    current working directory.
    
    The end-result in case session working directory differs from the binary
    working directory: we get the diff for a file relative to the binary
    working directory, and apply it on a file in the session working
    directory.
  • Disallow expect via lints (#865)
    Adds `expect()` as a denied lint. Same deal applies with `unwrap()`
    where we now need to put `#[expect(...` on ones that we legit want. Took
    care to enable `expect()` in test contexts.
    
    # Tests
    
    ```
    cargo fmt
    cargo clippy --all-features --all-targets --no-deps -- -D warnings
    cargo test
    ```
  • Workspace lints and disallow unwrap (#855)
    Sets submodules to use workspace lints. Added denying unwrap as a
    workspace level lint, which found a couple of cases where we could have
    propagated errors. Also manually labeled ones that were fine by my eye.
  • Update cargo to 2024 edition (#842)
    Some effects of this change:
    - New formatting changes across many files. No functionality changes
    should occur from that.
    - Calls to `set_env` are considered unsafe, since this only happens in
    tests we wrap them in `unsafe` blocks
  • fix: ensure apply_patch resolves relative paths against workdir or project cwd (#810)
    https://github.com/openai/codex/pull/800 kicked off some work to be more
    disciplined about honoring the `cwd` param passed in rather than
    assuming `std::env::current_dir()` as the `cwd`. As part of this, we
    need to ensure `apply_patch` calls honor the appropriate `cwd` as well,
    which is significant if the paths in the `apply_patch` arg are not
    absolute paths themselves. Failing that:
    
    - The `apply_patch` function call can contain an optional`workdir`
    param, so:
    - If specified and is an absolute path, it should be used to resolve
    relative paths
    - If specified and is a relative path, should be resolved against
    `Config.cwd` and then any relative paths will be resolved against the
    result
    - If `workdir` is not specified on the function call, relative paths
    should be resolved against `Config.cwd`
    
    Note that we had a similar issue in the TypeScript CLI that was fixed in
    https://github.com/openai/codex/pull/556.
    
    As part of the fix, this PR introduces `ApplyPatchAction` so clients can
    deal with that instead of the raw `HashMap<PathBuf,
    ApplyPatchFileChange>`. This enables us to enforce, by construction,
    that all paths contained in the `ApplyPatchAction` are absolute paths.
  • fix: eliminate runtime dependency on patch(1) for apply_patch (#718)
    When processing an `apply_patch` tool call, we were already computing
    the new file content in order to compute the unified diff. Before this
    PR, we were shelling out to `patch(1)` to apply the unified diff once
    the user accepted the change, but this updates the code to just retain
    the new file content and use it to write the file when the user accepts.
    This simplifies deployment because it no longer assumes `patch(1)` is on
    the host.
    
    Note this change is internal to the Codex agent and does not affect
    `protocol.rs`.
  • feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
    As stated in `codex-rs/README.md`:
    
    Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
    run it. For a number of users, this runtime requirement inhibits
    adoption: they would be better served by a standalone executable. As
    maintainers, we want Codex to run efficiently in a wide range of
    environments with minimal overhead. We also want to take advantage of
    operating system-specific APIs to provide better sandboxing, where
    possible.
    
    To that end, we are moving forward with a Rust implementation of Codex
    CLI contained in this folder, which has the following benefits:
    
    - The CLI compiles to small, standalone, platform-specific binaries.
    - Can make direct, native calls to
    [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
    [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
    order to support sandboxing on Linux.
    - No runtime garbage collection, resulting in lower memory consumption
    and better, more predictable performance.
    
    Currently, the Rust implementation is materially behind the TypeScript
    implementation in functionality, so continue to use the TypeScript
    implmentation for the time being. We will publish native executables via
    GitHub Releases as soon as we feel the Rust version is usable.