Commit Graph

8 Commits

  • chore: enable clippy::redundant_clone (#3489)
    Created this PR by:
    
    - adding `redundant_clone` to `[workspace.lints.clippy]` in
    `cargo-rs/Cargol.toml`
    - running `cargo clippy --tests --fix`
    - running `just fmt`
    
    Though I had to clean up one instance of the following that resulted:
    
    ```rust
    let codex = codex;
    ```
  • parse cd foo && ... for exec and apply_patch (#3083)
    sometimes the model likes to run "cd foo && ..." instead of using the
    workdir parameter of exec. handle them roughly the same.
  • Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
    This PR:
    * Added the clippy.toml to configure allowable expect / unwrap usage in
    tests
    * Removed as many expect/allow lines as possible from tests
    * moved a bunch of allows to expects where possible
    
    Note: in integration tests, non `#[test]` helper functions are not
    covered by this so we had to leave a few lingering `expect(expect_used`
    checks around
  • fix: run apply_patch calls through the sandbox (#1705)
    Building on the work of https://github.com/openai/codex/pull/1702, this
    changes how a shell call to `apply_patch` is handled.
    
    Previously, a shell call to `apply_patch` was always handled in-process,
    never leveraging a sandbox. To determine whether the `apply_patch`
    operation could be auto-approved, the
    `is_write_patch_constrained_to_writable_paths()` function would check if
    all the paths listed in the paths were writable. If so, the agent would
    apply the changes listed in the patch.
    
    Unfortunately, this approach afforded a loophole: symlinks!
    
    * For a soft link, we could fix this issue by tracing the link and
    checking whether the target is in the set of writable paths, however...
    * ...For a hard link, things are not as simple. We can run `stat FILE`
    to see if the number of links is greater than 1, but then we would have
    to do something potentially expensive like `find . -inum <inode_number>`
    to find the other paths for `FILE`. Further, even if this worked, this
    approach runs the risk of a
    [TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
    race condition, so it is not robust.
    
    The solution, implemented in this PR, is to take the virtual execution
    of the `apply_patch` CLI into an _actual_ execution using `codex
    --codex-run-as-apply-patch PATCH`, which we can run under the sandbox
    the user specified, just like any other `shell` call.
    
    This, of course, assumes that the sandbox prevents writing through
    symlinks as a mechanism to write to folders that are not in the writable
    set configured by the sandbox. I verified this by testing the following
    on both Mac and Linux:
    
    ```shell
    #!/usr/bin/env bash
    set -euo pipefail
    
    # Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?
    
    # Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
    SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    # EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
    EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
    
    echo "SANDBOX_DIR: $SANDBOX_DIR"
    echo "EXPLOIT_DIR: $EXPLOIT_DIR"
    
    cleanup() {
      # Only remove if it looks sane and still exists
      [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
      [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
    }
    
    trap cleanup EXIT
    
    echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"
    
    # Drop the -s to test hard links.
    ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"
    
    cat "${SANDBOX_DIR}/link-to-original.txt"
    
    if [[ "$(uname)" == "Linux" ]]; then
        SANDBOX_SUBCOMMAND=landlock
    else
        SANDBOX_SUBCOMMAND=seatbelt
    fi
    
    # Attempt the exploit
    cd "${SANDBOX_DIR}"
    
    codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true
    
    cat "${EXPLOIT_DIR}/original.txt"
    ```
    
    Admittedly, this change merits a proper integration test, but I think I
    will have to do that in a follow-up PR.
  • fix: provide tolerance for apply_patch tool (#993)
    As explained in detail in the doc comment for `ParseMode::Lenient`, we
    have observed that GPT-4.1 does not always generate a valid invocation
    of `apply_patch`. Fortunately, the error is predictable, so we introduce
    some new logic to the `codex-apply-patch` crate to recover from this
    error.
    
    Because we would like to avoid this becoming a de facto standard (as it
    would be incompatible if `apply_patch` were provided as an actual
    executable, unless we also introduced the lenient behavior in the
    executable, as well), we require passing `ParseMode::Lenient` to
    `parse_patch_text()` to make it clear that the caller is opting into
    supporting this special case.
    
    Note the analogous change to the TypeScript CLI was
    https://github.com/openai/codex/pull/930. In addition to changing the
    accepted input to `apply_patch`, it also introduced additional
    instructions for the model, which we include in this PR.
    
    Note that `apply-patch` does not depend on either `regex` or
    `regex-lite`, so some of the checks are slightly more verbose to avoid
    introducing this dependency.
    
    That said, this PR does not leverage the existing
    `extract_heredoc_body_from_apply_patch_command()`, which depends on
    `tree-sitter` and `tree-sitter-bash`:
    
    
    https://github.com/openai/codex/blob/5a5aa899143f9b9ef606692c401b010368b15bdb/codex-rs/apply-patch/src/lib.rs#L191-L246
    
    though perhaps it should.
  • fix: apply patch issue when using different cwd (#942)
    If you run a codex instance outside of the current working directory
    from where you launched the codex binary it won't be able to apply
    patches correctly, even if the sandbox policy allows it. This manifests
    weird behaviours, such as
    
    * Reading the same filename in the binary working directory, and
    overwriting it in the session working directory. e.g. if you have a
    `readme` in both folders it will overwrite the readme in the session
    working directory with the readme in the binary working directory
    *applied with the suggested patch*.
    * The LLM ends up in weird loops trying to verify and debug why the
    apply_patch won't work, and it can result in it applying patches by
    manually writing python or javascript if it figures out that either is
    supported by the system instead.
    
    I added a test-case to ensure that the patch contents are based on the
    cwd.
    
    ## Issue: mixing relative & absolute paths in apply_patch
    
    1. The apply_patch tool use relative paths based on the session working
    directory.
    2. `unified_diff_from_chunks` eventually ends up [reading the source
    file](https://github.com/reflectionai/codex/blob/main/codex-rs/apply-patch/src/lib.rs#L410)
    to figure out what the diff is, by using the relative path.
    3. The changes are targeted using an absolute path derived from the
    current working directory.
    
    The end-result in case session working directory differs from the binary
    working directory: we get the diff for a file relative to the binary
    working directory, and apply it on a file in the session working
    directory.
  • Update cargo to 2024 edition (#842)
    Some effects of this change:
    - New formatting changes across many files. No functionality changes
    should occur from that.
    - Calls to `set_env` are considered unsafe, since this only happens in
    tests we wrap them in `unsafe` blocks
  • feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
    As stated in `codex-rs/README.md`:
    
    Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
    run it. For a number of users, this runtime requirement inhibits
    adoption: they would be better served by a standalone executable. As
    maintainers, we want Codex to run efficiently in a wide range of
    environments with minimal overhead. We also want to take advantage of
    operating system-specific APIs to provide better sandboxing, where
    possible.
    
    To that end, we are moving forward with a Rust implementation of Codex
    CLI contained in this folder, which has the following benefits:
    
    - The CLI compiles to small, standalone, platform-specific binaries.
    - Can make direct, native calls to
    [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
    [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
    order to support sandboxing on Linux.
    - No runtime garbage collection, resulting in lower memory consumption
    and better, more predictable performance.
    
    Currently, the Rust implementation is materially behind the TypeScript
    implementation in functionality, so continue to use the TypeScript
    implmentation for the time being. We will publish native executables via
    GitHub Releases as soon as we feel the Rust version is usable.