Commit Graph

7 Commits

  • exec-server: additional context for errors (#7935)
    Add a .context() on some exec-server errors for debugging CI flakes.
    
    Also, "login": false in the test to make the test not affected by user
    profile.
  • fix: policy/*.codexpolicy -> rules/*.rules (#7888)
    We decided that `*.rules` is a more fitting (and concise) file extension
    than `*.codexpolicy`, so we are changing the file extension for the
    "execpolicy" effort. We are also changing the subfolder of `$CODEX_HOME`
    from `policy` to `rules` to match.
    
    This PR updates the in-repo docs and we will update the public docs once
    the next CLI release goes out.
    
    Locally, I created `~/.codex/rules/default.rules` with the following
    contents:
    
    ```
    prefix_rule(pattern=["gh", "pr", "view"])
    ```
    
    And then I asked Codex to run:
    
    ```
    gh pr view 7888 --json title,body,comments
    ```
    
    and it was able to!
  • fix: add a hopefully-temporary sleep to reduce test flakiness (#7848)
    Let's see if this `sleep()` call is good enough to fix the test
    flakiness we currently see in CI. It will take me some time to upstream
    a proper fix, and I would prefer not to disable this test in the
    interim.
  • fix: ensure accept_elicitation_for_prompt_rule() test passes locally (#7832)
    When I originally introduced `accept_elicitation_for_prompt_rule()` in
    https://github.com/openai/codex/pull/7617, it worked for me locally
    because I had run `codex-rs/exec-server/tests/suite/bash` once myself,
    which had the side-effect of installing the corresponding DotSlash
    artifact.
    
    In CI, I added explicit logic to do this as part of
    `.github/workflows/rust-ci.yml`, which meant the test also passed in CI,
    but this logic should have been done as part of the test so that it
    would work locally for devs who had not installed the DotSlash artifact
    for `codex-rs/exec-server/tests/suite/bash` before. This PR updates the
    test to do this (and deletes the setup logic from `rust-ci.yml`),
    creating a new `DOTSLASH_CACHE` in a temp directory so that this is
    handled independently for each test.
    
    While here, also added a check to ensure that the `codex` binary has
    been built prior to running the test, as we have to ensure it is
    symlinked as `codex-linux-sandbox` on Linux in order for the integration
    test to work on that platform.
  • fix: allow sendmsg(2) and recvmsg(2) syscalls in our Linux sandbox (#7779)
    This changes our default Landlock policy to allow `sendmsg(2)` and
    `recvmsg(2)` syscalls. We believe these were originally denied out of an
    abundance of caution, but given that `send(2)` nor `recv(2)` are allowed
    today [which provide comparable capability to the `*msg` equivalents],
    we do not believe allowing them grants any privileges beyond what we
    already allow.
    
    Rather than using the syscall as the security boundary, preventing
    access to the potentially hazardous file descriptor in the first place
    seems like the right layer of defense.
    
    In particular, this makes it possible for `shell-tool-mcp` to run on
    Linux when using a read-only sandbox for the Bash process, as
    demonstrated by `accept_elicitation_for_prompt_rule()` now succeeding in
    CI.
  • fix: add integration tests for codex-exec-mcp-server with execpolicy (#7617)
    This PR introduces integration tests that run
    [codex-shell-tool-mcp](https://www.npmjs.com/package/@openai/codex-shell-tool-mcp)
    as a user would. Note that this requires running our fork of Bash, so we
    introduce a [DotSlash](https://dotslash-cli.com/) file for `bash` so
    that we can run the integration tests on multiple platforms without
    having to check the binaries into the repository. (As noted in the
    DotSlash file, it is slightly more heavyweight than necessary, which may
    be worth addressing as disk space in CI is limited:
    https://github.com/openai/codex/pull/7678.)
    
    To start, this PR adds two tests:
    
    - `list_tools()` makes the `list_tools` request to the MCP server and
    verifies we get the expected response
    - `accept_elicitation_for_prompt_rule()` defines a `prefix_rule()` with
    `decision="prompt"` and verifies the elicitation flow works as expected
    
    Though the `accept_elicitation_for_prompt_rule()` test **only works on
    Linux**, as this PR reveals that there are currently issues when running
    the Bash fork in a read-only sandbox on Linux. This will have to be
    fixed in a follow-up PR.
    
    Incidentally, getting this test run to correctly on macOS also requires
    a recent fix we made to `brew` that hasn't hit a mainline release yet,
    so getting CI green in this PR required
    https://github.com/openai/codex/pull/7680.