Commit Graph

23 Commits

  • feat: stream exec stdout events (#1786)
    ## Summary
    - stream command stdout as `ExecCommandStdout` events
    - forward streamed stdout to clients and ignore in human output
    processor
    - adjust call sites for new streaming API
  • chore: refactor exec.rs: create separate seatbelt.rs and spawn.rs files (#1762)
    At 550 lines, `exec.rs` was a bit large. In particular, I found it hard
    to locate the Seatbelt-related code quickly without a file with
    `seatbelt` in the name, so this refactors things so:
    
    - `spawn_command_under_seatbelt()` and dependent code moves to a new
    `seatbelt.rs` file
    - `spawn_child_async()` and dependent code moves to a new `spawn.rs`
    file
  • fix: use PR_SET_PDEATHSIG so to ensure child processes are killed in a timely manner (#1626)
    Some users have reported issues where child processes are not cleaned up
    after Codex exits (e.g., https://github.com/openai/codex/issues/1570).
    
    This is generally a tricky issue on operating systems: if a parent
    process receives `SIGKILL`, then it terminates immediately and cannot
    communicate with the child.
    
    **It only helps on Linux**, but this PR introduces the use of `prctl(2)`
    so that if the parent process dies, `SIGTERM` will be delivered to the
    child process. Whereas previously, I believe that if Codex spawned a
    long-running process (like `tsc --watch`) and the Codex process received
    `SIGKILL`, the `tsc --watch` process would be reparented to the init
    process and would never be killed. Now with the use of `prctl(2)`, the
    `tsc --watch` process should receive `SIGTERM` in that scenario.
    
    We still need to come up with a solution for macOS. I've started to look
    at `launchd`, but I'm researching a number of options.
  • feat: redesign sandbox config (#1373)
    This is a major redesign of how sandbox configuration works and aims to
    fix https://github.com/openai/codex/issues/1248. Specifically, it
    replaces `sandbox_permissions` in `config.toml` (and the
    `-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
    three variants:
    
    ```toml
    # Safest option: full disk is read-only, but writes and network access are disallowed.
    [sandbox]
    mode = "read-only"
    
    # The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
    # writable_roots can be used to specify additional writable folders.
    [sandbox]
    mode = "workspace-write"
    writable_roots = []  # Optional, defaults to the empty list.
    network_access = false  # Optional, defaults to false.
    
    # Disable sandboxing: use at your own risk!!!
    [sandbox]
    mode = "danger-full-access"
    ```
    
    This should make sandboxing easier to reason about. While we have
    dropped support for `-s`, the way it works now is:
    
    - no flags => `read-only`
    - `--full-auto` => `workspace-write`
    - currently, there is no way to specify `danger-full-access` via a CLI
    flag, but we will revisit that as part of
    https://github.com/openai/codex/issues/1254
    
    Outstanding issue:
    
    - As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
    still conflating sandbox preferences with approval preferences in that
    case, which needs to be cleaned up.
  • fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)
    Historically, we spawned the Seatbelt and Landlock sandboxes in
    substantially different ways:
    
    For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
    specified as an arg followed by the original command:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec.rs#L147-L219
    
    For **Landlock/Seccomp**, we would do
    `tokio::runtime::Builder::new_current_thread()`, _invoke
    Landlock/Seccomp APIs to modify the permissions of that new thread_, and
    then spawn the command:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec_linux.rs#L28-L49
    
    While it is neat that Landlock/Seccomp supports applying a policy to
    only one thread without having to apply it to the entire process, it
    requires us to maintain two different codepaths and is a bit harder to
    reason about. The tipping point was
    https://github.com/openai/codex/pull/1061, in which we had to start
    building up the `env` in an unexpected way for the existing
    Landlock/Seccomp approach to continue to work.
    
    This PR overhauls things so that we do similar things for Mac and Linux.
    It turned out that we were already building our own "helper binary"
    comparable to Mac's `sandbox-exec` as part of the `cli` crate:
    
    
    https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/cli/Cargo.toml#L10-L12
    
    We originally created this to build a small binary to include with the
    Node.js version of the Codex CLI to provide support for Linux
    sandboxing.
    
    Though the sticky bit is that, at this point, we still want to deploy
    the Rust version of Codex as a single, standalone binary rather than a
    CLI and a supporting sandboxing binary. To satisfy this goal, we use
    "the arg0 trick," in which we:
    
    * use `std::env::current_exe()` to get the path to the CLI that is
    currently running
    * use the CLI as the `program` for the `Command`
    * set `"codex-linux-sandbox"` as arg0 for the `Command`
    
    A CLI that supports sandboxing should check arg0 at the start of the
    program. If it is `"codex-linux-sandbox"`, it must invoke
    `codex_linux_sandbox::run_main()`, which runs the CLI as if it were
    `codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
    appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
    the original command, so do _replace_ the process rather than spawn a
    subprocess. Incidentally, we do this before starting the Tokio runtime,
    so the process should only have one thread when `execvp(3)` is called.
    
    Because the `core` crate that needs to spawn the Linux sandboxing is not
    a CLI in its own right, this means that every CLI that includes `core`
    and relies on this behavior has to (1) implement it and (2) provide the
    path to the sandboxing executable. While the path is almost always
    `std::env::current_exe()`, we needed to make this configurable for
    integration tests, so `Config` now has a `codex_linux_sandbox_exe:
    Option<PathBuf>` property to facilitate threading this through,
    introduced in https://github.com/openai/codex/pull/1089.
    
    This common pattern is now captured in
    `codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
    functions that should use it have been updated as part of this PR.
    
    The `codex-linux-sandbox` crate added to the Cargo workspace as part of
    this PR now has the bulk of the Landlock/Seccomp logic, which makes
    `core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
    `core/src/landlock.rs` were removed/ported as part of this PR. I also
    moved the unit tests for this code into an integration test,
    `linux-sandbox/tests/landlock.rs`, in which I use
    `env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
    `codex_linux_sandbox_exe` since `std::env::current_exe()` is not
    appropriate in that case.
  • feat: introduce support for shell_environment_policy in config.toml (#1061)
    To date, when handling `shell` and `local_shell` tool calls, we were
    spawning new processes using the environment inherited from the Codex
    process itself. This means that the sensitive `OPENAI_API_KEY` that
    Codex needs to talk to OpenAI models was made available to everything
    run by `shell` and `local_shell`. While there are cases where that might
    be useful, it does not seem like a good default.
    
    This PR introduces a complex `shell_environment_policy` config option to
    control the `env` used with these tool calls. It is inevitably a bit
    complex so that it is possible to override individual components of the
    policy so without having to restate the entire thing.
    
    Details are in the updated `README.md` in this PR, but here is the
    relevant bit that explains the individual fields of
    `shell_environment_policy`:
    
    | Field | Type | Default | Description |
    | ------------------------- | -------------------------- | ------- |
    -----------------------------------------------------------------------------------------------------------------------------------------------
    |
    | `inherit` | string | `core` | Starting template for the
    environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
    parent env), or `none` (start empty). |
    | `ignore_default_excludes` | boolean | `false` | When `false`, Codex
    removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
    (case-insensitive) before other rules run. |
    | `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob
    patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
    `"AZURE_*"`. |
    | `set` | table&lt;string,string&gt; | `{}` | Explicit key/value
    overrides or additions – always win over inherited values. |
    | `include_only` | array&lt;string&gt; | `[]` | If non-empty, a
    whitelist of patterns; only variables that match _one_ pattern survive
    the final step. (Generally used with `inherit = "all"`.) |
    
    
    In particular, note that the default is `inherit = "core"`, so:
    
    * if you have extra env variables that you want to inherit from the
    parent process, use `inherit = "all"` and then specify `include_only`
    * if you have extra env variables where you want to hardcode the values,
    the default `inherit = "core"` will work fine, but then you need to
    specify `set`
    
    This configuration is not battle-tested, so we will probably still have
    to play with it a bit. `core/src/exec_env.rs` has the critical business
    logic as well as unit tests.
    
    Though if nothing else, previous to this change:
    
    ```
    $ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
    # ...prints OPENAI_API_KEY...
    ```
    
    But after this change it does not print anything (as desired).
    
    One final thing to call out about this PR is that the
    `configure_command!` macro we use in `core/src/exec.rs` has to do some
    complex logic with respect to how it builds up the `env` for the process
    being spawned under Landlock/seccomp. Specifically, doing
    `cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
    the most intuitive way to do it) caused the Landlock unit tests to fail
    because the processes spawned by the unit tests started failing in
    unexpected ways! If we forgo `env_clear()` in favor of updating env vars
    one at a time, the tests still pass. The comment in the code talks about
    this a bit, and while I would like to investigate this more, I need to
    move on for the moment, but I do plan to come back to it to fully
    understand what is going on. For example, this suggests that we might
    not be able to spawn a C program that calls `env_clear()`, which would
    be...weird. We may still have to fiddle with our Landlock config if that
    is the case.
  • chore: pin Rust version to 1.86 and use io::Error::other to prepare for 1.87 (#947)
    Previously, our GitHub actions specified the Rust toolchain as
    `dtolnay/rust-toolchain@stable`, which meant the version could change
    out from under us. In this case, the move from 1.86 to 1.87 introduced
    new clippy warnings, causing build failures.
    
    Because it will take a little time to fix all the new clippy warnings,
    this PR pins things to 1.86 for now to unbreak the build.
    
    It also replaces `io::Error::new(io::ErrorKind::Other)` with
    `io::Error::other()` in preparation for 1.87.
  • Add codespell support (config, workflow to detect/not fix) and make it fix some typos (#903)
    More about codespell: https://github.com/codespell-project/codespell .
    
    I personally introduced it to dozens if not hundreds of projects already
    and so far only positive feedback.
    
    CI workflow has 'permissions' set only to 'read' so also should be safe.
    
    Let me know if just want to take typo fixes in and get rid of the CI
    
    ---------
    
    Signed-off-by: Yaroslav O. Halchenko <debian@onerussian.com>
  • Disallow expect via lints (#865)
    Adds `expect()` as a denied lint. Same deal applies with `unwrap()`
    where we now need to put `#[expect(...` on ones that we legit want. Took
    care to enable `expect()` in test contexts.
    
    # Tests
    
    ```
    cargo fmt
    cargo clippy --all-features --all-targets --no-deps -- -D warnings
    cargo test
    ```
  • feat: experimental env var: CODEX_SANDBOX_NETWORK_DISABLED (#879)
    When using Codex to develop Codex itself, I noticed that sometimes it
    would try to add `#[ignore]` to the following tests:
    
    ```
    keeps_previous_response_id_between_tasks()
    retries_on_early_close()
    ```
    
    Both of these tests start a `MockServer` that launches an HTTP server on
    an ephemeral port and requires network access to hit it, which the
    Seatbelt policy associated with `--full-auto` correctly denies. If I
    wasn't paying attention to the code that Codex was generating, one of
    these `#[ignore]` annotations could have slipped into the codebase,
    effectively disabling the test for everyone.
    
    To that end, this PR enables an experimental environment variable named
    `CODEX_SANDBOX_NETWORK_DISABLED` that is set to `1` if the
    `SandboxPolicy` used to spawn the process does not have full network
    access. I say it is "experimental" because I'm not convinced this API is
    quite right, but we need to start somewhere. (It might be more
    appropriate to have an env var like `CODEX_SANDBOX=full-auto`, but the
    challenge is that our newer `SandboxPolicy` abstraction does not map to
    a simple set of enums like in the TypeScript CLI.)
    
    We leverage this new functionality by adding the following code to the
    aforementioned tests as a way to "dynamically disable" them:
    
    ```rust
    if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
        println!(
            "Skipping test because it cannot execute when network is disabled in a Codex sandbox."
        );
        return;
    }
    ```
    
    We can use the `debug seatbelt --full-auto` command to verify that
    `cargo test` fails when run under Seatbelt prior to this change:
    
    ```
    $ cargo run --bin codex -- debug seatbelt --full-auto -- cargo test
    ---- keeps_previous_response_id_between_tasks stdout ----
    
    thread 'keeps_previous_response_id_between_tasks' panicked at /Users/mbolin/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wiremock-0.6.3/src/mock_server/builder.rs:107:46:
    Failed to bind an OS port for a mock server.: Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" }
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    
    
    failures:
        keeps_previous_response_id_between_tasks
    
    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
    
    error: test failed, to rerun pass `-p codex-core --test previous_response_id`
    ```
    
    Though after this change, the above command succeeds! This means that,
    going forward, when Codex operates on Codex itself, when it runs `cargo
    test`, only "real failures" should cause the command to fail.
    
    As part of this change, I decided to tighten up the codepaths for
    running `exec()` for shell tool calls. In particular, we do it in `core`
    for the main Codex business logic itself, but we also expose this logic
    via `debug` subcommands in the CLI in the `cli` crate. The logic for the
    `debug` subcommands was not quite as faithful to the true business logic
    as I liked, so I:
    
    * refactored a bit of the Linux code, splitting `linux.rs` into
    `linux_exec.rs` and `landlock.rs` in the `core` crate.
    * gating less code behind `#[cfg(target_os = "linux")]` because such
    code does not get built by default when I develop on Mac, which means I
    either have to build the code in Docker or wait for CI signal
    * introduced `macro_rules! configure_command` in `exec.rs` so we can
    have both sync and async versions of this code. The synchronous version
    seems more appropriate for straight threads or potentially fork/exec.
  • chore: refactor exec() into spawn_child() and consume_truncated_output() (#878)
    This PR is a straight refactor so that creating the `Child` process for
    an `shell` tool call and consuming its output can be separate concerns.
    For the actual tool call, we will always apply
    `consume_truncated_output()`, but for the top-level debug commands in
    the CLI (e.g., `debug seatbelt` and `debug landlock`), we only want to
    use the `spawn_child()` part of `exec()`.
    
    We want the subcommands to match the `shell` tool call usage as
    faithfully as possible. This becomes more important when we introduce a
    new parameter to `spawn_child()` in
    https://github.com/openai/codex/pull/879.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/878).
    * #879
    * __->__ #878
  • Workspace lints and disallow unwrap (#855)
    Sets submodules to use workspace lints. Added denying unwrap as a
    workspace level lint, which found a couple of cases where we could have
    propagated errors. Also manually labeled ones that were fine by my eye.
  • feat: make cwd a required field of Config so we stop assuming std::env::current_dir() in a session (#800)
    In order to expose Codex via an MCP server, I realized that we should be
    taking `cwd` as a parameter rather than assuming
    `std::env::current_dir()` as the `cwd`. Specifically, the user may want
    to start a session in a directory other than the one where the MCP
    server has been started.
    
    This PR makes `cwd: PathBuf` a required field of `Session` and threads
    it all the way through, though I think there is still an issue with not
    honoring `workdir` for `apply_patch`, which is something we also had to
    fix in the TypeScript version: https://github.com/openai/codex/pull/556.
    
    This also adds `-C`/`--cd` to change the cwd via the command line.
    
    To test, I ran:
    
    ```
    cargo run --bin codex -- exec -C /tmp 'show the output of ls'
    ```
    
    and verified it showed the contents of my `/tmp` folder instead of
    `$PWD`.
  • fix: overhaul SandboxPolicy and config loading in Rust (#732)
    Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
    
    
    https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
    
    Specifically:
    
    * It was an `enum` and therefore options were mutually exclusive as
    opposed to additive.
    * It defined things in terms of what the agent _could not_ do as opposed
    to what they _could_ do. This made things hard to support because we
    would prefer to build up a sandbox config by starting with something
    extremely restrictive and only granting permissions for things the user
    as explicitly allowed.
    
    This PR changes things substantially by redefining the policy in terms
    of two concepts:
    
    * A `SandboxPermission` enum that defines permissions that can be
    granted to the agent/sandbox.
    * A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
    but externally exposes a simpler API that can be used to configure
    Seatbelt/Landlock.
    
    Previous to this PR, we supported a `--sandbox` flag that effectively
    mapped to an enum value in `SandboxPolicy`. Though now that
    `SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
    `--sandbox` flag no longer makes sense. While I could have turned it
    into a flag that the user can specify multiple times, I think the
    current values to use with such a flag are long and potentially messy,
    so for the moment, I have dropped support for `--sandbox` altogether and
    we can bring it back once we have figured out the naming thing.
    
    Since `--sandbox` is gone, users now have to specify `--full-auto` to
    get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
    way to specify the equivalent of `--full-auto` in your `config.toml`
    right now, so we will have to revisit that, as well.
    
    Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
    changed considerably, I had to overhaul how config loading works, as
    well. There are now two distinct concepts, `ConfigToml` and `Config`:
    
    * `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
    might expect, every field is `Optional` and it is `#[derive(Deserialize,
    Default)]`. Consistent use of `Optional` makes it clear what the user
    has specified explicitly.
    * `Config` is the "normalized config" and is produced by merging
    `ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
    `Option<Vec<SandboxPermission>>`, `Config` presents only the final
    `SandboxPolicy`.
    
    The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
    special attention to ensure we are faithfully mapping the
    `SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
    
    Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
    renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
    `(allow file-read*)` has been removed from the `.sbpl` file as now this
    is added to the policy in `core/src/exec.rs` when
    `sandbox_policy.has_full_disk_read_access()` is `true`.
  • fix: tighten up check for /usr/bin/sandbox-exec (#710)
    * In both TypeScript and Rust, we now invoke `/usr/bin/sandbox-exec`
    explicitly rather than whatever `sandbox-exec` happens to be on the
    `PATH`.
    * Changed `isSandboxExecAvailable` to use `access()` rather than
    `command -v` so that:
      *  We only do the check once over the lifetime of the Codex process.
      * The check is specific to `/usr/bin/sandbox-exec`.
    * We now do a syscall rather than incur the overhead of spawning a
    process, dealing with timeouts, etc.
    
    I think there is still room for improvement here where we should move
    the `isSandboxExecAvailable` check earlier in the CLI, ideally right
    after we do arg parsing to verify that we can provide the Seatbelt
    sandbox if that is what the user has requested.
  • feat: load defaults into Config and introduce ConfigOverrides (#677)
    This changes how instantiating `Config` works and also adds
    `approval_policy` and `sandbox_policy` as fields. The idea is:
    
    * All fields of `Config` have appropriate default values.
    * `Config` is initially loaded from `~/.codex/config.toml`, so values in
    `config.toml` will override those defaults.
    * Clients must instantiate `Config` via
    `Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides`
    has optional overrides that are expected to be settable based on CLI
    flags.
    
    The `Config` should be defined early in the program and then passed
    down. Now functions like `init_codex()` take fewer individual parameters
    because they can just take a `Config`.
    
    Also, `Config::load()` used to fail silently if `~/.codex/config.toml`
    had a parse error and fell back to the default config. This seemed
    really bad because it wasn't clear why the values in my `config.toml`
    weren't getting picked up. I changed things so that
    `load_with_overrides()` returns `Result<Config>` and verified that the
    various CLIs print a reasonable error if `config.toml` is malformed.
    
    Finally, I also updated the TUI to show which **sandbox** value is being
    used, as we do for other key values like **model** and **approval**.
    This was also a reminder that the various values of `--sandbox` are
    honored on Linux but not macOS today, so I added some TODOs about fixing
    that.
  • fix: small fixes so Codex compiles on Windows (#673)
    Small fixes required:
    
    * `ExitStatusExt` differs because UNIX expects exit code to be `i32`
    whereas Windows does `u32`
    * Marking a file "executable only by owner" is a bit more involved on
    Windows. We just do something approximate for now (and add a TODO) to
    get things compiling.
    
    I created this PR on my personal Windows machine and `cargo test` and
    `cargo clippy` succeed. Once this is in, I'll rebase
    https://github.com/openai/codex/pull/665 on top so Windows stays fixed!
  • [codex-rs] Improve linux sandbox timeouts (#662)
    * Fixes flaking rust unit test
    * Adds explicit sandbox exec timeout handling
  • [codex-rs] Reliability pass on networking (#658)
    We currently see a behavior that looks like this:
    ```
    2025-04-25T16:52:24.552789Z  WARN codex_core::codex: stream disconnected - retrying turn (1/10 in 232ms)...
    codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 1/10 in 232ms…" }
    2025-04-25T16:52:54.789885Z  WARN codex_core::codex: stream disconnected - retrying turn (2/10 in 418ms)...
    codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 2/10 in 418ms…" }
    ```
    
    This PR contains a few different fixes that attempt to resolve/improve
    this:
    1. **Remove overall client timeout.** I think
    [this](https://github.com/openai/codex/pull/658/files#diff-c39945d3c42f29b506ff54b7fa2be0795b06d7ad97f1bf33956f60e3c6f19c19L173)
    is perhaps the big fix -- it looks to me like this was actually timing
    out even if events were still coming through, and that was causing a
    disconnect right in the middle of a healthy stream.
    2. **Cap response sizes.** We were frequently sending MUCH larger
    responses than the upstream typescript `codex`, and that was definitely
    not helping. [Fix
    here](https://github.com/openai/codex/pull/658/files#diff-d792bef59aa3ee8cb0cbad8b176dbfefe451c227ac89919da7c3e536a9d6cdc0R21-R26)
    for that one.
    3. **Much higher idle timeout.** Our idle timeout value was much lower
    than typescript.
    4. **Sub-linear backoff.** We were much too aggressively backing off,
    [this](https://github.com/openai/codex/pull/658/files#diff-5d5959b95c6239e6188516da5c6b7eb78154cd9cfedfb9f753d30a7b6d6b8b06R30-R33)
    makes it sub-exponential but maintains the jitter and such.
    
    I was seeing that `stream error: stream disconnected` behavior
    constantly, and anecdotally I can no longer reproduce. It feels much
    snappier.
  • [codex-rs] More fine-grained sandbox flag support on Linux (#632)
    ##### What/Why
    This PR makes it so that in Linux we actually respect the different
    types of `--sandbox` flag, such that users can apply network and
    filesystem restrictions in combination (currently the only supported
    behavior), or just pick one or the other.
    
    We should add similar support for OSX in a future PR.
    
    ##### Testing
    From Linux devbox, updated tests to use more specific flags:
    ```
    test linux::tests_linux::sandbox_blocks_ping ... ok
    test linux::tests_linux::sandbox_blocks_getent ... ok
    test linux::tests_linux::test_root_read ... ok
    test linux::tests_linux::test_dev_null_write ... ok
    test linux::tests_linux::sandbox_blocks_dev_tcp_redirection ... ok
    test linux::tests_linux::sandbox_blocks_ssh ... ok
    test linux::tests_linux::test_writable_root ... ok
    test linux::tests_linux::sandbox_blocks_curl ... ok
    test linux::tests_linux::sandbox_blocks_wget ... ok
    test linux::tests_linux::sandbox_blocks_nc ... ok
    test linux::tests_linux::test_root_write - should panic ... ok
    ```
    
    ##### Todo
    - [ ] Add negative tests (e.g. confirm you can hit the network if you
    configure filesystem only restrictions)
  • feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
    As stated in `codex-rs/README.md`:
    
    Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
    run it. For a number of users, this runtime requirement inhibits
    adoption: they would be better served by a standalone executable. As
    maintainers, we want Codex to run efficiently in a wide range of
    environments with minimal overhead. We also want to take advantage of
    operating system-specific APIs to provide better sandboxing, where
    possible.
    
    To that end, we are moving forward with a Rust implementation of Codex
    CLI contained in this folder, which has the following benefits:
    
    - The CLI compiles to small, standalone, platform-specific binaries.
    - Can make direct, native calls to
    [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
    [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
    order to support sandboxing on Linux.
    - No runtime garbage collection, resulting in lower memory consumption
    and better, more predictable performance.
    
    Currently, the Rust implementation is materially behind the TypeScript
    implementation in functionality, so continue to use the TypeScript
    implmentation for the time being. We will publish native executables via
    GitHub Releases as soon as we feel the Rust version is usable.