Commit Graph

13 Commits

  • fix: preserve zsh-fork escalation fds across unified-exec spawn paths (#13644)
    ## Why
    
    `zsh-fork` sessions launched through unified-exec need the escalation
    socket to survive the wrapper -> server -> child handoff so later
    intercepted `exec()` calls can still reach the escalation server.
    
    The inherited-fd spawn path also needs to avoid closing Rust's internal
    exec-error pipe, and the shell-escalation handoff needs to tolerate the
    receive-side case where a transferred fd is installed into the same
    stdio slot it will be mapped onto.
    
    ## What Changed
    
    - Added `SpawnLifecycle::inherited_fds()` in
    `codex-rs/core/src/unified_exec/process.rs` and threaded inherited fds
    through `codex-rs/core/src/unified_exec/process_manager.rs` so
    unified-exec can preserve required descriptors across both PTY and
    no-stdin pipe spawn paths.
    - Updated `codex-rs/core/src/tools/runtimes/shell/zsh_fork_backend.rs`
    to expose the escalation socket fd through the spawn lifecycle.
    - Added inherited-fd-aware spawn helpers in
    `codex-rs/utils/pty/src/pty.rs` and `codex-rs/utils/pty/src/pipe.rs`,
    including Unix pre-exec fd pruning that preserves requested inherited
    fds while leaving `FD_CLOEXEC` descriptors alone. The pruning helper is
    now named `close_inherited_fds_except()` to better describe that
    behavior.
    - Updated `codex-rs/shell-escalation/src/unix/escalate_client.rs` to
    duplicate local stdio before transfer and send destination stdio numbers
    in `SuperExecMessage`, so the wrapper keeps using its own
    `stdin`/`stdout`/`stderr` until the escalated child takes over.
    - Updated `codex-rs/shell-escalation/src/unix/escalate_server.rs` so the
    server accepts the overlap case where a received fd reuses the same
    stdio descriptor number that the child setup will target with `dup2`.
    - Added comments around the PTY stdio wiring and the overlap regression
    helper to make the fd handoff and controlling-terminal setup easier to
    follow.
    
    ## Verification
    
    - `cargo test -p codex-utils-pty`
    - covers preserved-fd PTY spawn behavior, PTY resize, Python REPL
    continuity, exec-failure reporting, and the no-stdin pipe path
    - `cargo test -p codex-shell-escalation`
    - covers duplicated-fd transfer on the client side and verifies the
    overlap case by passing a pipe-backed stdin payload through the
    server-side `dup2` path
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13644).
    * #14624
    * __->__ #13644
  • start of hooks engine (#13276)
    (Experimental)
    
    This PR adds a first MVP for hooks, with SessionStart and Stop
    
    The core design is:
    
    - hooks live in a dedicated engine under codex-rs/hooks
    - each hook type has its own event-specific file
    - hook execution is synchronous and blocks normal turn progression while
    running
    - matching hooks run in parallel, then their results are aggregated into
    a normalized HookRunSummary
    
    On the AppServer side, hooks are exposed as operational metadata rather
    than transcript-native items:
    
    - new live notifications: hook/started, hook/completed
    - persisted/replayed hook results live on Turn.hookRuns
    - we intentionally did not add hook-specific ThreadItem variants
    
    Hooks messages are not persisted, they remain ephemeral. The context
    changes they add are (they get appended to the user's prompt)
  • fix(ci): restore guardian coverage and bazel unit tests (#13912)
    ## Summary
    - restore the guardian review request snapshot test and its tracked
    snapshot after it was dropped from `main`
    - make Bazel Rust unit-test wrappers resolve runfiles correctly on
    manifest-only platforms like macOS and point Insta at the real workspace
    root
    - harden the shell-escalation socket-closure assertion so the musl Bazel
    test no longer depends on fd reuse behavior
    
    ## Verification
    - cargo test -p codex-core
    guardian_review_request_layout_matches_model_visible_request_snapshot
    - cargo test -p codex-shell-escalation
    - bazel test //codex-rs/exec:exec-unit-tests
    //codex-rs/shell-escalation:shell-escalation-unit-tests
    
    Supersedes #13894.
    
    ---------
    
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
    Co-authored-by: viyatb-oai <viyatb@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • refactor: prepare unified exec for zsh-fork backend (#13392)
    ## Why
    
    `shell_zsh_fork` already provides stronger guarantees around which
    executables receive elevated permissions. To reuse that machinery from
    unified exec without pushing Unix-specific escalation details through
    generic runtime code, the escalation bootstrap and session lifetime
    handling need a cleaner boundary.
    
    That boundary also needs to be safe for long-lived sessions: when an
    intercepted shell session is closed or pruned, any in-flight approval
    workers and any already-approved escalated child they spawned must be
    torn down with the session, and the inherited escalation socket must not
    leak into unrelated subprocesses.
    
    ## What Changed
    
    - Extracted a reusable `EscalationSession` and
    `EscalateServer::start_session(...)` in `shell-escalation` so callers
    can get the wrapper/socket env overlay and keep the escalation server
    alive without immediately running a one-shot command.
    - Documented that `EscalationSession::env()` and
    `ShellCommandExecutor::run(...)` exchange only that env overlay, which
    callers must merge into their own base shell environment.
    - Clarified the prepared-exec helper boundary in `core` by naming the
    new helper APIs around `ExecRequest`, while keeping the legacy
    `execute_env(...)` entrypoints as thin compatibility wrappers for
    existing callers that still use the older naming.
    - Added a small post-spawn hook on the prepared execution path so the
    parent copy of the inheritable escalation socket is closed immediately
    after both the existing one-shot shell-command spawn and the
    unified-exec spawn.
    - Made session teardown explicit with session-scoped cancellation:
    dropping an `EscalationSession` or canceling its parent request now
    stops intercept workers, and the server-spawned escalated child uses
    `kill_on_drop(true)` so teardown cannot orphan an already-approved
    child.
    - Added `UnifiedExecBackendConfig` plumbing through `ToolsConfig`, a
    `shell::zsh_fork_backend` facade, and an opaque unified-exec
    spawn-lifecycle hook so unified exec can prepare a wrapped `zsh -c/-lc`
    request without storing `EscalationSession` directly in generic
    process/runtime code.
    - Kept the existing `shell_command` zsh-fork behavior intact on top of
    the new bootstrap path. Tool selection is unchanged in this PR: when
    `shell_zsh_fork` is enabled, `ShellCommand` still wins over
    `exec_command`.
    
    ## Verification
    
    - `cargo test -p codex-shell-escalation`
      - includes coverage for `start_session_exposes_wrapper_env_overlay`
      - includes coverage for `exec_closes_parent_socket_after_shell_spawn`
    - includes coverage for
    `dropping_session_aborts_intercept_workers_and_kills_spawned_child`
    - `cargo test -p codex-core
    shell_zsh_fork_prefers_shell_command_over_unified_exec`
    - `cargo test -p codex-core --test all
    shell_zsh_fork_prompts_for_skill_script_execution`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13392).
    * #13432
    * __->__ #13392
  • chore: Nest skill and protocol network permissions under network.enabled (#13427)
    ## Summary
    
    Changes the permission profile shape from a bare network boolean to a
    nested object.
    
    Before:
    
    ```yaml
    permissions:
      network: true
    ```
    
    After:
    
    ```yaml
    permissions:
      network:
        enabled: true
    ```
    
    This also updates the shared Rust and app-server protocol types so
    `PermissionProfile.network` is no longer `Option<bool>`, but
    `Option<NetworkPermissions>` with `enabled: Option<bool>`.
    
    ## What Changed
    
    - Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`:
    - `pub network: Option<bool>` -> `pub network:
    Option<NetworkPermissions>`
    - Added `NetworkPermissions` with:
      - `pub enabled: Option<bool>`
    - Changed emptiness semantics so `network` is only considered empty when
    `enabled` is `None`
    - Updated skill metadata parsing to accept `permissions.network.enabled`
    - Updated core permission consumers to read
    `network.enabled.unwrap_or(false)` where a concrete boolean is needed
    - Updated app-server v2 protocol types and regenerated schema/TypeScript
    outputs
    - Updated docs to mention `additionalPermissions.network.enabled`
  • fix: use https://git.savannah.gnu.org/git/bash instead of https://github.com/bolinfest/bash (#13057)
    Historically, we cloned the Bash repo from
    https://github.com/bminor/bash, but for whatever reason, it was removed
    at some point.
    
    I had a local clone of it, so I pushed it to
    https://github.com/bolinfest/bash so that we could continue running our
    CI job. I did this in https://github.com/openai/codex/pull/9563, and as
    you can see, I did not tamper with the commit hash we used as the basis
    of this build.
    
    Using a personal fork is not great, so this PR changes the CI job to use
    what appears to be considered the source of truth for Bash, which is
    https://git.savannah.gnu.org/git/bash.git.
    
    Though in testing this out, it appears this Git server does not support
    the combination of `git clone --depth 1
    https://git.savannah.gnu.org/git/bash` and `git fetch --depth 1 origin
    a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b`, as it fails with the
    following error:
    
    ```
    error: Server does not allow request for unadvertised object a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
    ```
    
    so unfortunately this means that we have to do a full clone instead of a
    shallow clone in our CI jobs, which will be a bit slower.
    
    Also updated `codex-rs/shell-escalation/README.md` to reflect this
    change.
  • feat: include sandbox config with escalation request (#12839)
    ## Why
    
    Before this change, an escalation approval could say that a command
    should be rerun, but it could not carry the sandbox configuration that
    should still apply when the escalated command is actually spawned.
    
    That left an unsafe gap in the `zsh-fork` skill path: skill scripts
    under `scripts/` that did not declare permissions could be escalated
    without a sandbox, and scripts that did declare permissions could lose
    their bounded sandbox on rerun or cached session approval.
    
    This PR extends the escalation protocol so approvals can optionally
    carry sandbox configuration all the way through execution. That lets the
    shell runtime preserve the intended sandbox instead of silently widening
    access.
    
    We likely want a single permissions type for this codepath eventually,
    probably centered on `Permissions`. For now, the protocol needs to
    represent both the existing `PermissionProfile` form and the fuller
    `Permissions` form, so this introduces a temporary disjoint union,
    `EscalationPermissions`, to carry either one.
    
    Further, this means that today, a skill either:
    
    - does not declare any permissions, in which case it is run using the
    default sandbox for the turn
    - specifies permissions, in which case the skill is run using that exact
    sandbox, which might be more restrictive than the default sandbox for
    the turn
    
    We will likely change the skill's permissions to be additive to the
    existing permissions for the turn.
    
    ## What Changed
    
    - Added `EscalationPermissions` to `codex-protocol` so escalation
    requests can carry either a `PermissionProfile` or a full `Permissions`
    payload.
    - Added an explicit `EscalationExecution` mode to the shell escalation
    protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and
    `Permissions(...)` instead of overloading `None`.
    - Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution
    time, which keeps ordinary `UseDefault` commands on the turn sandbox and
    preserves turn-level macOS seatbelt profile extensions.
    - Updated the `zsh-fork` skill path so a skill with no declared
    permissions inherits the conversation's effective sandbox instead of
    escalating unsandboxed.
    - Updated the `zsh-fork` skill path so a skill with declared permissions
    reruns with exactly those permissions, including when a cached session
    approval is reused.
    
    ## Testing
    
    - Added unit coverage in
    `core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit
    `UseDefault` / `RequireEscalated` / `WithAdditionalPermissions`
    execution mapping.
    - Added unit coverage in
    `core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt
    extension preservation in both the `TurnDefault` and
    explicit-permissions rerun paths.
    - Added integration coverage in `core/tests/suite/skill_approval.rs` for
    permissionless skills inheriting the turn sandbox and explicit skill
    permissions remaining bounded across cached approval reuse.
  • fix: keep shell escalation exec paths absolute (#12750)
    ## Why
    
    In the `shell_zsh_fork` flow, `codex-shell-escalation` receives the
    executable path exactly as the shell passed it to `execve()`. That path
    is not guaranteed to be absolute.
    
    For commands such as `./scripts/hello-mbolin.sh`, if the shell was
    launched with a different `workdir`, resolving the intercepted `file`
    against the server process working directory makes policy checks and
    skill matching inspect the wrong executable. This change pushes that fix
    a step further by keeping the normalized path typed as `AbsolutePathBuf`
    throughout the rest of the escalation pipeline.
    
    That makes the absolute-path invariant explicit, so later code cannot
    accidentally treat the resolved executable path as an arbitrary
    `PathBuf`.
    
    ## What Changed
    
    - record the wrapper process working directory as an `AbsolutePathBuf`
    - update the escalation protocol so `workdir` is explicitly absolute
    while `file` remains the raw intercepted exec path
    - resolve a relative intercepted `file` against the request `workdir` as
    soon as the server receives the request
    - thread `AbsolutePathBuf` through `EscalationPolicy`,
    `CoreShellActionProvider`, and command normalization helpers so the
    resolved executable path stays type-checked as absolute
    - replace the `path-absolutize` dependency in `codex-shell-escalation`
    with `codex-utils-absolute-path`
    - add a regression test that covers a relative `file` with a distinct
    `workdir`
    
    ## Verification
    
    - `cargo test -p codex-shell-escalation`
  • fix: make EscalateServer public and remove shell escalation wrappers (#12724)
    ## Why
    
    `codex-shell-escalation` exposed a `codex-core`-specific adapter layer
    (`ShellActionProvider`, `ShellPolicyFactory`, and `run_escalate_server`)
    that existed only to bridge `codex-core` to `EscalateServer`. That
    indirection increased API surface and obscured crate ownership without
    adding behavior.
    
    This change moves orchestration into `codex-core` so boundaries are
    clearer: `codex-shell-escalation` provides reusable escalation
    primitives, and `codex-core` provides shell-tool policy decisions.
    
    Admittedly, @pakrym rightfully requested this sort of cleanup as part of
    https://github.com/openai/codex/pull/12649, though this avoids moving
    all of `codex-shell-escalation` into `codex-core`.
    
    ## What changed
    
    - Made `EscalateServer` public and exported it from `shell-escalation`.
    - Removed the adapter layer from `shell-escalation`:
      - deleted `shell-escalation/src/unix/core_shell_escalation.rs`
    - removed exports for `ShellActionProvider`, `ShellPolicyFactory`,
    `EscalationPolicyFactory`, and `run_escalate_server`
    - Updated `core/src/tools/runtimes/shell/unix_escalation.rs` to:
      - create `Stopwatch`/cancellation in `codex-core`
      - instantiate `EscalateServer` directly
      - implement `EscalationPolicy` directly on `CoreShellActionProvider`
    
    Net effect: same escalation flow with fewer wrappers and a smaller
    public API.
    
    ## Verification
    
    - Manually reviewed the old vs. new escalation call flow to confirm
    timeout/cancellation behavior and approval policy decisions are
    preserved while removing wrapper types.
  • feat: run zsh fork shell tool via shell-escalation (#12649)
    ## Why
    
    This PR switches the `shell_command` zsh-fork path over to
    `codex-shell-escalation` so the new shell tool can use the shared
    exec-wrapper/escalation protocol instead of the `zsh_exec_bridge`
    implementation that was introduced in
    https://github.com/openai/codex/pull/12052. `zsh_exec_bridge` relied on
    UNIX domain sockets, which is not as tamper-proof as the FD-based
    approach in `codex-shell-escalation`.
    
    ## What Changed
    
    - Added a Unix zsh-fork runtime adapter in `core`
    (`core/src/tools/runtimes/shell/unix_escalation.rs`) that:
    - runs zsh-fork commands through
    `codex_shell_escalation::run_escalate_server`
      - bridges exec-policy / approval decisions into `ShellActionProvider`
    - executes escalated commands via a `ShellCommandExecutor` that calls
    `process_exec_tool_call`
    - Updated `ShellRuntime` / `ShellCommandHandler` / tool spec wiring to
    select a `shell_command` backend (`classic` vs `zsh-fork`) while leaving
    the generic `shell` tool path unchanged.
    - Removed the `zsh_exec_bridge`-based session service and deleted
    `core/src/zsh_exec_bridge/mod.rs`.
    - Moved exec-wrapper entrypoint dispatch to `arg0` by handling the
    `codex-execve-wrapper` arg0 alias there, and removed the old
    `codex_core::maybe_run_zsh_exec_wrapper_mode()` hooks from `cli` and
    `app-server` mains.
    - Added the needed `codex-shell-escalation` dependencies for `core` and
    `arg0`.
    
    ## Tests
    
    - `cargo test -p codex-core
    shell_zsh_fork_prefers_shell_command_over_unified_exec`
    - `cargo test -p codex-app-server turn_start_shell_zsh_fork --
    --nocapture`
    - verifies zsh-fork command execution and approval flows through the new
    backend
    - includes subcommand approve/decline coverage using the shared zsh
    DotSlash fixture in `app-server/tests/suite/zsh`
    - To test manually, I added the following to `~/.codex/config.toml`:
    
    ```toml
    zsh_path = "/Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh"
    
    [features]
    shell_zsh_fork = true
    ```
    
    Then I ran `just c` to run the dev build of Codex with these changes and
    sent it the message:
    
    ```
    run `echo $0`
    ```
    
    And it replied with:
    
    ```
      echo $0 printed:
    
      /Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh
    
      In this tool context, $0 reflects the script path used to invoke the shell, not just zsh.
    ```
    
    so the tool appears to be wired up correctly.
    
    ## Notes
    
    - The zsh subcommand-decline integration test now uses `rm` under a
    `WorkspaceWrite` sandbox. The previous `/usr/bin/true` scenario is
    auto-allowed by the new `shell-escalation` policy path, which no longer
    produces subcommand approval prompts.
  • refactor: decouple shell-escalation from codex-core (#12638)
    ## Why
    
    After removing `exec-server`, the next step is to wire a new shell tool
    to `codex-rs/shell-escalation` directly.
    
    That is blocked while `codex-shell-escalation` depends on `codex-core`,
    because the new integration would require `codex-core` to depend on
    `codex-shell-escalation` and create a dependency cycle.
    
    This change ports the reusable pieces from the earlier prep work, but
    drops the old compatibility shim because `exec-server`/MCP support is
    already gone.
    
    ## What Changed
    
    ### Decouple `shell-escalation` from `codex-core`
    
    - Introduce a crate-local `SandboxState` in `shell-escalation`
    - Introduce a `ShellCommandExecutor` trait so callers provide process
    execution/sandbox integration
    - Update `EscalateServer::exec(...)` and `run_escalate_server(...)` to
    use the injected executor
    - Remove the direct `codex_core::exec::process_exec_tool_call(...)` call
    from `shell-escalation`
    - Remove the `codex-core` dependency from `codex-shell-escalation`
    
    ### Restore reusable policy adapter exports
    
    - Re-enable `unix::core_shell_escalation`
    - Export `ShellActionProvider` and `ShellPolicyFactory` from
    `shell-escalation`
    - Keep the crate root API simple (no `legacy_api` compatibility layer)
    
    ### Port socket fixes from the earlier prep commit
    
    - Use `socket2::Socket::pair_raw(...)` for AF_UNIX socketpairs and
    restore `CLOEXEC` explicitly on both endpoints
    - Keep `CLOEXEC` cleared only on the single datagram client FD that is
    intentionally passed across `exec`
    - Clean up `tokio::AsyncFd::try_io(...)` error handling in the socket
    helpers
    
    ## Verification
    
    - `cargo shear`
    - `cargo clippy -p codex-shell-escalation --tests`
    - `cargo test -p codex-shell-escalation`
  • refactor: delete exec-server and move execve wrapper into shell-escalation (#12632)
    ## Why
    
    We already plan to remove the shell-tool MCP path, and doing that
    cleanup first makes the follow-on `shell-escalation` work much simpler.
    
    This change removes the last remaining reason to keep
    `codex-rs/exec-server` around by moving the `codex-execve-wrapper`
    binary and shared shell test fixtures to the crates/tests that now own
    that functionality.
    
    ## What Changed
    
    ### Delete `codex-rs/exec-server`
    
    - Remove the `exec-server` crate, including the MCP server binary,
    MCP-specific modules, and its test support/test suite
    - Remove `exec-server` from the `codex-rs` workspace and update
    `Cargo.lock`
    
    ### Move `codex-execve-wrapper` into `codex-rs/shell-escalation`
    
    - Move the wrapper implementation into `shell-escalation`
    (`src/unix/execve_wrapper.rs`)
    - Add the `codex-execve-wrapper` binary entrypoint under
    `shell-escalation/src/bin/`
    - Update `shell-escalation` exports/module layout so the wrapper
    entrypoint is hosted there
    - Move the wrapper README content from `exec-server` to
    `shell-escalation/README.md`
    
    ### Move shared shell test fixtures to `app-server`
    
    - Move the DotSlash `bash`/`zsh` test fixtures from
    `exec-server/tests/suite/` to `app-server/tests/suite/`
    - Update `app-server` zsh-fork tests to reference the new fixture paths
    
    ### Keep `shell-tool-mcp` as a shell-assets package
    
    - Update `.github/workflows/shell-tool-mcp.yml` packaging so the npm
    artifact contains only patched Bash/Zsh payloads (no Rust binaries)
    - Update `shell-tool-mcp/package.json`, `shell-tool-mcp/src/index.ts`,
    and docs to reflect the shell-assets-only package shape
    - `shell-tool-mcp-ci.yml` does not need changes because it is already
    JS-only
    
    ## Verification
    
    - `cargo shear`
    - `cargo clippy -p codex-shell-escalation --tests`
    - `just clippy`
  • refactor: normalize unix module layout for exec-server and shell-escalation (#12556)
    ## Why
    Shell execution refactoring in `exec-server` had become split between
    duplicated code paths, which blocked a clean introduction of the new
    reusable shell escalation flow. This commit creates a dedicated
    foundation crate so later shell tooling changes can share one
    implementation.
    
    ## What changed
    - Added the `codex-shell-escalation` crate and moved the core escalation
    pieces (`mcp` protocol/socket/session flow, policy glue) that were
    previously in `exec-server` into it.
    - Normalized `exec-server` Unix structure under a dedicated `unix`
    module layout and kept non-Unix builds narrow.
    - Wired crate/build metadata so `shell-escalation` is a first-class
    workspace dependency for follow-on integration work.
    
    ## Verification
    - Built and linted the stack at this commit point with `just clippy`.
    
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12556).
    * #12584
    * #12583
    * __->__ #12556