Commit Graph

101 Commits

  • Add Windows sandbox unified exec runtime support (#15578)
    ## Summary
    
    This is the runtime/foundation half of the Windows sandbox unified-exec
    work.
    
    - add Windows sandbox `unified_exec` session support in
    `windows-sandbox-rs` for both:
      - the legacy restricted-token backend
      - the elevated runner backend
    - extend the PTY/process runtime so driver-backed sessions can support:
      - stdin streaming
      - stdout/stderr separation
      - exit propagation
      - PTY resize hooks
    - add Windows sandbox runtime coverage in `codex-windows-sandbox` /
    `codex-utils-pty`
    
    This PR does **not** enable Windows sandbox `UnifiedExec` for product
    callers yet because hooking this up to app-server comes in the next PR.
    
    Windows sandbox advertising is intentionally kept aligned with `main`,
    so sandboxed Windows callers still fall back to `ShellCommand`.
    
    This PR isolates the runtime/session layer so it can be reviewed
    independently from product-surface enablement.
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • feat(permissions): add glob deny-read policy support (#15979)
    ## Summary
    - adds first-class filesystem policy entries for deny-read glob patterns
    - parses config such as :project_roots { "**/*.env" = "none" } into
    pattern entries
    - enforces deny-read patterns in direct read/list helpers
    - fails closed for sandbox execution until platform backends enforce
    glob patterns in #18096
    - preserves split filesystem policy in turn context only when it cannot
    be reconstructed from legacy sandbox policy
    
    ## Stack
    1. This PR - glob deny-read policy/config/direct-tool support
    2. #18096 - macOS and Linux sandbox enforcement
    3. #17740 - managed deny-read requirements
    
    ## Verification
    - just fmt
    - cargo check -p codex-core -p codex-sandboxing --tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • fix: cleanup the contract of the general-purpose exec() function (#17870)
    `exec()` had a number of arguments that were unused, making the function
    signature misleading. This PR aims to clean things up to clarify the
    role of this function and to clarify which fields of `ExecParams` are
    unused and why.
  • Spread AbsolutePathBuf (#17792)
    Mechanical change to promote absolute paths through code.
  • Build remote exec env from exec-server policy (#17216)
    ## Summary
    - add an exec-server `envPolicy` field; when present, the server starts
    from its own process env and applies the shell environment policy there
    - keep `env` as the exact environment for local/embedded starts, but
    make it an overlay for remote unified-exec starts
    - move the shell-environment-policy builder into `codex-config` so Core
    and exec-server share the inherit/filter/set/include behavior
    - overlay only runtime/sandbox/network deltas from Core onto the
    exec-server-derived env
    
    ## Why
    Remote unified exec was materializing the shell env inside Core and
    forwarding the whole map to exec-server, so remote processes could
    inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
    base env on the executor while preserving Core-owned runtime additions
    like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
    sandbox marker env.
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - `cargo test -p codex-exec-server --lib`
    - `cargo test -p codex-core --lib unified_exec::process_manager::tests`
    - `cargo test -p codex-core --lib exec_env::tests`
    - `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
    matched 0 tests)
    - `cargo test -p codex-config --lib shell_environment` (compile-only;
    filter matched 0 tests)
    - `just bazel-lock-update`
    
    ## Known local validation issue
    - `just bazel-lock-check` is not runnable in this checkout: it invokes
    `./scripts/check-module-bazel-lock.sh`, which is missing.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
    Co-authored-by: pakrym-oai <pakrym@openai.com>
  • fix: support split carveouts in windows elevated sandbox (#14568)
    ## Summary
    - preserve legacy Windows elevated sandbox behavior for existing
    policies
    - add elevated-only support for split filesystem policies that can be
    represented as readable-root overrides, writable-root overrides, and
    extra deny-write carveouts
    - resolve those elevated filesystem overrides during sandbox transform
    and thread them through setup and policy refresh
    - keep failing closed for explicit unreadable (`none`) carveouts and
    reopened writable descendants under read-only carveouts
    - for explicit read-only-under-writable-root carveouts, materialize
    missing carveout directories during elevated setup before applying the
    deny-write ACL
    - document the elevated vs restricted-token support split in the core
    README
    
    ## Example
    Given a split filesystem policy like:
    
    ```toml
    ":root" = "read"
    ":cwd" = "write"
    "./docs" = "read"
    "C:/scratch" = "write"
    ```
    
    the elevated backend now provisions the readable-root overrides,
    writable-root overrides, and extra deny-write carveouts during setup and
    refresh instead of collapsing back to the legacy workspace-only shape.
    
    If a read-only carveout under a writable root is missing at setup time,
    elevated setup creates that carveout as an empty directory before
    applying its deny-write ACE; otherwise the sandboxed command could
    create it later and bypass the carveout. This is only for explicit
    policy carveouts. Best-effort workspace protections like `.codex/` and
    `.agents/` still skip missing directories.
    
    A policy like:
    
    ```toml
    "/workspace" = "write"
    "/workspace/docs" = "read"
    "/workspace/docs/tmp" = "write"
    ```
    
    still fails closed, because the elevated backend does not reopen
    writable descendants under read-only carveouts yet.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Use AbsolutePathBuf for exec cwd plumbing (#17063)
    ## Summary
    - Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
    resolving workdirs to raw `PathBuf`s.
    - Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
    `ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
    requests.
    - Keep `PathBuf` conversions at external/event boundaries and update
    existing tests/fixtures for the typed cwd.
    
    ## Validation
    - `cargo check -p codex-core --tests`
    - `cargo check -p codex-sandboxing --tests`
    - `cargo test -p codex-sandboxing`
    - `cargo test -p codex-core --lib tools::handlers::`
    - `just fix -p codex-sandboxing`
    - `just fix -p codex-core`
    - `just fmt`
    
    Full `codex-core` test suite was not run locally; per repo guidance I
    kept local validation targeted.
  • [codex] reduce module visibility (#16978)
    ## Summary
    - reduce public module visibility across Rust crates, preferring private
    or crate-private modules with explicit crate-root public exports
    - update external call sites and tests to use the intended public crate
    APIs instead of reaching through module trees
    - add the module visibility guideline to AGENTS.md
    
    ## Validation
    - `cargo check --workspace --all-targets --message-format=short` passed
    before the final fix/format pass
    - `just fix` completed successfully
    - `just fmt` completed successfully
    - `git diff --check` passed
  • remove temporary ownership re-exports (#16626)
    Stacked on #16508.
    
    This removes the temporary `codex-core` / `codex-login` re-export shims
    from the ownership split and rewrites callsites to import directly from
    `codex-model-provider-info`, `codex-models-manager`, `codex-api`,
    `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
    
    No behavior change intended; this is the mechanical import cleanup layer
    split out from the ownership move.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • extract models manager and related ownership from core (#16508)
    ## Summary
    - split `models-manager` out of `core` and add `ModelsManagerConfig`
    plus `Config::to_models_manager_config()` so model metadata paths stop
    depending on `core::Config`
    - move login-owned/auth-owned code out of `core` into `codex-login`,
    move model provider config into `codex-model-provider-info`, move API
    bridge mapping into `codex-api`, move protocol-owned types/impls into
    `codex-protocol`, and move response debug helpers into a dedicated
    `response-debug-context` crate
    - move feedback tag emission into `codex-feedback`, relocate tests to
    the crates that now own the code, and keep broad temporary re-exports so
    this PR avoids a giant import-only rewrite
    
    ## Major moves and decisions
    - created `codex-models-manager` as the owner for model
    cache/catalog/config/model info logic, including the new
    `ModelsManagerConfig` struct
    - created `codex-model-provider-info` as the owner for provider config
    parsing/defaults and kept temporary `codex-login`/`codex-core`
    re-exports for old import paths
    - moved `api_bridge` error mapping + `CoreAuthProvider` into
    `codex-api`, while `codex-login::api_bridge` temporarily re-exports
    those symbols and keeps the `auth_provider_from_auth` wrapper
    - moved `auth_env_telemetry` and `provider_auth` ownership to
    `codex-login`
    - moved `CodexErr` ownership to `codex-protocol::error`, plus
    `StreamOutput`, `bytes_to_string_smart`, and network policy helpers to
    protocol-owned modules
    - created `codex-response-debug-context` for
    `extract_response_debug_context`, `telemetry_transport_error_message`,
    and related response-debug plumbing instead of leaving that behavior in
    `core`
    - moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and
    `emit_feedback_request_tags_with_auth_env` to `codex-feedback`
    - deferred removal of temporary re-exports and the mechanical import
    rewrites to a stacked follow-up PR so this PR stays reviewable
    
    ## Test moves
    - moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to
    `login/tests/suite/auth_refresh.rs`
    - moved text encoding coverage from
    `core/tests/suite/text_encoding_fix.rs` to
    `protocol/src/exec_output_tests.rs`
    - moved model info override coverage from
    `core/tests/suite/model_info_overrides.rs` to
    `models-manager/src/model_info_overrides_tests.rs`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • core: remove cross-crate re-exports from lib.rs (#16512)
    ## Why
    
    `codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
    which made downstream crates depend on `codex-core` as a proxy module
    instead of the actual owner crate.
    
    Removing those forwards makes crate boundaries explicit and lets leaf
    crates drop unnecessary `codex-core` dependencies. In this PR, this
    reduces the dependency on `codex-core` to `codex-login` in the following
    files:
    
    ```
    codex-rs/backend-client/Cargo.toml
    codex-rs/mcp-server/tests/common/Cargo.toml
    ```
    
    ## What
    
    - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
    `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
    `codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
    `codex-tools`, and `codex-utils-path`.
    - Delete the `default_client` forwarding shim in `codex-rs/core`.
    - Update in-crate and downstream callsites to import directly from the
    owning `codex-*` crate.
    - Add direct Cargo dependencies where callsites now target the owner
    crate, and remove `codex-core` from `codex-rs/backend-client`.
  • fix(core) rm execute_exec_request sandbox_policy (#16422)
    ## Summary
    In #11871 we started consolidating on ExecRequest.sandbox_policy instead
    of passing in a separate policy object that theoretically could differ
    (but did not). This finishes the some parameter cleanup.
    
    This should be a simple noop, since all 3 callsites of this function
    already used a cloned object from the ExecRequest value.
    
    ## Testing
    - [x] Existing tests pass
  • feat(windows-sandbox): add network proxy support (#12220)
    ## Summary
    
    This PR makes Windows sandbox proxying enforceable by routing proxy-only
    runs through the existing `offline` sandbox user and reserving direct
    network access for the existing `online` sandbox user.
    
    In brief:
    
    - if a Windows sandbox run should be proxy-enforced, we run it as the
    `offline` user
    - the `offline` user gets firewall rules that block direct outbound
    traffic and only permit the configured localhost proxy path
    - if a Windows sandbox run should have true direct network access, we
    run it as the `online` user
    - no new sandbox identity is introduced
    
    This brings Windows in line with the intended model: proxy use is not
    just env-based, it is backed by OS-level egress controls. Windows
    already has two sandbox identities:
    
    - `offline`: intended to have no direct network egress
    - `online`: intended to have full network access
    
    This PR makes proxy-enforced runs use that model directly.
    
    ### Proxy-enforced runs
    
    When proxy enforcement is active:
    
    - the run is assigned to the `offline` identity
    - setup extracts the loopback proxy ports from the sandbox env
    - Windows setup programs firewall rules for the `offline` user that:
      - block all non-loopback outbound traffic
      - block loopback UDP
      - block loopback TCP except for the configured proxy ports
    - optionally allow broader localhost access when `allow_local_binding=1`
    
    So the sandboxed process can only talk to the local proxy. It cannot
    open direct outbound sockets or do local UDP-based DNS on its own.The
    proxy then performs the real outbound network access outside that
    restricted sandbox identity.
    
    ### Direct-network runs
    
    When proxy enforcement is not active and full network access is allowed:
    
    - the run is assigned to the `online` identity
    - no proxy-only firewall restrictions are applied
    - the process gets normal direct network access
    
    ### Unelevated vs elevated
    
    The restricted-token / unelevated path cannot enforce per-identity
    firewall policy by itself.
    
    So for Windows proxy-enforced runs, we transparently use the logon-user
    sandbox path under the hood, even if the caller started from the
    unelevated mode. That keeps enforcement real instead of best-effort.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • permissions: remove macOS seatbelt extension profiles (#15918)
    ## Why
    
    `PermissionProfile` should only describe the per-command permissions we
    still want to grant dynamically. Keeping
    `MacOsSeatbeltProfileExtensions` in that surface forced extra macOS-only
    approval, protocol, schema, and TUI branches for a capability we no
    longer want to expose.
    
    ## What changed
    
    - Removed the macOS-specific permission-profile types from
    `codex-protocol`, the app-server v2 API, and the generated
    schema/TypeScript artifacts.
    - Deleted the core and sandboxing plumbing that threaded
    `MacOsSeatbeltProfileExtensions` through execution requests and seatbelt
    construction.
    - Simplified macOS seatbelt generation so it always includes the fixed
    read-only preferences allowlist instead of carrying a configurable
    profile extension.
    - Removed the macOS additional-permissions UI/docs/test coverage and
    deleted the obsolete macOS permission modules.
    - Tightened `request_permissions` intersection handling so explicitly
    empty requested read lists are preserved only when that field was
    actually granted, avoiding zero-grant responses being stored as active
    permissions.
  • sandboxing: use OsString for SandboxCommand.program (#15897)
    ## Why
    
    `SandboxCommand.program` represents an executable path, but keeping it
    as `String` forced path-backed callers to run `to_string_lossy()` before
    the sandbox layer ever touched the command. That loses fidelity earlier
    than necessary and adds avoidable conversions in runtimes that already
    have a `PathBuf`.
    
    ## What changed
    
    - Changed `SandboxCommand.program` to `OsString`.
    - Updated `SandboxManager::transform` to keep the program and argv in
    `OsString` form until the `SandboxExecRequest` conversion boundary.
    - Switched the path-backed `apply_patch` and `js_repl` runtimes to pass
    `into_os_string()` instead of `to_string_lossy()`.
    - Updated the remaining string-backed builders and tests to match the
    new type while preserving the existing Linux helper `arg0` behavior.
    
    ## Verification
    
    - `cargo test -p codex-sandboxing`
    - `just argument-comment-lint -p codex-core -p codex-sandboxing`
    - `cargo test -p codex-core` currently fails in unrelated existing
    config tests: `config::tests::approvals_reviewer_*` and
    `config::tests::smart_approvals_alias_*`
  • fix: support split carveouts in windows restricted-token sandbox (#14172)
    ## Summary
    - keep legacy Windows restricted-token sandboxing as the supported
    baseline
    - support the split-policy subset that restricted-token can enforce
    directly today
    - support full-disk read, the same writable root set as legacy
    `WorkspaceWrite`, and extra read-only carveouts under those writable
    roots via additional deny-write ACLs
    - continue to fail closed for unsupported split-only shapes, including
    explicit unreadable (`none`) carveouts, reopened writable descendants
    under read-only carveouts, and writable root sets that do not match the
    legacy workspace roots
    
    ## Example
    Given a filesystem policy like:
    
    ```toml
    ":root" = "read"
    ":cwd" = "write"
    "./docs" = "read"
    ```
    
    the restricted-token backend can keep the workspace writable while
    denying writes under `docs` by layering an extra deny-write carveout on
    top of the legacy workspace-write roots.
    
    A policy like:
    
    ```toml
    "/workspace" = "write"
    "/workspace/docs" = "read"
    "/workspace/docs/tmp" = "write"
    ```
    
    still fails closed, because the unelevated backend cannot reopen the
    nested writable descendant safely.
    
    ## Stack
    -> fix: support split carveouts in windows restricted-token sandbox
    #14172
    fix: support split carveouts in windows elevated sandbox #14568
  • Drop sandbox_permissions from sandbox exec requests (#15665)
    ## Summary
    - drop `sandbox_permissions` from the sandboxing `ExecOptions` and
    `ExecRequest` adapter types
    - remove the now-unused plumbing from shell, unified exec, JS REPL, and
    apply-patch runtime call sites
    - default reconstructed `ExecParams` to `SandboxPermissions::UseDefault`
    where the lower-level API still requires the field
    
    ## Testing
    - `just fmt`
    - `just argument-comment-lint`
    - `cargo test -p codex-core` (still running locally; first failures
    observed in `suite::cli_stream::responses_mode_stream_cli`,
    `suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override`,
    and
    `suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_env_fallback`)
  • Use released DotSlash package for argument-comment lint (#15199)
    ## Why
    The argument-comment lint now has a packaged DotSlash artifact from
    [#15198](https://github.com/openai/codex/pull/15198), so the normal repo
    lint path should use that released payload instead of rebuilding the
    lint from source every time.
    
    That keeps `just clippy` and CI aligned with the shipped artifact while
    preserving a separate source-build path for people actively hacking on
    the lint crate.
    
    The current alpha package also exposed two integration wrinkles that the
    repo-side prebuilt wrapper needs to smooth over:
    - the bundled Dylint library filename includes the host triple, for
    example `@nightly-2025-09-18-aarch64-apple-darwin`, and Dylint derives
    `RUSTUP_TOOLCHAIN` from that filename
    - on Windows, Dylint's driver path also expects `RUSTUP_HOME` to be
    present in the environment
    
    Without those adjustments, the prebuilt CI jobs fail during `cargo
    metadata` or driver setup. This change makes the checked-in prebuilt
    wrapper normalize the packaged library name to the plain
    `nightly-2025-09-18` channel before invoking `cargo-dylint`, and it
    teaches both the wrapper and the packaged runner source to infer
    `RUSTUP_HOME` from `rustup show home` when the environment does not
    already provide it.
    
    After the prebuilt Windows lint job started running successfully, it
    also surfaced a handful of existing anonymous literal callsites in
    `windows-sandbox-rs`. This PR now annotates those callsites so the new
    cross-platform lint job is green on the current tree.
    
    ## What Changed
    - checked in the current
    `tools/argument-comment-lint/argument-comment-lint` DotSlash manifest
    - kept `tools/argument-comment-lint/run.sh` as the source-build wrapper
    for lint development
    - added `tools/argument-comment-lint/run-prebuilt-linter.sh` as the
    normal enforcement path, using the checked-in DotSlash package and
    bundled `cargo-dylint`
    - updated `just clippy` and `just argument-comment-lint` to use the
    prebuilt wrapper
    - split `.github/workflows/rust-ci.yml` so source-package checks live in
    a dedicated `argument_comment_lint_package` job, while the released lint
    runs in an `argument_comment_lint_prebuilt` matrix on Linux, macOS, and
    Windows
    - kept the pinned `nightly-2025-09-18` toolchain install in the prebuilt
    CI matrix, since the prebuilt package still relies on rustup-provided
    toolchain components
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` to
    normalize host-qualified nightly library filenames, keep the `rustup`
    shim directory ahead of direct toolchain `cargo` binaries, and export
    `RUSTUP_HOME` when needed for Windows Dylint driver setup
    - updated `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
    so future published DotSlash artifacts apply the same nightly-filename
    normalization and `RUSTUP_HOME` inference internally
    - fixed the remaining Windows lint violations in
    `codex-rs/windows-sandbox-rs` by adding the required `/*param*/`
    comments at the reported callsites
    - documented the checked-in DotSlash file, wrapper split, archive
    layout, nightly prerequisite, and Windows `RUSTUP_HOME` requirement in
    `tools/argument-comment-lint/README.md`
  • feat: support restricted ReadOnlyAccess in elevated Windows sandbox (#14610)
    ## Summary
    - support legacy `ReadOnlyAccess::Restricted` on Windows in the elevated
    setup/runner backend
    - keep the unelevated restricted-token backend on the legacy full-read
    model only, and fail closed for restricted read-only policies there
    - keep the legacy full-read Windows path unchanged while deriving
    narrower read roots only for elevated restricted-read policies
    - honor `include_platform_defaults` by adding backend-managed Windows
    system roots only when requested, while always keeping helper roots and
    the command `cwd` readable
    - preserve `workspace-write` semantics by keeping writable roots
    readable when restricted read access is in use in the elevated backend
    - document the current Windows boundary: legacy `SandboxPolicy` is
    supported on both backends, while richer split-only carveouts still fail
    closed instead of running with weaker enforcement
    
    ## Testing
    - `cargo test -p codex-windows-sandbox`
    - `cargo check -p codex-windows-sandbox --tests --target
    x86_64-pc-windows-msvc`
    - `cargo clippy -p codex-windows-sandbox --tests --target
    x86_64-pc-windows-msvc -- -D warnings`
    - `cargo test -p codex-core windows_restricted_token_`
    
    ## Notes
    - local `cargo test -p codex-windows-sandbox` on macOS only exercises
    the non-Windows stubs; the Windows-targeted compile and clippy runs
    provide the local signal, and GitHub Windows CI exercises the runtime
    path
  • Apply argument comment lint across codex-rs (#14652)
    ## Why
    
    Once the repo-local lint exists, `codex-rs` needs to follow the
    checked-in convention and CI needs to keep it from drifting. This commit
    applies the fallback `/*param*/` style consistently across existing
    positional literal call sites without changing those APIs.
    
    The longer-term preference is still to avoid APIs that require comments
    by choosing clearer parameter types and call shapes. This PR is
    intentionally the mechanical follow-through for the places where the
    existing signatures stay in place.
    
    After rebasing onto newer `main`, the rollout also had to cover newly
    introduced `tui_app_server` call sites. That made it clear the first cut
    of the CI job was too expensive for the common path: it was spending
    almost as much time installing `cargo-dylint` and re-testing the lint
    crate as a representative test job spends running product tests. The CI
    update keeps the full workspace enforcement but trims that extra
    overhead from ordinary `codex-rs` PRs.
    
    ## What changed
    
    - keep a dedicated `argument_comment_lint` job in `rust-ci`
    - mechanically annotate remaining opaque positional literals across
    `codex-rs` with exact `/*param*/` comments, including the rebased
    `tui_app_server` call sites that now fall under the lint
    - keep the checked-in style aligned with the lint policy by using
    `/*param*/` and leaving string and char literals uncommented
    - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
    registry/git metadata in the lint job
    - split changed-path detection so the lint crate's own `cargo test` step
    runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
    - continue to run the repo wrapper over the `codex-rs` workspace, so
    product-code enforcement is unchanged
    
    Most of the code changes in this commit are intentionally mechanical
    comment rewrites or insertions driven by the lint itself.
    
    ## Verification
    
    - `./tools/argument-comment-lint/run.sh --workspace`
    - `cargo test -p codex-tui-app-server -p codex-tui`
    - parsed `.github/workflows/rust-ci.yml` locally with PyYAML
    
    ---
    
    * -> #14652
    * #14651
  • Use a private desktop for Windows sandbox instead of Winsta0\Default (#14400)
    ## Summary
    - launch Windows sandboxed children on a private desktop instead of
    `Winsta0\Default`
    - make private desktop the default while keeping
    `windows.sandbox_private_desktop=false` as the escape hatch
    - centralize process launch through the shared
    `create_process_as_user(...)` path
    - scope the private desktop ACL to the launching logon SID
    
    ## Why
    Today sandboxed Windows commands run on the visible shared desktop. That
    leaves an avoidable same-desktop attack surface for window interaction,
    spoofing, and related UI/input issues. This change moves sandboxed
    commands onto a dedicated per-launch desktop by default so the sandbox
    no longer shares `Winsta0\Default` with the user session.
    
    The implementation stays conservative on security with no silent
    fallback back to `Winsta0\Default`
    
    If private-desktop setup fails on a machine, users can still opt out
    explicitly with `windows.sandbox_private_desktop=false`.
    
    ## Validation
    - `cargo build -p codex-cli`
    - elevated-path `codex exec` desktop-name probe returned
    `CodexSandboxDesktop-*`
    - elevated-path `codex exec` smoke sweep for shell commands, nested
    `pwsh`, jobs, and hidden `notepad` launch
    - unelevated-path full private-desktop compatibility sweep via `codex
    exec` with `-c windows.sandbox=unelevated`
  • fix: move inline codex-rs/core unit tests into sibling files (#14444)
    ## Why
    PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This
    applies the same extraction pattern across the rest of `codex-rs/core`
    so the production modules stay focused on runtime code instead of large
    inline test blocks.
    
    Keeping the tests in sibling files also makes follow-up edits easier to
    review because product changes no longer have to share a file with
    hundreds or thousands of lines of test scaffolding.
    
    ## What changed
    - replaced each inline `mod tests { ... }` in `codex-rs/core/src/**`
    with a path-based module declaration
    - moved each extracted unit test module into a sibling `*_tests.rs`
    file, using `mod_tests.rs` for `mod.rs` modules
    - preserved the existing `cfg(...)` guards and module-local structure so
    the refactor remains structural rather than behavioral
    
    ## Testing
    - `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`)
    - `just fix -p codex-core`
    - `cargo fmt --check`
    - `cargo shear`
  • refactor: make bubblewrap the default Linux sandbox (#13996)
    ## Summary
    - make bubblewrap the default Linux sandbox and keep
    `use_legacy_landlock` as the only override
    - remove `use_linux_sandbox_bwrap` from feature, config, schema, and
    docs surfaces
    - update Linux sandbox selection, CLI/config plumbing, and related
    tests/docs to match the new default
    - fold in the follow-up CI fixes for request-permissions responses and
    Linux read-only sandbox error text
  • safety: honor filesystem policy carveouts in apply_patch (#13445)
    ## Why
    
    `apply_patch` safety approval was still checking writable paths through
    the legacy `SandboxPolicy` projection.
    
    That can hide explicit `none` carveouts when a split filesystem policy
    projects back to compatibility `ExternalSandbox`, which leaves one more
    approval path that can auto-approve writes inside paths that are
    intentionally blocked.
    
    ## What changed
    
    - passed `turn.file_system_sandbox_policy` into `assess_patch_safety`
    - changed writable-path checks to derive effective access from
    `FileSystemSandboxPolicy` instead of the legacy `SandboxPolicy`
    - made those checks reject explicit unreadable roots before considering
    broad write access or writable roots
    - added regression coverage showing that an `ExternalSandbox`
    compatibility projection still asks for approval when the split
    filesystem policy blocks a subpath
    
    ## Verification
    
    - `cargo test -p codex-core safety::tests::`
    - `cargo test -p codex-core test_sandbox_config_parsing`
    - `cargo clippy -p codex-core --all-targets -- -D warnings`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13445).
    * #13453
    * #13452
    * #13451
    * #13449
    * #13448
    * __->__ #13445
    * #13440
    * #13439
    
    ---------
    
    Co-authored-by: viyatb-oai <viyatb@openai.com>
  • sandboxing: plumb split sandbox policies through runtime (#13439)
    ## Why
    
    `#13434` introduces split `FileSystemSandboxPolicy` and
    `NetworkSandboxPolicy`, but the runtime still made most execution-time
    sandbox decisions from the legacy `SandboxPolicy` projection.
    
    That projection loses information about combinations like unrestricted
    filesystem access with restricted network access. In practice, that
    means the runtime can choose the wrong platform sandbox behavior or set
    the wrong network-restriction environment for a command even when config
    has already separated those concerns.
    
    This PR carries the split policies through the runtime so sandbox
    selection, process spawning, and exec handling can consult the policy
    that actually matters.
    
    ## What changed
    
    - threaded `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` through
    `TurnContext`, `ExecRequest`, sandbox attempts, shell escalation state,
    unified exec, and app-server exec overrides
    - updated sandbox selection in `core/src/sandboxing/mod.rs` and
    `core/src/exec.rs` to key off `FileSystemSandboxPolicy.kind` plus
    `NetworkSandboxPolicy`, rather than inferring behavior only from the
    legacy `SandboxPolicy`
    - updated process spawning in `core/src/spawn.rs` and the platform
    wrappers to use `NetworkSandboxPolicy` when deciding whether to set
    `CODEX_SANDBOX_NETWORK_DISABLED`
    - kept additional-permissions handling and legacy `ExternalSandbox`
    compatibility projections aligned with the split policies, including
    explicit user-shell execution and Windows restricted-token routing
    - updated callers across `core`, `app-server`, and `linux-sandbox` to
    pass the split policies explicitly
    
    ## Verification
    
    - added regression coverage in `core/tests/suite/user_shell_cmd.rs` to
    verify `RunUserShellCommand` does not inherit
    `CODEX_SANDBOX_NETWORK_DISABLED` from the active turn
    - added coverage in `core/src/exec.rs` for Windows restricted-token
    sandbox selection when the legacy projection is `ExternalSandbox`
    - updated Linux sandbox coverage in
    `linux-sandbox/tests/suite/landlock.rs` to exercise the split-policy
    exec path
    - verified the current PR state with `just clippy`
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13439).
    * #13453
    * #13452
    * #13451
    * #13449
    * #13448
    * #13445
    * #13440
    * __->__ #13439
    
    ---------
    
    Co-authored-by: viyatb-oai <viyatb@openai.com>
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • refactor: prepare unified exec for zsh-fork backend (#13392)
    ## Why
    
    `shell_zsh_fork` already provides stronger guarantees around which
    executables receive elevated permissions. To reuse that machinery from
    unified exec without pushing Unix-specific escalation details through
    generic runtime code, the escalation bootstrap and session lifetime
    handling need a cleaner boundary.
    
    That boundary also needs to be safe for long-lived sessions: when an
    intercepted shell session is closed or pruned, any in-flight approval
    workers and any already-approved escalated child they spawned must be
    torn down with the session, and the inherited escalation socket must not
    leak into unrelated subprocesses.
    
    ## What Changed
    
    - Extracted a reusable `EscalationSession` and
    `EscalateServer::start_session(...)` in `shell-escalation` so callers
    can get the wrapper/socket env overlay and keep the escalation server
    alive without immediately running a one-shot command.
    - Documented that `EscalationSession::env()` and
    `ShellCommandExecutor::run(...)` exchange only that env overlay, which
    callers must merge into their own base shell environment.
    - Clarified the prepared-exec helper boundary in `core` by naming the
    new helper APIs around `ExecRequest`, while keeping the legacy
    `execute_env(...)` entrypoints as thin compatibility wrappers for
    existing callers that still use the older naming.
    - Added a small post-spawn hook on the prepared execution path so the
    parent copy of the inheritable escalation socket is closed immediately
    after both the existing one-shot shell-command spawn and the
    unified-exec spawn.
    - Made session teardown explicit with session-scoped cancellation:
    dropping an `EscalationSession` or canceling its parent request now
    stops intercept workers, and the server-spawned escalated child uses
    `kill_on_drop(true)` so teardown cannot orphan an already-approved
    child.
    - Added `UnifiedExecBackendConfig` plumbing through `ToolsConfig`, a
    `shell::zsh_fork_backend` facade, and an opaque unified-exec
    spawn-lifecycle hook so unified exec can prepare a wrapped `zsh -c/-lc`
    request without storing `EscalationSession` directly in generic
    process/runtime code.
    - Kept the existing `shell_command` zsh-fork behavior intact on top of
    the new bootstrap path. Tool selection is unchanged in this PR: when
    `shell_zsh_fork` is enabled, `ShellCommand` still wins over
    `exec_command`.
    
    ## Verification
    
    - `cargo test -p codex-shell-escalation`
      - includes coverage for `start_session_exposes_wrapper_env_overlay`
      - includes coverage for `exec_closes_parent_socket_after_shell_spawn`
    - includes coverage for
    `dropping_session_aborts_intercept_workers_and_kills_spawned_child`
    - `cargo test -p codex-core
    shell_zsh_fork_prefers_shell_command_over_unified_exec`
    - `cargo test -p codex-core --test all
    shell_zsh_fork_prompts_for_skill_script_execution`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13392).
    * #13432
    * __->__ #13392
  • fix: use AbsolutePathBuf for permission profile file roots (#12970)
    ## Why
    `PermissionProfile` should describe filesystem roots as absolute paths
    at the type level. Using `PathBuf` in `FileSystemPermissions` made the
    shared type too permissive and blurred together three different
    deserialization cases:
    
    - skill metadata in `agents/openai.yaml`, where relative paths should
    resolve against the skill directory
    - app-server API payloads, where callers should have to send absolute
    paths
    - local tool-call payloads for commands like `shell_command` and
    `exec_command`, where `additional_permissions.file_system` may
    legitimately be relative to the command `workdir`
    
    This change tightens the shared model without regressing the existing
    local command flow.
    
    ## What Changed
    - changed `protocol::models::FileSystemPermissions` and the app-server
    `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf`
    - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so
    relative permission roots in `agents/openai.yaml` resolve against the
    containing skill directory
    - kept app-server/API deserialization strict, so relative
    `additionalPermissions.fileSystem.*` paths are rejected at the boundary
    - restored cwd/workdir-relative deserialization for local tool-call
    payloads by parsing `shell`, `shell_command`, and `exec_command`
    arguments under an `AbsolutePathBufGuard` rooted at the resolved command
    working directory
    - simplified runtime additional-permission normalization so it only
    canonicalizes and deduplicates absolute roots instead of trying to
    recover relative ones later
    - updated the app-server schema fixtures, `app-server/README.md`, and
    the affected transport/TUI tests to match the final behavior
  • feat: include sandbox config with escalation request (#12839)
    ## Why
    
    Before this change, an escalation approval could say that a command
    should be rerun, but it could not carry the sandbox configuration that
    should still apply when the escalated command is actually spawned.
    
    That left an unsafe gap in the `zsh-fork` skill path: skill scripts
    under `scripts/` that did not declare permissions could be escalated
    without a sandbox, and scripts that did declare permissions could lose
    their bounded sandbox on rerun or cached session approval.
    
    This PR extends the escalation protocol so approvals can optionally
    carry sandbox configuration all the way through execution. That lets the
    shell runtime preserve the intended sandbox instead of silently widening
    access.
    
    We likely want a single permissions type for this codepath eventually,
    probably centered on `Permissions`. For now, the protocol needs to
    represent both the existing `PermissionProfile` form and the fuller
    `Permissions` form, so this introduces a temporary disjoint union,
    `EscalationPermissions`, to carry either one.
    
    Further, this means that today, a skill either:
    
    - does not declare any permissions, in which case it is run using the
    default sandbox for the turn
    - specifies permissions, in which case the skill is run using that exact
    sandbox, which might be more restrictive than the default sandbox for
    the turn
    
    We will likely change the skill's permissions to be additive to the
    existing permissions for the turn.
    
    ## What Changed
    
    - Added `EscalationPermissions` to `codex-protocol` so escalation
    requests can carry either a `PermissionProfile` or a full `Permissions`
    payload.
    - Added an explicit `EscalationExecution` mode to the shell escalation
    protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and
    `Permissions(...)` instead of overloading `None`.
    - Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution
    time, which keeps ordinary `UseDefault` commands on the turn sandbox and
    preserves turn-level macOS seatbelt profile extensions.
    - Updated the `zsh-fork` skill path so a skill with no declared
    permissions inherits the conversation's effective sandbox instead of
    escalating unsandboxed.
    - Updated the `zsh-fork` skill path so a skill with declared permissions
    reruns with exactly those permissions, including when a cached session
    approval is reused.
    
    ## Testing
    
    - Added unit coverage in
    `core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit
    `UseDefault` / `RequireEscalated` / `WithAdditionalPermissions`
    execution mapping.
    - Added unit coverage in
    `core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt
    extension preservation in both the `TurnDefault` and
    explicit-permissions rerun paths.
    - Added integration coverage in `core/tests/suite/skill_approval.rs` for
    permissionless skills inheriting the turn sandbox and explicit skill
    permissions remaining bounded across cached approval reuse.
  • Revert "Ensure shell command skills trigger approval (#12697)" (#12721)
    This reverts commit daf0f03ac8.
    
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Ensure shell command skills trigger approval (#12697)
    Summary
    - detect skill-invoking shell commands based on the original command
    string, request approvals when needed, and cache positive decisions per
    session
    - keep implicit skill invocation emitted after approval and keep skill
    approval decline messaging centralized to the shell handler
    - expand and adjust skill approval tests to cover shell-based skill
    scripts while matching the new detection expectations
    
    Testing
    - Not run (not requested)
  • feat(core) Introduce Feature::RequestPermissions (#11871)
    ## Summary
    Introduces the initial implementation of Feature::RequestPermissions.
    RequestPermissions allows the model to request that a command be run
    inside the sandbox, with additional permissions, like writing to a
    specific folder. Eventually this will include other rules as well, and
    the ability to persist these permissions, but this PR is already quite
    large - let's get the core flow working and go from there!
    
    <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM"
    src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368"
    />
    
    ## Testing
    - [x] Added tests
    - [x] Tested locally
    - [x] Feature
  • Refactor network approvals to host/protocol/port scope (#12140)
    ## Summary
    Simplify network approvals by removing per-attempt proxy correlation and
    moving to session-level approval dedupe keyed by (host, protocol, port).
    Instead of encoding attempt IDs into proxy credentials/URLs, we now
    treat approvals as a destination policy decision.
    
    - Concurrent calls to the same destination share one approval prompt.
    - Different destinations (or same host on different ports) get separate
    prompts.
    - Allow once approves the current queued request group only.
    - Allow for session caches that (host, protocol, port) and auto-allows
    future matching requests.
    - Never policy continues to deny without prompting.
    
    Example:
    - 3 calls: 
      - a.com (line 443)
      - b.com (line 443)
      - a.com (line 443)
    => 2 prompts total (a, b), second a waits on the first decision.
    - a.com:80 is treated separately from a.com line 443
    
    ## Testing
    - `just fmt` (in `codex-rs`)
    - `cargo test -p codex-core tools::network_approval::tests`
    - `cargo test -p codex-core` (unit tests pass; existing
    integration-suite failures remain in this environment)
  • feat(core): zsh exec bridge (#12052)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 
    - https://github.com/openai/codex/pull/12052 👈 
    
    ### Summary
    This PR introduces a feature-gated native shell runtime path that routes
    shell execution through a patched zsh exec bridge, removing MCP-specific
    behavior from the shell hot path while preserving existing
    CommandExecution lifecycle semantics.
    
    When shell_zsh_fork is enabled, shell commands run via patched zsh with
    per-`execve` interception through EXEC_WRAPPER. Core receives wrapper
    IPC requests over a Unix socket, applies existing approval policy, and
    returns allow/deny before the subcommand executes.
    
    ### What’s included
    **1) New zsh exec bridge runtime in core**
    - Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for
    EXEC_WRAPPER invocations.
    - Per-execution Unix-socket IPC handling for wrapper requests/responses.
    - Approval callback integration using existing core approval
    orchestration.
    - Streaming stdout/stderr deltas to existing command output event
    pipeline.
    - Error handling for malformed IPC, denial/abort, and execution
    failures.
    
    **2) Session lifecycle integration**
    SessionServices now owns a `ZshExecBridge`.
    Session startup initializes bridge state; shutdown tears it down
    cleanly.
    
    **3) Shell runtime routing (feature-gated)**
    When `shell_zsh_fork` is enabled:
    - Build execution env/spec as usual.
    - Add wrapper socket env wiring.
    - Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of
    the regular shell path.
    - Non-zsh-fork behavior remains unchanged.
    
    **4) Config + feature wiring**
    - Added `Feature::ShellZshFork` (under development).
    - Added config support for `zsh_path` (optional absolute path to patched
    zsh):
    - `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema.
    - Session startup validates that `zsh_path` exists/usable when zsh-fork
    is enabled.
    - Added startup test for missing `zsh_path` failure mode.
    
    **5) Seatbelt/sandbox updates for wrapper IPC**
    - Extended seatbelt policy generation to optionally allow outbound
    connection to explicitly permitted Unix sockets.
    - Wired sandboxing path to pass wrapper socket path through to seatbelt
    policy generation.
    - Added/updated seatbelt tests for explicit socket allow rule and
    argument emission.
    
    **6) Runtime entrypoint hooks**
    - This allows the same binary to act as the zsh wrapper subprocess when
    invoked via `EXEC_WRAPPER`.
    
    **7) Tool selection behavior**
    - ToolsConfig now prefers ShellCommand type when shell_zsh_fork is
    enabled.
    - Added test coverage for precedence with unified-exec enabled.
  • feat(core): add structured network approval plumbing and policy decision model (#11672)
    ### Description
    #### Summary
    Introduces the core plumbing required for structured network approvals
    
    #### What changed
    - Added structured network policy decision modeling in core.
    - Added approval payload/context types needed for network approval
    semantics.
    - Wired shell/unified-exec runtime plumbing to consume structured
    decisions.
    - Updated related core error/event surfaces for structured handling.
    - Updated protocol plumbing used by core approval flow.
    - Included small CLI debug sandbox compatibility updates needed by this
    layer.
    
    #### Why
    establishes the minimal backend foundation for network approvals without
    yet changing high-level orchestration or TUI behavior.
    
    #### Notes
    - Behavior remains constrained by existing requirements/config gating.
    - Follow-up PRs in the stack handle orchestration, UX, and app-server
    integration.
    
    ---------
    
    Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
  • feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
    `SandboxPolicy::ReadOnly` previously implied broad read access and could
    not express a narrower read surface.
    This change introduces an explicit read-access model so we can support
    user-configurable read restrictions in follow-up work, while preserving
    current behavior today.
    
    It also ensures unsupported backends fail closed for restricted-read
    policies instead of silently granting broader access than intended.
    
    ## What
    
    - Added `ReadOnlyAccess` in protocol with:
      - `Restricted { include_platform_defaults, readable_roots }`
      - `FullAccess`
    - Updated `SandboxPolicy` to carry read-access configuration:
      - `ReadOnly { access: ReadOnlyAccess }`
      - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
    - Preserved existing behavior by defaulting current construction paths
    to `ReadOnlyAccess::FullAccess`.
    - Threaded the new fields through sandbox policy consumers and call
    sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
    related tests.
    - Updated Seatbelt policy generation to honor restricted read roots by
    emitting scoped read rules when full read access is not granted.
    - Added fail-closed behavior on Linux and Windows backends when
    restricted read access is requested but not yet implemented there
    (`UnsupportedOperation`).
    - Regenerated app-server protocol schema and TypeScript artifacts,
    including `ReadOnlyAccess`.
    
    ## Compatibility / rollout
    
    - Runtime behavior remains unchanged by default (`FullAccess`).
    - API/schema changes are in place so future config wiring can enable
    restricted read access without another policy-shape migration.
  • feat(sandbox): enforce proxy-aware network routing in sandbox (#11113)
    ## Summary
    - expand proxy env injection to cover common tool env vars
    (`HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY`/`NO_PROXY` families +
    tool-specific variants)
    - harden macOS Seatbelt network policy generation to route through
    inferred loopback proxy endpoints and fail closed when proxy env is
    malformed
    - thread proxy-aware Linux sandbox flags and add minimal bwrap netns
    isolation hook for restricted non-proxy runs
    - add/refresh tests for proxy env wiring, Seatbelt policy generation,
    and Linux sandbox argument wiring
  • feat: include NetworkConfig through ExecParams (#11105)
    This PR adds the following field to `Config`:
    
    ```rust
    pub network: Option<NetworkProxy>,
    ```
    
    Though for the moment, it will always be initialized as `None` (this
    will be addressed in a subsequent PR).
    
    This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.
  • add sandbox policy and sandbox name to codex.tool.call metrics (#10711)
    This will give visibility into the comparative success rate of the
    Windows sandbox implementations compared to other platforms.
  • feat(linux-sandbox): add bwrap support (#9938)
    ## Summary
    This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The
    curent Linux sandbox path relies on in-process restrictions (including
    Landlock). Bubblewrap gives us a more uniform filesystem isolation
    model, especially explicit writable roots with the option to make some
    directories read-only and granular network controls.
    
    This is behind a feature flag so we can validate behavior safely before
    making it the default.
    
    - Added temporary rollout flag:
      - `features.use_linux_sandbox_bwrap`
    - Preserved existing default path when the flag is off.
    - In Bubblewrap mode:
    - Added internal retry without /proc when /proc mount is not permitted
    by the host/container.
  • emit a metric when we can't spawn powershell (#10125)
    This will help diagnose and measure the impact of a user-reported bug
    with the elevated sandbox and powershell
  • remove sandbox globals. (#9797)
    Threads sandbox updates through OverrideTurnContext for active turn
    Passes computed sandbox type into safety/exec
  • Fix: cap aggregated exec output consistently (#9759)
    ## WHAT?
    - Bias aggregated output toward stderr under contention (2/3 stderr, 1/3
    stdout) while keeping the 1 MiB cap.
    - Rebalance unused stderr share back to stdout when stderr is tiny to
    avoid underfilling.
    - Add tests for contention, small-stderr rebalance, and under-cap
    ordering (stdout then stderr).
    
    ## WHY?
    - Review feedback requested stderr priority under contention.
    - Avoid underfilled aggregated output when stderr is small while
    preserving a consistent cap across exec paths.
    
    ## HOW?
    - Update `aggregate_output` to compute stdout/stderr shares, then
    reassign unused capacity to the other stream.
    - Use the helper in both Windows and async exec paths.
    - Add regression tests for contention/rebalance and under-cap ordering.
    
    ## BEFORE
    ```rust
    // Best-effort aggregate: stdout then stderr (capped).
    let mut aggregated = Vec::with_capacity(
        stdout
            .text
            .len()
            .saturating_add(stderr.text.len())
            .min(EXEC_OUTPUT_MAX_BYTES),
    );
    append_capped(&mut aggregated, &stdout.text, EXEC_OUTPUT_MAX_BYTES);
    append_capped(&mut aggregated, &stderr.text, EXEC_OUTPUT_MAX_BYTES);
    let aggregated_output = StreamOutput {
        text: aggregated,
        truncated_after_lines: None,
    };
    ```
    
    ## AFTER
    ```rust
    fn aggregate_output(
        stdout: &StreamOutput<Vec<u8>>,
        stderr: &StreamOutput<Vec<u8>>,
    ) -> StreamOutput<Vec<u8>> {
        let total_len = stdout.text.len().saturating_add(stderr.text.len());
        let max_bytes = EXEC_OUTPUT_MAX_BYTES;
        let mut aggregated = Vec::with_capacity(total_len.min(max_bytes));
    
        if total_len <= max_bytes {
            aggregated.extend_from_slice(&stdout.text);
            aggregated.extend_from_slice(&stderr.text);
            return StreamOutput {
                text: aggregated,
                truncated_after_lines: None,
            };
        }
    
        // Under contention, reserve 1/3 for stdout and 2/3 for stderr; rebalance unused stderr to stdout.
        let want_stdout = stdout.text.len().min(max_bytes / 3);
        let want_stderr = stderr.text.len();
        let stderr_take = want_stderr.min(max_bytes.saturating_sub(want_stdout));
        let remaining = max_bytes.saturating_sub(want_stdout + stderr_take);
        let stdout_take = want_stdout + remaining.min(stdout.text.len().saturating_sub(want_stdout));
    
        aggregated.extend_from_slice(&stdout.text[..stdout_take]);
        aggregated.extend_from_slice(&stderr.text[..stderr_take]);
    
        StreamOutput {
            text: aggregated,
            truncated_after_lines: None,
        }
    }
    ```
    
    ## TESTS
    - [x] `just fmt`
    - [x] `just fix -p codex-core`
    - [x] `cargo test -p codex-core aggregate_output_`
    - [x] `cargo test -p codex-core`
    - [x] `cargo test --all-features`
    
    ## FIXES
    Fixes #9758
  • fix: memory leak issue (#9543)
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • feat: introduce ExternalSandbox policy (#8290)
    ## Description
    
    Introduced `ExternalSandbox` policy to cover use case when sandbox
    defined by outside environment, effectively it translates to
    `SandboxMode#DangerFullAccess` for file system (since sandbox configured
    on container level) and configurable `network_access` (either Restricted
    or Enabled by outside environment).
    
    as example you can configure `ExternalSandbox` policy as part of
    `sendUserTurn` v1 app_server API:
    
    ```
     {
                "conversationId": <id>,
                "cwd": <cwd>,
                "approvalPolicy": "never",
                "sandboxPolicy": {
                      "type": ""external-sandbox",
                      "network_access": "enabled"/"restricted"
                },
                "model": <model>,
                "effort": <effort>,
                ....
            }
    ```
  • refactoring with_escalated_permissions to use SandboxPermissions instead (#7750)
    helpful in the future if we want more granularity for requesting
    escalated permissions:
    e.g when running in readonly sandbox, model can request to escalate to a
    sandbox that allows writes