77 Commits

  • Add network environment ID plumbing (#28766)
    ## Why
    
    Prepare network approval scoping to distinguish execution environments
    without changing behavior yet.
    
    ## What changed
    
    - Add optional environment IDs to network policy requests.
    - Add optional network environment IDs to exec and sandbox request
    structs.
    - Thread default None values through existing construction points.
    - Fix stale constructor call sites that caused the CI compile failures.
    
    ## Not included
    
    - Per-environment proxy listeners.
    - Network approval cache or prompt behavior changes.
    - Ambiguous request attribution handling.
    
    Those behavior changes moved to stacked follow-up #28899.
    
    ## Validation
    
    - just fmt
    - CI will run tests and clippy
  • unified-exec: preserve PathUri through exec-server (#28681)
    ## Why
    
    It should be possible for app-server to handle "foreign" OS paths in
    unified_exec working directories, allowing e.g. a Linux app-server to
    run processes on e.g. a Windows exec-server.
    
    ## What
    
    Convert the core unified_exec cwd values to use `PathUri`.
    
    Adds fallible path conversion in several places to try to minimize the
    scope of this change. The only time this change suppresses errors from
    converting `PathUri` to an `AbsolutePathBuf` is when the turn is
    configured with no sandboxing at all to allow us to make progress
    testing without sandboxing.
    
    Future changes to apply_patch and sandboxing will clean up these error
    paths.
    
    A tool's cwd is resolved from joining a model-provided workdir to the
    environment's cwd. When using `AbsolutePathBuf::join()`, an
    absolute-path workdir would overwrite the environment's cwd and we would
    resolve permissions/sandboxing against the model-provided path. This
    change extends `PathUri::join()` to also treat an absolute rhs as an
    override of the base/lhs.
    
    This also removes some coverage from the remove_env_windows tests until
    a follow-up converts foreign paths in command exec events correctly.
    
    ## Breaking Changes
    
    When using `AbsolutePathBuf::join()` for workdir resolution, we ended up
    resolving tilde-prefixed paths against the app-server's `$HOME`, e.g.
    `~/foo/bar` becomes `/home/anp/foo/bar`. It's difficult to do this with
    `PathUri` joining, so after offline discussion this PR no longer
    implements it.
    
    A quick check of some power users' rollouts suggests that models don't
    actually generate home-prefixed absolute working directories for their
    spawns, so this shouldn't have any real blast radius.
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • [codex] Bind shell snapshots to retained thread environments (#28421)
    ## Why
    
    Shell snapshots are currently session-scoped even though shell and cwd
    are properties of a selected turn environment. That makes snapshot
    refresh depend on separate session-cwd plumbing, prevents retained
    environments from retaining their snapshot work, and can make snapshot
    construction use a different shell than command execution.
    
    This follows #27955 by making the retained thread-environment service
    own environment snapshot lifecycles. Session configuration remains the
    requested selection state, while `ThreadEnvironments` remains the source
    of successfully resolved environments.
    
    ## What changed
    
    - Configure the shell-snapshot builder before initial environment
    resolution.
    - Start each local environment snapshot task when its `TurnEnvironment`
    is built and retain that shared task while environment ID and cwd still
    match.
    - Inherit retained environment snapshots into spawned child threads.
    - Carry the selected `TurnEnvironment` through shell runtimes so
    snapshot construction and command execution use the same
    environment-specific shell and cwd.
    - Load project instructions and warm plugins/skills after initial
    environment resolution.
    - Continue decoding invalid UTF-8 instruction files lossily without
    emitting a startup warning.
    - Keep requested selections in `SessionConfiguration`; failed or
    duplicate resolutions only affect the resolved environment snapshot.
    
    ## Validation
    
    - `cargo check -p codex-core --tests`
    - `just test -p codex-home instructions` (6 passed)
    - Focused environment, instruction, shell-snapshot, and user-shell tests
    (84 passed)
    - Focused shell-snapshot, user-shell, and unified-exec tests (126
    passed; two event-timing tests passed on retry)
  • Preserve hook trust bypass in codex exec threads (#26434)
    Addresses #26383 and #26452
    
    ## Summary
    
    `codex exec --dangerously-bypass-hook-trust` printed the bypass warning,
    but valid untrusted hooks still did not run.
    
    Exec applied the flag to its initial config, then lost it when
    app-server reloaded config for the new or resumed thread.
    
    ## Fix
    
    Forward `bypass_hook_trust: true` through the existing thread request
    config override for both start and resume.
    
    The override is omitted when the flag is not enabled, preserving normal
    trust behavior.
    
    ## Testing
    
    Added:
    
    - A test confirming start and resume preserve the override.
    - An end-to-end exec test confirming a `SessionStart` hook runs and
    creates a marker file.
  • [codex] Load user instructions through an injected provider (#27101)
    ## Why
    
    We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
    make embedders responsible for supplying user-level instructions. This
    also ensures user instructions load when no primary environment is
    selected.
    
    ## What changed
    
    Stacked on #27415, which makes `codex exec` surface thread-scoped
    runtime warnings.
    
    - Added `UserInstructionsProvider` to `codex-extension-api`, with
    absolute source attribution and recoverable loading warnings.
    - Added `codex-home` with the filesystem-backed provider for
    `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
    trimming, lossy UTF-8 handling, and the existing uncapped global
    instruction size.
    - Removed global instruction loading from `Config` and require
    `ThreadManager` callers to inject a provider.
    - Load provider instructions once for each fresh root runtime, including
    runtimes without a primary environment. Running sessions retain their
    snapshot, while child agents inherit the parent snapshot without
    invoking the provider.
    - Keep provider instructions separate while loading project `AGENTS.md`,
    then assemble the model-visible instructions with the existing ordering,
    source attribution, warning, and turn-context behavior.
    - Wired the Codex home provider through the CLI, app server, MCP server,
    core facade, and thread-manager sample.
    
    ## Validation
    
    - `just test -p codex-home -p codex-extension-api`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core guardian`
    - `just test -p codex-app-server
    thread_start_without_selected_environment_includes_only_global_instruction_source`
    - `just test -p codex-exec warning`
    - `just bazel-lock-check`
  • [codex] Surface runtime warnings in codex exec (#27415)
    ## Why
    
    `codex exec` drops thread-scoped warning notifications. Warnings
    discovered while a thread starts, including unreadable or invalid UTF-8
    project `AGENTS.md` files, therefore become silent.
    
    ## What changed
    
    - Process global and primary-thread warning notifications while
    continuing to ignore warnings from unrelated threads.
    - Render runtime warnings in human output and expose them through the
    existing non-fatal error item in JSONL output.
    - Add focused routing, rendering, and malformed project-instruction
    coverage.
  • Route AGENTS.md loading through environment filesystems (#26205)
    ## Why
    
    Workspace-specific `AGENTS.md` loading needs to use the selected
    environment filesystem so remote workspaces and child agents read
    instructions from their actual environment instead of the host
    filesystem. The app-server should report the same instruction sources
    the initialized thread actually loaded, rather than independently
    rescanning configuration and filesystem state.
    
    ## What changed
    
    - Introduce `LoadedAgentsMd` to retain ordered user, project, and
    internal instructions with their provenance.
    - Load and canonicalize workspace `AGENTS.md` paths through the primary
    `EnvironmentManager` environment, then render the loaded instructions
    when constructing turn context.
    - Expose cached loaded instruction sources from initialized threads and
    use them for app-server start, resume, and fork responses.
    - Preserve global `CODEX_HOME` loading and separator behavior while
    excluding empty project files that did not supply model-visible
    instructions.
    - Add integration coverage for CLI injection, selected-environment
    provenance and rendering, empty environment selection, and cached
    sources on loaded-thread resume.
    
    ## Validation
    
    - `just test -p codex-core agents_md`
    - `just test -p codex-core
    selected_environment_sources_match_model_visible_instructions`
    - `just test -p codex-exec agents_md`
    - `just test -p codex-app-server instruction_sources`
    - `just test -p codex-app-server --status-level fail`
  • Preserve auto-review approval policy in codex exec (#23763)
    ## Why
    
    `codex exec` was forcing headless runs to `approval_policy = "never"`
    even when the resolved reviewer was `auto_review`. That prevented
    unattended exec workflows from reaching the reviewed MCP write path they
    were configured to use.
    
    ## What changed
    
    - Keep the existing headless `never` default for ordinary exec runs.
    - Re-resolve exec config without that synthetic override when the final
    reviewer resolves to `AutoReview`, so configured or requirements-driven
    approval policy is preserved.
    - Add regression coverage for:
      - `auto_review` plus `on-request` from user config
    - requirements-driven `AutoReview`, asserting exec’s final approval
    policy matches the no-override control config exactly
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-exec`
  • windows-sandbox: pass workspace roots to runner (#24108)
    ## Why
    
    #23813 switches the Windows sandbox runner path to `PermissionProfile`,
    but it still left one runtime anchor for resolving symbolic
    `:workspace_roots` entries. That is not enough once a turn has multiple
    effective workspace roots: exact entries and deny globs under
    `:workspace_roots` need to be materialized for every runtime root before
    the command runner chooses token mode or builds ACL plans.
    
    ## What Changed
    
    - Replaces the Windows runner/setup `permission_profile_cwd` plumbing
    with `workspace_roots: Vec<AbsolutePathBuf>`.
    - Resolves Windows-local `PermissionProfile` data with
    `materialize_project_roots_with_workspace_roots(...)` instead of the
    single-cwd helper.
    - Threads `Config::effective_workspace_roots()` through core execution,
    unified exec, TUI setup/read-grant flows, app-server setup, app-server
    `command/exec`, and `debug sandbox` on Windows.
    - Preserves those workspace roots through the zsh-fork escalation
    executor instead of rebuilding them from `sandbox_policy_cwd`.
    - Makes `ExecRequest::new(...)` and the remaining
    `build_exec_request(...)` helper path take
    `windows_sandbox_workspace_roots` explicitly so new call sites cannot
    silently fall back to `vec![cwd]`.
    - Clarifies the `debug sandbox` non-Windows comment: remaining
    cwd-dependent resolution still uses `sandbox_policy_cwd`, while
    `:workspace_roots` entries are already materialized from config roots.
    - Updates elevated runner IPC `SpawnRequest` to send `workspace_roots`
    and bumps the framed IPC protocol version to `3` for the payload shape
    change.
    - Adds Windows-local resolver coverage for expanding exact and glob
    `:workspace_roots` entries across multiple roots, plus core helper
    coverage proving explicit roots are preserved.
    
    ## Verification
    
    - `cargo check -p codex-windows-sandbox -p codex-core -p codex-tui -p
    codex-cli -p codex-app-server`
    - `cargo test -p codex-windows-sandbox`
    - `cargo test -p codex-core windows_sandbox`
    - `cargo test -p codex-core unix_escalation`
    - `cargo test -p codex-app-server windows_sandbox`
    - `cargo test -p codex-tui windows_sandbox`
    - `cargo test -p codex-cli debug_sandbox`
    - `just test -p codex-core unified_exec`
    - `just test -p codex-core
    build_exec_request_preserves_windows_workspace_roots`
    - `env -u CODEX_NETWORK_PROXY_ACTIVE -u
    CODEX_NETWORK_ALLOW_LOCAL_BINDING just test -p codex-app-server --lib
    command_exec`
    - `just test -p codex-windows-sandbox`
    - `just test -p codex-exec sandbox`
    - `just fix -p codex-core -p codex-app-server -p codex-windows-sandbox`
    
    A local macOS cross-check with `cargo check --target
    x86_64-pc-windows-msvc ...` did not reach crate Rust code because native
    dependencies require Windows SDK headers (`windows.h` / `assert.h`) in
    this environment; Windows CI remains the real target validation.
    
    Two local targeted filters compile but do not run assertions on macOS:
    `env -u CODEX_NETWORK_PROXY_ACTIVE -u CODEX_NETWORK_ALLOW_LOCAL_BINDING
    just test -p codex-app-server --lib command_exec_processor` matched zero
    tests, and `just test -p codex-linux-sandbox landlock` matched zero
    tests because the landlock suite is Linux-only.
  • Support --output-schema for exec resume (#23123)
    ## Why
    
    `codex exec resume` should have the same structured-output support as
    top-level `codex exec`. Without `--output-schema`, multi-turn automation
    has to choose between resumed session context and schema-validated JSON
    output.
    
    Fixes #22998.
    
    ## What changed
    
    - Marked `--output-schema` as a global `codex exec` flag so it can be
    passed after `resume`.
    - Reused the existing output schema plumbing so resumed turns attach the
    schema to the final response request while preserving session context.
  • test: construct permission profiles directly (#23030)
    ## Why
    
    `SandboxPolicy` is now a legacy compatibility shape, but several tests
    still built a `SandboxPolicy` only to immediately convert it into
    `PermissionProfile` for APIs that already accept canonical runtime
    permissions. Those detours make it harder to audit where legacy sandbox
    policy is still required, because boundary-only usages are mixed
    together with ordinary test setup.
    
    ## What Changed
    
    - Updated tests in `codex-core`, `codex-exec`, `codex-analytics`, and
    `codex-config` to construct `PermissionProfile` values directly when the
    code under test takes a permission profile.
    - Changed exec-policy, request-permissions, session, and sandbox test
    helpers to pass `PermissionProfile` through instead of converting from
    `SandboxPolicy` internally.
    - Left `SandboxPolicy` in place where tests are explicitly exercising
    legacy compatibility or request/response boundaries.
    
    ## Test Plan
    
    - `cargo test -p codex-analytics -p codex-config`
    - `cargo test -p codex-core --lib safety::tests`
    - `cargo test -p codex-core --lib exec_policy::tests::`
    - `cargo test -p codex-core --lib exec::tests`
    - `cargo test -p codex-core --lib guardian_review_session_config`
    - `cargo test -p codex-core --lib tools::network_approval::tests`
    - `cargo test -p codex-core --lib
    tools::runtimes::shell::unix_escalation::tests`
    - `cargo test -p codex-core --lib managed_network`
    - `cargo test -p codex-core --test all request_permissions::`
    - `cargo test -p codex-exec sandbox`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23030).
    * #23036
    * __->__ #23030
  • tests: avoid ambient temp sandbox roots (#22576)
    ## Why
    Some sandboxed integration tests enabled both ambient temp roots
    (`TMPDIR` and literal `/tmp`) even though they were not testing
    temp-root behavior. On Linux bwrap, making `/tmp` writable causes
    protected metadata mount targets such as `/tmp/.git`, `/tmp/.agents`,
    and `/tmp/.codex` to be synthesized. If a run is interrupted, those
    top-level markers can be left behind and contaminate later tests.
    
    ## What changed
    For the incidental integration tests that do not need ambient temp-root
    access, set `exclude_tmpdir_env_var` and `exclude_slash_tmp` to `true`.
    Dedicated protected-metadata coverage remains in the lower-level sandbox
    tests that use isolated temp roots.
    
    ## Verification
    Focused remote devbox repros passed with a watcher polling `/tmp/.git`,
    `/tmp/.agents`, and `/tmp/.codex`; no leaked markers were observed.
  • Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
    ## Why
    
    `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
    bypass the normal Responses transport by reading SSE from local files.
    That kept test-only behavior wired through production client code. The
    affected tests can stay hermetic by using the existing
    `core_test_support::responses` mock server and passing `openai_base_url`
    instead.
    
    ## What Changed
    
    - Removed the `CODEX_RS_SSE_FIXTURE` flag,
    `codex_api::stream_from_fixture`, the `env-flags` dependency, and the
    checked-in SSE fixture files.
    - Repointed the affected core, exec, and TUI tests at `MockServer` with
    the existing SSE event constructors.
    - Removed the Bazel test data plumbing for the deleted fixtures and
    refreshed cargo/Bazel lock state.
    
    ## Verification
    
    - `cargo build -p codex-cli`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core --test all responses_api_stream_cli`
    - `cargo test -p codex-core --test all
    integration_creates_and_checks_session_file`
    - `cargo test -p codex-exec --test all ephemeral`
    - `cargo test -p codex-exec --test all resume`
    - `cargo test -p codex-tui --test all
    resume_startup_does_not_consume_model_availability_nux_count`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
    - `git diff --check`
  • [codex] Delete function-style apply_patch (#21651)
    ## Why
    
    `apply_patch` is now a freeform/custom tool. Keeping the old
    JSON/function-style registration and parsing path left another way for
    models and tests to invoke `apply_patch`, which made the tool surface
    harder to reason about.
    
    ## What changed
    
    - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
    spec, and handler support for function payloads.
    - Kept `apply_patch_tool_type = freeform` as the supported model
    metadata path, including Bedrock catalog metadata.
    - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
    calls.
    
    ## Verification
    
    - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
    - `cargo test -p codex-core tools::handlers::apply_patch --lib`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-core --test all
    apply_patch_reports_parse_diagnostics`
    - `cargo test -p codex-exec test_apply_patch_tool`
    - `just fix -p codex-core`
    - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
    codex-exec`
  • permissions: migrate approval and sandbox consumers to profiles (#19393)
    ## Why
    
    Runtime decisions should not infer permissions from the lossy legacy
    sandbox projection once `PermissionProfile` is available. In particular,
    `Disabled` and `External` need to remain distinct, and managed profiles
    with split filesystem or deny-read rules should not be collapsed before
    approval, network, safety, or analytics code makes decisions.
    
    ## What Changed
    
    - Changes managed network proxy setup and network approval logic to use
    `PermissionProfile` when deciding whether a managed sandbox is active.
    - Migrates patch safety, Guardian/user-shell approval paths, Landlock
    helper setup, analytics sandbox classification, and selected
    turn/session code to profile-backed permissions.
    - Validates command-level profile overrides against the constrained
    `PermissionProfile` rather than a strict `SandboxPolicy` round trip.
    - Preserves configured deny-read restrictions when command profiles are
    narrowed.
    - Adds coverage for profile-backed trust, network proxy/approval
    behavior, patch safety, analytics classification, and command-profile
    narrowing.
    
    ## Verification
    
    - `cargo test -p codex-core direct_write_roots`
    - `cargo test -p codex-core runtime_roots_to_legacy_projection`
    - `cargo test -p codex-app-server
    requested_permissions_trust_project_uses_permission_profile_intent`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19393).
    * #19395
    * #19394
    * __->__ #19393
  • permissions: make runtime config profile-backed (#19606)
    ## Why
    
    This supersedes #19391. During stack repair, GitHub marked #19391 as
    merged into a temporary stack branch rather than into `main`, so the
    runtime-config change needed a fresh PR.
    
    `PermissionProfile` is now the canonical permissions shape after #19231
    because it can distinguish `Managed`, `Disabled`, and `External`
    enforcement while also carrying filesystem rules that legacy
    `SandboxPolicy` cannot represent cleanly. Core config and session state
    still needed to accept profile-backed permissions without forcing every
    profile through the strict legacy bridge, which rejected valid runtime
    profiles such as direct write roots.
    
    The unrelated CI/test hardening that previously rode along with this PR
    has been split into #19683 so this PR stays focused on the permissions
    model migration.
    
    ## What Changed
    
    - Adds `Permissions.permission_profile` and
    `SessionConfiguration.permission_profile` as constrained runtime state,
    while keeping `sandbox_policy` as a legacy compatibility projection.
    - Introduces profile setters that keep `PermissionProfile`, split
    filesystem/network policies, and legacy `SandboxPolicy` projections
    synchronized.
    - Uses a compatibility projection for requirement checks and legacy
    consumers instead of rejecting profiles that cannot round-trip through
    `SandboxPolicy` exactly.
    - Updates config loading, config overrides, session updates, turn
    context plumbing, prompt permission text, sandbox tags, and exec request
    construction to carry profile-backed runtime permissions.
    - Preserves configured deny-read entries and `glob_scan_max_depth` when
    command/session profiles are narrowed.
    - Adds `PermissionProfile::read_only()` and
    `PermissionProfile::workspace_write()` presets that match legacy
    defaults.
    
    ## Verification
    
    - `cargo test -p codex-core direct_write_roots`
    - `cargo test -p codex-core runtime_roots_to_legacy_projection`
    - `cargo test -p codex-app-server
    requested_permissions_trust_project_uses_permission_profile_intent`
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606).
    * #19395
    * #19394
    * #19393
    * #19392
    * __->__ #19606
  • permissions: remove legacy read-only access modes (#19449)
    ## Why
    
    `ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`:
    `FullAccess` meant the historical read-only/workspace-write modes could
    read the full filesystem, while `Restricted` tried to carry partial
    readable roots. The partial-read model now belongs in
    `FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on
    `SandboxPolicy` makes every legacy projection reintroduce lossy
    read-root bookkeeping and creates unnecessary noise in the rest of the
    permissions migration.
    
    This PR makes the legacy policy model narrower and explicit:
    `SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent
    the old full-read sandbox modes only. Split readable roots, deny-read
    globs, and platform-default/minimal read behavior stay in the runtime
    permissions model.
    
    ## What changed
    
    - Removes `ReadOnlyAccess` from
    `codex_protocol::protocol::SandboxPolicy`, including the generated
    `access` and `readOnlyAccess` API fields.
    - Updates legacy policy/profile conversions so restricted filesystem
    reads are represented only by `FileSystemSandboxPolicy` /
    `PermissionProfile` entries.
    - Keeps app-server v2 compatible with legacy `fullAccess` read-access
    payloads by accepting and ignoring that no-op shape, while rejecting
    legacy `restricted` read-access payloads instead of silently widening
    them to full-read legacy policies.
    - Carries Windows sandbox platform-default read behavior with an
    explicit override flag instead of depending on
    `ReadOnlyAccess::Restricted`.
    - Refreshes generated app-server schema/types and updates tests/docs for
    the simplified legacy policy shape.
    
    ## Verification
    
    - `cargo check -p codex-app-server-protocol --tests`
    - `cargo check -p codex-windows-sandbox --tests`
    - `cargo test -p codex-app-server-protocol sandbox_policy_`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449).
    * #19395
    * #19394
    * #19393
    * #19392
    * #19391
    * __->__ #19449
  • permissions: make legacy profile conversion cwd-free (#19414)
    ## Why
    
    The profile conversion path still required a `cwd` even when it was only
    translating a legacy `SandboxPolicy` into a `PermissionProfile`. That
    made profile producers invent an ambient `cwd`, which is exactly the
    anchoring we are trying to remove from permission-profile data. A legacy
    workspace-write policy can be represented symbolically instead: `:cwd =
    write` plus read-only `:project_roots` metadata subpaths.
    
    This PR creates that cwd-free base so the rest of the stack can stop
    threading cwd through profile construction. Callers that actually need a
    concrete runtime filesystem policy for a specific cwd still have an
    explicitly named cwd-bound conversion.
    
    ## What Changed
    
    - `PermissionProfile::from_legacy_sandbox_policy` now takes only
    `&SandboxPolicy`.
    - `FileSystemSandboxPolicy::from_legacy_sandbox_policy` is now the
    symbolic, cwd-free projection for profiles.
    - The old concrete projection is retained as
    `FileSystemSandboxPolicy::from_legacy_sandbox_policy_for_cwd` for
    runtime/boundary code that must materialize legacy cwd behavior.
    - Workspace-write profiles preserve `CurrentWorkingDirectory` and
    `ProjectRoots` special entries instead of materializing cwd into
    absolute paths.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-exec -p
    codex-exec-server -p codex-tui -p codex-sandboxing -p
    codex-linux-sandbox -p codex-analytics --tests`
    - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol
    -p codex-app-server -p codex-exec -p codex-exec-server -p codex-tui -p
    codex-sandboxing -p codex-linux-sandbox -p codex-analytics`
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19414).
    * #19395
    * #19394
    * #19393
    * #19392
    * #19391
    * __->__ #19414
  • sandbox: remove dead seatbelt helper and update tests (#17859)
    ## Why
    
    `spawn_command_under_seatbelt()` in `codex-rs/core/src/seatbelt.rs` had
    fallen out of production use and was only referenced by test-only
    wrappers. That left us with sandbox tests that could stay green even if
    the actual seatbelt exec path regressed, because production shell
    execution now flows through `SandboxManager::transform()` and
    `ExecRequest::from_sandbox_exec_request()` instead of that helper.
    
    Removing the dead helper also exposed one downstream `codex-exec`
    integration test that still imported it, which broke `just clippy`.
    
    ## What Changed
    
    - Removed `codex-rs/core/src/seatbelt.rs` and stopped exporting
    `codex_core::seatbelt`.
    - Removed the redundant `codex-rs/core/tests/suite/seatbelt.rs` coverage
    that only exercised the dead helper.
    - Kept the `openpty` regression check, but moved it into
    `codex-rs/core/tests/suite/exec.rs` so it now runs through
    `process_exec_tool_call()`.
    - Fixed the seatbelt denial test in `codex-rs/core/tests/suite/exec.rs`
    to use `/usr/bin/touch`, so it actually exercises the sandbox instead of
    a nonexistent path.
    - Updated `codex-rs/exec/tests/suite/sandbox.rs` on macOS to build the
    sandboxed command through `build_exec_request()` and spawn the
    transformed command, instead of importing the removed helper.
    - Left the lower-level seatbelt policy coverage in
    `codex-rs/sandboxing/src/seatbelt_tests.rs`, where the policy generator
    is still covered directly.
    
    ## Verification
    
    - `cargo test -p codex-core suite::exec::`
    - `cargo test -p codex-exec`
    - `cargo clippy -p codex-exec --tests -- -D warnings`
  • Spread AbsolutePathBuf (#17792)
    Mechanical change to promote absolute paths through code.
  • [codex] reduce module visibility (#16978)
    ## Summary
    - reduce public module visibility across Rust crates, preferring private
    or crate-private modules with explicit crate-root public exports
    - update external call sites and tests to use the intended public crate
    APIs instead of reaching through module trees
    - add the module visibility guideline to AGENTS.md
    
    ## Validation
    - `cargo check --workspace --all-targets --message-format=short` passed
    before the final fix/format pass
    - `just fix` completed successfully
    - `just fmt` completed successfully
    - `git diff --check` passed
  • Remove OPENAI_BASE_URL config fallback (#16720)
    The `OPENAI_BASE_URL` environment variable has been a significant
    support issue, so we decided to deprecate it in favor of an
    `openai_base_url` config key. We've had the deprecation warning in place
    for about a month, so users have had time to migrate to the new
    mechanism. This PR removes support for `OPENAI_BASE_URL` entirely.
  • core: remove cross-crate re-exports from lib.rs (#16512)
    ## Why
    
    `codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
    which made downstream crates depend on `codex-core` as a proxy module
    instead of the actual owner crate.
    
    Removing those forwards makes crate boundaries explicit and lets leaf
    crates drop unnecessary `codex-core` dependencies. In this PR, this
    reduces the dependency on `codex-core` to `codex-login` in the following
    files:
    
    ```
    codex-rs/backend-client/Cargo.toml
    codex-rs/mcp-server/tests/common/Cargo.toml
    ```
    
    ## What
    
    - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
    `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
    `codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
    `codex-tools`, and `codex-utils-path`.
    - Delete the `default_client` forwarding shim in `codex-rs/core`.
    - Update in-crate and downstream callsites to import directly from the
    owning `codex-*` crate.
    - Add direct Cargo dependencies where callsites now target the owner
    crate, and remove `codex-core` from `codex-rs/backend-client`.
  • fix: fix comment linter lint violations in Linux-only code (#16118)
    https://github.com/openai/codex/pull/16071 took care of this for
    Windows, so this takes care of things for Linux.
    
    We don't touch the CI jobs in this PR because
    https://github.com/openai/codex/pull/16106 is going to be the real fix
    there (including a major speedup!).
  • Support Codex CLI stdin piping for codex exec (#15917)
    # Summary
    
    Claude Code supports a useful prompt-plus-stdin workflow:
    
    ```bash
    echo "complex input..." | claude -p "summarize concisely"
    ```
    
    Codex previously did not support the equivalent `codex exec` form. While
    `codex exec` could read the prompt from stdin, it could not combine
    piped input with an explicit prompt argument.
    
    This change adds that missing workflow:
    
    ```bash
    echo "complex input..." | codex exec "summarize concisely"
    ```
    
    With this change, when `codex exec` receives both a positional prompt
    and piped stdin, the prompt remains the instruction and stdin is passed
    along as structured `<stdin>...</stdin>` context.
    
    Example:
    
    ```bash
    curl https://jsonplaceholder.typicode.com/comments \
      | ./target/debug/codex exec --skip-git-repo-check "format the top 20 items into a markdown table" \
      > table.md
    ```
    
    This PR also adds regression coverage for:
    - prompt argument + piped stdin
    - legacy stdin-as-prompt behavior
    - `codex exec -` forced-stdin behavior
    - empty-stdin error cases
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
    ## Why
    
    `argument-comment-lint` was green in CI even though the repo still had
    many uncommented literal arguments. The main gap was target coverage:
    the repo wrapper did not force Cargo to inspect test-only call sites, so
    examples like the `latest_session_lookup_params(true, ...)` tests in
    `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
    
    This change cleans up the existing backlog, makes the default repo lint
    path cover all Cargo targets, and starts rolling that stricter CI
    enforcement out on the platform where it is currently validated.
    
    ## What changed
    
    - mechanically fixed existing `argument-comment-lint` violations across
    the `codex-rs` workspace, including tests, examples, and benches
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
    `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
    `--all-targets` unless the caller explicitly narrows the target set
    - fixed both wrappers so forwarded cargo arguments after `--` are
    preserved with a single separator
    - documented the new default behavior in
    `tools/argument-comment-lint/README.md`
    - updated `rust-ci` so the macOS lint lane keeps the plain wrapper
    invocation and therefore enforces `--all-targets`, while Linux and
    Windows temporarily pass `-- --lib --bins`
    
    That temporary CI split keeps the stricter all-targets check where it is
    already cleaned up, while leaving room to finish the remaining Linux-
    and Windows-specific target-gated cleanup before enabling
    `--all-targets` on those runners. The Linux and Windows failures on the
    intermediate revision were caused by the wrapper forwarding bug, not by
    additional lint findings in those lanes.
    
    ## Validation
    
    - `bash -n tools/argument-comment-lint/run.sh`
    - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
    - shell-level wrapper forwarding check for `-- --lib --bins`
    - shell-level wrapper forwarding check for `-- --tests`
    - `just argument-comment-lint`
    - `cargo test` in `tools/argument-comment-lint`
    - `cargo test -p codex-terminal-detection`
    
    ## Follow-up
    
    - Clean up remaining Linux-only target-gated callsites, then switch the
    Linux lint lane back to the plain wrapper invocation.
    - Clean up remaining Windows-only target-gated callsites, then switch
    the Windows lint lane back to the plain wrapper invocation.
  • Protect first-time project .codex creation across Linux and macOS sandboxes (#15067)
    ## Problem
    
    Codex already treated an existing top-level project `./.codex` directory
    as protected, but there was a gap on first creation.
    
    If `./.codex` did not exist yet, a turn could create files under it,
    such as `./.codex/config.toml`, without going through the same approval
    path as later modifications. That meant the initial write could bypass
    the intended protection for project-local Codex state.
    
    ## What this changes
    
    This PR closes that first-creation gap in the Unix enforcement layers:
    
    - `codex-protocol`
    - treat the top-level project `./.codex` path as a protected carveout
    even when it does not exist yet
    - avoid injecting the default carveout when the user already has an
    explicit rule for that exact path
    - macOS Seatbelt
    - deny writes to both the exact protected path and anything beneath it,
    so creating `./.codex` itself is blocked in addition to writes inside it
    - Linux bubblewrap
    - preserve the same protected-path behavior for first-time creation
    under `./.codex`
    - tests
    - add protocol regressions for missing `./.codex` and explicit-rule
    collisions
    - add Unix sandbox coverage for blocking first-time `./.codex` creation
      - tighten Seatbelt policy assertions around excluded subpaths
    
    ## Scope
    
    This change is intentionally scoped to protecting the top-level project
    `.codex` subtree from agent writes.
    
    It does not make `.codex` unreadable, and it does not change the product
    behavior around loading project skills from `.codex` when project config
    is untrusted.
    
    ## Why this shape
    
    The fix is pointed rather than broad:
    - it preserves the current model of “project `.codex` is protected from
    writes”
    - it closes the security-relevant first-write hole
    - it avoids folding a larger permissions-model redesign into this PR
    
    ## Validation
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-sandboxing seatbelt`
    - `cargo test -p codex-exec --test all
    sandbox_blocks_first_time_dot_codex_creation -- --nocapture`
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • fix: fix old system bubblewrap compatibility without falling back to vendored bwrap (#15693)
    Fixes #15283.
    
    ## Summary
    Older system bubblewrap builds reject `--argv0`, which makes our Linux
    sandbox fail before the helper can re-exec. This PR keeps using system
    `/usr/bin/bwrap` whenever it exists and only falls back to vendored
    bwrap when the system binary is missing. That matters on stricter
    AppArmor hosts, where the distro bwrap package also provides the policy
    setup needed for user namespaces.
    
    For old system bwrap, we avoid `--argv0` instead of switching binaries:
    - pass the sandbox helper a full-path `argv0`,
    - keep the existing `current_exe() + --argv0` path when the selected
    launcher supports it,
    - otherwise omit `--argv0` and re-exec through the helper's own
    `argv[0]` path, whose basename still dispatches as
    `codex-linux-sandbox`.
    
    Also updates the launcher/warning tests and docs so they match the new
    behavior: present-but-old system bwrap uses the compatibility path, and
    only absent system bwrap falls back to vendored.
    
    ### Validation
    
    1. Install Ubuntu 20.04 in a VM
    2. Compile codex and run without bubblewrap installed - see a warning
    about falling back to the vendored bwrap
    3. Install bwrap and verify version is 0.4.0 without `argv0` support
    4. run codex and use apply_patch tool without errors
    
    <img width="802" height="631" alt="Screenshot 2026-03-25 at 11 48 36 PM"
    src="https://github.com/user-attachments/assets/77248a29-aa38-4d7c-9833-496ec6a458b8"
    />
    <img width="807" height="634" alt="Screenshot 2026-03-25 at 11 47 32 PM"
    src="https://github.com/user-attachments/assets/5af8b850-a466-489b-95a6-455b76b5050f"
    />
    <img width="812" height="635" alt="Screenshot 2026-03-25 at 11 45 45 PM"
    src="https://github.com/user-attachments/assets/438074f0-8435-4274-a667-332efdd5cb57"
    />
    <img width="801" height="623" alt="Screenshot 2026-03-25 at 11 43 56 PM"
    src="https://github.com/user-attachments/assets/0dc8d3f5-e8cf-4218-b4b4-a4f7d9bf02e3"
    />
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • core: bundle settings diff updates into one dev/user envelope (#12417)
    ## Summary
    - bundle contextual prompt injection into at most one developer message
    plus one contextual user message in both:
      - per-turn settings updates
      - initial context insertion
    - preserve `<model_switch>` across compaction by rebuilding it through
    canonical initial-context injection, instead of relying on
    strip/reattach hacks
    - centralize contextual user fragment detection in one shared definition
    table and reuse it for parsing/compaction logic
    - keep `AGENTS.md` in its natural serialized format:
      - `# AGENTS.md instructions for {dirname}`
      - `<INSTRUCTIONS>...</INSTRUCTIONS>`
    - simplify related tests/helpers and accept the expected snapshot/layout
    updates from bundled multi-part messages
    
    ## Why
    The goal is to converge toward a simpler, more intentional prompt shape
    where contextual updates are consistently represented as one developer
    envelope plus one contextual user envelope, while keeping parsing and
    compaction behavior aligned with that representation.
    
    ## Notable details
    - the temporary `SettingsUpdateEnvelope` wrapper was removed; these
    paths now return `Vec<ResponseItem>` directly
    - local/remote compaction no longer rely on model-switch strip/restore
    helpers
    - contextual user detection is now driven by shared fragment definitions
    instead of ad hoc matcher assembly
    - AGENTS/user instructions are still the same logical context; only the
    synthetic `<user_instructions>` wrapper was replaced by the natural
    AGENTS text format
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-app-server
    codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
    -- --exact`
    - `cargo test -p codex-core
    compact::tests::collect_user_messages_filters_session_prefix_entries
    --lib -- --exact`
    - `cargo test -p codex-core --test all
    'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_developer_instructions_message_in_request' --
    --exact`
    - `cargo test -p codex-core --test all
    'suite::client::includes_user_instructions_message_in_request' --
    --exact`
    - `cargo test -p codex-core --test all
    'suite::client::resume_includes_initial_messages_and_sends_prior_items'
    -- --exact`
    - `cargo test -p codex-core --test all
    'suite::review::review_input_isolated_from_parent_history' -- --exact`
    - `cargo test -p codex-exec --test all
    'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
    --exact`
    - `cargo test -p core_test_support
    context_snapshot::tests::full_text_mode_preserves_unredacted_text --
    --exact`
    
    ## Notes
    - I also ran several targeted `compact`, `compact_remote`,
    `prompt_caching`, `model_visible_layout`, and `event_mapping` tests
    while iterating on prompt-shape changes.
    - I have not claimed a clean full-workspace `cargo test` from this
    environment because local sandbox/resource conditions have previously
    produced unrelated failures in large workspace runs.
  • fix(exec) Patch resume test race condition (#12648)
    ## Summary
    The test exec_resume_last_respects_cwd_filter_and_all_flag makes one
    session “newest” by resuming it, but rollout updated_at is stored/sorted
    at second precision. On fast CI (especially Windows), the touch could
    land in the same second as initial session creation, making ordering
    nondeterministic.
    
    This change adds a short sleep before the recency-touch step so the
    resumed session is guaranteed to have a later updated_at, preserving the
    intended assertion without changing product behavior.
  • fix: codex-arg0 no longer depends on codex-core (#12434)
    ## Why
    
    `codex-rs/arg0` only needed two things from `codex-core`:
    
    - the `find_codex_home()` wrapper
    - the special argv flag used for the internal `apply_patch`
    self-invocation path
    
    That made `codex-arg0` depend on `codex-core` for a very small surface
    area. This change removes that dependency edge and moves the shared
    `apply_patch` invocation flag to a more natural boundary
    (`codex-apply-patch`) while keeping the contract explicitly documented.
    
    ## What Changed
    
    - Moved the internal `apply_patch` argv[1] flag constant out of
    `codex-core` and into `codex-apply-patch`.
    - Renamed the constant to `CODEX_CORE_APPLY_PATCH_ARG1` and documented
    that it is part of the Codex core process-invocation contract (even
    though it now lives in `codex-apply-patch`).
    - Updated `arg0`, the core apply-patch runtime, and the `codex-exec`
    apply-patch test to import the constant from `codex-apply-patch`.
    - Updated `codex-rs/arg0` to call
    `codex_utils_home_dir::find_codex_home()` directly instead of
    `codex_core::config::find_codex_home()`.
    - Removed the `codex-core` dependency from `codex-rs/arg0` and added the
    needed direct dependency on `codex-utils-home-dir`.
    - Added `codex-apply-patch` as a dev-dependency for `codex-rs/exec`
    tests (the apply-patch test now imports the moved constant directly).
    
    ## Verification
    
    - `cargo test -p codex-apply-patch`
    - `cargo test -p codex-arg0`
    - `cargo test -p codex-core --lib apply_patch`
    - `cargo test -p codex-exec
    test_standalone_exec_cli_can_use_apply_patch`
    - `cargo shear`
  • chore: remove codex-core public protocol/shell re-exports (#12432)
    ## Why
    
    `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
    from `codex-protocol` and `codex-shell-command`. That made it easy for
    workspace crates to import those APIs through `codex-core`, which in
    turn hides dependency edges and makes it harder to reduce compile-time
    coupling over time.
    
    This change removes those public re-exports so call sites must import
    from the source crates directly. Even when a crate still depends on
    `codex-core` today, this makes dependency boundaries explicit and
    unblocks future work to drop `codex-core` dependencies where possible.
    
    ## What Changed
    
    - Removed public re-exports from `codex-rs/core/src/lib.rs` for:
    - `codex_protocol::protocol` and related protocol/model types (including
    `InitialHistory`)
      - `codex_protocol::config_types` (`protocol_config_types`)
    - `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
    parse_command, powershell}`
    - Migrated workspace Rust call sites to import directly from:
      - `codex_protocol::protocol`
      - `codex_protocol::config_types`
      - `codex_protocol::models`
      - `codex_shell_command`
    - Added explicit `Cargo.toml` dependencies (`codex-protocol` /
    `codex-shell-command`) in crates that now import those crates directly.
    - Kept `codex-core` internal modules compiling by using `pub(crate)`
    aliases in `core/src/lib.rs` (internal-only, not part of the public
    API).
    - Updated the two utility crates that can already drop a `codex-core`
    dependency edge entirely:
      - `codex-utils-approval-presets`
      - `codex-utils-cli`
    
    ## Verification
    
    - `cargo test -p codex-utils-approval-presets`
    - `cargo test -p codex-utils-cli`
    - `cargo check --workspace --all-targets`
    - `just clippy`
  • feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
    `SandboxPolicy::ReadOnly` previously implied broad read access and could
    not express a narrower read surface.
    This change introduces an explicit read-access model so we can support
    user-configurable read restrictions in follow-up work, while preserving
    current behavior today.
    
    It also ensures unsupported backends fail closed for restricted-read
    policies instead of silently granting broader access than intended.
    
    ## What
    
    - Added `ReadOnlyAccess` in protocol with:
      - `Restricted { include_platform_defaults, readable_roots }`
      - `FullAccess`
    - Updated `SandboxPolicy` to carry read-access configuration:
      - `ReadOnly { access: ReadOnlyAccess }`
      - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
    - Preserved existing behavior by defaulting current construction paths
    to `ReadOnlyAccess::FullAccess`.
    - Threaded the new fields through sandbox policy consumers and call
    sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
    related tests.
    - Updated Seatbelt policy generation to honor restricted read roots by
    emitting scoped read rules when full read access is not granted.
    - Added fail-closed behavior on Linux and Windows backends when
    restricted read access is requested but not yet implemented there
    (`UnsupportedOperation`).
    - Regenerated app-server protocol schema and TypeScript artifacts,
    including `ReadOnlyAccess`.
    
    ## Compatibility / rollout
    
    - Runtime behavior remains unchanged by default (`FullAccess`).
    - API/schema changes are in place so future config wiring can enable
    restricted read access without another policy-shape migration.
  • feat(sandbox): enforce proxy-aware network routing in sandbox (#11113)
    ## Summary
    - expand proxy env injection to cover common tool env vars
    (`HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY`/`NO_PROXY` families +
    tool-specific variants)
    - harden macOS Seatbelt network policy generation to route through
    inferred loopback proxy endpoints and fail closed when proxy env is
    malformed
    - thread proxy-aware Linux sandbox flags and add minimal bwrap netns
    isolation hook for restricted non-proxy runs
    - add/refresh tests for proxy env wiring, Seatbelt policy generation,
    and Linux sandbox argument wiring
  • Handle required MCP startup failures across components (#10902)
    Summary
    - add a `required` flag for MCP servers everywhere config/CLI data is
    touched so mandatory helpers can be round-tripped
    - have `codex exec` and `codex app-server` thread start/resume fail fast
    when required MCPs fail to initialize
  • feat(linux-sandbox): add bwrap support (#9938)
    ## Summary
    This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The
    curent Linux sandbox path relies on in-process restrictions (including
    Landlock). Bubblewrap gives us a more uniform filesystem isolation
    model, especially explicit writable roots with the option to make some
    directories read-only and granular network controls.
    
    This is behind a feature flag so we can validate behavior safely before
    making it the default.
    
    - Added temporary rollout flag:
      - `features.use_linux_sandbox_bwrap`
    - Preserved existing default path when the flag is off.
    - In Bubblewrap mode:
    - Added internal retry without /proc when /proc mount is not permitted
    by the host/container.
  • [bazel] Improve runfiles handling (#10098)
    we can't use runfiles directory on Windows due to path lengths, so swap
    to manifest strategy. Parsing the manifest is a bit complex and the
    format is changing in Bazel upstream, so pull in the official Rust
    library (via a small hack to make it importable...) and cleanup all the
    associated logic to work cleanly in both bazel and cargo without extra
    confusion
  • Fix flakey resume test (#9789)
    Sessions' `updated_at` times are truncated to seconds, with the UUID
    session ID used to break ties. If the two test sessions are created in
    the same second, AND the session B UUID < session A UUID, the test
    fails.
    
    Fix this by mutating the session mtimes, from which we derive the
    updated_at time, to ensure session B is updated_at later than session A.
  • Made codex exec resume --last consistent with codex resume --last (#9352)
    PR #9245 made `codex resume --last` honor cwd, but I forgot to make the
    same change for `codex exec resume --last`. This PR fixes the
    inconsistency.
    
    This addresses #8700
  • feat: introduce find_resource! macro that works with Cargo or Bazel (#8879)
    To support Bazelification in https://github.com/openai/codex/pull/8875,
    this PR introduces a new `find_resource!` macro that we use in place of
    our existing logic in tests that looks for resources relative to the
    compile-time `CARGO_MANIFEST_DIR` env var.
    
    To make this work, we plan to add the following to all `rust_library()`
    and `rust_test()` Bazel rules in the project:
    
    ```
    rustc_env = {
        "BAZEL_PACKAGE": native.package_name(),
    },
    ```
    
    Our new `find_resource!` macro reads this value via
    `option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code
    using `find_resource!`_ is injected into the code expanded from the
    macro. (If `find_resource()` were a function, then
    `option_env!("BAZEL_PACKAGE")` would always be
    `codex-rs/utils/cargo-bin`, which is not what we want.)
    
    Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR`
    environment variable is set at runtime, indicating that the test is
    being run by Bazel. In this case, we have to concatenate the runtime
    `RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the
    path to the resource.
    
    In testing this change, I discovered one funky edge case in
    `codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_
    (but not canonicalize!) the result from `find_resource!` because the
    path contains a `common/..` component that does not exist on disk when
    the test is run under Bazel, so it must be semantically normalized using
    the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate
    before it is passed to `dotslash fetch`.
    
    Because this new behavior may be non-obvious, this PR also updates
    `AGENTS.md` to make humans/Codex aware that this API is preferred.
  • Allow global exec flags after resume and fix CI codex build/timeout (#8440)
    **Motivation**
    - Bring `codex exec resume` to parity with top‑level flags so global
    options (git check bypass, json, model, sandbox toggles) work after the
    subcommand, including when outside a git repo.
    
    **Description**
    - Exec CLI: mark `--skip-git-repo-check`, `--json`, `--model`,
    `--full-auto`, and `--dangerously-bypass-approvals-and-sandbox` as
    global so they’re accepted after `resume`.
    - Tests: add `exec_resume_accepts_global_flags_after_subcommand` to
    verify those flags work when passed after `resume`.
    
    **Testing**
    - `just fmt`
    - `cargo test -p codex-exec` (pass; ran with elevated perms to allow
    network/port binds)
    - Manual: exercised `codex exec resume` with global flags after the
    subcommand to confirm behavior.
  • feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496)
    This PR introduces a `codex-utils-cargo-bin` utility crate that
    wraps/replaces our use of `assert_cmd::Command` and
    `escargot::CargoBuild`.
    
    As you can infer from the introduction of `buck_project_root()` in this
    PR, I am attempting to make it possible to build Codex under
    [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
    achieve faster incremental local builds (largely due to Buck2's
    [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
    build strategy, as well as benefits from its local build daemon) as well
    as faster CI builds if we invest in remote execution and caching.
    
    See
    https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
    for more details about the performance advantages of Buck2.
    
    Buck2 enforces stronger requirements in terms of build and test
    isolation. It discourages assumptions about absolute paths (which is key
    to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
    variables that Cargo provides are absolute paths (which
    `assert_cmd::Command` reads), this is a problem for Buck2, which is why
    we need this `codex-utils-cargo-bin` utility.
    
    My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
    passed to a `rust_test()` build rule as relative paths.
    `codex-utils-cargo-bin` will resolve these values to absolute paths,
    when necessary.
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
    * #8498
    * __->__ #8496
  • feat: change ConfigLayerName into a disjoint union rather than a simple enum (#8095)
    This attempts to tighten up the types related to "config layers."
    Currently, `ConfigLayerEntry` is defined as follows:
    
    
    https://github.com/openai/codex/blob/bef36f4ae765f471d7cd69372fcf1b92c8f0367a/codex-rs/core/src/config_loader/state.rs#L19-L25
    
    but the `source` field is a bit of a lie, as:
    
    - for `ConfigLayerName::Mdm`, it is
    `"com.openai.codex/config_toml_base64"`
    - for `ConfigLayerName::SessionFlags`, it is `"--config"`
    - for `ConfigLayerName::User`, it is `"config.toml"` (just the file
    name, not the path to the `config.toml` on disk that was read)
    - for `ConfigLayerName::System`, it seems like it is usually
    `/etc/codex/managed_config.toml` in practice, though on Windows, it is
    `%CODEX_HOME%/managed_config.toml`:
    
    
    https://github.com/openai/codex/blob/bef36f4ae765f471d7cd69372fcf1b92c8f0367a/codex-rs/core/src/config_loader/layer_io.rs#L84-L101
    
    All that is to say, in three out of the four `ConfigLayerName`, `source`
    is a `PathBuf` that is not an absolute path (or even a true path).
    
    This PR tries to uplevel things by eliminating `source` from
    `ConfigLayerEntry` and turning `ConfigLayerName` into a disjoint union
    named `ConfigLayerSource` that has the appropriate metadata for each
    variant, favoring the use of `AbsolutePathBuf` where appropriate:
    
    ```rust
    pub enum ConfigLayerSource {
        /// Managed preferences layer delivered by MDM (macOS only).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        Mdm { domain: String, key: String },
        /// Managed config layer from a file (usually `managed_config.toml`).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        System { file: AbsolutePathBuf },
        /// Session-layer overrides supplied via `-c`/`--config`.
        SessionFlags,
        /// User config layer from a file (usually `config.toml`).
        #[serde(rename_all = "camelCase")]
        #[ts(rename_all = "camelCase")]
        User { file: AbsolutePathBuf },
    }
    ```
  • fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
    Changes the `writable_roots` field of the `WorkspaceWrite` variant of
    the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
    This is helpful because now callers can be sure the value is an absolute
    path rather than a relative one. (Though when using an absolute path in
    a Seatbelt config policy, we still have to _canonicalize_ it first.)
    
    Because `writable_roots` can be read from a config file, it is important
    that we are able to resolve relative paths properly using the parent
    folder of the config file as the base path.
  • seatbelt: allow openpty() (#7507)
    This allows `openpty(3)` to run in the default sandbox. Also permit
    reading `kern.argmax`, which is the maximum number of arguments to
    exec().