23 Commits

  • chore(features) rm Feature::ApplyPatchFreeform (#22711)
    ## Summary
    Removes the feature since this is effectively on by default in all cases
    where we should use it, or can be configured via models.json.
    
    ## Testing
    - [x] unit tests pass
  • [codex] Delete function-style apply_patch (#21651)
    ## Why
    
    `apply_patch` is now a freeform/custom tool. Keeping the old
    JSON/function-style registration and parsing path left another way for
    models and tests to invoke `apply_patch`, which made the tool surface
    harder to reason about.
    
    ## What changed
    
    - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
    spec, and handler support for function payloads.
    - Kept `apply_patch_tool_type = freeform` as the supported model
    metadata path, including Bedrock catalog metadata.
    - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
    calls.
    
    ## Verification
    
    - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
    - `cargo test -p codex-core tools::handlers::apply_patch --lib`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-core --test all
    apply_patch_reports_parse_diagnostics`
    - `cargo test -p codex-exec test_apply_patch_tool`
    - `just fix -p codex-core`
    - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
    codex-exec`
  • [codex] Remove unused event messages (#20511)
    ## Why
    
    Several legacy `EventMsg` variants were still emitted or mapped even
    though clients either ignored them or had moved to item/lifecycle
    events. `Op::Undo` had also degraded to an unavailable shim, so this
    removes that dead task path instead of preserving a command that cannot
    do useful work.
    
    `McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are
    intentionally kept because useful consumers still depend on them: MCP
    startup completion drives readiness behavior, and the begin events let
    app-server/core consumers surface in-progress web-search and
    image-generation items before the final payload arrives.
    
    ## What Changed
    
    - Removed weak legacy event variants and payloads from `codex-protocol`,
    including legacy agent deltas, background events, and undo lifecycle
    events.
    - Kept/restored `EventMsg::McpStartupComplete`,
    `EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with
    serializer and emission coverage.
    - Updated core, rollout, MCP server, app-server thread history,
    review/delegate filtering, and tests to rely on the useful replacement
    events that remain.
    - Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI
    slash-command comments.
    - Stopped agent job/background progress and compaction retry notices
    from emitting `BackgroundEvent` payloads.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-app-server-protocol -p
    codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - `cargo test -p codex-protocol -p codex-app-server-protocol -p
    codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - `cargo test -p codex-core --test all suite::items`
    - `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core
    -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - Earlier coverage on this PR also included `codex-mcp`, `codex-tui`,
    core library tests, MCP/plugin/delegate/review/agent job tests, and MCP
    startup TUI tests.
  • core tests: configure profiles directly (#20015)
    ## Summary
    - Replace legacy sandbox config setup in delegate and telemetry tests
    with direct `PermissionProfile` configuration.
    - Move no-sandbox and read-only test turns in `tools.rs`,
    `code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from
    legacy `SandboxPolicy` values to `PermissionProfile` helpers, while
    leaving the deny-glob read-only compatibility case for a later targeted
    cleanup.
    - Use `PermissionProfile::read_only()` where tests need managed
    read-only behavior and `PermissionProfile::Disabled` where they
    intentionally need no sandbox.
    - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27
    files after #20013 to 22 files.
    
    ## Testing
    - `cargo check -p codex-core --tests`
    - `just fmt`
  • Update models.json (#18586)
    - Replace the active models-manager catalog with the deleted core
    catalog contents.
    - Replace stale hardcoded test model slugs with current bundled model
    slugs.
    - Keep this as a stacked change on top of the cleanup PR.
  • chore: remove codex-core public protocol/shell re-exports (#12432)
    ## Why
    
    `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
    from `codex-protocol` and `codex-shell-command`. That made it easy for
    workspace crates to import those APIs through `codex-core`, which in
    turn hides dependency edges and makes it harder to reduce compile-time
    coupling over time.
    
    This change removes those public re-exports so call sites must import
    from the source crates directly. Even when a crate still depends on
    `codex-core` today, this makes dependency boundaries explicit and
    unblocks future work to drop `codex-core` dependencies where possible.
    
    ## What Changed
    
    - Removed public re-exports from `codex-rs/core/src/lib.rs` for:
    - `codex_protocol::protocol` and related protocol/model types (including
    `InitialHistory`)
      - `codex_protocol::config_types` (`protocol_config_types`)
    - `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
    parse_command, powershell}`
    - Migrated workspace Rust call sites to import directly from:
      - `codex_protocol::protocol`
      - `codex_protocol::config_types`
      - `codex_protocol::models`
      - `codex_shell_command`
    - Added explicit `Cargo.toml` dependencies (`codex-protocol` /
    `codex-shell-command`) in crates that now import those crates directly.
    - Kept `codex-core` internal modules compiling by using `pub(crate)`
    aliases in `core/src/lib.rs` (internal-only, not part of the public
    API).
    - Updated the two utility crates that can already drop a `codex-core`
    dependency edge entirely:
      - `codex-utils-approval-presets`
      - `codex-utils-cli`
    
    ## Verification
    
    - `cargo test -p codex-utils-approval-presets`
    - `cargo test -p codex-utils-cli`
    - `cargo check --workspace --all-targets`
    - `just clippy`
  • feat(core): plumb distinct approval ids for command approvals (#12051)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 👈 
    - https://github.com/openai/codex/pull/12052
    
    With upcoming support for a fork of zsh that allows us to intercept
    `execve` and run execpolicy checks for each subcommand as part of a
    `CommandExecution`, it will be possible for there to be multiple
    approval requests for a shell command like `/path/to/zsh -lc 'git status
    && rg \"TODO\" src && make test'`.
    
    To support that, this PR introduces a new `approval_id` field across
    core, protocol, and app-server so that we can associate approvals
    properly for subcommands.
  • feat: introduce Permissions (#11633)
    ## Why
    We currently carry multiple permission-related concepts directly on
    `Config` for shell/unified-exec behavior (`approval_policy`,
    `sandbox_policy`, `network`, `shell_environment_policy`,
    `windows_sandbox_mode`).
    
    Consolidating these into one in-memory struct makes permission handling
    easier to reason about and sets up the next step: supporting named
    permission profiles (`[permissions.PROFILE_NAME]`) without changing
    behavior now.
    
    This change is mostly mechanical: it updates existing callsites to go
    through `config.permissions`, but it does not yet refactor those
    callsites to take a single `Permissions` value in places where multiple
    permission fields are still threaded separately.
    
    This PR intentionally **does not** change the on-disk `config.toml`
    format yet and keeps compatibility with legacy config keys.
    
    ## What Changed
    - Introduced `Permissions` in `core/src/config/mod.rs`.
    - Added `Config::permissions` and moved effective runtime permission
    fields under it:
      - `approval_policy`
      - `sandbox_policy`
      - `network`
      - `shell_environment_policy`
      - `windows_sandbox_mode`
    - Updated config loading/building so these effective values are still
    derived from the same existing config inputs and constraints.
    - Updated Windows sandbox helpers/resolution to read/write via
    `permissions`.
    - Threaded the new field through all permission consumers across core
    runtime, app-server, CLI/exec, TUI, and sandbox summary code.
    - Updated affected tests to reference `config.permissions.*`.
    - Renamed the struct/field from
    `EffectivePermissions`/`effective_permissions` to
    `Permissions`/`permissions` and aligned variable naming accordingly.
    
    ## Verification
    - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
    -p codex-exec -p codex-utils-sandbox-summary`
    - `cargo build -p codex-core -p codex-tui -p codex-cli -p
    codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
  • feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
    `SandboxPolicy::ReadOnly` previously implied broad read access and could
    not express a narrower read surface.
    This change introduces an explicit read-access model so we can support
    user-configurable read restrictions in follow-up work, while preserving
    current behavior today.
    
    It also ensures unsupported backends fail closed for restricted-read
    policies instead of silently granting broader access than intended.
    
    ## What
    
    - Added `ReadOnlyAccess` in protocol with:
      - `Restricted { include_platform_defaults, readable_roots }`
      - `FullAccess`
    - Updated `SandboxPolicy` to carry read-access configuration:
      - `ReadOnly { access: ReadOnlyAccess }`
      - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
    - Preserved existing behavior by defaulting current construction paths
    to `ReadOnlyAccess::FullAccess`.
    - Threaded the new fields through sandbox policy consumers and call
    sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
    related tests.
    - Updated Seatbelt policy generation to honor restricted read roots by
    emitting scoped read rules when full read access is not granted.
    - Added fail-closed behavior on Linux and Windows backends when
    restricted read access is requested but not yet implemented there
    (`UnsupportedOperation`).
    - Regenerated app-server protocol schema and TypeScript artifacts,
    including `ReadOnlyAccess`.
    
    ## Compatibility / rollout
    
    - Runtime behavior remains unchanged by default (`FullAccess`).
    - API/schema changes are in place so future config wiring can enable
    restricted read access without another policy-shape migration.
  • Fix: update parallel tool call exec approval to approve on request id (#11162)
    ### Summary
    
    In parallel tool call, exec command approvals were not approved at
    request level but at a turn level. i.e. when a single request is
    approved, the system currently treats all requests in turn as approved.
    
    ### Before
    
    https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8
    
    ### After
    
    https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc
  • feat: support allowed_sandbox_modes in requirements.toml (#8298)
    This adds support for `allowed_sandbox_modes` in `requirements.toml` and
    provides legacy support for constraining sandbox modes in
    `managed_config.toml`. This is converted to `Constrained<SandboxPolicy>`
    in `ConfigRequirements` and applied to `Config` such that constraints
    are enforced throughout the harness.
    
    Note that, because `managed_config.toml` is deprecated, we do not add
    support for the new `external-sandbox` variant recently introduced in
    https://github.com/openai/codex/pull/8290. As noted, that variant is not
    supported in `config.toml` today, but can be configured programmatically
    via app server.
  • feat: Constrain values for approval_policy (#7778)
    Constrain `approval_policy` through new `admin_policy` config.
    
    This PR will:
    1. Add a `admin_policy` section to config, with a single field (for now)
    `allowed_approval_policies`. This list constrains the set of
    user-settable `approval_policy`s.
    2. Introduce a new `Constrained<T>` type, which combines a current value
    and a validator function. The validator function ensures disallowed
    values are not set.
    3. Change the type of `approval_policy` on `Config` and
    `SessionConfiguration` from `AskForApproval` to
    `Constrained<AskForApproval>`. The validator function is set by the
    values passed into `allowed_approval_policies`.
    4. `GenericDisplayRow`: add a `disabled_reason: Option<String>`. When
    set, it disables selection of the value and indicates as such in the
    menu. This also makes it unselectable with arrow keys or numbers. This
    is used in the `/approvals` menu.
    
    Follow ups are:
    1. Do the same thing to `sandbox_policy`.
    2. Propagate the allowed set of values through app-server for the
    extension (though already this should prevent app-server from setting
    this values, it's just that we want to disable UI elements that are
    unsettable).
    
    Happy to split this PR up if you prefer, into the logical numbered areas
    above. Especially if there are parts we want to gavel on separately
    (e.g. admin_policy).
    
    Disabled full access:
    <img width="1680" height="380" alt="image"
    src="https://github.com/user-attachments/assets/1fb61c8c-1fcb-4dc4-8355-2293edb52ba0"
    />
    
    Disabled `--yolo` on startup:
    <img width="749" height="76" alt="image"
    src="https://github.com/user-attachments/assets/0a1211a0-6eb1-40d6-a1d7-439c41e94ddb"
    />
    
    CODEX-4087
  • refactoring with_escalated_permissions to use SandboxPermissions instead (#7750)
    helpful in the future if we want more granularity for requesting
    escalated permissions:
    e.g when running in readonly sandbox, model can request to escalate to a
    sandbox that allows writes
  • ignore deltas in codex_delegate (#6208)
    ignore legacy deltas in codex-delegate to avoid this
    [issue](https://github.com/openai/codex/pull/6202).
  • Delegate review to codex instance (#5572)
    In this PR, I am exploring migrating task kind to an invocation of
    Codex. The main reason would be getting rid off multiple
    `ConversationHistory` state and streamlining our context/history
    management.
    
    This approach depends on opening a channel between the sub-codex and
    codex. This channel is responsible for forwarding `interactive`
    (`approvals`) and `non-interactive` events. The `task` is responsible
    for handling those events.
    
    This opens the door for implementing `codex as a tool`, replacing
    `compact` and `review`, and potentially subagents.
    
    One consideration is this code is very similar to `app-server` specially
    in the approval part. If in the future we wanted an interactive
    `sub-codex` we should consider using `codex-mcp`