Commit Graph

578 Commits

  • app-server-protocol: remove PermissionProfile from API (#22924)
    ## Why
    
    The app server API should expose permission profile identity, not the
    lower-level runtime permission model. `PermissionProfile` is the
    compiled sandbox/network representation that the server uses internally;
    exposing it through app-server-protocol forces clients to understand
    details that should remain implementation-level.
    
    The API boundary should prefer `ActivePermissionProfile`: a stable
    profile id, plus future parent-profile metadata, that clients can pass
    back when they want to select the same active permissions. This also
    avoids schema generation collisions between the app-server v2 API type
    space and the core protocol model.
    
    Incidentally, while PR makes a number of changes to `command/exec`, note
    that we are hoping to deprecate this API in favor of `process/spawn`, so
    we don't need to be too finicky about these changes.
    
    ## What Changed
    
    - Removed `PermissionProfile` from the app-server-protocol API surface,
    including generated schema and TypeScript exports.
    - Changed `CommandExecParams.permissionProfile` to
    `ActivePermissionProfile`.
    - Resolve command exec profile ids through `ConfigManager` for the
    command cwd, matching turn override selection semantics.
    - Updated downstream TUI tests/helpers to use core permission types
    directly instead of app-server-protocol `PermissionProfile` shims.
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • [codex] Use compaction_trigger item for remote compaction v2 (#22809)
    ## Why
    
    Remote compaction v2 was still using `context_compaction` as both the
    request trigger and the compacted output shape. The Responses API now
    has the landed contract for this flow: Codex sends a dedicated `{
    "type": "compaction_trigger" }` input item, and the backend returns the
    standard `compaction` output item with encrypted content.
    
    This aligns the v2 path with that wire contract while preserving the
    existing local compacted-history post-processing behavior.
    
    ## What changed
    
    - Add `ResponseItem::CompactionTrigger` and regenerate the app-server
    protocol schema fixtures.
    - Send `compaction_trigger` from `remote_compaction_v2` instead of a
    payload-less `context_compaction`.
    - Collect exactly one backend `compaction` output item, then reuse the
    existing compacted-history rebuilding path.
    - Treat the trigger item as a transient request marker rather than model
    output or persisted rollout/memory content.
    
    ## Verification
    
    - `cargo test -p codex-protocol compaction_trigger`
    - `cargo test -p codex-core remote_compact_v2`
    - `cargo test -p codex-core compact_remote_v2`
    - `cargo test -p codex-core
    responses_websocket_sends_response_processed_after_remote_compaction_v2`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol schema_fixtures`
  • app-server: use permission ids and runtime workspace roots (#22611)
    ## Why
    
    This PR builds on [#22610](https://github.com/openai/codex/pull/22610)
    and is the app-server side of the migration from mutable per-turn
    `SandboxPolicy` replacement toward selecting immutable permission
    profiles by id plus mutable runtime workspace roots.
    
    Once permission profiles can carry their own immutable
    `workspace_roots`, app-server no longer needs to mutate the selected
    `PermissionProfile` just to represent thread-specific filesystem
    context. The mutable part now lives on the thread as explicit
    `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until
    the sandbox is realized for a turn.
    
    ## What Changed
    
    - Replaced the v2 permission-selection wrapper surface with plain
    profile ids for `thread/start`, `thread/resume`, `thread/fork`, and
    `turn/start`.
    - Removed the API surface for profile modifications
    (`PermissionProfileSelectionParams`,
    `PermissionProfileModificationParams`,
    `ActivePermissionProfileModification`).
    - Added experimental `runtimeWorkspaceRoots` fields to the thread
    lifecycle and turn-start APIs.
    - Threaded runtime workspace roots through core session/thread
    snapshots, turn overrides, app-server request handling, and command
    execution permission resolution.
    - Kept session permission state symbolic so later runtime root updates
    and cwd-only implicit-root retargeting rebind `:workspace_roots`
    correctly.
    - Updated the embedded clients just enough to send and restore the new
    thread state.
    - Refreshed the generated schema/TypeScript artifacts and the app-server
    README to match the new contract.
    
    ## Verification
    
    Targeted coverage for this layer lives in:
    
    - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_start.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
    - `codex-rs/app-server/tests/suite/v2/turn_start.rs`
    - `codex-rs/core/src/session/tests.rs`
    
    The key regression checks exercise that:
    
    - `runtimeWorkspaceRoots` resolve against the effective cwd on thread
    start.
    - Profile-declared workspace roots are excluded from the runtime
    workspace roots returned by app-server.
    - A turn-level runtime workspace-root update persists onto the thread
    and is returned by `thread/resume`.
    - A named permission profile selected on one turn remains symbolic so a
    later runtime-root-only turn update changes the actual sandbox writes.
    - A cwd-only turn update retargets the implicit runtime cwd root while
    preserving additional runtime roots.
    - The protocol fixtures and generated client artifacts stay in sync with
    the string-based permission selection contract.
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611).
    * #22612
    * __->__ #22611
  • permissions: support workspace roots in profiles (#22610)
    ## Why
    
    This is the configuration/model half of the alternative permissions
    migration we discussed as a comparison point for
    [#22401](https://github.com/openai/codex/pull/22401) and
    [#22402](https://github.com/openai/codex/pull/22402).
    
    The old `workspace-write` model mixes three concerns that we want to
    keep separate:
    - reusable profile rules that should stay immutable once selected
    - user/runtime workspace roots from `cwd`, `--add-dir`, and legacy
    workspace-write config
    - internal Codex writable roots such as memories, which should not be
    shown as user workspace roots
    
    This PR gives permission profiles first-class `workspace_roots` so users
    can opt multiple repositories into the same `:workspace_roots` rules
    without using broad absolute-path write grants. It also starts
    separating the raw selected profile from the effective runtime profile
    by making `Permissions` expose explicit accessors instead of public
    mutable fields.
    
    A representative `config.toml` looks like this:
    
    ```toml
    default_permissions = "dev"
    
    [permissions.dev.workspace_roots]
    "~/code/openai" = true
    "~/code/developers-website" = true
    
    [permissions.dev.filesystem.":workspace_roots"]
    "." = "write"
    ".codex" = "read"
    ".git" = "read"
    ".vscode" = "read"
    ```
    
    If Codex starts in `~/code/codex` with that profile selected, the
    effective workspace-root set becomes:
    - `~/code/codex` from the runtime `cwd`
    - `~/code/openai` from the profile
    - `~/code/developers-website` from the profile
    
    The `:workspace_roots` rules are materialized across each root, so
    `.git`, `.codex`, and `.vscode` stay scoped the same way everywhere.
    Runtime additions such as `--add-dir` can still layer on later stack
    entries without mutating the selected profile.
    
    ## Stack Shape
    
    This PR intentionally stops before the profile-identity cleanup in
    [#22683](https://github.com/openai/codex/pull/22683) so the base review
    stays focused on config loading, workspace-root materialization, and
    compatibility with legacy `workspace-write`.
    
    The representation in this PR is therefore transitional: `Permissions`
    carries enough state to distinguish the raw constrained profile from the
    effective runtime profile, and there are still call sites that must keep
    the active profile identity and constrained profile value in sync. The
    follow-up PR replaces that with a single resolved profile state
    (`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the
    profile id, immutable `PermissionProfile`, and profile-declared
    workspace roots together. That follow-up removes APIs such as
    `set_constrained_permission_profile_with_active_profile()` where
    separate arguments could drift out of sync.
    
    Downstream PRs then build on this base to switch app-server turn updates
    to profile ids plus runtime workspace roots and to finish the
    user-visible summary behavior. Reviewers should judge this PR as the
    workspace-roots foundation, not as the final in-memory shape of selected
    permission profiles.
    
    ## Review Guide
    
    Suggested review order:
    
    1. Start with `codex-rs/core/src/config/mod.rs`.
    This is the main shape change in the base slice. `Permissions` now
    stores a private raw `Constrained<PermissionProfile>` plus runtime
    `workspace_roots`. Callers use `permission_profile()` when they need the
    raw constrained value and `effective_permission_profile()` when they
    need a materialized runtime profile. As noted above,
    [#22683](https://github.com/openai/codex/pull/22683) replaces this
    transitional shape with a resolved profile state that keeps identity and
    profile data together.
    
    2. Review `codex-rs/config/src/permissions_toml.rs` and
    `codex-rs/core/src/config/permissions.rs`.
    These add `[permissions.<id>.workspace_roots]`, resolve enabled entries
    relative to the policy cwd, and keep `:workspace_roots` deny-read glob
    patterns symbolic until the actual roots are known.
    
    3. Review `codex-rs/protocol/src/permissions.rs` and
    `codex-rs/protocol/src/models.rs`.
    These add the policy/profile materialization helpers that expand exact
    `:workspace_roots` entries and scoped deny-read globs over every
    workspace root. This is also where `ActivePermissionProfileModification`
    is removed from the core model.
    
    4. Review the legacy bridge in
    `Config::load_from_base_config_with_overrides` and
    `Config::set_legacy_sandbox_policy`.
    This is where legacy `workspace-write` roots become runtime workspace
    roots, while Codex internal writable roots stay internal and do not
    appear as user-facing workspace roots.
    
    5. Then skim downstream call sites.
    The interesting pattern is raw-vs-effective access: state/proxy/bwrap
    paths keep the raw constrained profile, while execution, summaries, and
    user-visible status use the effective profile and workspace-root list.
    
    ## What Changed
    
    - added `[permissions.<id>.workspace_roots]` to the config model and
    schema
    - added runtime `workspace_roots` state to `Config`/`Permissions` and
    `ConfigOverrides`
    - made `Permissions` profile fields private and replaced direct mutation
    with accessors/setters
    - added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for
    materializing `:workspace_roots` exact paths and deny-read globs across
    all roots
    - moved legacy additional writable roots into runtime workspace-root
    state instead of active profile modifications
    - removed `ActivePermissionProfileModification` and its app-server
    protocol/schema export
    - updated sandbox/status summary paths so internal writable roots are
    not reported as user workspace roots
    
    ## Verification Strategy
    
    The targeted tests cover the behavior at the layers where regressions
    are most likely:
    - `codex-rs/core/src/config/config_tests.rs` verifies config loading,
    legacy workspace-root seeding, effective profile materialization, and
    memory-root handling.
    - `codex-rs/core/src/config/permissions_tests.rs` verifies profile
    `workspace_roots` parsing and `:workspace_roots` scoped/glob
    compilation.
    - `codex-rs/protocol/src/permissions.rs` unit tests verify exact and
    glob materialization over multiple workspace roots.
    - `codex-rs/tui/src/status/tests.rs` and
    `codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the
    user-facing summaries show effective workspace roots and hide internal
    writes.
    
    I also ran `cargo check --tests` locally after the latest stack refresh
    to catch cross-crate API breakage from the private-field/accessor
    changes.
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610).
    * #22612
    * #22611
    * #22683
    * __->__ #22610
  • permissions: canonicalize workspace_roots and danger-full-access names (#22624)
    ## Why
    
    This is a small precursor to the larger permissions-migration work. Both
    the comparison stack in
    [#22401](https://github.com/openai/codex/pull/22401) /
    [#22402](https://github.com/openai/codex/pull/22402) and the alternate
    stack in [#22610](https://github.com/openai/codex/pull/22610) /
    [#22611](https://github.com/openai/codex/pull/22611) /
    [#22612](https://github.com/openai/codex/pull/22612) are easier to
    review if the terminology is already settled underneath them.
    
    Because `:project_roots` and `:danger-no-sandbox` have not shipped as
    stable user-facing surface area, carrying them forward as aliases would
    just add more migration logic to the later stacks. This PR removes that
    ambiguity now so the follow-on work can rely on one spelling for each
    built-in concept.
    
    ## What Changed
    
    - renamed the config-facing special filesystem key from `:project_roots`
    to `:workspace_roots`
    - dropped unpublished `:project_roots` parsing support in
    `core/src/config/permissions.rs`, so new config only recognizes
    `:workspace_roots`
    - renamed the built-in full-access permission profile id from
    `:danger-no-sandbox` to `:danger-full-access`
    - dropped unpublished `:danger-no-sandbox` support entirely, including
    the old active-profile canonicalization path, and added explicit
    rejection coverage for the legacy id
    - introduced shared built-in permission-profile id constants in
    `codex-rs/protocol/src/models.rs`
    - updated `core`, `app-server`, and `tui` call sites that special-case
    built-in profiles to use the shared constants and canonical ids
    - updated tests and the Linux sandbox README to use `:workspace_roots` /
    `:danger-full-access`
    
    ## Verification
    
    I focused verification on the three places this rename can regress:
    config parsing, active-profile identity surfaced back out of `core`, and
    user/server call sites that special-case built-in profiles.
    
    Targeted checks:
    
    -
    `config::tests::default_permissions_can_select_builtin_profile_without_permissions_table`
    -
    `config::tests::default_permissions_read_only_applies_additional_writable_roots_as_modifications`
    -
    `config::tests::default_permissions_can_select_builtin_full_access_profile`
    - `config::tests::legacy_danger_no_sandbox_is_rejected`
    - `workspace_root` filtered `codex-core` tests
    -
    `request_processors::thread_processor::thread_processor_tests::thread_processor_behavior_tests::requested_permissions_trust_project_uses_permission_profile_intent`
    -
    `suite::v2::turn_start::turn_start_rejects_invalid_permission_selection_before_starting_turn`
    - `status::tests::status_snapshot_shows_auto_review_permissions`
    -
    `status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access`
    -
    `app_server_session::tests::embedded_turn_permissions_use_active_profile_selection`
  • feat: add layered --profile-v2 config files (#17141)
    ## Why
    
    `--profile-v2 <name>` gives launchers and runtime entry points a named
    profile config without making each profile duplicate the base user
    config. The base `$CODEX_HOME/config.toml` still loads first, then
    `$CODEX_HOME/<name>.config.toml` layers above it and becomes the active
    writable user config for that session.
    
    That keeps shared defaults, plugin/MCP setup, and managed/user
    constraints in one place while letting a named profile override only the
    pieces that need to differ.
    
    ## What Changed
    
    - Added the shared `--profile-v2 <name>` runtime option with validated
    plain names, now represented by `ProfileV2Name`.
    - Extended config layer state so the base user config and selected
    profile config are both `User` layers; APIs expose the active user layer
    and merged effective user config.
    - Threaded profile selection through runtime entry points: `codex`,
    `codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex
    debug prompt-input`.
    - Made user-facing config writes go to the selected profile file when
    active, including TUI/settings persistence, app-server config writes,
    and MCP/app tool approval persistence.
    - Made plugin, marketplace, MCP, hooks, and config reload paths read
    from the merged user config so base and profile layers both participate.
    - Updated app-server config layer schemas to mark profile-backed user
    layers.
    
    ## Limits
    
    `--profile-v2` is still rejected for config-management subcommands such
    as feature, MCP, and marketplace edits. Those paths remain tied to the
    base `config.toml` until they have explicit profile-selection semantics.
    
    Some adjacent background writes may still update base or global state
    rather than the selected profile:
    
    - marketplace auto-upgrade metadata
    - automatic MCP dependency installs from skills
    - remote plugin sync or uninstall config edits
    - personality migration marker/default writes
    
    ## Verification
    
    Added targeted coverage for profile name validation, layer
    ordering/merging, selected-profile writes, app-server config writes,
    session hot reload, plugin config merging, hooks/config fixture updates,
    and MCP/app approval persistence.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Remove unused legacy shell tools (#22246)
    ## Why
    
    Recent session history showed no active use of the raw `shell`,
    `local_shell`, or `container.exec` execution surfaces. Keeping those
    handlers/specs wired into core leaves duplicate shell execution paths
    alongside the supported `shell_command` and unified exec tools.
    
    ## What changed
    
    - Removed the raw `shell` handler/spec and its `ShellToolCallParams`
    protocol helper.
    - Removed the legacy `local_shell` and `container.exec` handler/spec
    plumbing while preserving persisted-history compatibility for old
    response items.
    - Normalized model/config `default` and `local` shell selections to
    `shell_command`.
    - Pruned tests that exercised removed raw-shell/local-shell/apply-patch
    variants and kept coverage on `shell_command`, unified exec, and
    freeform `apply_patch`.
    
    ## Verification
    
    - `git diff --check`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tools::handlers::shell`
    - `cargo test -p codex-core tools::spec`
    - `cargo test -p codex-core tools::router`
    - `cargo test -p codex-core
    active_call_preserves_triggering_command_context`
    - `cargo test -p codex-core guardian_tests`
    - `cargo test -p codex-core --test all shell_serialization`
    - `cargo test -p codex-core --test all apply_patch_cli`
    - `cargo test -p codex-core --test all shell_command_`
    - `cargo test -p codex-core --test all local_shell`
    - `cargo test -p codex-core --test all otel::`
    - `cargo test -p codex-core --test all hooks::`
    - `just fix -p codex-core`
    - `just fix -p codex-tools`
  • feat(tui): remove Zellij TUI workarounds (#22214)
    ## Why
    
    We added Zellij-specific TUI workarounds because older Zellij behavior
    did not work with Codex's normal terminal model:
    
    - #8555 made `tui.alternate_screen = "auto"` disable alternate screen in
    Zellij so transcript history stayed available.
    - #16578 avoided scroll-region operations in Zellij by emitting raw
    newlines and using a separate composer styling path.
    
    This PR removes both workarounds because the latest Zellij release
    tested locally (`zellij 0.44.1`) works correctly with Codex's standard
    TUI behavior: normal alternate-screen handling, redraw, and history
    insertion.
    
    ## What Changed
    
    - Removed the `InsertHistoryMode::Zellij` path and the Zellij-only
    newline scrollback insertion behavior.
    - Removed cached `is_zellij` state from the TUI and composer.
    - Removed Zellij-specific composer styling, the helper snapshot, and the
    `TerminalInfo::is_zellij()` convenience method that only served this
    workaround.
    - Changed `tui.alternate_screen = "auto"` to use alternate screen for
    Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still
    preserve the inline mode escape hatch.
    - Updated the generated config schema description for
    `tui.alternate_screen`.
    
    ## How to Test
    
    Manual smoke path used with `zellij 0.44.1`:
    
    1. Build and run this branch inside a Zellij `0.44.1` session with
    default config.
    2. Start Codex normally and produce enough assistant/tool output to
    create scrollback.
    3. Confirm the transcript remains readable, the composer renders
    normally, and scrolling through terminal history works.
    4. Resize the Zellij pane while output exists and confirm the TUI
    redraws without duplicated, missing, or stale rows.
    5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if
    you want to verify the inline fallback still works.
    
    Targeted tests:
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-tui`
    - `cargo test -p codex-terminal-detection`
    - `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen`
    
    Attempted but did not complete locally:
    - `cargo test -p codex-tui` built and ran the new test successfully,
    then failed later on unrelated local failures in
    `status_permissions_full_disk_managed_*` and a stack overflow in
    `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
    
    ## Documentation
    
    No developers.openai.com Codex documentation update is needed for this
    revert.
  • feat(sandbox): add Windows deny-read parity (#18202)
    ## Why
    
    The split filesystem policy stack already supports exact and glob
    `access = none` read restrictions on macOS and Linux. Windows still
    needed subprocess handling for those deny-read policies without claiming
    enforcement from a backend that cannot provide it.
    
    ## Key finding
    
    The unelevated restricted-token backend cannot safely enforce deny-read
    overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
    checks, not read denials, so this PR intentionally fails that backend
    closed when deny-read overrides are present instead of claiming
    unsupported enforcement.
    
    ## What changed
    
    This PR adds the Windows deny-read enforcement layer and makes the
    backend split explicit:
    
    - Resolves Windows deny-read filesystem policy entries into concrete ACL
    targets.
    - Preserves exact missing paths so they can be materialized and denied
    before an enforceable sandboxed process starts.
    - Snapshot-expands existing glob matches into ACL targets for Windows
    subprocess enforcement.
    - Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
    - Plans both the configured lexical path and the canonical target for
    existing paths so reparse-point aliases are covered.
    - Threads deny-read overrides through the elevated/logon-user Windows
    sandbox backend and unified exec.
    - Applies elevated deny-read ACLs synchronously before command launch
    rather than delegating them to the background read-grant helper.
    - Reconciles persistent deny-read ACEs per sandbox principal so policy
    changes do not leave stale deny-read ACLs behind.
    - Fails closed on the unelevated restricted-token backend when deny-read
    overrides are present, because its `WRITE_RESTRICTED` token model is not
    authoritative for read denials.
    
    ## Landed prerequisites
    
    These prerequisite PRs are already on `main`:
    
    1. #15979 `feat(permissions): add glob deny-read policy support`
    2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
    3. #17740 `feat(config): support managed deny-read requirements`
    
    This PR targets `main` directly and contains only the Windows deny-read
    enforcement layer.
    
    ## Implementation notes
    
    - Exact deny-read paths remain enforceable on the elevated path even
    when they do not exist yet: Windows materializes the missing path before
    applying the deny ACE, so the sandboxed command cannot create and read
    it during the same run.
    - Existing exact deny paths are preserved lexically until the ACL
    planner, which then adds the canonical target as a second ACL target
    when needed. That keeps both the configured alias and the resolved
    object covered.
    - Windows ACLs do not consume Codex glob syntax directly, so glob
    deny-read entries are expanded to the concrete matches that exist before
    process launch.
    - Glob traversal deduplicates directory visits within each pattern walk
    to avoid cycles, without collapsing distinct lexical roots that happen
    to resolve to the same target.
    - Persistent deny-read ACL state is keyed by sandbox principal SID, so
    cleanup only removes ACEs owned by the same backend principal.
    - Deny-read ACEs are fail-closed on the elevated path: setup aborts if
    mandatory deny-read ACL application fails.
    - Unelevated restricted-token sessions reject deny-read overrides early
    instead of running with a silently unenforceable read policy.
    
    ## Verification
    
    - `cargo test -p codex-core
    windows_restricted_token_rejects_unreadable_split_carveouts`
    - `just fmt`
    - `just fix -p codex-core`
    - `just fix -p codex-windows-sandbox`
    - GitHub Actions rerun is in progress on the pushed head.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Reapply "Move skills watcher to app-server" (#21652)
    ## Why
    
    PR #21460 reverted the earlier move of skills change watching from
    `codex-core` into app-server. This reapplies that boundary change so
    app-server owns client-facing `skills/changed` notifications and core no
    longer carries the watcher.
    
    ## What
    
    - Restore the app-server `SkillsWatcher` and register it from thread
    listener setup.
    - Remove the core-owned skills watcher and its core live-reload
    integration surface.
    - Restore app-server coverage for `skills/changed` notifications after a
    watched skill file changes.
    
    ## Validation
    
    - `cargo test -p codex-app-server --test all
    suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change
    -- --exact --nocapture`
    - `cargo test -p codex-core --lib --no-run`
  • Enable --deny-warnings for cargo shear (#21616)
    ## Summary
    
    In https://github.com/openai/codex/pull/21584, we disabled doctests for
    crates that lack any doctests. We can enforce that property via `cargo
    shear --deny-warnings`: crates that lack doctests will be flagged if
    doctests are enabled, and crates with doctests will be flagged if
    doctests are disabled.
    
    A few additional notes:
    
    - By adding `--deny-warnings`, `cargo shear` also flagged a number of
    modules that were not reachable at all. Some of those have been removed.
    - This PR removes a usage of `windows_modules!` (since `cargo shear` and
    `rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
    "windows")]` macros. As a consequence, many of these files exhibit churn
    in this PR, since they weren't being formatted by `rustfmt` at all on
    main.
    - Again, to make the code more analyzable, this PR also removes some
    usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
    module structure. The bin sidecar structure is still retained, but,
    e.g., `windows-sandbox-rs/src/bin/command_runner.rs‎` was moved to
    `windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Delete function-style apply_patch (#21651)
    ## Why
    
    `apply_patch` is now a freeform/custom tool. Keeping the old
    JSON/function-style registration and parsing path left another way for
    models and tests to invoke `apply_patch`, which made the tool surface
    harder to reason about.
    
    ## What changed
    
    - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
    spec, and handler support for function payloads.
    - Kept `apply_patch_tool_type = freeform` as the supported model
    metadata path, including Bedrock catalog metadata.
    - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
    calls.
    
    ## Verification
    
    - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
    - `cargo test -p codex-core tools::handlers::apply_patch --lib`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-core --test all
    apply_patch_reports_parse_diagnostics`
    - `cargo test -p codex-exec test_apply_patch_tool`
    - `just fix -p codex-core`
    - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
    codex-exec`
  • Remove ToolName display helper (#21465)
    ## Why
    
    `ToolName::display()` made it too easy to flatten tool identity and
    accidentally compare rendered strings. Tool identity should stay
    structural until a legacy string boundary actually requires the
    flattened spelling.
    
    ## What
    
    - Removes `ToolName::display()` and relies on the existing `Display`
    impl for messages and errors.
    - Adds structural ordering for `ToolName` and uses it for
    sorting/deduping deferred tools.
    - Carries `ToolName` through tool/sandbox plumbing, flattening only at
    legacy boundaries such as hook payloads, telemetry tags, and Responses
    tool names.
    - Updates MCP normalization tests to assert `ToolName` structure instead
    of rendered strings.
    
    ## Testing
    
    - `cargo test -p codex-mcp test_normalize_tools`
    - `cargo test -p codex-core unavailable_tool`
    - `just fix -p codex-protocol`
    - `just fix -p codex-mcp`
    - `just fix -p codex-core`
  • [codex] Generalize service tier slash commands (#21745)
    ## Why
    
    `/fast` was wired as a one-off slash command even though model metadata
    now exposes service tiers as catalog data. That meant adding another
    tier, such as a slower/cheaper tier, would require more hardcoded TUI
    plumbing instead of letting the model catalog drive the available
    commands.
    
    This change makes service-tier commands data-driven: each advertised
    `service_tiers` entry becomes a `/name` command using the catalog
    description, while the request path sends the tier `id` only when the
    selected model supports it.
    
    ## What Changed
    
    - Removed the hardcoded `/fast` slash-command variant and introduced
    dynamic service-tier command items in the composer and command popup.
    - Added toggle behavior for service-tier commands: invoking `/name`
    selects that tier, and invoking it again clears the selection.
    - Preserved the existing Fast-mode keybinding/status affordances by
    resolving the current model tier whose name is `fast`, while still
    sending the tier request value such as `priority`.
    - Persisted service-tier selections as raw request strings so non-fast
    tiers can round-trip through config.
    - Updated the Bedrock catalog entry to advertise fast support through
    `service_tiers` with `id: "priority"` and `name: "fast"`.
    - Added defensive filtering in core so unsupported selected service
    tiers are omitted from `/responses` requests.
    
    ## Validation
    
    - Added/updated coverage for dynamic service-tier slash command lookup,
    popup descriptions, composer dispatch, TUI fast toggling, and
    unsupported-tier omission in core request construction.
    - Local tests were not run per request.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] plumb protocol-native review timing (#21434)
    ## Why
    
    We want terminal tool review analytics, but the reducer should not stamp
    review timing from its own wall clock.
    
    This PR plumbs review timing through the real protocol and app-server
    seams so downstream analytics can consume the emitter's timestamps
    directly. Guardian reviews keep their enriched `started_at` /
    `completed_at` analytics fields by deriving those legacy second-based
    values from the same protocol-native millisecond lifecycle timestamps,
    rather than sampling a separate analytics clock.
    
    ## What changed
    
    - add `started_at_ms` to user approval request payloads
    - add `started_at_ms` / `completed_at_ms` to guardian review
    notifications
    - preserve Guardian review `started_at` / `completed_at` enrichment from
    the protocol-native timing source
    - stamp typed `ServerResponse` analytics facts with app-server-observed
    `completed_at_ms`
    - thread the new timing fields through core, protocol, app-server, TUI,
    and analytics fixtures
    
    ## Verification
    
    - `cargo test -p codex-app-server outgoing_message --manifest-path
    codex-rs/Cargo.toml`
    - `cargo test -p codex-app-server-protocol guardian --manifest-path
    codex-rs/Cargo.toml`
    - `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml`
    - `cargo test -p codex-analytics analytics_client_tests --manifest-path
    codex-rs/Cargo.toml`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434).
    * #18748
    * __->__ #21434
    * #18747
    * #17090
    * #17089
    * #20514
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
    Based on work from Vincent K -
    https://github.com/openai/codex/pull/19060
    
    <img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
    src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
    />
    
    ## Why
    
    Compaction rewrites the conversation context that future model turns
    receive, but hooks currently have no deterministic lifecycle point
    around that rewrite. This adds compact lifecycle hooks so users can
    audit manual and automatic compaction, surface hook messages in the UI,
    and run post-compaction follow-up without overloading tool or prompt
    hooks.
    
    ## What Changed
    
    - Added `PreCompact` and `PostCompact` hook events across hook config,
    discovery, dispatch, generated schemas, app-server notifications,
    analytics, and TUI hook rendering.
    - Added trigger matching for compact hooks with the documented `manual`
    and `auto` matcher values.
    - Wired `PreCompact` before both local and remote compaction, and
    `PostCompact` after successful local or remote compaction.
    - Kept compact hook command input to lifecycle metadata: session id,
    Codex turn id, transcript path, cwd, hook event name, model, and
    trigger.
    - Made compact stdout handling consistent with other hooks: plain stdout
    is ignored as debug output, while malformed JSON-looking stdout is
    reported as failed hook output.
    - Added integration coverage for compact hook dispatch, trigger
    matching, post-compact execution, and the audited behavior that
    `decision:"block"` does not block compaction.
    
    ## Out of Scope
    
    - Hook-specific compaction blocking is not implemented;
    `decision:"block"` and exit-code-2 blocking semantics are intentionally
    unsupported for `PreCompact`.
    - Custom compaction instructions are not exposed to compact hooks in
    this PR.
    - Compact summaries, summary character counts, and summary previews are
    not exposed to compact hooks in this PR.
    
    ## Verification
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-core
    manual_pre_compact_block_decision_does_not_block_compaction`
    - `cargo test -p codex-app-server hooks_list`
    - `cargo test -p codex-core config_schema_matches_fixture`
    - `cargo test -p codex-tui hooks_browser`
    
    ## Docs
    
    The developer documentation for Codex hooks should be updated alongside
    this feature to document `PreCompact` and `PostCompact`, the
    `manual`/`auto` matcher values, and the compact hook payload fields.
    
    ---------
    
    Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
  • Move skills watcher to app-server (#21287)
    ## Why
    
    Skills update notifications are app-server API behavior, but the watcher
    lived in `codex-core` and surfaced through
    `EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
    focused on thread execution and lets app-server own both cache
    invalidation and the `skills/changed` notification.
    
    ## What changed
    
    - Added an app-server-owned skills watcher that watches local skill
    roots, clears the shared skills cache, and emits `skills/changed`
    directly.
    - Registers skill watches from the common app-server thread listener
    attach path, including direct starts, resumes, and app-server-observed
    child or forked threads.
    - Stores the `WatchRegistration` on `ThreadState`, so listener
    replacement, thread teardown, idle unload, and app-server shutdown
    deregister by dropping the RAII guard.
    - Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
    old core live-reload test.
    - Extended the app-server skills change test to verify a cached skills
    list is refreshed after a filesystem change without forcing reload.
    
    ## Validation
    
    - `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
    codex-rollout -p codex-rollout-trace`
    - `cargo test -p codex-app-server
    skills_changed_notification_is_emitted_after_skill_change`
  • Route opted-in MCP elicitations through Guardian (#19431)
    # Motivation
    
    Browser Use origin-access prompts are MCP elicitations, not direct
    tool-call approval prompts, so they were bypassing the Guardian approval
    path. We need a generic opt-in that lets eligible MCP elicitations use
    Guardian when the current turn already routes approvals there.
    
    # Description
    
    Add a generic elicitation reviewer hook in codex-mcp and wire codex-core
    to pass a Guardian reviewer callback when creating the MCP connection
    manager. The reviewer validates explicit mcp_tool_call opt-in metadata,
    builds a Guardian MCP tool-call review request from
    server/tool/connector metadata and tool params, and maps Guardian
    approval, denial, timeout, and cancellation decisions back to MCP
    elicitation responses.
    
    The new option to trigger this in the `_meta` object is:
    ```
    "codex_request_type": "approval_request",
    ```
    
    # Testing
    
    - RUST_MIN_STACK=8388608 NEXTEST_STATUS_LEVEL=leak cargo nextest run
    --no-fail-fast --cargo-profile ci-test --test-threads 2
    - cargo clippy --tests -- -D warnings
    - cargo fmt -- --config imports_granularity=Item --check
    - cargo shear
    - pnpm run format
    - python3 .github/scripts/verify_cargo_workspace_manifests.py
    - python3 .github/scripts/verify_tui_core_boundary.py
    - python3 .github/scripts/verify_bazel_clippy_lints.py
    - git diff --check
  • Remove core MCP list tools op (#21281)
    ## Why
    
    The core `Op::ListMcpTools` request path is no longer needed. Keeping it
    around left a dead request/response surface alongside the app-server MCP
    inventory APIs that own current server status listing.
    
    ## What Changed
    
    - Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
    core handler that built the MCP snapshot response.
    - Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
    event handling arms in rollout and MCP-server consumers.
    - Updated tests that used the old op as a synchronization hook to wait
    on existing startup/skills events, and deleted the plugin test that only
    exercised the removed listing op.
    
    ## Validation
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-mcp`
    - `cargo test -p codex-rollout -p codex-rollout-trace -p
    codex-mcp-server`
    - `cargo test -p codex-core --test all
    pending_input::queued_inter_agent_mail`
    - `cargo test -p codex-core --test all
    rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
    - `cargo test -p codex-core --test all
    rmcp_client::stdio_image_responses`
    - `just fix -p codex-core -p codex-protocol -p codex-mcp -p
    codex-rollout -p codex-rollout-trace -p codex-mcp-server`
  • Move message history out of core (#21278)
    ## Why
    
    Message history was implemented inside `codex-core` and surfaced through
    core protocol ops and `SessionConfiguredEvent` fields even though the
    current consumer is TUI-local prompt recall. That made core own UI
    history persistence and exposed `history_log_id` / `history_entry_count`
    through surfaces that app-server and other clients do not need.
    
    This change moves message history persistence out of core and keeps the
    recall plumbing local to the TUI.
    
    ## What changed
    
    - Added a new `codex-message-history` crate for appending, looking up,
    trimming, and reading metadata from `history.jsonl`.
    - Removed core protocol history ops/events: `AddToHistory`,
    `GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
    - Removed `history_log_id` and `history_entry_count` from
    `SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
    - Updated the TUI to dispatch local app events for message-history
    append/lookup and keep its persistent-history metadata in TUI session
    state.
    
    ## Validation
    
    - `cargo test -p codex-message-history -p codex-protocol`
    - `cargo test -p codex-exec event_processor_with_json_output`
    - `cargo test -p codex-mcp-server outgoing_message`
    - `cargo test -p codex-tui`
    - `just fix -p codex-message-history -p codex-protocol -p codex-core -p
    codex-tui -p codex-exec -p codex-mcp-server`
  • 2- Use string service tiers in session protocol (#20971)
    ## Summary
    - break service tier session/op/app-server protocol fields from the
    closed enum to string tier ids
    - send the service tier string directly through model requests, prewarm,
    compaction, memories, and TUI/app-server turn starts
    - regenerate app-server protocol JSON/TypeScript schemas, removing the
    standalone ServiceTier TS enum
    
    ## Verification
    - just fmt
    - cargo check -p codex-core -p codex-app-server -p codex-tui
    - just write-app-server-schema
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • [codex] Remove legacy ListSkills op (#21282)
    ## Why
    
    `skills/list` is already exposed through app-server v2 and covered by
    the app-server test suite. Keeping the separate core `Op::ListSkills`
    path leaves a duplicate legacy protocol surface that no longer needs to
    be maintained.
    
    ## What Changed
    
    - Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the
    core protocol.
    - Deleted the corresponding core session handler and stale core
    integration tests.
    - Removed rollout/MCP ignore branches and protocol v1 docs references
    for the deleted event/op.
    - Left app-server `skills/list` and its existing coverage intact.
    
    ## Validation
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-core --test all suite::skills`
    - `cargo check -p codex-mcp-server -p codex-rollout -p
    codex-rollout-trace`
    - `just fix -p codex-core`
  • [codex] Remove unused ListModels op (#21276)
    ## Why
    
    The core protocol still exposed a `ListModels` submission op even though
    no client sends it and the core submission loop treated it as an ignored
    unknown op. Keeping the dead variant made the protocol surface look
    supported while the active model listing API is the app-server
    `model/list` JSON-RPC request.
    
    ## What Changed
    
    - Removed the unused `Op::ListModels` variant from `codex-rs/protocol`.
    - Removed its `Op::kind()` mapping.
    
    The existing app-server `model/list` endpoint is unchanged.
    
    ## Verification
    
    - `cargo test -p codex-protocol`
  • [codex] Move thread naming to app server (#21260)
    ## Why
    
    Thread names are app-server metadata now, backed by the thread store and
    sqlite state database. Keeping a core `SetThreadName` op plus a rollout
    `thread_name_updated` event made rename persistence live in the wrong
    layer and required historical replay support for an event that new
    app-server flows should not write.
    
    ## What changed
    
    - Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the
    core protocol and deleted the core handler path that appended rename
    events to rollouts.
    - Updated app-server `thread/name/set` so both loaded and unloaded
    threads write through thread-store metadata and app-server emits
    `thread/name/updated` notifications.
    - Updated local thread-store name metadata updates to write sqlite title
    metadata and the legacy thread-name index without appending rollout
    events.
    - Removed state extraction and rollout handling for the deleted
    thread-name event.
    
    ## Validation
    
    - `cargo test -p codex-app-server thread_name_updated_broadcasts`
    - `cargo test -p codex-app-server
    thread_name_set_is_reflected_in_read_list_and_resume`
    - `cargo test -p codex-thread-store
    update_thread_metadata_sets_name_on_active_rollout_and_indexes_name`
    - `cargo test -p codex-state`
    - `cargo check -p codex-mcp-server -p codex-rollout-trace`
    - `just fix -p codex-app-server -p codex-thread-store -p codex-state -p
    codex-mcp-server -p codex-rollout-trace`
    
    ## Docs
    
    No external documentation update is expected for this internal ownership
    change.
  • hook trust metadata and enforcement (#20321)
    # Why
    
    We want shared hook trust that both the app and the TUI can build on,
    but the metadata is only useful if runtime behavior agrees with it. This
    PR adds a single backend trust model for hooks so unmanaged hooks cannot
    run until the current definition has been reviewed, while managed hooks
    remain runnable and non-configurable.
    
    # What
    
    - persist `trusted_hash` alongside hook state in `config.toml`
    - expose `currentHash` and derived `trustStatus` through `hooks/list`
    - derive trust from normalized hook definitions so equivalent hooks from
    `config.toml` and `hooks.json` share the same trust identity
    - gate unmanaged hooks on trust before they enter the runnable handler
    set
    
    # Reviewer Notes
    
    - key file to review is `codex-rs/hooks/src/engine/discovery.rs`
    - the only **core** change is schema related
  • 1- Add model service tiers metadata (#20969)
    ## Why
    
    The model list needs to carry display-ready service tier metadata so
    clients can render tier choices with stable IDs, names, and
    descriptions. A raw speed-tier string list is not enough for richer UI
    copy or future tier labels.
    
    ## What changed
    
    - Added `ModelServiceTier` to shared model metadata with string `id`,
    `name`, and `description` fields.
    - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
    empty defaults for older cached model payloads.
    - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
    it through TUI app-server model conversion.
    - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
    metadata as deprecated in source and generated schema output.
    - Regenerated app-server protocol JSON schema and TypeScript fixtures,
    including `ModelServiceTier.ts`.
    
    ## Verification
    
    - Ran `just write-app-server-schema`.
    - Did not run local tests per repo instruction; relying on PR CI.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] add item lifecycle timing (#20514)
    ## Why
    
    Tool families already disagree on what their existing `duration` fields
    mean, so lifecycle latency should live on the shared item envelope
    instead of being inferred from per-tool execution fields. Carrying that
    envelope through app-server notifications gives downstream consumers one
    reusable timing signal without pretending every tool has the same
    execution semantics.
    
    ## What changed
    
    - Adds `started_at_ms` to core `ItemStartedEvent` values and
    `completed_at_ms` to core `ItemCompletedEvent` values.
    - Populates those timestamps in the shared session lifecycle emitters,
    so protocol-native items get timing without each producer tracking its
    own clock state.
    - Exposes `startedAtMs` on app-server `item/started` notifications and
    `completedAtMs` on `item/completed` notifications.
    - Maps the lifecycle timestamps through the app-server boundary while
    leaving legacy-converted notifications nullable when no lifecycle
    timestamp exists.
    - Regenerates the app-server JSON schema and TypeScript fixtures for the
    notification-envelope change and updates downstream fixtures that
    construct those notifications directly.
    - Extends the existing web-search and image-generation integration flows
    to assert the new lifecycle timestamps on the native item events.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
    -p codex-app-server-client`
    - `cargo test -p codex-core --test all web_search_item_is_emitted`
    - `cargo test -p codex-core --test all
    image_generation_call_event_is_emitted`
    - `cargo test -p codex-app-server-protocol`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
    * #18748
    * #18747
    * #17090
    * #17089
    * __->__ #20514
  • feat: add remote compaction v2 Responses client path (#20773)
    ## Why
    
    This adds the `remote_compaction_v2` client path so remote compaction
    can run through the normal Responses stream and install a
    `context_compaction` item that trigger a compaction.
    
    The goal is to migrate some of the compaction logic on the client side
    
    We keeps the v2 transport behind a feature flag while letting follow-up
    requests reuse the compacted context instead of falling back to the
    legacy compaction item shape.
    
    ## What changed
    
    - add `ResponseItem::ContextCompaction` and refresh the generated
    app-server / schema / TypeScript fixtures that expose response items on
    the wire
    - add `core/src/compact_remote_v2.rs` to send compaction through the
    standard streamed Responses client, require exactly one
    `context_compaction` output item, and install that item into compacted
    history
    - route manual compact and auto-compaction through the v2 path when
    `remote_compaction_v2` is enabled, while keeping the existing remote
    compaction path as the fallback
    - preserve the new item type across history retention, follow-up request
    construction, telemetry, rollout persistence, and rollout-trace
    normalization
    - add targeted coverage for the feature flag, `context_compaction`
    serialization, rollout-trace normalization, and remote-compaction
    follow-up behavior
    
    ## Verification
    
    - added protocol tests for `context_compaction`
    serialization/deserialization in `protocol/src/models.rs`
    - added rollout-trace coverage for `context_compaction` normalization in
    `rollout-trace/src/reducer/conversation_tests.rs`
    - added remote compaction integration coverage for v2 follow-up reuse
    and mixed compaction output streams in
    `core/tests/suite/compact_remote.rs`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Emit MCP tool calls as turn items (#20677)
    ## Why
    
    `McpToolCall` was still an app-server item synthesized from deprecated
    legacy begin/end events. Recent item migrations moved this ownership
    into core `TurnItem`s, so MCP tool calls now follow the same canonical
    lifecycle and leave legacy events as compatibility fanout.
    
    Keeping the core item close to the v2 `ThreadItem::McpToolCall` shape
    also avoids spreading MCP result semantics across app-server conversion
    code. Core now owns whether a completed call is `completed` or `failed`,
    and whether the payload is a tool result or an error.
    
    ## What changed
    
    - Added core `TurnItem::McpToolCall` with flattened `server`, `tool`,
    `arguments`, `status`, `result`, and `error` fields.
    - Updated MCP tool call emitters, including MCP resource tools, to emit
    `ItemStarted`/`ItemCompleted` around directly constructed core MCP
    items.
    - Updated app-server v2 conversion to project the core MCP item into
    `ThreadItem::McpToolCall` without deriving status or splitting `Result`
    locally.
    - Ignored live deprecated MCP legacy fanout in app-server v2 to avoid
    duplicate item notifications, while keeping thread history replay on the
    legacy event path.
    
    ## Verification
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core --lib mcp_tool_call`
    - `cargo check -p codex-app-server`
    - `cargo test -p codex-app-server
    mcp_tool_call_completion_notification_contains_truncated_large_result`
  • [codex] Emit image view as core item (#20512)
    ## Why
    
    Image-view results should be represented as a core-produced turn item
    instead of being reconstructed by app-server. At the same time, existing
    rollout/history paths still understand the legacy `ViewImageToolCall`
    event, so this keeps that event as compatibility output generated from
    the new item lifecycle.
    
    ## What changed
    
    - Added `TurnItem::ImageView` to `codex-protocol`.
    - Emitted image-view item start/completion directly from the core
    `view_image` handler.
    - Kept `ViewImageToolCall` as a legacy event and generate it from
    completed `TurnItem::ImageView` items.
    - Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay
    path, with `ImageView` item lifecycle events ignored there.
    - Updated app-server protocol conversion, rollout persistence, and
    affected exhaustive event matches for the new item plus legacy fan-out
    shape.
    
    ## Verification
    
    - `cargo test -p codex-protocol -p codex-app-server-protocol -p
    codex-rollout -p codex-rollout-trace -p codex-mcp-server -p
    codex-app-server --lib`
    - `cargo test -p codex-core --test all
    view_image_tool_attaches_local_image`
    - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol
    -p codex-app-server -p codex-rollout -p codex-rollout-trace -p
    codex-mcp-server`
    - `git diff --check`
  • Move apply-patch file changes into turn items (#20540)
    ## Why
    
    Apply-patch file changes are now part of the core turn item stream, so
    v2 clients can consume the same first-class item lifecycle path used by
    other turn items instead of relying on app-server-specific remapping
    from legacy patch events.
    
    ## What changed
    
    - Added a core `TurnItem::FileChange` carrying apply-patch changes and
    completion metadata.
    - Updated the apply-patch tool emitter to send `ItemStarted` /
    `ItemCompleted` with the new `FileChange` item while preserving legacy
    `PatchApplyBegin` / `PatchApplyEnd` fan-out.
    - Updated app-server v2 conversion to render the new core item directly
    and stopped `event_mapping` from remapping old patch begin/end events
    into item notifications.
    - Kept thread history reconstruction based on the existing old
    apply-patch events for rollout compatibility.
    
    ## Verification
    
    - `cargo test -p codex-protocol -p codex-app-server-protocol`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-app-server bespoke_event_handling`
  • [codex] Remove unused event messages (#20511)
    ## Why
    
    Several legacy `EventMsg` variants were still emitted or mapped even
    though clients either ignored them or had moved to item/lifecycle
    events. `Op::Undo` had also degraded to an unavailable shim, so this
    removes that dead task path instead of preserving a command that cannot
    do useful work.
    
    `McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are
    intentionally kept because useful consumers still depend on them: MCP
    startup completion drives readiness behavior, and the begin events let
    app-server/core consumers surface in-progress web-search and
    image-generation items before the final payload arrives.
    
    ## What Changed
    
    - Removed weak legacy event variants and payloads from `codex-protocol`,
    including legacy agent deltas, background events, and undo lifecycle
    events.
    - Kept/restored `EventMsg::McpStartupComplete`,
    `EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with
    serializer and emission coverage.
    - Updated core, rollout, MCP server, app-server thread history,
    review/delegate filtering, and tests to rely on the useful replacement
    events that remain.
    - Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI
    slash-command comments.
    - Stopped agent job/background progress and compaction retry notices
    from emitting `BackgroundEvent` payloads.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-app-server-protocol -p
    codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - `cargo test -p codex-protocol -p codex-app-server-protocol -p
    codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - `cargo test -p codex-core --test all suite::items`
    - `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core
    -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
    - Earlier coverage on this PR also included `codex-mcp`, `codex-tui`,
    core library tests, MCP/plugin/delegate/review/agent job tests, and MCP
    startup TUI tests.
  • Add /hooks browser for lifecycle hooks (#19882)
    ## Why
    
    `hooks/list` and `hooks/config/write` give us read/write access to hooks
    and their state. This hooks up the TUI as a client so users can inspect
    and manage that state directly.
    
    ## What
    
    - add a two-page `/hooks` browser in the TUI: an event overview with
    installed/active counts, followed by a per-event handler page with
    toggle controls and detail rendering
    - thread managed-state metadata through hook discovery and `hooks/list`
    so the UI can label admin-managed hooks and suppress toggles for them
    - persist hook toggles through the existing config-write path and add
    snapshot coverage for the event list, handler list, managed-hook, and
    empty states
    
    ## Stack
    
    1. openai/codex#19705
    2. openai/codex#19778
    3. openai/codex#19840
    4. This PR - openai/codex#19882
    
    ## Reviewer Notes
    
    - Main UI logic is in
    `codex-rs/tui/src/bottom_pane/hooks_browser_view.rs`; most of the diff
    is the new view plus its snapshot coverage
    - Request / write plumbing for opening the browser and persisting
    toggles is in `codex-rs/tui/src/app/background_requests.rs` and
    `codex-rs/tui/src/chatwidget/hooks.rs`
    - Outside the TUI, the only behavioral change in this PR is threading
    `is_managed` through hook discovery and `hooks/list` so managed hooks
    render as non-toggleable
    - The `codex-rs/tui/src/status/snapshots/` churn is unrelated merge
    fallout from the stacked base branch's newer permission-label rendering
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • realtime: rename provider session ids (#20361)
    ## Summary
    
    Codex is repurposing `session` to mean a thread group, so the realtime
    provider session id should no longer use `session_id` / `sessionId` in
    Codex-facing protocol payloads. This PR renames that provider-specific
    field to `realtime_session_id` / `realtimeSessionId` and intentionally
    breaks clients that still send the old field names.
    
    ## What Changed
    
    - Renamed realtime provider session fields in `ConversationStartParams`,
    `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
    - Renamed app-server v2 realtime request and notification fields to
    `realtimeSessionId`.
    - Removed legacy serde aliases for `session_id` / `sessionId`; clients
    must send the new names.
    - Propagated the rename through core realtime startup, app-server
    adapters, codex-api websocket handling, and TUI realtime state.
    - Regenerated app-server protocol schema/TypeScript outputs and updated
    app-server README examples.
    - Kept upstream Realtime API concepts unchanged: provider `session.id`
    parsing and `x-session-id` headers still use the upstream wire names.
    
    ## Testing
    
    - CI is running on the latest pushed commit.
    - Earlier local verification on this PR:
      - `cargo test -p codex-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
    realtime_conversation`
      - `cargo test -p codex-app-server-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
    realtime_conversation`
    - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
    linker bus error while linking the test binary)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add persisted hook enablement state (#19840)
    ## Why
    
    After `hooks/list` exposes the hook inventory, clients need a way to
    persist user hook preferences, make those changes effective in
    already-open sessions, and distinguish user-controllable hooks from
    managed requirements without adding another bespoke app-server write
    API.
    
    ## What
    
    - Extends `hooks/list` entries with effective `enabled` state.
    - Persists user-level hook state under `hooks.state.<hook-id>` so the
    model can grow beyond a single boolean over time.
    - Uses the existing `config/batchWrite` path for hook state updates
    instead of introducing a dedicated hook write RPC.
    - Refreshes live session hook engines after config writes so
    already-open threads observe updated enablement without a restart.
    
    ## Stack
    
    1. openai/codex#19705
    2. openai/codex#19778
    3. This PR - openai/codex#19840
    4. openai/codex#19882
    
    ## Reviewer Notes
    
    The generated schema files account for much of the raw diff. The core
    behavior is in:
    
    - `hooks/src/config_rules.rs`, which resolves per-hook user state from
    the config layer stack.
    - `hooks/src/engine/discovery.rs`, which projects effective enablement
    into `hooks/list` from source-derived managedness.
    - `config/src/hook_config.rs`, which defines the new `hooks.state`
    representation.
    - `core/src/session/mod.rs`, which rebuilds live hook state after user
    config reloads.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • expand the set of core shell env vars for Windows. (#20089)
    https://github.com/openai/codex/issues/13917 and
    https://github.com/openai/codex/issues/18248 correctly identify that
    
    ```
    [shell_environment_policy]
    inherit = "core"
    ```
    is not functional on Windows because it carries an insufficient set of
    env vars.
    This PR expands that to match the more functional set from the MCP
    client
  • test protocol: lock inter-agent commentary phase (#20046)
    ## Summary
    - add a regression test for
    `InterAgentCommunication::to_response_input_item`
    - assert replayed inter-agent messages keep `phase:
    Some(MessagePhase::Commentary)`
    
    ## Test plan
    - `cargo test -p codex-protocol`
    - `just argument-comment-lint`
  • Discover hooks bundled with plugins (#19705)
    ## Why
    
    Plugins can bundle lifecycle hooks, but Codex previously only discovered
    hooks from user, project, and managed config layers. This adds the
    plugin discovery and runtime plumbing needed for plugin-bundled hooks
    while keeping execution behind the `plugin_hooks` feature flag.
    
    ## What
    
    - Discovers plugin hook sources from each plugin's default
    `hooks/hooks.json`.
    - Supports `plugin.json` manifest `hooks` entries as either relative
    paths or inline hook objects.
    - Plumbs discovered plugin hook sources through plugin loading into the
    hook runtime when `plugin_hooks` is enabled.
    - Marks plugin-originated hook runs as `HookSource::Plugin`.
    - Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook
    command environments.
    - Updates generated schemas and hook source metadata for the plugin hook
    source.
    
    ## Stack
    
    1. This PR - openai/codex#19705
    2. openai/codex#19778
    3. openai/codex#19840
    4. openai/codex#19882
    
    ## Reviewer Notes
    
    - Core logic is in `codex-rs/core-plugins/src/loader.rs` and
    `codex-rs/hooks/src/engine/discovery.rs`
    - Moved existing / adding new tests to
    `codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there
    - Otherwise mostly plumbing and minor schema updates
    
    ### Core Changes
    
    The `codex-rs/core` changes are limited to wiring plugin hook support
    into existing core flows:
    
    - `core/src/session/session.rs` conditionally pulls effective plugin
    hook sources and plugin hook load warnings from `PluginsManager` when
    `plugin_hooks` is enabled, then passes them into `HooksConfig`.
    - `core/src/hook_runtime.rs` adds the `plugin` metric tag for
    `HookSource::Plugin`.
    - `core/config.schema.json` picks up the new `plugin_hooks` feature
    flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the
    added plugin hook fields.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Load cloud requirements for agent identity (#19708)
    ## Why
    
    Agent Identity sessions can represent Business and Enterprise ChatGPT
    workspaces, but cloud requirements were skipped before fetch. That meant
    workspace-managed requirements were not loaded for Agent Identity even
    when the JWT carried the same account identity and plan information that
    normal ChatGPT token auth exposes.
    
    This PR now sits on top of the Agent Identity stack through
    [#19764](https://github.com/openai/codex/pull/19764). Because
    [#19763](https://github.com/openai/codex/pull/19763) moved task
    registration into Agent Identity auth loading, cloud requirements no
    longer needs a separate runtime-initialization step before building the
    backend client.
    
    ## What changed
    
    - Stop skipping `CodexAuth::AgentIdentity` in the cloud requirements
    loader.
    - Share the cloud requirements eligibility check between startup load
    and background cache refresh.
    - Rely on eagerly loaded Agent Identity auth so backend requests can
    attach task-scoped `AgentAssertion` headers.
    - Decode Agent Identity JWT `plan_type` as the auth-layer plan type,
    then convert it through a shared `auth::PlanType` -> `account::PlanType`
    mapping.
    - Add the missing serde alias for the `education` plan string and add
    coverage for raw Agent Identity plan aliases such as `hc` and
    `education`.
    
    ## Testing
    
    - `cargo test -p codex-agent-identity -p codex-login -p
    codex-cloud-requirements -p codex-protocol`
  • [sandbox] Enforce protected workspace metadata paths (#19846)
    ## Summary
    
    Make FileSystemSandboxPolicy the semantic source of truth for project
    root metadata protection. Under writable roots, `.git`, `.codex`, and
    `.agents` stay protected unless user policy grants an explicit write
    rule for that metadata path.
    
    ## Scope
    
    1. Add `protected_metadata_names` to `WritableRoot`.
    2. Teach `FileSystemSandboxPolicy::can_write_path_with_cwd` to reject
    protected metadata writes under writable roots unless explicitly
    allowed.
    3. Default workspace write profiles to protect `.git`, `.codex`, and
    `.agents`.
    4. Add the Linux fallback setup needed before Linux enforcement lands
    later in the stack.
    
    ## Reviewer Focus
    
    1. The policy decision belongs in FileSystemSandboxPolicy, not shell
    command parsing.
    2. Legacy SandboxPolicy remains a compatibility projection, not the
    source of the new rule.
    3. Explicit user write rules can still opt into these metadata paths.
    
    ## Stack
    
    1. Policy primitive: this PR
    2. macOS Seatbelt adapter: #19847
    3. Shell preflight UX: #19848
    4. Runtime profile propagation: #19849
    5. Linux bubblewrap adapter: #19852
    
    ## Validation
    
    1. codex protocol permissions tests
    2. formatting for codex protocol and codex linux sandbox
    3. diff whitespace check
  • feat: split memories part 2 (#19860)
    Keep extracting memories out of core and moving the write trigger in the
    app-server
    This is temporary and it should move at the client level as a follow-up
    This makes core fully independant from `codex-memories-write`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • permissions: make SessionConfigured profile-only (#19774)
    ## Why
    
    `SessionConfiguredEvent` is the internal event that tells clients what
    permissions are active for a session. Emitting both `sandbox_policy` and
    `permission_profile` leaves two possible authorities and forces every
    consumer to decide which one to honor. At this point in the migration,
    the profile is expressive enough to represent managed, disabled, and
    external sandbox enforcement, so the internal event can be profile-only.
    
    The wire compatibility concern is older serialized events or rollout
    data that only contain `sandbox_policy`; those still need to
    deserialize.
    
    ## What Changed
    
    - Removes `sandbox_policy` from `SessionConfiguredEvent` and makes
    `permission_profile` required.
    - Adds custom deserialization so old payloads with only `sandbox_policy`
    are upgraded to a cwd-anchored `PermissionProfile`.
    - Updates core event emission and TUI session handling to sync
    permissions from the profile directly.
    - Updates app-server response construction to derive the legacy
    `sandbox` response field from the active thread snapshot instead of from
    `SessionConfiguredEvent`.
    - Updates yolo-mode display logic to treat both
    `PermissionProfile::Disabled` and managed unrestricted filesystem plus
    enabled network as full-access, while still preserving the distinction
    between no sandbox and external sandboxing.
    
    ## Verification
    
    - `cargo test -p codex-protocol session_configured_event --lib`
    - `cargo test -p codex-protocol serialize_event --lib`
    - `cargo test -p codex-exec session_configured --lib`
    - `cargo test -p codex-app-server
    thread_response_permission_profile_preserves_enforcement --lib`
    - `cargo test -p codex-core
    session_configured_reports_permission_profile_for_external_sandbox
    --lib`
    - `cargo test -p codex-tui session_configured --lib`
    - `cargo test -p codex-tui
    yolo_mode_includes_managed_full_access_profiles --lib`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774).
    * #19900
    * #19899
    * #19776
    * #19775
    * __->__ #19774
  • Remove ghost snapshots (#19481)
    ## Summary
    - Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface
    and generated SDK/schema artifacts.
    - Keep legacy config loading compatible, but make undo a no-op that
    reports the feature is unavailable.
    - Clean up core history, compaction, telemetry, rollout, and tests to
    stop carrying ghost snapshot items.
    
    ## Testing
    - Unit tests passed for `codex-protocol`, `codex-core` targeted undo and
    compaction flows, `codex-rollout`, and `codex-app-server-protocol`.
    - Regenerated config and app-server schemas plus Python SDK artifacts
    and verified they match the checked-in outputs.