Commit Graph

3718 Commits

  • fix(exec-policy) No empty command lists (#11397)
    ## Summary
    This should rarely, if ever, happen in practice. But regardless, we
    should never provide an empty list of `commands` to ExecPolicy. This PR
    is almost entirely adding test around these cases.
    
    ## Testing
    - [x] Adds a bunch of unit tests for this
  • Remove deterministic_process_ids feature to avoid duplicate codex-core builds (#11393)
    ## Why
    
    `codex-core` enabled `deterministic_process_ids` through a self
    dev-dependency.
    That forced a second feature-resolved build of the same crate, which
    increased
    compile time and test latency.
    
    ## What Changed
    
    - Removed the `deterministic_process_ids` feature from
    `codex-rs/core/Cargo.toml`.
    - Removed the self dev-dependency on `codex-core` that enabled that
    feature.
    - Removed the Bazel `deterministic_process_ids` crate feature for
    `codex-core`.
    - Added a test-only `AtomicBool` override in unified exec process-id
    allocation.
    - Added a test-support setter for that override and re-exported it from
    `codex-core`.
    - Enabled deterministic process IDs in integration tests via
    `core_test_support` ctor.
    
    ## Behavior
    
    - Production behavior remains random process IDs.
    - Unit tests remain deterministic via `cfg(test)`.
    - Integration tests remain deterministic via explicit test-support
    initialization.
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-core unified_exec::`
    - `cargo test -p codex-core --test all unified_exec -- --test-threads=1`
    - `cargo tree -p codex-core -e features` (verified the removed feature
    path)
  • tui: queue non-pending rollback trims in app-event order (#11373)
    ## Summary
    
    This PR fixes TUI transcript-sync behavior for
    `EventMsg::ThreadRolledBack` and makes rollback application order
    deterministic.
    
    Previously, rollback handling depended on `pending_rollback`:
    
    - if `pending_rollback` was set (local backtrack), TUI trimmed correctly
    - otherwise, replayed/external rollbacks were either ignored or could be
    applied at the wrong time relative to queued transcript inserts
    
    This change keeps the local backtrack path intact and routes non-pending
    rollbacks through the app event queue so rollback trims are applied in
    FIFO order with transcript cell inserts.
    
    ## What changed
    
    - Added/used `trim_transcript_cells_drop_last_n_user_turns(...)` for
    rollback-by-`num_turns` semantics.
    - Renamed rollback app event:
    - `AppEvent::ApplyReplayedThreadRollback` ->
    `AppEvent::ApplyThreadRollback`
    - Replay path (`ChatWidget`) now emits `ApplyThreadRollback`.
    - Live non-pending rollback path (`App::handle_backtrack_event`) now
    emits `ApplyThreadRollback` instead of trimming immediately.
    - App-level event handler applies `ApplyThreadRollback` after queued
    `InsertHistoryCell` events and schedules redraw only when a trim
    occurred.
    - When a trim occurs with an overlay open, TUI now syncs transcript
    overlay committed cells, clamps backtrack preview selection, and clears
    stale `deferred_history_lines` so closed overlays do not re-append
    rolled-back lines.
    - Clarified inline comments around the `pending_rollback` branch so
    future readers can reason about why there are two paths.
    
    ## Why queueing matters
    
    During resume/replay, transcript cells are populated via queued
    `InsertHistoryCell` app events. If a rollback is applied immediately
    outside that queue, it can run against an incomplete transcript and
    under-trim. Queueing non-pending rollbacks ensures consistent ordering
    and correct final transcript state.
    
    ## Behavior by rollback source
    
    - `pending_rollback = Some(...)` (local backtrack requested by this
    TUI):
      - use `finish_pending_backtrack()` and the stored selection boundary
    - `pending_rollback = None` (replay/external/non-local rollback):
    - enqueue `AppEvent::ApplyThreadRollback { num_turns }` and trim in
    app-event order
    
    ## Tests
    
    Added/updated tests covering ordering and semantics:
    
    -
    `app_backtrack::tests::trim_drop_last_n_user_turns_applies_rollback_semantics`
    - `app_backtrack::tests::trim_drop_last_n_user_turns_allows_overflow`
    - `app::tests::replayed_initial_messages_apply_rollback_in_queue_order`
    -
    `app::tests::live_rollback_during_replay_is_applied_in_app_event_order`
    -
    `app::tests::queued_rollback_syncs_overlay_and_clears_deferred_history`
    - `chatwidget::tests::replayed_thread_rollback_emits_ordered_app_event`
    
    Validation run:
    
    - `just fmt`
    - `cargo test -p codex-tui`
  • Prefer websocket transport when model opts in (#11386)
    Summary
    - add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
    in all fixtures and constructors
    - wire the new flag into websocket selection so models that opt in
    always use websocket transport even when the feature gate is off
    
    Testing
    - Not run (not requested)
  • Update models.json (#11376)
    Automated update of models.json.
    
    ---------
    
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: sayan-oai <sayan@openai.com>
  • chore: rename codex-command to codex-shell-command (#11378)
    This addresses some post-merge feedback on
    https://github.com/openai/codex/pull/11361:
    
    - crate rename
    - reuse `detect_shell_type()` utility
  • feat: prevent double backfill (#11377)
    ## Summary
    
    Add a DB-backed lease to prevent duplicate `.sqlite` backfill workers
    from running concurrently.
    
    ### What changed
    - Added StateRuntime::try_claim_backfill(lease_seconds) that atomically
    claims backfill only when:
      - backfill is not complete, and
      - no fresh running worker currently owns it.
    - Updated backfill_sessions to use the claim API and exit early when
    another worker already holds the lease.
    - Added runtime tests covering:
      - singleton claim behavior,
      - stale lease takeover,
      - claim blocked after complete.
    - Set backfill lease to 900s in production and 1s in tests.
    
    ### Why
    
    This avoids duplicate backfill work and reduces backfill status churn
    under concurrent startup, while preserving
    current best-effort fallback behavior.
  • ci: fall back to local Bazel on forks without BuildBuddy key (#11359)
    ## Summary
    - detect whether BUILDBUDDY_API_KEY is present in Bazel CI
    - keep existing remote BuildBuddy path when key is available
    - add a local fallback path for fork PRs without secrets by clearing
    remote cache/executor/BES endpoints
    - document each fallback flag inline with links to Bazel docs
    
    ## Testing
    - ruby -e 'require "yaml";
    YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
    - verified Bazel docs/flag references used in workflow comments
  • Enable SOCKS defaults for common local network proxy use cases (#11362)
    ## Summary
    - enable local-use defaults in network proxy settings: SOCKS5 on, SOCKS5
    UDP on, upstream proxying on, and local binding on
    - add a regression test that asserts the full
    `NetworkProxySettings::default()` baseline
    - Fixed managed listener reservation behavior.
    Before: we always reserved a loopback SOCKS listener, even when
    enable_socks5 = false.
    Now: SOCKS listener is only reserved when SOCKS is enabled.
    - Fixed /debug-config env output for SOCKS-disabled sessions.
    ALL_PROXY now shows the HTTP proxy URL when SOCKS is disabled (instead
    of incorrectly showing socks5h://...).
    
    
    ## Validation
    - just fmt
    - cargo test -p codex-network-proxy
    - cargo clippy -p codex-network-proxy --all-targets
  • feat: mem v2 - PR4 (#11369)
    # Memories migration plan (simplified global workflow)
    
    ## Target behavior
    
    - One shared memory root only: `~/.codex/memories/`.
    - No per-cwd memory buckets, no cwd hash handling.
    - Phase 1 candidate rules:
    - Not currently being processed unless the job lease is stale.
    - Rollout updated within the max-age window (currently 30 days).
    - Rollout idle for at least 12 hours (new constant).
    - Global cap: at most 64 stage-1 jobs in `running` state at any time
    (new invariant).
    - Stage-1 model output shape (new):
    - `rollout_slug` (accepted but ignored for now).
    - `rollout_summary`.
    - `raw_memory`.
    - Phase-1 artifacts written under the shared root:
    - `rollout_summaries/<thread_id>.md` for each rollout summary.
    - `raw_memories.md` containing appended/merged raw memory paragraphs.
    - Phase 2 runs one consolidation agent for the shared `memories/`
    directory.
    - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.
    
    ## Current code map
    
    - Core startup pipeline: `core/src/memories/startup/mod.rs`.
    - Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
    `core/src/memories/stage_one.rs`, templates in
    `core/templates/memories/`.
    - File materialization: `core/src/memories/storage.rs`,
    `core/src/memories/layout.rs`.
    - Scope routing (cwd/user): `core/src/memories/scope.rs`,
    `core/src/memories/startup/mod.rs`.
    - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.
    
    ## PR plan
    
    ## PR 1: Correct phase-1 selection invariants (no behavior-breaking
    layout changes yet)
    
    - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
    `core/src/memories/mod.rs`.
    - Thread this into `state::claim_stage1_jobs_for_startup(...)`.
    - Enforce idle-time filter in DB selection logic (not only in-memory
    filtering after `scan_limit`) so eligible threads are not starved by
    very recent threads.
    - Enforce global running cap of 64 at claim time in DB logic:
    - Count fresh `memory_stage1` running jobs.
    - Only allow new claims while count < cap.
    - Keep stale-lease takeover behavior intact.
    - Add/adjust tests in `state/src/runtime.rs`:
    - Idle filter inclusion/exclusion around 12h boundary.
    - Global running-cap guarantee.
    - Existing stale/fresh ownership behavior still passes.
    
    Acceptance criteria:
    - Startup never creates more than 64 fresh `memory_stage1` running jobs.
    - Threads updated <12h ago are skipped.
    - Threads older than 30d are skipped.
    
    ## PR 2: Stage-1 output contract + storage artifacts
    (forward-compatible)
    
    - Update parser/types to accept the new structured output while keeping
    backward compatibility:
    - Add `rollout_slug` (optional for now).
    - Add `rollout_summary`.
    - Keep alias support for legacy `summary` and `rawMemory` until prompt
    swap completes.
    - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
    include the new keys.
    - Update prompt templates:
    - `core/templates/memories/stage_one_system.md`.
    - `core/templates/memories/stage_one_input.md`.
    - Replace storage model in `core/src/memories/storage.rs`:
    - Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
    files).
    - Introduce `raw_memories.md` aggregator writer from DB rows.
    - Keep deterministic rebuild behavior from DB outputs so files can
    always be regenerated.
    - Update consolidation prompt template to reference `rollout_summaries/`
    + `raw_memories.md` inputs.
    
    Acceptance criteria:
    - Stage-1 accepts both old and new output keys during migration.
    - Phase-1 artifacts are generated in new format from DB state.
    - No dependence on per-thread files in `raw_memories/`.
    
    ## PR 3: Remove per-cwd memories and move to one global memory root
    
    - Simplify layout in `core/src/memories/layout.rs`:
    - Single root: `codex_home/memories`.
    - Remove cwd-hash bucket helpers and normalization logic used only for
    memory pathing.
    - Remove scope branching from startup phase-2 dispatch path:
    - No cwd/user mapping in `core/src/memories/startup/mod.rs`.
    - One target root for consolidation.
    - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
    consolidation scope.
    - Keep one logical consolidation scope/job key (global/user) to avoid a
    risky schema rewrite in same PR.
    - Add one-time migration helper (core side) to preserve current shared
    memory output:
    - If `~/.codex/memories/user/memory` exists and new root is empty,
    move/copy contents into `~/.codex/memories`.
    - Leave old hashed cwd buckets untouched for now (safe/no-destructive
    migration).
    
    Acceptance criteria:
    - New runs only read/write `~/.codex/memories`.
    - No new cwd-scoped consolidation jobs are enqueued.
    - Existing user-shared memory content is preserved.
    
    ## PR 4: Phase-2 global lock simplification and cleanup
    
    - Replace multi-scope dispatch with a single global consolidation claim
    path:
    - Either reuse jobs table with one fixed key, or add a tiny dedicated
    lock helper; keep 1h lease.
    - Ensure at most one consolidation agent can run at once.
    - Keep heartbeat + stale lock recovery semantics in
    `core/src/memories/startup/watch.rs`.
    - Remove dead scope code and legacy constants no longer used.
    - Update tests:
    - One-agent-at-a-time behavior.
    - Lock expiry allows takeover after stale lease.
    
    Acceptance criteria:
    - Exactly one phase-2 consolidation agent can be active cluster-wide
    (per local DB).
    - Stale lock recovers automatically.
    
    ## PR 5: Final cleanup and docs
    
    - Remove legacy artifacts and references:
    - `raw_memories/` and `memory_summary.md` assumptions from
    prompts/comments/tests.
    - Scope constants for cwd memory pathing in core/state if fully unused.
    - Update docs under `docs/` for memory workflow and directory layout.
    - Add a brief operator note for rollout: compatibility window for old
    stage-1 JSON keys and when to remove aliases.
    
    Acceptance criteria:
    - Code and docs reflect only the simplified global workflow.
    - No stale references to per-cwd memory buckets.
    
    ## Notes on sequencing
    
    - PR 1 is safest first because it improves correctness without changing
    external artifact layout.
    - PR 2 keeps parser compatibility so prompt deployment can happen
    independently.
    - PR 3 and PR 4 split filesystem/scope simplification from locking
    simplification to reduce blast radius.
    - PR 5 is intentionally cleanup-only.
  • # Split command parsing/safety out of codex-core into new codex-command (#11361)
    `codex-core` had accumulated command parsing and command safety logic
    (`bash`, `powershell`, `parse_command`, and `command_safety`) that is
    logically cohesive but orthogonal to most core session/runtime logic.
    Keeping this code in `codex-core` made the crate increasingly monolithic
    and raised iteration cost for unrelated core changes.
    
    This change extracts that surface into a dedicated crate,
    `codex-command`, while preserving existing `codex_core::...` call sites
    via re-exports.
    
    ## Why this refactor
    
    During analysis, command parsing/safety stood out as a good first split
    because it has:
    
    - a clear domain boundary (shell parsing + safety classification)
    - relatively self-contained dependencies (notably `tree-sitter` /
    `tree-sitter-bash`)
    - a meaningful standalone test surface (`134` tests moved with the
    crate)
    - many downstream uses that benefit from independent compilation and
    caching
    
    The practical problem was build latency from a large `codex-core`
    compile/test graph. Clean-build timings before and after this split
    showed measurable wins:
    
    - `cargo check -p codex-core`: `57.08s` -> `53.54s` (~`6.2%` faster)
    - `cargo test -p codex-core --no-run`: `2m39.9s` -> `2m20s` (~`12.4%`
    faster)
    - `codex-core lib` compile unit: `57.18s` -> `49.67s` (~`13.1%` faster)
    - `codex-core lib(test)` compile unit: `60.87s` -> `53.21s` (~`12.6%`
    faster)
    
    This gives a concrete reduction in core build overhead without changing
    behavior.
    
    ## What changed
    
    ### New crate
    
    - Added `codex-rs/command` as workspace crate `codex-command`.
    - Added:
      - `command/src/lib.rs`
      - `command/src/bash.rs`
      - `command/src/powershell.rs`
      - `command/src/parse_command.rs`
      - `command/src/command_safety/*`
      - `command/src/shell_detect.rs`
      - `command/BUILD.bazel`
    
    ### Code moved out of `codex-core`
    
    - Moved modules from `core/src` into `command/src`:
      - `bash.rs`
      - `powershell.rs`
      - `parse_command.rs`
      - `command_safety/*`
    
    ### Dependency graph updates
    
    - Added workspace member/dependency entries for `codex-command` in
    `codex-rs/Cargo.toml`.
    - Added `codex-command` dependency to `codex-rs/core/Cargo.toml`.
    - Removed `tree-sitter` and `tree-sitter-bash` from `codex-core` direct
    deps (now owned by `codex-command`).
    
    ### API compatibility for callers
    
    To avoid immediate downstream churn, `codex-core` now re-exports the
    moved modules/functions:
    
    - `codex_command::bash`
    - `codex_command::powershell`
    - `codex_command::parse_command`
    - `codex_command::is_safe_command`
    - `codex_command::is_dangerous_command`
    
    This keeps existing `codex_core::...` paths working while enabling
    gradual migration to direct `codex-command` usage.
    
    ### Internal decoupling detail
    
    - Added `command::shell_detect` so moved `bash`/`powershell` logic no
    longer depends on core shell internals.
    - Adjusted PowerShell helper visibility in `codex-command` for existing
    core test usage (`UTF8` prefix helper + executable discovery functions).
    
    ## Validation
    
    - `just fmt`
    - `just fix -p codex-command -p codex-core`
    - `cargo test -p codex-command` (`134` passed)
    - `cargo test -p codex-core --no-run`
    - `cargo test -p codex-core shell_command_handler`
    
    ## Notes / follow-up
    
    This commit intentionally prioritizes boundary extraction and
    compatibility. A follow-up can migrate downstream crates to depend
    directly on `codex-command` (instead of through `codex-core` re-exports)
    to realize additional incremental build wins.
  • Update models.json (#11274)
    Automated update of models.json.
    
    ---------
    
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
    Co-authored-by: Sayan Sisodiya <sayan@openai.com>
  • feat: mem v2 - PR3 (#11366)
    # Memories migration plan (simplified global workflow)
    
    ## Target behavior
    
    - One shared memory root only: `~/.codex/memories/`.
    - No per-cwd memory buckets, no cwd hash handling.
    - Phase 1 candidate rules:
    - Not currently being processed unless the job lease is stale.
    - Rollout updated within the max-age window (currently 30 days).
    - Rollout idle for at least 12 hours (new constant).
    - Global cap: at most 64 stage-1 jobs in `running` state at any time
    (new invariant).
    - Stage-1 model output shape (new):
    - `rollout_slug` (accepted but ignored for now).
    - `rollout_summary`.
    - `raw_memory`.
    - Phase-1 artifacts written under the shared root:
    - `rollout_summaries/<thread_id>.md` for each rollout summary.
    - `raw_memories.md` containing appended/merged raw memory paragraphs.
    - Phase 2 runs one consolidation agent for the shared `memories/`
    directory.
    - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.
    
    ## Current code map
    
    - Core startup pipeline: `core/src/memories/startup/mod.rs`.
    - Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
    `core/src/memories/stage_one.rs`, templates in
    `core/templates/memories/`.
    - File materialization: `core/src/memories/storage.rs`,
    `core/src/memories/layout.rs`.
    - Scope routing (cwd/user): `core/src/memories/scope.rs`,
    `core/src/memories/startup/mod.rs`.
    - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.
    
    ## PR plan
    
    ## PR 1: Correct phase-1 selection invariants (no behavior-breaking
    layout changes yet)
    
    - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
    `core/src/memories/mod.rs`.
    - Thread this into `state::claim_stage1_jobs_for_startup(...)`.
    - Enforce idle-time filter in DB selection logic (not only in-memory
    filtering after `scan_limit`) so eligible threads are not starved by
    very recent threads.
    - Enforce global running cap of 64 at claim time in DB logic:
    - Count fresh `memory_stage1` running jobs.
    - Only allow new claims while count < cap.
    - Keep stale-lease takeover behavior intact.
    - Add/adjust tests in `state/src/runtime.rs`:
    - Idle filter inclusion/exclusion around 12h boundary.
    - Global running-cap guarantee.
    - Existing stale/fresh ownership behavior still passes.
    
    Acceptance criteria:
    - Startup never creates more than 64 fresh `memory_stage1` running jobs.
    - Threads updated <12h ago are skipped.
    - Threads older than 30d are skipped.
    
    ## PR 2: Stage-1 output contract + storage artifacts
    (forward-compatible)
    
    - Update parser/types to accept the new structured output while keeping
    backward compatibility:
    - Add `rollout_slug` (optional for now).
    - Add `rollout_summary`.
    - Keep alias support for legacy `summary` and `rawMemory` until prompt
    swap completes.
    - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
    include the new keys.
    - Update prompt templates:
    - `core/templates/memories/stage_one_system.md`.
    - `core/templates/memories/stage_one_input.md`.
    - Replace storage model in `core/src/memories/storage.rs`:
    - Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
    files).
    - Introduce `raw_memories.md` aggregator writer from DB rows.
    - Keep deterministic rebuild behavior from DB outputs so files can
    always be regenerated.
    - Update consolidation prompt template to reference `rollout_summaries/`
    + `raw_memories.md` inputs.
    
    Acceptance criteria:
    - Stage-1 accepts both old and new output keys during migration.
    - Phase-1 artifacts are generated in new format from DB state.
    - No dependence on per-thread files in `raw_memories/`.
    
    ## PR 3: Remove per-cwd memories and move to one global memory root
    
    - Simplify layout in `core/src/memories/layout.rs`:
    - Single root: `codex_home/memories`.
    - Remove cwd-hash bucket helpers and normalization logic used only for
    memory pathing.
    - Remove scope branching from startup phase-2 dispatch path:
    - No cwd/user mapping in `core/src/memories/startup/mod.rs`.
    - One target root for consolidation.
    - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
    consolidation scope.
    - Keep one logical consolidation scope/job key (global/user) to avoid a
    risky schema rewrite in same PR.
    - Add one-time migration helper (core side) to preserve current shared
    memory output:
    - If `~/.codex/memories/user/memory` exists and new root is empty,
    move/copy contents into `~/.codex/memories`.
    - Leave old hashed cwd buckets untouched for now (safe/no-destructive
    migration).
    
    Acceptance criteria:
    - New runs only read/write `~/.codex/memories`.
    - No new cwd-scoped consolidation jobs are enqueued.
    - Existing user-shared memory content is preserved.
    
    ## PR 4: Phase-2 global lock simplification and cleanup
    
    - Replace multi-scope dispatch with a single global consolidation claim
    path:
    - Either reuse jobs table with one fixed key, or add a tiny dedicated
    lock helper; keep 1h lease.
    - Ensure at most one consolidation agent can run at once.
    - Keep heartbeat + stale lock recovery semantics in
    `core/src/memories/startup/watch.rs`.
    - Remove dead scope code and legacy constants no longer used.
    - Update tests:
    - One-agent-at-a-time behavior.
    - Lock expiry allows takeover after stale lease.
    
    Acceptance criteria:
    - Exactly one phase-2 consolidation agent can be active cluster-wide
    (per local DB).
    - Stale lock recovers automatically.
    
    ## PR 5: Final cleanup and docs
    
    - Remove legacy artifacts and references:
    - `raw_memories/` and `memory_summary.md` assumptions from
    prompts/comments/tests.
    - Scope constants for cwd memory pathing in core/state if fully unused.
    - Update docs under `docs/` for memory workflow and directory layout.
    - Add a brief operator note for rollout: compatibility window for old
    stage-1 JSON keys and when to remove aliases.
    
    Acceptance criteria:
    - Code and docs reflect only the simplified global workflow.
    - No stale references to per-cwd memory buckets.
    
    ## Notes on sequencing
    
    - PR 1 is safest first because it improves correctness without changing
    external artifact layout.
    - PR 2 keeps parser compatibility so prompt deployment can happen
    independently.
    - PR 3 and PR 4 split filesystem/scope simplification from locking
    simplification to reduce blast radius.
    - PR 5 is intentionally cleanup-only.
  • feat: mem v2 - PR2 (#11365)
    # Memories migration plan (simplified global workflow)
    
    ## Target behavior
    
    - One shared memory root only: `~/.codex/memories/`.
    - No per-cwd memory buckets, no cwd hash handling.
    - Phase 1 candidate rules:
    - Not currently being processed unless the job lease is stale.
    - Rollout updated within the max-age window (currently 30 days).
    - Rollout idle for at least 12 hours (new constant).
    - Global cap: at most 64 stage-1 jobs in `running` state at any time
    (new invariant).
    - Stage-1 model output shape (new):
    - `rollout_slug` (accepted but ignored for now).
    - `rollout_summary`.
    - `raw_memory`.
    - Phase-1 artifacts written under the shared root:
    - `rollout_summaries/<thread_id>.md` for each rollout summary.
    - `raw_memories.md` containing appended/merged raw memory paragraphs.
    - Phase 2 runs one consolidation agent for the shared `memories/`
    directory.
    - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.
    
    ## Current code map
    
    - Core startup pipeline: `core/src/memories/startup/mod.rs`.
    - Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
    `core/src/memories/stage_one.rs`, templates in
    `core/templates/memories/`.
    - File materialization: `core/src/memories/storage.rs`,
    `core/src/memories/layout.rs`.
    - Scope routing (cwd/user): `core/src/memories/scope.rs`,
    `core/src/memories/startup/mod.rs`.
    - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.
    
    ## PR plan
    
    ## PR 1: Correct phase-1 selection invariants (no behavior-breaking
    layout changes yet)
    
    - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
    `core/src/memories/mod.rs`.
    - Thread this into `state::claim_stage1_jobs_for_startup(...)`.
    - Enforce idle-time filter in DB selection logic (not only in-memory
    filtering after `scan_limit`) so eligible threads are not starved by
    very recent threads.
    - Enforce global running cap of 64 at claim time in DB logic:
    - Count fresh `memory_stage1` running jobs.
    - Only allow new claims while count < cap.
    - Keep stale-lease takeover behavior intact.
    - Add/adjust tests in `state/src/runtime.rs`:
    - Idle filter inclusion/exclusion around 12h boundary.
    - Global running-cap guarantee.
    - Existing stale/fresh ownership behavior still passes.
    
    Acceptance criteria:
    - Startup never creates more than 64 fresh `memory_stage1` running jobs.
    - Threads updated <12h ago are skipped.
    - Threads older than 30d are skipped.
    
    ## PR 2: Stage-1 output contract + storage artifacts
    (forward-compatible)
    
    - Update parser/types to accept the new structured output while keeping
    backward compatibility:
    - Add `rollout_slug` (optional for now).
    - Add `rollout_summary`.
    - Keep alias support for legacy `summary` and `rawMemory` until prompt
    swap completes.
    - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
    include the new keys.
    - Update prompt templates:
    - `core/templates/memories/stage_one_system.md`.
    - `core/templates/memories/stage_one_input.md`.
    - Replace storage model in `core/src/memories/storage.rs`:
    - Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
    files).
    - Introduce `raw_memories.md` aggregator writer from DB rows.
    - Keep deterministic rebuild behavior from DB outputs so files can
    always be regenerated.
    - Update consolidation prompt template to reference `rollout_summaries/`
    + `raw_memories.md` inputs.
    
    Acceptance criteria:
    - Stage-1 accepts both old and new output keys during migration.
    - Phase-1 artifacts are generated in new format from DB state.
    - No dependence on per-thread files in `raw_memories/`.
    
    ## PR 3: Remove per-cwd memories and move to one global memory root
    
    - Simplify layout in `core/src/memories/layout.rs`:
    - Single root: `codex_home/memories`.
    - Remove cwd-hash bucket helpers and normalization logic used only for
    memory pathing.
    - Remove scope branching from startup phase-2 dispatch path:
    - No cwd/user mapping in `core/src/memories/startup/mod.rs`.
    - One target root for consolidation.
    - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
    consolidation scope.
    - Keep one logical consolidation scope/job key (global/user) to avoid a
    risky schema rewrite in same PR.
    - Add one-time migration helper (core side) to preserve current shared
    memory output:
    - If `~/.codex/memories/user/memory` exists and new root is empty,
    move/copy contents into `~/.codex/memories`.
    - Leave old hashed cwd buckets untouched for now (safe/no-destructive
    migration).
    
    Acceptance criteria:
    - New runs only read/write `~/.codex/memories`.
    - No new cwd-scoped consolidation jobs are enqueued.
    - Existing user-shared memory content is preserved.
    
    ## PR 4: Phase-2 global lock simplification and cleanup
    
    - Replace multi-scope dispatch with a single global consolidation claim
    path:
    - Either reuse jobs table with one fixed key, or add a tiny dedicated
    lock helper; keep 1h lease.
    - Ensure at most one consolidation agent can run at once.
    - Keep heartbeat + stale lock recovery semantics in
    `core/src/memories/startup/watch.rs`.
    - Remove dead scope code and legacy constants no longer used.
    - Update tests:
    - One-agent-at-a-time behavior.
    - Lock expiry allows takeover after stale lease.
    
    Acceptance criteria:
    - Exactly one phase-2 consolidation agent can be active cluster-wide
    (per local DB).
    - Stale lock recovers automatically.
    
    ## PR 5: Final cleanup and docs
    
    - Remove legacy artifacts and references:
    - `raw_memories/` and `memory_summary.md` assumptions from
    prompts/comments/tests.
    - Scope constants for cwd memory pathing in core/state if fully unused.
    - Update docs under `docs/` for memory workflow and directory layout.
    - Add a brief operator note for rollout: compatibility window for old
    stage-1 JSON keys and when to remove aliases.
    
    Acceptance criteria:
    - Code and docs reflect only the simplified global workflow.
    - No stale references to per-cwd memory buckets.
    
    ## Notes on sequencing
    
    - PR 1 is safest first because it improves correctness without changing
    external artifact layout.
    - PR 2 keeps parser compatibility so prompt deployment can happen
    independently.
    - PR 3 and PR 4 split filesystem/scope simplification from locking
    simplification to reduce blast radius.
    - PR 5 is intentionally cleanup-only.
  • feat: mem v2 - PR1 (#11364)
    # Memories migration plan (simplified global workflow)
    
    ## Target behavior
    
    - One shared memory root only: `~/.codex/memories/`.
    - No per-cwd memory buckets, no cwd hash handling.
    - Phase 1 candidate rules:
    - Not currently being processed unless the job lease is stale.
    - Rollout updated within the max-age window (currently 30 days).
    - Rollout idle for at least 12 hours (new constant).
    - Global cap: at most 64 stage-1 jobs in `running` state at any time
    (new invariant).
    - Stage-1 model output shape (new):
    - `rollout_slug` (accepted but ignored for now).
    - `rollout_summary`.
    - `raw_memory`.
    - Phase-1 artifacts written under the shared root:
    - `rollout_summaries/<thread_id>.md` for each rollout summary.
    - `raw_memories.md` containing appended/merged raw memory paragraphs.
    - Phase 2 runs one consolidation agent for the shared `memories/`
    directory.
    - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.
    
    ## Current code map
    
    - Core startup pipeline: `core/src/memories/startup/mod.rs`.
    - Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
    `core/src/memories/stage_one.rs`, templates in
    `core/templates/memories/`.
    - File materialization: `core/src/memories/storage.rs`,
    `core/src/memories/layout.rs`.
    - Scope routing (cwd/user): `core/src/memories/scope.rs`,
    `core/src/memories/startup/mod.rs`.
    - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.
    
    ## PR plan
    
    ## PR 1: Correct phase-1 selection invariants (no behavior-breaking
    layout changes yet)
    
    - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
    `core/src/memories/mod.rs`.
    - Thread this into `state::claim_stage1_jobs_for_startup(...)`.
    - Enforce idle-time filter in DB selection logic (not only in-memory
    filtering after `scan_limit`) so eligible threads are not starved by
    very recent threads.
    - Enforce global running cap of 64 at claim time in DB logic:
    - Count fresh `memory_stage1` running jobs.
    - Only allow new claims while count < cap.
    - Keep stale-lease takeover behavior intact.
    - Add/adjust tests in `state/src/runtime.rs`:
    - Idle filter inclusion/exclusion around 12h boundary.
    - Global running-cap guarantee.
    - Existing stale/fresh ownership behavior still passes.
    
    Acceptance criteria:
    - Startup never creates more than 64 fresh `memory_stage1` running jobs.
    - Threads updated <12h ago are skipped.
    - Threads older than 30d are skipped.
    
    ## PR 2: Stage-1 output contract + storage artifacts
    (forward-compatible)
    
    - Update parser/types to accept the new structured output while keeping
    backward compatibility:
    - Add `rollout_slug` (optional for now).
    - Add `rollout_summary`.
    - Keep alias support for legacy `summary` and `rawMemory` until prompt
    swap completes.
    - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
    include the new keys.
    - Update prompt templates:
    - `core/templates/memories/stage_one_system.md`.
    - `core/templates/memories/stage_one_input.md`.
    - Replace storage model in `core/src/memories/storage.rs`:
    - Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
    files).
    - Introduce `raw_memories.md` aggregator writer from DB rows.
    - Keep deterministic rebuild behavior from DB outputs so files can
    always be regenerated.
    - Update consolidation prompt template to reference `rollout_summaries/`
    + `raw_memories.md` inputs.
    
    Acceptance criteria:
    - Stage-1 accepts both old and new output keys during migration.
    - Phase-1 artifacts are generated in new format from DB state.
    - No dependence on per-thread files in `raw_memories/`.
    
    ## PR 3: Remove per-cwd memories and move to one global memory root
    
    - Simplify layout in `core/src/memories/layout.rs`:
    - Single root: `codex_home/memories`.
    - Remove cwd-hash bucket helpers and normalization logic used only for
    memory pathing.
    - Remove scope branching from startup phase-2 dispatch path:
    - No cwd/user mapping in `core/src/memories/startup/mod.rs`.
    - One target root for consolidation.
    - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
    consolidation scope.
    - Keep one logical consolidation scope/job key (global/user) to avoid a
    risky schema rewrite in same PR.
    - Add one-time migration helper (core side) to preserve current shared
    memory output:
    - If `~/.codex/memories/user/memory` exists and new root is empty,
    move/copy contents into `~/.codex/memories`.
    - Leave old hashed cwd buckets untouched for now (safe/no-destructive
    migration).
    
    Acceptance criteria:
    - New runs only read/write `~/.codex/memories`.
    - No new cwd-scoped consolidation jobs are enqueued.
    - Existing user-shared memory content is preserved.
    
    ## PR 4: Phase-2 global lock simplification and cleanup
    
    - Replace multi-scope dispatch with a single global consolidation claim
    path:
    - Either reuse jobs table with one fixed key, or add a tiny dedicated
    lock helper; keep 1h lease.
    - Ensure at most one consolidation agent can run at once.
    - Keep heartbeat + stale lock recovery semantics in
    `core/src/memories/startup/watch.rs`.
    - Remove dead scope code and legacy constants no longer used.
    - Update tests:
    - One-agent-at-a-time behavior.
    - Lock expiry allows takeover after stale lease.
    
    Acceptance criteria:
    - Exactly one phase-2 consolidation agent can be active cluster-wide
    (per local DB).
    - Stale lock recovers automatically.
    
    ## PR 5: Final cleanup and docs
    
    - Remove legacy artifacts and references:
    - `raw_memories/` and `memory_summary.md` assumptions from
    prompts/comments/tests.
    - Scope constants for cwd memory pathing in core/state if fully unused.
    - Update docs under `docs/` for memory workflow and directory layout.
    - Add a brief operator note for rollout: compatibility window for old
    stage-1 JSON keys and when to remove aliases.
    
    Acceptance criteria:
    - Code and docs reflect only the simplified global workflow.
    - No stale references to per-cwd memory buckets.
    
    ## Notes on sequencing
    
    - PR 1 is safest first because it improves correctness without changing
    external artifact layout.
    - PR 2 keeps parser compatibility so prompt deployment can happen
    independently.
    - PR 3 and PR 4 split filesystem/scope simplification from locking
    simplification to reduce blast radius.
    - PR 5 is intentionally cleanup-only.
  • Use thin LTO for alpha Rust release builds (#11348)
    We are looking to speed up build times for alpha releases, but we do not
    want to completely compromise on runtime performance by shipping debug
    builds. This PR changes our CI so that alpha releases build with
    `lto="thin"` instead of `lto="fat"`.
    
    Specifically, this change keeps `[profile.release] lto = "fat"` as the
    default in `Cargo.toml`, but overrides LTO in CI using
    `CARGO_PROFILE_RELEASE_LTO`:
    - `rust-release.yml`: use `thin` for `-alpha` tags, otherwise `fat`
    - `shell-tool-mcp.yml`: use `thin` for `-alpha` versions, otherwise
    `fat`
    
    Tradeoffs:
    - Alpha binaries may be somewhat larger and/or slightly slower than
    fat-LTO builds
    - LTO policy now lives in workflow logic for two pipelines, so
    consistency must be maintained across both files
    
    Note `CARGO_PROFILE_<name>_LTO` is documented on
    https://doc.rust-lang.org/cargo/reference/environment-variables.html#configuration-environment-variables.
  • Strip unsupported images from prompt history to guard against model switch (#11349)
    - Make `ContextManager::for_prompt` modality-aware and strip input_image
    content when the active model is text-only.
    - Added a test for multi-model -> text-only model switch
  • include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946)
    This will help us understand retention/usage for folks who use the
    Windows (or any other) sandboxes
  • fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941)
    ## Summary
    - Reduced repeated approvals for equivalent wrapper commands and fixed
    execpolicy matching for heredoc-style shell invocations, with minimal
    behavior change and fail-closed defaults.
    
    ## Fixes
    1. Canonicalized approval matching for wrappers so equivalent commands
    map to the same approval intent.
    2. Added heredoc-aware prefix extraction for execpolicy so commands like
    `python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"],
    ...)`.
    3. Kept fallback behavior conservative: if parsing is ambiguous,
    existing prompt behavior is preserved.
    
    ## Edge Cases Covered
    - Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs
    `zsh`.
    - Shell modes: `-c` and `-lc`.
    - Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<<
    PY`).
    - Multi-command heredoc scripts are rejected by the fallback
    - Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix
    matches.
    - Complex scripts still fall back to prior behavior rather than
    expanding permissions.
    
    ---------
    
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
  • Extract tool building (#11337)
    Make it clear what input go into building tools and allow for easy reuse
    for pre-warm request
  • Sanitize MCP image output for text-only models (#11346)
    - Replace image blocks in MCP tool results with a text placeholder when
    the active model does not accept image input.
    - Add an e2e rmcp test to verify sanitized tool output is what gets sent
    back to the model.
  • Always expose view_image and return unsupported image-input error (#11336)
    - Keep `view_image` in the advertised tool list for all models.
    - Return a clear error when the current model does not support image
    inputs, and cover it with a unit test.
  • Compare full request for websockets incrementality (#11343)
    Tools can dynamically change mid-turn now. We need to be more thorough
    about reusing incremental connections.
  • core: remove stale apply_patch SandboxPolicy TODO in seatbelt (#11345)
    The `TODO` in `core/src/seatbelt.rs` claimed that `apply_patch` still needed to honor `SandboxPolicy`. That was true when the comment was added, but it is no longer true.
    
    Analysis:
    - The TODO was introduced in #1762, when seatbelt code was split out of `exec.rs`.
    - `apply_patch` sandboxing was later implemented in #1705.
    - Today, `apply_patch` calls are routed through the tool orchestrator and delegated to `ApplyPatchRuntime`, which executes via `execute_env()` using the active sandbox attempt policy.
    - On macOS, the sandbox transform path for that execution still builds seatbelt args with `create_seatbelt_command_args(command, policy, sandbox_policy_cwd)`, so the same `SandboxPolicy` gates `apply_patch` writes and network behavior.
    
    Because this behavior is already enforced, the TODO is stale and removing it avoids implying missing sandbox coverage where none exists.
    
    No functional behavior change; comment-only cleanup.
  • test(core): stabilize ARM bazel remote-model and parallelism tests (#11330)
    ## Summary
    - keep wiremock MockServer handles alive through async assertions in
    remote model suite tests
    - assert /models request count in remote_models_hide_picker_only_models
    - use a slightly higher parallel timing threshold on aarch64 while
    keeping existing x86 threshold
    
    ## Validation
    - just fmt
    - targeted tests:
    - cargo test -p codex-core --test all
    suite::remote_models::remote_models_merge_replaces_overlapping_model --
    --exact
    - cargo test -p codex-core --test all
    suite::remote_models::remote_models_hide_picker_only_models -- --exact
    - cargo test -p codex-core --test all
    suite::tool_parallelism::shell_tools_run_in_parallel -- --exact
    - soak loop: 40 iterations of all three targeted tests
    
    ## Notes
    - cargo test -p codex-core has one unrelated local-env failure in
    shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from
    exported certificate env content in this workspace.
    - local bazel test //codex-rs/core:core-all-test failed to build due
    missing rust-objcopy in this host toolchain.
  • # Use @openai/codex dist-tags for platform binaries instead of separate package names (#11339)
    https://github.com/openai/codex/pull/11318 introduced logic to publish
    platform artifacts as separate npm packages (for example,
    `@openai/codex-darwin-arm64`, `@openai/codex-linux-x64`, etc.). That
    requires provisioning and maintaining multiple package entries in npm,
    which we want to avoid.
    
    We still need to keep the package-size mitigation (platform-specific
    payloads), but we want that layout to live under a single npm package
    namespace (`@openai/codex`) using dist-tags.
    
    We also need to preserve pre-release workflows where users install
    `@openai/codex@alpha` and get platform-appropriate binaries.
    
    Additionally, we want GitHub Release assets to group Codex npm tarballs
    together, so platform tarballs should follow the same `codex-npm-*`
    filename prefix as the main Codex tarball.
    
    ## Release Strategy (New Scheme)
    
    We publish **one npm package name for Codex binaries** (`@openai/codex`)
    and use **dist-tags** to select platform-specific payloads. This avoids
    creating separate platform package names while keeping the package size
    split by platform.
    
    ### What gets published
    
    #### Mainline release (`x.y.z`)
    
    - `@openai/codex@latest` (meta package)
    - `@openai/codex@darwin-arm64`
    - `@openai/codex@darwin-x64`
    - `@openai/codex@linux-arm64`
    - `@openai/codex@linux-x64`
    - `@openai/codex@win32-arm64`
    - `@openai/codex@win32-x64`
    - `@openai/codex-responses-api-proxy@latest`
    - `@openai/codex-sdk@latest`
    
    #### Alpha release (`x.y.z-alpha.N`)
    
    - `@openai/codex@alpha` (meta package)
    - `@openai/codex@alpha-darwin-arm64`
    - `@openai/codex@alpha-darwin-x64`
    - `@openai/codex@alpha-linux-arm64`
    - `@openai/codex@alpha-linux-x64`
    - `@openai/codex@alpha-win32-arm64`
    - `@openai/codex@alpha-win32-x64`
    - `@openai/codex-responses-api-proxy@alpha`
    - `@openai/codex-sdk@alpha`
    
    As an example, the `package.json` for `@openai/codex@alpha` (using
    `0.99.0-alpha.17` as the `version`) would be:
    
    ```
    {
      "name": "@openai/codex",
      "version": "0.99.0-alpha.17",
      "license": "Apache-2.0",
      "bin": {
        "codex": "bin/codex.js"
      },
      "type": "module",
      "engines": {
        "node": ">=16"
      },
      "files": [
        "bin"
      ],
      "repository": {
        "type": "git",
        "url": "git+https://github.com/openai/codex.git",
        "directory": "codex-cli"
      },
      "packageManager": "pnpm@10.28.2+sha512.41872f037ad22f7348e3b1debbaf7e867cfd448f2726d9cf74c08f19507c31d2c8e7a11525b983febc2df640b5438dee6023ebb1f84ed43cc2d654d2bc326264",
      "optionalDependencies": {
        "@openai/codex-linux-x64": "npm:@openai/codex@0.99.0-alpha.17-linux-x64",
        "@openai/codex-linux-arm64": "npm:@openai/codex@0.99.0-alpha.17-linux-arm64",
        "@openai/codex-darwin-x64": "npm:@openai/codex@0.99.0-alpha.17-darwin-x64",
        "@openai/codex-darwin-arm64": "npm:@openai/codex@0.99.0-alpha.17-darwin-arm64",
        "@openai/codex-win32-x64": "npm:@openai/codex@0.99.0-alpha.17-win32-x64",
        "@openai/codex-win32-arm64": "npm:@openai/codex@0.99.0-alpha.17-win32-arm64"
      }
    }
    ```
    
    Note that the keys in `optionalDependencies` have "clean" names, but the
    values have the tag embedded.
    
    ### Important note
    
    **Note:** Because we never created the new platform package names on npm
    (for example,
    `@openai/codex-darwin-arm64`) since #11318 landed, there are no extra
    npm packages to clean up.
    
    ## What changed
    
    ### 1. Stage platform tarballs as `@openai/codex` with platform-specific
    versions
    
    File: `codex-cli/scripts/build_npm_package.py`
    
    - Added `CODEX_NPM_NAME = "@openai/codex"` and platform metadata
    `npm_tag` values:
    - `darwin-arm64`, `darwin-x64`, `linux-arm64`, `linux-x64`,
    `win32-arm64`, `win32-x64`
    - For platform package staging (`codex-<platform>` inputs), switched
    generated `package.json` from:
      - `name = @openai/codex-<platform>`
      to:
      - `name = @openai/codex`
    - Added `compute_platform_package_version(version, platform_tag)` so
    platform tarballs have unique
    versions (`<release-version>-<platform-tag>`), which is required because
    npm forbids re-publishing
      the same `name@version`.
    
    ### 2. Point meta package optional dependencies at dist-tags on
    `@openai/codex`
    
    File: `codex-cli/scripts/build_npm_package.py`
    
    - Updated `optionalDependencies` generation for the main `codex` package
    to use npm alias syntax:
    - key remains alias package name (for example,
    `@openai/codex-darwin-arm64`) so runtime lookup behavior is unchanged
      - value now resolves to `@openai/codex` by dist-tag
    - Stable releases emit tags like `npm:@openai/codex@darwin-arm64`.
    - Alpha releases (`x.y.z-alpha.N`) emit tags like
    `npm:@openai/codex@alpha-darwin-arm64`.
    
    ### 3. Publish with per-tarball dist-tags in release CI
    
    File: `.github/workflows/rust-release.yml`
    
    - Reworked npm publish logic to derive the publish tag per tarball
    filename:
      - platform tarballs publish with `<platform>` tags for stable releases
    - platform tarballs publish with `alpha-<platform>` tags for alpha
    releases
    - top-level tarballs (`codex`, `codex-responses-api-proxy`, `codex-sdk`)
    continue using
    the existing channel tag policy (`latest` implicit for stable, `alpha`
    for alpha)
    - Added fail-fast behavior for unexpected tarball names to avoid silent
    mispublishes.
    
    ### 4. Normalize Codex platform tarball filenames for GitHub Release
    grouping
    
    Files: `scripts/stage_npm_packages.py`,
    `.github/workflows/rust-release.yml`
    
    - Renamed staged platform tarball filenames from:
      - `codex-linux-<arch>-npm-<version>.tgz`
      - `codex-darwin-<arch>-npm-<version>.tgz`
      - `codex-win32-<arch>-npm-<version>.tgz`
    - To:
      - `codex-npm-linux-<arch>-<version>.tgz`
      - `codex-npm-darwin-<arch>-<version>.tgz`
      - `codex-npm-win32-<arch>-<version>.tgz`
    
    This keeps all Codex npm artifacts grouped under a common `codex-npm-`
    prefix in GitHub Releases.
    
    ### 5. Documentation update
    
    File: `codex-cli/scripts/README.md`
    
    - Updated staging docs to clarify that platform-native variants are
    published as dist-tagged
      `@openai/codex` artifacts rather than separate npm package names.
    
    ## Resulting behavior
    
    - Mainline release:
      - `@openai/codex@latest` resolves the meta package
    - meta package optional dependencies resolve
    `@openai/codex@<platform-tag>`
    - Alpha release:
      - users can continue installing `@openai/codex@alpha`
    - alpha meta package optional dependencies resolve
    `@openai/codex@alpha-<platform-tag>`
    - Release assets:
    - Codex npm tarballs share `codex-npm-` prefix for cleaner grouping in
    GitHub Releases
    
    This preserves platform-specific payload distribution while avoiding
    separate npm package names and
    improves release-asset discoverability.
    
    ## Validation notes
    
    - Verified staged `package.json` output for stable and alpha meta
    packages includes expected alias targets.
    - Verified staged platform package manifests are `name=@openai/codex`
    with unique platform-suffixed versions.
    - Verified publish tag derivation maps renamed platform tarballs to
    expected stable and alpha dist-tags.
  • Treat first rollout session_meta as canonical thread identity (#11241)
    During thread/fork, the new rollout includes the fork’s own session_meta
    plus copied history that can contain older session_meta entries from the
    source thread. thread/list was overwriting metadata on later
    session_meta lines, so a fork could be reported with the source thread’s
    thread_id. This fix only uses the first session_meta, so the fork keeps
    its own ID.
  • feat: opt-out of events in the app-server (#11319)
    Add `optOutNotificationMethods` in the app-server to opt-out events
    based on exact method matching
  • [apps] Improve app installation flow. (#11249)
    - [x] Add buttons to start the installation flow and verify installation
    completes.
    - [x] Hard refresh apps list when the /apps view opens.
  • Fix: update parallel tool call exec approval to approve on request id (#11162)
    ### Summary
    
    In parallel tool call, exec command approvals were not approved at
    request level but at a turn level. i.e. when a single request is
    approved, the system currently treats all requests in turn as approved.
    
    ### Before
    
    https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8
    
    ### After
    
    https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc
  • fix(protocol): approval policy never prompt (#11288)
    This removes overly directed language about how the model should behave
    when it's in `approval_policy=never` mode.
    
    ---------
    
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
  • tui: keep history recall cursor at line end (#11295)
    ## Summary
    - keep cursor at end-of-line after Up/Down history recall
    - allow continued history navigation when recalled text cursor is at
    start or end boundary
    - add regression tests and document the history cursor contract in
    composer docs
    
    ## Testing
    - just fmt
    - cargo test -p codex-tui --lib
    history_navigation_leaves_cursor_at_end_of_line
    - cargo test -p codex-tui --lib
    should_handle_navigation_when_cursor_is_at_line_boundaries
    - cargo test -p codex-tui *(fails in existing integration test
    `suite::no_panic_on_startup::malformed_rules_should_not_panic` because
    `target/debug/codex` is not present in this environment)*
  • Remove ApiPrompt (#11265)
    Keep things simple and build a full Responses API request request right
    in the model client
  • Fix pending input test waiting logic (#11322)
    ## Summary
    - remove redundant user message wait that could time out and cause
    flakiness
    - rely on the existing turn-complete wait to ensure the follow-up
    request is observed
    
    ## Testing
    - Not run (not requested)
  • feat: phase 2 consolidation (#11306)
    Consolidation phase of memories
    
    Cleaning and better handling of concurrency
  • Extract hooks into dedicated crate (#11311)
    Summary
    - move `core/src/hooks` implementation into a new `codex-hooks` crate
    with its own manifest
    - update `codex-rs` workspace and `codex-core` crate to depend on the
    extracted `hooks` crate and wire up the shared APIs
    - ensure references, modules, and lockfile reflect the new crate layout
    
    Testing
    - Not run (not requested)
  • feat: align memory phase 1 and make it stronger (#11300)
    ## Align with the new phase-1 design
    
    Basically we know run phase 1 in parallel by considering:
    * Max 64 rollouts
    * Max 1 month old
    * Consider the most recent first
    
    This PR also adds stronger parallelization capabilities by detecting
    stale jobs, retry policies, ownership of computation to prevent double
    computations etc etc
  • memories: add extraction and prompt module foundation (#11200)
    ## Summary
    - add the new `core/src/memories` module (phase-one parsing, rollout
    filtering, storage, selection, prompts)
    - add Askama-backed memory templates for stage-one input/system and
    consolidation prompts
    - add module tests for parsing, filtering, path bucketing, and summary
    maintenance
    
    ## Testing
    - just fmt
    - cargo test -p codex-core --lib memories::
  • feat: retain NetworkProxy, when appropriate (#11207)
    As of this PR, `SessionServices` retains a
    `Option<StartedNetworkProxy>`, if appropriate.
    
    Now the `network` field on `Config` is `Option<NetworkProxySpec>`
    instead of `Option<NetworkProxy>`.
    
    Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
    create the `StartedNetworkProxy`, which is a new struct that retains the
    `NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
    implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
    when it is dropped.)
    
    The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
    the appropriate places.
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
    * #11285
    * __->__ #11207
  • chore: put crypto provider logic in a shared crate (#11294)
    Ensures a process-wide rustls crypto provider is installed.
    
    Both the `codex-network-proxy` and `codex-api` crates need this.