codex

fix(exec-policy) No empty command lists (#11397 )

## Summary
This should rarely, if ever, happen in practice. But regardless, we
should never provide an empty list of `commands` to ExecPolicy. This PR
is almost entirely adding test around these cases.

## Testing
- [x] Adds a bunch of unit tests for this

Dylan Hurd · 2026-02-10 19:22:23 -08:00

cc8c293378

Remove deterministic_process_ids feature to avoid duplicate codex-core builds (#11393 )

## Why

`codex-core` enabled `deterministic_process_ids` through a self
dev-dependency.
That forced a second feature-resolved build of the same crate, which
increased
compile time and test latency.

## What Changed

- Removed the `deterministic_process_ids` feature from
`codex-rs/core/Cargo.toml`.
- Removed the self dev-dependency on `codex-core` that enabled that
feature.
- Removed the Bazel `deterministic_process_ids` crate feature for
`codex-core`.
- Added a test-only `AtomicBool` override in unified exec process-id
allocation.
- Added a test-support setter for that override and re-exported it from
`codex-core`.
- Enabled deterministic process IDs in integration tests via
`core_test_support` ctor.

## Behavior

- Production behavior remains random process IDs.
- Unit tests remain deterministic via `cfg(test)`.
- Integration tests remain deterministic via explicit test-support
initialization.

## Validation

- `just fmt`
- `cargo test -p codex-core unified_exec::`
- `cargo test -p codex-core --test all unified_exec -- --test-threads=1`
- `cargo tree -p codex-core -e features` (verified the removed feature
path)

Michael Bolin · 2026-02-10 19:07:01 -08:00

b68a84ee8e

tui: queue non-pending rollback trims in app-event order (#11373 )

## Summary

This PR fixes TUI transcript-sync behavior for
`EventMsg::ThreadRolledBack` and makes rollback application order
deterministic.

Previously, rollback handling depended on `pending_rollback`:

- if `pending_rollback` was set (local backtrack), TUI trimmed correctly
- otherwise, replayed/external rollbacks were either ignored or could be
applied at the wrong time relative to queued transcript inserts

This change keeps the local backtrack path intact and routes non-pending
rollbacks through the app event queue so rollback trims are applied in
FIFO order with transcript cell inserts.

## What changed

- Added/used `trim_transcript_cells_drop_last_n_user_turns(...)` for
rollback-by-`num_turns` semantics.
- Renamed rollback app event:
- `AppEvent::ApplyReplayedThreadRollback` ->
`AppEvent::ApplyThreadRollback`
- Replay path (`ChatWidget`) now emits `ApplyThreadRollback`.
- Live non-pending rollback path (`App::handle_backtrack_event`) now
emits `ApplyThreadRollback` instead of trimming immediately.
- App-level event handler applies `ApplyThreadRollback` after queued
`InsertHistoryCell` events and schedules redraw only when a trim
occurred.
- When a trim occurs with an overlay open, TUI now syncs transcript
overlay committed cells, clamps backtrack preview selection, and clears
stale `deferred_history_lines` so closed overlays do not re-append
rolled-back lines.
- Clarified inline comments around the `pending_rollback` branch so
future readers can reason about why there are two paths.

## Why queueing matters

During resume/replay, transcript cells are populated via queued
`InsertHistoryCell` app events. If a rollback is applied immediately
outside that queue, it can run against an incomplete transcript and
under-trim. Queueing non-pending rollbacks ensures consistent ordering
and correct final transcript state.

## Behavior by rollback source

- `pending_rollback = Some(...)` (local backtrack requested by this
TUI):
  - use `finish_pending_backtrack()` and the stored selection boundary
- `pending_rollback = None` (replay/external/non-local rollback):
- enqueue `AppEvent::ApplyThreadRollback { num_turns }` and trim in
app-event order

## Tests

Added/updated tests covering ordering and semantics:

-
`app_backtrack::tests::trim_drop_last_n_user_turns_applies_rollback_semantics`
- `app_backtrack::tests::trim_drop_last_n_user_turns_allows_overflow`
- `app::tests::replayed_initial_messages_apply_rollback_in_queue_order`
-
`app::tests::live_rollback_during_replay_is_applied_in_app_event_order`
-
`app::tests::queued_rollback_syncs_overlay_and_clears_deferred_history`
- `chatwidget::tests::replayed_thread_rollback_emits_ordered_app_event`

Validation run:

- `just fmt`
- `cargo test -p codex-tui`

Charley Cunningham · 2026-02-10 18:53:43 -08:00

8b46c0ce00

Prefer websocket transport when model opts in (#11386 )

Summary
- add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
in all fixtures and constructors
- wire the new flag into websocket selection so models that opt in
always use websocket transport even when the feature gate is off

Testing
- Not run (not requested)

pakrym-oai · 2026-02-10 18:50:48 -08:00

c68999ee6d

Disable very flaky tests (#11394 )

Collected from last 20 builds of main in
https://github.com/openai/codex/commits/main/.

pakrym-oai · 2026-02-10 18:50:11 -08:00

bfd4e2112c

Update models.json (#11376 )

Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: sayan-oai <sayan@openai.com>

github-actions[bot] · 2026-02-10 17:25:35 -08:00

f101300dba

chore: rename codex-command to codex-shell-command (#11378 )

This addresses some post-merge feedback on
https://github.com/openai/codex/pull/11361:

- crate rename
- reuse `detect_shell_type()` utility

Michael Bolin · 2026-02-10 17:03:46 -08:00

d44f4205fb

feat: prevent double backfill (#11377 )

## Summary

Add a DB-backed lease to prevent duplicate `.sqlite` backfill workers
from running concurrently.

### What changed
- Added StateRuntime::try_claim_backfill(lease_seconds) that atomically
claims backfill only when:
  - backfill is not complete, and
  - no fresh running worker currently owns it.
- Updated backfill_sessions to use the claim API and exit early when
another worker already holds the lease.
- Added runtime tests covering:
  - singleton claim behavior,
  - stale lease takeover,
  - claim blocked after complete.
- Set backfill lease to 900s in production and 1s in tests.

### Why

This avoids duplicate backfill work and reduces backfill status churn
under concurrent startup, while preserving
current best-effort fallback behavior.

jif-oai · 2026-02-11 00:24:20 +00:00

87bbfc50a1

feat: mem v2 - PR6 (consolidation) (#11374 )

jif-oai · 2026-02-11 00:02:57 +00:00

674799d356

feat: mem v2 - PR5 (#11372 )

jif-oai · 2026-02-10 23:22:55 +00:00

2c9be54c9a

ci: fall back to local Bazel on forks without BuildBuddy key (#11359 )

## Summary
- detect whether BUILDBUDDY_API_KEY is present in Bazel CI
- keep existing remote BuildBuddy path when key is available
- add a local fallback path for fork PRs without secrets by clearing
remote cache/executor/BES endpoints
- document each fallback flag inline with links to Bazel docs

## Testing
- ruby -e 'require "yaml";
YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
- verified Bazel docs/flag references used in workflow comments

Josh McKinney · 2026-02-10 23:19:55 +00:00

34fb4b6e63

Enable SOCKS defaults for common local network proxy use cases (#11362 )

## Summary
- enable local-use defaults in network proxy settings: SOCKS5 on, SOCKS5
UDP on, upstream proxying on, and local binding on
- add a regression test that asserts the full
`NetworkProxySettings::default()` baseline
- Fixed managed listener reservation behavior.
Before: we always reserved a loopback SOCKS listener, even when
enable_socks5 = false.
Now: SOCKS listener is only reserved when SOCKS is enabled.
- Fixed /debug-config env output for SOCKS-disabled sessions.
ALL_PROXY now shows the HTTP proxy URL when SOCKS is disabled (instead
of incorrectly showing socks5h://...).


## Validation
- just fmt
- cargo test -p codex-network-proxy
- cargo clippy -p codex-network-proxy --all-targets

viyatb-oai · 2026-02-10 15:13:52 -08:00

1d47927aa0

feat: mem v2 - PR4 (#11369 )

# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.

jif-oai · 2026-02-10 23:10:35 +00:00

623d3f4071

# Split command parsing/safety out of codex-core into new codex-command (#11361 )

`codex-core` had accumulated command parsing and command safety logic
(`bash`, `powershell`, `parse_command`, and `command_safety`) that is
logically cohesive but orthogonal to most core session/runtime logic.
Keeping this code in `codex-core` made the crate increasingly monolithic
and raised iteration cost for unrelated core changes.

This change extracts that surface into a dedicated crate,
`codex-command`, while preserving existing `codex_core::...` call sites
via re-exports.

## Why this refactor

During analysis, command parsing/safety stood out as a good first split
because it has:

- a clear domain boundary (shell parsing + safety classification)
- relatively self-contained dependencies (notably `tree-sitter` /
`tree-sitter-bash`)
- a meaningful standalone test surface (`134` tests moved with the
crate)
- many downstream uses that benefit from independent compilation and
caching

The practical problem was build latency from a large `codex-core`
compile/test graph. Clean-build timings before and after this split
showed measurable wins:

- `cargo check -p codex-core`: `57.08s` -> `53.54s` (~`6.2%` faster)
- `cargo test -p codex-core --no-run`: `2m39.9s` -> `2m20s` (~`12.4%`
faster)
- `codex-core lib` compile unit: `57.18s` -> `49.67s` (~`13.1%` faster)
- `codex-core lib(test)` compile unit: `60.87s` -> `53.21s` (~`12.6%`
faster)

This gives a concrete reduction in core build overhead without changing
behavior.

## What changed

### New crate

- Added `codex-rs/command` as workspace crate `codex-command`.
- Added:
  - `command/src/lib.rs`
  - `command/src/bash.rs`
  - `command/src/powershell.rs`
  - `command/src/parse_command.rs`
  - `command/src/command_safety/*`
  - `command/src/shell_detect.rs`
  - `command/BUILD.bazel`

### Code moved out of `codex-core`

- Moved modules from `core/src` into `command/src`:
  - `bash.rs`
  - `powershell.rs`
  - `parse_command.rs`
  - `command_safety/*`

### Dependency graph updates

- Added workspace member/dependency entries for `codex-command` in
`codex-rs/Cargo.toml`.
- Added `codex-command` dependency to `codex-rs/core/Cargo.toml`.
- Removed `tree-sitter` and `tree-sitter-bash` from `codex-core` direct
deps (now owned by `codex-command`).

### API compatibility for callers

To avoid immediate downstream churn, `codex-core` now re-exports the
moved modules/functions:

- `codex_command::bash`
- `codex_command::powershell`
- `codex_command::parse_command`
- `codex_command::is_safe_command`
- `codex_command::is_dangerous_command`

This keeps existing `codex_core::...` paths working while enabling
gradual migration to direct `codex-command` usage.

### Internal decoupling detail

- Added `command::shell_detect` so moved `bash`/`powershell` logic no
longer depends on core shell internals.
- Adjusted PowerShell helper visibility in `codex-command` for existing
core test usage (`UTF8` prefix helper + executable discovery functions).

## Validation

- `just fmt`
- `just fix -p codex-command -p codex-core`
- `cargo test -p codex-command` (`134` passed)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-core shell_command_handler`

## Notes / follow-up

This commit intentionally prioritizes boundary extraction and
compatibility. A follow-up can migrate downstream crates to depend
directly on `codex-command` (instead of through `codex-core` re-exports)
to realize additional incremental build wins.

Michael Bolin · 2026-02-10 14:43:16 -08:00

d8f9bb65e2

Update models.json (#11274 )

Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: Sayan Sisodiya <sayan@openai.com>

github-actions[bot] · 2026-02-10 14:28:18 -08:00

3626399811

feat: mem v2 - PR3 (#11366 )

# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.

jif-oai · 2026-02-10 22:12:50 +00:00

3419660767

feat: mem v2 - PR2 (#11365 )

# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.

jif-oai · 2026-02-10 21:50:53 +00:00

0229dc5ccf

feat: mem v2 - PR1 (#11364 )

# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.

jif-oai · 2026-02-10 21:29:06 +00:00

07da740c8a

chore: unify memory job flow (#11334 )

jif-oai · 2026-02-10 20:26:39 +00:00

a6e9469fa4

Use thin LTO for alpha Rust release builds (#11348 )

We are looking to speed up build times for alpha releases, but we do not
want to completely compromise on runtime performance by shipping debug
builds. This PR changes our CI so that alpha releases build with
`lto="thin"` instead of `lto="fat"`.

Specifically, this change keeps `[profile.release] lto = "fat"` as the
default in `Cargo.toml`, but overrides LTO in CI using
`CARGO_PROFILE_RELEASE_LTO`:
- `rust-release.yml`: use `thin` for `-alpha` tags, otherwise `fat`
- `shell-tool-mcp.yml`: use `thin` for `-alpha` versions, otherwise
`fat`

Tradeoffs:
- Alpha binaries may be somewhat larger and/or slightly slower than
fat-LTO builds
- LTO policy now lives in workflow logic for two pipelines, so
consistency must be maintained across both files

Note `CARGO_PROFILE_<name>_LTO` is documented on
https://doc.rust-lang.org/cargo/reference/environment-variables.html#configuration-environment-variables.

Michael Bolin · 2026-02-10 11:59:03 -08:00

58a59a2dae

Strip unsupported images from prompt history to guard against model switch (#11349 )

- Make `ContextManager::for_prompt` modality-aware and strip input_image
content when the active model is text-only.
- Added a test for multi-model -> text-only model switch

Ahmed Ibrahim · 2026-02-10 11:58:00 -08:00

5e01450963

include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946 )

This will help us understand retention/usage for folks who use the
Windows (or any other) sandboxes

iceweasel-oai · 2026-02-10 19:50:07 +00:00

82f93a13b2

fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941 )

## Summary
- Reduced repeated approvals for equivalent wrapper commands and fixed
execpolicy matching for heredoc-style shell invocations, with minimal
behavior change and fail-closed defaults.

## Fixes
1. Canonicalized approval matching for wrappers so equivalent commands
map to the same approval intent.
2. Added heredoc-aware prefix extraction for execpolicy so commands like
`python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"],
...)`.
3. Kept fallback behavior conservative: if parsing is ambiguous,
existing prompt behavior is preserved.

## Edge Cases Covered
- Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs
`zsh`.
- Shell modes: `-c` and `-lc`.
- Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<<
PY`).
- Multi-command heredoc scripts are rejected by the fallback
- Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix
matches.
- Complex scripts still fall back to prior behavior rather than
expanding permissions.

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>

viyatb-oai · 2026-02-10 11:46:40 -08:00

62d0f302fd

Extract tool building (#11337 )

Make it clear what input go into building tools and allow for easy reuse
for pre-warm request

pakrym-oai · 2026-02-10 11:45:23 -08:00

e4b5384539

Sanitize MCP image output for text-only models (#11346 )

- Replace image blocks in MCP tool results with a text placeholder when
the active model does not accept image input.
- Add an e2e rmcp test to verify sanitized tool output is what gets sent
back to the model.

Ahmed Ibrahim · 2026-02-10 11:25:32 -08:00

9c4656000f

Always expose view_image and return unsupported image-input error (#11336 )

- Keep `view_image` in the advertised tool list for all models.
- Return a clear error when the current model does not support image
inputs, and cover it with a unit test.

Ahmed Ibrahim · 2026-02-10 11:25:12 -08:00

6e96e4837e

fix: reduce usage of open_if_present (#11344 )

jif-oai · 2026-02-10 19:25:07 +00:00

847a6092e6

Compare full request for websockets incrementality (#11343 )

Tools can dynamically change mid-turn now. We need to be more thorough
about reusing incremental connections.

pakrym-oai · 2026-02-10 19:14:36 +00:00

0639c33892

core: remove stale apply_patch SandboxPolicy TODO in seatbelt (#11345 )

The `TODO` in `core/src/seatbelt.rs` claimed that `apply_patch` still needed to honor `SandboxPolicy`. That was true when the comment was added, but it is no longer true.

Analysis:
- The TODO was introduced in #1762, when seatbelt code was split out of `exec.rs`.
- `apply_patch` sandboxing was later implemented in #1705.
- Today, `apply_patch` calls are routed through the tool orchestrator and delegated to `ApplyPatchRuntime`, which executes via `execute_env()` using the active sandbox attempt policy.
- On macOS, the sandbox transform path for that execution still builds seatbelt args with `create_seatbelt_command_args(command, policy, sandbox_policy_cwd)`, so the same `SandboxPolicy` gates `apply_patch` writes and network behavior.

Because this behavior is already enforced, the TODO is stale and removing it avoids implying missing sandbox coverage where none exists.

No functional behavior change; comment-only cleanup.

Michael Bolin · 2026-02-10 19:10:02 +00:00

548afa5749

test(core): stabilize ARM bazel remote-model and parallelism tests (#11330 )

## Summary
- keep wiremock MockServer handles alive through async assertions in
remote model suite tests
- assert /models request count in remote_models_hide_picker_only_models
- use a slightly higher parallel timing threshold on aarch64 while
keeping existing x86 threshold

## Validation
- just fmt
- targeted tests:
- cargo test -p codex-core --test all
suite::remote_models::remote_models_merge_replaces_overlapping_model --
--exact
- cargo test -p codex-core --test all
suite::remote_models::remote_models_hide_picker_only_models -- --exact
- cargo test -p codex-core --test all
suite::tool_parallelism::shell_tools_run_in_parallel -- --exact
- soak loop: 40 iterations of all three targeted tests

## Notes
- cargo test -p codex-core has one unrelated local-env failure in
shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from
exported certificate env content in this workspace.
- local bazel test //codex-rs/core:core-all-test failed to build due
missing rust-objcopy in this host toolchain.

Dylan Hurd · 2026-02-10 10:57:50 -08:00

f3bbcc987d

# Use @openai/codex dist-tags for platform binaries instead of separate package names (#11339 )

https://github.com/openai/codex/pull/11318 introduced logic to publish
platform artifacts as separate npm packages (for example,
`@openai/codex-darwin-arm64`, `@openai/codex-linux-x64`, etc.). That
requires provisioning and maintaining multiple package entries in npm,
which we want to avoid.

We still need to keep the package-size mitigation (platform-specific
payloads), but we want that layout to live under a single npm package
namespace (`@openai/codex`) using dist-tags.

We also need to preserve pre-release workflows where users install
`@openai/codex@alpha` and get platform-appropriate binaries.

Additionally, we want GitHub Release assets to group Codex npm tarballs
together, so platform tarballs should follow the same `codex-npm-*`
filename prefix as the main Codex tarball.

## Release Strategy (New Scheme)

We publish **one npm package name for Codex binaries** (`@openai/codex`)
and use **dist-tags** to select platform-specific payloads. This avoids
creating separate platform package names while keeping the package size
split by platform.

### What gets published

#### Mainline release (`x.y.z`)

- `@openai/codex@latest` (meta package)
- `@openai/codex@darwin-arm64`
- `@openai/codex@darwin-x64`
- `@openai/codex@linux-arm64`
- `@openai/codex@linux-x64`
- `@openai/codex@win32-arm64`
- `@openai/codex@win32-x64`
- `@openai/codex-responses-api-proxy@latest`
- `@openai/codex-sdk@latest`

#### Alpha release (`x.y.z-alpha.N`)

- `@openai/codex@alpha` (meta package)
- `@openai/codex@alpha-darwin-arm64`
- `@openai/codex@alpha-darwin-x64`
- `@openai/codex@alpha-linux-arm64`
- `@openai/codex@alpha-linux-x64`
- `@openai/codex@alpha-win32-arm64`
- `@openai/codex@alpha-win32-x64`
- `@openai/codex-responses-api-proxy@alpha`
- `@openai/codex-sdk@alpha`

As an example, the `package.json` for `@openai/codex@alpha` (using
`0.99.0-alpha.17` as the `version`) would be:

```
{
  "name": "@openai/codex",
  "version": "0.99.0-alpha.17",
  "license": "Apache-2.0",
  "bin": {
    "codex": "bin/codex.js"
  },
  "type": "module",
  "engines": {
    "node": ">=16"
  },
  "files": [
    "bin"
  ],
  "repository": {
    "type": "git",
    "url": "git+https://github.com/openai/codex.git",
    "directory": "codex-cli"
  },
  "packageManager": "pnpm@10.28.2+sha512.41872f037ad22f7348e3b1debbaf7e867cfd448f2726d9cf74c08f19507c31d2c8e7a11525b983febc2df640b5438dee6023ebb1f84ed43cc2d654d2bc326264",
  "optionalDependencies": {
    "@openai/codex-linux-x64": "npm:@openai/codex@0.99.0-alpha.17-linux-x64",
    "@openai/codex-linux-arm64": "npm:@openai/codex@0.99.0-alpha.17-linux-arm64",
    "@openai/codex-darwin-x64": "npm:@openai/codex@0.99.0-alpha.17-darwin-x64",
    "@openai/codex-darwin-arm64": "npm:@openai/codex@0.99.0-alpha.17-darwin-arm64",
    "@openai/codex-win32-x64": "npm:@openai/codex@0.99.0-alpha.17-win32-x64",
    "@openai/codex-win32-arm64": "npm:@openai/codex@0.99.0-alpha.17-win32-arm64"
  }
}
```

Note that the keys in `optionalDependencies` have "clean" names, but the
values have the tag embedded.

### Important note

**Note:** Because we never created the new platform package names on npm
(for example,
`@openai/codex-darwin-arm64`) since #11318 landed, there are no extra
npm packages to clean up.

## What changed

### 1. Stage platform tarballs as `@openai/codex` with platform-specific
versions

File: `codex-cli/scripts/build_npm_package.py`

- Added `CODEX_NPM_NAME = "@openai/codex"` and platform metadata
`npm_tag` values:
- `darwin-arm64`, `darwin-x64`, `linux-arm64`, `linux-x64`,
`win32-arm64`, `win32-x64`
- For platform package staging (`codex-<platform>` inputs), switched
generated `package.json` from:
  - `name = @openai/codex-<platform>`
  to:
  - `name = @openai/codex`
- Added `compute_platform_package_version(version, platform_tag)` so
platform tarballs have unique
versions (`<release-version>-<platform-tag>`), which is required because
npm forbids re-publishing
  the same `name@version`.

### 2. Point meta package optional dependencies at dist-tags on
`@openai/codex`

File: `codex-cli/scripts/build_npm_package.py`

- Updated `optionalDependencies` generation for the main `codex` package
to use npm alias syntax:
- key remains alias package name (for example,
`@openai/codex-darwin-arm64`) so runtime lookup behavior is unchanged
  - value now resolves to `@openai/codex` by dist-tag
- Stable releases emit tags like `npm:@openai/codex@darwin-arm64`.
- Alpha releases (`x.y.z-alpha.N`) emit tags like
`npm:@openai/codex@alpha-darwin-arm64`.

### 3. Publish with per-tarball dist-tags in release CI

File: `.github/workflows/rust-release.yml`

- Reworked npm publish logic to derive the publish tag per tarball
filename:
  - platform tarballs publish with `<platform>` tags for stable releases
- platform tarballs publish with `alpha-<platform>` tags for alpha
releases
- top-level tarballs (`codex`, `codex-responses-api-proxy`, `codex-sdk`)
continue using
the existing channel tag policy (`latest` implicit for stable, `alpha`
for alpha)
- Added fail-fast behavior for unexpected tarball names to avoid silent
mispublishes.

### 4. Normalize Codex platform tarball filenames for GitHub Release
grouping

Files: `scripts/stage_npm_packages.py`,
`.github/workflows/rust-release.yml`

- Renamed staged platform tarball filenames from:
  - `codex-linux-<arch>-npm-<version>.tgz`
  - `codex-darwin-<arch>-npm-<version>.tgz`
  - `codex-win32-<arch>-npm-<version>.tgz`
- To:
  - `codex-npm-linux-<arch>-<version>.tgz`
  - `codex-npm-darwin-<arch>-<version>.tgz`
  - `codex-npm-win32-<arch>-<version>.tgz`

This keeps all Codex npm artifacts grouped under a common `codex-npm-`
prefix in GitHub Releases.

### 5. Documentation update

File: `codex-cli/scripts/README.md`

- Updated staging docs to clarify that platform-native variants are
published as dist-tagged
  `@openai/codex` artifacts rather than separate npm package names.

## Resulting behavior

- Mainline release:
  - `@openai/codex@latest` resolves the meta package
- meta package optional dependencies resolve
`@openai/codex@<platform-tag>`
- Alpha release:
  - users can continue installing `@openai/codex@alpha`
- alpha meta package optional dependencies resolve
`@openai/codex@alpha-<platform-tag>`
- Release assets:
- Codex npm tarballs share `codex-npm-` prefix for cleaner grouping in
GitHub Releases

This preserves platform-specific payload distribution while avoiding
separate npm package names and
improves release-asset discoverability.

## Validation notes

- Verified staged `package.json` output for stable and alpha meta
packages includes expected alias targets.
- Verified staged platform package manifests are `name=@openai/codex`
with unique platform-suffixed versions.
- Verified publish tag derivation maps renamed platform tarballs to
expected stable and alpha dist-tags.

Michael Bolin · 2026-02-10 10:33:47 -08:00

d9c014efce

Treat first rollout session_meta as canonical thread identity (#11241 )

During thread/fork, the new rollout includes the fork’s own session_meta
plus copied history that can contain older session_meta entries from the
source thread. thread/list was overwriting metadata on later
session_meta lines, so a fork could be reported with the source thread’s
thread_id. This fix only uses the first session_meta, so the fork keeps
its own ID.

guinness-oai · 2026-02-10 10:32:11 -08:00

099ed802b2

feat: opt-out of events in the app-server (#11319 )

Add `optOutNotificationMethods` in the app-server to opt-out events
based on exact method matching

jif-oai · 2026-02-10 18:04:52 +00:00

a364dd8b56

[apps] Improve app installation flow. (#11249 )

- [x] Add buttons to start the installation flow and verify installation
completes.
- [x] Hard refresh apps list when the /apps view opens.

Matthew Zeng · 2026-02-10 17:59:43 +00:00

48e415bdef

Fix: update parallel tool call exec approval to approve on request id (#11162 )

### Summary

In parallel tool call, exec command approvals were not approved at
request level but at a turn level. i.e. when a single request is
approved, the system currently treats all requests in turn as approved.

### Before

https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8

### After

https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc

Shijie Rao · 2026-02-10 09:38:00 -08:00

c4b771a16f

Revert "Add app-server transport layer with websocket support (#10693 )" (#11323 )

Suspected cause of deadlocking bug

Max Johnson · 2026-02-10 17:37:49 +00:00

47356ff83c

fix(protocol): approval policy never prompt (#11288 )

This removes overly directed language about how the model should behave
when it's in `approval_policy=never` mode.

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>

Fouad Matin · 2026-02-10 09:27:46 -08:00

693bac1851

tui: keep history recall cursor at line end (#11295 )

## Summary
- keep cursor at end-of-line after Up/Down history recall
- allow continued history navigation when recalled text cursor is at
start or end boundary
- add regression tests and document the history cursor contract in
composer docs

## Testing
- just fmt
- cargo test -p codex-tui --lib
history_navigation_leaves_cursor_at_end_of_line
- cargo test -p codex-tui --lib
should_handle_navigation_when_cursor_is_at_line_boundaries
- cargo test -p codex-tui *(fails in existing integration test
`suite::no_panic_on_startup::malformed_rules_should_not_panic` because
`target/debug/codex` is not present in this environment)*

Josh McKinney · 2026-02-10 17:21:46 +00:00

e704f488bd

Remove ApiPrompt (#11265 )

Keep things simple and build a full Responses API request request right
in the model client

pakrym-oai · 2026-02-10 16:12:31 +00:00

3322b99900

Fix pending input test waiting logic (#11322 )

## Summary
- remove redundant user message wait that could time out and cause
flakiness
- rely on the existing turn-complete wait to ensure the follow-up
request is observed

## Testing
- Not run (not requested)

jif-oai · 2026-02-10 15:40:53 +00:00

59c625458b

chore: split NPM packages (#11318 )

jif-oai · 2026-02-10 14:49:53 +00:00

c19969c676

feat: phase 2 consolidation (#11306 )

Consolidation phase of memories

Cleaning and better handling of concurrency

jif-oai · 2026-02-10 14:31:16 +00:00

e57892b211

Extract hooks into dedicated crate (#11311 )

Summary
- move `core/src/hooks` implementation into a new `codex-hooks` crate
with its own manifest
- update `codex-rs` workspace and `codex-core` crate to depend on the
extracted `hooks` crate and wire up the shared APIs
- ensure references, modules, and lockfile reflect the new crate layout

Testing
- Not run (not requested)

jif-oai · 2026-02-10 13:42:17 +00:00

d735df1f50

feat: align memory phase 1 and make it stronger (#11300 )

## Align with the new phase-1 design

Basically we know run phase 1 in parallel by considering:
* Max 64 rollouts
* Max 1 month old
* Consider the most recent first

This PR also adds stronger parallelization capabilities by detecting
stale jobs, retry policies, ownership of computation to prevent double
computations etc etc

jif-oai · 2026-02-10 13:42:09 +00:00

1d5eba0090

Fix spawn_agent input type (#11304 )

jif-oai · 2026-02-10 12:16:39 +00:00

223fadc760

feat: add connector capabilities to sub-agents (#11191 )

jif-oai · 2026-02-10 11:53:01 +00:00

87ccc5bbae

memories: add extraction and prompt module foundation (#11200 )

## Summary
- add the new `core/src/memories` module (phase-one parsing, rollout
filtering, storage, selection, prompts)
- add Askama-backed memory templates for stage-one input/system and
consolidation prompts
- add module tests for parsing, filtering, path bucketing, and summary
maintenance

## Testing
- just fmt
- cargo test -p codex-core --lib memories::

jif-oai · 2026-02-10 10:10:24 +00:00

6049ff02a0

feat: retain NetworkProxy, when appropriate (#11207 )

As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.

Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.

Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)

The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207

Michael Bolin · 2026-02-10 02:09:23 -08:00

44ebf4588f

chore: put crypto provider logic in a shared crate (#11294 )

Ensures a process-wide rustls crypto provider is installed.

Both the `codex-network-proxy` and `codex-api` crates need this.

Michael Bolin · 2026-02-10 01:04:31 -08:00

8e240a13be

feat: support configurable metric_exporter (#10940 )

alexsong-oai · 2026-02-10 08:14:28 +00:00

9fded117ac

3718 Commits