codex

turn metadata followups (#11782 )

some trivial simplifications from #11677

pash-openai · 2026-02-13 14:59:16 -08:00

a5e8e69d18

turn metadata: per-turn non-blocking (#11677 )

pash-openai · 2026-02-13 12:48:29 -08:00

6c0a924203

support app usage analytics (#11687 )

Emit app mentioned and app used events. Dedup by (turn_id, connector_id)

Example event params:
{
    "event_type": "codex_app_used",
    "connector_id": "asdk_app_xxx",
    "thread_id": "019c5527-36d4-xxx",
    "turn_id": "019c552c-cd17-xxx",
    "app_name": "Slack (OpenAI Internal)",
    "product_client_id": "codex_cli_rs",
    "invoke_type": "explicit",
    "model_slug": "gpt-5.3-codex"
}

alexsong-oai · 2026-02-13 12:00:16 -08:00

e71760fc64

Add js_repl kernel crash diagnostics (#11666 )

## Summary

This PR improves `js_repl` crash diagnostics so kernel failures are
debuggable without weakening timeout/reset guarantees.

## What Changed

- Added bounded kernel stderr capture and truncation logic (line + byte
caps).
- Added structured kernel snapshots (`pid`, exit status, stderr tail)
for failure paths.
- Enriched model-visible kernel-failure errors with a structured
diagnostics payload:
  - `js_repl diagnostics: {...}`
  - Included only for likely kernel-failure write/EOF cases.
- Improved logging around kernel write failures, unexpected exits, and
kill/wait paths.
- Added/updated unit tests for:
  - UTF-8-safe truncation
  - stderr tail bounds
  - structured diagnostics shape/truncation
  - conditional diagnostics emission
  - timeout kill behavior
  - forced kernel-failure diagnostics

## Why

Before this, failures like broken pipe / unexpected kernel exit often
surfaced as generic errors with little context. This change preserves
existing behavior but adds actionable diagnostics while keeping output
bounded.

## Scope

- Code changes are limited to:
-
`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`

## Validation

- `cargo clippy -p codex-core --all-targets -- -D warnings`
- Targeted `codex-core` js_repl unit tests (including new
diagnostics/timeout coverage)
- Tried starting a long running js_repl command (sleep for 10 minutes),
verified error output was as expected after killing the node process.

#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/11666
- ⏳ `2` https://github.com/openai/codex/pull/10673
- ⏳ `3` https://github.com/openai/codex/pull/10670

Curtis 'Fjord' Hawthorne · 2026-02-13 11:57:11 -08:00

a02342c9e1

chore: mini (#11772 )

https://github.com/openai/codex/issues/11764

jif-oai · 2026-02-13 19:30:49 +00:00

c54a4ec078

Update read_path prompt (#11763 )

## Summary

- Created branch zuxin/read-path-update from main.
- Copied codex-rs/core/templates/memories/read_path.md from the current
branch.
- Committed the content change.

## Testing
Not run (content copy + commit only).

zuxin-oai · 2026-02-13 18:34:54 +00:00

b934ffcaaa

Report syntax errors in rules file (#11686 )

Currently, if there are syntax errors detected in the starlark rules
file, the entire policy is silently ignored by the CLI. The app server
correctly emits a message that can be displayed in a GUI.

This PR changes the CLI (both the TUI and non-interactive exec) to fail
when the rules file can't be parsed. It then prints out an error message
and exits with a non-zero exit code. This is consistent with the
handling of errors in the config file.

This addresses #11603

Eric Traut · 2026-02-13 10:33:40 -08:00

b98c810328

feat(tui): prevent macOS idle sleep while turns run (#11711 )

## Summary
- add a shared `codex-core` sleep inhibitor that uses native macOS IOKit
assertions (`IOPMAssertionCreateWithName` / `IOPMAssertionRelease`)
instead of spawning `caffeinate`
- wire sleep inhibition to turn lifecycle in `tui` (`TurnStarted`
enables; `TurnComplete` and abort/error finalization disable)
- gate this behavior behind a `/experimental` feature toggle
(`[features].prevent_idle_sleep`) instead of a dedicated `[tui]` config
flag
- expose the toggle in `/experimental` on macOS; keep it under
development on other platforms
- keep behavior no-op on non-macOS targets

<img width="1326" height="577" alt="image"
src="https://github.com/user-attachments/assets/73fac06b-97ae-46a2-800a-30f9516cf8a3"
/>

## Testing
- `cargo check -p codex-core -p codex-tui`
- `cargo test -p codex-core sleep_inhibitor::tests -- --nocapture`
- `cargo test -p codex-core
tui_config_missing_notifications_field_defaults_to_enabled --
--nocapture`
- `cargo test -p codex-core prevent_idle_sleep_is_ -- --nocapture`

## Semantics and API references
- This PR targets `caffeinate -i` semantics: prevent *idle system sleep*
while allowing display idle sleep.
- `caffeinate -i` mapping in Apple open source (`assertionMap`):
  - `kIdleAssertionFlag -> kIOPMAssertionTypePreventUserIdleSystemSleep`
- Source:
https://github.com/apple-oss-distributions/PowerManagement/blob/PowerManagement-1846.60.12/caffeinate/caffeinate.c#L52-L54
- Apple IOKit docs for assertion types and API:
-
https://developer.apple.com/documentation/iokit/iopmlib_h/iopmassertiontypes
-
https://developer.apple.com/documentation/iokit/1557092-iopmassertioncreatewithname
  - https://developer.apple.com/library/archive/qa/qa1340/_index.html

## Codex Electron vs this PR (full stack path)
- Codex Electron app requests sleep blocking with
`powerSaveBlocker.start("prevent-app-suspension")`:
-
https://github.com/openai/codex/blob/main/codex/codex-vscode/electron/src/electron-message-handler.ts
- Electron maps that string to Chromium wake lock type
`kPreventAppSuspension`:
-
https://github.com/electron/electron/blob/main/shell/browser/api/electron_api_power_save_blocker.cc
- Chromium macOS backend maps wake lock types to IOKit assertion
constants and calls IOKit:
  - `kPreventAppSuspension -> kIOPMAssertionTypeNoIdleSleep`
- `kPreventDisplaySleep / kPreventDisplaySleepAllowDimming ->
kIOPMAssertionTypeNoDisplaySleep`
-
https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_mac.cc

## Why this PR uses a different macOS constant name
- This PR uses `"PreventUserIdleSystemSleep"` directly, via
`IOPMAssertionCreateWithName`, in
`codex-rs/core/src/sleep_inhibitor.rs`.
- Apple’s IOKit header documents `kIOPMAssertionTypeNoIdleSleep` as
deprecated and recommends `kIOPMAssertPreventUserIdleSystemSleep` /
`kIOPMAssertionTypePreventUserIdleSystemSleep`:
-
https://github.com/apple-oss-distributions/IOKitUser/blob/IOKitUser-100222.60.2/pwr_mgt.subproj/IOPMLib.h#L1000-L1030
- So Chromium and this PR are using different constant names, but
semantically equivalent idle-system-sleep prevention behavior.

## Future platform support
The architecture is intentionally set up for multi-platform extensions:
- UI code (`tui`) only calls `SleepInhibitor::set_turn_running(...)` on
turn lifecycle boundaries.
- Platform-specific behavior is isolated in
`codex-rs/core/src/sleep_inhibitor.rs` behind `cfg(...)` blocks.
- Feature exposure is centralized in `core/src/features.rs` and surfaced
via `/experimental`.
- Adding new OS backends should not require additional TUI wiring; only
the backend internals and feature stage metadata need to change.

Potential follow-up implementations:
- Windows:
- Add a backend using Win32 power APIs
(`SetThreadExecutionState(ES_CONTINUOUS | ES_SYSTEM_REQUIRED)` as
baseline).
- Optionally move to `PowerCreateRequest` / `PowerSetRequest` /
`PowerClearRequest` for richer assertion semantics.
- Linux:
- Add a backend using logind inhibitors over D-Bus
(`org.freedesktop.login1.Manager.Inhibit` with `what="sleep"`).
  - Keep a no-op fallback where logind/D-Bus is unavailable.

This PR keeps the cross-platform API surface minimal so future PRs can
add Windows/Linux support incrementally with low churn.

---------

Co-authored-by: jif-oai <jif@openai.com>

Yaroslav Volovich · 2026-02-13 10:31:39 -08:00

32da5eb358

fix: reduce flakiness of compact_resume_after_second_compaction_preserves_history (#11663 )

## Why
`compact_resume_after_second_compaction_preserves_history` has been
intermittently flaky in Windows CI.

The test had two one-shot request matchers in the second compact/resume
phase that could overlap, and it waited for the first `Warning` event
after compaction. In practice, that made the test sensitive to
platform/config-specific prompt shape and unrelated warning timing.

## What Changed
- Hardened the second compaction matcher in
`codex-rs/core/tests/suite/compact_resume_fork.rs` so it accepts
expected compact-request variants while explicitly excluding the
`AFTER_SECOND_RESUME` payload.
- Updated `compact_conversation()` to wait for the specific compaction
warning (`COMPACT_WARNING_MESSAGE`) rather than any `Warning` event.
- Added an inline comment explaining why the matcher is intentionally
broad but disjoint from the follow-up resume matcher.

## Test Plan
- `cargo test -p codex-core --test all
suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
-- --exact`
- Repeated the same test in a loop (40 runs) to check for local
nondeterminism.

Michael Bolin · 2026-02-13 09:51:22 -08:00

2383978a2c

core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669 )

## Summary
- Limit `search_tool_bm25` indexing to `codex_apps` tools only, so
non-Apps MCP servers are no longer discoverable through this search
path.
- Move search-tool discovery guidance into the `search_tool_bm25` tool
description (via template include) instead of injecting it as a separate
developer message.
- Update Apps discovery guidance wording to clarify when to use
`search_tool_bm25` for Apps-backed systems (for example Slack, Google
Drive, Jira, Notion) and when to call tools directly.
- Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and
`codex_apps_connector_id`) that is no longer used after the
tool-selection refactor.
- Update `core` search-tool tests to assert codex-apps-only behavior and
to validate guidance from the tool description.

## Validation
- ✅ `just fmt`
- ✅ `cargo test -p codex-core search_tool`
- ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly
stalled on
`tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`.

## Tickets
- None

Anton Panasenko · 2026-02-13 09:32:46 -08:00

38c442ca7f

Fix memories output schema requirements (#11748 )

Summary
- make the phase1 memories schema require `rollout_slug` while still
allowing it to be `null`
- update the corresponding test to check the required fields and
nullable type list

Testing
- Not run (not requested)

jif-oai · 2026-02-13 16:17:21 +00:00

c0749c349f

chore: move explorer to spark (#11745 )

jif-oai · 2026-02-13 16:13:24 +00:00

561fc14045

feat: add slug in name (#11739 )

jif-oai · 2026-02-13 15:24:03 +00:00

db66d827be

feat: memories config (#11731 )

jif-oai · 2026-02-13 14:18:15 +00:00

e00080cea3

chore: streamline phase 2 (#11712 )

jif-oai · 2026-02-13 13:21:11 +00:00

36541876f4

Lower missing rollout log level (#11722 )

Fix this: https://github.com/openai/codex/issues/11634

jif-oai · 2026-02-13 12:59:17 +00:00

feae389942

feat: add token usage on memories (#11618 )

Add aggregated token usage metrics on phase 1 of memories

jif-oai · 2026-02-13 09:31:20 +00:00

e5e40e2d4b

chore(core) Restrict model-suggested rules (#11671 )

## Summary
If the model suggests a bad rule, don't show it to the user. This does
not impact the parsing of existing rules, just the ones we show.

## Testing
- [x] Added unit tests
- [x] Ran locally

Dylan Hurd · 2026-02-12 23:57:53 -08:00

e6e4c5fa3a

[apps] Fix app loading logic. (#11518 )

When `app/list` is called with `force_refetch=True`, we should seed the
results with what is already cached instead of starting from an empty
list. Otherwise when we send app/list/updated events, the client will
first see an empty list of accessible apps and then get the updated one.

Matthew Zeng · 2026-02-13 03:55:10 +00:00

f93037f55d

chore(approvals) More approvals scenarios (#11660 )

## Summary
Add some additional tests to approvals flow

## Testing
- [x] these are tests

Dylan Hurd · 2026-02-12 19:54:54 -08:00

35692e99c1

Added a test to verify that feature flags that are enabled by default are stable (#11275 )

We've had a few cases recently where someone enabled a feature flag for
a feature that's still under development or experimental. This test
should prevent this.

Eric Traut · 2026-02-12 17:53:15 -08:00

537102e657

Remove git commands from dangerous command checks (#11510 )

### Motivation

- Git subcommand matching was being classified as "dangerous" and caused
benign developer workflows (for example `git push --force-with-lease`)
to be blocked by the preflight policy.
- The change aligns behavior with the intent to reserve the dangerous
checklist for truly destructive shell ops (e.g. `rm -rf`) and avoid
surprising developer-facing blocks.

### Description

- Remove git-specific subcommand checks from
`is_dangerous_to_call_with_exec` in
`codex-rs/shell-command/src/command_safety/is_dangerous_command.rs`,
leaving only explicit `rm` and `sudo` passthrough checks.
- Deleted the git-specific helper logic that classified `reset`,
`branch`-delete, `push` (force/delete/refspec) and `clean --force` as
dangerous.
- Updated unit tests in the same file to assert that various `git
reset`/`git branch`/`git push`/`git clean` variants are no longer
classified as dangerous.
- Kept `find_git_subcommand` (used by safe-command classification)
intact so safe/unsafe parsing elsewhere remains functional.

### Testing

- Ran formatter with `just fmt` successfully.  
- Ran unit tests with `cargo test -p codex-shell-command` and all tests
passed (`144 passed; 0 failed`).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_698d19dedb4883299c3ceb5bbc6a0dcf)

Josh McKinney · 2026-02-13 01:33:02 +00:00

fc073c9c5b

Persist complete TurnContextItem state via canonical conversion (#11656 )

## Summary

This PR delivers the first small, shippable step toward model-visible
state diffing by making
`TurnContextItem` more complete and standardizing how it is built.

Specifically, it:
- Adds persisted network context to `TurnContextItem`.
- Introduces a single canonical `TurnContext -> TurnContextItem`
conversion path.
- Routes existing rollout write sites through that canonical conversion
helper.

No context injection/diff behavior changes are included in this PR.

## Why this change

The design goal is to make `TurnContextItem` the canonical source of
truth for context-diff
decisions.
Before this PR:
- `TurnContextItem` did not include all TurnContext-derived environment
inputs needed for v1
completeness.
- Construction was duplicated at multiple write sites.

This PR addresses both with a minimal, reviewable change.

## Changes

### 1) Extend `TurnContextItem` with network state
- Added `TurnContextNetworkItem { allowed_domains, denied_domains }`.
- Added `network: Option<TurnContextNetworkItem>` to `TurnContextItem`.
- Kept backward compatibility by making the new field optional and
skipped when absent.

Files:
- `codex-rs/protocol/src/protocol.rs`

### 2) Canonical conversion helper
- Added `TurnContext::to_turn_context_item(collaboration_mode)` in core.
- Added internal helper to derive network fields from
`config_layer_stack.requirements().network`.

Files:
- `codex-rs/core/src/codex.rs`

### 3) Use canonical conversion at rollout write sites
- Replaced ad hoc `TurnContextItem { ... }` construction with
`to_turn_context_item(...)` in:
  - sampling request path
  - compaction path

Files:
- `codex-rs/core/src/codex.rs`
- `codex-rs/core/src/compact.rs`

### 4) Update fixtures/tests for new optional field
- Updated existing `TurnContextItem` literals in tests to include
`network: None`.
- Added protocol tests for:
  - deserializing old payloads with no `network`
  - serializing when `network` is present

Files:
- `codex-rs/core/tests/suite/resume_warning.rs`
- No replay/diff logic changes.
- Persisted rollout `TurnContextItem` now carries additional network
context when available.
- Older rollout lines without `network` remain readable.

Charley Cunningham · 2026-02-12 17:22:44 -08:00

f24669d444

Add new apps_mcp_gateway (#11630 )

Adds a new apps_mcp_gateway flag to route Apps MCP calls through
https://api.openai.com/v1/connectors/mcp/ when enabled, while keeping
legacy MCP routing as default.

canvrno-oai · 2026-02-12 16:54:11 -08:00

46b2da35d5

[apps] Add is_enabled to app info. (#11417 )

- [x] Add is_enabled to app info and the response of `app/list`.
- [x] Update TUI to have Enable/Disable button on the app detail page.

Matthew Zeng · 2026-02-13 00:30:52 +00:00

c37560069a

Add js_repl_tools_only model and routing restrictions (#10671 )

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.


#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/10674
- ✅ `2` https://github.com/openai/codex/pull/10672
- 👉 `3` https://github.com/openai/codex/pull/10671
- ⏳ `4` https://github.com/openai/codex/pull/10673
- ⏳ `5` https://github.com/openai/codex/pull/10670

Curtis 'Fjord' Hawthorne · 2026-02-12 15:41:05 -08:00

0dcfc59171

Remove absolute path in rollout_summary (#11622 )

Wendy Jiao · 2026-02-12 23:32:41 +00:00

a7ce2a1c31

[feat] add seatbelt permission files (#11639 )

Add seatbelt permission extension abstraction as permission files for
seatbelt profiles. This should complement our current sandbox policy

Celia Chen · 2026-02-12 23:30:22 +00:00

dfd1e199a0

feat: introduce Permissions (#11633 )

## Why
We currently carry multiple permission-related concepts directly on
`Config` for shell/unified-exec behavior (`approval_policy`,
`sandbox_policy`, `network`, `shell_environment_policy`,
`windows_sandbox_mode`).

Consolidating these into one in-memory struct makes permission handling
easier to reason about and sets up the next step: supporting named
permission profiles (`[permissions.PROFILE_NAME]`) without changing
behavior now.

This change is mostly mechanical: it updates existing callsites to go
through `config.permissions`, but it does not yet refactor those
callsites to take a single `Permissions` value in places where multiple
permission fields are still threaded separately.

This PR intentionally **does not** change the on-disk `config.toml`
format yet and keeps compatibility with legacy config keys.

## What Changed
- Introduced `Permissions` in `core/src/config/mod.rs`.
- Added `Config::permissions` and moved effective runtime permission
fields under it:
  - `approval_policy`
  - `sandbox_policy`
  - `network`
  - `shell_environment_policy`
  - `windows_sandbox_mode`
- Updated config loading/building so these effective values are still
derived from the same existing config inputs and constraints.
- Updated Windows sandbox helpers/resolution to read/write via
`permissions`.
- Threaded the new field through all permission consumers across core
runtime, app-server, CLI/exec, TUI, and sandbox summary code.
- Updated affected tests to reference `config.permissions.*`.
- Renamed the struct/field from
`EffectivePermissions`/`effective_permissions` to
`Permissions`/`permissions` and aligned variable naming accordingly.

## Verification
- `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
-p codex-exec -p codex-utils-sandbox-summary`
- `cargo build -p codex-core -p codex-tui -p codex-cli -p
codex-app-server -p codex-exec -p codex-utils-sandbox-summary`

Michael Bolin · 2026-02-12 14:42:54 -08:00

a4cc1a4a85

Better error message for model limit hit. (#11636 )

<img width="553" height="147" alt="image"
src="https://github.com/user-attachments/assets/f04cdebd-608a-4055-a413-fae92aaf04e5"
/>

xl-openai · 2026-02-12 14:10:30 -08:00

d7cb70ed26

chore(core) Deprecate approval_policy: on-failure (#11631 )

## Summary
In an effort to start simplifying our sandbox setup, we're announcing
this approval_policy as deprecated. In general, it performs worse than
`on-request`, and we're focusing on making fewer sandbox configurations
perform much better.

## Testing
- [x] Tested locally
- [x] Existing tests pass

Dylan Hurd · 2026-02-12 13:23:30 -08:00

4668feb43a

add a slash command to grant sandbox read access to inaccessible directories (#11512 )

There is an edge case where a directory is not readable by the sandbox.
In practice, we've seen very little of it, but it can happen so this
slash command unlocks users when it does.

Future idea is to make this a tool that the agent knows about so it can
be more integrated.

iceweasel-oai · 2026-02-12 12:48:36 -08:00

5c3ca73914

Add js_repl host helpers and exec end events (#10672 )

## Summary

This PR adds host-integrated helper APIs for `js_repl` and updates model
guidance so the agent can use them reliably.

### What’s included

- Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call
normal Codex tools.
- Keep persistent JS state and scratch-path helpers available:
  - `codex.state`
  - `codex.tmpDir`
- Wire `js_repl` tool calls through the standard tool router path.
- Add/align `js_repl` execution completion/end event behavior with
existing tool logging patterns.
- Update dynamic prompt injection (`project_doc`) to document:
  - how to call `codex.tool(...)`
  - raw output behavior
- image flow via `view_image` (`codex.tmpDir` +
`codex.tool("view_image", ...)`)
- stdio safety guidance (`console.log` / `codex.tool`, avoid direct
`process.std*`)

## Why

- Standardize JS-side tool usage on `codex.tool(...)`
- Make `js_repl` behavior more consistent with existing tool execution
and event/logging patterns.
- Give the model enough runtime guidance to use `js_repl` safely and
effectively.

## Testing

- Added/updated unit and runtime tests for:
  - `codex.tool` calls from `js_repl` (including shell/MCP paths)
  - image handoff flow via `view_image`
  - prompt-injection text for `js_repl` guidance
  - execution/end event behavior and related regression coverage




#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/10674
- 👉 `2` https://github.com/openai/codex/pull/10672
- ⏳ `3` https://github.com/openai/codex/pull/10671
- ⏳ `4` https://github.com/openai/codex/pull/10673
- ⏳ `5` https://github.com/openai/codex/pull/10670

Curtis 'Fjord' Hawthorne · 2026-02-12 12:10:25 -08:00

466be55abc

feat(app-server): experimental flag to persist extended history (#11227 )

This PR adds an experimental `persist_extended_history` bool flag to
app-server thread APIs so rollout logs can retain a richer set of
EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
on `thread/resume`).

### Motivation
Today, our rollout recorder only persists a small subset (e.g. user
message, reasoning, assistant message) of `EventMsg` types, dropping a
good number (like command exec, file change, etc.) that are important
for reconstructing full item history for `thread/resume`, `thread/read`,
and `thread/fork`.

Some clients want to be able to resume a thread without lossiness. This
lossiness is primarily a UI thing, since what the model sees are
`ResponseItem` and not `EventMsg`.

### Approach
This change introduces an opt-in `persist_full_history` flag to preserve
those events when you start/resume/fork a thread (defaults to `false`).

This is done by adding an `EventPersistenceMode` to the rollout
recorder:
- `Limited` (existing behavior, default)
- `Extended` (new opt-in behavior)

In `Extended` mode, persist additional `EventMsg` variants needed for
non-lossy app-server `ThreadItem` reconstruction. We now store the
following ThreadItems that we didn't before:
- web search
- command execution
- patch/file changes
- MCP tool calls
- image view calls
- collab tool outcomes
- context compaction
- review mode enter/exit

For **command executions** in particular, we truncate the output using
the existing `truncate_text` from core to store an upper bound of 10,000
bytes, which is also the default value for truncating tool outputs shown
to the model. This keeps the size of the rollout file and command
execution items returned over the wire reasonable.

And we also persist `EventMsg::Error` which we can now map back to the
Turn's status and populates the Turn's error metadata.

#### Updates to EventMsgs
To truly make `thread/resume` non-lossy, we also needed to persist the
`status` on `EventMsg::CommandExecutionEndEvent` and
`EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
command failed or was declined (similar for apply_patch). These
EventMsgs were never persisted before so I made it a required field.

Owen Lin · 2026-02-12 19:34:22 +00:00

efc8d45750

Parse first order skill/connector mentions (#11547 )

This PR introduces a skill-expansion mechanism for mentions so nested or
skill or connection mentions are expanded if present in skills invoked
by the user. This keeps behavior aligned with existing mention handling
while extending coverage to deeper scenarios. With these changes, users
can create skills that invoke connectors, and skills that invoke other
skills.

Replaces #10863, which is not needed with the addition of
[search_tool_bm25](https://github.com/openai/codex/issues/10657)

canvrno-oai · 2026-02-12 10:55:22 -08:00

22fa283511

fix: fmt (#11619 )

jif-oai · 2026-02-12 18:13:00 +00:00

545b266839

chore: reduce concurrency of memories (#11614 )

jif-oai · 2026-02-12 17:55:21 +00:00

b3674dcce0

Add cwd to memory files (#11591 )

Add cwd to memory files so that model can deal with multi cwd memory
better.

---------

Co-authored-by: jif-oai <jif@openai.com>

Wendy Jiao · 2026-02-12 17:46:49 +00:00

88c5ca2573

exclude developer messages from phase-1 memory input (#11608 )

Co-authored-by: jif-oai <jif@openai.com>

Wendy Jiao · 2026-02-12 17:43:38 +00:00

82acd815e4

fix(core) model_info preserves slug (#11602 )

## Summary
Preserve the specified model slug when we get a prefix-based match

## Testing
- [x] added unit test

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>

Dylan Hurd · 2026-02-12 09:43:32 -08:00

f39f506700

chore: drop and clean from phase 1 (#11605 )

This PR is mostly cleaning and simplifying phase 1 of memories

jif-oai · 2026-02-12 17:23:00 +00:00

f741fad5c0

Fix config test on macOS (#11579 )

When running these tests locally, you may have system-wide config or
requirements files. This makes the tests ignore these files.

gt-oai · 2026-02-12 15:56:48 +00:00

d8b130d9a4

feat: metrics to memories (#11593 )

jif-oai · 2026-02-12 15:28:48 +00:00

aeaa68347f

chore: clean consts (#11590 )

jif-oai · 2026-02-12 14:44:40 +00:00

04b60d65b3

feat: truncate with model infos (#11577 )

jif-oai · 2026-02-12 13:16:40 +00:00

44b92f9a85

Ensure list_threads drops stale rollout files (#11572 )

Summary
- trim `state_db::list_threads_db` results to entries whose rollout
files still exist, logging and recording a discrepancy for dropped rows
- delete stale metadata rows from the SQLite store so future calls don’t
surface invalid paths
- add regression coverage in `recorder.rs` to verify stale DB paths are
dropped when the file is missing

jif-oai · 2026-02-12 12:49:31 +00:00

adad23f743

feat: mem drop cot (#11571 )

Drop CoT and compaction for memory building

jif-oai · 2026-02-12 11:41:04 +00:00

befe4fbb02

Fix flaky pre_sampling_compact switch test (#11573 )

Summary
- address the nondeterministic behavior observed in
`pre_sampling_compact_runs_on_switch_to_smaller_context_model` so it no
longer fails intermittently during model switches
- ensure the surrounding sampling logic consistently handles the
smaller-context case that the test exercises

Testing
- Not run (not requested)

jif-oai · 2026-02-12 11:40:48 +00:00

3cd93c00ac

feat: mem slash commands (#11569 )

Add 2 slash commands for memories:
* `/m_drop` delete all the memories
* `/m_update` update the memories with phase 1 and 2

jif-oai · 2026-02-12 10:39:43 +00:00

a0dab25c68

Fix test flake (#11448 )

Flaking with

```
   Nextest run ID 6b7ff5f7-57f6-4c9c-8026-67f08fa2f81f with nextest profile: default
      Starting 3282 tests across 118 binaries (21 tests skipped)
          FAIL [  14.548s] (1367/3282) codex-core::all suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input
    stdout ───

      running 1 test
      test suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input ... FAILED

      failures:

      failures:
          suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input

      test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 522 filtered out; finished in 14.41s

    stderr ───

      thread 'suite::apply_patch_cli::apply_patch_cli_can_use_shell_command_output_as_patch_input' (15632) panicked at C:\a\codex\codex\codex-rs\core\tests\common\lib.rs:186:14:
      timeout waiting for event: Elapsed(())
      stack backtrace:
      read_output:
      Exit code: 0
      Wall time: 8.5 seconds
      Output:
      line1
      naïve café
      line3

      stdout:
      line1
      naïve café
      line3
      patch:
      *** Begin Patch
      *** Add File: target.txt
      +line1
      +naïve café
      +line3
      *** End Patch
      note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```

gt-oai · 2026-02-12 09:37:24 +00:00

4027f1f1a4

1727 Commits