Files
codex/codex-rs/thread-store
T
Michael Bolin 01f89c8c59 core: persist initial context window metadata (#29519)
## Why

PR #29494 made context-window IDs visible to the model by wrapping the
token-budget window payload in `<context_window>`, but rollout JSONL
consumers still could not see the initial window identity by tailing the
session file. Compacted rollout items carry window IDs only after
compaction has happened, so a session with no compaction had no durable
JSONL record for window 0.

This change gives tailing consumers a stable initial-window record at
session creation time.

## What Changed

- Added `session_meta.context_window.window_id` for the initial
context-window identity.
- `CreateThreadParams` now requires `initial_window_id: String`, so
thread-store callers cannot accidentally create new threads without
window-0 metadata.
- Live thread creation derives the persisted initial window ID from the
same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
runtime state and JSONL metadata aligned.
- Rollout reconstruction uses `session_meta.context_window.window_id` as
the initial-window fallback and derives `window_number = 0`,
`first_window_id = window_id`, and `previous_window_id = None`
internally.
- Fork reconstruction intentionally uses the same rollout reconstruction
path; consumers that need to distinguish copied initial-window metadata
can use the rollout `thread_id`.
- Legacy compactions without `window_number` still use compaction-count
fallback accounting instead of being reset to window 0 by the
initial-window fallback.
- Compacted rollout metadata still takes precedence once compaction
records exist, preserving the richer chain fields there.

## JSONL Shape

Real rollout JSONL is one object per line. This example is expanded for
readability, but shows the new initial `session_meta.context_window`
record followed by the existing compacted rollout item shape that also
carries window IDs:

```jsonl
{
  "timestamp": "2026-06-22T12:00:00.000Z",
  "type": "session_meta",
  "payload": {
    "session_id": "<THREAD_ID>",
    "id": "<THREAD_ID>",
    "timestamp": "2026-06-22T12:00:00.000Z",
    "cwd": "/repo",
    "originator": "codex",
    "cli_version": "0.0.0",
    "source": "cli",
    "model_provider": "<MODEL_PROVIDER>",
    "context_window": {
      "window_id": "<INITIAL_WINDOW_ID>"
    }
  }
}
...
{
  "timestamp": "2026-06-22T12:34:56.000Z",
  "type": "compacted",
  "payload": {
    "message": "<COMPACTION_SUMMARY>",
    "replacement_history": [
      "..."
    ],
    "window_number": 1,
    "first_window_id": "<INITIAL_WINDOW_ID>",
    "previous_window_id": "<INITIAL_WINDOW_ID>",
    "window_id": "<NEXT_WINDOW_ID>"
  }
}
```

The nested `context_window` object is intentional: it gives rollout
consumers a stable namespace for context-window metadata while only
writing the non-derivable initial `window_id`. For the initial window,
`window_number`, `first_window_id`, and `previous_window_id` are derived
internally instead of being written to the rollout.

## Verification

- `just test -p codex-protocol`
- `just test -p codex-rollout
recorder_materializes_on_flush_with_pending_items`
- `just test -p codex-core reconstruct_history`
- `just test -p codex-core
record_initial_history_reconstructs_forked_transcript`
- `just test -p codex-thread-store`
- `just test -p codex-state`
- `just test -p codex-app-server
thread_read_returns_summary_without_turns`
- `just test -p codex-rollout persistence_metrics`
01f89c8c59 ยท 2026-06-23 21:50:50 +00:00
History
..
2026-04-14 13:51:00 -07:00

Thread Store

codex-thread-store is the storage boundary for Codex threads. It defines the ThreadStore trait plus local and in-memory implementations. Other storage implementations may live outside this repository.

Responsibilities

  • ThreadStore::append_items is the raw canonical history append API. It does not infer metadata from item contents.
  • ThreadStore::update_thread_metadata is the only thread metadata write API. It accepts a single literal metadata patch shape, regardless of whether the caller is applying a user/API mutation or facts derived above the store from appended history.
  • LiveThread is the preferred API for active session persistence. It owns a per-thread metadata sync helper, applies the rollout persistence policy, appends canonical history, and then sends metadata patches through ThreadStore::update_thread_metadata.
  • ThreadManager routes metadata mutations for loaded and cold threads through one entrypoint. Loaded threads use their LiveThread; cold threads go directly to the store.
  • LocalThreadStore persists history through codex-rollout JSONL files and persists queryable metadata through the SQLite state database when available. Local explicit metadata mutations also maintain JSONL/name-index compatibility so reading old or SQLite-less local storage keeps working.
  • RolloutRecorder is the local JSONL writer. It writes already-canonical items for ThreadStore::append_items; it no longer decides metadata updates for live thread-store appends.
  • core/session creates or resumes LiveThread handles and does not need to know whether persistence is backed by local files or another store.

Direction

New metadata observation semantics should live above ThreadStore. Stores persist explicit metadata fields, but raw history appends remain history-only.