192 Commits

  • ensure thread.history_mode is immutable (#30261)
    ## Description
    
    This PR makes `thread.history_mode` immutable after the thread's
    canonical first `SessionMeta` has been written. Later same-thread
    `SessionMeta` lines are compatibility metadata writes, not a new thread
    definition.
    
    Without this, an older binary could append a `SessionMeta` that omits
    `history_mode`; when a newer binary replays it, serde defaults that
    missing field to `legacy` and SQLite could downgrade a paginated thread.
    
    ## Why
    
    `history_mode` is the persisted thread storage contract.
    Paginated-thread fail-closed behavior and SQLite memory filtering depend
    on it staying aligned with canonical rollout metadata, especially when
    multiple Codex binary versions can touch the same local rollout.
    
    ## What changed
    
    - Stop generic rollout metadata replay from overwriting `history_mode`
    from later `SessionMeta` items.
    - Remove `history_mode` from `ThreadMetadataPatch`, so mutable metadata
    sync and app-server metadata updates cannot rewrite it.
    - When local metadata sync has to recreate a missing SQLite row, recover
    `history_mode` from the rollout's canonical first `SessionMeta` instead
    of from a mutable patch.
    - Keep the in-memory thread store using the created thread's canonical
    `history_mode` instead of metadata patches.
    - Fill the one remaining core test `CreateThreadParams` initializer with
    the new `history_mode` field; Bazel CI caught this after the parent
    history-mode PR landed.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-thread-store`
    - `just test -p codex-state
    session_meta_does_not_set_model_or_reasoning_effort`
  • feat(app-server): add history_mode to thread (#29927)
    ## Description
    
    This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
    This will be stored in `SessionMeta` in the JSONL rollout file and as a
    new column in the SQLite thread_metadata table, and exposed on
    `thread/start` and on the `Thread` object in app-server.
    
    ## What changed
    
    - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
    defaulting old and new SessionMeta to `legacy`.
    - Carried `history_mode` through core session config, ThreadStore stored
    metadata, local/in-memory stores, rollout metadata extraction, and the
    existing SQLite `threads` table.
    - Added experimental `historyMode` to app-server v2 `Thread` and
    `thread/start`.
    - Made paginated stored threads metadata-discoverable but unsupported
    for legacy full-history reads, `load_history`, live resume, and create
    paths.
    - Regenerated app-server schema fixtures and added
    protocol/state/thread-store/app-server coverage for persistence and
    fail-closed behavior.
    
    ## Compatibility floor
    Because users may be running various versions of Codex binaries on the
    same machine (TUI, Codex App, etc.), we will need to establish a
    compatibility floor for upcoming paginated threads, which will change
    how thread storage reads and writes work.
    
    The overall plan here:
    ```
    Release N:
    - Add historyMode to SessionMeta / Thread / SQLite metadata.
    - Teach binaries to understand paginated threads.
    - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
    - Default remains `"legacy"`.
    
    Release N+1:
    - First-party clients start opting into paginated threads where appropriate.
    - Internal dogfood / staged rollout.
    - Measure old-client usage and paginated-thread unsupported errors.
    
    Release N+2:
    - Only after Release N+ is overwhelmingly deployed, make paginated the default.
    - Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
    ```
    
    The important behavior change is fail-closed handling for a binary that
    encounters a persisted `paginated` thread before it knows how to fully
    support paginated history. In app-server, if a thread is `paginated`, we
    will:
    
    - allow metadata-only discovery paths like `thread/list` and
    `thread/read(includeTurns=false)`, so clients can still see the thread
    and inspect its `historyMode`
    - reject legacy full-history/live-thread paths like
    `thread/read(includeTurns=true)` and `thread/resume` with an unsupported
    JSON-RPC error
    - avoid silently treating an unknown or future `historyMode` as `legacy`
    
    Under the hood, the ThreadStore layer also rejects legacy operations
    that would need to load or replay the full thread history for a
    paginated thread. That gives us the behavior we want for Release N:
    future paginated threads are visible, but this binary fails closed
    instead of trying to operate on them as if they were legacy threads.
  • Persist selected capability roots and resolve availability per model step (#29856)
    ## Why
    
    `selectedCapabilityRoots` is durable thread intent: “use this capability
    root from environment `worker`.”
    
    The important product assumption is:
    
    > One environment ID always names the same logical executor and stable
    contents.
    
    `worker` does not silently change from executor A to an unrelated
    executor B. The process-local connection handle for `worker` can still
    be replaced while Codex is running, though, for example when
    `environment/add` registers a fresh handle for the same logical
    environment.
    
    The thread should persist only the stable selection. Each model step
    should pair that selection with the exact ready handle captured for that
    step.
    
    ## The boundary
    
    ```text
    persisted thread intent
      plugin@1 -> environment "worker"
                    |
                    | capture the current step
                    v
    model-step view
      unavailable, or
      plugin@1 + worker's exact captured ready handle
    ```
    
    The environment ID is the stable identity and cache key. The
    `Arc<Environment>` is only a process-local handle retained so consumers
    of one model step use the same captured environment. It is never
    persisted and it does not imply different environment contents.
    
    ## What changes
    
    ### Persist the stable selection
    
    Selected roots are written into `SessionMeta` and restored with the
    thread. Forked subagents inherit the same selections, including
    bounded-history forks.
    
    Only stable data is persisted: root ID, environment ID, and root path.
    
    ### Capture readiness together with the exact handle
    
    The environment snapshot records:
    
    ```rust
    environment_id -> Some(Arc<Environment>) // ready in this step
    environment_id -> None                   // still starting in this step
    ```
    
    This prevents readiness and execution from coming from different
    registry snapshots.
    
    For example:
    
    ```text
    step snapshot: worker -> handle A, ready
    environment/add: worker -> fresh handle B for the same logical environment
    current step: plugin@1 still uses captured handle A
    ```
    
    Without carrying handle A in the snapshot, the resolver could combine “A
    was ready” with handle B and treat B as ready before it had finished
    starting.
    
    This does not change cache invalidation. Stable capability metadata
    remains identified by environment ID and capability root. Replacing a
    process-local handle under the same stable environment ID does not
    invalidate or rediscover that metadata.
    
    ### Resolve availability per model step
    
    - A ready captured environment produces resolved roots using its
    captured handle.
    - A starting, missing, or failed environment is omitted from that step.
    - A selected lazy environment that is outside the turn's captured
    environment set is asked to start, and a later step can observe it as
    ready.
    - No capability files are scanned here.
    
    Transient transport disconnects remain the remote client's reconnect
    concern. This PR models initial attachment/readiness; it does not add
    live socket-connectivity state.
    
    ## Example
    
    ```text
    thread selection: plugin@1 -> environment "worker"
    
    step 1: worker is starting -> plugin@1 unavailable
    step 2: worker is ready    -> plugin@1 resolves through worker's captured handle
    step 3: fresh local handle -> current step remains pinned; a later step captures its own view
    ```
    
    Temporary unavailability does not discard the durable selection. Later
    PRs can retain stable metadata caches while projecting only currently
    available capabilities into model-visible World State.
    
    ## Compatibility
    
    The app-server request shape does not change. Older rollouts without
    `selected_capability_roots` deserialize to an empty list.
    
    ## Stack
    
    1. **This PR:** persist stable selected roots and resolve them through
    an exact model-step handle.
    2. #29960: cache stable skill metadata and project available skills into
    World State.
    3. #29946: cache stable plugin declarations and manage the separate live
    MCP runtime.
  • [2/3] core: persist world state in rollouts (#29835)
    ## Why
    
    `WorldState` currently remembers its model-visible diff baseline only in
    memory. That leaves no durable source for restoring the exact baseline
    after resume, fork, rollback, or compaction.
    
    This is the second PR in the WorldState persistence stack, built on
    #29833 and following #29249. It records durable state transitions; the
    next PR will replay them during rollout reconstruction.
    
    ## What
    
    - Add a `world_state` rollout item containing either a full snapshot or
    an RFC 7386 JSON Merge Patch.
    - Persist a full snapshot after initial context and after compaction
    establishes a new context window.
    - Persist non-empty patches when later sampling steps or turns advance
    the WorldState baseline.
    - Write model-visible history before its matching WorldState record, so
    an interrupted write can only cause a safe repeated update on replay.
    - Preserve WorldState records for full-history forks while excluding
    them from thread previews, metadata, and app-server history
    materialization.
    
    Older binaries read rollout lines independently, so they skip the
    unknown `world_state` records while retaining the rest of the thread.
    
    ## Testing
    
    - `just test -p codex-core
    snapshot_merge_patch_changes_and_removes_nested_values`
    - `just test -p codex-core
    world_state_baseline_deduplicates_until_history_is_replaced`
    - `just test -p codex-core
    deferred_executor_compaction_preserves_then_updates_environment_once`
    - `just test -p codex-protocol`
    - `just test -p codex-rollout`
    - `just test -p codex-state`
    - `just test -p codex-thread-store`
    - `just test -p codex-app-server-protocol`
  • feat(app-server): list descendant threads by ancestor (#29591)
    ## Why
    
    `thread/list` can filter direct children with `parentThreadId`, but
    clients cannot request an entire spawned subtree. Discovering every
    descendant requires repeated client-side requests and gives up the
    database's existing filtering and pagination path.
    
    ## What changed
    
    Experimental clients can use `ancestorThreadId` to return strict
    descendants at any depth while `parentThreadId` retains its direct-child
    meaning. The filters are mutually exclusive, the ancestor is excluded,
    and every result preserves its immediate `parentThreadId` so callers can
    reconstruct the tree.
    
    ## How it works
    
    - **Explicit relationship:** Internal list parameters distinguish direct
    children from transitive descendants without changing the meaning of
    `parentThreadId`.
    - **Existing graph:** Persisted parent-child spawn edges remain the
    source of truth, so descendant lookup needs no schema migration or
    ancestry cache.
    - **Indexed traversal:** A recursive SQLite query starts from the
    parent-edge index, walks each generation, and applies thread filters,
    sorting, and cursor pagination in the same database request.
    - **Reconstructable results:** The response stays flat and normally
    ordered while carrying each descendant's immediate parent.
    
    ## Verification
    
    Ran 550 tests across the protocol, state, rollout, and thread-store
    crates, then reran the four focused state, store, and app-server
    descendant-listing tests after the final diff reduction. Scoped Clippy
    and formatting checks passed. Stable and experimental schema generation
    was checked; the stable fixtures remain unchanged while the experimental
    schema includes the new field.
  • Persist agent messages as response items (#29829)
    ## Why
    
    Inter-agent messages are recorded in live history as
    `ResponseItem::AgentMessage`, but rollouts stored
    `InterAgentCommunication` and rebuilt the response item during resume.
    This made the rollout differ from the actual Responses history.
    
    ## What changed
    
    - store the prepared `agent_message` response item directly
    - keep `trigger_turn` in a small local metadata record for fork
    truncation
    - keep reading older `inter_agent_communication` rollout items
  • core: persist initial context window metadata (#29519)
    ## Why
    
    PR #29494 made context-window IDs visible to the model by wrapping the
    token-budget window payload in `<context_window>`, but rollout JSONL
    consumers still could not see the initial window identity by tailing the
    session file. Compacted rollout items carry window IDs only after
    compaction has happened, so a session with no compaction had no durable
    JSONL record for window 0.
    
    This change gives tailing consumers a stable initial-window record at
    session creation time.
    
    ## What Changed
    
    - Added `session_meta.context_window.window_id` for the initial
    context-window identity.
    - `CreateThreadParams` now requires `initial_window_id: String`, so
    thread-store callers cannot accidentally create new threads without
    window-0 metadata.
    - Live thread creation derives the persisted initial window ID from the
    same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
    runtime state and JSONL metadata aligned.
    - Rollout reconstruction uses `session_meta.context_window.window_id` as
    the initial-window fallback and derives `window_number = 0`,
    `first_window_id = window_id`, and `previous_window_id = None`
    internally.
    - Fork reconstruction intentionally uses the same rollout reconstruction
    path; consumers that need to distinguish copied initial-window metadata
    can use the rollout `thread_id`.
    - Legacy compactions without `window_number` still use compaction-count
    fallback accounting instead of being reset to window 0 by the
    initial-window fallback.
    - Compacted rollout metadata still takes precedence once compaction
    records exist, preserving the richer chain fields there.
    
    ## JSONL Shape
    
    Real rollout JSONL is one object per line. This example is expanded for
    readability, but shows the new initial `session_meta.context_window`
    record followed by the existing compacted rollout item shape that also
    carries window IDs:
    
    ```jsonl
    {
      "timestamp": "2026-06-22T12:00:00.000Z",
      "type": "session_meta",
      "payload": {
        "session_id": "<THREAD_ID>",
        "id": "<THREAD_ID>",
        "timestamp": "2026-06-22T12:00:00.000Z",
        "cwd": "/repo",
        "originator": "codex",
        "cli_version": "0.0.0",
        "source": "cli",
        "model_provider": "<MODEL_PROVIDER>",
        "context_window": {
          "window_id": "<INITIAL_WINDOW_ID>"
        }
      }
    }
    ...
    {
      "timestamp": "2026-06-22T12:34:56.000Z",
      "type": "compacted",
      "payload": {
        "message": "<COMPACTION_SUMMARY>",
        "replacement_history": [
          "..."
        ],
        "window_number": 1,
        "first_window_id": "<INITIAL_WINDOW_ID>",
        "previous_window_id": "<INITIAL_WINDOW_ID>",
        "window_id": "<NEXT_WINDOW_ID>"
      }
    }
    ```
    
    The nested `context_window` object is intentional: it gives rollout
    consumers a stable namespace for context-window metadata while only
    writing the non-derivable initial `window_id`. For the initial window,
    `window_number`, `first_window_id`, and `previous_window_id` are derived
    internally instead of being written to the rollout.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout
    recorder_materializes_on_flush_with_pending_items`
    - `just test -p codex-core reconstruct_history`
    - `just test -p codex-core
    record_initial_history_reconstructs_forked_transcript`
    - `just test -p codex-thread-store`
    - `just test -p codex-state`
    - `just test -p codex-app-server
    thread_read_returns_summary_without_turns`
    - `just test -p codex-rollout persistence_metrics`
  • Stop persisting bridged log events (#29599)
    ## Why
    
    The 0.142.0 persistent-log filter disables target=log, but bridged log
    records are filtered using their original dependency target before
    tracing-log emits them as target=log. This allowed high-volume
    dependency TRACE events to keep reaching SQLite.
    
    This is a follow-up to #28224.
    
    ## What changed
    
    - Reject bridged target=log events inside the SQLite sink before
    formatting or queueing them.
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • Filter noisy targets from persistent logs (#29457)
    ## Why
    
    The local SQLite log sink currently enables TRACE for every target. This
    persists high-volume dependency logs bridged through `target=log` and
    duplicates OpenTelemetry mirror events in `codex_otel.log_only` and
    `codex_otel.trace_safe`.
    
    These records rapidly consume the per-partition log budget and cause
    unnecessary SQLite insert-and-prune churn.
    
    ## What changed
    
    - Keep TRACE persistence for other targets.
    - Exclude bridged `target=log` events from the SQLite sink.
    - Exclude the two `codex_otel` mirror targets from the SQLite sink.
    - Share the same filter between app-server and TUI.
    
    Remote OpenTelemetry export and metrics are unchanged.
  • Persist session IDs across thread resume (#29327)
    ## Summary
    
    A cold-resumed subagent kept its durable thread ID but could receive a
    new session ID, splitting one agent tree across multiple sessions after
    a restart.
    
    Persist the root session ID in every rollout `SessionMeta`, carry it
    through thread creation, and restore it before initializing the resumed
    `Session` and `AgentControl`.
    
    ## Behavior
    
    For a nested agent tree:
    
    ```text
    root session R
      parent thread P
        child thread C
    ```
    
    The child rollout stores:
    
    ```text
    session_id:       R
    parent_thread_id: P
    id:               C
    ```
    
    After a cold resume, the child still belongs to root session `R` while
    its immediate parent remains `P`. The integration coverage uses distinct
    values for all three IDs so it catches restoring the session from
    `parent_thread_id`.
    
    ## Legacy rollouts
    
    Previous rollouts have `id` but no `session_id`. `SessionMetaLine`
    deserialization treats a missing `session_id` as `id`, keeping those
    files readable, listable, and resumable. When a legacy subagent is
    resumed through its root, that synthesized child ID no longer overrides
    the inherited root-scoped `AgentControl`. New rollouts always persist
    the explicit root session ID.
  • Add per-turn multi-agent mode (#28685)
    ## Why
    
    Multi-agent v2 currently carries an explicit-request-only delegation
    rule in its static usage hint. That provides a safe default, but it
    prevents clients from selecting proactive delegation per turn without
    changing static guidance or rewriting prior model context.
    
    This change makes delegation mode a session selection that can be
    updated through `turn/start`, while deriving the effective model-visible
    mode separately for each turn. Eligible multi-agent v2 turns remain
    explicit-request-only unless proactive mode is both selected and
    enabled.
    
    ## What changed
    
    - Add the experimental `turn/start.multiAgentMode` parameter with
    `explicitRequestOnly` and `proactive` values. Omission retains the
    loaded session's current optional selection.
    - Add the default-off `features.multi_agent_mode` feature gate. Eligible
    multi-agent v2 turns use the selected mode when enabled; an unset
    selection or disabled gate resolves to `explicitRequestOnly`.
    - Treat mode prompting as inapplicable for multi-agent v1 and other
    unsupported session configurations, producing no multi-agent mode
    developer message rather than rejecting the turn.
    - Move the explicit-request-only rule out of the static v2 usage hint
    and into a bounded, tagged developer context fragment.
    - Emit the effective mode in initial context and only when that
    effective mode changes on later turns.
    - Persist the effective mode in `TurnContextItem` as the durable
    baseline for resume and context-update comparisons.
    
    Historical rollout items are not rewritten. Later mode developer
    messages establish the current rule incrementally.
    
    ## Not covered
    
    - Initial selection through `thread/start` and selected-mode reporting
    from thread lifecycle/settings APIs; those are isolated in the stacked
    #28792.
    - A TUI control or slash command for selecting the mode.
    - Persisting a preferred mode to `config.toml`; selection remains
    session/turn scoped.
    - Changes to multi-agent concurrency limits, tool availability, or model
    catalog capability declarations.
    - Rewriting historical rollout prompt items. Cold resume restores the
    latest persisted effective mode when available while leaving historical
    developer messages intact.
    
    ## Verification
    
    - `CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode`
    - Focused app-server coverage verifies that `turn/start.multiAgentMode`
    produces proactive developer instructions for an eligible v2 turn.
    
    ## Stack
    
    Followed by #28792, which adds `thread/start` initialization and
    lifecycle/settings observability.
  • [codex] Restore thread recency with compatible migration history (#28671)
    ## Summary
    
    - Revert #28655, restoring the thread `recencyAt` behavior introduced by
    #27910.
    - Move `threads_recency_at` to migration 0039 so it no longer collides
    with `external_agent_config_imports` at version 0038.
    - Repair databases that already applied the recency migration as version
    38 by moving the matching migration-history row to version 39 before
    SQLx validation. The current version-38 migration can then apply
    normally.
    
    ## Validation
    
    - `just test -p codex-state
    migrations::tests::repairs_recency_migration_that_was_applied_as_version_38`
    - `just test -p codex-state -p codex-rollout -p codex-thread-store -p
    codex-app-server-protocol -p codex-tui`: 3,439 passed; six TUI tests
    could not open the machine's existing read-only incident database at
    `~/.codex/sqlite/state_5.sqlite`.
    - `just fix -p codex-state`
    - `just fmt`
    - Verified that state migration versions are unique.
  • Revert thread recencyAt for sidebar ordering (#28655)
    ## Why
    
    Revert #27910 to remove the newly introduced thread `recencyAt`
    persistence and API behavior from `main`.
    
    ## What changed
    
    This reverts commit `fac3158c2a783095768076489815f361fa9b0db4`,
    including the state migration, thread-store propagation, app-server API
    surface, generated schemas, and related tests.
    
    ## Validation
    
    Not run before opening; relying on CI for the initial fast signal.
  • [codex] core: restore absolute turn context cwd (#28629)
    ## Why
    
    #28152 jumped the gun on moving the rollout format to store URIs, and
    would likely break compat with some features that don't go through the
    same types as the core logic.
    
    ## What
    
    Make `TurnContextItem.cwd` an `AbsolutePathBuf` again, remove test added
    for `PathUri` serialization in rollouts. Also drops a bunch of error
    paths that are no longer needed.
  • Add thread recencyAt for sidebar ordering (#27910)
    ## Summary
    
    Add a server-owned `recencyAt` timestamp and `recency_at` thread-list
    sort key for product recency ordering while preserving the existing
    meaning of `updatedAt` as the latest persisted thread mutation.
    
    This is the server-side alternative to #27697. Rather than narrowing
    `updatedAt`, clients can sort the sidebar by `recency_at` and continue
    treating `updatedAt` as mutation time.
    
    Paired Codex Apps PR:
    [openai/openai#1024599](https://github.com/openai/openai/pull/1024599)
    
    ## Contract
    
    - `recencyAt` initializes when a thread is created.
    - A turn start advances `recencyAt` monotonically.
    - Commentary, agent output, tool results, token/accounting updates, turn
    completion, archive, unarchive, resume, and generic metadata writes do
    not advance it.
    - `updatedAt` retains its existing behavior and continues to advance for
    persisted thread mutations.
    - Current servers populate `recencyAt`; the response field is optional
    in generated TypeScript so clients connected to older servers can fall
    back to `updatedAt`.
    - Filesystem-only fallback uses existing updated/mtime ordering when
    SQLite is unavailable.
    
    ## Persistence and compatibility
    
    Migration 0038 adds second- and millisecond-precision recency columns,
    backfills them from the existing updated timestamp, creates list
    indexes, and includes an insert trigger so older binaries writing to a
    migrated database seed recency without causing later mutations to
    advance it.
    
    Generic metadata upserts preserve existing recency values. Turn-start
    updates use a dedicated monotonic touch, and process-local allocation
    keeps millisecond cursor values unique. State DB list, search, read,
    filtered-list repair, rollout fallback propagation, and app-server
    conversions all carry the new field.
    
    ## API
    
    `Thread` responses include:
    
    ```ts
    recencyAt?: number
    ```
    
    `thread/list` and `thread/search` accept:
    
    ```json
    { "sortKey": "recency_at" }
    ```
    
    Generated TypeScript and JSON schemas are included.
    
    ## Validation
    
    - `just test -p codex-state` — 146 passed
    - `just test -p codex-rollout` — 69 passed
    - `just test -p codex-thread-store` — 81 passed
    - `just test -p codex-app-server-protocol` — 231 passed
    - Focused app-server list ordering, response mapping, archive/unarchive,
    and resume lifecycle tests passed
    - Scoped `just fix` for state, rollout, thread-store,
    app-server-protocol, and app-server
    - `just fmt`
    - `git diff --check`
    - Independent correctness, simplicity, elegance, security, and
    test-quality reviews; actionable ordering, lifecycle, query-projection,
    and timestamp-uniqueness findings were addressed
  • core: render remote environment cwd natively (#28152)
    ## Why
    
    Model-visible `<environment_context>` should match the environment of
    the executor, not of the app server.
    
    Stacked on #28146.
    
    ## What
    
    - Keep selected environment cwd values as `PathUri` while building
    environment context.
    - Render cwd text using the path convention represented by the URI, with
    the canonical URI as a fallback.
    - Preserve compatibility with legacy `TurnContextItem.cwd` values when
    reconstructing and diffing context.
    - Extend the Wine-backed remote Windows test to assert that the model
    sees `powershell` and `C:\windows`.
  • [codex] Record external agent import results (#28396)
    ## Summary
    - restore `externalAgentConfig/import/progress` notifications while
    keeping `externalAgentConfig/import/completed` as the must-deliver event
    - persist completed external-agent config imports in state DB by
    `importId`, including concrete success/failure details for config,
    AGENTS.md, skills, plugins, MCP servers, subagents, hooks, commands, and
    sessions
    - add `externalAgentConfig/import/readHistories` so clients can recover
    persisted import results after missing the live completion notification
    - include `errorType` on import failures in protocol
    responses/notifications and persisted DB JSON so future code can
    classify failures without another wire/storage shape change
    
    ## Validation
    - `git diff --check`
    - `just test -p codex-state external_agent_config_imports`
    - `just test -p codex-app-server-protocol`
    - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-sqlite-read-details
    just test -p codex-app-server
    external_agent_config_import_sends_completion_notification_for_sync_only_import`
    
    Also ran earlier broader checks before publishing:
    - `just test -p codex-state`
    -
    `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-external-agent-test-sqlite
    just test -p codex-app-server external_agent_config`
    - `just test -p codex-external-agent-migration`
  • Use PathUri in filesystem permission paths for exec-server (#28165)
    ## Why
    
    Progress towards letting app-server and exec-server run on different
    platforms, specifically for sandbox configuration.
    
    ## What
    
    - Make the filesystem path containment hierarchy generic, defaulting to
    `AbsolutePathBuf` for now.
    - Have clients specify `AbsolutePathBuf` or `PathUri` directly where
    needed.
    - Use `PathUri` throughout exec-server filesystem protocol and trait
    boundaries.
    - Implement `From` for conversion to path URIs and `TryFrom` for
    fallible conversion to absolute paths through the generic type
    hierarchy.
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • feat(app-server): filter threads by parent (#26662)
    ## Why
    
    Clients that display or coordinate spawned subagents need an
    authoritative snapshot of a thread's immediate spawned children when
    they connect to app-server or recover after missing live events.
    `thread/list` cannot query by parent, so clients must otherwise scan
    unrelated threads or reconstruct relationships from rollout history and
    transient events.
    
    The direct spawn relationship already exists in persisted
    `thread_spawn_edges` state. Review and Guardian threads do not
    participate in that lifecycle and are intentionally outside this
    filter's scope.
    
    ## What changed
    
    This adds an experimental `parentThreadId` filter to `thread/list`.
    Parent-filtered requests return direct spawned children from persisted
    state while preserving the existing response shape, explicit filters,
    sorting, and timestamp-only cursor behavior. The lookup does not read
    rollout transcripts or recursively return descendants.
    
    Supersedes #25112 with the narrower `thread/list` filter approach.
    
    ## How it works
    
    1. An experimental client passes a valid thread ID as `parentThreadId`.
    2. App-server routes the list through the existing thread-store and
    state-database boundaries.
    3. SQLite selects threads whose IDs have a direct persisted spawn edge
    from that parent.
    4. Omitted provider and source filters include all values; explicit
    filters keep ordinary `thread/list` semantics.
    5. Grandchildren, Review threads, and Guardian threads are excluded.
    
    ## Verification
    
    State (144 tests), rollout (69 tests), and focused app-server
    thread-list (31 tests) suites passed. Scoped Clippy checks and
    repository formatting also passed. Coverage includes direct spawned
    children, omitted grandchildren, pagination, malformed IDs, mixed source
    kinds, explicit filters, and operation without rollout files.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • [codex] Pin bundled SQLite to fixed WAL-reset version (#27992)
    ## Summary
    
    Prevent dependency refreshes from silently downgrading Codex's bundled
    SQLite to a release affected by the WAL-reset corruption bug.
    
    SQLx 0.9 accepts a broad `libsqlite3-sys` range. An unrelated lock
    refresh therefore moved Codex from `libsqlite3-sys 0.37.0` back to
    `0.35.0`, changing the bundled SQLite runtime from 3.51.3 to 3.50.2.
    SQLite documents the affected versions and fix in [The WAL Reset
    Bug](https://www.sqlite.org/wal.html#the_wal_reset_bug) and the [SQLite
    3.51.3 changelog](https://www.sqlite.org/changes.html#version_3_51_3).
  • Support plaintext agent messages (#27830)
    ## Why
    
    Multi-agent v2 `send_message` deliveries already reach the receiving
    model as typed `agent_message` items with encrypted content.
    Child-completion notifications are generated by Codex itself, so their
    content is plaintext and previously fell back to a serialized JSON
    envelope inside an assistant message.
    
    With plaintext `input_text` supported for `agent_message`, both delivery
    paths can use the same model-visible type while preserving explicit
    author and recipient metadata.
    
    ## What changed
    
    - add plaintext `input_text` support to `AgentMessageInputContent` and
    regenerate the affected app-server schemas
    - preserve `InterAgentCommunication` as structured mailbox input instead
    of converting it to assistant text
    - record delivered communications as typed `agent_message` history items
    - persist a dedicated rollout item so local delivery metadata such as
    `trigger_turn` remains available without leaking into the Responses
    request
    - reconstruct typed agent messages on resume and preserve fork-turn
    truncation behavior
    - remove request-time assistant-content parsing
    - preserve plaintext and encrypted inter-agent deliveries in stage-one
    memory inputs
    - normalize and link plaintext and encrypted agent messages in rollout
    traces without treating inbound messages as child results
    - cover the real MultiAgent V2 child-completion path end to end with
    deterministic mailbox synchronization
    
    ## Verification
    
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order
    record_initial_history_reconstructs_typed_inter_agent_message
    fork_turn_positions_use_inter_agent_delivery_metadata`
    - `just test -p codex-memories-write
    serializes_inter_agent_communications_for_memory`
    - `just test -p codex-rollout-trace
    agent_messages_preserve_routing_and_content
    sub_agent_started_activity_creates_spawn_edge`
    - `just test -p codex-rollout-trace
    agent_result_edge_falls_back_to_child_thread_without_result_message`
    - `just test -p codex-protocol -p codex-rollout -p
    codex-app-server-protocol`
  • feat(app-server): persist remote-control desired state (#27445)
    ## Why
    
    Remote-control runtime enablement and persisted enrollment preference
    were represented by separate flags. That made startup rehydration, RPC
    persistence, and new-enrollment seeding race with one another, and it
    did not cleanly distinguish runtime-only CLI or daemon starts from
    durable app-server RPC changes.
    
    ## What Changed
    
    - Replace the parallel enablement, seed, and rehydration flags with one
    transport-owned `RemoteControlDesiredState`.
    - Add nullable enrollment-scoped persistence and preserve existing
    preferences during enrollment upserts.
    - Rehydrate plain startup only after auth and client scope resolve,
    without overwriting a concurrent RPC transition.
    - Make ordinary `remoteControl/enable` and `remoteControl/disable`
    durable while retaining `ephemeral: true` for runtime-only callers.
    - Have the daemon explicitly request ephemeral enablement and regenerate
    the app-server schemas.
    
    ## Verification
    
    - Covered migration and `NULL`/`0`/`1` persistence round trips.
    - Covered plain-start rehydration and runtime-only versus durable
    enrollment seeding.
    - Covered durable enable, durable disable, and ephemeral enable through
    app-server RPC.
    - Covered the daemon's exact `{ "ephemeral": true }` request payload.
    
    Related issue: N/A (internal remote-control persistence architecture
    change).
  • [codex] Compact when comp_hash changes (#27520)
    ## Summary
    - snapshot `comp_hash` into `TurnContext` when the turn is created and
    use that snapshot as the downstream source of truth
    - persist the turn hash in rollout context and recover it into
    previous-turn settings during resume and fork replay
    - compact existing history with the previous model only when both
    adjacent turns provide hashes and the values differ
    - record `comp_hash_changed` as the compaction reason
    - cover ordinary transitions, resume, and missing-hash compatibility
    with end-to-end tests
    
    ## Why
    History produced under one compaction-compatible model configuration may
    not be safe to carry directly into another. Compacting at the turn
    boundary converts that history before context updates and the new user
    message are added. Persisting the turn snapshot in `TurnContextItem`
    makes the same protection work after resuming a rollout.
    
    A missing hash is not treated as evidence of incompatibility. `None →
    Some`, `Some → None`, and `None → None` do not trigger compaction; only
    `Some(previous) → Some(current)` with unequal values does.
    
    ## Stack
    - depends on #27532
    - #27532 is based directly on `main`
    
    ## Testing
    - `just test -p codex-core pre_sampling_compact_` — 6 passed
    - `just test -p codex-core
    turn_context_item_uses_turn_context_comp_hash_snapshot` — passed
    - `just fix -p codex-core -p codex-protocol -p codex-analytics -p
    codex-models-manager`
  • fix: Auto-recover from corrupted sqlite databases (#26859)
    Further investigation of the sqlite incidents showed that the problems
    are due to corruption from the older version of SQLite that we recently
    upgraded, and that the data is truly corrupted in the root database --
    recovery of all data is not possible. Given that the data is
    reconstructable from the rollouts on disk, we should just auto-backup
    the database and let codex rebuild the rollout info from the disk
    rollouts.
    
    The new behavior is that appserver auto-backs-up and rebuilds (with logs
    reflecting that behavior). The CLI now pops a message letting you know
    this happened and the paths of the backed-up corrupt db and the new
    database. There is also context added so that the desktop app can read
    the rebuild info from it and inform the user with it.
  • Add app-server thread/delete API (#25018)
    ## Why
    
    Clients can archive and unarchive threads today, but there is no
    app-server API for permanently removing a thread. Deletion also needs to
    cover the full session tree: deleting a main thread should remove
    spawned subagent threads and the related local metadata instead of
    leaving orphaned rollout files, goals, or subagent state behind.
    
    ## What
    
    - Adds the v2 `thread/delete` request and `thread/deleted` notification,
    with the response shape kept consistent with `thread/archive`.
    - Implements local hard delete for active and archived rollout files.
    - Deletes the requested thread's state DB row as the commit point, then
    best-effort cleans associated state including spawned descendants,
    goals, spawn edges, logs, dynamic tools, and agent job assignments.
    - Updates app-server API docs and generated protocol schema/TypeScript
    fixtures.
  • Index visible thread list ordering (#27391)
    ## Summary
    
    - add partial SQLite indexes for visible thread lists ordered by
    creation or update time
    - match the `archived` and non-empty `preview` filters used by
    `thread/list`
    - add query-plan coverage for both supported sort orders
    
    ## Query performance
    
    Benchmarked the production query shape on a snapshot of my database with
    ~10k threads before and after applying these indexes. The query selected
    the full thread projection with `archived = 0`, `preview <> ''`, the
    `openai` provider filter, and a page size of 201. Results are the mean
    of 30 runs after 5 warmups:
    
    | Query | Before | After | Speedup |
    | --- | ---: | ---: | ---: |
    | First page, `created_at_ms DESC` | 132.3 ms | 15.1 ms | 8.78x |
    | First page, `updated_at_ms DESC` | 123.6 ms | 15.5 ms | 7.99x |
    | Cursor page near row 4,000, `created_at_ms DESC` | 51.8 ms | 16.8 ms |
    3.07x |
    | Cursor page near row 4,000, `updated_at_ms DESC` | 52.4 ms | 17.1 ms |
    3.06x |
    
    Before this change, SQLite used `idx_threads_archived`, filtered the
    candidate rows, and built a temporary B-tree for the requested ordering.
    With the partial indexes, SQLite reads matching visible rows directly in
    timestamp order and stops at the page limit. `EXPLAIN QUERY PLAN` no
    longer reports `USE TEMP B-TREE FOR ORDER BY`.
    
    The result rows were identical before and after. The two partial indexes
    occupy approximately 168 KiB combined on this snapshot.
    
    ## Performance under contention
    
    I noticed this issue on a database with high-contention and tried to use
    simulated contention to validate the performance in that context.
    
    A synthetic SQLite benchmark ran five concurrent readers, matching the
    state database pool size, and fetched 101 rows per query. Results are
    the median of three runs on fresh copies of the same database snapshot:
    
    | Query | Before | After |
    | --- | ---: | ---: |
    | `created_at_ms` mean latency under saturation | 328 ms | 12 ms |
    | `created_at_ms` throughput | 16 queries/s | 412 queries/s |
    | `updated_at_ms` mean latency under saturation | 336 ms | 14 ms |
    | `updated_at_ms` throughput | 15 queries/s | 357 queries/s |
    
    For a burst of 100 queries queued through five connections, p95
    completion time fell from 6.90 seconds to 226 ms for `created_at_ms`,
    and from 6.31 seconds to 473 ms for `updated_at_ms`.
    
    ## Validation
    
    - `just test -p codex-state` (135 tests passed)
    - query-plan regression covers created-at and updated-at ordering,
    requires the corresponding index, and rejects `TEMP B-TREE`
    - `just fmt`
  • [codex-analytics] emit goal lifecycle analytics (#27078)
    ## Why
    - Currently, there is no analytics event for `/goal` behavior
    - Existing events cannot identify goal execution or its resulting
    outcome
    - The original update in
    [#26182](https://github.com/openai/codex/pull/26182) was implemented
    before `/goal` moved into `codex-goal-extension`.
    
    ## What Changed
    - Adds `codex_goal_event` serialization and enrichment to
    `codex-analytics`
    - Emits goal events from the canonical `codex-goal-extension` mutation
    and accounting paths:
      - `created` when a new logical goal is persisted
      - `usage_accounted` when cumulative goal usage is persisted
      - `status_changed` when the stored goal status changes
      - `cleared` when the goal is deleted
    - Preserves causal `turn_id` for turn driven events and uses null
    attribution for external or idle lifecycle events
    - Changes goal deletion to return the deleted row so `cleared` retains
    the stable goal ID
    
    ## Event Details
    
    Includes standard analytics metadata along with goal specific fields:
    - `goal_id`: Stable ID stored in the local SQLite goal row and shared
    across the goal's events
    - `event_kind`: Observed operation (see the 4 lifecycle events cited in
    the above bullet)
    - `goal_status`: Resulting or last stored status: `active`, `paused`,
    `blocked`, `usage_limited`, etc.
      - `has_token_budget`: Indicates whether a token budget is configured
      - `turn_id`: Causal turn ID, or null when no causal turn exists
    - `cumulative_tokens_accounted`: Cumulative tokens on `usage_accounted`
    events; null otherwise
    - `cumulative_time_accounted_seconds`: Cumulative active time on
    `usage_accounted` events; null otherwise
    
    ## Validation
    - `just test -p codex-analytics -p codex-state -p codex-goal-extension`
    - `just test -p codex-core -E 'test(/goal/)'`
    - `just test -p codex-app-server`
    - `cargo build -p codex-analytics -p codex-core -p codex-state -p
    codex-app-server`
  • Allow creating a new goal after completion (#26681)
    ## Why
    
    Users have indicated that they want an agent to be able to create a new
    goal for itself after completing the previous goal. Currently, that's
    not possible because agents cannot overwrite an existing goal even if
    it's complete. This PR removes this limitation and allows `create_goal`
    to overwrite an existing goal if it is in the `complete` state.
    
    ## What changed
    
    `create_goal` now replaces the existing goal only when its status is
    `complete`. The replacement is performed atomically in the goal store,
    creates a fresh active goal with reset usage, and continues to reject
    creation while any unfinished goal exists. App server clients see a
    single `thread/goal/updated` event when the previous goal is replaced
    with the new one.
    
    The tool description and error message now reflect these semantics.
    
    ## What didn't change
    
    Agents are not allowed to create a new goal (overwrite their existing
    goal) if an existing goal is still active, blocked, paused, or in any
    other state other than "completed".
  • [codex-analytics] add extensible feature thread sources (#27063)
    ## Why
    - `ThreadSource` currently defines a closed set of core-owned values
    - Product features also create threads for background or scheduled work
    - Adding every product-specific value to the core enum would require
    repeated `codex-rs` protocol changes
    - Feature-backed values let product callers provide precise attribution
    while preserving the existing core classifications
    
    ## What Changed
    - Adds `ThreadSource::Feature(String)` for app-owned thread source
    values
    - Represents all app-server v2 thread sources as scalar strings, so a
    feature source is supplied as `"automation"`
    - Persists and emits the feature's plain string label, so `"automation"`
    produces `thread_source="automation"` in analytics
    - Keeps `user`, `subagent`, and `memory_consolidation` as explicit
    core-owned values and regenerates the app-server schemas and TypeScript
    bindings
    
    ## Verification
    - `just write-app-server-schema`
    - `cargo check --workspace`
    - `just test -p codex-protocol
    feature_thread_source_serializes_as_its_app_owned_label`
    - `just test -p codex-app-server-protocol
    thread_sources_round_trip_as_scalar_labels`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `just fmt`
  • Avoid no-op backfill state writes (#26420)
    ## Summary
    
    - avoid acquiring SQLite's writer slot when the singleton backfill row
    already exists
    - preserve race-safe repair when the row is missing
    - add regressions for writer contention and missing-row repair
    
    ## Why
    
    State runtime initialization and backfill-state reads previously
    executed
    `INSERT ... ON CONFLICT DO NOTHING` even in the steady state. SQLite
    still
    enters the writer path for that statement, so TUI and app-server startup
    could
    wait behind another writer for up to the configured five-second busy
    timeout.
    
    ## Validation
    
    - `just test -p codex-state` (134 tests passed)
    - `just fix -p codex-state`
    - `just fmt`
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Reduce SQLite contention from OpenTelemetry SDK debug logs (#26396)
    ## Summary
    
    - skip `opentelemetry_sdk` DEBUG and TRACE events before formatting or
    queueing them for the SQLite log sink
    - preserve INFO, WARN, and ERROR events from the SDK, along with TRACE
    events from application targets
    - add a persistence-level regression test for the target and level
    policy
    
    ## Why
    
    OpenTelemetry's batch log processor emits internal
    `BatchLogProcessor.ExportingDueToTimer` meta-events every second per
    Codex process. In measured high-fanout `logs_2.sqlite` databases,
    low-level `opentelemetry_sdk` events accounted for over 30% of retained
    rows (30-60% on the machines of people I asked to check).
    
    Persisting this SDK bookkeeping across many processes adds substantial
    write volume and contention without representing application activity.
    
    ## Validation
    
    - `just test -p codex-state` (132/132 tests passed, plus bench smoke)
    - `just fix -p codex-state`
    - `just fmt`
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • Preserve renamed thread titles during reconciliation (#25624)
    ## Summary
    - preserve existing explicit SQLite thread titles during rollout
    reconciliation/backfill when the incoming rollout title is only
    first-message-derived
    - keep stale inferred-title repair behavior while avoiding session-index
    scans during startup backfill
    - add a regression test for renamed titles surviving reconcile
    
    ## Testing
    - just fmt
    - just test -p codex-rollout
    - just test -p codex-state
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • thread-store: store permission profiles (#23165)
    ## Why
    
    `SandboxPolicy` is the legacy compatibility shape, but
    `codex-thread-store` still exposed it through `StoredThread`,
    `ThreadMetadataPatch`, and live metadata sync. That kept thread-store
    consumers tied to the legacy representation and meant richer permission
    profile data could not round-trip through thread metadata or cold
    rollout reconciliation.
    
    ## What Changed
    
    - Replaced thread-store `sandbox_policy` API fields with canonical
    `PermissionProfile` fields.
    - Persist new permission-profile metadata as canonical JSON in the
    existing SQLite metadata slot while continuing to read older legacy
    sandbox policy values.
    - Updated local, in-memory, live metadata sync, and rollout extraction
    paths to propagate `TurnContextItem::permission_profile()`.
    - Re-materialize legacy permission metadata against the final rollout
    cwd when rollout-derived metadata replaces stale SQLite summaries.
    - Updated affected app-server and core test constructors to build
    `PermissionProfile` values directly.
    
    ## Test Plan
    
    - `cargo test -p codex-state`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-app-server
    summary_from_stored_thread_preserves_millisecond_precision --lib`
    - `cargo test -p codex-core realtime_context --lib`
  • Add subagent lineage metadata for responsesapi (#24161)
    ## Why
    
    We recently added `forked_from_thread_id` which lets us trace where a
    thread's _context_ comes from, but we also want to understand subagent
    lineage (e.g. which parent thread spawned this subagent? what kind of
    subagent is it?) which is orthogonal.
    
    This PR adds `parent_thread_id` and `subagent_kind` to the
    `x-codex-turn-metadata` header sent to ResponsesAPI.
    
    ## What changed
    
    - Adds `parent_thread_id` and `subagent_kind` to core-owned
    `x-codex-turn-metadata`.
    - Restores persisted `SessionSource` and `ThreadSource` from resumed
    session metadata so cold-resumed subagent threads keep their lineage on
    later Responses API requests.
    - Centralizes parent-thread extraction on `SessionSource` /
    `SubAgentSource` and reuses it in the Responses client, analytics, agent
    control, and state parsing paths.
    - Extends reserved-key, git-enrichment, thread-spawn, and app-server v2
    metadata coverage for the new lineage fields.
    
    ## Verification
    
    - Not run locally per request.
    - Added focused coverage in `core/src/turn_metadata_tests.rs` and
    `app-server/tests/suite/v2/client_metadata.rs`.
  • Surface filesystem permission profiles in prompt context (#23924)
    ## Summary
    Some permission profiles can encode filesystem reads that should remain
    unavailable to the agent. Before this change, the model-visible context
    and automatic approval review prompt summarized the effective
    permissions as a legacy sandbox mode, which can omit permission-profile
    filesystem entries from escalation decisions.
    
    For example, a profile can grant workspace access while denying a
    private subtree across every workspace root:
    
    ```toml
    default_permissions = "restricted-workspace"
    
    [permissions.restricted-workspace.workspace_roots]
    "/Users/alice/project" = true
    "/Users/alice/other-project" = true
    
    [permissions.restricted-workspace.filesystem]
    ":minimal" = "read"
    
    [permissions.restricted-workspace.filesystem.":workspace_roots"]
    "." = "write"
    "private" = "deny"
    "private/**" = "deny"
    ```
    
    The context window now describes the workspace roots and effective
    filesystem side of the `PermissionProfile` directly, with deny entries
    marked as non-escalatable:
    
    ```xml
    <environment_context>
      <cwd>/Users/alice/project</cwd>
      <shell>zsh</shell>
      <filesystem><workspace_roots><root>/Users/alice/project</root><root>/Users/alice/other-project</root></workspace_roots><permission_profile type="managed"><file_system type="restricted"><entry access="read"><special>:minimal</special></entry><entry access="write"><path>/Users/alice/project</path></entry><entry access="write"><path>/Users/alice/other-project</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/project/private</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/other-project/private</path></entry><entry access="deny" escalatable="false"><glob>/Users/alice/project/private/**</glob></entry><entry access="deny" escalatable="false"><glob>/Users/alice/other-project/private/**</glob></entry></file_system></permission_profile></filesystem>
    </environment_context>
    ```
    
    Managed requirements can impose the same kind of deny-read restriction:
    
    ```toml
    [permissions.filesystem]
    deny_read = [
      "/Users/alice/project/private",
      "/Users/alice/project/private/**",
    ]
    ```
    
    The automatic approval review prompt also receives the parent turn's
    denied-read context, so review decisions can account for the active
    permission profile.
    
    ## What Changed
    - Render the effective filesystem profile in `<environment_context>`,
    including profile type, filesystem entries, workspace roots, and
    non-escalatable deny entries.
    - Persist effective `workspace_roots` in `TurnContextItem` so
    resumed/replayed context does not have to bind `:workspace_roots`
    through legacy `cwd` fallback.
    - Add explicit permission instructions that denied reads are policy
    restrictions, not escalation targets.
    - Pass the parent turn's denied-read context into automatic approval
    reviews.
    - Add targeted coverage for prompt rendering, workspace-root
    materialization, replay context, and review prompt context.
    - Keep the prompt-context test expectations platform-aware so the same
    filesystem rendering assertions pass on Unix and Windows paths.
    
    ## Testing
    - `just test -p codex-core
    context::environment_context::tests::serialize_environment_context_with_full_filesystem_profile`
    - `just test -p codex-core
    context::environment_context::tests::turn_context_item_filesystem_uses_workspace_roots_instead_of_cwd`
    - `just test -p codex-core
    context::permissions_instructions::permissions_instructions_tests::builds_permissions_from_profile_with_denied_reads`
    - `just fix -p codex-core`
    
    I also attempted `just test -p codex-core`; the changed prompt-context
    tests passed, but the full local run did not complete cleanly in this
    sandboxed macOS environment due unrelated user-shell `CODEX_SANDBOX*`
    expectations and integration-test timeouts.
  • [codex] Add user input client ids (#24653)
    ## Summary
    
    Adds an optional `clientId` field to app-server v2 `UserInput` and
    carries it through the core `UserInput` model so clients can correlate
    echoed user input items without relying on payload equality.
    
    ## Details
    
    - Adds `client_id: Option<String>` to core `UserInput` variants.
    - Exposes the v2 app-server field as `clientId` on the wire and in
    generated TypeScript.
    - Preserves the id when converting between app-server v2 and core
    protocol types.
    - Regenerates app-server schema fixtures.
    
    ## Validation
    
    - `just fmt`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-protocol`
    - `git diff --check`
  • [codex] Remove redundant SQLite dynamic tool storage (#24819)
    ## Why
    
    Dynamic tools are defined at thread start and already stored in rollout
    `SessionMeta`, which restores resumed and forked sessions. Persisting
    the same tools through SQLite creates a second runtime persistence path
    that is unnecessary prework for the explicit namespace refactor.
    
    ## What changed
    
    - Restore missing thread-start dynamic tools directly from rollout
    history, including when SQLite is enabled.
    - Remove SQLite dynamic-tool reads, writes, backfill, and thread
    metadata patch plumbing.
    - Add SQLite-enabled resume integration coverage that verifies a
    rollout-defined dynamic tool is still sent after resume.
    
    ## Compatibility
    
    The existing `thread_dynamic_tools` table is intentionally not dropped
    even though it's now unused. Older Codex binaries are allowed to open
    databases migrated by newer binaries and still reference this table;
    dropping it would break that mixed-version path. See
    [here](https://github.com/openai/codex/blob/main/codex-rs/state/src/migrations.rs#L10-L11).
    
    ## Verification
    
    - `just test -p codex-state -p codex-rollout -p codex-thread-store`
    - `just test -p codex-core --test all
    resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
  • Bump SQLx to pick up newer bundled SQLite (#24728)
    ## Why
    
    Codex stores thread, log, goal, and memory state in bundled SQLite
    databases through SQLx. We have a suspected SQLite WAL-reset corruption
    issue under heavy concurrent writer load, especially when multiple
    subagents are active. The existing `sqlx 0.8.6` dependency kept us on an
    older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack
    far enough forward to pick up the newer bundled SQLite library.
    
    ## What changed
    
    - Bump the workspace `sqlx` dependency to `0.9.0`.
    - Use the SQLx 0.9 feature names explicitly: `runtime-tokio`,
    `tls-rustls`, and `sqlite-bundled`.
    - Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys
    0.37.0`.
    - Refresh `MODULE.bazel.lock` for the dependency changes.
    - Adapt `codex-state` to SQLx 0.9:
    - build dynamic state queries with `QueryBuilder<Sqlite>` instead of
    passing dynamic `String`s to `sqlx::query`;
    - remove the old `QueryBuilder` lifetime parameter from helper
    signatures;
    - preserve SQLx's new `Migrator` fields when constructing runtime
    migrators.
    
    ## Verification
    
    - `just test -p codex-state`
    - `just bazel-lock-check`
    - `cargo check -p codex-state --tests`
  • Uprev Rust toolchain pins to 1.95.0 (#24684)
    ## Summary
    - Bump the workspace Rust toolchain from `1.93.0` to `1.95.0` across
    Cargo, Bazel, CI, release workflows, devcontainers, and the Codex
    environment config.
    - Refresh `MODULE.bazel.lock` so the Bazel Rust toolchain artifacts
    match the new version.
    - Leave purpose-specific toolchains unchanged, including the
    `argument-comment-lint` nightly and the upstream `rusty_v8` `1.91.0`
    build pin.
    - Includes fixes for new lints from `just fix` and a few codex-authored
    fixes for lints without a suggestion.
  • Move memory state to a dedicated SQLite DB (#24591)
    ## Summary
    
    Generated memory rows and their stage-one/stage-two job state currently
    live in `state_5.sqlite` alongside thread metadata. That makes memory
    cleanup and regeneration share the main state schema even though those
    rows are memory-pipeline data and can be rebuilt independently from the
    durable thread records.
    
    This PR moves the memory-owned tables into a dedicated
    `memories_1.sqlite` runtime database while keeping thread metadata in
    `state_5.sqlite`.
    
    ## Changes
    
    - Adds a separate memories DB runtime, migrator, path helpers, telemetry
    kind, and Bazel compile data for `state/memory_migrations`.
    - Introduces `MemoryStore` behind `StateRuntime::memories()` and moves
    memory table/job operations onto that store.
    - Drops the old memory tables from the state DB and recreates their
    schema in `state/memory_migrations/0001_memories.sql`.
    - Updates memory startup, citation usage tracking, rollout pollution
    handling, `debug clear-memories`, and app-server `memory/reset` to
    operate through the memories DB.
    - Preserves cross-DB behavior by hydrating thread metadata from the
    state DB when selecting visible memory outputs and checking stage-one
    staleness.
    
    ## Verification
    
    - Added/updated `codex-state` tests for deleted-thread memory visibility
    and already-polluted phase-two enqueue behavior.
    - Updated `debug clear-memories`, app-server `memory/reset`, and
    memories startup tests to seed and assert memory rows through
    `memories_1.sqlite`.
  • Add doctor thread inventory audit (#24305)
    ## Why
    
    Users have been reporting missing sessions in the app. The app server
    thread listing is backed by the SQLite state DB, but the durable source
    of truth for a thread still exists on disk as rollout JSONL. When the
    state DB is incomplete, doctor should be able to show the mismatch
    directly instead of leaving users with a generic state health result.
    
    ## What changed
    
    This adds a `threads` doctor check that compares active and archived
    rollout files under `CODEX_HOME` with rows in the SQLite `threads`
    table. The check reports missing rollout rows, stale DB rows, archive
    flag mismatches, duplicate rollout thread IDs, duplicate DB paths,
    source/provider summaries, and bounded samples of affected rollout
    paths.
    
    It also adds a read-only state audit helper in `codex-rs/state` so
    doctor can inspect thread rows without creating, migrating, or repairing
    the database.
    
    ## Sample output
    
    ```text
      ⚠ threads      rollout files are missing from the state DB
          default model provider   openai
          rollout DB active files  3910
          rollout DB archived files 2037
          rollout DB scan errors   0
          rollout DB malformed file names 0
          rollout DB scan cap reached false
          rollout DB rows          5499
          rollout DB active rows   3462
          rollout DB archived rows 2037
          rollout DB missing active rows 448
          rollout DB missing archived rows 0
          rollout DB stale rows    0
          rollout DB archive mismatches 0
          rollout DB duplicate rollout thread ids 0
          rollout DB duplicate DB paths 0
          rollout DB model providers openai=5359, lmstudio=35, mock_provider=33, lite_llm=26, proxy=26, ollama=15, lms=4, local-usage-limit=1
          rollout DB sources       vscode=2587, cli=1494, subagent:thread_spawn=577, subagent:other=502, exec=281, subagent:memory_consolidation=46, subagent:review=9, unknown=3
          rollout DB missing active sample ~/.codex/sessions/2026/0…857e-a923c712e066.jsonl
          rollout DB missing active sample ~/.codex/sessions/2025/0…877a-766dff25c68d.jsonl
          rollout DB missing active sample ~/.codex/sessions/2025/0…a8b1-7bbadc836f6e.jsonl
          rollout DB missing active sample ~/.codex/sessions/2025/0…a218-e6197f3f62f8.jsonl
          rollout DB missing active sample ~/.codex/sessions/2025/0…9011-7e30784f9932.jsonl
    ```
  • fix: main (#23675)
    Fix main due to conflicting merges
    This is only fixing some imports and mechanics
  • feat: rename 2 (#23668)
    Just a mechanical renaming
  • feat: rename 3 (#23669)
    Just a mechanical renaming