24 Commits

  • [codex] Inject agent graph store into ThreadManager (#29736)
    Pick up the AgentGraphStore migration.
    
    - Inject an explicit optional agent graph store into `ThreadManager` 
    - Move all calls to spawn, close, recursive resume, and
    subtree/archive/delete/feedback traversal through it
    - Keep using  `LocalAgentGraphStore` when SQLite is available
    
    This required some changes to the interface to deal with futures:
    
    - The interface now matches `ThreadStore`'s object-safe pattern by
    returning a boxed `AgentGraphStoreFuture` directly, allowing
    `ThreadManager` to hold `Arc<dyn AgentGraphStore>`
    
    *Slight behavior change!* Unfiltered subtree enumeration now performs a
    single all-status breadth-first traversal, so a closed grandchild
    beneath an open edge is included; the previous Open-then-Closed
    traversals could not cross mixed-status paths and silently omitted it.
  • feat(exec-server): add Noise rendezvous environment (#28774)
    ## Why
    
    Codex can run a remote exec server through the Noise relay, but the
    normal
    environment-manager path could not establish an
    environment-registry-backed
    harness connection. Signed rendezvous URLs and harness authorizations
    are
    short-lived, so reconnects must fetch a fresh bundle instead of
    retaining
    stale connection credentials. A stalled registry request must also fail
    within
    the regular remote connection deadline, without exposing these
    credentials in
    debug logs.
    
    Issue: N/A (internal environment-service integration).
    
    ## What Changed
    
    - Add environment-manager configuration for a registry-backed Noise
    rendezvous
      environment.
    - Request a fresh bundle from
    `/cloud/environment/{environment_id}/connect` for every physical harness
      connection, using the existing 10-second remote connection timeout.
    - Share the Environment Registry register, connect, and validate wire
    payloads
      through `codex-exec-server` and `codex-core-api`.
    - Redact the signed rendezvous URL and harness authorization from the
    public
      connect response's `Debug` output.
    - Add focused coverage for registry bundle retrieval, stalled requests,
    and
      credential redaction.
  • exec-server: expose environment registry payloads (#28651)
    ## Why
    
    Services that proxy the exec-server environment registry endpoints need
    to deserialize and forward the same Noise registration and harness-key
    validation payloads. Those wire models currently live as private,
    serialize-only structs in `exec-server`, which forces consumers to
    duplicate the contract.
    
    ## What changed
    
    - Add owned serde models for registration and harness-key validation
    requests and responses.
    - Use those models in the existing exec-server registry client.
    - Re-export the models from `codex-exec-server` and `codex-core-api`.
    - Keep the harness authorization request free of a derived `Debug`
    implementation so it is not accidentally logged.
    
    ## Testing
    
    - Focused exec-server registration and harness-key validation tests: 2
    passed.
    - `cargo check -p codex-core-api`
    
    The full `codex-exec-server` suite compiled and ran 254 tests: 222
    passed, while 32 existing filesystem sandbox tests could not run under
    the nested macOS sandbox (`sandbox_apply: Operation not permitted`).
    
    Co-authored-by: Codex <noreply@openai.com>
  • Replace SkillsManager with SkillsService (#28705)
    ## Why
    
    Host skill discovery was still exposed as a manager even though it is a
    process-owned service shared by sessions, the app-server catalog, and
    file-watcher invalidation. The skills extension also consumed an ad hoc
    loaded-skills wrapper instead of a named immutable snapshot.
    
    ## What changed
    
    - replace `SkillsManager` with concrete `SkillsService`
    - make the service cache and return immutable `HostSkillsSnapshot`
    values
    - migrate the skills extension host provider to the snapshot boundary
    - migrate app-server catalog, watcher, and invalidation paths to the
    service
    
    This keeps the service limited to host discovery, caching, roots, and
    invalidation. Catalog rendering and invocation remain extension
    responsibilities for the next stacked change.
  • exec-server: add Noise relay transport (#26242)
    ## Why
    
    Rendezvous forwards traffic between the orchestrator and exec-server.
    The endpoints need to authenticate each other and encrypt that traffic
    without trusting Rendezvous with plaintext or endpoint keys.
    
    ## Changes
    
    - Adds a hybrid Noise IK channel through Clatter using X25519,
    ML-KEM-768, AES-256-GCM, and SHA-256.
    - Binds each handshake to `environment_id`, `executor_registration_id`,
    and `stream_id`.
    - Pins the registry-provided executor key and carries the harness
    authorization inside the encrypted handshake.
    - Orders relay frames before consuming Noise nonces and fragments large
    JSON-RPC messages into bounded records.
    - Bounds handshake payloads, frames, streams, and message reassembly.
    
    Runtime activation is in
    [openai/codex#26245](https://github.com/openai/codex/pull/26245).
    
    ## Stack
    
    1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**:
    Noise channel and relay transport
    2. [openai/codex#26245](https://github.com/openai/codex/pull/26245):
    remote registration and runtime activation
    
    ## Verification
    
    - `just test -p codex-exec-server`
    - Oversized initiator payload regression coverage
    - `just fix -p codex-exec-server`
    - `just bazel-lock-check`
    - `cargo shear`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Represent dynamic tools with explicit namespaces internally (#27365)
    Follow-up to #27356.
    
    ## Stack note
    
    This PR changes Codex's internal dynamic-tool shape while leaving
    `thread/start` unchanged. App-server therefore converts the existing
    per-tool input into explicit functions and namespaces before passing it
    to core.
    
    [#27371](https://github.com/openai/codex/pull/27371) updates
    `thread/start` to use the same explicit shape and removes this temporary
    conversion.
    
    ## Why
    
    Dynamic tools repeat namespace metadata on every function. Core should
    keep one explicit namespace with its member tools so descriptions and
    membership stay consistent across sessions and runtime planning.
    
    ## What changed
    
    - Represent dynamic tools as top-level functions or explicit namespaces
    in protocol and session state.
    - Read old flat rollout metadata and write the canonical hierarchy.
    - Flatten namespace members only when registering callable tools.
    - Keep `thread/start.dynamicTools` flat for now and normalize it at the
    app-server boundary.
    
    New builds can read old rollout metadata. Older builds cannot read newly
    written hierarchical metadata.
    
    ## Test plan
    
    - `just test -p codex-app-server
    thread_start_normalizes_legacy_dynamic_tools_into_model_request`
    - `just test -p codex-protocol
    session_meta_normalizes_legacy_dynamic_tools`
    - `just test -p codex-core
    resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
    - `just test -p codex-core
    tool_search_returns_deferred_dynamic_tool_and_routes_follow_up_call`
    - `just test -p codex-core code_mode_can_call_hidden_dynamic_tools`
    - `just test -p codex-tools`
  • feat: use encrypted local secrets for CLI auth (#27539)
    ## Why
    
    Windows Credential Manager limits generic credential blobs to 2,560
    bytes. Large serialized ChatGPT auth payloads can exceed that limit, so
    keyring-mode CLI auth needs a backend that keeps only the encryption key
    in the OS keyring and stores the payload in Codex's encrypted
    local-secrets file.
    
    This is the third PR in the encrypted-auth stack:
    
    1. #27504 — feature and config selection
    2. #27535 — auth-specific local-secrets namespaces
    3. This PR — CLI auth implementation and activation
    4. MCP OAuth implementation and activation
    
    ## What Changed
    
    - Added encrypted CLI-auth storage using the `CliAuth` secrets
    namespace.
    - Preserved direct keyring storage for platforms/configurations where it
    remains selected.
    - Selected the backend consistently for login, logout, refresh,
    device-code login, auth loading, and login restrictions.
    - Threaded resolved bootstrap/full config through CLI, exec, TUI,
    app-server account handling, cloud config, and cloud tasks.
    - Removed stale `auth.json` fallback data after successful encrypted
    saves and removed encrypted, direct-keyring, and fallback data during
    logout.
    - Added storage and integration coverage for both direct and encrypted
    keyring modes.
    
    MCP OAuth persistence is intentionally left to the next PR.
    
    ## Validation
    
    - `just test -p codex-login` — 131 passed
    - `just test -p codex-cli` — 280 passed
    - `just test -p codex-app-server v2::account` — 25 passed
    - `just test -p codex-cloud-config service` — 21 passed, 7 skipped
    - `just fix -p codex-login`
    - `just fix -p codex-cli`
    - `just fmt`
  • [codex] Load user instructions through an injected provider (#27101)
    ## Why
    
    We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
    make embedders responsible for supplying user-level instructions. This
    also ensures user instructions load when no primary environment is
    selected.
    
    ## What changed
    
    Stacked on #27415, which makes `codex exec` surface thread-scoped
    runtime warnings.
    
    - Added `UserInstructionsProvider` to `codex-extension-api`, with
    absolute source attribution and recoverable loading warnings.
    - Added `codex-home` with the filesystem-backed provider for
    `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
    trimming, lossy UTF-8 handling, and the existing uncapped global
    instruction size.
    - Removed global instruction loading from `Config` and require
    `ThreadManager` callers to inject a provider.
    - Load provider instructions once for each fresh root runtime, including
    runtimes without a primary environment. Running sessions retain their
    snapshot, while child agents inherit the parent snapshot without
    invoking the provider.
    - Keep provider instructions separate while loading project `AGENTS.md`,
    then assemble the model-visible instructions with the existing ordering,
    source attribution, warning, and turn-context behavior.
    - Wired the Codex home provider through the CLI, app server, MCP server,
    core facade, and thread-manager sample.
    
    ## Validation
    
    - `just test -p codex-home -p codex-extension-api`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core guardian`
    - `just test -p codex-app-server
    thread_start_without_selected_environment_includes_only_global_instruction_source`
    - `just test -p codex-exec warning`
    - `just bazel-lock-check`
  • Route AGENTS.md loading through environment filesystems (#26205)
    ## Why
    
    Workspace-specific `AGENTS.md` loading needs to use the selected
    environment filesystem so remote workspaces and child agents read
    instructions from their actual environment instead of the host
    filesystem. The app-server should report the same instruction sources
    the initialized thread actually loaded, rather than independently
    rescanning configuration and filesystem state.
    
    ## What changed
    
    - Introduce `LoadedAgentsMd` to retain ordered user, project, and
    internal instructions with their provenance.
    - Load and canonicalize workspace `AGENTS.md` paths through the primary
    `EnvironmentManager` environment, then render the loaded instructions
    when constructing turn context.
    - Expose cached loaded instruction sources from initialized threads and
    use them for app-server start, resume, and fork responses.
    - Preserve global `CODEX_HOME` loading and separator behavior while
    excluding empty project files that did not supply model-visible
    instructions.
    - Add integration coverage for CLI injection, selected-environment
    provenance and rendering, empty environment selection, and cached
    sources on loaded-thread resume.
    
    ## Validation
    
    - `just test -p codex-core agents_md`
    - `just test -p codex-core
    selected_environment_sources_match_model_visible_instructions`
    - `just test -p codex-exec agents_md`
    - `just test -p codex-app-server instruction_sources`
    - `just test -p codex-app-server --status-level fail`
  • Add body_after_prefix auto-compact token limit scope (#22870)
    ## Why
    
    `model_auto_compact_token_limit` has only been able to budget the full
    active context. That makes it hard to set a small "growth since
    compaction" budget for sessions that preserve a large carried window
    prefix: the preserved prefix can consume the whole budget and force
    immediate repeated compaction.
    
    This PR adds an opt-in `body_after_prefix` scope so callers can apply
    `model_auto_compact_token_limit` to sampled output and later growth
    after the current carried prefix, while still forcing compaction before
    the full model context window is exhausted.
    
    ## What changed
    
    - Adds `AutoCompactTokenLimitScope` with the existing `total` behavior
    as the default and a new `body_after_prefix` mode:
    [`config_types.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/protocol/src/config_types.rs#L24-L37).
    - Threads `model_auto_compact_token_limit_scope` through config loading,
    `Config`, `core-api`, and app-server v2 schema/TypeScript generation.
    - Records the first observed input-token count for a `body_after_prefix`
    compaction window and uses it as the baseline when deciding whether the
    scoped auto-compaction budget is exhausted:
    [`turn.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/src/session/turn.rs#L743-L781).
    - Keeps a hard context-window cap in `body_after_prefix`, so scoped
    budgeting cannot let the active context overrun the usable window.
    
    ## Verification
    
    Added compact-suite coverage for the two key behaviors:
    `body_after_prefix` does not re-compact just because the carried prefix
    is larger than the scoped budget, and it still compacts when the total
    active context reaches the configured context window:
    [`compact.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/tests/suite/compact.rs#L3003-L3128).
  • feat(tui): add ambient terminal pets (#21206)
    ## Why
    
    The Codex App has animated pets, but the TUI had no equivalent ambient
    companion surface. This brings that experience into terminal Codex while
    keeping the main chat flow usable: the pet should feel present, but it
    cannot cover transcript text, composer input, approvals, or picker
    content.
    
    The feature also needs to be terminal-aware. Different terminals support
    different image protocols, tmux can interfere with image rendering, and
    some users will want pets disabled entirely or anchored differently
    depending on their layout.
    
    <table>
    <tr><td>
    <img width="4110" height="2584" alt="CleanShot 2026-05-05 at 12 41
    45@2x"
    src="https://github.com/user-attachments/assets/68a1fcbc-2104-48d6-b834-69c6aaa95cdf"
    />
    <p align="center">macOS - Ghostty, iTerm2 and WezTerm with Custom
    Pet</p>
    </td></tr>
    <tr><td>
    ![Uploading CleanShot 2026-05-10 at 20.28.30.png…]()
    <p align="center">Windows Terminal</p>
    </td></tr>
    <tr><td>
    <img width="3902" height="2752" alt="CleanShot 2026-05-05 at 12 39
    02@2x"
    src="https://github.com/user-attachments/assets/300e2931-6b00-467e-91cb-ab8e28470500"
    />
    <p align="center">Linux - WezTerm and Ghostty</p>
    </td></tr>
    </table>
    
    ## What Changed
    
    - Add a TUI ambient pet renderer in `codex-rs/tui/src/pets/`.
    - Port the app-style pet animation states so the sprite changes with
    task status, waiting-for-input states, review/ready states, and
    failures.
    - Add `/pets` selection UI with a preview pane, loading state, built-in
    pet choices, and a first-row `Disable terminal pets` option.
    - Download built-in pet spritesheets on demand from the same public CDN
    path already used by Android, under
    `https://persistent.oaistatic.com/codex/pets/v1/...`, and cache them
    locally under `~/.codex/cache/tui-pets/`.
    - Keep custom pets local.
    - Add config support for pet selection, disabling pets, and choosing
    whether the pet follows the composer bottom or anchors to the terminal
    bottom.
    - Reserve layout space around the pet so transcript wrapping, live
    responses, and composer input do not render underneath the sprite.
    - Gate image rendering by terminal capability, disable image pets under
    tmux, and support both Kitty Graphics and SIXEL terminals.
    - Add redraw cleanup for terminal image artifacts, including sixel cell
    clearing.
    
    ## Current Scope
    
    - This is an initial TUI version of ambient pets, not full App parity.
    - It focuses on ambient sprite rendering, `/pets` selection, custom
    pets, terminal capability gating, and on-demand CDN-backed built-in
    assets.
    - The ambient text overlay is currently disabled, so the TUI renders the
    pet sprite without extra status text beside it.
    
    ## How to Test
    
    1. Start Codex TUI in a terminal with image support.
    2. Run `/pets`.
    3. Confirm the picker shows built-in pets plus custom pets, and the
    first item is `Disable terminal pets`.
    4. On a fresh `~/.codex/cache/tui-pets/`, move onto a built-in pet and
    confirm the first preview downloads the spritesheet from the shared
    Codex pets CDN and renders successfully.
    5. Move through the pet list and confirm subsequent built-in previews
    use the local cache.
    6. Select a pet, then send and receive messages. Confirm transcript and
    composer text wrap before the pet instead of rendering underneath the
    sprite.
    7. Change the pet anchor setting and confirm the pet can either follow
    the composer bottom or sit at the terminal bottom.
    8. Return to `/pets`, choose `Disable terminal pets`, and confirm the
    sprite disappears cleanly.
    
    Targeted tests:
    - `cargo test -p codex-tui ambient_pet_`
    - `cargo test -p codex-tui
    resize_reflow_wraps_transcript_early_when_pet_is_enabled`
    - `cargo insta pending-snapshots`
  • extension: wire extension registries into sessions (#21737)
    ## Why
    
    [#21736](https://github.com/openai/codex/pull/21736) introduces the
    typed extension API, but the runtime does not yet carry a registry
    through thread/session startup or give contributors host-owned stores to
    read from. This PR wires that host-side path so later feature migrations
    can move product-specific behavior behind typed contributions without
    adding another bespoke seam directly to `codex-core`.
    
    ## What changed
    
    - Thread `ExtensionRegistry<Config>` through `ThreadManager`,
    `CodexSpawnArgs`, `Session`, and sub-agent spawn paths.
    - Wire `ThreadStartContributor` and `ContextContributor`
    - Expose the small supporting surface needed by non-core callers that
    construct threads directly, including `empty_extension_registry()`
    through `codex-core-api`.
    
    This PR lands the host plumbing only: the app-server registry is still
    empty, and concrete feature migrations are intended to follow
    separately.
  • Load configured environments from CODEX_HOME (#20667)
    ## Why
    
    The earlier PRs add stdio transport support and the config-backed
    environment provider, but the feature remains inert until normal Codex
    entrypoints construct `EnvironmentManager` with enough context to
    discover `CODEX_HOME/environments.toml`. This final stack PR activates
    the provider while preserving the legacy `CODEX_EXEC_SERVER_URL`
    fallback when no environments file exists.
    
    **Stack position:** this is PR 5 of 5. It is the product wiring PR that
    activates the configured environment provider added in PR 4.
    
    ## What Changed
    
    - Thread `codex_home` into `EnvironmentManagerArgs`.
    - Change `EnvironmentManager::new(...)` to load the provider from
    `CODEX_HOME`.
    - Preserve legacy behavior by falling back to
    `DefaultEnvironmentProvider::from_env()` when `environments.toml` is
    absent.
    - Make `environments.toml`-backed managers start new threads with all
    configured environments, default first, while keeping the legacy env-var
    path single-default.
    - Update the app-server, TUI, exec, MCP server, connector, prompt-debug,
    and thread-manager-sample callsites to pass `codex_home` and handle
    provider-loading errors.
    
    ## Self-Review Notes
    
    - The multi-environment startup path is intentionally tied to the
    `environments.toml` provider. Using `>1` configured environment as the
    only signal would also expand the legacy `CODEX_EXEC_SERVER_URL`
    provider because it keeps `local` addressable alongside `remote`.
    - The startup environment list is still derived inside
    `EnvironmentManager`; the provider only says whether its snapshot should
    start new threads with all configured environments.
    - The thread-manager sample was updated to pass the current
    `ThreadManager::new(...)` installation id argument so the stack compiles
    under Bazel.
    
    ## Stack
    
    - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
    listener
    - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
    client transport
    - 3. https://github.com/openai/codex/pull/20665 - Make environment
    providers own default selection
    - 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
    environments TOML provider
    - **5. This PR:** https://github.com/openai/codex/pull/20667 - Load
    configured environments from CODEX_HOME
    
    Split from original draft: https://github.com/openai/codex/pull/20508
    
    ## Validation
    
    - `just fmt`
    - `git diff --check`
    - `bazel build --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/thread-manager-sample:codex-thread-manager-sample`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/exec-server:exec-server-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=default_thread_environment_selections_use_manager_default_id
    //codex-rs/core:core-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=start_thread_uses_all_default_environments_from_codex_home
    //codex-rs/core:core-unit-tests`
    
    ## Documentation
    
    This activates `CODEX_HOME/environments.toml`; user-facing documentation
    should be added before this stack is treated as a documented public
    workflow.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Revert state DB injection and agent graph store (#21481)
    ## Why
    
    Reverts #20689 to restore the previous optional state DB plumbing. The
    conflict resolution keeps the newer installation ID and session/thread
    identity changes that landed after #20689, while removing the mandatory
    state DB and agent graph store dependency from ThreadManager
    construction.
    
    ## What changed
    
    - Restored `Option<StateDbHandle>` through app-server, MCP server,
    prompt debug, and test entry points.
    - Removed the `codex-core` dependency on `codex-agent-graph-store` and
    reverted descendant lookup back to the existing state DB path when
    available.
    - Kept newer `installation_id` forwarding by passing it beside the
    optional DB handle.
    - Kept local thread-name updates working when the optional state DB
    handle is absent.
    
    ## Validation
    
    - `git diff --check`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-state -p codex-rollout -p
    codex-app-server-protocol`
    - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
    codex-app-server -p codex-app-server-client -p codex-mcp-server -p
    codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
    ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
    2026-01-19)` on `aarch64-apple-darwin`.
  • Move installation ID resolution out of core startup (#21182)
    ## Summary
    
    - resolve or inject the installation ID before core startup and pass it
    through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
    `String`
    - keep child sessions on the parent installation ID instead of
    rediscovering it inside core
    - propagate installation ID startup failures in `mcp-server` instead of
    panicking
    
    ## Why
    
    Core was still touching the filesystem on the session startup path to
    discover `installation_id`. This moves that work to the outer host
    boundary so core no longer depends on `codex_home` reads during session
    construction.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Inject state DB, agent graph store (#20689)
    ## Why
    
    We want the agent graph store to be passed down the stack as a real
    dependency, the same way we already treat the thread store.
    
    This will let us inject the agent graph store as a real dependency and
    support implementations other than the local SQLite-backed one. Right
    now most code instantiates a state DB and an agent graph store
    just-in-time. Ideally, we would not depend on the state DB directly but
    only read through the higher-level interfaces.
    
    This change makes the dependency boundaries explicit and moves state DB
    initialization to process bootstrap instead of hiding it inside local
    store implementations.
    
    ## What changed
    
    - `ThreadManager` now requires a `StateDbHandle` and an
    `AgentGraphStore` at construction time instead of treating them as
    optional internals.
    - The local store constructors no longer lazily initialize SQLite.
    Callers now initialize the state DB once per process and use that shared
    handle to build:
      - `LocalThreadStore`
      - `LocalAgentGraphStore`
    - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
    thread-manager sample) now initialize the state DB up front and inject
    the resulting handle down the stack.
    - `app-server` now consistently uses its process-scoped state DB handle
    instead of reopening SQLite or trying to recover it from loaded threads.
    - Device-key storage now reuses the shared state DB handle instead of
    maintaining its own lazy opener.
    - The thread archive / descendant traversal paths now use the injected
    `AgentGraphStore` instead of reaching through local
    thread-store-specific state.
    
    ## Verification
    
    - `cargo check -p codex-core -p codex-thread-store -p codex-app-server
    -p codex-mcp-server -p codex-thread-manager-sample --tests`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-core
    thread_manager_accepts_separate_agent_graph_store_and_thread_store --
    --nocapture`
    - `cargo test -p codex-app-server
    thread_archive_archives_spawned_descendants -- --nocapture`
  • feat(tui): redesign session picker (#20065)
    ## Why
    
    The resume/fork picker is becoming the main way users recover previous
    work, but the old fixed table made sessions hard to scan once thread
    names, branches, working directories, and timestamps all mattered. This
    redesign makes the picker denser by default, easier to search, and safer
    to inspect before resuming or forking.
    
    <table>
    <tr>
    <td>
    <img width="1660" height="1103" alt="CleanShot 2026-05-03 at 12 34 10"
    src="https://github.com/user-attachments/assets/313ede1d-1da4-4863-acd2-56b3e27e9703"
    />
    </td>
    <td>
    <img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 34 15"
    src="https://github.com/user-attachments/assets/cfde7d5c-bab0-4994-a807-254e53f344ea"
    />
    </td>
    </tr>
    <tr>
    <td>
    <img width="1664" height="1107" alt="CleanShot 2026-05-03 at 12 39 22"
    src="https://github.com/user-attachments/assets/e1ee58ca-4dc5-4a35-ae0f-47562da3974c"
    />
    </td>
    <td>
    <img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 35 09"
    src="https://github.com/user-attachments/assets/9c888072-eedf-4f45-985c-0c14df28bcc7"
    />
    </td>
    </tr>
    </table>
    
    ## What Changed
    
    - Replaces the old session table with responsive session rows that
    prioritize the session name or preview, then show timestamp, cwd, and
    branch metadata.
    - Makes dense view the default while keeping comfortable view available
    through `Ctrl+O`.
    - Persists the picker view preference in `[tui].session_picker_view`,
    including active profile-scoped config.
    - Adds sort/filter controls for updated time, created time, cwd, and all
    sessions.
    - Expands search matching across session name, preview, thread id,
    branch, and cwd.
    - Makes `Esc` safer in search mode: it clears an active query before
    starting a new session.
    - Adds lazy transcript inspection:
      - `Space` expands recent transcript context inline.
      - `Ctrl+T` opens a transcript overlay.
      - raw reasoning visibility follows `show_raw_agent_reasoning`.
    - Keeps remote cwd filtering server-side for remote app-server sessions
    so local path normalization does not incorrectly hide remote results.
    - Updates snapshots and config schema for the new picker states and
    config option.
    
    ## How to Test
    
    1. Start Codex in a repo with several saved sessions.
    2. Press `Ctrl+R` / resume picker entry point.
    3. Confirm the picker opens in dense mode and shows session name or
    preview, timestamp, cwd, and branch metadata.
    4. Press `Ctrl+O` and confirm it switches between dense and comfortable
    views.
    5. Restart Codex and confirm the selected view persists.
    6. Type a query that matches a branch, cwd, thread id, or session name;
    confirm matching sessions appear.
    7. Press `Esc` while the query is non-empty and confirm it clears search
    instead of starting a new session.
    8. Select a session and press `Space`; confirm recent transcript context
    expands inline.
    9. Press `Ctrl+T`; confirm the transcript overlay opens and respects
    raw-reasoning visibility settings.
    
    Targeted tests:
    - `cargo test -p codex-tui resume_picker --no-fail-fast`
    - `cargo test -p codex-core
    runtime_config_resolves_session_picker_view_default_and_override`
    - `cargo test -p codex-core profile_tui_rejects_unsupported_settings`
    - `cargo check -p codex-thread-manager-sample`
    - `cargo insta pending-snapshots`
  • state: pass state db handles through consumers (#20561)
    ## Why
    
    SQLite state was still being opened from consumer paths, including lazy
    `OnceCell`-backed thread-store call sites. That let one process
    construct multiple state DB connections for the same Codex home, which
    makes SQLite lock contention and `database is locked` failures much
    easier to hit.
    
    State DB lifetime should be chosen by main-like entrypoints and tests,
    then passed through explicitly. Consumers should use the supplied
    `Option<StateDbHandle>` or `StateDbHandle` and keep their existing
    filesystem fallback or error behavior when no handle is available.
    
    The startup path also needs to keep the rollout crate in charge of
    SQLite state initialization. Opening `codex_state::StateRuntime`
    directly bypasses rollout metadata backfill, so entrypoints should
    initialize through `codex_rollout::state_db` and receive a handle only
    after required rollout backfills have completed.
    
    ## What Changed
    
    - Initialize the state DB in main-like entrypoints for CLI, TUI,
    app-server, exec, MCP server, and the thread-manager sample.
    - Pass `Option<StateDbHandle>` through `ThreadManager`,
    `LocalThreadStore`, app-server processors, TUI app wiring, rollout
    listing/recording, personality migration, shell snapshot cleanup,
    session-name lookup, and memory/device-key consumers.
    - Remove the lazy local state DB wrapper from the thread store so
    non-test consumers use only the supplied handle or their existing
    fallback path.
    - Make `codex_rollout::state_db::init` the local state startup path: it
    opens/migrates SQLite, runs rollout metadata backfill when needed, waits
    for concurrent backfill workers up to a bounded timeout, verifies
    completion, and then returns the initialized handle.
    - Keep optional/non-owning SQLite helpers, such as remote TUI local
    reads, as open-only paths that do not run startup backfill.
    - Switch app-server startup from direct
    `codex_state::StateRuntime::init` to the rollout state initializer so
    app-server cannot skip rollout backfill.
    - Collapse split rollout lookup/list APIs so callers use the normal
    methods with an optional state handle instead of `_with_state_db`
    variants.
    - Restore `getConversationSummary(ThreadId)` to delegate through
    `ThreadStore::read_thread` instead of a LocalThreadStore-specific
    rollout path special case.
    - Keep DB-backed rollout path lookup keyed on the DB row and file
    existence, without imposing the filesystem filename convention on
    existing DB rows.
    - Verify readable DB-backed rollout paths against `session_meta.id`
    before returning them, so a stale SQLite row that points at another
    thread's JSONL falls back to filesystem search and read-repairs the DB
    row.
    - Keep `debug prompt-input` filesystem-only so a one-off debug command
    does not initialize or backfill SQLite state just to print prompt input.
    - Keep goal-session test Codex homes alive only in the goal-specific
    helper, rather than leaking tempdirs from the shared session test
    helper.
    - Update tests and call sites to pass explicit state handles where DB
    behavior is expected and explicit `None` where filesystem-only behavior
    is intended.
    
    ## Validation
    
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
    codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
    codex-tui -p codex-exec -p codex-cli --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout state_db_`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout try_init_ -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
    codex-rollout --lib -- -D warnings`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store
    read_thread_falls_back_when_sqlite_path_points_to_another_thread --
    --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    shell_snapshot`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all personality_migration`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find::find_prefers_sqlite_path_by_id --
    --nocapture`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    interrupt_accounts_active_goal_before_pausing`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server get_auth_status -- --test-threads=1`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server --lib`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
    -p codex-app-server --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
    -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
    codex-exec -p codex-cli`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
    codex-app-server`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
    codex-rollout`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
    - `just argument-comment-lint -p codex-core`
    - `just argument-comment-lint -p codex-rollout`
    
    Focused coverage added in `codex-rollout`:
    
    - `recorder::tests::state_db_init_backfills_before_returning` verifies
    the rollout metadata row exists before startup init returns.
    - `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
    verifies startup waits for another worker to finish backfill instead of
    disabling the handle for the process.
    -
    `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
    verifies startup does not hang indefinitely on a stuck backfill lease.
    -
    `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
    verifies DB-backed lookup accepts valid existing rollout paths even when
    the filename does not include the thread UUID.
    -
    `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
    verifies DB-backed lookup ignores a stale row whose existing path
    belongs to another thread and read-repairs the row after filesystem
    fallback.
    
    Focused coverage updated in `codex-core`:
    
    - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
    DB-preferred rollout file with matching `session_meta.id`, so it still
    verifies that valid SQLite paths win without depending on stale/empty
    rollout contents.
    
    `cargo test -p codex-app-server thread_list_respects_search_term_filter
    -- --test-threads=1 --nocapture` was attempted locally but timed out
    waiting for the app-server test harness `initialize` response before
    reaching the changed thread-list code path.
    
    `bazel test //codex-rs/thread-store:thread-store-unit-tests
    --test_output=errors` was attempted locally after the thread-store fix,
    but this container failed before target analysis while fetching `v8+`
    through BuildBuddy/direct GitHub. The equivalent local crate coverage,
    including `cargo test -p codex-thread-store`, passes.
    
    A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
    also requires system `libcap.pc` for `codex-linux-sandbox`; the
    follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
    this container.
  • Move item event mapping into app-server-protocol (#20299)
    ## Why
    
    Follow-up to #20291.
    
    The v2 item-event-to-notification translation had been embedded in
    `app-server/src/bespoke_event_handling.rs`, which made it hard to reuse
    anywhere else. This PR moves that stateless mapping into shared protocol
    code so other entry points can produce the same `ServerNotification`
    payloads without copying app-server logic.
    
    That also lets `thread-manager-sample` demonstrate the same notification
    surface that the app server exposes, instead of only printing the final
    assistant message.
    
    ## What changed
    
    - move `item_event_to_server_notification` into
    `codex-app-server-protocol::protocol::event_mapping`
    - keep the mapper tests next to the shared implementation in
    `codex-app-server-protocol`
    - re-export the mapper from `codex-core-api` so lightweight consumers
    can use it without reaching into `app-server-protocol` directly
    - simplify `app-server/src/bespoke_event_handling.rs` so it delegates
    the stateless event-to-notification projection to the shared helper
    - update `thread-manager-sample` to:
      - print mapped notifications as newline-delimited JSON
      - use the shared mapper through `codex-core-api`
    - enable the default feature set so the sample exposes the normal tool
    surface
    - use a `read_only` permission profile so shell commands can run in the
    sample without widening permissions
    
    ## Testing
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core-api`
    - `cargo test -p codex-app-server bespoke_event_handling::tests`
    - `cargo test -p codex-thread-manager-sample`
    - `cargo run -p codex-thread-manager-sample -- "briefly explore the repo
    with pwd and ls, then summarize it"`
  • Reduce the surface of collaboration modes (#20149)
    Collaboration modes were slightly invasive both into ThreadManager
    construction and ModelProvider
  • Add codex-core public API listing (#20243)
    Summary:
    - Add a checked-in codex-core public API listing generated by
    cargo-public-api.
    - Add scripts/regen-public-api.sh with an embedded crate list,
    auto-install for cargo-public-api 0.51.0, pinned nightly, and --check
    mode.
    - Add Rust CI jobs on the codex Linux x64 runner pool to verify the
    listing stays up to date.
    
    Testing:
    - bash -n scripts/regen-public-api.sh
    - just regen-public-api --check
    - yq '.' .github/workflows/rust-ci.yml
    .github/workflows/rust-ci-full.yml
    - git diff --check