11 Commits

  • [codex] Instrument rollout persistence bytes (#29498)
    - Add 1%-sampled rollout persistence metrics that report per-item and
    per-thread JSON byte totals before and after filtering when metrics
    export is enabled.
    - Tag each item with its exact response or event variant, including
    nested turn-item kinds for conditionally persisted completion events, so
    aggregate cloud-storage impact can be estimated by policy choice.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • [codex] Add rollout-backed thread content search (#23519)
    ## Summary
    - add experimental `thread/search` for local rollout-backed thread
    search using `rg` over JSONL rollouts
    - return search-specific result rows with optional previews instead of
    storing preview data on `StoredThread` or ordinary `Thread` responses
    - keep `thread/list` separate from full-content search and document the
    new app-server surface
    
    ## Testing
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server
    thread_search_returns_content_and_title_matches -- --nocapture`
  • Unify thread metadata updates above store (#22236)
    - make ThreadStore::update_thread_metadata accept a broad range of
    metadata patches
    - keep ThreadStore::append_items as raw canonical history append (no
    metadata side effects)
    - in the local store, write these metadata updates to a combination of
    sqlite and rollout jsonl files for backwards-compat. It special cases
    which fields need to go into jsonl vs sqlite vs whatever, confining the
    awkwardness to just this implementation
    - in remote stores we can simply persist the metadata directly to a
    database, no special casing required.
    - move the "implicit metadata updates triggered by appending rollout
    items" from the RolloutRecorder (which is local-threadstore-specific) to
    the LiveThread layer above the ThreadStore, inside of a private helper
    utility called ThreadMetadataSync. LiveThread calls ThreadStore
    append_items and update_metadata separately.
    - Add a generic update metadata method to ThreadManager that works on
    both live threads and "cold" threads
    - Call that ThreadManager method from app server code, so app server
    doesn't need to worry about whether the thread is live or not
  • [codex] Remove remote thread store implementation (#21596)
    Remove the remote thread-store backend and checked-in protobuf
    artifacts. We've moved these into another crate that link against this
    one.
    
    Also remove the config settings for thread store backend selection,
    since we'll instead pass an instantiated thread store into the core-api
    crate's main entrypoint.
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Route live thread writes through ThreadStore (#18882)
    Begin migrating the thread write codepaths to ThreadStore.
    
    This starts using ThreadStore inside of core session code, not only in
    the app server code.
    
    Rework the interfaces around thread recording/persistence. We're left
    with the following:
    
    * `ThreadManager`: owns the process-level registry of loaded threads and
    handles cross-thread orchestration: start, resume, fork, lookup, remove,
    and route ops to running CodexThreads.
    * `CodexThread`: represents one loaded/running thread from the outside.
    It is the handle app-server and callers use to submit ops, inspect
    session metadata, and shut the thread down.
    * `LiveThread`: session-owned persistence lifecycle handle for one
    active thread. Core session code uses it to append rollout items,
    materialize lazy persistence, flush, shutdown, discard init-failed
    writers, and load that thread’s persisted history.
    * `ThreadStore`: storage backend abstraction. It answers “how are
    threads persisted, read, listed, updated, archived?” Local and remote
    implementations live behind this trait.
    * `LocalThreadStore`: local ThreadStore implementation. It owns the
    file/sqlite-specific details and keeps RolloutRecorder as a local
    implementation detail.
    
    This is a few too many Thread abstractions for my liking, but they do
    all represent different concepts / needs / layers.
    
    Migration note: in places where the core code explicitly requires a
    path, rather than a thread ID, throw an error if we're running with a
    remote store.
    
    Cover the new local live-writer lifecycle with focused tests and
    preserve app-server thread-start behavior, including ephemeral pathless
    sessions.
  • codex: route thread/read persistence through thread store (#18352)
    Summary
    - replace the thread/read persisted-load helper with
    ThreadStore::read_thread
    - move SQLite/rollout summary, name, fork metadata, and history loading
    for persisted reads into LocalThreadStore
    - leave getConversationSummary unchanged for a later PR
    
    Context
    - Replaces closed stacked PR #18232 after PR #18231 merged and its base
    branch was deleted.
  • [codex] Add remote thread store implementation (#17826)
    - Add a "remote" thread store implementation
    - Implement the remote thread store as a thin wrapper that makes grpc
    calls to a configurable service endpoint
    - Implement only the thread/list method to start
    - Encode the grpc method/param shape as protobufs in the remote
    implementation
    
    A wart: the proto generation script is an "example" binary target. This
    is an example target only because Cargo lets examples use
    dev-dependencies, which keeps tonic-prost-build out of the normal
    codex-thread-store dependency surface. A regular bin would either need
    to add proto generation deps as normal runtime deps, or use a
    feature-gated optional dep, which this repo’s manifest checks explicitly
    reject.
  • [codex] Add local thread store listing (#17824)
    Builds on top of #17659 
    
    Move the filesystem + sqlite thread listing-related operations inside of
    a local ThreadStore implementation and call ThreadStore from the places
    that used to perform these filesystem/sqlite operations.
    
    This is the first of a series of PRs that will implement the rest of the
    local ThreadStore.
    
    Testing:
    - added unit tests for the thread store implementation
    - adjusted some unit tests in the realtime + personality packages whose
    callsites changed. Specifically I'm trying to hide ThreadMetadata inside
    of the local implementation and make ThreadMetadata a sqlite
    implementation detail concern rather than a public interface, preferring
    the more generate StoredThread interface instead
    - added a corner case test for the personality migration package that
    wasn't covered by the existing test suite
    - adjust the behavior of searched thread listing to run the existing
    local rollout repair/backfill pass _before_ querying SQLite results, so
    callers using ThreadStore::list_threads do not miss matches after a
    partial metadata warm-up
  • ThreadStore interface (#17659)
    Introduce a ThreadStore interface for mediating access to the filesystem
    (rollout jsonl files + sqlite db) based thread storage.
    
    In later PRs we'll move the existing fs code behind a "local"
    implementation of this ThreadStore interface.
    
    This PR should be a no-op behaviorally, it only introduces the
    interface.