121 Commits

  • feat(app-server): add history_mode to thread (#29927)
    ## Description
    
    This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
    This will be stored in `SessionMeta` in the JSONL rollout file and as a
    new column in the SQLite thread_metadata table, and exposed on
    `thread/start` and on the `Thread` object in app-server.
    
    ## What changed
    
    - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
    defaulting old and new SessionMeta to `legacy`.
    - Carried `history_mode` through core session config, ThreadStore stored
    metadata, local/in-memory stores, rollout metadata extraction, and the
    existing SQLite `threads` table.
    - Added experimental `historyMode` to app-server v2 `Thread` and
    `thread/start`.
    - Made paginated stored threads metadata-discoverable but unsupported
    for legacy full-history reads, `load_history`, live resume, and create
    paths.
    - Regenerated app-server schema fixtures and added
    protocol/state/thread-store/app-server coverage for persistence and
    fail-closed behavior.
    
    ## Compatibility floor
    Because users may be running various versions of Codex binaries on the
    same machine (TUI, Codex App, etc.), we will need to establish a
    compatibility floor for upcoming paginated threads, which will change
    how thread storage reads and writes work.
    
    The overall plan here:
    ```
    Release N:
    - Add historyMode to SessionMeta / Thread / SQLite metadata.
    - Teach binaries to understand paginated threads.
    - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
    - Default remains `"legacy"`.
    
    Release N+1:
    - First-party clients start opting into paginated threads where appropriate.
    - Internal dogfood / staged rollout.
    - Measure old-client usage and paginated-thread unsupported errors.
    
    Release N+2:
    - Only after Release N+ is overwhelmingly deployed, make paginated the default.
    - Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
    ```
    
    The important behavior change is fail-closed handling for a binary that
    encounters a persisted `paginated` thread before it knows how to fully
    support paginated history. In app-server, if a thread is `paginated`, we
    will:
    
    - allow metadata-only discovery paths like `thread/list` and
    `thread/read(includeTurns=false)`, so clients can still see the thread
    and inspect its `historyMode`
    - reject legacy full-history/live-thread paths like
    `thread/read(includeTurns=true)` and `thread/resume` with an unsupported
    JSON-RPC error
    - avoid silently treating an unknown or future `historyMode` as `legacy`
    
    Under the hood, the ThreadStore layer also rejects legacy operations
    that would need to load or replay the full thread history for a
    paginated thread. That gives us the behavior we want for Release N:
    future paginated threads are visible, but this binary fails closed
    instead of trying to operate on them as if they were legacy threads.
  • feat: add provider-aware model fallback to thread start (#29942)
    ## Why
    
    Helper threads such as task title generation can request a model ID that
    is valid for the default OpenAI provider but unavailable from the active
    provider. With Amazon Bedrock, `gpt-5.4-mini` is rejected while the
    provider static catalog exposes Bedrock model IDs such as
    `openai.gpt-5.5` and `openai.gpt-5.4`. This causes repeated background
    404s and can surface a misleading turn error even when the main turn
    succeeds.
    
    Clients need an explicit way to ask app-server to resolve an unavailable
    helper model to the active provider default. That fallback must remain
    limited to providers with an authoritative static catalog so custom or
    dynamically discovered model IDs are not rewritten based on an
    incomplete catalog.
    
    Fixes #28741.
    
    ## What changed
    
    - Add the experimental `allowProviderModelFallback` option to
    `thread/start`, defaulting to `false` to preserve existing behavior.
    - Thread the option through thread creation and model selection.
    - When enabled for a static model manager, preserve requested models
    present in the catalog and replace unavailable models with the provider
    default.
    - Continue preserving explicit model IDs for dynamic model managers
    without fetching a catalog solely to validate them.
    - Document the new `thread/start` behavior in the app-server API
    overview.
    
    ## Test
    Temporary test-client harness:
    ```
    ThreadStartParams {
        model: Some("gpt-5.4-mini".to_string()),
        allow_provider_model_fallback: true,
        ..Default::default()
    }
    ```
    Command:
    ```
    CODEX_HOME=/tmp/codex-bedrock-thread-start-home \
    CODEX_E2E_BEDROCK_THREAD_START_ONLY=1 \
    ./target/debug/codex-app-server-test-client \
      --codex-bin ./target/debug/codex \
      -c 'model_provider="amazon-bedrock"' \
      send-message-v2 --experimental-api ignored
    ```
    Relevant output:
    ```
    > "method": "thread/start",
    > "params": {
    >   "model": "gpt-5.4-mini",
    >   "modelProvider": null,
    >   "allowProviderModelFallback": true,
    >   ...
    > }
    
    < "result": {
    <   "model": "openai.gpt-5.5",
    <   "modelProvider": "amazon-bedrock",
    <   ...
    < }
    ```
  • [codex] Add Ultra reasoning effort (#29899)
    ## Why
    
    Ultra should be one user-facing reasoning selection for work that
    benefits from both maximum reasoning and proactive multi-agent
    delegation. Without it, clients must coordinate maximum reasoning with
    the experimental `multiAgentMode` setting, even though the inference
    backend still expects its existing `max` effort value.
    
    This change makes reasoning effort the source of truth: clients select
    `ultra`, core derives proactive multi-agent behavior when the turn is
    eligible for multi-agent V2, and inference requests continue to use the
    backend-compatible `max` value.
    
    ## What changed
    
    - Add `ultra` as a first-class reasoning effort and preserve
    model-catalog ordering when exposing it to clients.
    - Convert `ultra` to `max` at the inference request boundary, including
    Responses HTTP/WebSocket requests, startup prewarm, compaction, and
    memory summarization.
    - Derive effective multi-agent mode per turn from effective reasoning
    effort:
      - eligible multi-agent V2 + `ultra` → `proactive`
      - eligible multi-agent V2 + any other effort → `explicitRequestOnly`
    - V1 or otherwise ineligible sessions → no multi-agent mode instruction
    - Keep the derived effective mode in turn context history so successive
    turns can emit a developer-message update only when the effective mode
    changes.
    - Remove selected multi-agent mode from core session configuration, turn
    construction, thread settings, resume/fork restoration, and subagent
    spawn plumbing. Subagents inherit reasoning effort and derive their own
    effective mode.
    - Retain the experimental app-server `multiAgentMode` fields for wire
    compatibility while marking them deprecated. Request values are accepted
    but ignored; compatibility response fields report `explicitRequestOnly`.
    - Display Ultra in the TUI using the order supplied by `model/list`.
    
    ## Validation
    
    - `just test -p codex-core ultra_reasoning_uses_max_for_requests`
    - `just test -p codex-tui model_reasoning_selection_popup`
  • [codex] Inject agent graph store into ThreadManager (#29736)
    Pick up the AgentGraphStore migration.
    
    - Inject an explicit optional agent graph store into `ThreadManager` 
    - Move all calls to spawn, close, recursive resume, and
    subtree/archive/delete/feedback traversal through it
    - Keep using  `LocalAgentGraphStore` when SQLite is available
    
    This required some changes to the interface to deal with futures:
    
    - The interface now matches `ThreadStore`'s object-safe pattern by
    returning a boxed `AgentGraphStoreFuture` directly, allowing
    `ThreadManager` to hold `Arc<dyn AgentGraphStore>`
    
    *Slight behavior change!* Unfiltered subtree enumeration now performs a
    single all-status breadth-first traversal, so a closed grandchild
    beneath an open edge is included; the previous Open-then-Closed
    traversals could not cross mixed-status paths and silently omitted it.
  • test: add app-server auto environment helper (#29746)
    ## Why
    
    Start moving towards app-server tests defaulting to running against
    remote & foreign OS executors. To do so we need a point of indirection
    similar to core integration tests' `build_with_auto_env`, but with the
    flexibility of letting tests control environment registration if they
    need to.
    
    ## What
    
    This adds:
    
    - `TestAppServer::new_with_auto_env()` for constructing an app server
    with a default environment defined by the test runner (e.g. bazel)
    - `TestAppServer::auto_env_params()` for tests to easily acquire turn
    env params tailored to the automatic environment
    - `TestAppServer::send_thread_start_request_with_auto_env()` to make it
    easy for tests to start a thread using the automatic environment
    
    The above methods all fail if the test calling them has set up an
    environment where the automatic environment configuration conflicts with
    test-created state.
    
    ## Validation
    
    Adds a couple of basic smoke tests to the app-server test suite.
    Follow-ups will migrate more tests to use it.
  • core tests: rename automatic environment builder (#29728)
    ## Why
    
    Use a clearer name for what happens when this helper sets up a test
    environment.
    
    ## What
    
    - Rename the builder and its harness wrapper to use `auto_env` instead
    of `remote_env` because the helper will set up a local environment if
    configured by the build system.
  • test: branch on target OS instead of runner flavor (#29712)
    ## Why
    
    Core tests should branch on the executor's operating system, not on
    runner details such as Docker or Wine. This keeps platform behavior
    stable as new test backends are added and reserves Wine-specific skips
    for actual runner debt.
    
    ## What
    
    - Add `TestTargetOs` and target/host-aware skip helpers while keeping
    `TestEnvironment` internal.
    - Replace topology enum access with remote predicates and a narrow
    Docker accessor.
    - Migrate OS-semantic Wine skips, preserve runner-specific gaps, and
    document the skip taxonomy.
    
    ## Validation
    
    - `just test -p core_test_support`
    - `just test -p codex-core
    remote_test_env_can_connect_and_use_filesystem`
    - `bazel test //codex-rs/core:core-all-wine-exec-test
    --test_output=errors` reached test execution; unrelated existing
    view-image, path, and timing failures remain.
    - `just test -p codex-core` and `just test` reached broad test
    execution; this checkout has unrelated helper, sandbox, and timing
    failures.
  • path-uri: clarify host-native path conversion (#29501)
    ## Why
    
    Downstream refactors are producing confusing code with this
    functionality having a very generic name. Encoding the specific
    conversion approach in the method name makes it clearer.
    
    ## What
    
    Rename `PathUri::from_path` to `PathUri::from_host_native_path` and
    update its Rust call sites.
  • Expose thread-level multi-agent mode (#28792)
    ## Why
    
    Once multi-agent mode can be selected per turn, clients also need to
    choose the initial selection when creating a thread and observe that
    selection through lifecycle and settings APIs.
    
    The selected value is intentionally distinct from the effective
    model-visible value: no client selection is represented as `null`, even
    though an eligible multi-agent v2 turn derives `explicitRequestOnly` as
    its effective default.
    
    ## What changed
    
    - Add the optional experimental `thread/start.multiAgentMode` parameter
    and pass it through thread creation.
    - Preserve an omitted initial value as an unset selection rather than
    eagerly storing `explicitRequestOnly`.
    - Apply an explicit `thread/start` selection to the first turn through
    the session configuration established at thread creation.
    - Restore the latest persisted effective mode as the selected baseline
    on cold resume when rollout history contains one.
    - Inherit the optional selected mode from a loaded parent when creating
    related runtime threads.
    - Return the current selected `multiAgentMode` from `thread/start`,
    `thread/resume`, `thread/fork`, and thread settings, using `null` when
    no mode is selected.
    - Keep lifecycle reporting independent from model capability and feature
    eligibility; core turn construction remains responsible for calculating
    and persisting the effective mode.
    
    ## Not covered
    
    - Clearing an existing loaded-session selection back to unset through
    `turn/start`; omitted or `null` currently retains the session's
    selection.
    - A TUI control, slash command, or `config.toml` preference.
    
    ## Verification
    
    - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol`
    - `CARGO_INCREMENTAL=0 just test -p codex-app-server multi_agent_mode`
    
    The focused app-server coverage verifies explicit `thread/start`
    initialization, first-turn prompting, nullable reporting for an omitted
    selection, and retention of selections that are not currently
    runtime-eligible.
    
    ## Stack
    
    Stacked on #28685. This PR contains only the thread initialization and
    lifecycle/settings API layer.
  • core: load AGENTS.md from foreign environments (#28958)
    ## Why
    
    Make it possible to load AGENTS.md from remote exec-servers whose OS is
    different than app-server.
    
    ## What
    
    - keep `AGENTS.md` discovery and provenance as `PathUri`, with
    root-aware parent and ancestor traversal
    - expose lifecycle instruction sources as legacy app-server path strings
    in events while retaining `PathUri` internally
    - preserve and test mixed POSIX and Windows paths in model context and
    TUI status output
    - cover remote Windows loading end to end by seeding the Wine prefix
    through host filesystem APIs
    - fix bug in `PathUri`'s parent() implementation that would erase
    Windows drive letters
  • current time reminders impl for system clock (varlatency 2/n) (#28824)
    Stacked on #28822.
    
    ## Summary
    
    - add a host-injectable current-time provider with a built-in system
    implementation
    - record UTC developer reminders in history immediately before due model
    requests
    - keep cadence state per session and force a refresh after compaction
    
    This does NOT include the app server client <-> server clock logic. This
    PR is only for the reminder message & system clock that will be used in
    prod.
    
    ## Testing
    
    - `just test -p codex-core varlatency_`
    - `just clippy -p codex-core -p codex-app-server -p codex-mcp-server -p
    codex-thread-manager-sample`
    - `just fmt`
  • Support openai/form extended form elicitations (#27500)
    # Summary
    Allow App Server clients to opt into `openai/form` MCP elicitations.
  • [tests] Keep Apps out of generic core test harness (#28508)
    ## Summary
    
    - disable the stable Apps feature in the generic `test_codex()`
    integration-test harness
    - keep Apps-specific tests explicit: their builders re-enable Apps and
    point it at a local mock server
    
    ## Why
    
    Generic tests that use dummy ChatGPT auth were also enabling the
    host-owned `codex_apps` MCP server. That made unrelated tests contact
    `chatgpt.com` and wait for MCP startup, causing the Bazel timeouts
    observed on #28368.
    
    The generic harness should be hermetic and should not start an external
    service that the test did not request. This is test-only; production
    Apps behavior is unchanged. The broader optional-MCP startup behavior is
    being handled separately in #28407.
    
    ## Testing
    
    - `just test -p codex-core -E
    'test(pre_sampling_compact_runs_when_comp_hash_changes) |
    test(model_switch_to_smaller_model_updates_token_context_window) |
    test(codex_apps_file_params_upload_local_paths_before_mcp_tool_call)'`
    - `just fix -p codex-core`
    - `just fmt`
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • Run core integration tests against a Wine-backed Windows executor (#28401)
    ## Why
    
    We want to exercise a linux app-server against a windows exec-server
    without having to repeat every test case. This approach has slight
    precedent in the remote docker test setup.
    
    ## What
    
    Run the shared `codex-core` integration suite against Windows
    exec-server behavior from Linux. This makes cross-OS path and shell
    regressions visible while keeping unsupported cases owned by individual
    tests.
    
    - Add `local`, `docker`, and `wine-exec` test environment selection with
    legacy Docker compatibility.
    - Extend `codex_rust_crate` to generate a sharded Wine-exec variant
    using a cross-built Windows server and pinned Bazel Wine/PowerShell
    runtimes.
    - Teach remote-aware helpers about Windows paths and track temporary
    incompatibilities with source-local `skip_if_wine_exec!` calls and
    follow-up reasons.
  • [codex] exec-server honors remote environment cwd and shell (#28122)
    ## Why
    
    Next slice needed to make progress on the `remote_env_windows` test is
    to support passing a Windows cwd for the remote environment and using
    that environment's native shell. This lets the test run a real Windows
    process instead of only recording an early path or shell mismatch.
    
    ## What
    
    - change `TurnEnvironmentSelection.cwd` from `AbsolutePathBuf` to
    `PathUri`
    - convert local cwd values to URIs when constructing selections
    - preserve a remote primary cwd instead of replacing it with the local
    legacy fallback
    - prefer the selected environment's discovered shell for unified exec,
    falling back to the session shell when unavailable
    - convert back to a host-native absolute path at current native-only
    consumer boundaries
    - reject or deny unsupported foreign cwd values at the existing
    request-permissions boundary, with TODOs for its future migration
    - extend the hermetic Wine test to execute Windows PowerShell in
    `C:\windows` and verify successful process completion
    - record the current app-server rejection against the same Wine-backed
    remote Windows fixture when its cwd is supplied as a native Windows path
  • [codex] make PathUri::from_abs_path infallible (#27976)
    ## Why
    
    `PathUri::from_abs_path` can fail for absolute paths that do not have a
    normal `file:` URI representation, forcing filesystem call sites to
    handle a conversion error even though the original path can be preserved
    losslessly.
    
    ## What
    
    Make `from_abs_path` infallible and migrate its callers. Unrepresentable
    paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
    Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
    leading encoded null reserves a namespace that cannot collide with a
    real Unix or Windows path, and fallback URIs remain opaque to lexical
    path operations.
    
    ## Validation
    
    Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
    device/verbatim and non-Unicode paths, serialization, malformed
    fallbacks, opaque lexical operations, invalid native payloads, and
    literal `/bad/path` collision resistance.
  • [codex] Load user instructions through an injected provider (#27101)
    ## Why
    
    We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
    make embedders responsible for supplying user-level instructions. This
    also ensures user instructions load when no primary environment is
    selected.
    
    ## What changed
    
    Stacked on #27415, which makes `codex exec` surface thread-scoped
    runtime warnings.
    
    - Added `UserInstructionsProvider` to `codex-extension-api`, with
    absolute source attribution and recoverable loading warnings.
    - Added `codex-home` with the filesystem-backed provider for
    `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
    trimming, lossy UTF-8 handling, and the existing uncapped global
    instruction size.
    - Removed global instruction loading from `Config` and require
    `ThreadManager` callers to inject a provider.
    - Load provider instructions once for each fresh root runtime, including
    runtimes without a primary environment. Running sessions retain their
    snapshot, while child agents inherit the parent snapshot without
    invoking the provider.
    - Keep provider instructions separate while loading project `AGENTS.md`,
    then assemble the model-visible instructions with the existing ordering,
    source attribution, warning, and turn-context behavior.
    - Wired the Codex home provider through the CLI, app server, MCP server,
    core facade, and thread-manager sample.
    
    ## Validation
    
    - `just test -p codex-home -p codex-extension-api`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core guardian`
    - `just test -p codex-app-server
    thread_start_without_selected_environment_includes_only_global_instruction_source`
    - `just test -p codex-exec warning`
    - `just bazel-lock-check`
  • [codex] migrate ExecutorFileSystem paths to PathUri (#27424)
    ## Why
    
    We're moving exec-server to use PathUri for its internal path
    representations.
    
    ## What
    
    Move `ExecutorFileSystem` APIs to use `PathUri` instead of
    `AbsolutePathBuf`. Future changes will convert higher-level parts of
    exec-server.
  • Pair thread environment settings (#26687)
    ## Why
    
    Thread cwd and environment selections are a single logical setting in
    core: updating one without the other can silently desynchronize the
    next-turn execution context. This change makes that relationship
    explicit in the internal thread settings flow while preserving the
    existing app-server public API shape.
    
    ## What changed
    
    - Moved the cwd/environment pair through internal
    `ThreadSettingsOverrides.environment_settings` instead of a top-level
    internal `cwd` field.
    - Kept `thread/settings/update` public params unchanged, with app-server
    translating top-level `cwd` into the paired internal settings shape.
    - Moved `Op::UserInput` environment overrides into thread settings so
    user turns and settings updates use the same core path.
    - Updated core, app-server, MCP, memories, sample, and test callsites to
    construct the paired settings shape.
    
    ## Verification
    
    - `git diff --check`
    - Local test run starting after PR creation.
  • [codex] Use standalone tools for Responses Lite (#26490)
    ## Summary
    
    Responses Lite does not execute hosted Responses tools, so models using
    it must route web search and image generation through Codex-owned
    executors & standalone Response's API endpoints.
    
    This PR is stacked on #26487.
    
    ## Validation
    
    - `cargo test -p codex-core responses_lite_ --lib`
    - `cargo test -p codex-core
    standalone_executors_remain_hidden_without_flags_or_responses_lite
    --lib`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates --lib`
    - `cargo test -p codex-web-search-extension -p
    codex-image-generation-extension`
    - `cargo test -p codex-app-server --test all standalone_`
    - `cargo fmt --all -- --check`
  • Require absolute cwd in thread settings (#26532)
    ## Why
    
    Thread settings cwd overrides are expected to be resolved before they
    enter core. Keeping this boundary as a plain `PathBuf` made it easy for
    core/session code to keep fallback normalization and relative-path
    resolution logic in places that should only receive an already-resolved
    cwd.
    
    This is intentionally the absolute-cwd-only slice: it does not change
    environment selection stickiness or cwd-to-default-environment fallback
    behavior.
    
    ## What changed
    
    - Changes `ThreadSettingsOverrides.cwd`,
    `CodexThreadSettingsOverrides.cwd`, and `SessionSettingsUpdate.cwd` to
    use `AbsolutePathBuf`.
    - Removes core-side cwd normalization/resolution from session settings
    updates.
    - Updates affected core/app-server test helpers and callsites to pass
    existing absolute cwd values or use `abs()` helpers.
    
    ## Validation
    
    Opening as draft so CI can start while local validation continues.
  • Switch runtime to cloud config bundle (#24622)
    ## Summary
    
    - Adapts the moved `codex-cloud-config` crate from the legacy cloud
    requirements endpoint to the new config bundle endpoint.
    - Switches runtime consumers from `CloudRequirementsLoader` to
    `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered
    config and requirements.
    - Removes the legacy cloud requirements domain loader path.
    
    ## Details
    
    This intentionally keeps `codex-cloud-config` monolithic for review
    lineage: the previous PR establishes the crate move, and this PR shows
    the behavior change against that moved implementation. A follow-up PR
    splits the module back into focused files.
    
    The new bundle path preserves the important cloud requirements loader
    semantics where intended: account-scoped signed cache, 30 minute TTL, 5
    minute refresh cadence, retry/backoff, auth recovery, and fail-closed
    startup loading. The cached payload changes from a single requirements
    TOML string to the backend-delivered bundle, and validation rejects
    malformed config or requirements fragments before cache write/use.
  • Add experimental turn additional context (#24154)
    ## Summary
    
    Adds experimental `additionalContext` support to `turn/start` and
    `turn/steer` so clients can provide ephemeral external context, such as
    browser or automation state, without turning that plumbing into a
    visible user prompt or triggering user-prompt lifecycle behavior.
    
    ## API Shape
    
    The parameter shape is:
    
    ```ts
    additionalContext?: Record<string, {
      value: string
      kind: "untrusted" | "application"
    }> | null
    ```
    
    Example:
    
    ```json
    {
      "additionalContext": {
        "browser_info": {
          "value": "Active tab is CI failures.",
          "kind": "untrusted"
        },
        "automation_info": {
          "value": "CI rerun is in progress.",
          "kind": "application"
        }
      }
    }
    ```
    
    The keys are opaque and caller-defined.
    
    ## Context Injection
    
    When provided, accepted entries are inserted into model context as
    hidden contextual message items, not as visible thread user-message
    items.
    
    `kind: "untrusted"` entries are inserted with role `user`:
    
    ```text
    <external_${key}>${value}</external_${key}>
    ```
    
    `kind: "application"` entries are inserted with role `developer`:
    
    ```text
    <${key}>${value}</${key}>
    ```
    
    Values are not escaped. Each value is truncated to 1k approximate tokens
    before wrapping.
    
    For `turn/start`, accepted additional context is inserted before normal
    user input. For `turn/steer`, additional context is merged only when the
    steer includes non-empty user input; context-only steers still reject as
    empty input.
    
    ## Dedupe Strategy
    
    `AdditionalContextStore` lives on session state and stores the latest
    complete additional-context map.
    
    Each `turn/start` or non-empty `turn/steer` treats its
    `additionalContext` as the current complete set of values. Entries are
    injected only when the key is new or the exact entry for that key
    changed, including `value` or `kind`. After merging, the store is
    replaced with the provided map, so omitted keys are removed from the
    retained set and can be injected again later if reintroduced.
    
    Omitting `additionalContext`, passing `null`, or passing an empty object
    resets the store to empty and injects nothing.
    
    ## What Changed
    
    - Threads experimental v2 `additionalContext` through app-server into
    core turn start and steer handling.
    - Adds separate contextual fragment types for untrusted user-role
    context and application developer-role context.
    - Uses pending response input items so additional context can be
    combined with normal user input without treating it as prompt text.
    - Adds integration coverage for start/steer flow, role routing,
    dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
    behavior, empty context-only steer rejection, external-fragment marker
    matching, and truncation.
  • Honor client-resolved service tier defaults (#23537)
    ## Why
    
    Model catalog responses can now advertise a nullable
    `default_service_tier` for each model. Codex needs to preserve three
    distinct states all the way from config/app-server inputs to inference:
    
    - no explicit service tier, so the client may apply the current model
    catalog default when FastMode is enabled
    - explicit `default`, meaning the user intentionally wants standard
    routing
    - explicit catalog tier ids such as `priority`, `flex`, or future tiers
    
    Keeping those states distinct prevents the UI from showing one tier
    while core sends another, especially after model switches or app-server
    `thread/start` / `turn/start` updates.
    
    ## What Changed
    
    - Plumbed `default_service_tier` through model catalog protocol types,
    app-server model responses, generated schemas, model cache fixtures, and
    provider/model-manager conversions.
    - Added the request-only `default` service tier sentinel and normalized
    legacy config spelling so `fast` in `config.toml` still materializes as
    the runtime/request id `priority`.
    - Moved catalog default resolution to the TUI/client side, including
    recomputing the effective service tier when model/FastMode-dependent
    surfaces change.
    - Updated app-server thread lifecycle config construction so
    `serviceTier: null` preserves explicit standard-routing intent by
    mapping to `default` instead of internal `None`.
    - Kept core responsible for validating explicit tiers against the
    current model and stripping `default` before `/v1/responses`, without
    applying catalog defaults itself.
    
    ## Validation
    
    - `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
    - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
    - `cargo test -p codex-tui service_tier`
    - `cargo test -p codex-protocol service_tier_for_request`
    - `cargo test -p codex-core get_service_tier`
    - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
    service_tier`
  • Make local environment optional in EnvironmentManager (#23369)
    ## Summary
    - make `EnvironmentManager` local environment/runtime paths optional
    - simplify constructor surface around snapshot materialization
    - rename local env accessors to `require_local_environment` /
    `try_local_environment`
    
    ## Validation
    - devbox Bazel build for touched crate surfaces
    - `//codex-rs/exec-server:exec-server-unit-tests`
    - `//codex-rs/app-server-client:app-server-client-unit-tests`
    - filtered touched `//codex-rs/core:core-unit-tests` cases
  • [3 of 7] Remove UserTurn (#23075)
    **Stack position:** [3 of 7]
    
    ## Summary
    
    This PR finishes the input-op consolidation by moving the remaining
    `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`.
    This touches a lot of files, but it is a low-risk mechanical migration.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075) (this PR)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • [codex] Remove legacy shell output formatting paths (#22706)
    ## Why
    
    The client and tool pipeline still carried compatibility code for legacy
    structured shell output. Current shell and apply_patch responses are
    already plain text for model consumption, so keeping a
    JSON-serialization path plus shell-item rewrite logic makes the request
    formatter and tests preserve a format we do not need anymore.
    
    ## What Changed
    
    - Removed the client-side shell output rewrite from
    `core/src/client_common.rs`.
    - Removed the structured exec-output formatter and the shell `freeform`
    switch so tool emitters use one model-facing formatter.
    - Collapsed apply_patch/shell serialization tests around the remaining
    plain-text output expectations and removed duplicate one-variant
    parameterized cases.
    - Kept the `ApplyPatchModelOutput::ShellCommandViaHeredoc` compatibility
    input shape, but no longer treats it as a separate output-format mode.
    
    ## Validation
    
    - `cargo test -p codex-core client_common`
    - `cargo test -p codex-core shell_serialization`
    - `cargo test -p codex-core apply_patch_cli`
    - `just fix -p codex-core`
    
    ## Documentation
    
    No external Codex documentation update is needed.
  • chore(features) rm Feature::ApplyPatchFreeform (#22711)
    ## Summary
    Removes the feature since this is effectively on by default in all cases
    where we should use it, or can be configured via models.json.
    
    ## Testing
    - [x] unit tests pass
  • Fix remote environment test fixtures (#22572)
    ## Why
    The Docker remote-env coverage was failing before it reached the
    behavior those tests are meant to exercise. The remote-aware test
    fixture only registered the remote environment, so tests that
    intentionally select both `local` and `remote` could not start a turn.
    After that was fixed, two tests exposed stale fixtures: the approval
    test was auto-approving under workspace-write, and the remote
    `view_image` test was writing invalid PNG bytes.
    
    ## What Changed
    - Added `EnvironmentManager::create_for_tests_with_local(...)` so tests
    can keep the provider default while also selecting `local` explicitly.
    - Updated `build_remote_aware()` to use that test-only manager when a
    remote exec-server URL is present.
    - Changed the remote apply-patch approval helper to use
    `SandboxPolicy::new_read_only_policy()` so the test actually exercises
    approval caching per environment.
    - Replaced the hardcoded remote `view_image` PNG blob with the existing
    `png_bytes(...)` helper so the test uses a valid image fixture.
    
    ## Validation
    Ran these isolated Docker remote-env tests on the devbox with
    `$remote-tests` setup:
    -
    `suite::remote_env::apply_patch_freeform_routes_to_selected_remote_environment`
    -
    `suite::remote_env::apply_patch_approvals_are_remembered_per_environment`
    -
    `suite::remote_env::apply_patch_intercepted_exec_command_routes_to_selected_remote_environment`
    -
    `suite::remote_env::exec_command_routes_to_selected_remote_environment`
    - `suite::view_image::view_image_routes_to_selected_remote_environment`
    
    All five pass.
  • [codex] Remove unused legacy shell tools (#22246)
    ## Why
    
    Recent session history showed no active use of the raw `shell`,
    `local_shell`, or `container.exec` execution surfaces. Keeping those
    handlers/specs wired into core leaves duplicate shell execution paths
    alongside the supported `shell_command` and unified exec tools.
    
    ## What changed
    
    - Removed the raw `shell` handler/spec and its `ShellToolCallParams`
    protocol helper.
    - Removed the legacy `local_shell` and `container.exec` handler/spec
    plumbing while preserving persisted-history compatibility for old
    response items.
    - Normalized model/config `default` and `local` shell selections to
    `shell_command`.
    - Pruned tests that exercised removed raw-shell/local-shell/apply-patch
    variants and kept coverage on `shell_command`, unified exec, and
    freeform `apply_patch`.
    
    ## Verification
    
    - `git diff --check`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tools::handlers::shell`
    - `cargo test -p codex-core tools::spec`
    - `cargo test -p codex-core tools::router`
    - `cargo test -p codex-core
    active_call_preserves_triggering_command_context`
    - `cargo test -p codex-core guardian_tests`
    - `cargo test -p codex-core --test all shell_serialization`
    - `cargo test -p codex-core --test all apply_patch_cli`
    - `cargo test -p codex-core --test all shell_command_`
    - `cargo test -p codex-core --test all local_shell`
    - `cargo test -p codex-core --test all otel::`
    - `cargo test -p codex-core --test all hooks::`
    - `just fix -p codex-core`
    - `just fix -p codex-tools`
  • extension: wire extension registries into sessions (#21737)
    ## Why
    
    [#21736](https://github.com/openai/codex/pull/21736) introduces the
    typed extension API, but the runtime does not yet carry a registry
    through thread/session startup or give contributors host-owned stores to
    read from. This PR wires that host-side path so later feature migrations
    can move product-specific behavior behind typed contributions without
    adding another bespoke seam directly to `codex-core`.
    
    ## What changed
    
    - Thread `ExtensionRegistry<Config>` through `ThreadManager`,
    `CodexSpawnArgs`, `Session`, and sub-agent spawn paths.
    - Wire `ThreadStartContributor` and `ContextContributor`
    - Expose the small supporting surface needed by non-core callers that
    construct threads directly, including `empty_extension_registry()`
    through `codex-core-api`.
    
    This PR lands the host plumbing only: the app-server registry is still
    empty, and concrete feature migrations are intended to follow
    separately.
  • tests: cover sandbox link write behavior (#21819)
    ## Why
    
    [PR #1705](https://github.com/openai/codex/pull/1705) moved
    `apply_patch` execution under the configured sandbox and called out the
    need for integration coverage. We already covered textual `../` escapes,
    but did not have coverage for link aliases that live inside a writable
    workspace while pointing at, or aliasing, files visible outside it.
    
    This PR locks in the current sandbox boundary without changing
    production write semantics. Symlink escapes into a read-only outside
    root should fail and leave the outside file unchanged. Existing hard
    links are characterized separately: if a user-created hard link already
    exists inside the writable root, sandboxed writes preserve normal
    hard-link semantics rather than replacing the link and silently breaking
    that relationship.
    
    ## What Changed
    
    - Added
    `apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
    to verify `apply_patch` cannot update a symlink that targets a file
    outside the writable workspace.
    - Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace`
    to verify `apply_patch` intentionally writes through an existing hard
    link and does not unlink or replace it.
    - Added `file_system_sandboxed_write_preserves_existing_hard_link` to
    verify sandboxed `fs/writeFile` preserves an existing hard link and
    writes the shared inode.
    
    ## Testing
    
    - `cargo test -p codex-exec-server file_system_sandboxed_write`
    - `cargo test -p codex-core
    apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
    - `cargo test -p codex-core
    apply_patch_cli_preserves_existing_hard_link_outside_workspace`
    - `just fix -p codex-exec-server -p codex-core`
    - `just fix -p codex-core`
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819).
    * #21845
    * __->__ #21819
  • [codex] Delete function-style apply_patch (#21651)
    ## Why
    
    `apply_patch` is now a freeform/custom tool. Keeping the old
    JSON/function-style registration and parsing path left another way for
    models and tests to invoke `apply_patch`, which made the tool surface
    harder to reason about.
    
    ## What changed
    
    - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
    spec, and handler support for function payloads.
    - Kept `apply_patch_tool_type = freeform` as the supported model
    metadata path, including Bedrock catalog metadata.
    - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
    calls.
    
    ## Verification
    
    - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
    - `cargo test -p codex-core tools::handlers::apply_patch --lib`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-core --test all
    apply_patch_reports_parse_diagnostics`
    - `cargo test -p codex-exec test_apply_patch_tool`
    - `just fix -p codex-core`
    - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
    codex-exec`
  • [codex] request desktop attestation from app (#20619)
    ## Summary
    
    TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
    attestation token and attach it as `x-oai-attestation` on the scoped
    ChatGPT Codex request paths.
    
    ![DeviceCheck attestation
    interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)
    
    ## Details
    
    This PR teaches the Codex app-server runtime how to request and attach
    an attestation token. It does not generate DeviceCheck tokens directly;
    instead, it relies on the connected desktop app to advertise that it can
    generate attestation and then asks that app for a fresh header value
    when needed.
    
    The flow is:
    
    1. The Codex desktop app connects to app-server.
    2. During `initialize`, the app can advertise that it supports
    `requestAttestation`.
    3. Before app-server calls selected ChatGPT Codex endpoints, it sends
    the internal server request `attestation/generate` to the app.
    4. app-server receives a pre-encoded header value back.
    5. app-server forwards that value as `x-oai-attestation` on the scoped
    outbound requests.
    
    The code in this repo is mostly protocol and runtime plumbing: it adds
    the app-server request/response shape, introduces an attestation
    provider in core, wires that provider into Responses / compaction /
    realtime setup paths, and covers the intended scoping with tests. The
    signed macOS DeviceCheck generation remains owned by the desktop app PR.
    
    ## Related PR
    
    - Codex desktop app implementation:
    https://github.com/openai/openai/pull/878649
    
    ## Validation
    
    <details>
    <summary>Tests run</summary>
    
    ```sh
    cargo test -p codex-app-server-protocol
    cargo test -p codex-core attestation --lib
    cargo test -p codex-app-server --lib attestation
    ```
    
    Also ran:
    
    ```sh
    just fix -p codex-core
    just fix -p codex-app-server
    just fix -p codex-app-server-protocol
    just fmt
    just write-app-server-schema
    ```
    
    </details>
    
    <details>
    <summary>E2E DeviceCheck validation</summary>
    
    First validated the signed desktop app boundary directly: launched a
    packaged signed `Codex.app`, sent `attestation/generate`, decoded the
    returned `v1.` attestation header, and validated the extracted
    DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
    bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
    `is_ok: true`.
    
    Then ran the fuller app + app-server flow. The packaged `Codex.app`
    launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
    MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
    requested `attestation/generate` from the real Electron app process, and
    the intercepted `/backend-api/codex/responses` traffic included
    `x-oai-attestation` on both routes:
    
    ```text
    GET  /backend-api/codex/responses  Upgrade: websocket  x-oai-attestation: present
    POST /backend-api/codex/responses  Upgrade: none       x-oai-attestation: present
    ```
    
    The captured header decoded to a DeviceCheck token that also validated
    with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
    team `2DC432GLL2`).
    
    </details>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add CODEX_HOME environments TOML provider (#20666)
    ## Why
    
    After stdio transports and provider-owned defaults exist, Codex needs a
    config-backed provider that can describe more than the single legacy
    `CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without
    activating it in product entrypoints yet, keeping parser/validation
    review separate from runtime wiring.
    
    **Stack position:** this is PR 4 of 5. It builds on PR 3's
    provider/default model and adds the `environments.toml` provider used by
    PR 5.
    
    ## What Changed
    
    - Add `environment_toml.rs` as the TOML-specific home for parsing,
    validation, and provider construction.
    - Keep the TOML schema/provider structs private; the public constructor
    added here is `EnvironmentManager::from_codex_home(...)`.
    - Add `TomlEnvironmentProvider`, including validation for:
      - reserved ids such as `local` and `none`
      - duplicate ids
      - unknown explicit defaults
      - empty programs or URLs
      - exactly one of `url` or `program` per configured environment
    - Support websocket environments with `url = "ws://..."` / `wss://...`.
    - Support stdio-command environments with `program = "..."`.
    - Add helpers to load `environments.toml` from `CODEX_HOME`, but do not
    wire entrypoints to call them yet.
    - Add the `toml` dependency for parsing.
    
    ## Stack
    
    - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
    listener
    - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
    client transport
    - 3. https://github.com/openai/codex/pull/20665 - Make environment
    providers own default selection
    - **4. This PR:** https://github.com/openai/codex/pull/20666 - Add
    CODEX_HOME environments TOML provider
    - 5. https://github.com/openai/codex/pull/20667 - Load configured
    environments from CODEX_HOME
    
    Split from original draft: https://github.com/openai/codex/pull/20508
    
    ## Validation
    
    Not run locally; this was split out of the original draft stack.
    
    ## Documentation
    
    This introduces the config shape for `environments.toml`; user-facing
    documentation should be added before this stack is treated as a
    documented public workflow.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Revert state DB injection and agent graph store (#21481)
    ## Why
    
    Reverts #20689 to restore the previous optional state DB plumbing. The
    conflict resolution keeps the newer installation ID and session/thread
    identity changes that landed after #20689, while removing the mandatory
    state DB and agent graph store dependency from ThreadManager
    construction.
    
    ## What changed
    
    - Restored `Option<StateDbHandle>` through app-server, MCP server,
    prompt debug, and test entry points.
    - Removed the `codex-core` dependency on `codex-agent-graph-store` and
    reverted descendant lookup back to the existing state DB path when
    available.
    - Kept newer `installation_id` forwarding by passing it beside the
    optional DB handle.
    - Kept local thread-name updates working when the optional state DB
    handle is absent.
    
    ## Validation
    
    - `git diff --check`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-state -p codex-rollout -p
    codex-app-server-protocol`
    - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
    codex-app-server -p codex-app-server-client -p codex-mcp-server -p
    codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
    ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
    2026-01-19)` on `aarch64-apple-darwin`.
  • 2- Use string service tiers in session protocol (#20971)
    ## Summary
    - break service tier session/op/app-server protocol fields from the
    closed enum to string tier ids
    - send the service tier string directly through model requests, prewarm,
    compaction, memories, and TUI/app-server turn starts
    - regenerate app-server protocol JSON/TypeScript schemas, removing the
    standalone ServiceTier TS enum
    
    ## Verification
    - just fmt
    - cargo check -p codex-core -p codex-app-server -p codex-tui
    - just write-app-server-schema
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Move installation ID resolution out of core startup (#21182)
    ## Summary
    
    - resolve or inject the installation ID before core startup and pass it
    through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
    `String`
    - keep child sessions on the parent installation ID instead of
    rediscovering it inside core
    - propagate installation ID startup failures in `mcp-server` instead of
    panicking
    
    ## Why
    
    Core was still touching the filesystem on the session startup path to
    discover `installation_id`. This moves that work to the outer host
    boundary so core no longer depends on `codex_home` reads during session
    construction.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Inject state DB, agent graph store (#20689)
    ## Why
    
    We want the agent graph store to be passed down the stack as a real
    dependency, the same way we already treat the thread store.
    
    This will let us inject the agent graph store as a real dependency and
    support implementations other than the local SQLite-backed one. Right
    now most code instantiates a state DB and an agent graph store
    just-in-time. Ideally, we would not depend on the state DB directly but
    only read through the higher-level interfaces.
    
    This change makes the dependency boundaries explicit and moves state DB
    initialization to process bootstrap instead of hiding it inside local
    store implementations.
    
    ## What changed
    
    - `ThreadManager` now requires a `StateDbHandle` and an
    `AgentGraphStore` at construction time instead of treating them as
    optional internals.
    - The local store constructors no longer lazily initialize SQLite.
    Callers now initialize the state DB once per process and use that shared
    handle to build:
      - `LocalThreadStore`
      - `LocalAgentGraphStore`
    - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
    thread-manager sample) now initialize the state DB up front and inject
    the resulting handle down the stack.
    - `app-server` now consistently uses its process-scoped state DB handle
    instead of reopening SQLite or trying to recover it from loaded threads.
    - Device-key storage now reuses the shared state DB handle instead of
    maintaining its own lazy opener.
    - The thread archive / descendant traversal paths now use the injected
    `AgentGraphStore` instead of reaching through local
    thread-store-specific state.
    
    ## Verification
    
    - `cargo check -p codex-core -p codex-thread-store -p codex-app-server
    -p codex-mcp-server -p codex-thread-manager-sample --tests`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-core
    thread_manager_accepts_separate_agent_graph_store_and_thread_store --
    --nocapture`
    - `cargo test -p codex-app-server
    thread_archive_archives_spawned_descendants -- --nocapture`
  • state: pass state db handles through consumers (#20561)
    ## Why
    
    SQLite state was still being opened from consumer paths, including lazy
    `OnceCell`-backed thread-store call sites. That let one process
    construct multiple state DB connections for the same Codex home, which
    makes SQLite lock contention and `database is locked` failures much
    easier to hit.
    
    State DB lifetime should be chosen by main-like entrypoints and tests,
    then passed through explicitly. Consumers should use the supplied
    `Option<StateDbHandle>` or `StateDbHandle` and keep their existing
    filesystem fallback or error behavior when no handle is available.
    
    The startup path also needs to keep the rollout crate in charge of
    SQLite state initialization. Opening `codex_state::StateRuntime`
    directly bypasses rollout metadata backfill, so entrypoints should
    initialize through `codex_rollout::state_db` and receive a handle only
    after required rollout backfills have completed.
    
    ## What Changed
    
    - Initialize the state DB in main-like entrypoints for CLI, TUI,
    app-server, exec, MCP server, and the thread-manager sample.
    - Pass `Option<StateDbHandle>` through `ThreadManager`,
    `LocalThreadStore`, app-server processors, TUI app wiring, rollout
    listing/recording, personality migration, shell snapshot cleanup,
    session-name lookup, and memory/device-key consumers.
    - Remove the lazy local state DB wrapper from the thread store so
    non-test consumers use only the supplied handle or their existing
    fallback path.
    - Make `codex_rollout::state_db::init` the local state startup path: it
    opens/migrates SQLite, runs rollout metadata backfill when needed, waits
    for concurrent backfill workers up to a bounded timeout, verifies
    completion, and then returns the initialized handle.
    - Keep optional/non-owning SQLite helpers, such as remote TUI local
    reads, as open-only paths that do not run startup backfill.
    - Switch app-server startup from direct
    `codex_state::StateRuntime::init` to the rollout state initializer so
    app-server cannot skip rollout backfill.
    - Collapse split rollout lookup/list APIs so callers use the normal
    methods with an optional state handle instead of `_with_state_db`
    variants.
    - Restore `getConversationSummary(ThreadId)` to delegate through
    `ThreadStore::read_thread` instead of a LocalThreadStore-specific
    rollout path special case.
    - Keep DB-backed rollout path lookup keyed on the DB row and file
    existence, without imposing the filesystem filename convention on
    existing DB rows.
    - Verify readable DB-backed rollout paths against `session_meta.id`
    before returning them, so a stale SQLite row that points at another
    thread's JSONL falls back to filesystem search and read-repairs the DB
    row.
    - Keep `debug prompt-input` filesystem-only so a one-off debug command
    does not initialize or backfill SQLite state just to print prompt input.
    - Keep goal-session test Codex homes alive only in the goal-specific
    helper, rather than leaking tempdirs from the shared session test
    helper.
    - Update tests and call sites to pass explicit state handles where DB
    behavior is expected and explicit `None` where filesystem-only behavior
    is intended.
    
    ## Validation
    
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
    codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
    codex-tui -p codex-exec -p codex-cli --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout state_db_`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout try_init_ -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
    codex-rollout --lib -- -D warnings`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store
    read_thread_falls_back_when_sqlite_path_points_to_another_thread --
    --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    shell_snapshot`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all personality_migration`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find::find_prefers_sqlite_path_by_id --
    --nocapture`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    interrupt_accounts_active_goal_before_pausing`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server get_auth_status -- --test-threads=1`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server --lib`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
    -p codex-app-server --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
    -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
    codex-exec -p codex-cli`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
    codex-app-server`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
    codex-rollout`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
    - `just argument-comment-lint -p codex-core`
    - `just argument-comment-lint -p codex-rollout`
    
    Focused coverage added in `codex-rollout`:
    
    - `recorder::tests::state_db_init_backfills_before_returning` verifies
    the rollout metadata row exists before startup init returns.
    - `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
    verifies startup waits for another worker to finish backfill instead of
    disabling the handle for the process.
    -
    `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
    verifies startup does not hang indefinitely on a stuck backfill lease.
    -
    `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
    verifies DB-backed lookup accepts valid existing rollout paths even when
    the filename does not include the thread UUID.
    -
    `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
    verifies DB-backed lookup ignores a stale row whose existing path
    belongs to another thread and read-repairs the row after filesystem
    fallback.
    
    Focused coverage updated in `codex-core`:
    
    - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
    DB-preferred rollout file with matching `session_meta.id`, so it still
    verifies that valid SQLite paths win without depending on stale/empty
    rollout contents.
    
    `cargo test -p codex-app-server thread_list_respects_search_term_filter
    -- --test-threads=1 --nocapture` was attempted locally but timed out
    waiting for the app-server test harness `initialize` response before
    reaching the changed thread-list code path.
    
    `bazel test //codex-rs/thread-store:thread-store-unit-tests
    --test_output=errors` was attempted locally after the thread-store fix,
    but this container failed before target analysis while fetching `v8+`
    through BuildBuddy/direct GitHub. The equivalent local crate coverage,
    including `cargo test -p codex-thread-store`, passes.
    
    A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
    also requires system `libcap.pc` for `codex-linux-sandbox`; the
    follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
    this container.
  • Make thread store process-scoped (#19474)
    - Build one app-server process ThreadStore from startup config and share
    it with ThreadManager and CodexMessageProcessor.
    - Remove per-thread/fork store reconstruction so effective thread config
    cannot switch the persistence backend.
    - Add params to ThreadStore create/resume for specifying thread
    metadata, since otherwise the metadata from store creation would be used
    (incorrectly).
  • Reduce the surface of collaboration modes (#20149)
    Collaboration modes were slightly invasive both into ThreadManager
    construction and ModelProvider
  • Add ThreadManager sample crate (#20141)
    Summary:
    - Add codex-thread-manager-sample, a one-shot binary that starts a
    ThreadManager thread, submits a prompt, and prints the final assistant
    output.
    - Pass ThreadStore into ThreadManager::new and expose
    thread_store_from_config for existing callsites.
    - Build the sample Config directly with only --model and prompt inputs.
    
    Verification:
    - just fmt
    - cargo check -p codex-thread-manager-sample -p codex-app-server -p
    codex-mcp-server
    - git diff --check
    
    Tests: Not run per request.
  • Add environment provider snapshot (#20058)
    ## Summary
    - Change `EnvironmentProvider` to return concrete `Environment`
    instances instead of `EnvironmentConfigurations`.
    - Make `DefaultEnvironmentProvider` provide the provider-visible `local`
    environment plus optional `remote` environment from
    `CODEX_EXEC_SERVER_URL`.
    - Keep `EnvironmentManager` as the concrete cache while exposing its own
    explicit local environment for `local_environment()` fallback paths.
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • core tests: build user turns from permission profiles (#20011)
    ## Summary
    - Add `turn_permission_fields()` so tests that construct `Op::UserTurn`
    directly can provide a canonical `PermissionProfile` while still filling
    the required legacy `sandbox_policy` compatibility field.
    - Migrate direct user-turn construction in core integration tests from
    `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`.
    - Continue reducing direct `SandboxPolicy` usage in
    `codex-rs/core/tests`, from 41 files after #20010 to 32 files in this
    PR.
    
    ## Testing
    - `cargo check -p codex-core --tests`
    - `just fmt`
    - `just fix -p core_test_support`
    - `just fix -p codex-core`
  • core tests: submit turns with permission profiles (#20010)
    ## Summary
    
    - Add `PermissionProfile`-based turn submission helpers to
    `core_test_support`, while keeping the legacy `SandboxPolicy` helper for
    tests that intentionally exercise legacy fallback behavior.
    - Switch the default `TestCodex::submit_turn()` path to send a real
    `PermissionProfile` plus the required legacy compatibility projection in
    `Op::UserTurn`.
    - Migrate straightforward app/search/shell/truncation tests from
    `SandboxPolicy::{DangerFullAccess, ReadOnly}` to
    `PermissionProfile::{Disabled, read_only}`.
    - Add a TUI compatibility projection helper for legacy app-server fields
    so non-legacy writable roots are preserved instead of being downgraded
    to read-only.
    - Fix remote start/resume/fork sandbox-mode projection to classify any
    managed profile with writable roots as workspace-write, not only
    profiles that can write `cwd`.
    - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47
    files to 41 files without changing production behavior.
    
    ## Testing
    
    - `cargo check -p codex-core --tests`
    - `cargo test -p codex-tui
    compatibility_profile_preserves_unbridgeable_write_roots`
    - `cargo test -p codex-tui
    sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions`
    - `just fmt`
    - `just fix -p core_test_support`
    - `just fix -p codex-core`
  • permissions: add built-in default profiles (#19900)
    ## Why
    
    The migration away from `SandboxPolicy` needs new configs to start from
    permissions profiles instead of deriving profiles from legacy sandbox
    modes. Existing users can have empty `config.toml` files, and we should
    not rewrite user-owned config files that may live in shared
    repositories.
    
    This PR introduces built-in profile names so an empty config can resolve
    to a canonical `PermissionProfile`, while explicit named `[permissions]`
    profiles still behave predictably.
    
    ## What changed
    
    - Adds built-in `default_permissions` profile names:
      - `:read-only` maps to `PermissionProfile::read_only()`.
    - `:workspace` maps to the workspace-write profile, including
    project-root metadata carveouts.
    - `:danger-no-sandbox` maps to `PermissionProfile::Disabled`, preserving
    the distinction between no sandbox and a broad managed sandbox.
    - Reserves the `:` prefix for built-in profiles so user-defined
    `[permissions]` profiles cannot collide with future built-ins.
    - Allows `default_permissions` to reference a built-in profile without
    requiring a `[permissions]` table.
    - Makes an otherwise empty config choose a built-in profile by
    trust/platform context: trusted or untrusted project roots use
    `:workspace` when the platform supports that sandbox, while roots
    without a trust decision use `:read-only`.
    - Keeps legacy `sandbox_mode` configs on the legacy path, and still
    rejects user-defined `[permissions]` profiles that omit
    `default_permissions` so we do not silently guess among custom profiles.
    - Preserves compatibility behavior for implicit defaults: bare
    `network.enabled = true` allows runtime network without starting the
    managed proxy, explicit profile proxy policy still starts the proxy, and
    implicit workspace/add-dir roots keep legacy metadata carveouts.
    
    ## Verification
    
    - `cargo test -p codex-core builtin --lib`
    - `cargo test -p codex-core profile_network_proxy_config`
    - `cargo test -p codex-core
    implicit_builtin_workspace_profile_preserves_add_dir_metadata_carveouts`
    - `cargo test -p codex-core
    permissions_profiles_network_enabled_allows_runtime_network_without_proxy`
    - `cargo test -p codex-core
    permissions_profiles_proxy_policy_starts_managed_network_proxy`
    
    ## Documentation
    
    Public Codex config docs should mention these built-in names when the
    `[permissions]` config format is ready to document as stable.
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19900).
    * #20041
    * #20040
    * #20037
    * #20035
    * #20034
    * #20033
    * #20032
    * #20030
    * #20028
    * #20027
    * #20026
    * #20024
    * #20021
    * #20018
    * #20016
    * #20015
    * #20013
    * #20011
    * #20010
    * #20008
    * __->__ #19900
  • tui: carry permission profiles on user turns (#18285)
    ## Why
    
    Per-turn permission overrides should use the same canonical profile
    abstraction as session configuration. That lets TUI submissions preserve
    exact configured permissions without round-tripping through legacy
    sandbox fields.
    
    ## What changed
    
    This adds `permission_profile` to user-turn operations, threads it
    through TUI/app-server submission paths, fills the new field in existing
    test fixtures, and adds coverage that composer submission includes the
    configured profile.
    
    ## Verification
    
    - `cargo test -p codex-tui permissions -- --nocapture`
    - `cargo test -p codex-core --test all permissions_messages --
    --nocapture`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285).
    * #18288
    * #18287
    * #18286
    * __->__ #18285
  • codex: support hooks in config.toml and requirements.toml (#18893)
    ## Summary
    
    Support the existing hooks schema in inline TOML so hooks can be
    configured from both `config.toml` and enterprise-managed
    `requirements.toml` without requiring a separate `hooks.json` payload.
    
    This gives enterprise admins a way to ship managed hook policy through
    the existing requirements channel while still leaving script delivery to
    MDM or other device-management tooling, and it keeps `hooks.json`
    working unchanged for existing users.
    
    This also lays the groundwork for follow-on managed filtering work such
    as #15937, while continuing to respect project trust gating from #14718.
    It does **not** implement `allow_managed_hooks_only` itself.
    
    NOTE: yes, it's a bit unfortunate that the toml isn't formatted as
    closely as normal to our default styling. This is because we're trying
    to stay compatible with the spec for plugins/hooks that we'll need to
    support & the main usecase here is embedding into requirements.toml
    
    ## What changed
    
    - moved the shared hook serde model out of `codex-rs/hooks` into
    `codex-rs/config` so the same schema can power `hooks.json`, inline
    `config.toml` hooks, and managed `requirements.toml` hooks
    - added `hooks` support to both `ConfigToml` and
    `ConfigRequirementsToml`, including requirements-side `managed_dir` /
    `windows_managed_dir`
    - treated requirements-managed hooks as one constrained value via
    `Constrained`, so managed hook policy is merged atomically and cannot
    drift across requirement sources
    - updated hook discovery to load requirements-managed hooks first, then
    per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning
    when a single layer defines both representations
    - threaded managed hook metadata through discovered handlers and exposed
    requirements hooks in app-server responses, generated schemas, and
    `/debug-config`
    - added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`,
    `codex-rs/core/src/config_loader/tests.rs`, and
    `codex-rs/core/tests/suite/hooks.rs`
    
    ## Testing
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-app-server config_api`
    
    ## Documentation
    
    Companion updates are needed in the developers website repo for:
    
    - the hooks guide
    - the config reference, sample, basic, and advanced pages
    - the enterprise managed configuration guide
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>