60 Commits

  • Support openai/form extended form elicitations (#27500)
    # Summary
    Allow App Server clients to opt into `openai/form` MCP elicitations.
  • [codex] Add external agent import result accounting (#28008)
    ## Why
    
    External-agent imports can complete synchronously or continue in the
    background for plugins/sessions. Clients need a stable import id to
    correlate the immediate response with the eventual completion
    notification, and the completion payload needs enough accounting to show
    which artifact types succeeded or failed without hiding partial
    failures.
    
    ## What Changed
    
    - `externalAgentConfig/import` now returns an `importId`;
    `externalAgentConfig/import/completed` includes the same `importId` plus
    type-level `itemResults`.
    - Completed `itemResults` report `successCount`, `errorCount`,
    `successes`, and `rawErrors` for each migrated item type.
    - Added protocol/schema/TypeScript types for import successes, raw
    errors, and type-level results. No progress notification is included in
    the final PR.
    - `ExternalAgentConfigService::import` now returns an outcome object
    with synchronous item results and pending plugin imports.
    - Plugin import outcomes track succeeded/failed marketplaces, plugin
    ids, and raw errors. Plugin failures can be reported in completed
    accounting while later migration items continue.
    - Non-plugin synchronous import failures still fail the request, so
    invalid config/skills-style failures are not reported as a successful
    import response.
    - Session imports now return item results. Successful imports include
    the source session path and imported thread id; prepare, persist,
    ledger, and source-validation failures become raw errors in completion
    accounting where the import can continue.
    - The request processor generates the `importId`, aggregates synchronous
    results with background plugin/session results, and sends a single
    completed notification when all selected work is done.
    - App-server docs and generated schema fixtures were updated for the new
    response/completed payload shapes.
    
    ## Validation
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server-client event_requires_delivery`
    - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-sync-error
    just test -p codex-app-server
    external_agent_config_import_returns_error_for_failed_sync_import`
    - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-external-agent
    just test -p codex-app-server external_agent_config`
    
    Note: local sandbox validation used `CODEX_SQLITE_HOME` because the
    default sqlite state path is read-only in this environment.
  • Add request_user_input auto-resolution window contract (#27256)
    ## Why
    
    `request_user_input` is moving beyond its original plan-mode-only
    workflow, and future default/goal-mode usage needs a way for the model
    to ask helpful but non-blocking questions without forcing the turn to
    wait forever. This PR adds an explicit `autoResolutionMs` contract so a
    later client/runtime change can auto-resolve unanswered prompts after a
    bounded window while leaving truly blocking questions unchanged.
    
    This is contract plumbing only; it does not implement the client-side
    timer or auto-selection behavior, and the model-facing description
    treats the field as reserved unless the current runtime explicitly
    supports auto-resolution.
    
    ## What Changed
    
    - Added optional `autoResolutionMs` to the model-facing
    `request_user_input` args and core `RequestUserInputEvent`.
    - Added model-facing schema text for `autoResolutionMs` while marking it
    reserved for runtimes that explicitly support auto-resolution.
    - Bounds `autoResolutionMs` to `60_000..=240_000` ms during argument
    normalization by clamping out-of-range model-provided values.
    - Propagated the field through app-server v2
    `ToolRequestUserInputParams`, app-server request forwarding, generated
    TypeScript, and JSON schema fixtures.
    - Updated app-server, core, protocol, and TUI call sites/tests so
    omitted values preserve existing `None`/`null` behavior and coverage
    verifies a `Some(60_000)` round trip.
    
    ## Verification
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-core request_user_input`
    - `just test -p codex-app-server request_user_input_round_trip`
    - `just test -p codex-tui request_user_input`
    - `just test -p codex-protocol`
  • [1 of 3] Support long raw TUI goal objectives (#27508)
    ## Stack
    
    1. **[1 of 3] Support long raw TUI goal objectives** - this PR
    2. [2 of 3] Support long pasted text in TUI goals - #27509
    3. [3 of 3] Support images in TUI goals - #27510
    
    ## Why
    
    `thread/goal/set` limits persisted objective text to 4000 characters.
    The TUI used to reject raw `/goal` objectives above that limit, even
    though the client can make them usable by writing the long text to a
    file and storing a short objective that points at that file.
    
    This also needs to work for remote app-server sessions: filesystem API
    calls must create files on the app-server host, and the stored path must
    be meaningful to the agent on that host.
    
    ## What Changed
    
    - Adds an app-server-host path helper so TUI code can build paths that
    are resolved on the app-server host rather than the TUI host.
    - Adds TUI app-server session helpers for `fs/createDirectory`,
    `fs/writeFile`, `fs/readFile`, and `fs/remove` that work for embedded
    and remote app-server sessions without changing the app-server protocol.
    - Materializes oversized raw `/goal` objectives into
    `$CODEX_HOME/attachments/<uuid>/goal-objective.md` through the
    app-server filesystem APIs, then stores a short, readable objective that
    directs the agent to that file.
    - Reads managed objective files back for `/goal edit`. Other goal UI
    renders the readable stored objective normally, without
    managed-file-specific presentation logic.
    - Recognizes managed references only when they name the expected
    generated file under the app server's reported `$CODEX_HOME`, and cleans
    up newly materialized files when goal replacement or setting does not
    complete.
    
    ## Verification
    
    - Added/updated TUI tests for raw oversized `/goal` submission, large
    inline-paste expansion, queued oversized goals, app-facing
    materialization before `thread/goal/set`, managed-path validation,
    editing, and cleanup.
    - Added/updated app-server-client remote coverage for initialized remote
    Codex home handling.
    
    ## Manual Testing
    
    - Ran the real TUI against a Unix-socket app server with different local
    and server `$CODEX_HOME` directories. Oversized goals wrote only under
    the server home, and persisted references used the server-canonical path
    rather than the TUI path.
    - Exercised 3,999-, 4,000-, and 4,001-character raw objectives. The
    first two stayed inline without new files; the 4,001-character objective
    became a managed objective file.
    - Submitted a larger 8,275-character objective, verified its full
    contents on the app-server host, and observed the goal continuation open
    the referenced server-side file.
    - Opened `/goal edit` for a managed objective and verified the full text
    was restored through remote `fs/readFile`.
    - Submitted an oversized replacement while a goal was active, verified
    no file was written before confirmation, then canceled and confirmed
    that the existing goal and attachment count were unchanged.
  • Remove TUI legacy Windows sandbox dependency (#27490)
    ## Why
    
    This is part of an ongoing attempt to eliminate the TUI's direct
    dependency on core features. When we moved the TUI to the app server, we
    left a `legacy_core` shim that re-exported some remaining core symbols
    for the TUI. The intent was to eventually remove all of these.
    
    In this PR, we remove the symbols related to the Windows sandbox.
    
    The change should be behavior-neutral and low risk because it's just
    refactoring and removal of code that is now effectively dead.
    
    When working on this PR, I noticed a big existing problem that affects
    mixed-platform remoting. For example, if you run the TUI on a Linux box
    and remote into a Windows box, the TUI logic doesn't properly handle
    Windows sandbox setup properly. Fixing this is beyond the scope of this
    PR, but I've left a TODO comment in place so we don't forget.
    
    ## What changed
    
    - Move the remaining TUI-specific sandbox level, setup, telemetry, and
    read-root helpers into `codex-tui`, calling `codex-windows-sandbox`
    directly.
    - Remove the Windows sandbox namespace and read-root grant re-exports
    from the client-side `legacy_core` facade.
    - Remove the dormant pre-elevation prompt fallback guarded by the
    permanently enabled `ELEVATED_SANDBOX_NUX_ENABLED` switch. The reachable
    elevated and non-elevated setup flows remain unchanged.
  • Trim TUI legacy telemetry and migration dependencies (#27487)
    ## Why
    
    The TUI still reached through `codex-app-server-client::legacy_core` for
    process telemetry setup and personality migration, exposing core-only
    details after the TUI moved onto the app-server layer.
    
    This is part of our ongoing efforts to whittle away at the legacy_core
    shim that was left over after migrating the TUI to the app server.
    
    This change is just a refactor/rename and should be behavior-neutral and
    low risk.
    
    ## What changed
    
    - expose OTEL provider construction through the app-server client and
    keep the small process/SQLite telemetry adapters local to the TUI
    - collapse personality migration results to the config-reload decision
    the TUI needs
    - remove the `legacy_core::otel_init` and
    `legacy_core::personality_migration` subnamespaces
  • Remove TUI legacy core test_support dependencies (#27484)
    ## Why
    
    The TUI now sits on the app-server layer, but
    `app-server-client::legacy_core` still exposed core test helpers solely
    for TUI tests. We've been whittling away the remaining dependencies.
    This is the next step on that journey.
    
    There is no functional change — just a refactor, and this affects only
    test code, so it should be low risk.
    
    ## What changed
    
    - remove the `legacy_core::test_support` re-export and call
    model-manager test helpers directly
    - keep the bundled model-preset cache local to TUI test support
    - import constraint types directly from `codex-config`
  • [codex] add /import for external agents (#27071)
    ## Why
    
    External-agent import should be discoverable and deliberate without
    blocking startup or claiming the public `codex [PROMPT]` CLI namespace.
    The slash command keeps the flow local to the interactive TUI and reuses
    the existing app-server import API.
    
    ## What changed
    
    - add the user-facing `/import` slash command
    - detect external-agent importable items only when the command is
    invoked
    - run imports through the embedded local app-server
    - show start and completion messages, refresh configuration, and block
    duplicate imports while one is pending
    - reject the flow for unsupported remote and local-daemon sessions
    
    ## Validation
    
    - `just test -p codex-tui external_agent_config_migration` (10 passed)
    - manually exercised an isolated TUI fixture with existing
    external-agent setup and session data using a fresh `CODEX_HOME`
    - verified picker customization, plugin and session detection, import
    completion, repeated invocation, and imported-session resume context
    - the broader `just test -p codex-tui` run passed 2,805 tests, with 2
    unrelated guardian feature-flag failures and 4 skipped tests
    
    ## Draft follow-ups
    
    - review whether completion messaging should remain attached to the
    initiating chat if the user switches chats during an import
    - review shutdown semantics for an in-progress background import
    
    ## Stack
    
    1. [#27064](https://github.com/openai/codex/pull/27064): remove the
    startup migration flow
    2. [#27065](https://github.com/openai/codex/pull/27065): extract the
    picker renderer
    3. [#27070](https://github.com/openai/codex/pull/27070): add the
    external-agent import picker UX
    4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow
    through `/import`
    
    **This PR is stack item 4.** Draft while the lower stack dependencies
    are reviewed.
  • Reduce TUI legacy core dependencies (#26711)
    ## Why
    
    The TUI still reached through `app-server-client::legacy_core` for
    thread-name normalization and project-instruction filename details. In
    particular, checking the TUI's local filesystem for `/init` is incorrect
    for remote app-server sessions, where the server owns the working
    directory and instruction discovery.
    
    ## What changed
    
    - use the instruction source paths supplied by the app server to decide
    whether `/init` should avoid overwriting project instructions
    - keep the small thread-name normalization helper local to the TUI
    - remove the now-unused instruction filename constants, utility module,
    and other unused `legacy_core` re-exports
    - make status helper tests independent of concrete instruction filenames
    
    ## Verification
    
    - `just test -p codex-app-server-client`
    - `just test -p codex-tui
    slash_init_skips_when_project_instructions_are_loaded`
    - `just test -p codex-tui` ran 2,799 tests; 2,797 passed and two
    unrelated guardian feature-flag tests failed reproducibly in untouched
    code
    
    ### Manual test
    
    Started an app server over WebSocket with a remote workspace containing
    `AGENTS.md`, then connected the TUI using `--remote`. After confirming
    `thread/start` returned the file in `instructionSources`, deleted
    `AGENTS.md` and ran `/init` in the existing session.
    
    The TUI still reported that project instructions already existed and
    skipped `/init`. The trace contained no `turn/start` request, confirming
    the decision came from app-server session state rather than a new
    client-local filesystem check.
  • Switch runtime to cloud config bundle (#24622)
    ## Summary
    
    - Adapts the moved `codex-cloud-config` crate from the legacy cloud
    requirements endpoint to the new config bundle endpoint.
    - Switches runtime consumers from `CloudRequirementsLoader` to
    `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered
    config and requirements.
    - Removes the legacy cloud requirements domain loader path.
    
    ## Details
    
    This intentionally keeps `codex-cloud-config` monolithic for review
    lineage: the previous PR establishes the crate move, and this PR shows
    the behavior change against that moved implementation. A follow-up PR
    splits the module back into focused files.
    
    The new bundle path preserves the important cloud requirements loader
    semantics where intended: account-scoped signed cache, 30 minute TTL, 5
    minute refresh cadence, retry/backoff, auth recovery, and fail-closed
    startup loading. The cached payload changes from a single requirements
    TOML string to the backend-delivered bundle, and validation rejects
    malformed config or requirements fragments before cache write/use.
  • Show remote connection details in /status (#24420)
    ## Summary
    
    Fixes #24411.
    
    `/status` currently has no way to show when the TUI is talking to Codex
    through a remote transport. That makes embedded local sessions, local
    daemon sessions, and true remote sessions look the same, and it hides
    the remote server version when debugging connection-specific behavior.
    
    This PR adds a single `Remote` row for non-embedded connections only.
    The row shows the sanitized connection address and a dimmed version
    parenthetical, preserving the existing status output for embedded local
    sessions.
    
    <img width="791" height="144" alt="image"
    src="https://github.com/user-attachments/assets/529d7940-1c45-4586-8b06-f20a1f04b771"
    />
    
    
    ## Verification
    
    - Manually validated when connecting remotely (either implicitly to
    local daemon or explicitly)
  • Add thread/settings/update app-server API (#23502)
    ## Why
    
    App-server clients need a way to update a thread's next-turn settings
    without starting a turn, adding transcript content, or waiting for turn
    lifecycle events. This gives settings UI a direct path for durable
    thread settings while clients observe the eventual effective state
    through a notification.
    
    This is a simplified rework of PR
    https://github.com/openai/codex/pull/22509. In particular, it changes
    the `thread/settings/update` api to return immediately rather than
    waiting and returning the effective (updated) thread settings. This
    makes the new api consistent with `turn/start` and greatly reduces the
    complexity of the implementation relative to the earlier attempt.
    
    ## What Changed
    
    - Adds experimental `thread/settings/update` with partial-update request
    fields and an empty acknowledgment response.
    - Adds experimental `thread/settings/updated`, carrying full effective
    `ThreadSettings` and scoped by `threadId` to subscribed clients for the
    affected thread.
    - Shares durable settings validation with `turn/start`, including
    `sandboxPolicy` plus `permissions` rejection and `serviceTier: null`
    clearing.
    - Emits the same settings notification when `turn/start` overrides
    change the stored effective thread settings.
    - Regenerates app-server protocol schema fixtures and updates
    `app-server/README.md`.
  • Make local environment optional in EnvironmentManager (#23369)
    ## Summary
    - make `EnvironmentManager` local environment/runtime paths optional
    - simplify constructor surface around snapshot materialization
    - rename local env accessors to `require_local_environment` /
    `try_local_environment`
    
    ## Validation
    - devbox Bazel build for touched crate surfaces
    - `//codex-rs/exec-server:exec-server-unit-tests`
    - `//codex-rs/app-server-client:app-server-client-unit-tests`
    - filtered touched `//codex-rs/core:core-unit-tests` cases
  • config: add strict config parsing (#20559)
    ## Why
    
    Codex intentionally ignores unknown `config.toml` fields by default so
    older and newer config files keep working across versions. That leniency
    also makes typo detection hard because misspelled or misplaced keys
    disappear silently.
    
    This change adds an opt-in strict config mode so users and tooling can
    fail fast on unrecognized config fields without changing the default
    permissive behavior.
    
    This feature is possible because `serde_ignored` exposes the exact
    signal Codex needs: it lets Codex run ordinary Serde deserialization
    while recording fields Serde would otherwise ignore. That avoids
    requiring `#[serde(deny_unknown_fields)]` across every config type and
    keeps strict validation opt-in around the existing config model.
    
    ## What Changed
    
    ### Added strict config validation
    
    - Added `serde_ignored`-based validation for `ConfigToml` in
    `codex-rs/config/src/strict_config.rs`.
    - Combined `serde_ignored` with `serde_path_to_error` so strict mode
    preserves typed config error paths while also collecting fields Serde
    would otherwise ignore.
    - Added strict-mode validation for unknown `[features]` keys, including
    keys that would otherwise be accepted by `FeaturesToml`'s flattened
    boolean map.
    - Kept typed config errors ahead of ignored-field reporting, so
    malformed known fields are reported before unknown-field diagnostics.
    - Added source-range diagnostics for top-level and nested unknown config
    fields, including non-file managed preference source names.
    
    ### Kept parsing single-pass per source
    
    - Reworked file and managed-config loading so strict validation reuses
    the already parsed `TomlValue` for that source.
    - For actual config files and managed config strings, the loader now
    reads once, parses once, and validates that same parsed value instead of
    deserializing multiple times.
    - Validated `-c` / `--config` override layers with the same
    base-directory context used for normal relative-path resolution, so
    unknown override keys are still reported when another override contains
    a relative path.
    
    ### Scoped `--strict-config` to config-heavy entry points
    
    - Added support for `--strict-config` on the main config-loading entry
    points where it is most useful:
      - `codex`
      - `codex resume`
      - `codex fork`
      - `codex exec`
      - `codex review`
      - `codex mcp-server`
      - `codex app-server` when running the server itself
      - the standalone `codex-app-server` binary
      - the standalone `codex-exec` binary
    - Commands outside that set now reject `--strict-config` early with
    targeted errors instead of accepting it everywhere through shared CLI
    plumbing.
    - `codex app-server` subcommands such as `proxy`, `daemon`, and
    `generate-*` are intentionally excluded from the first rollout.
    - When app-server strict mode sees invalid config, app-server exits with
    the config error instead of logging a warning and continuing with
    defaults.
    - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
    instead of extending shared `ReviewArgs`, so `--strict-config` stays on
    the outer config-loading command surface and does not become part of the
    reusable review payload used by `codex exec review`.
    
    ### Coverage
    
    - Added tests for top-level and nested unknown config fields, unknown
    `[features]` keys, typed-error precedence, source-location reporting,
    and non-file managed preference source names.
    - Added CLI coverage showing invalid `--enable`, invalid `--disable`,
    and unknown `-c` overrides still error when `--strict-config` is
    present, including compound-looking feature names such as
    `multi_agent_v2.subagent_usage_hint_text`.
    - Added integration coverage showing both `codex app-server
    --strict-config` and standalone `codex-app-server --strict-config` exit
    with an error for unknown config fields instead of starting with
    fallback defaults.
    - Added coverage showing unsupported command surfaces reject
    `--strict-config` with explicit errors.
    
    ## Example Usage
    
    Run Codex with strict config validation enabled:
    
    ```shell
    codex --strict-config
    ```
    
    Strict config mode is also available on the supported config-heavy
    subcommands:
    
    ```shell
    codex --strict-config exec "explain this repository"
    codex review --strict-config --uncommitted
    codex mcp-server --strict-config
    codex app-server --strict-config --listen off
    codex-app-server --strict-config --listen off
    ```
    
    For example, if `~/.codex/config.toml` contains a typo in a key name:
    
    ```toml
    model = "gpt-5"
    approval_polic = "on-request"
    ```
    
    then `codex --strict-config` reports the misspelled key instead of
    silently ignoring it. The path is shortened to `~` here for readability:
    
    ```text
    $ codex --strict-config
    Error loading config.toml:
    ~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
      |
    2 | approval_polic = "on-request"
      | ^^^^^^^^^^^^^^
    ```
    
    Without `--strict-config`, Codex keeps the existing permissive behavior
    and ignores the unknown key.
    
    Strict config mode also validates ad-hoc `-c` / `--config` overrides:
    
    ```text
    $ codex --strict-config -c foo=bar
    Error: unknown configuration field `foo` in -c/--config override
    
    $ codex --strict-config -c features.foo=true
    Error: unknown configuration field `features.foo` in -c/--config override
    ```
    
    Invalid feature toggles are rejected too, including values that look
    like nested config paths:
    
    ```text
    $ codex --strict-config --enable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --disable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
    Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
    ```
    
    Unsupported commands reject the flag explicitly:
    
    ```text
    $ codex --strict-config cloud list
    Error: `--strict-config` is not supported for `codex cloud`
    ```
    
    ## Verification
    
    The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
    `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
    case, unknown `-c` overrides, app-server strict startup failure through
    `codex app-server`, and rejection for unsupported commands such as
    `codex cloud`, `codex mcp`, `codex remote-control`, and `codex
    app-server proxy`.
    
    The config and config-loader tests cover unknown top-level fields,
    unknown nested fields, unknown `[features]` keys, source-location
    reporting, non-file managed config sources, and `-c` validation for keys
    such as `features.foo`.
    
    The app-server test suite covers standalone `codex-app-server
    --strict-config` startup failure for an unknown config field.
    
    ## Documentation
    
    The Codex CLI docs on developers.openai.com/codex should mention
    `--strict-config` as an opt-in validation mode for supported
    config-heavy entry points once this ships.
  • Add support for UDS in codex --remote (#22414)
    ## Why
    
    Added support for UDS connections in `codex --remote`.
    
    TUI also now connects to local app-server using UDS by default if it is
    running and set to listen to UDS connection.
    
    ## What Changed
    
    - Introduced `RemoteAppServerEndpoint` with `WebSocket` and `UnixSocket`
    variants.
    - Reused the existing JSON-RPC-over-WebSocket protocol over either a TCP
    WebSocket stream or a UDS stream.
    - Updated `codex --remote` to accept `ws://host:port`,
    `wss://host:port`, `unix://`, and `unix://PATH`.
    - Kept `--remote-auth-token-env` restricted to `wss://` and loopback
    `ws://` remotes.
    - Added a fast TUI startup probe for the default daemon socket, falling
    back to the embedded app server when the daemon is absent or
    unresponsive.
    
    ## Verification
    
    - Manually verified that the updated remote flow works.
    - Added coverage for UDS remote round trips, WebSocket auth headers,
    auth-token transport policy, remote address parsing, and missing-daemon
    fallback.
    - Ran focused remote test coverage locally.
  • [codex] request desktop attestation from app (#20619)
    ## Summary
    
    TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
    attestation token and attach it as `x-oai-attestation` on the scoped
    ChatGPT Codex request paths.
    
    ![DeviceCheck attestation
    interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)
    
    ## Details
    
    This PR teaches the Codex app-server runtime how to request and attach
    an attestation token. It does not generate DeviceCheck tokens directly;
    instead, it relies on the connected desktop app to advertise that it can
    generate attestation and then asks that app for a fresh header value
    when needed.
    
    The flow is:
    
    1. The Codex desktop app connects to app-server.
    2. During `initialize`, the app can advertise that it supports
    `requestAttestation`.
    3. Before app-server calls selected ChatGPT Codex endpoints, it sends
    the internal server request `attestation/generate` to the app.
    4. app-server receives a pre-encoded header value back.
    5. app-server forwards that value as `x-oai-attestation` on the scoped
    outbound requests.
    
    The code in this repo is mostly protocol and runtime plumbing: it adds
    the app-server request/response shape, introduces an attestation
    provider in core, wires that provider into Responses / compaction /
    realtime setup paths, and covers the intended scoping with tests. The
    signed macOS DeviceCheck generation remains owned by the desktop app PR.
    
    ## Related PR
    
    - Codex desktop app implementation:
    https://github.com/openai/openai/pull/878649
    
    ## Validation
    
    <details>
    <summary>Tests run</summary>
    
    ```sh
    cargo test -p codex-app-server-protocol
    cargo test -p codex-core attestation --lib
    cargo test -p codex-app-server --lib attestation
    ```
    
    Also ran:
    
    ```sh
    just fix -p codex-core
    just fix -p codex-app-server
    just fix -p codex-app-server-protocol
    just fmt
    just write-app-server-schema
    ```
    
    </details>
    
    <details>
    <summary>E2E DeviceCheck validation</summary>
    
    First validated the signed desktop app boundary directly: launched a
    packaged signed `Codex.app`, sent `attestation/generate`, decoded the
    returned `v1.` attestation header, and validated the extracted
    DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
    bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
    `is_ok: true`.
    
    Then ran the fuller app + app-server flow. The packaged `Codex.app`
    launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
    MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
    requested `attestation/generate` from the real Electron app process, and
    the intercepted `/backend-api/codex/responses` traffic included
    `x-oai-attestation` on both routes:
    
    ```text
    GET  /backend-api/codex/responses  Upgrade: websocket  x-oai-attestation: present
    POST /backend-api/codex/responses  Upgrade: none       x-oai-attestation: present
    ```
    
    The captured header decoded to a DeviceCheck token that also validated
    with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
    team `2DC432GLL2`).
    
    </details>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Load configured environments from CODEX_HOME (#20667)
    ## Why
    
    The earlier PRs add stdio transport support and the config-backed
    environment provider, but the feature remains inert until normal Codex
    entrypoints construct `EnvironmentManager` with enough context to
    discover `CODEX_HOME/environments.toml`. This final stack PR activates
    the provider while preserving the legacy `CODEX_EXEC_SERVER_URL`
    fallback when no environments file exists.
    
    **Stack position:** this is PR 5 of 5. It is the product wiring PR that
    activates the configured environment provider added in PR 4.
    
    ## What Changed
    
    - Thread `codex_home` into `EnvironmentManagerArgs`.
    - Change `EnvironmentManager::new(...)` to load the provider from
    `CODEX_HOME`.
    - Preserve legacy behavior by falling back to
    `DefaultEnvironmentProvider::from_env()` when `environments.toml` is
    absent.
    - Make `environments.toml`-backed managers start new threads with all
    configured environments, default first, while keeping the legacy env-var
    path single-default.
    - Update the app-server, TUI, exec, MCP server, connector, prompt-debug,
    and thread-manager-sample callsites to pass `codex_home` and handle
    provider-loading errors.
    
    ## Self-Review Notes
    
    - The multi-environment startup path is intentionally tied to the
    `environments.toml` provider. Using `>1` configured environment as the
    only signal would also expand the legacy `CODEX_EXEC_SERVER_URL`
    provider because it keeps `local` addressable alongside `remote`.
    - The startup environment list is still derived inside
    `EnvironmentManager`; the provider only says whether its snapshot should
    start new threads with all configured environments.
    - The thread-manager sample was updated to pass the current
    `ThreadManager::new(...)` installation id argument so the stack compiles
    under Bazel.
    
    ## Stack
    
    - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
    listener
    - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
    client transport
    - 3. https://github.com/openai/codex/pull/20665 - Make environment
    providers own default selection
    - 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
    environments TOML provider
    - **5. This PR:** https://github.com/openai/codex/pull/20667 - Load
    configured environments from CODEX_HOME
    
    Split from original draft: https://github.com/openai/codex/pull/20508
    
    ## Validation
    
    - `just fmt`
    - `git diff --check`
    - `bazel build --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/thread-manager-sample:codex-thread-manager-sample`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/exec-server:exec-server-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=default_thread_environment_selections_use_manager_default_id
    //codex-rs/core:core-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=start_thread_uses_all_default_environments_from_codex_home
    //codex-rs/core:core-unit-tests`
    
    ## Documentation
    
    This activates `CODEX_HOME/environments.toml`; user-facing documentation
    should be added before this stack is treated as a documented public
    workflow.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Revert state DB injection and agent graph store (#21481)
    ## Why
    
    Reverts #20689 to restore the previous optional state DB plumbing. The
    conflict resolution keeps the newer installation ID and session/thread
    identity changes that landed after #20689, while removing the mandatory
    state DB and agent graph store dependency from ThreadManager
    construction.
    
    ## What changed
    
    - Restored `Option<StateDbHandle>` through app-server, MCP server,
    prompt debug, and test entry points.
    - Removed the `codex-core` dependency on `codex-agent-graph-store` and
    reverted descendant lookup back to the existing state DB path when
    available.
    - Kept newer `installation_id` forwarding by passing it beside the
    optional DB handle.
    - Kept local thread-name updates working when the optional state DB
    handle is absent.
    
    ## Validation
    
    - `git diff --check`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-state -p codex-rollout -p
    codex-app-server-protocol`
    - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
    codex-app-server -p codex-app-server-client -p codex-mcp-server -p
    codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
    ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
    2026-01-19)` on `aarch64-apple-darwin`.
  • Move message history out of core (#21278)
    ## Why
    
    Message history was implemented inside `codex-core` and surfaced through
    core protocol ops and `SessionConfiguredEvent` fields even though the
    current consumer is TUI-local prompt recall. That made core own UI
    history persistence and exposed `history_log_id` / `history_entry_count`
    through surfaces that app-server and other clients do not need.
    
    This change moves message history persistence out of core and keeps the
    recall plumbing local to the TUI.
    
    ## What changed
    
    - Added a new `codex-message-history` crate for appending, looking up,
    trimming, and reading metadata from `history.jsonl`.
    - Removed core protocol history ops/events: `AddToHistory`,
    `GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
    - Removed `history_log_id` and `history_entry_count` from
    `SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
    - Updated the TUI to dispatch local app events for message-history
    append/lookup and keep its persistent-history metadata in TUI session
    state.
    
    ## Validation
    
    - `cargo test -p codex-message-history -p codex-protocol`
    - `cargo test -p codex-exec event_processor_with_json_output`
    - `cargo test -p codex-mcp-server outgoing_message`
    - `cargo test -p codex-tui`
    - `just fix -p codex-message-history -p codex-protocol -p codex-core -p
    codex-tui -p codex-exec -p codex-mcp-server`
  • test: isolate app-server-client in-process test state (#21328)
    ## Why
    
    The in-process `app-server-client` tests were still building their
    configs from the ambient `codex_home` and letting the embedded app
    server create its own state DB when `state_db` was absent. That matters
    because in-process startup falls back to
    `init_state_db_from_config(...)` in that case, so tests can otherwise
    share persisted state instead of getting isolated fixtures:
    [`app-server/src/in_process.rs`](https://github.com/openai/codex/blob/a98623511ba433154ec811fc63091617f5945438/codex-rs/app-server/src/in_process.rs#L368-L373).
    
    ## What changed
    
    - Give each in-process test client its own temporary `codex_home`.
    - Initialize the matching state DB from that per-client config and pass
    it into the client explicitly.
    - Keep the temp directory alive for the lifetime of the test client
    through a small `TestClient` wrapper.
    - Add `tempfile` as a dev dependency for the new harness.
    
    The updated setup lives in
    [`app-server-client/src/lib.rs`](https://github.com/openai/codex/blob/35c1133d45d10931914dbb88a1246a195d025ff6/codex-rs/app-server-client/src/lib.rs#L982-L1055).
    
    ## Testing
    
    - Existing `codex-app-server-client` tests continue to exercise the
    updated in-process client path through the isolated helper.
  • Inject state DB, agent graph store (#20689)
    ## Why
    
    We want the agent graph store to be passed down the stack as a real
    dependency, the same way we already treat the thread store.
    
    This will let us inject the agent graph store as a real dependency and
    support implementations other than the local SQLite-backed one. Right
    now most code instantiates a state DB and an agent graph store
    just-in-time. Ideally, we would not depend on the state DB directly but
    only read through the higher-level interfaces.
    
    This change makes the dependency boundaries explicit and moves state DB
    initialization to process bootstrap instead of hiding it inside local
    store implementations.
    
    ## What changed
    
    - `ThreadManager` now requires a `StateDbHandle` and an
    `AgentGraphStore` at construction time instead of treating them as
    optional internals.
    - The local store constructors no longer lazily initialize SQLite.
    Callers now initialize the state DB once per process and use that shared
    handle to build:
      - `LocalThreadStore`
      - `LocalAgentGraphStore`
    - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
    thread-manager sample) now initialize the state DB up front and inject
    the resulting handle down the stack.
    - `app-server` now consistently uses its process-scoped state DB handle
    instead of reopening SQLite or trying to recover it from loaded threads.
    - Device-key storage now reuses the shared state DB handle instead of
    maintaining its own lazy opener.
    - The thread archive / descendant traversal paths now use the injected
    `AgentGraphStore` instead of reaching through local
    thread-store-specific state.
    
    ## Verification
    
    - `cargo check -p codex-core -p codex-thread-store -p codex-app-server
    -p codex-mcp-server -p codex-thread-manager-sample --tests`
    - `cargo test -p codex-thread-store`
    - `cargo test -p codex-core
    thread_manager_accepts_separate_agent_graph_store_and_thread_store --
    --nocapture`
    - `cargo test -p codex-app-server
    thread_archive_archives_spawned_descendants -- --nocapture`
  • add turn items view to app-server turns (#21063)
    ## Why
    
    `Turn.items` currently overloads an empty array to mean either that no
    items exist or that the server intentionally did not load them for this
    response. That ambiguity blocks future lazy-loading work where clients
    need to distinguish unloaded, summary, and fully hydrated turn payloads.
    
    ## What changed
    
    - add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
    variants
    - add required `itemsView` metadata to app-server `Turn` payloads
    - mark reconstructed persisted history as `full` and live shell-style
    turn payloads as `notLoaded`
    - keep current `thread/turns/list` behavior unchanged and document that
    it still returns `full` turns today
    - regenerate the JSON and TypeScript protocol fixtures
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server thread_read_can_include_turns`
    - `cargo test -p codex-app-server
    thread_turns_list_can_page_backward_and_forward`
    - `cargo test -p codex-app-server
    thread_resume_rejects_history_when_thread_is_running`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fmt`
  • [codex-analytics] add item lifecycle timing (#20514)
    ## Why
    
    Tool families already disagree on what their existing `duration` fields
    mean, so lifecycle latency should live on the shared item envelope
    instead of being inferred from per-tool execution fields. Carrying that
    envelope through app-server notifications gives downstream consumers one
    reusable timing signal without pretending every tool has the same
    execution semantics.
    
    ## What changed
    
    - Adds `started_at_ms` to core `ItemStartedEvent` values and
    `completed_at_ms` to core `ItemCompletedEvent` values.
    - Populates those timestamps in the shared session lifecycle emitters,
    so protocol-native items get timing without each producer tracking its
    own clock state.
    - Exposes `startedAtMs` on app-server `item/started` notifications and
    `completedAtMs` on `item/completed` notifications.
    - Maps the lifecycle timestamps through the app-server boundary while
    leaving legacy-converted notifications nullable when no lifecycle
    timestamp exists.
    - Regenerates the app-server JSON schema and TypeScript fixtures for the
    notification-envelope change and updates downstream fixtures that
    construct those notifications directly.
    - Extends the existing web-search and image-generation integration flows
    to assert the new lifecycle timestamps on the native item events.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
    -p codex-app-server-client`
    - `cargo test -p codex-core --test all web_search_item_is_emitted`
    - `cargo test -p codex-core --test all
    image_generation_call_event_is_emitted`
    - `cargo test -p codex-app-server-protocol`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
    * #18748
    * #18747
    * #17090
    * #17089
    * __->__ #20514
  • state: pass state db handles through consumers (#20561)
    ## Why
    
    SQLite state was still being opened from consumer paths, including lazy
    `OnceCell`-backed thread-store call sites. That let one process
    construct multiple state DB connections for the same Codex home, which
    makes SQLite lock contention and `database is locked` failures much
    easier to hit.
    
    State DB lifetime should be chosen by main-like entrypoints and tests,
    then passed through explicitly. Consumers should use the supplied
    `Option<StateDbHandle>` or `StateDbHandle` and keep their existing
    filesystem fallback or error behavior when no handle is available.
    
    The startup path also needs to keep the rollout crate in charge of
    SQLite state initialization. Opening `codex_state::StateRuntime`
    directly bypasses rollout metadata backfill, so entrypoints should
    initialize through `codex_rollout::state_db` and receive a handle only
    after required rollout backfills have completed.
    
    ## What Changed
    
    - Initialize the state DB in main-like entrypoints for CLI, TUI,
    app-server, exec, MCP server, and the thread-manager sample.
    - Pass `Option<StateDbHandle>` through `ThreadManager`,
    `LocalThreadStore`, app-server processors, TUI app wiring, rollout
    listing/recording, personality migration, shell snapshot cleanup,
    session-name lookup, and memory/device-key consumers.
    - Remove the lazy local state DB wrapper from the thread store so
    non-test consumers use only the supplied handle or their existing
    fallback path.
    - Make `codex_rollout::state_db::init` the local state startup path: it
    opens/migrates SQLite, runs rollout metadata backfill when needed, waits
    for concurrent backfill workers up to a bounded timeout, verifies
    completion, and then returns the initialized handle.
    - Keep optional/non-owning SQLite helpers, such as remote TUI local
    reads, as open-only paths that do not run startup backfill.
    - Switch app-server startup from direct
    `codex_state::StateRuntime::init` to the rollout state initializer so
    app-server cannot skip rollout backfill.
    - Collapse split rollout lookup/list APIs so callers use the normal
    methods with an optional state handle instead of `_with_state_db`
    variants.
    - Restore `getConversationSummary(ThreadId)` to delegate through
    `ThreadStore::read_thread` instead of a LocalThreadStore-specific
    rollout path special case.
    - Keep DB-backed rollout path lookup keyed on the DB row and file
    existence, without imposing the filesystem filename convention on
    existing DB rows.
    - Verify readable DB-backed rollout paths against `session_meta.id`
    before returning them, so a stale SQLite row that points at another
    thread's JSONL falls back to filesystem search and read-repairs the DB
    row.
    - Keep `debug prompt-input` filesystem-only so a one-off debug command
    does not initialize or backfill SQLite state just to print prompt input.
    - Keep goal-session test Codex homes alive only in the goal-specific
    helper, rather than leaking tempdirs from the shared session test
    helper.
    - Update tests and call sites to pass explicit state handles where DB
    behavior is expected and explicit `None` where filesystem-only behavior
    is intended.
    
    ## Validation
    
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
    codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
    codex-tui -p codex-exec -p codex-cli --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout state_db_`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout try_init_ -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
    codex-rollout --lib -- -D warnings`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store
    read_thread_falls_back_when_sqlite_path_points_to_another_thread --
    --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    shell_snapshot`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all personality_migration`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find::find_prefers_sqlite_path_by_id --
    --nocapture`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    interrupt_accounts_active_goal_before_pausing`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server get_auth_status -- --test-threads=1`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server --lib`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
    -p codex-app-server --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
    -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
    codex-exec -p codex-cli`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
    codex-app-server`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
    codex-rollout`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
    - `just argument-comment-lint -p codex-core`
    - `just argument-comment-lint -p codex-rollout`
    
    Focused coverage added in `codex-rollout`:
    
    - `recorder::tests::state_db_init_backfills_before_returning` verifies
    the rollout metadata row exists before startup init returns.
    - `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
    verifies startup waits for another worker to finish backfill instead of
    disabling the handle for the process.
    -
    `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
    verifies startup does not hang indefinitely on a stuck backfill lease.
    -
    `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
    verifies DB-backed lookup accepts valid existing rollout paths even when
    the filename does not include the thread UUID.
    -
    `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
    verifies DB-backed lookup ignores a stale row whose existing path
    belongs to another thread and read-repairs the row after filesystem
    fallback.
    
    Focused coverage updated in `codex-core`:
    
    - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
    DB-preferred rollout file with matching `session_meta.id`, so it still
    verifies that valid SQLite paths win without depending on stale/empty
    rollout contents.
    
    `cargo test -p codex-app-server thread_list_respects_search_term_filter
    -- --test-threads=1 --nocapture` was attempted locally but timed out
    waiting for the app-server test harness `initialize` response before
    reaching the changed thread-list code path.
    
    `bazel test //codex-rs/thread-store:thread-store-unit-tests
    --test_output=errors` was attempted locally after the thread-store fix,
    but this container failed before target analysis while fetching `v8+`
    through BuildBuddy/direct GitHub. The equivalent local crate coverage,
    including `cargo test -p codex-thread-store`, passes.
    
    A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
    also requires system `libcap.pc` for `codex-linux-sandbox`; the
    follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
    this container.
  • feat: export and replay effective config locks (#20405)
    ## Why
    
    For reproducibility. A hand-written `config.toml` is not enough to
    recreate what a Codex session actually ran with because layered config,
    CLI overrides, defaults, feature aliases, resolved feature config,
    prompt setup, and model-catalog/session values can all affect the final
    runtime behavior.
    
    This PR adds an effective config lockfile path: one run can export the
    resolved session config, and a later run can replay that lockfile and
    fail early if the regenerated effective config drifts.
    
    ## What Changed
    
    - Add a dedicated `ConfigLockfileToml` wrapper with top-level lockfile
    metadata plus the replayable config:
    
      ```toml
      version = 1
      codex_version = "..."
    
      [config]
      # effective ConfigToml fields
      ```
    
    - Keep lockfile metadata out of regular `ConfigToml`; replay loads
    `ConfigLockfileToml` and then uses its nested `config` as the
    authoritative config layer.
    - Add `debug.config_lockfile.export_dir` to write
    `<thread_id>.config.lock.toml` when a root session starts.
    - Add `debug.config_lockfile.load_path` to replay a saved lockfile and
    validate the regenerated session lockfile against it.
    - Add `debug.config_lockfile.allow_codex_version_mismatch` to optionally
    tolerate Codex binary version drift while still comparing the rest of
    the lockfile.
    - Add `debug.config_lockfile.save_fields_resolved_from_model_catalog` so
    lock creation can either save model-catalog/session-resolved fields or
    intentionally leave those fields dynamic.
    - Build lockfiles from the effective config plus resolved runtime values
    such as model selection, reasoning settings, prompts, service tier, web
    search mode, feature states/config, memories config, skill instructions,
    and agent limits.
    - Materialize feature aliases and custom feature config into the
    lockfile so replay compares canonical resolved behavior instead of
    user-authored alias shape.
    - Strip profile/debug/file-include/environment-specific inputs from
    generated lockfiles so they contain replayable values rather than the
    inputs that produced those values.
    - Surface JSON-RPC server error code/data in app-server client and TUI
    bootstrap errors so config-lock replay failures include the actual TOML
    diff.
    - Regenerate the config schema for the new debug config keys.
    
    ## Review Notes
    
    The main flow is split across these files:
    
    - `config/src/config_toml.rs`: lockfile/debug TOML shapes.
    - `core/src/config/mod.rs`: loading `debug.config_lockfile.*`, replaying
    a lockfile as a config layer, and preserving the expected lockfile for
    validation.
    - `core/src/session/config_lock.rs`: exporting the current session
    lockfile and materializing resolved session/config values.
    - `core/src/config_lock.rs`: lockfile parsing, metadata/version checks,
    replay comparison, and diff formatting.
    
    ## Usage
    
    Export a lockfile from a normal session:
    
    ```sh
    codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"'
    ```
    
    Export a lockfile without saving model-catalog/session-resolved fields:
    
    ```sh
    codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' \
      -c 'debug.config_lockfile.save_fields_resolved_from_model_catalog=false'
    ```
    
    Replay a saved lockfile in a later session:
    
    ```sh
    codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"'
    ```
    
    If replay resolves to a different effective config, startup fails with a
    TOML diff.
    
    To tolerate Codex binary version drift during replay:
    
    ```sh
    codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' \
      -c 'debug.config_lockfile.allow_codex_version_mismatch=true'
    ```
    
    ## Limitations
    
    This does not support custom rules/network policies.
    
    ## Verification
    
    - `cargo test -p codex-core config_lock`
    - `cargo test -p codex-config`
    - `cargo test -p codex-thread-manager-sample`
  • Add environment provider snapshot (#20058)
    ## Summary
    - Change `EnvironmentProvider` to return concrete `Environment`
    instances instead of `EnvironmentConfigurations`.
    - Make `DefaultEnvironmentProvider` provide the provider-visible `local`
    environment plus optional `remote` environment from
    `CODEX_EXEC_SERVER_URL`.
    - Keep `EnvironmentManager` as the concrete cache while exposing its own
    explicit local environment for `local_environment()` fallback paths.
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Allow large remote app-server resume responses (#19920)
    ## Why
    
    Remote TUI resume uses the app-server websocket client. That client
    inherited tungstenite's default `16 MiB` frame limit, so a large saved
    session could make `thread/resume` return a single JSON-RPC response
    frame that the client rejected before the TUI could deserialize or
    render it.
    
    Fixes #19837
    
    ## What Changed
    
    - Configure the remote app-server websocket client with a bounded `128
    MiB` max frame/message size.
    - Preserve the concrete remote worker exit reason when completing
    pending requests after a transport/read failure instead of replacing it
    with a generic channel-closed error.
    - Add a regression test that sends a single `>16 MiB` JSON-RPC response
    frame and verifies the typed request succeeds.
    
    Note: This isn't a perfect fix. It really just moves the limit to a much
    larger value. I looked at a bunch of other potential fixes (both
    server-side and client-side), and they all involved significant
    complexity, had backward-compatibility impact, or impacted performance
    of common use cases. This simple fix should address the vast majority of
    remote use cases.
    
    ## Verification
    
    I reproed the problem locally using a long rollout. Verified that fix
    addresses connection drop.
  • [codex] Move config loading into codex-config (#19487)
    ## Why
    
    Config loading had become split across crates: `codex-config` owned the
    config types and merge logic, while `codex-core` still owned the loader
    that assembled the layer stack. This change consolidates that
    responsibility in `codex-config`, so the crate that defines config
    behavior also owns how configs are discovered and loaded.
    
    To make that move possible without reintroducing the old dependency
    cycle, the shell-environment policy types and helpers that
    `codex-exec-server` needs now live in `codex-protocol` instead of
    flowing through `codex-config`.
    
    This also makes the migrated loader tests more deterministic on machines
    that already have managed or system Codex config installed by letting
    tests override the system config and requirements paths instead of
    reading the host's `/etc/codex`.
    
    ## What Changed
    
    - moved the config loader implementation from `codex-core` into
    `codex-config::loader` and deleted the old `core::config_loader` module
    instead of leaving a compatibility shim
    - moved shell-environment policy types and helpers into
    `codex-protocol`, then updated `codex-exec-server` and other downstream
    crates to import them from their new home
    - updated downstream callers to use loader/config APIs from
    `codex-config`
    - added test-only loader overrides for system config and requirements
    paths so loader-focused tests do not depend on host-managed config state
    - cleaned up now-unused dependency entries and platform-specific cfgs
    that were surfaced by post-push CI
    
    ## Testing
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core config_loader_tests::`
    - `cargo test -p codex-protocol -p codex-exec-server -p
    codex-cloud-requirements -p codex-rmcp-client --lib`
    - `cargo test --lib -p codex-app-server-client -p codex-exec`
    - `cargo test --no-run --lib -p codex-app-server`
    - `cargo test -p codex-linux-sandbox --lib`
    - `cargo shear`
    - `just bazel-lock-check`
    
    ## Notes
    
    - I did not chase unrelated full-suite failures outside the migrated
    loader surface.
    - `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
    failures on this machine, and Windows CI still shows unrelated
    long-running/timeouting test noise outside the loader migration itself.
  • Add remote thread config endpoint (#18908)
    ## Why
    
    App-server needs a way to fetch thread-scoped config from the remote
    thread config service when the user config opts into that behavior. This
    mirrors the existing experimental remote thread store endpoint while
    keeping local/noop behavior as the default.
    
    Startup paths also need to avoid silently dropping the remote config
    endpoint after the first config load. The stdio app-server path
    discovers the endpoint from the initial config and installs the real
    thread config loader for later config builds, while in-process clients
    used by TUI/exec now select the same remote loader directly from their
    provided config.
    
    ## What changed
    
    - Added `experimental_thread_config_endpoint` to `ConfigToml`, `Config`,
    and `core/config.schema.json`.
    - Added config parsing coverage for the new setting.
    - Updated app-server startup to select `RemoteThreadConfigLoader` from
    the initially loaded config, falling back to `NoopThreadConfigLoader`
    when unset.
    - Let `ConfigManager` replace its thread config loader after startup
    discovery so later config loads use the selected loader.
    - Updated in-process app-server client startup to pass
    `RemoteThreadConfigLoader` when its config has
    `experimental_thread_config_endpoint` set.
    
    ## Verification
    
    - Added `experimental_thread_config_endpoint_loads_from_config_toml`.
    - Added
    `runtime_start_args_use_remote_thread_config_loader_when_configured`.
    - Ran `cargo check -p codex-app-server --lib`.
    - Ran `cargo test -p codex-app-server-client`.
  • Move marketplace add/remove and startup sync out of core. (#19099)
    Move more things to core-plugins.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • TUI: Keep remote app-server events draining (#18932)
    Addresses #18860
    
    Problem: Remote app-server clients could stop draining websocket events
    when their bounded local event channel filled, leaving clients stuck on
    stale in-progress turns after a disconnect.
    
    Solution: Use an unbounded local event channel for the remote client so
    the websocket reader can keep forwarding disconnect and progress events
    instead of blocking or dropping them.
    
    Why this is reasonable: This does not make the remote websocket itself
    unbounded. The changed queue lives inside the remote client, between the
    task that reads the remote websocket and the API consumer in the same
    client process. Once an event has been received from the remote server,
    preserving it is preferable to blocking websocket reads or dropping
    disconnect/lifecycle events; network-level backpressure still happens at
    the websocket boundary if the remote side outpaces the client.
  • Fix remote app-server shutdown race (#18936)
    ## Why
    
    A Mac Bazel CI run saw `remote_notifications_arrive_over_websocket` fail
    during shutdown with `remote app-server shutdown channel is closed`
    (https://app.buildbuddy.io/invocation/9dac05d6-ae20-40f9-b627-fca6e91cf127).
    The remote websocket worker can legitimately finish while `shutdown()`
    is waiting for the shutdown acknowledgement: after the test server sends
    a notification and exits, the worker may deliver the required disconnect
    event, observe that the caller has dropped the event receiver, and exit
    before it sends the shutdown one-shot.
    
    That state is already terminal cleanup, not a failed shutdown, so
    callers should not see a `BrokenPipe` from the acknowledgement channel.
    
    ## What Changed
    
    - Treat a closed remote shutdown acknowledgement as an already-exited
    worker while still propagating websocket close errors when the worker
    returns them.
    - Added a deterministic regression test for the interleaving where the
    shutdown command is received and the worker exits before replying.
    
    ## Verification
    
    - `cargo test -p codex-app-server-client`
    - New test:
    `remote::tests::shutdown_tolerates_worker_exit_after_command_is_queued`
  • Support multiple managed environments (#18401)
    ## Summary
    - refactor EnvironmentManager to own keyed environments with
    default/local lookup helpers
    - keep remote exec-server client creation lazy until exec/fs use
    - preserve disabled agent environment access separately from internal
    local environment access
    
    ## Validation
    - not run (per Codex worktree instruction to avoid tests/builds unless
    requested)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add session config loader interface (#18208)
    ## Why
    
    Cloud-hosted sessions need a way for the service that starts or manages
    a thread to provide session-owned config without treating all config as
    if it came from the same user/project/workspace TOML stack.
    
    The important boundary is ownership: some values should be controlled by
    the session/orchestrator, some by the authenticated user, and later some
    may come from the executor. The earlier broad config-store shape made
    that boundary too fuzzy and overlapped heavily with the existing
    filesystem-backed config loader. This PR starts with the smaller piece
    we need now: a typed session config loader that can feed the existing
    config layer stack while preserving the normal precedence and merge
    behavior.
    
    ## What Changed
    
    - Added `ThreadConfigLoader` and related typed payloads in
    `codex-config`.
    - `SessionThreadConfig` currently supports `model_provider`,
    `model_providers`, and feature flags.
    - `UserThreadConfig` is present as an ownership boundary, but does not
    yet add TOML-backed fields.
    - `NoopThreadConfigLoader` preserves existing behavior when no external
    loader is configured.
      - `StaticThreadConfigLoader` supports tests and simple callers.
    
    - Taught thread config sources to produce ordinary `ConfigLayerEntry`
    values so the existing `ConfigLayerStack` remains the place where
    precedence and merging happen.
    
    - Wired the loader through `ConfigBuilder`, the config loader, and
    app-server startup paths so app-server can provide session-owned config
    before deriving a thread config.
    
    - Added coverage for:
      - translating typed thread config into config layers,
    - inserting thread config layers into the stack at the right precedence,
    - applying session-provided model provider and feature settings when
    app-server derives config from thread params.
    
    ## Follow-Ups
    
    This intentionally stops short of adding the remote/service transport.
    The next pieces are expected to be:
    
    1. Define the proto/API shape for this interface.
    2. Add a client implementation that can source session config from the
    service side.
    
    ## Verification
    
    - Added unit coverage in `codex-config` for the loader and layer
    conversion.
    - Added `codex-core` config loader coverage for thread config layer
    precedence.
    - Added app-server coverage that verifies session thread config wins
    over request-provided config for model provider and feature settings.
  • Remove simple TUI legacy_core reexports (#18631)
    ## Problem
    The TUI still imported path utilities and config-loader symbols through
    app-server-client's legacy_core facade even though those APIs already
    exist in utility/config crates. This is part of our ongoing effort to
    whittle away at these old dependencies.
    
    ## Solution
    Rewire imports to avoid the TUI directly importing from the core crate
    and instead import from common lower-level crates. This PR doesn't
    include any functional changes; it's just a simple rewiring.
  • TUI: remove simple legacy_core re-exports (#18605)
    ## Summary
    
    The TUI still imported several symbols through the transitional
    app-server-client `legacy_core` facade even though those symbols are
    already owned by smaller crates. This PR narrows that facade by rewiring
    those imports directly to their owner crates.
    
    ## Changes
    
    No functional changes, just import rewiring. This is part of our ongoing
    effort to whittle away at the `legacy_core` namespace, which represents
    all of the remaining symbols that the TUI imports from the core.
  • Refactor AGENTS.md discovery into AgentsMdManager (#18035)
    Encapsulate Agents MD processing a bit and drop user_instructions_path
    from config.
  • Async config loading (#18022)
    Parts of config will come from executor. Prepare for that by making
    config loading methods async.
  • Run exec-server fs operations through sandbox helper (#17294)
    ## Summary
    - run exec-server filesystem RPCs requiring sandboxing through a
    `codex-fs` arg0 helper over stdin/stdout
    - keep direct local filesystem execution for `DangerFullAccess` and
    external sandbox policies
    - remove the standalone exec-server binary path in favor of top-level
    arg0 dispatch/runtime paths
    - add sandbox escape regression coverage for local and remote filesystem
    paths
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - remote devbox: `cd codex-rs && bazel test --bes_backend=
    --bes_results_url= //codex-rs/exec-server:all` (6/6 passed)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • TUI: enforce core boundary (#17399)
    Problem: The TUI still depended on `codex-core` directly in a number of
    places, and we had no enforcement from keeping this problem from getting
    worse.
    
    Solution: Route TUI core access through
    `codex-app-server-client::legacy_core`, add CI enforcement for that
    boundary, and re-export this legacy bridge inside the TUI as
    `crate::legacy_core` so the remaining call sites stay readable. There is
    no functional change in this PR — just changes to import targets.
    
    Over time, we can whittle away at the remaining symbols in this legacy
    namespace with the eventual goal of removing them all. In the meantime,
    this linter rule will prevent us from inadvertently importing new
    symbols from core.
  • Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391)
    Reverts openai/codex#16969
    
    #sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300
  • Option to Notify Workspace Owner When Usage Limit is Reached (#16969)
    ## Summary
    - Replace the manual `/notify-owner` flow with an inline confirmation
    prompt when a usage-based workspace member hits a credits-depleted
    limit.
    - Fetch the current workspace role from the live ChatGPT
    `accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
    the desktop and web clients.
    - Keep owner, member, and spend-cap messaging distinct so we only offer
    the owner nudge when the workspace is actually out of credits.
    
    ## What Changed
    - `backend-client`
    - Added a typed fetch for the current account role from
    `accounts/check`.
      - Mapped backend role values into a Rust workspace-role enum.
    - `app-server` and protocol
      - Added `workspaceRole` to `account/read` and `account/updated`.
    - Derived `isWorkspaceOwner` from the live role, with a fallback to the
    cached token claim when the role fetch is unavailable.
    - `tui`
      - Removed the explicit `/notify-owner` slash command.
    - When a member is blocked because the workspace is out of credits, the
    error now prompts:
    - `Your workspace is out of credits. Request more from your workspace
    owner? [y/N]`
      - Choosing `y` sends the existing owner-notification request.
    - Choosing `n`, pressing `Esc`, or accepting the default selection
    dismisses the prompt without sending anything.
    - Selection popups now honor explicit item shortcuts, which is how the
    `y` / `n` interaction is wired.
    
    ## Reviewer Notes
    - The main behavior change is scoped to usage-based workspace members
    whose workspace credits are depleted.
    - Spend-cap reached should not show the owner-notification prompt.
    - Owners and admins should continue to see `/usage` guidance instead of
    the member prompt.
    - The live role fetch is best-effort; if it fails, we fall back to the
    existing token-derived ownership signal.
    
    ## Testing
    - Manual verification
      - Workspace owner does not see the member prompt.
    - Workspace member with depleted credits sees the confirmation prompt
    and can send the nudge with `y`.
    - Workspace member with spend cap reached does not see the
    owner-notification prompt.
    
    ### Workspace member out of usage
    
    https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1
    
    ### Workspace owner
    <img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
    22 AM"
    src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
    />
  • Install rustls provider for remote websocket client (#17288)
    Addresses #17283
    
    Problem: `codex --remote wss://...` could panic because
    app-server-client did not install rustls' process-level crypto provider
    before opening TLS websocket connections.
    
    Solution: Add the existing rustls provider utility dependency and
    install it before the remote websocket connect.
  • [codex] Support remote exec cwd in TUI startup (#17142)
    When running with remote executor the cwd is the remote path. Today we
    check for existence of a local directory on startup and attempt to load
    config from it.
    
    For remote executors don't do that.
  • [codex-analytics] add protocol-native turn timestamps (#16638)
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
    * #16870
    * #16706
    * #16659
    * #16641
    * #16640
    * __->__ #16638
  • chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
    ## Why
    
    `argument-comment-lint` was green in CI even though the repo still had
    many uncommented literal arguments. The main gap was target coverage:
    the repo wrapper did not force Cargo to inspect test-only call sites, so
    examples like the `latest_session_lookup_params(true, ...)` tests in
    `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
    
    This change cleans up the existing backlog, makes the default repo lint
    path cover all Cargo targets, and starts rolling that stricter CI
    enforcement out on the platform where it is currently validated.
    
    ## What changed
    
    - mechanically fixed existing `argument-comment-lint` violations across
    the `codex-rs` workspace, including tests, examples, and benches
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
    `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
    `--all-targets` unless the caller explicitly narrows the target set
    - fixed both wrappers so forwarded cargo arguments after `--` are
    preserved with a single separator
    - documented the new default behavior in
    `tools/argument-comment-lint/README.md`
    - updated `rust-ci` so the macOS lint lane keeps the plain wrapper
    invocation and therefore enforces `--all-targets`, while Linux and
    Windows temporarily pass `-- --lib --bins`
    
    That temporary CI split keeps the stricter all-targets check where it is
    already cleaned up, while leaving room to finish the remaining Linux-
    and Windows-specific target-gated cleanup before enabling
    `--all-targets` on those runners. The Linux and Windows failures on the
    intermediate revision were caused by the wrapper forwarding bug, not by
    additional lint findings in those lanes.
    
    ## Validation
    
    - `bash -n tools/argument-comment-lint/run.sh`
    - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
    - shell-level wrapper forwarding check for `-- --lib --bins`
    - shell-level wrapper forwarding check for `-- --tests`
    - `just argument-comment-lint`
    - `cargo test` in `tools/argument-comment-lint`
    - `cargo test -p codex-terminal-detection`
    
    ## Follow-up
    
    - Clean up remaining Linux-only target-gated callsites, then switch the
    Linux lint lane back to the plain wrapper invocation.
    - Clean up remaining Windows-only target-gated callsites, then switch
    the Windows lint lane back to the plain wrapper invocation.
  • Remove the legacy TUI split (#15922)
    This is the part 1 of 2 PRs that will delete the `tui` /
    `tui_app_server` split. This part simply deletes the existing `tui`
    directory and marks the `tui_app_server` feature flag as removed. I left
    the `tui_app_server` feature flag in place for now so its presence
    doesn't result in an error. It is simply ignored.
    
    Part 2 will rename the `tui_app_server` directory `tui`. I did this as
    two parts to reduce visible code churn.