Commit Graph

446 Commits

  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Route AGENTS.md loading through environment filesystems (#26205)
    ## Why
    
    Workspace-specific `AGENTS.md` loading needs to use the selected
    environment filesystem so remote workspaces and child agents read
    instructions from their actual environment instead of the host
    filesystem. The app-server should report the same instruction sources
    the initialized thread actually loaded, rather than independently
    rescanning configuration and filesystem state.
    
    ## What changed
    
    - Introduce `LoadedAgentsMd` to retain ordered user, project, and
    internal instructions with their provenance.
    - Load and canonicalize workspace `AGENTS.md` paths through the primary
    `EnvironmentManager` environment, then render the loaded instructions
    when constructing turn context.
    - Expose cached loaded instruction sources from initialized threads and
    use them for app-server start, resume, and fork responses.
    - Preserve global `CODEX_HOME` loading and separator behavior while
    excluding empty project files that did not supply model-visible
    instructions.
    - Add integration coverage for CLI injection, selected-environment
    provenance and rendering, empty environment selection, and cached
    sources on loaded-thread resume.
    
    ## Validation
    
    - `just test -p codex-core agents_md`
    - `just test -p codex-core
    selected_environment_sources_match_model_visible_instructions`
    - `just test -p codex-exec agents_md`
    - `just test -p codex-app-server instruction_sources`
    - `just test -p codex-app-server --status-level fail`
  • Switch runtime to cloud config bundle (#24622)
    ## Summary
    
    - Adapts the moved `codex-cloud-config` crate from the legacy cloud
    requirements endpoint to the new config bundle endpoint.
    - Switches runtime consumers from `CloudRequirementsLoader` to
    `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered
    config and requirements.
    - Removes the legacy cloud requirements domain loader path.
    
    ## Details
    
    This intentionally keeps `codex-cloud-config` monolithic for review
    lineage: the previous PR establishes the crate move, and this PR shows
    the behavior change against that moved implementation. A follow-up PR
    splits the module back into focused files.
    
    The new bundle path preserves the important cloud requirements loader
    semantics where intended: account-scoped signed cache, 30 minute TTL, 5
    minute refresh cadence, retry/backoff, auth recovery, and fail-closed
    startup loading. The cached payload changes from a single requirements
    TOML string to the backend-delivered bundle, and validation rejects
    malformed config or requirements fragments before cache write/use.
  • Move cloud requirements crate to cloud config (#24621)
    ## Summary
    
    - Moves the existing `codex-cloud-requirements` crate to
    `codex-cloud-config`.
    - Updates workspace dependencies and imports to the new crate name.
    - Intentionally keeps runtime behavior unchanged: this still fetches the
    legacy cloud requirements endpoint.
    
    ## Details
    
    This PR exists to make the lineage obvious before the bundle migration.
    GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs`
    implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather
    than as unrelated new code.
    
    The follow-up PR adapts this moved crate to the new config bundle API
    and switches runtime consumers over.
  • Preserve auto-review approval policy in codex exec (#23763)
    ## Why
    
    `codex exec` was forcing headless runs to `approval_policy = "never"`
    even when the resolved reviewer was `auto_review`. That prevented
    unattended exec workflows from reaching the reviewed MCP write path they
    were configured to use.
    
    ## What changed
    
    - Keep the existing headless `never` default for ordinary exec runs.
    - Re-resolve exec config without that synthetic override when the final
    reviewer resolves to `AutoReview`, so configured or requirements-driven
    approval policy is preserved.
    - Add regression coverage for:
      - `auto_review` plus `on-request` from user config
    - requirements-driven `AutoReview`, asserting exec’s final approval
    policy matches the no-override control config exactly
    
    ## Validation
    
    - `just fmt`
    - `cargo test -p codex-exec`
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • windows-sandbox: pass workspace roots to runner (#24108)
    ## Why
    
    #23813 switches the Windows sandbox runner path to `PermissionProfile`,
    but it still left one runtime anchor for resolving symbolic
    `:workspace_roots` entries. That is not enough once a turn has multiple
    effective workspace roots: exact entries and deny globs under
    `:workspace_roots` need to be materialized for every runtime root before
    the command runner chooses token mode or builds ACL plans.
    
    ## What Changed
    
    - Replaces the Windows runner/setup `permission_profile_cwd` plumbing
    with `workspace_roots: Vec<AbsolutePathBuf>`.
    - Resolves Windows-local `PermissionProfile` data with
    `materialize_project_roots_with_workspace_roots(...)` instead of the
    single-cwd helper.
    - Threads `Config::effective_workspace_roots()` through core execution,
    unified exec, TUI setup/read-grant flows, app-server setup, app-server
    `command/exec`, and `debug sandbox` on Windows.
    - Preserves those workspace roots through the zsh-fork escalation
    executor instead of rebuilding them from `sandbox_policy_cwd`.
    - Makes `ExecRequest::new(...)` and the remaining
    `build_exec_request(...)` helper path take
    `windows_sandbox_workspace_roots` explicitly so new call sites cannot
    silently fall back to `vec![cwd]`.
    - Clarifies the `debug sandbox` non-Windows comment: remaining
    cwd-dependent resolution still uses `sandbox_policy_cwd`, while
    `:workspace_roots` entries are already materialized from config roots.
    - Updates elevated runner IPC `SpawnRequest` to send `workspace_roots`
    and bumps the framed IPC protocol version to `3` for the payload shape
    change.
    - Adds Windows-local resolver coverage for expanding exact and glob
    `:workspace_roots` entries across multiple roots, plus core helper
    coverage proving explicit roots are preserved.
    
    ## Verification
    
    - `cargo check -p codex-windows-sandbox -p codex-core -p codex-tui -p
    codex-cli -p codex-app-server`
    - `cargo test -p codex-windows-sandbox`
    - `cargo test -p codex-core windows_sandbox`
    - `cargo test -p codex-core unix_escalation`
    - `cargo test -p codex-app-server windows_sandbox`
    - `cargo test -p codex-tui windows_sandbox`
    - `cargo test -p codex-cli debug_sandbox`
    - `just test -p codex-core unified_exec`
    - `just test -p codex-core
    build_exec_request_preserves_windows_workspace_roots`
    - `env -u CODEX_NETWORK_PROXY_ACTIVE -u
    CODEX_NETWORK_ALLOW_LOCAL_BINDING just test -p codex-app-server --lib
    command_exec`
    - `just test -p codex-windows-sandbox`
    - `just test -p codex-exec sandbox`
    - `just fix -p codex-core -p codex-app-server -p codex-windows-sandbox`
    
    A local macOS cross-check with `cargo check --target
    x86_64-pc-windows-msvc ...` did not reach crate Rust code because native
    dependencies require Windows SDK headers (`windows.h` / `assert.h`) in
    this environment; Windows CI remains the real target validation.
    
    Two local targeted filters compile but do not run assertions on macOS:
    `env -u CODEX_NETWORK_PROXY_ACTIVE -u CODEX_NETWORK_ALLOW_LOCAL_BINDING
    just test -p codex-app-server --lib command_exec_processor` matched zero
    tests, and `just test -p codex-linux-sandbox landlock` matched zero
    tests because the landlock suite is Linux-only.
  • [codex] Add user input client ids (#24653)
    ## Summary
    
    Adds an optional `clientId` field to app-server v2 `UserInput` and
    carries it through the core `UserInput` model so clients can correlate
    echoed user input items without relying on payload equality.
    
    ## Details
    
    - Adds `client_id: Option<String>` to core `UserInput` variants.
    - Exposes the v2 app-server field as `clientId` on the wire and in
    generated TypeScript.
    - Preserves the id when converting between app-server v2 and core
    protocol types.
    - Regenerates app-server schema fixtures.
    
    ## Validation
    
    - `just fmt`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-protocol`
    - `git diff --check`
  • Add experimental turn additional context (#24154)
    ## Summary
    
    Adds experimental `additionalContext` support to `turn/start` and
    `turn/steer` so clients can provide ephemeral external context, such as
    browser or automation state, without turning that plumbing into a
    visible user prompt or triggering user-prompt lifecycle behavior.
    
    ## API Shape
    
    The parameter shape is:
    
    ```ts
    additionalContext?: Record<string, {
      value: string
      kind: "untrusted" | "application"
    }> | null
    ```
    
    Example:
    
    ```json
    {
      "additionalContext": {
        "browser_info": {
          "value": "Active tab is CI failures.",
          "kind": "untrusted"
        },
        "automation_info": {
          "value": "CI rerun is in progress.",
          "kind": "application"
        }
      }
    }
    ```
    
    The keys are opaque and caller-defined.
    
    ## Context Injection
    
    When provided, accepted entries are inserted into model context as
    hidden contextual message items, not as visible thread user-message
    items.
    
    `kind: "untrusted"` entries are inserted with role `user`:
    
    ```text
    <external_${key}>${value}</external_${key}>
    ```
    
    `kind: "application"` entries are inserted with role `developer`:
    
    ```text
    <${key}>${value}</${key}>
    ```
    
    Values are not escaped. Each value is truncated to 1k approximate tokens
    before wrapping.
    
    For `turn/start`, accepted additional context is inserted before normal
    user input. For `turn/steer`, additional context is merged only when the
    steer includes non-empty user input; context-only steers still reject as
    empty input.
    
    ## Dedupe Strategy
    
    `AdditionalContextStore` lives on session state and stores the latest
    complete additional-context map.
    
    Each `turn/start` or non-empty `turn/steer` treats its
    `additionalContext` as the current complete set of values. Entries are
    injected only when the key is new or the exact entry for that key
    changed, including `value` or `kind`. After merging, the store is
    replaced with the provided map, so omitted keys are removed from the
    retained set and can be injected again later if reintroduced.
    
    Omitting `additionalContext`, passing `null`, or passing an empty object
    resets the store to empty and injects nothing.
    
    ## What Changed
    
    - Threads experimental v2 `additionalContext` through app-server into
    core turn start and steer handling.
    - Adds separate contextual fragment types for untrusted user-role
    context and application developer-role context.
    - Uses pending response input items so additional context can be
    combined with normal user input without treating it as prompt text.
    - Adds integration coverage for start/steer flow, role routing,
    dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
    behavior, empty context-only steer rejection, external-fragment marker
    matching, and truncation.
  • package: include zsh fork in Codex package (#23756)
    ## Why
    
    The package layout gives Codex a stable place for runtime helpers that
    should travel with the entrypoint. `shell_zsh_fork` still required users
    to configure `zsh_path` manually, even though we already publish
    prebuilt zsh fork artifacts.
    
    This PR builds on #24129 and uses the shared DotSlash artifact fetcher
    to include the zsh fork in Codex packages when a matching target
    artifact exists. Packaged Codex builds can then discover the bundled
    fork automatically; the user/profile `zsh_path` override is removed so
    the feature uses the package-managed artifact instead of a legacy path
    knob.
    
    ## What Changed
    
    - Added `scripts/codex_package/codex-zsh`, a checked-in DotSlash
    manifest for the current macOS arm64 and Linux zsh fork artifacts.
    - Taught `scripts/build_codex_package.py` to fetch the matching zsh fork
    artifact and install it at `codex-resources/zsh/bin/zsh` when available
    for the selected target.
    - Added package layout validation for the optional bundled zsh resource.
    - Added `InstallContext::bundled_zsh_path()` and
    `InstallContext::bundled_zsh_bin_dir()` for package-layout resource
    discovery.
    - Threaded the packaged zsh path through config loading as the runtime
    `zsh_path` for packaged installs, and removed the config/profile/CLI
    override path.
    - Kept the packaged default zsh override typed as `AbsolutePathBuf`
    until the existing runtime `Config::zsh_path` boundary.
    - Updated app-server zsh-fork integration tests to spawn
    `codex-app-server` from a temporary package layout with
    `codex-resources/zsh/bin/zsh`, matching the new packaged discovery path
    instead of setting `zsh_path` in config.
    - Switched package executable copying from metadata-preserving `copy2()`
    to `copyfile()` plus explicit executable bits, which avoids macOS
    file-flag failures when local smoke tests use system binaries as inputs.
    
    ## Testing
    
    To verify that the `zsh` executable from the Codex package is picked up
    correctly, first I ran:
    
    ```shell
    ./scripts/build_codex_package.py
    ```
    
    which created:
    
    ```
    /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/
    ```
    
    so then I ran:
    
    ```
    /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/bin/codex exec --enable shell_zsh_fork 'run `echo $0`'
    ```
    
    which reported the following, as expected:
    
    ```
    /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/codex-resources/zsh/bin/zsh
    ```
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23756).
    * #23768
    * __->__ #23756
  • config: remove legacy profile write paths (#24055)
    ## Why
    
    [#23883](https://github.com/openai/codex/pull/23883) moved the
    user-facing `--profile` flag onto profile v2 and
    [#23886](https://github.com/openai/codex/pull/23886) removed CLI
    forwarding for the legacy profile-v1 path. Core and TUI config
    persistence still carried `active_profile` and
    `ConfigEditsBuilder::with_profile`, which let later writes continue
    targeting legacy `[profiles.<name>]` tables after profile selection
    moved to profile-v2 config files.
    
    ## What
    
    - Remove legacy profile routing from
    [`ConfigEditsBuilder`](https://github.com/openai/codex/blob/4b38e9c22e762261d7f7eef49d8a21792e241a06/codex-rs/core/src/config/edit.rs#L1064-L1294),
    so core config edits no longer carry `with_profile` or infer
    `[profiles.*]` write targets from a `profile` key.
    - Drop `active_profile` plumbing from runtime `Config`, TUI
    startup/state, app-server config override forwarding, and Windows
    sandbox setup persistence.
    - Make app-server-backed TUI config edits use unscoped model,
    service-tier, feature, Auto-review, plan-mode, and Windows sandbox paths
    through
    [`tui/src/config_update.rs`](https://github.com/openai/codex/blob/4b38e9c22e762261d7f7eef49d8a21792e241a06/codex-rs/tui/src/config_update.rs#L43-L112).
    - Update config edit coverage so legacy `profile` state stays untouched
    by direct model writes, and remove tests whose only contract was the
    deleted profile-scoped persistence path.
    
    ## Testing
    
    - Not run locally.
  • config: remove legacy profile v1 resolution (#24051)
    ## Why
    
    [#23883](https://github.com/openai/codex/pull/23883) moved user-facing
    `--profile` selection onto profile v2, and
    [#23886](https://github.com/openai/codex/pull/23886) removed the old CLI
    `config_profile` override path. Core still had a second legacy path:
    `profile = "..."` could select `[profiles.*]` values while runtime
    config was built. Keeping that resolver alive preserves the old
    precedence model and profile-carrying surfaces even though profile
    selection now points at `$CODEX_HOME/<name>.config.toml`.
    
    ## What
    
    - Reject legacy top-level `profile = "..."` config while loading runtime
    config, with an error that points callers at `--profile <name>` and
    `<name>.config.toml` in the [core load
    path](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/mod.rs#L2524-L2531).
    - Remove the remaining profile-v1 merge points from runtime config
    resolution, including features, permissions, model/provider selection,
    web search, Windows sandbox settings, TUI settings, role reloads, and
    OSS provider lookup.
    - Drop the leftover profile override surface from
    [`ConfigOverrides`](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/mod.rs#L2118-L2148)
    and from the MCP server `codex` tool schema.
    - Prune profile-precedence tests that only exercised the removed
    resolver and replace them with rejection coverage for the legacy
    selector.
    
    ## Testing
    
    - Not run in this metadata pass.
    - Added
    [`legacy_profile_selection_is_rejected`](https://github.com/openai/codex/blob/3d923366eca10a29143623124c6c6e538f058269/codex-rs/core/src/config/config_tests.rs#L7942-L7965)
    coverage for the new runtime guard.
  • cli: remove legacy profile v1 plumbing (#23886)
    ## Why
    
    [#23883](https://github.com/openai/codex/pull/23883) moved the
    user-facing `--profile` flag onto profile v2. The shared CLI option
    layer still carried the old `config_profile` slot and several CLI
    entrypoints still copied that value into legacy config overrides.
    Leaving that path around makes the CLI surface look like it still
    selects legacy `[profiles.*]` state even though `--profile` now means
    `$CODEX_HOME/<name>.config.toml`.
    
    ## What
    
    - Remove the legacy `config_profile` field and merge/copy path from
    [`SharedCliOptions`](https://github.com/openai/codex/blob/95baaf72920c8db22097df8d15a0bb76c84528b6/codex-rs/utils/cli/src/shared_options.rs#L8-L177).
    - Stop forwarding profile-v1 overrides from CLI, exec, TUI, doctor,
    debug, feature, and exec-server paths; runtime profile selection remains
    on `config_profile_v2` through
    [`loader_overrides_for_profile`](https://github.com/openai/codex/blob/95baaf72920c8db22097df8d15a0bb76c84528b6/codex-rs/cli/src/main.rs#L1606-L1619).
    - Resolve local OSS provider selection from the base config in exec and
    TUI now that the legacy profile argument is gone.
    
    ## Testing
    
    - Not run (cleanup-only follow-up to #23883).
  • [codex] Add plugin id to MCP tool call items (#23737)
    Add owning plugin id to MCP tool call items so we can better filter them
    at plugin level.
    
    ## Summary
    - add optional `plugin_id` to MCP tool-call items and legacy begin/end
    events
    - propagate plugin metadata into emitted core items and app-server v2
    `ThreadItem::McpToolCall`
    - preserve plugin ids through app-server replay/redaction paths and
    regenerate v2 schema fixtures
    
    ## Testing
    - `just write-app-server-schema`
    - `just fmt`
    - `just fix -p codex-core`
    - `cargo test -p codex-protocol -p codex-app-server-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core mcp_tool_call_item_includes_plugin_id --lib`
    - `cargo check -p codex-tui --tests`
    - `cargo check -p codex-app-server --tests`
    - `git diff --check`
    
    ## Notes
    - `just fix -p codex-core` completed with two non-fatal
    `too_many_arguments` warnings on the touched MCP notification helpers.
    - A broader `cargo test -p codex-core` run passed core unit tests, then
    hit shell/sandbox/snapshot failures in the integration target.
    - A broader app-server downstream run hit the existing
    `in_process::tests::in_process_start_clamps_zero_channel_capacity` stack
    overflow; `cargo test -p codex-exec` also hit the existing sandbox
    expectation mismatch in
    `thread_lifecycle_params_include_legacy_sandbox_when_no_active_profile`.
  • Make local environment optional in EnvironmentManager (#23369)
    ## Summary
    - make `EnvironmentManager` local environment/runtime paths optional
    - simplify constructor surface around snapshot materialization
    - rename local env accessors to `require_local_environment` /
    `try_local_environment`
    
    ## Validation
    - devbox Bazel build for touched crate surfaces
    - `//codex-rs/exec-server:exec-server-unit-tests`
    - `//codex-rs/app-server-client:app-server-client-unit-tests`
    - filtered touched `//codex-rs/core:core-unit-tests` cases
  • app-server: use profile ids in v2 permission params (#23360)
    ## Why
    
    The v2 app-server permission profile fields are experimental, but the
    previous migration kept a legacy object payload for profile selection.
    That made clients aware of server-owned `activePermissionProfile`
    metadata such as `extends`, and it kept a
    `legacy_additional_writable_roots` path even though
    `runtimeWorkspaceRoots` now owns runtime workspace-root selection.
    
    This PR makes the client contract match the intended model: clients
    select a permission profile by id, and the server resolves and reports
    active profile provenance in response payloads.
    
    Follow-up to #22611.
    
    ## What Changed
    
    - Changed `thread/start`, `thread/resume`, `thread/fork`, and
    `turn/start` permission profile selection to plain profile id strings.
    - Changed `command/exec.permissionProfile` to a plain profile id string
    for the same client/server ownership split.
    - Removed `PermissionProfileSelectionParams` and the legacy `{ type:
    "profile", modifications: [...] }` compatibility deserializer.
    - Updated app-server, TUI, and `codex exec` call sites to send only ids,
    while keeping `activePermissionProfile` as server response metadata.
    - Updated app-server docs and schema fixtures for the revised
    `command/exec.permissionProfile` shape.
    
    ## Verification
    
    - `cargo test -p codex-app-server-protocol`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server`
    - `cargo test -p codex-exec`
    - `RUST_MIN_STACK=8388608 cargo test -p codex-tui`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23360).
    * #23368
    * __->__ #23360
  • [codex-analytics] preserve user thread source for exec threads (#23376)
    ## Why
    - Follows #20949.
    - The above moved `thread_source` attribution from the reducer to
    explicit caller provided metadata
    - The `codex exec` path still omitted this metadata, leaving
    exec-created threads without `thread_source`
    
    
    ## What Changed
    - Ensures exec threads are marked as user created (`thread_source =
    "user"`)
    - Preserves thread-source metadata in exec’s startup session event
    
    
    ## Verification
    - Updated unit tests to validate exec `thread_source` propagation.
    - `cargo +1.93.0 test -p codex-exec --manifest-path codex-rs/Cargo.toml`
    - `cargo +1.93.1 build -p codex-cli --manifest-path codex-rs/Cargo.toml`
    - Validated locally with a freshly built `codex exec` run:
      - Startup logs showed `thread_source: Some(User)`.
      - Rollout metadata recorded `"thread_source":"user"`.
  • Support --output-schema for exec resume (#23123)
    ## Why
    
    `codex exec resume` should have the same structured-output support as
    top-level `codex exec`. Without `--output-schema`, multi-turn automation
    has to choose between resumed session context and schema-validated JSON
    output.
    
    Fixes #22998.
    
    ## What changed
    
    - Marked `--output-schema` as a global `codex exec` flag so it can be
    passed after `resume`.
    - Reused the existing output schema plumbing so resumed turns attach the
    schema to the final response request while preserving session context.
  • [codex] preserve MCP result meta in McpToolCallItemResult (#22946)
    ## Summary
    
    https://openai.slack.com/archives/C0ARA9UAQEA/p1778890981647319?thread_ts=1778888537.934319&cid=C0ARA9UAQEA
    
    
    - Add `_meta` to exec JSONL MCP tool call result events.
    - Copy MCP result metadata through the JSONL event conversion.
    - Add a focused test that verifies `_meta` is serialized as `_meta` and
    not `meta`.
    
    
    ## Verification
    
    https://www.notion.so/openai/Miaolin-0516-_meta-population-debug-3628e50b62b08074b365e0ce1ffb8f74
  • test: construct permission profiles directly (#23030)
    ## Why
    
    `SandboxPolicy` is now a legacy compatibility shape, but several tests
    still built a `SandboxPolicy` only to immediately convert it into
    `PermissionProfile` for APIs that already accept canonical runtime
    permissions. Those detours make it harder to audit where legacy sandbox
    policy is still required, because boundary-only usages are mixed
    together with ordinary test setup.
    
    ## What Changed
    
    - Updated tests in `codex-core`, `codex-exec`, `codex-analytics`, and
    `codex-config` to construct `PermissionProfile` values directly when the
    code under test takes a permission profile.
    - Changed exec-policy, request-permissions, session, and sandbox test
    helpers to pass `PermissionProfile` through instead of converting from
    `SandboxPolicy` internally.
    - Left `SandboxPolicy` in place where tests are explicitly exercising
    legacy compatibility or request/response boundaries.
    
    ## Test Plan
    
    - `cargo test -p codex-analytics -p codex-config`
    - `cargo test -p codex-core --lib safety::tests`
    - `cargo test -p codex-core --lib exec_policy::tests::`
    - `cargo test -p codex-core --lib exec::tests`
    - `cargo test -p codex-core --lib guardian_review_session_config`
    - `cargo test -p codex-core --lib tools::network_approval::tests`
    - `cargo test -p codex-core --lib
    tools::runtimes::shell::unix_escalation::tests`
    - `cargo test -p codex-core --lib managed_network`
    - `cargo test -p codex-core --test all request_permissions::`
    - `cargo test -p codex-exec sandbox`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23030).
    * #23036
    * __->__ #23030
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • app-server: stop returning thread permission profiles (#22792)
    ## Why
    
    The app-server thread lifecycle API should no longer expose the full
    `PermissionProfile` value. After the permissions-profile migration,
    clients should round-trip only the active profile identity through
    `activePermissionProfile` and `permissions` when that identity is known.
    
    The full profile is server-side config. Treating a response-derived
    legacy sandbox projection as a new local profile can lose named-profile
    restrictions and accidentally widen permissions on the next turn. The
    legacy `sandbox` response field remains only as the
    compatibility/display fallback.
    
    ## What Changed
    
    - Removed `permissionProfile` from `ThreadStartResponse`,
    `ThreadResumeResponse`, and `ThreadForkResponse`.
    - Stopped populating that field in app-server thread start/resume/fork
    responses.
    - Updated embedded exec/TUI response mapping to derive display
    permission state from local config or the legacy sandbox fallback
    instead of a response profile value.
    - Added a TUI turn override shape that distinguishes preserving server
    permissions, selecting an active profile id, and sending a legacy
    sandbox for an explicit local override.
    - Preserved remote app-server permissions across turns by sending
    `permissions` only when an `activePermissionProfile` id is known, and
    otherwise sending no sandbox override unless the user selected a local
    override.
    - Kept embedded `thread/resume` hydration server-authored when
    `activePermissionProfile` is absent, which matches the live-thread
    attach path where the server ignores requested overrides.
    - Updated the app-server README to remove the obsolete lifecycle
    response `permissionProfile` reference. The remaining
    `permissionProfile` README references are request-side permission
    overrides.
    - Regenerated app-server JSON schema and TypeScript fixtures.
    - Kept the generated typed response enum exempt from
    `large_enum_variant`, matching the existing payload enum exemption after
    the lifecycle response variants shrank.
    
    ## How To Review
    
    Start with `codex-rs/app-server-protocol/src/protocol/v2/thread.rs` to
    confirm the response shape, then check the response construction in
    `codex-rs/app-server/src/request_processors`. The generated schema and
    TypeScript fixture changes are mechanical follow-through from the
    protocol removal.
    
    The TUI behavior is the delicate part: review
    `codex-rs/tui/src/app_server_session.rs` for response hydration and
    turn-start override projection, then
    `codex-rs/tui/src/app/thread_routing.rs` for the decision about whether
    the next turn should preserve the server snapshot, send an active
    profile id, or send a legacy sandbox for an explicit local override.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol
    thread_lifecycle_responses_default_missing_optional_fields`
    - `cargo test -p codex-exec
    session_configured_from_thread_response_uses_permission_profile_from_config`
    - `cargo test -p codex-tui --lib thread_response`
    - `cargo test -p codex-tui turn_permissions_`
    - `cargo test -p codex-tui
    resume_response_restores_turns_from_thread_items`
    - `cargo test -p codex-analytics
    track_response_only_enqueues_analytics_relevant_responses`
    - `just fix -p codex-analytics`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-tui`
    - `just argument-comment-lint`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22792).
    * #22795
    * __->__ #22792
  • tui/exec: show effective workspace roots in summaries (#22612)
    ## Why
    
    This PR builds on [#22611](https://github.com/openai/codex/pull/22611).
    
    After `runtimeWorkspaceRoots` moved onto thread state, the user-facing
    summaries were still inconsistent about which roots they showed. In
    particular, `/status` and the exec startup summary could under-report
    extra workspace roots from `--add-dir` or from profile-defined
    `workspace_roots`, which made the new model look incorrect even when the
    permissions themselves were right.
    
    ## What Changed
    
    - switched the TUI status surfaces to summarize against
    `Config::effective_workspace_roots()`
    - updated the exec human-output summary to render from the effective
    permission profile instead of the raw constrained profile
    - added focused regressions for both the TUI and exec code paths so
    extra workspace roots stay visible in user-facing summaries
    
    ## Verification
    
    Targeted coverage for this follow-up lives in:
    - `codex-rs/tui/src/status/tests.rs`
    - `codex-rs/exec/src/event_processor_with_human_output_tests.rs`
    
    The added regressions verify that:
    - status output includes profile-defined workspace roots in the
    effective permissions summary
    - exec startup output includes runtime workspace roots instead of
    collapsing back to `cwd` only
  • app-server: use permission ids and runtime workspace roots (#22611)
    ## Why
    
    This PR builds on [#22610](https://github.com/openai/codex/pull/22610)
    and is the app-server side of the migration from mutable per-turn
    `SandboxPolicy` replacement toward selecting immutable permission
    profiles by id plus mutable runtime workspace roots.
    
    Once permission profiles can carry their own immutable
    `workspace_roots`, app-server no longer needs to mutate the selected
    `PermissionProfile` just to represent thread-specific filesystem
    context. The mutable part now lives on the thread as explicit
    `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until
    the sandbox is realized for a turn.
    
    ## What Changed
    
    - Replaced the v2 permission-selection wrapper surface with plain
    profile ids for `thread/start`, `thread/resume`, `thread/fork`, and
    `turn/start`.
    - Removed the API surface for profile modifications
    (`PermissionProfileSelectionParams`,
    `PermissionProfileModificationParams`,
    `ActivePermissionProfileModification`).
    - Added experimental `runtimeWorkspaceRoots` fields to the thread
    lifecycle and turn-start APIs.
    - Threaded runtime workspace roots through core session/thread
    snapshots, turn overrides, app-server request handling, and command
    execution permission resolution.
    - Kept session permission state symbolic so later runtime root updates
    and cwd-only implicit-root retargeting rebind `:workspace_roots`
    correctly.
    - Updated the embedded clients just enough to send and restore the new
    thread state.
    - Refreshed the generated schema/TypeScript artifacts and the app-server
    README to match the new contract.
    
    ## Verification
    
    Targeted coverage for this layer lives in:
    
    - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_start.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
    - `codex-rs/app-server/tests/suite/v2/turn_start.rs`
    - `codex-rs/core/src/session/tests.rs`
    
    The key regression checks exercise that:
    
    - `runtimeWorkspaceRoots` resolve against the effective cwd on thread
    start.
    - Profile-declared workspace roots are excluded from the runtime
    workspace roots returned by app-server.
    - A turn-level runtime workspace-root update persists onto the thread
    and is returned by `thread/resume`.
    - A named permission profile selected on one turn remains symbolic so a
    later runtime-root-only turn update changes the actual sandbox writes.
    - A cwd-only turn update retargets the implicit runtime cwd root while
    preserving additional runtime roots.
    - The protocol fixtures and generated client artifacts stay in sync with
    the string-based permission selection contract.
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611).
    * #22612
    * __->__ #22611
  • permissions: resolve profile identity with constraints (#22683)
    ## Why
    
    This PR is the invariant-cleanup layer that follows the workspace-roots
    base merged in [#22610](https://github.com/openai/codex/pull/22610).
    
    #22610 adds `[permissions.<id>.workspace_roots]` and keeps runtime
    workspace roots separate from the raw permission profile, but its
    in-memory representation is intentionally transitional: `Permissions`
    still carries the selected profile identity next to a constrained
    `PermissionProfile`. That makes APIs such as
    `set_constrained_permission_profile_with_active_profile()` fragile
    because the id and value only mean the right thing when every caller
    keeps them in sync.
    
    This PR introduces a single resolved profile state so profile identity,
    `extends`, the profile value, and profile-declared workspace roots
    travel together. The next PR,
    [#22611](https://github.com/openai/codex/pull/22611), builds on this by
    changing the app-server turn API to select permission profiles by id
    plus runtime workspace roots.
    
    ## Stack Context
    
    - #22610, now merged: adds profile-declared `workspace_roots`, runtime
    workspace roots, and `:workspace_roots` materialization.
    - This PR: replaces the parallel active-profile/profile-value fields
    with `PermissionProfileState`.
    - #22611: switches app-server turn updates toward profile ids plus
    runtime workspace roots.
    - #22612: updates TUI/exec summaries to show the effective workspace
    roots.
    
    Keeping this separate from #22611 is deliberate: reviewers can validate
    the internal state invariant before reviewing the app-server protocol
    migration.
    
    ## What Changed
    
    - Added `ResolvedPermissionProfile::{Legacy, BuiltIn, Named}` and
    `PermissionProfileState`.
    - Typed built-in profile ids with `BuiltInPermissionProfileId`.
    - Moved selected profile identity and profile-declared workspace roots
    into the resolved state.
    - Replaced `Permissions` parallel profile fields with one
    `permission_profile_state`.
    - Removed `set_constrained_permission_profile_with_active_profile()`
    from session sync paths.
    - Kept trusted session replay/`SessionConfigured` compatibility through
    explicit session snapshot helpers.
    - Updated session configuration, MCP initialization, app-server, exec,
    TUI, and guardian call sites to consume `&PermissionProfile` directly.
    
    ## Review Guide
    
    Start with `codex-rs/core/src/config/resolved_permission_profile.rs`; it
    is the new invariant boundary. Then review
    `codex-rs/core/src/config/mod.rs` to see how config loading records
    active profile identity and profile workspace roots. The remaining
    call-site changes are mostly mechanical fallout from
    `Permissions::permission_profile()` returning `&PermissionProfile`
    instead of `&Constrained<PermissionProfile>`.
    
    ## Verification
    
    The existing config/session coverage now constructs and asserts through
    `PermissionProfileState`. The workspace-root config test also asserts
    that profile-declared roots are preserved in the resolved state, which
    is the behavior #22611 relies on when runtime roots become mutable
    through the app-server API.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22683).
    * #22612
    * #22611
    * __->__ #22683
  • permissions: support workspace roots in profiles (#22610)
    ## Why
    
    This is the configuration/model half of the alternative permissions
    migration we discussed as a comparison point for
    [#22401](https://github.com/openai/codex/pull/22401) and
    [#22402](https://github.com/openai/codex/pull/22402).
    
    The old `workspace-write` model mixes three concerns that we want to
    keep separate:
    - reusable profile rules that should stay immutable once selected
    - user/runtime workspace roots from `cwd`, `--add-dir`, and legacy
    workspace-write config
    - internal Codex writable roots such as memories, which should not be
    shown as user workspace roots
    
    This PR gives permission profiles first-class `workspace_roots` so users
    can opt multiple repositories into the same `:workspace_roots` rules
    without using broad absolute-path write grants. It also starts
    separating the raw selected profile from the effective runtime profile
    by making `Permissions` expose explicit accessors instead of public
    mutable fields.
    
    A representative `config.toml` looks like this:
    
    ```toml
    default_permissions = "dev"
    
    [permissions.dev.workspace_roots]
    "~/code/openai" = true
    "~/code/developers-website" = true
    
    [permissions.dev.filesystem.":workspace_roots"]
    "." = "write"
    ".codex" = "read"
    ".git" = "read"
    ".vscode" = "read"
    ```
    
    If Codex starts in `~/code/codex` with that profile selected, the
    effective workspace-root set becomes:
    - `~/code/codex` from the runtime `cwd`
    - `~/code/openai` from the profile
    - `~/code/developers-website` from the profile
    
    The `:workspace_roots` rules are materialized across each root, so
    `.git`, `.codex`, and `.vscode` stay scoped the same way everywhere.
    Runtime additions such as `--add-dir` can still layer on later stack
    entries without mutating the selected profile.
    
    ## Stack Shape
    
    This PR intentionally stops before the profile-identity cleanup in
    [#22683](https://github.com/openai/codex/pull/22683) so the base review
    stays focused on config loading, workspace-root materialization, and
    compatibility with legacy `workspace-write`.
    
    The representation in this PR is therefore transitional: `Permissions`
    carries enough state to distinguish the raw constrained profile from the
    effective runtime profile, and there are still call sites that must keep
    the active profile identity and constrained profile value in sync. The
    follow-up PR replaces that with a single resolved profile state
    (`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the
    profile id, immutable `PermissionProfile`, and profile-declared
    workspace roots together. That follow-up removes APIs such as
    `set_constrained_permission_profile_with_active_profile()` where
    separate arguments could drift out of sync.
    
    Downstream PRs then build on this base to switch app-server turn updates
    to profile ids plus runtime workspace roots and to finish the
    user-visible summary behavior. Reviewers should judge this PR as the
    workspace-roots foundation, not as the final in-memory shape of selected
    permission profiles.
    
    ## Review Guide
    
    Suggested review order:
    
    1. Start with `codex-rs/core/src/config/mod.rs`.
    This is the main shape change in the base slice. `Permissions` now
    stores a private raw `Constrained<PermissionProfile>` plus runtime
    `workspace_roots`. Callers use `permission_profile()` when they need the
    raw constrained value and `effective_permission_profile()` when they
    need a materialized runtime profile. As noted above,
    [#22683](https://github.com/openai/codex/pull/22683) replaces this
    transitional shape with a resolved profile state that keeps identity and
    profile data together.
    
    2. Review `codex-rs/config/src/permissions_toml.rs` and
    `codex-rs/core/src/config/permissions.rs`.
    These add `[permissions.<id>.workspace_roots]`, resolve enabled entries
    relative to the policy cwd, and keep `:workspace_roots` deny-read glob
    patterns symbolic until the actual roots are known.
    
    3. Review `codex-rs/protocol/src/permissions.rs` and
    `codex-rs/protocol/src/models.rs`.
    These add the policy/profile materialization helpers that expand exact
    `:workspace_roots` entries and scoped deny-read globs over every
    workspace root. This is also where `ActivePermissionProfileModification`
    is removed from the core model.
    
    4. Review the legacy bridge in
    `Config::load_from_base_config_with_overrides` and
    `Config::set_legacy_sandbox_policy`.
    This is where legacy `workspace-write` roots become runtime workspace
    roots, while Codex internal writable roots stay internal and do not
    appear as user-facing workspace roots.
    
    5. Then skim downstream call sites.
    The interesting pattern is raw-vs-effective access: state/proxy/bwrap
    paths keep the raw constrained profile, while execution, summaries, and
    user-visible status use the effective profile and workspace-root list.
    
    ## What Changed
    
    - added `[permissions.<id>.workspace_roots]` to the config model and
    schema
    - added runtime `workspace_roots` state to `Config`/`Permissions` and
    `ConfigOverrides`
    - made `Permissions` profile fields private and replaced direct mutation
    with accessors/setters
    - added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for
    materializing `:workspace_roots` exact paths and deny-read globs across
    all roots
    - moved legacy additional writable roots into runtime workspace-root
    state instead of active profile modifications
    - removed `ActivePermissionProfileModification` and its app-server
    protocol/schema export
    - updated sandbox/status summary paths so internal writable roots are
    not reported as user workspace roots
    
    ## Verification Strategy
    
    The targeted tests cover the behavior at the layers where regressions
    are most likely:
    - `codex-rs/core/src/config/config_tests.rs` verifies config loading,
    legacy workspace-root seeding, effective profile materialization, and
    memory-root handling.
    - `codex-rs/core/src/config/permissions_tests.rs` verifies profile
    `workspace_roots` parsing and `:workspace_roots` scoped/glob
    compilation.
    - `codex-rs/protocol/src/permissions.rs` unit tests verify exact and
    glob materialization over multiple workspace roots.
    - `codex-rs/tui/src/status/tests.rs` and
    `codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the
    user-facing summaries show effective workspace roots and hide internal
    writes.
    
    I also ran `cargo check --tests` locally after the latest stack refresh
    to catch cross-crate API breakage from the private-field/accessor
    changes.
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610).
    * #22612
    * #22611
    * #22683
    * __->__ #22610
  • test: isolate exec review policy config test (#22512)
    ## Why
    
    
    `thread_start_params_include_review_policy_when_review_policy_is_manual_only`
    builds a `Config` with a temporary `CODEX_HOME`, but
    `ConfigBuilder::default()` can still load host-managed configuration. On
    local macOS machines with enterprise-managed Codex config, that host
    state can leak into the test and change the resulting config, even
    though CI does not have the same managed config source.
    
    This makes the test environment-dependent: it can pass in CI while
    failing locally for developers who have managed configuration installed.
    
    ## What Changed
    
    - Updated `codex-rs/exec/src/lib_tests.rs` so the test calls
    `LoaderOverrides::without_managed_config_for_tests()` through
    `ConfigBuilder::loader_overrides(...)`.
    - Left the rest of the test setup intact, including the temporary
    `CODEX_HOME`, temporary cwd, and explicit `approvals_reviewer` harness
    override.
    
    ## Verification
    
    ```shell
    cargo test -p codex-exec thread_start_params_include_review_policy_when_review_policy_is_manual_only
    ```
  • tests: avoid ambient temp sandbox roots (#22576)
    ## Why
    Some sandboxed integration tests enabled both ambient temp roots
    (`TMPDIR` and literal `/tmp`) even though they were not testing
    temp-root behavior. On Linux bwrap, making `/tmp` writable causes
    protected metadata mount targets such as `/tmp/.git`, `/tmp/.agents`,
    and `/tmp/.codex` to be synthesized. If a run is interrupted, those
    top-level markers can be left behind and contaminate later tests.
    
    ## What changed
    For the incidental integration tests that do not need ambient temp-root
    access, set `exclude_tmpdir_env_var` and `exclude_slash_tmp` to `true`.
    Dedicated protected-metadata coverage remains in the lower-level sandbox
    tests that use isolated temp roots.
    
    ## Verification
    Focused remote devbox repros passed with a watcher polling `/tmp/.git`,
    `/tmp/.agents`, and `/tmp/.codex`; no leaked markers were observed.
  • feat: add layered --profile-v2 config files (#17141)
    ## Why
    
    `--profile-v2 <name>` gives launchers and runtime entry points a named
    profile config without making each profile duplicate the base user
    config. The base `$CODEX_HOME/config.toml` still loads first, then
    `$CODEX_HOME/<name>.config.toml` layers above it and becomes the active
    writable user config for that session.
    
    That keeps shared defaults, plugin/MCP setup, and managed/user
    constraints in one place while letting a named profile override only the
    pieces that need to differ.
    
    ## What Changed
    
    - Added the shared `--profile-v2 <name>` runtime option with validated
    plain names, now represented by `ProfileV2Name`.
    - Extended config layer state so the base user config and selected
    profile config are both `User` layers; APIs expose the active user layer
    and merged effective user config.
    - Threaded profile selection through runtime entry points: `codex`,
    `codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex
    debug prompt-input`.
    - Made user-facing config writes go to the selected profile file when
    active, including TUI/settings persistence, app-server config writes,
    and MCP/app tool approval persistence.
    - Made plugin, marketplace, MCP, hooks, and config reload paths read
    from the merged user config so base and profile layers both participate.
    - Updated app-server config layer schemas to mark profile-backed user
    layers.
    
    ## Limits
    
    `--profile-v2` is still rejected for config-management subcommands such
    as feature, MCP, and marketplace edits. Those paths remain tied to the
    base `config.toml` until they have explicit profile-selection semantics.
    
    Some adjacent background writes may still update base or global state
    rather than the selected profile:
    
    - marketplace auto-upgrade metadata
    - automatic MCP dependency installs from skills
    - remote plugin sync or uninstall config edits
    - personality migration marker/default writes
    
    ## Verification
    
    Added targeted coverage for profile name validation, layer
    ordering/merging, selected-profile writes, app-server config writes,
    session hot reload, plugin config merging, hooks/config fixture updates,
    and MCP/app approval persistence.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore(config) rm experimental_use_freeform_apply_patch (#22565)
    ## Summary
    Get rid of the `experimental_use_freeform_apply_patch` config option,
    since it is now encoded in model config. No deprecation message since it
    has been experimental this entire time.
    
    ## Testing
    - [x] Updated unit tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • config: add strict config parsing (#20559)
    ## Why
    
    Codex intentionally ignores unknown `config.toml` fields by default so
    older and newer config files keep working across versions. That leniency
    also makes typo detection hard because misspelled or misplaced keys
    disappear silently.
    
    This change adds an opt-in strict config mode so users and tooling can
    fail fast on unrecognized config fields without changing the default
    permissive behavior.
    
    This feature is possible because `serde_ignored` exposes the exact
    signal Codex needs: it lets Codex run ordinary Serde deserialization
    while recording fields Serde would otherwise ignore. That avoids
    requiring `#[serde(deny_unknown_fields)]` across every config type and
    keeps strict validation opt-in around the existing config model.
    
    ## What Changed
    
    ### Added strict config validation
    
    - Added `serde_ignored`-based validation for `ConfigToml` in
    `codex-rs/config/src/strict_config.rs`.
    - Combined `serde_ignored` with `serde_path_to_error` so strict mode
    preserves typed config error paths while also collecting fields Serde
    would otherwise ignore.
    - Added strict-mode validation for unknown `[features]` keys, including
    keys that would otherwise be accepted by `FeaturesToml`'s flattened
    boolean map.
    - Kept typed config errors ahead of ignored-field reporting, so
    malformed known fields are reported before unknown-field diagnostics.
    - Added source-range diagnostics for top-level and nested unknown config
    fields, including non-file managed preference source names.
    
    ### Kept parsing single-pass per source
    
    - Reworked file and managed-config loading so strict validation reuses
    the already parsed `TomlValue` for that source.
    - For actual config files and managed config strings, the loader now
    reads once, parses once, and validates that same parsed value instead of
    deserializing multiple times.
    - Validated `-c` / `--config` override layers with the same
    base-directory context used for normal relative-path resolution, so
    unknown override keys are still reported when another override contains
    a relative path.
    
    ### Scoped `--strict-config` to config-heavy entry points
    
    - Added support for `--strict-config` on the main config-loading entry
    points where it is most useful:
      - `codex`
      - `codex resume`
      - `codex fork`
      - `codex exec`
      - `codex review`
      - `codex mcp-server`
      - `codex app-server` when running the server itself
      - the standalone `codex-app-server` binary
      - the standalone `codex-exec` binary
    - Commands outside that set now reject `--strict-config` early with
    targeted errors instead of accepting it everywhere through shared CLI
    plumbing.
    - `codex app-server` subcommands such as `proxy`, `daemon`, and
    `generate-*` are intentionally excluded from the first rollout.
    - When app-server strict mode sees invalid config, app-server exits with
    the config error instead of logging a warning and continuing with
    defaults.
    - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
    instead of extending shared `ReviewArgs`, so `--strict-config` stays on
    the outer config-loading command surface and does not become part of the
    reusable review payload used by `codex exec review`.
    
    ### Coverage
    
    - Added tests for top-level and nested unknown config fields, unknown
    `[features]` keys, typed-error precedence, source-location reporting,
    and non-file managed preference source names.
    - Added CLI coverage showing invalid `--enable`, invalid `--disable`,
    and unknown `-c` overrides still error when `--strict-config` is
    present, including compound-looking feature names such as
    `multi_agent_v2.subagent_usage_hint_text`.
    - Added integration coverage showing both `codex app-server
    --strict-config` and standalone `codex-app-server --strict-config` exit
    with an error for unknown config fields instead of starting with
    fallback defaults.
    - Added coverage showing unsupported command surfaces reject
    `--strict-config` with explicit errors.
    
    ## Example Usage
    
    Run Codex with strict config validation enabled:
    
    ```shell
    codex --strict-config
    ```
    
    Strict config mode is also available on the supported config-heavy
    subcommands:
    
    ```shell
    codex --strict-config exec "explain this repository"
    codex review --strict-config --uncommitted
    codex mcp-server --strict-config
    codex app-server --strict-config --listen off
    codex-app-server --strict-config --listen off
    ```
    
    For example, if `~/.codex/config.toml` contains a typo in a key name:
    
    ```toml
    model = "gpt-5"
    approval_polic = "on-request"
    ```
    
    then `codex --strict-config` reports the misspelled key instead of
    silently ignoring it. The path is shortened to `~` here for readability:
    
    ```text
    $ codex --strict-config
    Error loading config.toml:
    ~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
      |
    2 | approval_polic = "on-request"
      | ^^^^^^^^^^^^^^
    ```
    
    Without `--strict-config`, Codex keeps the existing permissive behavior
    and ignores the unknown key.
    
    Strict config mode also validates ad-hoc `-c` / `--config` overrides:
    
    ```text
    $ codex --strict-config -c foo=bar
    Error: unknown configuration field `foo` in -c/--config override
    
    $ codex --strict-config -c features.foo=true
    Error: unknown configuration field `features.foo` in -c/--config override
    ```
    
    Invalid feature toggles are rejected too, including values that look
    like nested config paths:
    
    ```text
    $ codex --strict-config --enable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --disable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
    Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
    ```
    
    Unsupported commands reject the flag explicitly:
    
    ```text
    $ codex --strict-config cloud list
    Error: `--strict-config` is not supported for `codex cloud`
    ```
    
    ## Verification
    
    The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
    `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
    case, unknown `-c` overrides, app-server strict startup failure through
    `codex app-server`, and rejection for unsupported commands such as
    `codex cloud`, `codex mcp`, `codex remote-control`, and `codex
    app-server proxy`.
    
    The config and config-loader tests cover unknown top-level fields,
    unknown nested fields, unknown `[features]` keys, source-location
    reporting, non-file managed config sources, and `-c` validation for keys
    such as `features.foo`.
    
    The app-server test suite covers standalone `codex-app-server
    --strict-config` startup failure for an unknown config field.
    
    ## Documentation
    
    The Codex CLI docs on developers.openai.com/codex should mention
    `--strict-config` as an opt-in validation mode for supported
    config-heavy entry points once this ships.
  • add --dangerously-bypass-hook-trust CLI flag (#21768)
    # Why
    
    Hook trust happens through the TUI in `/hooks` so it can block
    non-interactive use cases. This flag will allow users that are using
    codex headlessly to bypass hooks when they want to.
    
    # What
    
    This adds one invocation-scoped escape hatch.
    
    - the CLI flag sets a runtime-only `bypass_hook_trust` override; there
    is no durable `config.toml` setting
    - hook discovery still respects normal enablement, so explicitly
    disabled hooks remain disabled
    - we show a `--dangerously-bypass-hook-trust is enabled. Enabled hooks
    may run without review for this invocation.` message on startup so
    accidental use is visible in both interactive and exec flows
    
    This keeps “enabled” and “trusted” as separate concepts in the normal
    path, while giving CI/E2E callers a stable way to opt into the
    exceptional path when they already control the hook set.
  • Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
    ## Why
    
    `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
    bypass the normal Responses transport by reading SSE from local files.
    That kept test-only behavior wired through production client code. The
    affected tests can stay hermetic by using the existing
    `core_test_support::responses` mock server and passing `openai_base_url`
    instead.
    
    ## What Changed
    
    - Removed the `CODEX_RS_SSE_FIXTURE` flag,
    `codex_api::stream_from_fixture`, the `env-flags` dependency, and the
    checked-in SSE fixture files.
    - Repointed the affected core, exec, and TUI tests at `MockServer` with
    the existing SSE event constructors.
    - Removed the Bazel test data plumbing for the deleted fixtures and
    refreshed cargo/Bazel lock state.
    
    ## Verification
    
    - `cargo build -p codex-cli`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core --test all responses_api_stream_cli`
    - `cargo test -p codex-core --test all
    integration_creates_and_checks_session_file`
    - `cargo test -p codex-exec --test all ephemeral`
    - `cargo test -p codex-exec --test all resume`
    - `cargo test -p codex-tui --test all
    resume_startup_does_not_consume_model_availability_nux_count`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
    - `git diff --check`
  • Add process-scoped SQLite telemetry (#22154)
    ## Summary
    - add SQLite init, backfill-gate, and fallback telemetry without
    introducing a cross-cutting state-db access wrapper
    - install one process-scoped telemetry sink after OTEL startup and let
    low-level state/rollout paths emit through it directly
    - add process-start metrics for the process owners that initialize
    SQLite
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • [codex] Delete function-style apply_patch (#21651)
    ## Why
    
    `apply_patch` is now a freeform/custom tool. Keeping the old
    JSON/function-style registration and parsing path left another way for
    models and tests to invoke `apply_patch`, which made the tool surface
    harder to reason about.
    
    ## What changed
    
    - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
    spec, and handler support for function payloads.
    - Kept `apply_patch_tool_type = freeform` as the supported model
    metadata path, including Bedrock catalog metadata.
    - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
    calls.
    
    ## Verification
    
    - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
    - `cargo test -p codex-core tools::handlers::apply_patch --lib`
    - `cargo test -p codex-core --test all
    apply_patch_tool_executes_and_emits_patch_events`
    - `cargo test -p codex-core --test all
    apply_patch_reports_parse_diagnostics`
    - `cargo test -p codex-exec test_apply_patch_tool`
    - `just fix -p codex-core`
    - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
    codex-exec`
  • [codex] request desktop attestation from app (#20619)
    ## Summary
    
    TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
    attestation token and attach it as `x-oai-attestation` on the scoped
    ChatGPT Codex request paths.
    
    ![DeviceCheck attestation
    interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)
    
    ## Details
    
    This PR teaches the Codex app-server runtime how to request and attach
    an attestation token. It does not generate DeviceCheck tokens directly;
    instead, it relies on the connected desktop app to advertise that it can
    generate attestation and then asks that app for a fresh header value
    when needed.
    
    The flow is:
    
    1. The Codex desktop app connects to app-server.
    2. During `initialize`, the app can advertise that it supports
    `requestAttestation`.
    3. Before app-server calls selected ChatGPT Codex endpoints, it sends
    the internal server request `attestation/generate` to the app.
    4. app-server receives a pre-encoded header value back.
    5. app-server forwards that value as `x-oai-attestation` on the scoped
    outbound requests.
    
    The code in this repo is mostly protocol and runtime plumbing: it adds
    the app-server request/response shape, introduces an attestation
    provider in core, wires that provider into Responses / compaction /
    realtime setup paths, and covers the intended scoping with tests. The
    signed macOS DeviceCheck generation remains owned by the desktop app PR.
    
    ## Related PR
    
    - Codex desktop app implementation:
    https://github.com/openai/openai/pull/878649
    
    ## Validation
    
    <details>
    <summary>Tests run</summary>
    
    ```sh
    cargo test -p codex-app-server-protocol
    cargo test -p codex-core attestation --lib
    cargo test -p codex-app-server --lib attestation
    ```
    
    Also ran:
    
    ```sh
    just fix -p codex-core
    just fix -p codex-app-server
    just fix -p codex-app-server-protocol
    just fmt
    just write-app-server-schema
    ```
    
    </details>
    
    <details>
    <summary>E2E DeviceCheck validation</summary>
    
    First validated the signed desktop app boundary directly: launched a
    packaged signed `Codex.app`, sent `attestation/generate`, decoded the
    returned `v1.` attestation header, and validated the extracted
    DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
    bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
    `is_ok: true`.
    
    Then ran the fuller app + app-server flow. The packaged `Codex.app`
    launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
    MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
    requested `attestation/generate` from the real Electron app process, and
    the intercepted `/backend-api/codex/responses` traffic included
    `x-oai-attestation` on both routes:
    
    ```text
    GET  /backend-api/codex/responses  Upgrade: websocket  x-oai-attestation: present
    POST /backend-api/codex/responses  Upgrade: none       x-oai-attestation: present
    ```
    
    The captured header decoded to a DeviceCheck token that also validated
    with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
    team `2DC432GLL2`).
    
    </details>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Load configured environments from CODEX_HOME (#20667)
    ## Why
    
    The earlier PRs add stdio transport support and the config-backed
    environment provider, but the feature remains inert until normal Codex
    entrypoints construct `EnvironmentManager` with enough context to
    discover `CODEX_HOME/environments.toml`. This final stack PR activates
    the provider while preserving the legacy `CODEX_EXEC_SERVER_URL`
    fallback when no environments file exists.
    
    **Stack position:** this is PR 5 of 5. It is the product wiring PR that
    activates the configured environment provider added in PR 4.
    
    ## What Changed
    
    - Thread `codex_home` into `EnvironmentManagerArgs`.
    - Change `EnvironmentManager::new(...)` to load the provider from
    `CODEX_HOME`.
    - Preserve legacy behavior by falling back to
    `DefaultEnvironmentProvider::from_env()` when `environments.toml` is
    absent.
    - Make `environments.toml`-backed managers start new threads with all
    configured environments, default first, while keeping the legacy env-var
    path single-default.
    - Update the app-server, TUI, exec, MCP server, connector, prompt-debug,
    and thread-manager-sample callsites to pass `codex_home` and handle
    provider-loading errors.
    
    ## Self-Review Notes
    
    - The multi-environment startup path is intentionally tied to the
    `environments.toml` provider. Using `>1` configured environment as the
    only signal would also expand the legacy `CODEX_EXEC_SERVER_URL`
    provider because it keeps `local` addressable alongside `remote`.
    - The startup environment list is still derived inside
    `EnvironmentManager`; the provider only says whether its snapshot should
    start new threads with all configured environments.
    - The thread-manager sample was updated to pass the current
    `ThreadManager::new(...)` installation id argument so the stack compiles
    under Bazel.
    
    ## Stack
    
    - 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
    listener
    - 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
    client transport
    - 3. https://github.com/openai/codex/pull/20665 - Make environment
    providers own default selection
    - 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
    environments TOML provider
    - **5. This PR:** https://github.com/openai/codex/pull/20667 - Load
    configured environments from CODEX_HOME
    
    Split from original draft: https://github.com/openai/codex/pull/20508
    
    ## Validation
    
    - `just fmt`
    - `git diff --check`
    - `bazel build --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/thread-manager-sample:codex-thread-manager-sample`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel
    //codex-rs/exec-server:exec-server-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=default_thread_environment_selections_use_manager_default_id
    //codex-rs/core:core-unit-tests`
    - `bazel test --config=remote --strategy=remote
    --remote_download_toplevel --test_sharding_strategy=disabled
    --test_arg=start_thread_uses_all_default_environments_from_codex_home
    //codex-rs/core:core-unit-tests`
    
    ## Documentation
    
    This activates `CODEX_HOME/environments.toml`; user-facing documentation
    should be added before this stack is treated as a documented public
    workflow.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Remove exec research preview banner wording (#21683)
    ## Why
    
    `codex exec` still included the stale `research preview` label in its
    human-readable startup banner, which makes the CLI look older and less
    current than it is.
    
    Fixes #21444.
    
    ## What Changed
    
    Removed the hard-coded ` (research preview)` suffix from the `OpenAI
    Codex v<version>` startup banner in
    `codex-rs/exec/src/event_processor_with_human_output.rs`.
    
    ## Validation
    
    Local validation was not required for this one-line startup banner text
    cleanup.
  • Disable empty Cargo test targets (#21584)
    ## Summary
    
    `cargo test` has entails both running standard Rust tests and doctests.
    It turns out that the doctest discovery is fairly slow, and it's a cost
    you pay even for crates that don't include any doctests.
    
    This PR disables doctests with `doctest = false` for crates that lack
    any doctests.
    
    For the collection of crates below, this speeds up test execution by
    >4x.
    
    E.g., before this PR:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
      Range (min … max):    0.418 s … 14.529 s    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
      Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
      Range (min … max):   418.0 ms … 436.8 ms    10 runs
    ```
    
    For a single crate, with >2x speedup, before:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
      Range (min … max):   480.9 ms … 512.0 ms    10 runs
    ```
    
    And after:
    
    ```
    Benchmark 1: cargo test -p codex-utils-string
      Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
      Range (min … max):   206.8 ms … 221.0 ms    13 runs
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • Avoid noisy OTEL diagnostics in codex exec (#21107)
    `codex exec` should not print OpenTelemetry exporter self-diagnostics to
    stderr by default. Suppress the SDK and OTLP exporter targets unless
    callers
    explicitly opt in with `RUST_LOG`.
    
    Also stop defaulting the trace exporter to the log exporter, since OTLP
    HTTP
    endpoints are signal-specific and a logs endpoint is not valid for
    spans.
    
    Co-authored-by: Codex <noreply@openai.com>
    
    Co-authored-by: Codex <noreply@openai.com>
  • Move message history out of core (#21278)
    ## Why
    
    Message history was implemented inside `codex-core` and surfaced through
    core protocol ops and `SessionConfiguredEvent` fields even though the
    current consumer is TUI-local prompt recall. That made core own UI
    history persistence and exposed `history_log_id` / `history_entry_count`
    through surfaces that app-server and other clients do not need.
    
    This change moves message history persistence out of core and keeps the
    recall plumbing local to the TUI.
    
    ## What changed
    
    - Added a new `codex-message-history` crate for appending, looking up,
    trimming, and reading metadata from `history.jsonl`.
    - Removed core protocol history ops/events: `AddToHistory`,
    `GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
    - Removed `history_log_id` and `history_entry_count` from
    `SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
    - Updated the TUI to dispatch local app events for message-history
    append/lookup and keep its persistent-history metadata in TUI session
    state.
    
    ## Validation
    
    - `cargo test -p codex-message-history -p codex-protocol`
    - `cargo test -p codex-exec event_processor_with_json_output`
    - `cargo test -p codex-mcp-server outgoing_message`
    - `cargo test -p codex-tui`
    - `just fix -p codex-message-history -p codex-protocol -p codex-core -p
    codex-tui -p codex-exec -p codex-mcp-server`
  • 2- Use string service tiers in session protocol (#20971)
    ## Summary
    - break service tier session/op/app-server protocol fields from the
    closed enum to string tier ids
    - send the service tier string directly through model requests, prewarm,
    compaction, memories, and TUI/app-server turn starts
    - regenerate app-server protocol JSON/TypeScript schemas, removing the
    standalone ServiceTier TS enum
    
    ## Verification
    - just fmt
    - cargo check -p codex-core -p codex-app-server -p codex-tui
    - just write-app-server-schema
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat(app-server): move v2 sessionId onto Thread (#21336)
    ## Why
    
    `session_id` and `thread_id` are separate identities after #20437, but
    app-server only surfaced `sessionId` on the `thread/start`,
    `thread/resume`, and `thread/fork` response envelopes. Other
    thread-bearing surfaces such as `thread/list`, `thread/read`,
    `thread/started`, `thread/rollback`, `thread/metadata/update`, and
    `thread/unarchive` either lacked the grouping key or forced clients to
    special-case those three responses.
    
    Making `sessionId` part of the reusable `Thread` payload gives every v2
    API surface one place to expose session-tree identity.
    
    ## Mental model
      1. thread.sessionId lives on `Thread`
    2. It is a view/runtime identity for the current live session tree, not
    durable stored lineage metadata
    3. When app-server has a live loaded thread, it copies the real value
    from core’s session_configured.session_id
    4. When it only has stored/unloaded data, it falls back to
    thread.sessionId = thread.id
    
    ## What changed
    
    - Added `sessionId` to the v2
    [`Thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs#L105-L109).
    - Removed the duplicate top-level `sessionId` fields from
    `thread/start`, `thread/resume`, and `thread/fork`; clients should now
    read `response.thread.sessionId`.
    - Populated `thread.sessionId` when building live thread responses,
    replaying loaded threads, and returning stored-thread summaries so the
    field is present across start, resume, fork, list, read, rollback,
    metadata-update, unarchive, and `thread/started` paths. See
    [`load_thread_from_resume_source_or_send_internal`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L2824-L2918)
    and
    [`thread_from_stored_thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L3671-L3719).
    - Preserved the stored-thread fallback: if a thread has not been loaded
    into a live session tree yet, `thread.sessionId` falls back to
    `thread.id`; once the thread is live again, the field reports the active
    session tree root.
    - Regenerated the JSON/TypeScript schemas and updated the app-server
    README examples to show
    [`thread.sessionId`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/README.md#L306-L310)
    on the thread object.
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • add turn items view to app-server turns (#21063)
    ## Why
    
    `Turn.items` currently overloads an empty array to mean either that no
    items exist or that the server intentionally did not load them for this
    response. That ambiguity blocks future lazy-loading work where clients
    need to distinguish unloaded, summary, and fully hydrated turn payloads.
    
    ## What changed
    
    - add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
    variants
    - add required `itemsView` metadata to app-server `Turn` payloads
    - mark reconstructed persisted history as `full` and live shell-style
    turn payloads as `notLoaded`
    - keep current `thread/turns/list` behavior unchanged and document that
    it still returns `full` turns today
    - regenerate the JSON and TypeScript protocol fixtures
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server thread_read_can_include_turns`
    - `cargo test -p codex-app-server
    thread_turns_list_can_page_backward_and_forward`
    - `cargo test -p codex-app-server
    thread_resume_rejects_history_when_thread_is_running`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fmt`
  • [codex-analytics] add item lifecycle timing (#20514)
    ## Why
    
    Tool families already disagree on what their existing `duration` fields
    mean, so lifecycle latency should live on the shared item envelope
    instead of being inferred from per-tool execution fields. Carrying that
    envelope through app-server notifications gives downstream consumers one
    reusable timing signal without pretending every tool has the same
    execution semantics.
    
    ## What changed
    
    - Adds `started_at_ms` to core `ItemStartedEvent` values and
    `completed_at_ms` to core `ItemCompletedEvent` values.
    - Populates those timestamps in the shared session lifecycle emitters,
    so protocol-native items get timing without each producer tracking its
    own clock state.
    - Exposes `startedAtMs` on app-server `item/started` notifications and
    `completedAtMs` on `item/completed` notifications.
    - Maps the lifecycle timestamps through the app-server boundary while
    leaving legacy-converted notifications nullable when no lifecycle
    timestamp exists.
    - Regenerates the app-server JSON schema and TypeScript fixtures for the
    notification-envelope change and updates downstream fixtures that
    construct those notifications directly.
    - Extends the existing web-search and image-generation integration flows
    to assert the new lifecycle timestamps on the native item events.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
    -p codex-app-server-client`
    - `cargo test -p codex-core --test all web_search_item_is_emitted`
    - `cargo test -p codex-core --test all
    image_generation_call_event_is_emitted`
    - `cargo test -p codex-app-server-protocol`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
    * #18748
    * #18747
    * #17090
    * #17089
    * __->__ #20514
  • state: pass state db handles through consumers (#20561)
    ## Why
    
    SQLite state was still being opened from consumer paths, including lazy
    `OnceCell`-backed thread-store call sites. That let one process
    construct multiple state DB connections for the same Codex home, which
    makes SQLite lock contention and `database is locked` failures much
    easier to hit.
    
    State DB lifetime should be chosen by main-like entrypoints and tests,
    then passed through explicitly. Consumers should use the supplied
    `Option<StateDbHandle>` or `StateDbHandle` and keep their existing
    filesystem fallback or error behavior when no handle is available.
    
    The startup path also needs to keep the rollout crate in charge of
    SQLite state initialization. Opening `codex_state::StateRuntime`
    directly bypasses rollout metadata backfill, so entrypoints should
    initialize through `codex_rollout::state_db` and receive a handle only
    after required rollout backfills have completed.
    
    ## What Changed
    
    - Initialize the state DB in main-like entrypoints for CLI, TUI,
    app-server, exec, MCP server, and the thread-manager sample.
    - Pass `Option<StateDbHandle>` through `ThreadManager`,
    `LocalThreadStore`, app-server processors, TUI app wiring, rollout
    listing/recording, personality migration, shell snapshot cleanup,
    session-name lookup, and memory/device-key consumers.
    - Remove the lazy local state DB wrapper from the thread store so
    non-test consumers use only the supplied handle or their existing
    fallback path.
    - Make `codex_rollout::state_db::init` the local state startup path: it
    opens/migrates SQLite, runs rollout metadata backfill when needed, waits
    for concurrent backfill workers up to a bounded timeout, verifies
    completion, and then returns the initialized handle.
    - Keep optional/non-owning SQLite helpers, such as remote TUI local
    reads, as open-only paths that do not run startup backfill.
    - Switch app-server startup from direct
    `codex_state::StateRuntime::init` to the rollout state initializer so
    app-server cannot skip rollout backfill.
    - Collapse split rollout lookup/list APIs so callers use the normal
    methods with an optional state handle instead of `_with_state_db`
    variants.
    - Restore `getConversationSummary(ThreadId)` to delegate through
    `ThreadStore::read_thread` instead of a LocalThreadStore-specific
    rollout path special case.
    - Keep DB-backed rollout path lookup keyed on the DB row and file
    existence, without imposing the filesystem filename convention on
    existing DB rows.
    - Verify readable DB-backed rollout paths against `session_meta.id`
    before returning them, so a stale SQLite row that points at another
    thread's JSONL falls back to filesystem search and read-repairs the DB
    row.
    - Keep `debug prompt-input` filesystem-only so a one-off debug command
    does not initialize or backfill SQLite state just to print prompt input.
    - Keep goal-session test Codex homes alive only in the goal-specific
    helper, rather than leaking tempdirs from the shared session test
    helper.
    - Update tests and call sites to pass explicit state handles where DB
    behavior is expected and explicit `None` where filesystem-only behavior
    is intended.
    
    ## Validation
    
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
    codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
    codex-tui -p codex-exec -p codex-cli --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout state_db_`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout find_thread_path -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout try_init_ -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-rollout`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
    codex-rollout --lib -- -D warnings`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store
    read_thread_falls_back_when_sqlite_path_points_to_another_thread --
    --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-thread-store`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    shell_snapshot`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all personality_migration`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find::find_prefers_sqlite_path_by_id --
    --nocapture`
    - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    --test all rollout_list_find -- --nocapture`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
    interrupt_accounts_active_goal_before_pausing`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server get_auth_status -- --test-threads=1`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
    codex-app-server --lib`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
    -p codex-app-server --tests`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
    -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
    codex-exec -p codex-cli`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
    codex-app-server`
    - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
    codex-rollout`
    - `CODEX_SKIP_VENDORED_BWRAP=1
    CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
    - `just argument-comment-lint -p codex-core`
    - `just argument-comment-lint -p codex-rollout`
    
    Focused coverage added in `codex-rollout`:
    
    - `recorder::tests::state_db_init_backfills_before_returning` verifies
    the rollout metadata row exists before startup init returns.
    - `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
    verifies startup waits for another worker to finish backfill instead of
    disabling the handle for the process.
    -
    `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
    verifies startup does not hang indefinitely on a stuck backfill lease.
    -
    `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
    verifies DB-backed lookup accepts valid existing rollout paths even when
    the filename does not include the thread UUID.
    -
    `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
    verifies DB-backed lookup ignores a stale row whose existing path
    belongs to another thread and read-repairs the row after filesystem
    fallback.
    
    Focused coverage updated in `codex-core`:
    
    - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
    DB-preferred rollout file with matching `session_meta.id`, so it still
    verifies that valid SQLite paths win without depending on stale/empty
    rollout contents.
    
    `cargo test -p codex-app-server thread_list_respects_search_term_filter
    -- --test-threads=1 --nocapture` was attempted locally but timed out
    waiting for the app-server test harness `initialize` response before
    reaching the changed thread-list code path.
    
    `bazel test //codex-rs/thread-store:thread-store-unit-tests
    --test_output=errors` was attempted locally after the thread-store fix,
    but this container failed before target analysis while fetching `v8+`
    through BuildBuddy/direct GitHub. The equivalent local crate coverage,
    including `cargo test -p codex-thread-store`, passes.
    
    A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
    also requires system `libcap.pc` for `codex-linux-sandbox`; the
    follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
    this container.
  • chore(cli) deprecate --full-auto (#20133)
    ## Summary
    Starts the process of getting rid of `--full-auto`, with some
    concessions:
    1. Fully removes the command from the tui, since it just resolves to the
    default permissions there, and encourages users to use the one-time
    trust flow if they're not in a trusted repo.
    2. Marks the command as deprecated in `codex exec`, in case users are
    actively relying on this. We'll remove in an upcoming n+X release.
    3. Cleans up some of the `codex sandbox` cli logic, to keep supporting
    legacy sandbox policies for now.
    
    This isn't the cleanest setup, but I think it is worthwhile to warn
    users for one release before hard-removing it.
    
    ## Testing 
    - [x] Updated unit tests