82 Commits

  • feat(app-server): add history_mode to thread (#29927)
    ## Description
    
    This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
    This will be stored in `SessionMeta` in the JSONL rollout file and as a
    new column in the SQLite thread_metadata table, and exposed on
    `thread/start` and on the `Thread` object in app-server.
    
    ## What changed
    
    - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
    defaulting old and new SessionMeta to `legacy`.
    - Carried `history_mode` through core session config, ThreadStore stored
    metadata, local/in-memory stores, rollout metadata extraction, and the
    existing SQLite `threads` table.
    - Added experimental `historyMode` to app-server v2 `Thread` and
    `thread/start`.
    - Made paginated stored threads metadata-discoverable but unsupported
    for legacy full-history reads, `load_history`, live resume, and create
    paths.
    - Regenerated app-server schema fixtures and added
    protocol/state/thread-store/app-server coverage for persistence and
    fail-closed behavior.
    
    ## Compatibility floor
    Because users may be running various versions of Codex binaries on the
    same machine (TUI, Codex App, etc.), we will need to establish a
    compatibility floor for upcoming paginated threads, which will change
    how thread storage reads and writes work.
    
    The overall plan here:
    ```
    Release N:
    - Add historyMode to SessionMeta / Thread / SQLite metadata.
    - Teach binaries to understand paginated threads.
    - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
    - Default remains `"legacy"`.
    
    Release N+1:
    - First-party clients start opting into paginated threads where appropriate.
    - Internal dogfood / staged rollout.
    - Measure old-client usage and paginated-thread unsupported errors.
    
    Release N+2:
    - Only after Release N+ is overwhelmingly deployed, make paginated the default.
    - Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
    ```
    
    The important behavior change is fail-closed handling for a binary that
    encounters a persisted `paginated` thread before it knows how to fully
    support paginated history. In app-server, if a thread is `paginated`, we
    will:
    
    - allow metadata-only discovery paths like `thread/list` and
    `thread/read(includeTurns=false)`, so clients can still see the thread
    and inspect its `historyMode`
    - reject legacy full-history/live-thread paths like
    `thread/read(includeTurns=true)` and `thread/resume` with an unsupported
    JSON-RPC error
    - avoid silently treating an unknown or future `historyMode` as `legacy`
    
    Under the hood, the ThreadStore layer also rejects legacy operations
    that would need to load or replay the full thread history for a
    paginated thread. That gives us the behavior we want for Release N:
    future paginated threads are visible, but this binary fails closed
    instead of trying to operate on them as if they were legacy threads.
  • Expose MCP app identity in app context (#29934)
    ## Why
    
    MCP tool-call events need to expose trusted app identity and action
    metadata directly so v2 clients do not have to infer it from tool names
    or resource URIs.
    
    ## What changed
    
    - Add optional `appName`, `templateId`, and `actionName` fields to MCP
    tool-call `appContext`.
    - Populate `appName` and `templateId` from trusted Codex Apps metadata,
    and derive `actionName` from the trusted app resource metadata.
    - Preserve all three fields through core events, legacy protocol events,
    persisted thread history, resume redaction, and app-server v2 responses.
    - Document the public `appContext` fields in
    `codex-rs/app-server/README.md`.
    - Regenerate app-server JSON and TypeScript schemas and add coverage for
    serialization, persistence, redaction, and metadata propagation.
    
    ## Validation
    
    - `just test -p codex-app-server-protocol mcp_tool_call`
    - `just test -p codex-core
    mcp_tool_call_item_metadata_only_trusts_codex_apps_identity
    mcp_tool_call_item_includes_app_identity`
    - `just write-app-server-schema`
    
    ---------
    
    Co-authored-by: Martin Au-Yeung <280153141+martinauyeung-oai@users.noreply.github.com>
  • [codex] rename rollout budget error to session budget error (#29744)
    ## Summary
    
    - rename the rollout-budget exhaustion error from
    `RolloutBudgetExceeded` to `SessionBudgetExceeded`
    - expose the matching app-server v2 wire value as
    `sessionBudgetExceeded`
    - regenerate JSON/TypeScript schema fixtures and update the app-server
    docs and focused tests
    
    This is a naming-only follow-up to #29715 based on [Pavel's review
    suggestion](https://github.com/openai/codex/pull/29715#discussion_r3463183480).
    Runtime behavior is unchanged.
    
    ## Tests
    
    - `just test -p codex-core rollout_budget`
    - `just test -p codex-app-server-protocol`
    - `just fmt`
    - `just write-app-server-schema`
  • [codex] surface rollout budget exhaustion (#29715)
    ## Summary
    - surface shared rollout-budget exhaustion as
    `CodexErr::RolloutBudgetExceeded` instead of a generic interrupted turn
    - map it through the existing `CodexErrorInfo` and app-server v2
    `codexErrorInfo` path
    - keep local compaction from retrying after the shared rollout budget is
    exhausted
    
    This gives app-server clients a stable `rolloutBudgetExceeded` error
    they can classify without guessing from `status="interrupted"`.
    
    ## Tests
    - `just test -p codex-core rollout_budget`
  • core: resolve view_image paths in selected environment (#29526)
    ## Why
    
    view_image needs to support foreign OS remote executors.
    
    ## What
    
    - resolve image paths against the selected environment as `PathUri` and
    read them through that environment's filesystem
    - keep app-server's public path field wire-compatible as
    `LegacyAppPathString`, with purpose-specific UI rendering
    - cover relative and absolute target-native paths in the core
    integration test and run the full `view_image` suite under wine-exec
    without skips
  • core: add extra metadata field to Thread struct (#29675)
    # Summary
    
    Adds a field Thread.extras that can be used to hold arbitrary metadata
    specific to a given thread.
  • chore(core) rm AskForApproval::OnFailure (#28418)
    ## Summary
    Deletes the OnFailure variant of the `AskForApproval` enum. This option
    has been deprecated since #11631.
    
    ## Testing
    - [x] Tests pass
  • app-server: document thread and turn IDs are UUID7 (#27714)
    It's actually a very nice property that these are UUID7s, so documenting
    them so we think twice before changing it away from UUID7s in the
    future.
  • Simplify multi-agent mode controls (#29324)
    ## Why
    
    Multi-agent delegation policy was split across `multiAgentMode`,
    `features.multi_agent_mode`, and `usage_hint_enabled`. These controls
    could disagree: a requested mode could be downgraded by the feature
    flag, and disabling usage hints also disabled mode instructions.
    
    Some clients also need multi-agent tools without adding
    delegation-policy text to model context. The previous two-mode API could
    not express that directly.
    
    ## What changed
    
    `multiAgentMode` is now the only live delegation-policy control:
    
    | Mode | Behavior |
    | --- | --- |
    | `none` | Keep multi-agent tools available without adding mode
    instructions. |
    | `explicitRequestOnly` | Only delegate after an explicit user request.
    |
    | `proactive` | Delegate when parallel work materially improves speed or
    quality. |
    
    - new threads default to `explicitRequestOnly`; omitting the mode on
    later turns keeps the current value
    - thread start, resume, fork, and settings responses always report the
    concrete current mode instead of `null`
    - mode selection remains sticky across turns and resume
    - usage-hint text no longer controls whether mode instructions apply
    - `features.multi_agent_mode` and `usage_hint_enabled` remain accepted
    as ignored compatibility settings so existing configs continue to load
    - app-server documentation and generated schemas describe the three-mode
    API
    
    ## Tests
    
    - `just test -p codex-core multi_agent_mode`
    - `just test -p codex-core multi_agent_v2_config_from_feature_table`
    - `just test -p codex-core spawn_agent_description`
    - `just test -p codex-features`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server multi_agent_mode`
  • Expose thread-level multi-agent mode (#28792)
    ## Why
    
    Once multi-agent mode can be selected per turn, clients also need to
    choose the initial selection when creating a thread and observe that
    selection through lifecycle and settings APIs.
    
    The selected value is intentionally distinct from the effective
    model-visible value: no client selection is represented as `null`, even
    though an eligible multi-agent v2 turn derives `explicitRequestOnly` as
    its effective default.
    
    ## What changed
    
    - Add the optional experimental `thread/start.multiAgentMode` parameter
    and pass it through thread creation.
    - Preserve an omitted initial value as an unset selection rather than
    eagerly storing `explicitRequestOnly`.
    - Apply an explicit `thread/start` selection to the first turn through
    the session configuration established at thread creation.
    - Restore the latest persisted effective mode as the selected baseline
    on cold resume when rollout history contains one.
    - Inherit the optional selected mode from a loaded parent when creating
    related runtime threads.
    - Return the current selected `multiAgentMode` from `thread/start`,
    `thread/resume`, `thread/fork`, and thread settings, using `null` when
    no mode is selected.
    - Keep lifecycle reporting independent from model capability and feature
    eligibility; core turn construction remains responsible for calculating
    and persisting the effective mode.
    
    ## Not covered
    
    - Clearing an existing loaded-session selection back to unset through
    `turn/start`; omitted or `null` currently retains the session's
    selection.
    - A TUI control, slash command, or `config.toml` preference.
    
    ## Verification
    
    - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol`
    - `CARGO_INCREMENTAL=0 just test -p codex-app-server multi_agent_mode`
    
    The focused app-server coverage verifies explicit `thread/start`
    initialization, first-turn prompting, nullable reporting for an omitted
    selection, and retention of selections that are not currently
    runtime-eligible.
    
    ## Stack
    
    Stacked on #28685. This PR contains only the thread initialization and
    lifecycle/settings API layer.
  • core: load AGENTS.md from foreign environments (#28958)
    ## Why
    
    Make it possible to load AGENTS.md from remote exec-servers whose OS is
    different than app-server.
    
    ## What
    
    - keep `AGENTS.md` discovery and provenance as `PathUri`, with
    root-aware parent and ancestor traversal
    - expose lifecycle instruction sources as legacy app-server path strings
    in events while retaining `PathUri` internally
    - preserve and test mixed POSIX and Windows paths in model context and
    TUI status output
    - cover remote Windows loading end to end by seeding the Wine prefix
    through host filesystem APIs
    - fix bug in `PathUri`'s parent() implementation that would erase
    Windows drive letters
  • Emit Trusted MCP App Identity on Tool-Call Items (#27132)
    ## Summary
    
    - Add optional `appContext` to app-server MCP tool-call items with
    trusted `connectorId`, `linkId`, and `mcpAppResourceUri` metadata.
    - Preserve that context across tool-call events, persisted history,
    reconnects, and thread resume.
    - Keep the deprecated top-level `mcpAppResourceUri` temporarily for
    client migration.
    
    The consumer contract is `{ appContext: { connectorId, linkId,
    mcpAppResourceUri }, tool }`.
    
    ## Validation
    
    - Full GitHub Actions suite passes, including CLA, Bazel tests, clippy,
    release builds, and argument-comment lint.
    
    ---------
    
    Co-authored-by: martinauyeung-oai <280153141+martinauyeung-oai@users.noreply.github.com>
  • unified-exec: retain PathUri in command events (#28780)
    ## Why
    
    App-server must report command events containing foreign-platform paths
    without changing existing client or rollout path-string formats.
    
    ## What changed
    
    - retain `PathUri` through exec command begin/end events
    - convert cwd values to `LegacyAppPathString` at the app-server
    compatibility boundary
    - drop command actions with foreign paths and log them
    - serialize rollout-trace cwd values using their inferred native path
    representation
    - restore Wine coverage for retained Windows cwd values and successful
    completion
  • [codex] Restore thread recency with compatible migration history (#28671)
    ## Summary
    
    - Revert #28655, restoring the thread `recencyAt` behavior introduced by
    #27910.
    - Move `threads_recency_at` to migration 0039 so it no longer collides
    with `external_agent_config_imports` at version 0038.
    - Repair databases that already applied the recency migration as version
    38 by moving the matching migration-history row to version 39 before
    SQLx validation. The current version-38 migration can then apply
    normally.
    
    ## Validation
    
    - `just test -p codex-state
    migrations::tests::repairs_recency_migration_that_was_applied_as_version_38`
    - `just test -p codex-state -p codex-rollout -p codex-thread-store -p
    codex-app-server-protocol -p codex-tui`: 3,439 passed; six TUI tests
    could not open the machine's existing read-only incident database at
    `~/.codex/sqlite/state_5.sqlite`.
    - `just fix -p codex-state`
    - `just fmt`
    - Verified that state migration versions are unique.
  • Revert thread recencyAt for sidebar ordering (#28655)
    ## Why
    
    Revert #27910 to remove the newly introduced thread `recencyAt`
    persistence and API behavior from `main`.
    
    ## What changed
    
    This reverts commit `fac3158c2a783095768076489815f361fa9b0db4`,
    including the state migration, thread-store propagation, app-server API
    surface, generated schemas, and related tests.
    
    ## Validation
    
    Not run before opening; relying on CI for the initial fast signal.
  • Add thread recencyAt for sidebar ordering (#27910)
    ## Summary
    
    Add a server-owned `recencyAt` timestamp and `recency_at` thread-list
    sort key for product recency ordering while preserving the existing
    meaning of `updatedAt` as the latest persisted thread mutation.
    
    This is the server-side alternative to #27697. Rather than narrowing
    `updatedAt`, clients can sort the sidebar by `recency_at` and continue
    treating `updatedAt` as mutation time.
    
    Paired Codex Apps PR:
    [openai/openai#1024599](https://github.com/openai/openai/pull/1024599)
    
    ## Contract
    
    - `recencyAt` initializes when a thread is created.
    - A turn start advances `recencyAt` monotonically.
    - Commentary, agent output, tool results, token/accounting updates, turn
    completion, archive, unarchive, resume, and generic metadata writes do
    not advance it.
    - `updatedAt` retains its existing behavior and continues to advance for
    persisted thread mutations.
    - Current servers populate `recencyAt`; the response field is optional
    in generated TypeScript so clients connected to older servers can fall
    back to `updatedAt`.
    - Filesystem-only fallback uses existing updated/mtime ordering when
    SQLite is unavailable.
    
    ## Persistence and compatibility
    
    Migration 0038 adds second- and millisecond-precision recency columns,
    backfills them from the existing updated timestamp, creates list
    indexes, and includes an insert trigger so older binaries writing to a
    migrated database seed recency without causing later mutations to
    advance it.
    
    Generic metadata upserts preserve existing recency values. Turn-start
    updates use a dedicated monotonic touch, and process-local allocation
    keeps millisecond cursor values unique. State DB list, search, read,
    filtered-list repair, rollout fallback propagation, and app-server
    conversions all carry the new field.
    
    ## API
    
    `Thread` responses include:
    
    ```ts
    recencyAt?: number
    ```
    
    `thread/list` and `thread/search` accept:
    
    ```json
    { "sortKey": "recency_at" }
    ```
    
    Generated TypeScript and JSON schemas are included.
    
    ## Validation
    
    - `just test -p codex-state` — 146 passed
    - `just test -p codex-rollout` — 69 passed
    - `just test -p codex-thread-store` — 81 passed
    - `just test -p codex-app-server-protocol` — 231 passed
    - Focused app-server list ordering, response mapping, archive/unarchive,
    and resume lifecycle tests passed
    - Scoped `just fix` for state, rollout, thread-store,
    app-server-protocol, and app-server
    - `just fmt`
    - `git diff --check`
    - Independent correctness, simplicity, elegance, security, and
    test-quality reviews; actionable ordering, lifecycle, query-projection,
    and timestamp-uniqueness findings were addressed
  • [codex] Add interruptible sleep tool (#28429)
    ## Why
    
    Models sometimes need to pause briefly while waiting for external work,
    but using a shell command for that delay ties the wait to a process and
    does not naturally resume when new turn input arrives.
    
    ## What changed
    
    - add a built-in `sleep` tool behind the under-development `sleep_tool`
    feature
    - accept a bounded `duration_ms` argument, matching the millisecond
    convention used by unified exec
    - end the sleep early when either steered user input or mailbox input
    arrives
    - include elapsed wall-clock time in completed and interrupted outputs
    - emit a dedicated core `SleepItem` through `item/started` and
    `item/completed`
    - expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain
    it in reconstructed thread history
    - regenerate the configuration schema for the new feature flag
    - regenerate app-server JSON and TypeScript schema fixtures
    
    ## Test plan
    
    - `just test -p codex-core sleep_tool_follows_feature_gate`
    - `just test -p codex-core any_new_input_interrupts_sleep`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    sleep_emits_started_and_completed_items`
  • [codex-analytics] add extensible feature thread sources (#27063)
    ## Why
    - `ThreadSource` currently defines a closed set of core-owned values
    - Product features also create threads for background or scheduled work
    - Adding every product-specific value to the core enum would require
    repeated `codex-rs` protocol changes
    - Feature-backed values let product callers provide precise attribution
    while preserving the existing core classifications
    
    ## What Changed
    - Adds `ThreadSource::Feature(String)` for app-owned thread source
    values
    - Represents all app-server v2 thread sources as scalar strings, so a
    feature source is supplied as `"automation"`
    - Persists and emits the feature's plain string label, so `"automation"`
    produces `thread_source="automation"` in analytics
    - Keeps `user`, `subagent`, and `memory_consolidation` as explicit
    core-owned values and regenerates the app-server schemas and TypeScript
    bindings
    
    ## Verification
    - `just write-app-server-schema`
    - `cargo check --workspace`
    - `just test -p codex-protocol
    feature_thread_source_serializes_as_its_app_owned_label`
    - `just test -p codex-app-server-protocol
    thread_sources_round_trip_as_scalar_labels`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `just fmt`
  • multi-agent: add path-based v2 activity tracking (#27007)
    ## Why
    
    Multi-agent v2 identifies agents by canonical paths, but its tool
    handlers still emitted the larger legacy collaboration begin/end events
    built around nickname and role metadata. App-server, rollout-trace,
    analytics, and TUI consumers therefore lacked one compact path-based
    completion signal that behaved consistently across live events and
    replay.
    
    The TUI also needs a bounded `/agent` status surface for v2 agents. It
    should use recent local activity for previews, refresh liveness without
    loading full histories, and keep the legacy picker available when no
    path-backed v2 agent is known.
    
    ## What changed
    
    - Replace the v2 `spawn_agent`, `send_message`, `followup_task`, and
    `interrupt_agent` legacy lifecycle emissions with a success-only
    `SubAgentActivity` event. The event records the tool call ID, occurrence
    time, affected thread, canonical agent path, and `started`,
    `interacted`, or `interrupted` kind.
    - Expose the activity as a completion-only app-server v2
    `subAgentActivity` thread item in live notifications and reconstructed
    history, regenerate the protocol schemas, and count it in sub-agent tool
    analytics.
    - Track canonical paths from live activity and loaded-thread metadata in
    the TUI, and render the activity in live and replayed transcripts.
    - Make `/agent` list running path-backed agents with summaries from
    bounded local event buffers. Each summary is capped at 240 graphemes,
    the scan is capped at six recent items, only the last three wrapped
    lines are shown, and command output is omitted. Liveness falls back to
    metadata-only `thread/read` when local turn state is unavailable.
    - Persist the activity as a terminal rollout-trace runtime payload and
    reduce it to the corresponding spawn, send, follow-up, or close
    interaction edge. `interrupt_agent` is classified as a close-edge
    operation.
    - Preserve the legacy picker when no path-backed v2 agent is known.
    
    ## Compatibility
    
    App-server v2 clients that consumed `collabAgentToolCall` begin/end
    pairs for these tools must handle the new completion-only
    `subAgentActivity` item. Legacy v1 collaboration behavior is unchanged.
    
    ## Screenshot
    
    <img width="684" height="288" alt="Screenshot 2026-06-08 at 15 40 47"
    src="https://github.com/user-attachments/assets/194b3cd0-619d-45fb-b587-cf3e2b1b8a1d"
    />
    
    ## Testing
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-rollout-trace`
    - Added focused coverage for activity analytics, terminal trace
    serialization, spawn-edge reduction, `interrupt_agent` classification,
    TUI status rendering without aggregated command output, and clearing
    stale running state after a completed turn.
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • [codex] Add user input client ids (#24653)
    ## Summary
    
    Adds an optional `clientId` field to app-server v2 `UserInput` and
    carries it through the core `UserInput` model so clients can correlate
    echoed user input items without relying on payload equality.
    
    ## Details
    
    - Adds `client_id: Option<String>` to core `UserInput` variants.
    - Exposes the v2 app-server field as `clientId` on the wire and in
    generated TypeScript.
    - Preserves the id when converting between app-server v2 and core
    protocol types.
    - Regenerates app-server schema fixtures.
    
    ## Validation
    
    - `just fmt`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-protocol`
    - `git diff --check`
  • Restore legacy image detail values (#24644)
    ## Why
    
    Older persisted rollouts can contain `input_image.detail` values of
    `auto` or `low` from before `ImageDetail` was narrowed to
    `high`/`original`. Current deserialization rejects those values, which
    can make resume skip later compacted checkpoints and reconstruct an
    oversized raw suffix before the next compaction attempt.
    
    Confirmed Sentry reports fixed by this compatibility path:
    
    - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/)
    - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/)
    - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/)
    - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/)
    
    ## Background
    
    [openai/codex#20693](https://github.com/openai/codex/pull/20693) added
    image-detail plumbing for app-server `UserInput` so input images could
    explicitly request `detail: original`. The Slack discussion behind that
    PR was about ScreenSpot / bridge evals where user input images were
    resized, while tool output images already had MCP/code-mode ways to
    request image detail.
    
    In review, the intended new API surface was narrowed to `high` and
    `original`: default to `high`, allow `original` when callers need
    unchanged image handling, and avoid encouraging new `auto` or `low`
    usage. That policy still makes sense for newly emitted values.
    
    The missing compatibility piece is persisted history. Older rollouts can
    already contain `auto` and `low`, and resume reconstructs typed history
    by deserializing those rollout records. Rejecting old values at that
    boundary causes valid compacted checkpoints to be skipped. This PR
    restores `auto` and `low` as real variants so old records deserialize
    and round-trip without being rewritten as `high`, while product paths
    can continue to default to `high` and avoid emitting `auto` for new
    behavior.
    
    ## What changed
    
    - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class
    protocol values.
    - Preserved `auto`/`low` through rollout deserialization, MCP image
    metadata, code-mode image output, and schema/type generation.
    - Kept local image byte handling conservative: only `original` switches
    to original-resolution loading; `auto`/`low`/`high` continue through the
    resize-to-fit path while retaining their detail value.
    - Added regression coverage for enum round-tripping and code-mode `low`
    detail handling.
    
    ## Testing
    
    - `just write-app-server-schema`
    - `just test -p codex-protocol`
    - `just test -p codex-tools`
    - `just test -p codex-code-mode`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-core
    suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata`
    - `just test -p codex-core
    suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper`
    - Loaded broken rollouts on local fixed builds, and started/completed
    new turns.
    
    I also attempted `just test -p codex-core`; the local broad run did not
    finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed
    out. The failures were broad timeout/deadline failures across unrelated
    areas; targeted changed-path core tests above passed.
  • [codex] Add plugin id to MCP tool call items (#23737)
    Add owning plugin id to MCP tool call items so we can better filter them
    at plugin level.
    
    ## Summary
    - add optional `plugin_id` to MCP tool-call items and legacy begin/end
    events
    - propagate plugin metadata into emitted core items and app-server v2
    `ThreadItem::McpToolCall`
    - preserve plugin ids through app-server replay/redaction paths and
    regenerate v2 schema fixtures
    
    ## Testing
    - `just write-app-server-schema`
    - `just fmt`
    - `just fix -p codex-core`
    - `cargo test -p codex-protocol -p codex-app-server-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core mcp_tool_call_item_includes_plugin_id --lib`
    - `cargo check -p codex-tui --tests`
    - `cargo check -p codex-app-server --tests`
    - `git diff --check`
    
    ## Notes
    - `just fix -p codex-core` completed with two non-fatal
    `too_many_arguments` warnings on the touched MCP notification helpers.
    - A broader `cargo test -p codex-core` run passed core unit tests, then
    hit shell/sandbox/snapshot failures in the integration target.
    - A broader app-server downstream run hit the existing
    `in_process::tests::in_process_start_clamps_zero_channel_capacity` stack
    overflow; `cargo test -p codex-exec` also hit the existing sandbox
    expectation mismatch in
    `thread_lifecycle_params_include_legacy_sandbox_when_no_active_profile`.
  • feat(permissions): resolve permission profile inheritance (#22270)
    ## Stack
    
    This is the foundation PR for the permission-profile inheritance stack.
    
    - This PR adds config-level `extends` resolution and merge semantics.
    - Follow-up: #23705 applies resolved profiles at runtime and updates the
    active-profile protocol surfaces.
    
    ## Why
    
    Permission profiles are starting to carry enough policy that
    copy-pasting near-identical definitions becomes hard to review and easy
    to drift. Before the runtime can consume inherited profiles, the config
    layer needs one explicit resolver that can merge parent chains and
    reject unsafe or invalid inheritance shapes.
    
    ## What changed
    
    - Add `extends` to permission-profile TOML and resolve parent chains in
    inheritance order.
    - Merge inherited profile TOML with the existing config merge behavior
    while preserving the permission-specific normalization needed for
    network domain keys.
    - Keep parent descriptions out of resolved child profiles and record
    inherited profile names separately for downstream consumers.
    - Reject undefined parents, unsupported built-in parents, and
    inheritance cycles with targeted errors.
    - Cover resolver behavior with TOML fixture tests and refresh the
    generated config schema.
    
    ## Validation
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core permissions_profiles_`
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • app-server: stop returning thread permission profiles (#22792)
    ## Why
    
    The app-server thread lifecycle API should no longer expose the full
    `PermissionProfile` value. After the permissions-profile migration,
    clients should round-trip only the active profile identity through
    `activePermissionProfile` and `permissions` when that identity is known.
    
    The full profile is server-side config. Treating a response-derived
    legacy sandbox projection as a new local profile can lose named-profile
    restrictions and accidentally widen permissions on the next turn. The
    legacy `sandbox` response field remains only as the
    compatibility/display fallback.
    
    ## What Changed
    
    - Removed `permissionProfile` from `ThreadStartResponse`,
    `ThreadResumeResponse`, and `ThreadForkResponse`.
    - Stopped populating that field in app-server thread start/resume/fork
    responses.
    - Updated embedded exec/TUI response mapping to derive display
    permission state from local config or the legacy sandbox fallback
    instead of a response profile value.
    - Added a TUI turn override shape that distinguishes preserving server
    permissions, selecting an active profile id, and sending a legacy
    sandbox for an explicit local override.
    - Preserved remote app-server permissions across turns by sending
    `permissions` only when an `activePermissionProfile` id is known, and
    otherwise sending no sandbox override unless the user selected a local
    override.
    - Kept embedded `thread/resume` hydration server-authored when
    `activePermissionProfile` is absent, which matches the live-thread
    attach path where the server ignores requested overrides.
    - Updated the app-server README to remove the obsolete lifecycle
    response `permissionProfile` reference. The remaining
    `permissionProfile` README references are request-side permission
    overrides.
    - Regenerated app-server JSON schema and TypeScript fixtures.
    - Kept the generated typed response enum exempt from
    `large_enum_variant`, matching the existing payload enum exemption after
    the lifecycle response variants shrank.
    
    ## How To Review
    
    Start with `codex-rs/app-server-protocol/src/protocol/v2/thread.rs` to
    confirm the response shape, then check the response construction in
    `codex-rs/app-server/src/request_processors`. The generated schema and
    TypeScript fixture changes are mechanical follow-through from the
    protocol removal.
    
    The TUI behavior is the delicate part: review
    `codex-rs/tui/src/app_server_session.rs` for response hydration and
    turn-start override projection, then
    `codex-rs/tui/src/app/thread_routing.rs` for the decision about whether
    the next turn should preserve the server snapshot, send an active
    profile id, or send a legacy sandbox for an explicit local override.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol
    thread_lifecycle_responses_default_missing_optional_fields`
    - `cargo test -p codex-exec
    session_configured_from_thread_response_uses_permission_profile_from_config`
    - `cargo test -p codex-tui --lib thread_response`
    - `cargo test -p codex-tui turn_permissions_`
    - `cargo test -p codex-tui
    resume_response_restores_turns_from_thread_items`
    - `cargo test -p codex-analytics
    track_response_only_enqueues_analytics_relevant_responses`
    - `just fix -p codex-analytics`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-tui`
    - `just argument-comment-lint`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22792).
    * #22795
    * __->__ #22792
  • permissions: support workspace roots in profiles (#22610)
    ## Why
    
    This is the configuration/model half of the alternative permissions
    migration we discussed as a comparison point for
    [#22401](https://github.com/openai/codex/pull/22401) and
    [#22402](https://github.com/openai/codex/pull/22402).
    
    The old `workspace-write` model mixes three concerns that we want to
    keep separate:
    - reusable profile rules that should stay immutable once selected
    - user/runtime workspace roots from `cwd`, `--add-dir`, and legacy
    workspace-write config
    - internal Codex writable roots such as memories, which should not be
    shown as user workspace roots
    
    This PR gives permission profiles first-class `workspace_roots` so users
    can opt multiple repositories into the same `:workspace_roots` rules
    without using broad absolute-path write grants. It also starts
    separating the raw selected profile from the effective runtime profile
    by making `Permissions` expose explicit accessors instead of public
    mutable fields.
    
    A representative `config.toml` looks like this:
    
    ```toml
    default_permissions = "dev"
    
    [permissions.dev.workspace_roots]
    "~/code/openai" = true
    "~/code/developers-website" = true
    
    [permissions.dev.filesystem.":workspace_roots"]
    "." = "write"
    ".codex" = "read"
    ".git" = "read"
    ".vscode" = "read"
    ```
    
    If Codex starts in `~/code/codex` with that profile selected, the
    effective workspace-root set becomes:
    - `~/code/codex` from the runtime `cwd`
    - `~/code/openai` from the profile
    - `~/code/developers-website` from the profile
    
    The `:workspace_roots` rules are materialized across each root, so
    `.git`, `.codex`, and `.vscode` stay scoped the same way everywhere.
    Runtime additions such as `--add-dir` can still layer on later stack
    entries without mutating the selected profile.
    
    ## Stack Shape
    
    This PR intentionally stops before the profile-identity cleanup in
    [#22683](https://github.com/openai/codex/pull/22683) so the base review
    stays focused on config loading, workspace-root materialization, and
    compatibility with legacy `workspace-write`.
    
    The representation in this PR is therefore transitional: `Permissions`
    carries enough state to distinguish the raw constrained profile from the
    effective runtime profile, and there are still call sites that must keep
    the active profile identity and constrained profile value in sync. The
    follow-up PR replaces that with a single resolved profile state
    (`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the
    profile id, immutable `PermissionProfile`, and profile-declared
    workspace roots together. That follow-up removes APIs such as
    `set_constrained_permission_profile_with_active_profile()` where
    separate arguments could drift out of sync.
    
    Downstream PRs then build on this base to switch app-server turn updates
    to profile ids plus runtime workspace roots and to finish the
    user-visible summary behavior. Reviewers should judge this PR as the
    workspace-roots foundation, not as the final in-memory shape of selected
    permission profiles.
    
    ## Review Guide
    
    Suggested review order:
    
    1. Start with `codex-rs/core/src/config/mod.rs`.
    This is the main shape change in the base slice. `Permissions` now
    stores a private raw `Constrained<PermissionProfile>` plus runtime
    `workspace_roots`. Callers use `permission_profile()` when they need the
    raw constrained value and `effective_permission_profile()` when they
    need a materialized runtime profile. As noted above,
    [#22683](https://github.com/openai/codex/pull/22683) replaces this
    transitional shape with a resolved profile state that keeps identity and
    profile data together.
    
    2. Review `codex-rs/config/src/permissions_toml.rs` and
    `codex-rs/core/src/config/permissions.rs`.
    These add `[permissions.<id>.workspace_roots]`, resolve enabled entries
    relative to the policy cwd, and keep `:workspace_roots` deny-read glob
    patterns symbolic until the actual roots are known.
    
    3. Review `codex-rs/protocol/src/permissions.rs` and
    `codex-rs/protocol/src/models.rs`.
    These add the policy/profile materialization helpers that expand exact
    `:workspace_roots` entries and scoped deny-read globs over every
    workspace root. This is also where `ActivePermissionProfileModification`
    is removed from the core model.
    
    4. Review the legacy bridge in
    `Config::load_from_base_config_with_overrides` and
    `Config::set_legacy_sandbox_policy`.
    This is where legacy `workspace-write` roots become runtime workspace
    roots, while Codex internal writable roots stay internal and do not
    appear as user-facing workspace roots.
    
    5. Then skim downstream call sites.
    The interesting pattern is raw-vs-effective access: state/proxy/bwrap
    paths keep the raw constrained profile, while execution, summaries, and
    user-visible status use the effective profile and workspace-root list.
    
    ## What Changed
    
    - added `[permissions.<id>.workspace_roots]` to the config model and
    schema
    - added runtime `workspace_roots` state to `Config`/`Permissions` and
    `ConfigOverrides`
    - made `Permissions` profile fields private and replaced direct mutation
    with accessors/setters
    - added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for
    materializing `:workspace_roots` exact paths and deny-read globs across
    all roots
    - moved legacy additional writable roots into runtime workspace-root
    state instead of active profile modifications
    - removed `ActivePermissionProfileModification` and its app-server
    protocol/schema export
    - updated sandbox/status summary paths so internal writable roots are
    not reported as user workspace roots
    
    ## Verification Strategy
    
    The targeted tests cover the behavior at the layers where regressions
    are most likely:
    - `codex-rs/core/src/config/config_tests.rs` verifies config loading,
    legacy workspace-root seeding, effective profile materialization, and
    memory-root handling.
    - `codex-rs/core/src/config/permissions_tests.rs` verifies profile
    `workspace_roots` parsing and `:workspace_roots` scoped/glob
    compilation.
    - `codex-rs/protocol/src/permissions.rs` unit tests verify exact and
    glob materialization over multiple workspace roots.
    - `codex-rs/tui/src/status/tests.rs` and
    `codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the
    user-facing summaries show effective workspace roots and hide internal
    writes.
    
    I also ran `cargo check --tests` locally after the latest stack refresh
    to catch cross-crate API breakage from the private-field/accessor
    changes.
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610).
    * #22612
    * #22611
    * #22683
    * __->__ #22610
  • 2- Use string service tiers in session protocol (#20971)
    ## Summary
    - break service tier session/op/app-server protocol fields from the
    closed enum to string tier ids
    - send the service tier string directly through model requests, prewarm,
    compaction, memories, and TUI/app-server turn starts
    - regenerate app-server protocol JSON/TypeScript schemas, removing the
    standalone ServiceTier TS enum
    
    ## Verification
    - just fmt
    - cargo check -p codex-core -p codex-app-server -p codex-tui
    - just write-app-server-schema
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat(app-server): move v2 sessionId onto Thread (#21336)
    ## Why
    
    `session_id` and `thread_id` are separate identities after #20437, but
    app-server only surfaced `sessionId` on the `thread/start`,
    `thread/resume`, and `thread/fork` response envelopes. Other
    thread-bearing surfaces such as `thread/list`, `thread/read`,
    `thread/started`, `thread/rollback`, `thread/metadata/update`, and
    `thread/unarchive` either lacked the grouping key or forced clients to
    special-case those three responses.
    
    Making `sessionId` part of the reusable `Thread` payload gives every v2
    API surface one place to expose session-tree identity.
    
    ## Mental model
      1. thread.sessionId lives on `Thread`
    2. It is a view/runtime identity for the current live session tree, not
    durable stored lineage metadata
    3. When app-server has a live loaded thread, it copies the real value
    from core’s session_configured.session_id
    4. When it only has stored/unloaded data, it falls back to
    thread.sessionId = thread.id
    
    ## What changed
    
    - Added `sessionId` to the v2
    [`Thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs#L105-L109).
    - Removed the duplicate top-level `sessionId` fields from
    `thread/start`, `thread/resume`, and `thread/fork`; clients should now
    read `response.thread.sessionId`.
    - Populated `thread.sessionId` when building live thread responses,
    replaying loaded threads, and returning stored-thread summaries so the
    field is present across start, resume, fork, list, read, rollback,
    metadata-update, unarchive, and `thread/started` paths. See
    [`load_thread_from_resume_source_or_send_internal`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L2824-L2918)
    and
    [`thread_from_stored_thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L3671-L3719).
    - Preserved the stored-thread fallback: if a thread has not been loaded
    into a live session tree yet, `thread.sessionId` falls back to
    `thread.id`; once the thread is live again, the field reports the active
    session tree root.
    - Regenerated the JSON/TypeScript schemas and updated the app-server
    README examples to show
    [`thread.sessionId`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/README.md#L306-L310)
    on the thread object.
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • add turn items view to app-server turns (#21063)
    ## Why
    
    `Turn.items` currently overloads an empty array to mean either that no
    items exist or that the server intentionally did not load them for this
    response. That ambiguity blocks future lazy-loading work where clients
    need to distinguish unloaded, summary, and fully hydrated turn payloads.
    
    ## What changed
    
    - add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
    variants
    - add required `itemsView` metadata to app-server `Turn` payloads
    - mark reconstructed persisted history as `full` and live shell-style
    turn payloads as `notLoaded`
    - keep current `thread/turns/list` behavior unchanged and document that
    it still returns `full` turns today
    - regenerate the JSON and TypeScript protocol fixtures
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server thread_read_can_include_turns`
    - `cargo test -p codex-app-server
    thread_turns_list_can_page_backward_and_forward`
    - `cargo test -p codex-app-server
    thread_resume_rejects_history_when_thread_is_running`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fmt`
  • app-server-protocol: mark permission profiles experimental (#19899)
    ## Why
    
    `PermissionProfile` is now the canonical internal permissions
    representation, but the app-server wire shape is still intentionally
    unstable while the migration continues. Stable app-server clients should
    not see or generate code for these fields until the wire format settles.
    
    ## What changed
    
    - Marks every app-server v2 field that sends `PermissionProfile` as
    experimental, including `command/exec`, `thread/start`, `thread/resume`,
    `thread/fork`, and `turn/start` request/response payloads.
    - Enables per-field experimental inspection for `command/exec`, so
    `permissionProfile` is gated without making the entire method
    experimental.
    - Fixes the generated TypeScript schema filter to be comment-aware. The
    previous scanner treated apostrophes inside doc comments as string
    delimiters, so some experimental fields leaked into stable TypeScript
    even though stable JSON was filtered correctly.
    
    ## Verification
    
    - `cargo test -p codex-app-server-protocol`
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19899).
    * #19900
    * __->__ #19899
  • permissions: remove cwd special path (#19841)
    ## Why
    
    The experimental `PermissionProfile` API had both `:cwd` and
    `:project_roots` special filesystem paths, which made the permission
    root ambiguous. This PR removes the unstable `current_working_directory`
    special path before the permissions API is stabilized, so callers use
    `:project_roots` for symbolic project-root access.
    
    ## What changed
    
    - Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol
    and app-server protocol models, plus regenerated app-server
    JSON/TypeScript schemas.
    - Replaces internal `:cwd` permission entries with `:project_roots`
    entries.
    - Keeps the existing cwd-update behavior for legacy-shaped
    workspace-write profiles, while removing the deleted
    `CurrentWorkingDirectory` case from that compatibility path.
    - Keeps `PermissionProfile::workspace_write()` as the reusable symbolic
    workspace-write helper, with docs noting that `:project_roots` entries
    resolve at enforcement time.
    - Updates app-server docs/examples and approval UI labeling to stop
    advertising `:cwd` as a permission token.
    
    ## Compatibility
    
    Persisted rollout items may contain the old
    `{"kind":"current_working_directory"}` tag from earlier experimental
    `permissionProfile` snapshots. This PR keeps that tag as a
    deserialize-only alias for `ProjectRoots { subpath: None }`, while
    continuing to serialize only the new `project_roots` tag.
    
    ## Follow-up
    
    This PR intentionally does not introduce an explicit project-root set on
    `SessionConfiguration` or runtime sandbox resolution. Today, the
    resolver still uses the active cwd as the single implicit project root.
    A follow-up should model project roots separately from tool cwd so
    `:project_roots` entries can resolve against the configured project
    roots, and resolve to no entries when there are no project roots.
    
    ## Verification
    
    - `cargo test -p codex-protocol permissions:: --lib`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-sandboxing -p codex-exec-server --lib`
    - `cargo test -p codex-core session_configuration_apply_ --lib`
    - `cargo test -p codex-app-server
    command_exec_permission_profile_project_roots_use_command_cwd --test
    all`
    - `cargo test -p codex-tui
    thread_read_session_state_does_not_reuse_primary_permission_profile
    --lib`
    - `cargo test -p codex-tui
    preset_matching_accepts_workspace_write_with_extra_roots --lib`
    - `cargo test -p codex-config --lib`
  • permissions: remove legacy read-only access modes (#19449)
    ## Why
    
    `ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`:
    `FullAccess` meant the historical read-only/workspace-write modes could
    read the full filesystem, while `Restricted` tried to carry partial
    readable roots. The partial-read model now belongs in
    `FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on
    `SandboxPolicy` makes every legacy projection reintroduce lossy
    read-root bookkeeping and creates unnecessary noise in the rest of the
    permissions migration.
    
    This PR makes the legacy policy model narrower and explicit:
    `SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent
    the old full-read sandbox modes only. Split readable roots, deny-read
    globs, and platform-default/minimal read behavior stay in the runtime
    permissions model.
    
    ## What changed
    
    - Removes `ReadOnlyAccess` from
    `codex_protocol::protocol::SandboxPolicy`, including the generated
    `access` and `readOnlyAccess` API fields.
    - Updates legacy policy/profile conversions so restricted filesystem
    reads are represented only by `FileSystemSandboxPolicy` /
    `PermissionProfile` entries.
    - Keeps app-server v2 compatible with legacy `fullAccess` read-access
    payloads by accepting and ignoring that no-op shape, while rejecting
    legacy `restricted` read-access payloads instead of silently widening
    them to full-read legacy policies.
    - Carries Windows sandbox platform-default read behavior with an
    explicit override flag instead of depending on
    `ReadOnlyAccess::Restricted`.
    - Refreshes generated app-server schema/types and updates tests/docs for
    the simplified legacy policy shape.
    
    ## Verification
    
    - `cargo check -p codex-app-server-protocol --tests`
    - `cargo check -p codex-windows-sandbox --tests`
    - `cargo test -p codex-app-server-protocol sandbox_policy_`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449).
    * #19395
    * #19394
    * #19393
    * #19392
    * #19391
    * __->__ #19449
  • permissions: make profiles represent enforcement (#19231)
    ## Why
    
    `PermissionProfile` is becoming the canonical permissions abstraction,
    but the old shape only carried optional filesystem and network fields.
    It could describe allowed access, but not who is responsible for
    enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
    when profiles were exported, cached, or round-tripped through app-server
    APIs.
    
    The important model change is that active permissions are now a disjoint
    union over the enforcement mode. Conceptually:
    
    ```rust
    pub enum PermissionProfile {
        Managed {
            file_system: FileSystemSandboxPolicy,
            network: NetworkSandboxPolicy,
        },
        Disabled,
        External {
            network: NetworkSandboxPolicy,
        },
    }
    ```
    
    This distinction matters because `Disabled` means Codex should apply no
    outer sandbox at all, while `External` means filesystem isolation is
    owned by an outside caller. Those are not equivalent to a broad managed
    sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
    inner sandbox may require the outer Codex layer to use no sandbox rather
    than a permissive one.
    
    ## How Existing Modeling Maps
    
    Legacy `SandboxPolicy` remains a boundary projection, but it now maps
    into the higher-fidelity profile model:
    
    - `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
    with restricted filesystem entries plus the corresponding network
    policy.
    - `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
    the “no outer sandbox” intent instead of treating it as a lax managed
    sandbox.
    - `ExternalSandbox { network_access }` maps to
    `PermissionProfile::External { network }`, preserving external
    filesystem enforcement while still carrying the active network policy.
    - Split runtime policies that legacy `SandboxPolicy` cannot faithfully
    express, such as managed unrestricted filesystem plus restricted
    network, stay `Managed` instead of being collapsed into
    `ExternalSandbox`.
    - Per-command/session/turn grants remain partial overlays via
    `AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
    complete active runtime permissions.
    
    ## What Changed
    
    - Change active `PermissionProfile` into a tagged union: `managed`,
    `disabled`, and `external`.
    - Keep partial permission grants separate with
    `AdditionalPermissionProfile` for command/session/turn overlays.
    - Represent managed filesystem permissions as either `restricted`
    entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
    present.
    - Preserve old rollout compatibility by accepting the pre-tagged `{
    network, file_system }` profile shape during deserialization.
    - Preserve fidelity for important edge cases: `DangerFullAccess`
    round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
    and managed unrestricted filesystem + restricted network stays managed
    instead of being mistaken for external enforcement.
    - Preserve configured deny-read entries and bounded glob scan depth when
    full profiles are projected back into runtime policies, including
    unrestricted replacements that now become `:root = write` plus deny
    entries.
    - Regenerate the experimental app-server v2 JSON/TypeScript schema and
    update the `command/exec` README example for the tagged
    `permissionProfile` shape.
    
    ## Compatibility
    
    Legacy `SandboxPolicy` remains available at config/API boundaries as the
    compatibility projection. Existing rollout lines with the old
    `PermissionProfile` shape continue to load. The app-server
    `permissionProfile` field is experimental, so its v2 wire shape is
    intentionally updated to match the higher-fidelity model.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo check --tests`
    - `cargo test -p codex-protocol permission_profile`
    - `cargo test -p codex-protocol
    preserving_deny_entries_keeps_unrestricted_policy_enforceable`
    - `cargo test -p codex-app-server-protocol
    permission_profile_file_system_permissions`
    - `cargo test -p codex-app-server-protocol serialize_client_response`
    - `cargo test -p codex-core
    session_configured_reports_permission_profile_for_external_sandbox`
    - `just fix`
    - `just fix -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-core`
    - `just fix -p codex-app-server`
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • Rebrand approvals reviewer config to auto-review (#18504)
    ### Why
    
    Auto-review is the user-facing name for the approvals reviewer, but the
    config/API value still exposed the old `guardian_subagent` name. That
    made new configs and generated schemas point users at Guardian
    terminology even though the intended product surface is Auto-review.
    
    This PR updates the external `approvals_reviewer` value while preserving
    compatibility for existing configs and clients.
    
    ### What changed
    
    - Makes `auto_review` the canonical serialized value for
    `approvals_reviewer`.
    - Keeps `guardian_subagent` accepted as a legacy alias.
    - Keeps `user` accepted and serialized as `user`.
    - Updates generated config and app-server schemas so
    `approvals_reviewer` includes:
      - `user`
      - `auto_review`
      - `guardian_subagent`
    - Updates app-server README docs for the reviewer value.
    - Updates analytics and config requirements tests for the canonical
    auto_review value.
    
    
    ### Compatibility
    
    Existing configs and API payloads using:
    
    ```toml
    approvals_reviewer = "guardian_subagent"
    ```
    
    continue to load and map to the Auto-review reviewer behavior. 
    
    New serialization emits: 
    ```toml
    approvals_reviewer = "auto_review" 
    ```
    
    This PR intentionally does not rename the [features].guardian_approval
    key or broad internal Guardian symbols. Those are split out for a
    follow-up PR to keep this migration small and avoid touching large
    TUI/internal surfaces.
    
    **Verification**
    cargo test -p codex-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
    cargo test -p codex-app-server-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
  • app-server: expose thread permission profiles (#18278)
    ## Why
    
    The `PermissionProfile` migration needs app-server clients to see the
    same constrained permission model that core is using at runtime. Before
    this PR, thread lifecycle responses only exposed the legacy
    `SandboxPolicy` shape, so clients still had to infer active permissions
    from sandbox fields. That makes downstream resume, fork, and override
    flows harder to make `PermissionProfile`-first.
    
    External sandbox policies are intentionally excluded from this canonical
    view. External enforcement cannot be round-tripped as a
    `PermissionProfile`, and exposing a lossy root-write profile would let
    clients accidentally change sandbox semantics if they echo the profile
    back later.
    
    ## What changed
    
    - Adds the app-server v2 `PermissionProfile` wire shape, including
    filesystem permissions and glob scan depth metadata.
    - Adds `PermissionProfileNetworkPermissions` so the profile response
    does not expose active network state through the older
    additional-permissions naming.
    - Returns `permissionProfile` from thread start, resume, and fork
    responses when the active sandbox can be represented as a
    `PermissionProfile`.
    - Keeps legacy `sandbox` in those responses for compatibility and
    documents `permissionProfile` as canonical when present.
    - Makes lifecycle `permissionProfile` nullable and returns `null` for
    `ExternalSandbox` to avoid exposing a lossy profile.
    - Regenerates the app-server JSON schema and TypeScript fixtures.
    
    ## Verification
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server
    thread_response_permission_profile_omits_external_sandbox --
    --nocapture`
    - `cargo check --tests -p codex-analytics -p codex-exec -p codex-tui`
    - `just fix -p codex-app-server-protocol -p codex-app-server -p
    codex-analytics -p codex-exec -p codex-tui`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18278).
    * #18279
    * __->__ #18278
  • [tool search] support namespaced deferred dynamic tools (#18413)
    Deferred dynamic tools need to round-trip a namespace so a tool returned
    by `tool_search` can be called through the same registry key that core
    uses for dispatch.
    
    This change adds namespace support for dynamic tool specs/calls,
    persists it through app-server thread state, and routes dynamic tool
    calls by full `ToolName` while still sending the app the leaf tool name.
    Deferred dynamic tools must provide a namespace; non-deferred dynamic
    tools may remain top-level.
    
    It also introduces `LoadableToolSpec` as the shared
    function-or-namespace Responses shape used by both `tool_search` output
    and dynamic tool registration, so dynamic tools use the same wrapping
    logic in both paths.
    
    Validation:
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tool_search`
    
    ---------
    
    Co-authored-by: Sayan Sisodiya <sayan@openai.com>
  • [codex][mcp] Add resource uri meta to tool call item. (#17831)
    - [x] Add resource uri meta to tool call item so that the app-server
    client can start prefetching resources immediately without loading mcp
    server status.
  • Spread AbsolutePathBuf (#17792)
    Mechanical change to promote absolute paths through code.
  • Expose instruction sources (AGENTS.md) via app server (#17506)
    Addresses #17498
    
    Problem: The TUI derived /status instruction source paths from the local
    client environment, which could show stale <none> output or incorrect
    paths when connected to a remote app server.
    
    Solution: Add an app-server v2 instructionSources snapshot to thread
    start/resume/fork responses, default it to an empty list when older
    servers omit it, and render TUI /status from that server-provided
    session data.
    
    Additional context: The app-server field is intentionally named
    instructionSources rather than AGENTS.md-specific terminology because
    the loaded instruction sources can include global instructions, project
    AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
    future dynamic sources.
  • [codex-analytics] add protocol-native turn timestamps (#16638)
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
    * #16870
    * #16706
    * #16659
    * #16641
    * #16640
    * __->__ #16638
  • Fix fork source display in /status (expose forked_from_id in app server) (#16596)
    Addresses #16560
    
    Problem: `/status` stopped showing the source thread id in forked TUI
    sessions after the app-server migration.
    
    Solution: Carry fork source ids through app-server v2 thread data and
    the TUI session adapter, and update TUI fixtures so `/status` matches
    the old TUI behavior.
  • tui: queue follow-ups during manual /compact (#15259)
    ## Summary
    - queue input after the user submits `/compact` until that manual
    compact turn ends
    - mirror the same behavior in the app-server TUI
    - add regressions for input queued before compact starts and while it is
    running
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: change multi-agent to use path-like system instead of uuids (#15313)
    This PR add an URI-based system to reference agents within a tree. This
    comes from a sync between research and engineering.
    
    The main agent (the one manually spawned by a user) is always called
    `/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
    where `agent_1` is chosen by the model.
    
    Any agent can contact any agents using the path.
    
    Paths can be used either in absolute or relative to the calling agents
    
    Resume is not supported for now on this new path