890 Commits

  • [codex] Use model metadata for skills usage instructions (#29740)
    ## Summary
    
    - add a false-by-default `include_skills_usage_instructions` model
    metadata field
    - enable the field for the bundled `gpt-5.5` model metadata
    - consume the metadata in both core and extension skill rendering
    - remove hardcoded legacy-model matching and its marker plumbing
  • [codex] Enable remote plugins by default (#30297)
    ## Summary
    
    - enable the remote plugin feature by default
    - promote the remote plugin feature from under development to stable
    - preserve the existing `features.remote_plugin` override for explicitly
    disabling it
    - keep legacy disabled-path coverage explicit in TUI and app-server
    tests
    
    ## Impact
    
    Remote plugin functionality is enabled by default for configurations
    that do not set the feature flag. The existing Codex backend
    authentication gate still applies.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-features`
    - `just test -p codex-tui
    plugins_popup_remote_section_fallback_states_snapshot`
    - targeted `codex-app-server` plugin-list and skills-list tests
    - `git diff --check`
    
    The full TUI and app-server suites were also exercised locally. All
    remote-plugin-related coverage passed; unrelated local
    sandbox/test-binary failures remain outside this change.
  • [app-server] expose environment info RPC (#30291)
    ## Why
    
    App-server clients that configure named execution environments need to
    discover an environment's shell and working directory before selecting
    it for a thread or turn. Because the environment can run on a different
    operating system than app-server, its working directory is represented
    as a canonical `file:` URI rather than a host-local path string. The
    probe also needs a bounded response time: an exec-server that completes
    initialization but never answers `environment/info` must not hold the
    environment serialization queue indefinitely.
    
    ## What changed
    
    - Add an experimental `environment/info` app-server RPC for named
    environments.
    - Route the probe through the managed environment connection and return
    target-native shell metadata plus the default working directory as a
    `PathUri`.
    - Return connection and protocol failures as JSON-RPC errors.
    - Bound the exec-server probe response to 30 seconds and remove
    timed-out calls from the pending-request table so later environment
    mutations can proceed.
    - Cover successful responses, omitted working directories, unknown
    environments, connection failures, and pending-call cleanup.
    
    ## Protocol examples
    
    Request:
    
    ```json
    {
      "id": 42,
      "method": "environment/info",
      "params": {
        "environmentId": "remote-a"
      }
    }
    ```
    
    Successful response:
    
    ```json
    {
      "id": 42,
      "result": {
        "shell": {
          "name": "zsh",
          "path": "/bin/zsh"
        },
        "cwd": "file:///workspace"
      }
    }
    ```
    
    If the exec-server initializes but does not answer the probe within 30
    seconds:
    
    ```json
    {
      "id": 42,
      "error": {
        "code": -32603,
        "message": "failed to get info for environment `remote-a`: exec-server protocol error: timed out waiting for exec-server `environment/info` response after 30s"
      }
    }
    ```
    
    ## Testing
    
    - App-server integration coverage for successful info (including omitted
    `cwd`), unknown environments, and connection failures.
    - Exec-server RPC coverage verifying a timed-out call is removed from
    the pending-request table.
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • app-server: structure and test JSON shutdown logs (#30314)
    ## Why
    
    `LOG_FORMAT=json` and `RUST_LOG` are supported by app-server, but the
    behavior was only covered indirectly. We should verify the actual JSONL
    written by both user-facing entry points: `codex app-server` and the
    standalone `codex-app-server` binary.
    
    The existing processor shutdown message also always said the channel
    closed, even though the processor can exit for several different
    reasons. Structured fields make that event more accurate and useful to
    log consumers.
    
    ## What changed
    
    - Record the processor `exit_reason`, remaining connection count, and
    forced-shutdown state as structured tracing fields.
    - Add a shared process-test helper that enables JSON logging, validates
    every stderr line as JSON, and verifies the top-level timestamp is RFC
    3339.
    - Cover both `codex app-server` and `codex-app-server`, asserting the
    stable `level`, `fields`, and `target` payload.
    
    ## Test plan
    
    - `just test -p codex-app-server
    standalone_app_server_emits_json_info_events`
    - `just test -p codex-cli app_server_emits_json_info_events`
  • feat(app-server): add optional turn_id to thread/fork (#30277)
    ## Description
    
    This adds stable optional `turnId` support to `thread/fork`. When
    supplied, the fork copies persisted history through that terminal turn,
    inclusive, and drops later turns from the new thread.
    
    Omitting or passing `null` preserves the existing full-history fork
    behavior, including the interruption marker when the stored source
    history ends mid-turn.
    
    ## Why
    
    We're deprecating `thread/rollback` and this will help certain UX use
    cases work around it by using `thread/fork` + `turn_id` instead.
  • [codex] allow AGENTS.md and skills to authorize delegation (#30274)
    Prompt update of MAv2 to include agents.md and skills more explicitly
    
    should mimic: https://github.com/openai/codex/pull/27919
  • [codex] Add managed new-thread model settings (#29683)
    ## Why
    
    Admins need persistent defaults for the model, reasoning effort, and
    service tier shown when the Desktop App creates a new thread. These are
    initialization defaults rather than runtime constraints: the App should
    use them to initialize its draft while still allowing a user to make an
    explicit selection.
    
    The app-server therefore needs to expose the managed values before
    thread creation without changing `thread/start` behavior for other
    clients.
    
    ## What changed
    
    - Parse `model`, `model_reasoning_effort`, and `service_tier` from
    `[models.new_thread]` in `requirements.toml`.
    - Compose the `models` requirements through the existing
    requirements-layer precedence rules.
    - Expose the resolved values through `configRequirements/read` as
    `requirements.models.newThread`.
    - Add the corresponding app-server protocol types and regenerate the
    JSON and TypeScript schema fixtures.
    - Document the new `configRequirements/read` fields in the app-server
    README.
    
    ## Scope
    
    This PR is data plumbing only. It does not apply these values during
    `thread/start` and does not change thread creation for existing
    app-server clients, resumed or forked sessions, internal or subagent
    sessions, `codex exec`, or the TUI. A companion Desktop App change owns
    draft initialization, sends the effective settings for ordinary and
    prewarmed starts, and preserves explicit user changes.
    
    ## Validation
    
    - Requirements deserialization coverage for `[models.new_thread]`
    - Requirements-layer precedence coverage
    - App-server API mapping coverage
    - `configRequirements/read` integration coverage
    - Regenerated app-server JSON and TypeScript schema fixtures
  • feat(app-server): add history_mode to thread (#29927)
    ## Description
    
    This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
    This will be stored in `SessionMeta` in the JSONL rollout file and as a
    new column in the SQLite thread_metadata table, and exposed on
    `thread/start` and on the `Thread` object in app-server.
    
    ## What changed
    
    - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
    defaulting old and new SessionMeta to `legacy`.
    - Carried `history_mode` through core session config, ThreadStore stored
    metadata, local/in-memory stores, rollout metadata extraction, and the
    existing SQLite `threads` table.
    - Added experimental `historyMode` to app-server v2 `Thread` and
    `thread/start`.
    - Made paginated stored threads metadata-discoverable but unsupported
    for legacy full-history reads, `load_history`, live resume, and create
    paths.
    - Regenerated app-server schema fixtures and added
    protocol/state/thread-store/app-server coverage for persistence and
    fail-closed behavior.
    
    ## Compatibility floor
    Because users may be running various versions of Codex binaries on the
    same machine (TUI, Codex App, etc.), we will need to establish a
    compatibility floor for upcoming paginated threads, which will change
    how thread storage reads and writes work.
    
    The overall plan here:
    ```
    Release N:
    - Add historyMode to SessionMeta / Thread / SQLite metadata.
    - Teach binaries to understand paginated threads.
    - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
    - Default remains `"legacy"`.
    
    Release N+1:
    - First-party clients start opting into paginated threads where appropriate.
    - Internal dogfood / staged rollout.
    - Measure old-client usage and paginated-thread unsupported errors.
    
    Release N+2:
    - Only after Release N+ is overwhelmingly deployed, make paginated the default.
    - Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
    ```
    
    The important behavior change is fail-closed handling for a binary that
    encounters a persisted `paginated` thread before it knows how to fully
    support paginated history. In app-server, if a thread is `paginated`, we
    will:
    
    - allow metadata-only discovery paths like `thread/list` and
    `thread/read(includeTurns=false)`, so clients can still see the thread
    and inspect its `historyMode`
    - reject legacy full-history/live-thread paths like
    `thread/read(includeTurns=true)` and `thread/resume` with an unsupported
    JSON-RPC error
    - avoid silently treating an unknown or future `historyMode` as `legacy`
    
    Under the hood, the ThreadStore layer also rejects legacy operations
    that would need to load or replay the full thread history for a
    paginated thread. That gives us the behavior we want for Release N:
    future paginated threads are visible, but this binary fails closed
    instead of trying to operate on them as if they were legacy threads.
  • Test selected capabilities across unavailable resume (#30215)
    ## Why
    
    The selected-capability integration test already covers initial
    attachment and cold resume, but it resumes while the selected executor
    is still reachable.
    
    That leaves an important World State transition untested: a thread
    remembers its selected capability root, resumes while that environment
    is unavailable, and later sees the same stable environment return.
    
    ## What this tests
    
    This extends the existing end-to-end scenario:
    
    ```text
    selected executor available
            ↓
    app-server stops and the executor goes away
            ↓
    thread resumes with the executor unavailable
            ↓
    skills, selected MCP tools, and connector attribution are absent
            ↓
    the same environment ID is attached again
            ↓
    skills, MCP tools, and connector attribution return
    ```
    
    The test also checks that the unavailable snapshot explicitly tells the
    model that no selected-environment skills are currently available. After
    reattachment, it invokes the selected skill again and verifies that a
    new executor-owned MCP process starts.
    
    ## Scope
    
    This is test-only. It keeps the existing assumption that an environment
    ID refers to stable capability contents. It does not add package-file
    invalidation or live transport reconnect behavior.
  • Test selected capabilities across availability and resume (#30157)
    ## Why
    
    This stack crosses World State, executor skills, selected plugin
    metadata, MCP processes, connectors, dynamic environments, and resume.
    This PR adds two end-to-end scenarios that validate those pieces
    together.
    
    Both tests enable `deferred_executor`, so they exercise the real
    delayed-environment path.
    
    ## Scenario 1: availability across turns and resume
    
    ```text
    1. Start a thread with one selected plugin root bound to E1.
    2. E1 is unavailable.
       - executor skill is absent
       - selected MCP is absent
       - connector has no selected-plugin attribution
    3. Start E1 and register the same stable environment ID.
    4. Start a new turn.
       - the executor skill appears through World State
       - its body beats a colliding host skill
       - the selected MCP tool is advertised and executes inside E1
       - the connector is attributed to the selected plugin
    5. Start another turn without changing E1.
       - the MCP PID stays the same, proving runtime reuse
    6. Restart app-server and resume the thread.
       - durable selected-root intent is restored
       - skills, MCP, and connector attribution are restored
       - a new MCP PID proves ephemeral process state was rebuilt
    ```
    
    ## Scenario 2: availability changes inside one turn
    
    ```text
    1. Start a turn while E1 is unavailable.
    2. The first model sample sees no executor skill, MCP, or selected connector.
    3. The turn pauses on request_user_input.
    4. Start E1 and register it while that same turn is still active.
    5. Continue the turn.
    6. The very next model sample sees:
       - the executor skill catalog
       - the selected MCP tool
       - selected-plugin connector attribution
    7. The model calls the MCP, and its output proves execution happened inside E1.
    ```
    
    This second scenario specifically protects the aeon-style behavior:
    capability state is captured again for every sampling step, not only at
    the next user turn.
    
    ## Scope
    
    These are integration tests only. They do not add a combinatorial matrix
    for unsupported plugin-file mutation, environment generations, transport
    disconnects, or delayed `required = true` executor MCPs.
  • Expose MCP app identity in app context (#29934)
    ## Why
    
    MCP tool-call events need to expose trusted app identity and action
    metadata directly so v2 clients do not have to infer it from tool names
    or resource URIs.
    
    ## What changed
    
    - Add optional `appName`, `templateId`, and `actionName` fields to MCP
    tool-call `appContext`.
    - Populate `appName` and `templateId` from trusted Codex Apps metadata,
    and derive `actionName` from the trusted app resource metadata.
    - Preserve all three fields through core events, legacy protocol events,
    persisted thread history, resume redaction, and app-server v2 responses.
    - Document the public `appContext` fields in
    `codex-rs/app-server/README.md`.
    - Regenerate app-server JSON and TypeScript schemas and add coverage for
    serialization, persistence, redaction, and metadata propagation.
    
    ## Validation
    
    - `just test -p codex-app-server-protocol mcp_tool_call`
    - `just test -p codex-core
    mcp_tool_call_item_metadata_only_trusts_codex_apps_identity
    mcp_tool_call_item_includes_app_identity`
    - `just write-app-server-schema`
    
    ---------
    
    Co-authored-by: Martin Au-Yeung <280153141+martinauyeung-oai@users.noreply.github.com>
  • Keep MCP elicitation routable across runtime refreshes (#30127)
    ## Why
    
    An MCP tool call can still be waiting for an elicitation response when
    an environment update replaces the thread's MCP runtime.
    
    Before this change:
    
    ```text
    runtime A starts a tool call and asks the user
    environment becomes ready, so runtime B is published
    client answers the prompt through runtime B
    runtime B cannot find runtime A's pending responder
    ```
    
    The response is lost and the original tool call stays blocked.
    
    ## What changed
    
    All MCP runtimes for one thread now share a small elicitation router:
    
    ```text
    runtime A ---\
                   shared router: response token -> exact pending responder
    runtime B ---/
    ```
    
    When Codex surfaces an MCP elicitation, it assigns a unique opaque
    response token. The router records which pending request owns that
    token. A replacement runtime reuses the same router, so the latest
    runtime can deliver a response to a request started by the previous
    runtime.
    
    The Codex-owned token also prevents two runtime connections that reuse
    the same MCP server request ID from receiving each other's responses.
    
    This does not retain or search old MCP managers. Only the pending
    responder map is shared.
    
    ## Covered scenario
    
    The integration test exercises the complete failure mode:
    
    1. A thread starts while its selected environment is still unavailable.
    2. A configured MCP server starts a tool call and asks the client for
    input.
    3. The environment becomes ready, causing Codex to publish a replacement
    MCP runtime.
    4. The client answers the original prompt after the replacement.
    5. The original tool call receives that answer and completes.
    
    A focused routing test also creates two runtimes with the same server
    request ID and verifies that each response reaches the exact request
    that emitted its token.
    
    ## Scope
    
    This PR changes only elicitation response routing across MCP runtime
    replacement. It does not change when runtimes are rebuilt, which
    environments contribute MCP configuration, or how environment
    availability is detected.
  • [codex] Attribute app-server analytics by thread originator (#29935)
    ## Why
    
    Desktop Work threads and regular Codex threads can share the same
    app-server connection. App-server analytics currently copy
    `product_client_id` from connection metadata for every thread-scoped
    event, so Work thread activity is attributed to the Desktop connection
    instead of the thread's resolved originator. This prevents analytics
    from distinguishing the two products on a shared connection.
    
    ## What changed
    
    - Publish the resolved originator after a thread is materialized,
    covering new, resumed, forked, and subagent threads.
    - Store that originator in the analytics reducer's existing per-thread
    state.
    - Override only `app_server_client.product_client_id` for thread, turn,
    tool, review, goal, guardian, and compaction events while preserving the
    connection's client name, version, and transport metadata.
    - Fall back to the connection-wide product client ID when a thread has
    no originator override.
    - Preserve persisted originators in thread initialization analytics for
    resume and fork flows.
    
    ## Validation
    
    - `just test -p codex-analytics
    thread_originator_overrides_shared_connection_across_thread_events
    subagent_events_keep_thread_originator_with_explicit_turn_connection`
    - `just test -p codex-app-server
    turn_start_tracks_thread_originator_in_analytics
    thread_start_tracks_thread_initialized_analytics
    thread_fork_tracks_thread_initialized_analytics
    thread_resume_tracks_thread_initialized_analytics`
    - `just test -p codex-core thread_manager`
  • Project selected plugin runtime by environment availability (#30093)
    ## Why
    
    Selected plugin metadata is stable, but MCP processes are live runtime
    state. They need different lifetimes:
    
    - the MCP extension caches manifest, MCP, and connector declarations for
    each stable selected root;
    - each model step projects that cached metadata through the roots that
    resolved as ready for that exact step;
    - the MCP manager is rebuilt only when that availability projection
    changes.
    
    This matches executor skills: both features consume the same resolved
    step roots instead of inferring readiness from the turn's selected
    environments.
    
    ## Behavior
    
    ```text
    E1 not ready for this step
      -> no E1 MCP servers or connectors
      -> cached plugin metadata stays in ext/mcp
    
    E1 becomes ready
      -> reuse cached metadata
      -> publish one MCP runtime containing E1 capabilities
    
    same ready roots on the next step
      -> reuse the exact runtime; no rediscovery and no MCP restart
    
    resume
      -> create new extension thread state and a new MCP runtime
    ```
    
    All model-facing consumers use the same step snapshot:
    
    ```text
    resolved selected roots
            |
            v
    extension MCP/connector projection
            |
            v
    { MCP config, connector snapshot, MCP manager }
            |
            +-> advertise model tools
            +-> build app/connector tools
            +-> execute MCP calls
    ```
    
    ## Cache contract
    
    The existing MCP extension owns a cache keyed by the full
    `SelectedCapabilityRoot`:
    
    ```rust
    let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
    ```
    
    The cache lives with extension thread state. Environment availability
    filters projection but does not invalidate metadata. Resume creates new
    thread state. There is no file watcher or executor generation because
    contents behind a stable environment/root are assumed stable.
    
    ## What changes
    
    - Keeps executor plugin discovery and cached metadata in `ext/mcp`.
    - Caches MCP and connector declarations together per selected root.
    - Uses the step's already-resolved capability roots, including lazy
    environments that are not turn environments.
    - Reuses the current MCP runtime when the ready-root projection is
    unchanged.
    - Uses the same step MCP manager and connector snapshot for
    model-visible tools and execution.
    - Resolves direct thread-scoped MCP requests from the current
    selected-root projection.
    
    ## Deliberately out of scope
    
    - `app/list` remains based on the latest global host-plugin state; this
    PR does not make its response or notifications thread-specific.
    - `required = true` startup semantics do not apply to delayed executor
    MCP activation.
    - No filesystem/content invalidation.
    - No transport-disconnect watcher.
    - No executor generations or environment replacement semantics.
    - No client sharing across complete manager replacements.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. Project executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. **This PR:** project selected MCP and connector state from
    extension-owned metadata.
    5. Integration coverage for selected capability availability and resume.
    
    ## Verification
    
    -
    `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
    - The stacked integration PR covers unavailable to ready activation,
    unchanged-runtime reuse, skills, MCP tools, connector attribution, and
    cold resume.
  • [codex] Surface MCP reauthentication-required startup failures (#29877)
    ## Summary
    
    - distinguish expired, non-refreshable stored MCP OAuth credentials from
    first-time missing credentials
    - carry a typed `failureReason: "reauthenticationRequired"` on the
    existing `mcpServer/startupStatus/updated` notification only when user
    action is required
    - keep the public MCP auth-status API unchanged and regenerate the
    app-server protocol schemas and documentation
    
    ## Why
    
    An MCP server with an expired access token and no usable refresh token
    currently fails startup without giving clients a reliable, typed
    recovery signal.
    
    The existing startup-status notification is the natural place to carry
    this state. Its nullable `failureReason` keeps the recovery reason
    attached to the failed startup transition without adding a one-off
    notification. Internally, Codex distinguishes first-time login from
    reauthentication and emits the reason only when the startup error itself
    requires authentication.
    
    ## User impact
    
    App clients can prompt an existing user to reconnect an MCP server when
    automatic recovery is impossible by handling a failed
    `mcpServer/startupStatus/updated` notification whose `failureReason` is
    `reauthenticationRequired`. Starting, ready, cancelled, unrelated
    failures, and first-time setup carry no reauthentication reason.
    
    ## Companion app PR
    
    - openai/openai#1069582
    
    ## Validation
    
    - `just test -p codex-app-server-protocol` — 248 passed; schema fixture
    tests passed
    - `cargo check -p codex-app-server -p codex-tui`
    - `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
    - `just test -p codex-protocol -p codex-app-server-protocol -p
    codex-mcp` — 579 passed
    - `just write-app-server-schema`
    - `just fmt`
  • fix(app-server): suppress TUI rollback warning (#30124)
    ## Why
    
    The TUI uses `thread/rollback` internally for user-facing flows such as
    prompt cancellation/backtracking. After `thread/rollback` was marked
    deprecated, those internal calls started surfacing `deprecationNotice`
    messages in the TUI, even though the user did not explicitly call the
    deprecated app-server API.
    
    The endpoint should remain deprecated for external app-server clients,
    but the built-in `codex-tui` client should not show this
    implementation-detail warning during normal interaction.
    
    ## What changed
    
    - Pass the initialized app-server client name into the `thread/rollback`
    request processor.
    - Suppress the `thread/rollback` deprecation notice only for
    `codex-tui`.
    - Preserve the existing `deprecationNotice` behavior for non-TUI
    clients.
    - Add regression coverage for the `codex-tui` suppression path.
    
    ## How to Test
    
    1. Start Codex TUI from this branch.
    2. Type text into the composer and press `Esc` to cancel/backtrack.
    3. Confirm the TUI restores/cancels the prompt without showing
    `thread/rollback is deprecated and will be removed soon`.
    4. Also verify an external app-server client that calls
    `thread/rollback` still receives `deprecationNotice`.
    
    Targeted tests:
    
    - `just test -p codex-app-server thread_rollback`
    - `just argument-comment-lint`
  • [codex] poll external clock during sleep (#30113)
    ## Summary
    
    - make the external app-server time provider establish sleep deadlines
    using `currentTime/read`
    - poll the external clock once per second and complete `clock.sleep`
    when the deadline is reached
    - keep the system-clock timer and existing steer/agent-message
    interruption behavior unchanged
    
    ## Why
    
    This lets training control `clock.sleep` through its existing external
    simulated clock without adding separate sleep/wake protocol methods.
    
    ## Testing
    
    - `just fmt`
    - `just test -p codex-app-server
    external_sleep_polls_current_time_and_emits_items`
  • feat: add provider-aware model fallback to thread start (#29942)
    ## Why
    
    Helper threads such as task title generation can request a model ID that
    is valid for the default OpenAI provider but unavailable from the active
    provider. With Amazon Bedrock, `gpt-5.4-mini` is rejected while the
    provider static catalog exposes Bedrock model IDs such as
    `openai.gpt-5.5` and `openai.gpt-5.4`. This causes repeated background
    404s and can surface a misleading turn error even when the main turn
    succeeds.
    
    Clients need an explicit way to ask app-server to resolve an unavailable
    helper model to the active provider default. That fallback must remain
    limited to providers with an authoritative static catalog so custom or
    dynamically discovered model IDs are not rewritten based on an
    incomplete catalog.
    
    Fixes #28741.
    
    ## What changed
    
    - Add the experimental `allowProviderModelFallback` option to
    `thread/start`, defaulting to `false` to preserve existing behavior.
    - Thread the option through thread creation and model selection.
    - When enabled for a static model manager, preserve requested models
    present in the catalog and replace unavailable models with the provider
    default.
    - Continue preserving explicit model IDs for dynamic model managers
    without fetching a catalog solely to validate them.
    - Document the new `thread/start` behavior in the app-server API
    overview.
    
    ## Test
    Temporary test-client harness:
    ```
    ThreadStartParams {
        model: Some("gpt-5.4-mini".to_string()),
        allow_provider_model_fallback: true,
        ..Default::default()
    }
    ```
    Command:
    ```
    CODEX_HOME=/tmp/codex-bedrock-thread-start-home \
    CODEX_E2E_BEDROCK_THREAD_START_ONLY=1 \
    ./target/debug/codex-app-server-test-client \
      --codex-bin ./target/debug/codex \
      -c 'model_provider="amazon-bedrock"' \
      send-message-v2 --experimental-api ignored
    ```
    Relevant output:
    ```
    > "method": "thread/start",
    > "params": {
    >   "model": "gpt-5.4-mini",
    >   "modelProvider": null,
    >   "allowProviderModelFallback": true,
    >   ...
    > }
    
    < "result": {
    <   "model": "openai.gpt-5.5",
    <   "modelProvider": "amazon-bedrock",
    <   ...
    < }
    ```
  • Persist selected capability roots and resolve availability per model step (#29856)
    ## Why
    
    `selectedCapabilityRoots` is durable thread intent: “use this capability
    root from environment `worker`.”
    
    The important product assumption is:
    
    > One environment ID always names the same logical executor and stable
    contents.
    
    `worker` does not silently change from executor A to an unrelated
    executor B. The process-local connection handle for `worker` can still
    be replaced while Codex is running, though, for example when
    `environment/add` registers a fresh handle for the same logical
    environment.
    
    The thread should persist only the stable selection. Each model step
    should pair that selection with the exact ready handle captured for that
    step.
    
    ## The boundary
    
    ```text
    persisted thread intent
      plugin@1 -> environment "worker"
                    |
                    | capture the current step
                    v
    model-step view
      unavailable, or
      plugin@1 + worker's exact captured ready handle
    ```
    
    The environment ID is the stable identity and cache key. The
    `Arc<Environment>` is only a process-local handle retained so consumers
    of one model step use the same captured environment. It is never
    persisted and it does not imply different environment contents.
    
    ## What changes
    
    ### Persist the stable selection
    
    Selected roots are written into `SessionMeta` and restored with the
    thread. Forked subagents inherit the same selections, including
    bounded-history forks.
    
    Only stable data is persisted: root ID, environment ID, and root path.
    
    ### Capture readiness together with the exact handle
    
    The environment snapshot records:
    
    ```rust
    environment_id -> Some(Arc<Environment>) // ready in this step
    environment_id -> None                   // still starting in this step
    ```
    
    This prevents readiness and execution from coming from different
    registry snapshots.
    
    For example:
    
    ```text
    step snapshot: worker -> handle A, ready
    environment/add: worker -> fresh handle B for the same logical environment
    current step: plugin@1 still uses captured handle A
    ```
    
    Without carrying handle A in the snapshot, the resolver could combine “A
    was ready” with handle B and treat B as ready before it had finished
    starting.
    
    This does not change cache invalidation. Stable capability metadata
    remains identified by environment ID and capability root. Replacing a
    process-local handle under the same stable environment ID does not
    invalidate or rediscover that metadata.
    
    ### Resolve availability per model step
    
    - A ready captured environment produces resolved roots using its
    captured handle.
    - A starting, missing, or failed environment is omitted from that step.
    - A selected lazy environment that is outside the turn's captured
    environment set is asked to start, and a later step can observe it as
    ready.
    - No capability files are scanned here.
    
    Transient transport disconnects remain the remote client's reconnect
    concern. This PR models initial attachment/readiness; it does not add
    live socket-connectivity state.
    
    ## Example
    
    ```text
    thread selection: plugin@1 -> environment "worker"
    
    step 1: worker is starting -> plugin@1 unavailable
    step 2: worker is ready    -> plugin@1 resolves through worker's captured handle
    step 3: fresh local handle -> current step remains pinned; a later step captures its own view
    ```
    
    Temporary unavailability does not discard the durable selection. Later
    PRs can retain stable metadata caches while projecting only currently
    available capabilities into model-visible World State.
    
    ## Compatibility
    
    The app-server request shape does not change. Older rollouts without
    `selected_capability_roots` deserialize to an empty list.
    
    ## Stack
    
    1. **This PR:** persist stable selected roots and resolve them through
    an exact model-step handle.
    2. #29960: cache stable skill metadata and project available skills into
    World State.
    3. #29946: cache stable plugin declarations and manage the separate live
    MCP runtime.
  • chore(app-server): mark thread/rollback as deprecated (#29928)
    We will drop support for this in the near future due to the complexity
    it introduces.
  • Test executor-routed MCP OAuth token exchange (#29656)
    ## Why
    
    #28529 proves OAuth discovery uses the selected executor, but its
    end-to-end test stops before the callback and token exchange.
    
    ## What changed
    
    - add an executor-only mock token endpoint
    - complete the OAuth callback using the authorization URL's `state` and
    `redirect_uri`
    - assert the PKCE token exchange reaches the executor-only endpoint
    - assert the completion notification reports the selected thread and
    succeeds
    
    Depends on #28529.
  • Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
    ## Why
    
    #28522 routes selected-plugin HTTP MCP traffic through the owning
    executor, but OAuth bootstrap and refresh still used host-local clients.
    Executor-only servers therefore cannot complete discovery or login
    through the same network boundary as the MCP connection.
    
    ## What changed
    
    - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
    contract
    - let RMCP own discovery, dynamic registration, PKCE, token exchange,
    and refresh
    - route auth status, persisted-token startup, and app-server login
    through the server runtime while preserving the existing local discovery
    path
    - add optional `threadId` to `mcpServer/oauth/login` and echo it in the
    completion notification
    - implement RMCP's redirect policy and 1 MiB OAuth response limit over
    executor HTTP
    - cover selected-thread OAuth discovery and login through an
    executor-only route
    
    Depends on #28522.
  • Support HTTP MCP servers from selected executor plugins (#28522)
    ## Why
    
    Selected executor plugins can declare both stdio and Streamable HTTP MCP
    servers, but only stdio registrations were retained. That silently drops
    part of the plugin's tool surface and prevents HTTP traffic from using
    the owning executor's network.
    
    ## What changed
    
    - retain selected-plugin Streamable HTTP MCP declarations alongside
    stdio declarations
    - route their HTTP clients through the owning executor environment
    - preserve local auth-header environment references while rejecting them
    for executor-hosted declarations
    - cover thread isolation, refresh, and an executor-only HTTP route end
    to end
  • [codex] Add Ultra reasoning effort (#29899)
    ## Why
    
    Ultra should be one user-facing reasoning selection for work that
    benefits from both maximum reasoning and proactive multi-agent
    delegation. Without it, clients must coordinate maximum reasoning with
    the experimental `multiAgentMode` setting, even though the inference
    backend still expects its existing `max` effort value.
    
    This change makes reasoning effort the source of truth: clients select
    `ultra`, core derives proactive multi-agent behavior when the turn is
    eligible for multi-agent V2, and inference requests continue to use the
    backend-compatible `max` value.
    
    ## What changed
    
    - Add `ultra` as a first-class reasoning effort and preserve
    model-catalog ordering when exposing it to clients.
    - Convert `ultra` to `max` at the inference request boundary, including
    Responses HTTP/WebSocket requests, startup prewarm, compaction, and
    memory summarization.
    - Derive effective multi-agent mode per turn from effective reasoning
    effort:
      - eligible multi-agent V2 + `ultra` → `proactive`
      - eligible multi-agent V2 + any other effort → `explicitRequestOnly`
    - V1 or otherwise ineligible sessions → no multi-agent mode instruction
    - Keep the derived effective mode in turn context history so successive
    turns can emit a developer-message update only when the effective mode
    changes.
    - Remove selected multi-agent mode from core session configuration, turn
    construction, thread settings, resume/fork restoration, and subagent
    spawn plumbing. Subagents inherit reasoning effort and derive their own
    effective mode.
    - Retain the experimental app-server `multiAgentMode` fields for wire
    compatibility while marking them deprecated. Request values are accepted
    but ignored; compatibility response fields report `explicitRequestOnly`.
    - Display Ultra in the TUI using the order supplied by `model/list`.
    
    ## Validation
    
    - `just test -p codex-core ultra_reasoning_uses_max_for_requests`
    - `just test -p codex-tui model_reasoning_selection_popup`
  • [codex] Populate remote plugin local versions (#29956)
    # What
    
    - Carry installed remote release versions through remote plugin
    summaries as `localVersion`.
    - Keep the app-server mapping a pure adapter by populating that value in
    the remote catalog layer.
    
    # Why
    
    Remote plugin summaries always returned `localVersion: null` even after
    their versioned bundles had been installed locally. Consumers such as
    scheduled-task template discovery use `localVersion` to resolve a
    plugin's materialized root, so templates from remote curated plugins
    were silently skipped.
  • [codex] nest sleep config under current time reminder (#29910)
    ## Summary
    
    - move sleep tool enablement from top-level `[features].sleep_tool` to
    `[features.current_time_reminder].sleep_tool`
    - remove the standalone `Feature::SleepTool` flag and gate `clock.sleep`
    from resolved current-time configuration
    - update config schema, config-lock materialization, and existing sleep
    coverage
    
    Stacked on #29907.
  • [codex] namespace sleep under clock (#29907)
    ## Summary
    
    - expose the interruptible sleep tool as `clock.sleep` instead of
    top-level `sleep`
    - keep `clock.curr_time` and `clock.sleep` in the same model-visible
    namespace when both features are enabled
    - update existing core and app-server integration coverage to issue
    namespaced sleep calls
    
    ## Why
    
    Sleep is a clock operation. Grouping it with `clock.curr_time` gives the
    model a more coherent tool surface without changing the sleep feature
    gate or runtime behavior.
    
    ## Validation
    
    - `just test -p codex-core sleep_tool_follows_feature_gate`
    - `just test -p codex-core any_new_input_interrupts_sleep`
    - `just test -p codex-app-server
    sleep_emits_started_and_completed_items`
  • [apps] Thread structured icon assets through app list (#29889)
    ## Summary
    
    - Add `iconAssets` and `iconDarkAssets` to the app-list protocol.
    - Preserve structured icons through directory merging and the connector,
    app-
      server, and TUI boundaries.
    - Keep legacy logo URLs unchanged as compatibility fallbacks.
    - Update generated protocol schemas and TypeScript types.
  • feat(app-server): list descendant threads by ancestor (#29591)
    ## Why
    
    `thread/list` can filter direct children with `parentThreadId`, but
    clients cannot request an entire spawned subtree. Discovering every
    descendant requires repeated client-side requests and gives up the
    database's existing filtering and pagination path.
    
    ## What changed
    
    Experimental clients can use `ancestorThreadId` to return strict
    descendants at any depth while `parentThreadId` retains its direct-child
    meaning. The filters are mutually exclusive, the ancestor is excluded,
    and every result preserves its immediate `parentThreadId` so callers can
    reconstruct the tree.
    
    ## How it works
    
    - **Explicit relationship:** Internal list parameters distinguish direct
    children from transitive descendants without changing the meaning of
    `parentThreadId`.
    - **Existing graph:** Persisted parent-child spawn edges remain the
    source of truth, so descendant lookup needs no schema migration or
    ancestry cache.
    - **Indexed traversal:** A recursive SQLite query starts from the
    parent-edge index, walks each generation, and applies thread filters,
    sorting, and cursor pagination in the same database request.
    - **Reconstructable results:** The response stays flat and normally
    ordered while carrying each descendant's immediate parent.
    
    ## Verification
    
    Ran 550 tests across the protocol, state, rollout, and thread-store
    crates, then reran the four focused state, store, and app-server
    descendant-listing tests after the final diff reduction. Scoped Clippy
    and formatting checks passed. Stable and experimental schema generation
    was checked; the stable fixtures remain unchanged while the experimental
    schema includes the new field.
  • test: use automatic environments in app-server integration tests (#29789)
    ## Why
    
    Topology-neutral app-server integration tests should exercise automatic
    environment selection so the same setup covers local and remote
    executors.
    
    ## What
    
    Migrate eligible tests to `TestAppServer::new_with_auto_env()` and
    `send_thread_start_request_with_auto_env()`. Leave explicit-topology
    tests unchanged, and skip the request-permissions case on Windows with a
    TODO for cross-platform tool routing.
    
    ## Validation
    
    - `just test -p codex-app-server`
    - `bazel test //codex-rs/app-server:app-server-all-wine-exec-test
    --test_output=errors`
    
    Stacked on #29788.
  • test: run app-server integration tests under Wine (#29788)
    ## Why
    
    Made a mistake when carving #29746 out of my local changes and the test
    was missing from the build graph. Oops!
    
    ## What
    
    Enable the app-server Wine exec test target. Remove the `manual` tag
    from generated Wine-exec test variants so wildcard Bazel test
    invocations select them. Refactor the smoke test to ensure it passes
    with current Windows support.
  • auth: move domain mode below app wire types (#29721)
    ## Why
    
    Authentication mode is a domain concept used by login, model selection,
    telemetry, and transports. Keeping the canonical type in app-server
    protocol forces those lower-level crates to depend on an unrelated wire
    API.
    
    ## What changed
    
    - Added canonical `codex_protocol::auth::AuthMode` domain values.
    - Kept the app-server wire DTO unchanged and added an explicit app-side
    conversion.
    - Removed production app-server-protocol dependencies from login,
    model-provider-info, models-manager, and otel call paths.
    
    ## Stack
    
    This is PR 2 of 6, stacked on [PR
    #29714](https://github.com/openai/codex/pull/29714). Review only the
    delta from `codex/split-json-rpc-protocols`. Next: [PR
    #29722](https://github.com/openai/codex/pull/29722).
    
    ## Validation
    
    - Auth and login coverage passed in the focused protocol/domain test
    run.
    - App-server account and auth conversion coverage passed.
  • [codex] Ignore local curated plugins when remote catalog is active (#29765)
    ## Summary
    
    - suppress configured `openai-curated` plugins when the remote plugin
    feature is enabled and auth uses the Codex backend
    - preserve `openai-api-curated` and non-Codex-backend behavior while
    including remote catalog activation in the plugin load cache key
    - add core plugin coverage and an app-server integration test for
    runtime feature enablement
    
    ## Why
    
    The Codex app enables remote plugins through process-local runtime
    feature enablement, which can happen after app-server startup tasks have
    already observed legacy local plugin state. The existing conflict logic
    only preferred a remote plugin when the same plugin was already
    installed remotely, so a configured legacy-only plugin could continue
    exposing skills and other capabilities from `openai-curated`.
    
    ## Impact
    
    When the remote catalog is active, legacy `openai-curated` plugins no
    longer contribute skills, MCP servers, apps, or hooks. Remote installed
    plugins continue to load normally, and `openai-api-curated` remains
    unaffected. This does not change remote fetch, bundle sync, or uninstall
    behavior.
    
    ## Validation
    
    - `just test -p codex-core-plugins
    remote_global_catalog_ignores_local_curated_plugins
    remote_plugin_feature_keeps_local_curated_without_codex_backend`
    - `just test -p codex-app-server
    runtime_remote_plugin_enablement_excludes_local_curated_plugin_skills`
    - `just fmt`
    - `git diff --check`
  • Let image generation extension hosts control output persistence (#29711)
    ## Why
    
    Some extension hosts need generated images returned without writing them
    to the local filesystem or giving the model a local path.
    
    ## What changed
    
    **tl;dr**: we now conduct all extension operations in the image gen
    extension
    
    - Let hosts provide an optional image save root when installing the
    extension.
    - Save images and return path hints only when a save root is configured.
    - Return image data without saving or adding a path hint when no save
    root is configured.
    - Preserve the extension-provided `saved_path` instead of persisting
    extension images again in core.
    - Leave built-in image generation unchanged.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
    - `just test -p codex-core
    extension_tool_uses_granted_turn_permissions_without_local_persistence`
    - `just test -p codex-core tools::handlers::extension_tools::tests`
    - tested on CODEX CLI on both save_root: CODEX_HOME and None 
    - tested on CODEX APP on both as well
  • test: add app-server auto environment helper (#29746)
    ## Why
    
    Start moving towards app-server tests defaulting to running against
    remote & foreign OS executors. To do so we need a point of indirection
    similar to core integration tests' `build_with_auto_env`, but with the
    flexibility of letting tests control environment registration if they
    need to.
    
    ## What
    
    This adds:
    
    - `TestAppServer::new_with_auto_env()` for constructing an app server
    with a default environment defined by the test runner (e.g. bazel)
    - `TestAppServer::auto_env_params()` for tests to easily acquire turn
    env params tailored to the automatic environment
    - `TestAppServer::send_thread_start_request_with_auto_env()` to make it
    easy for tests to start a thread using the automatic environment
    
    The above methods all fail if the test calling them has set up an
    environment where the automatic environment configuration conflicts with
    test-created state.
    
    ## Validation
    
    Adds a couple of basic smoke tests to the app-server test suite.
    Follow-ups will migrate more tests to use it.
  • Support thread-level originator overrides (#29477)
    ## Why
    
    Work(TPP) threads can be launched from the Desktop app, but if they all
    keep the Desktop app's default originator then downstream attribution
    cannot distinguish local Work launches from cloud-backed Work launches.
    `thread/start.serviceName` already carries that launch signal, while
    `SessionMeta.originator` is the durable thread-level value that survives
    resume and fork.
    
    This change converts the Desktop Work service names into an effective
    originator at thread creation time, persists that originator with the
    thread, and keeps using it for later model requests and memory writes.
    
    ## What changed
    
    - Map `CODEX_WORK_LOCAL` and `CODEX_WORK_CLOUD` service names to
    per-thread originators, while preserving
    `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` as the highest-precedence override.
    - Persist the effective originator in `SessionMeta.originator`, read it
    back on resume/fork, and inherit the parent originator for subagent
    spawns when there is no persisted session metadata.
    - Handle truncated `SpawnAgentForkMode::LastNTurns` forks by falling
    back to the live parent originator when the forked history no longer
    includes `SessionMeta`.
    - Thread the per-thread originator through Responses headers,
    websocket/compaction request paths, thread-store creation, rollout
    metadata, and memory stage-one telemetry.
    
    ## Verification
    
    - `just test -p codex-core
    agent::control::tests::spawn_thread_subagent_inherits_parent_originator_without_fork
    agent::control::tests::spawn_thread_subagent_fork_last_n_turns_inherits_parent_originator_without_session_meta
    thread_manager::tests::originator_override_precedes_service_name_remapping`
    - `just test -p codex-core
    agent::control::tests::resume_thread_subagent_restores_stored_metadata_and_effective_multi_agent_mode`
    - `just test -p codex-memories-write`
    - `just fix -p codex-core -p codex-memories-write`
    - `git diff --check`
  • Make selected plugin roots URI-native (#28918)
    ## Why
    
    Selected capability roots belong to the executor filesystem, not the
    app-server host. Converting their path strings into the host's native
    `Path` breaks whenever the two machines use different path conventions,
    such as a Windows executor behind a Unix app-server.
    
    This PR establishes `PathUri` as the selected-plugin boundary so the
    executor remains authoritative for its paths.
    
    ## What changed
    
    - Require `selectedCapabilityRoots[].location.path` to be a canonical
    `file:` URI and deserialize it directly as `PathUri`; native path
    strings are rejected.
    - Update the app-server schema, generated TypeScript, examples, and
    request coverage for the URI contract.
    - Keep selected roots, resolved plugin locations, manifest paths, and
    manifest resources as `PathUri`.
    - Inspect and read plugin roots and manifests only through the selected
    environment's `ExecutorFileSystem`.
    - Parse executor manifests with the shared URI-native parser from #29620
    instead of projecting them onto the host filesystem.
    - Enforce resource containment lexically and preserve the root URI's
    POSIX or Windows path convention.
    - Cover foreign Windows plugin roots and URI-native manifest resources.
    
    ```text
    thread/start
      selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo"
                                  | PathUri
                                  v
                        ExecutorFileSystem
                                  |
                                  +--> plugin.json
                                  +--> manifest resources
    ```
    
    This PR stops at the shared selected-plugin representation. The next two
    PRs remove the remaining host-path projections in the skill and MCP
    consumers.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. **This PR** — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • core: persist initial context window metadata (#29519)
    ## Why
    
    PR #29494 made context-window IDs visible to the model by wrapping the
    token-budget window payload in `<context_window>`, but rollout JSONL
    consumers still could not see the initial window identity by tailing the
    session file. Compacted rollout items carry window IDs only after
    compaction has happened, so a session with no compaction had no durable
    JSONL record for window 0.
    
    This change gives tailing consumers a stable initial-window record at
    session creation time.
    
    ## What Changed
    
    - Added `session_meta.context_window.window_id` for the initial
    context-window identity.
    - `CreateThreadParams` now requires `initial_window_id: String`, so
    thread-store callers cannot accidentally create new threads without
    window-0 metadata.
    - Live thread creation derives the persisted initial window ID from the
    same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
    runtime state and JSONL metadata aligned.
    - Rollout reconstruction uses `session_meta.context_window.window_id` as
    the initial-window fallback and derives `window_number = 0`,
    `first_window_id = window_id`, and `previous_window_id = None`
    internally.
    - Fork reconstruction intentionally uses the same rollout reconstruction
    path; consumers that need to distinguish copied initial-window metadata
    can use the rollout `thread_id`.
    - Legacy compactions without `window_number` still use compaction-count
    fallback accounting instead of being reset to window 0 by the
    initial-window fallback.
    - Compacted rollout metadata still takes precedence once compaction
    records exist, preserving the richer chain fields there.
    
    ## JSONL Shape
    
    Real rollout JSONL is one object per line. This example is expanded for
    readability, but shows the new initial `session_meta.context_window`
    record followed by the existing compacted rollout item shape that also
    carries window IDs:
    
    ```jsonl
    {
      "timestamp": "2026-06-22T12:00:00.000Z",
      "type": "session_meta",
      "payload": {
        "session_id": "<THREAD_ID>",
        "id": "<THREAD_ID>",
        "timestamp": "2026-06-22T12:00:00.000Z",
        "cwd": "/repo",
        "originator": "codex",
        "cli_version": "0.0.0",
        "source": "cli",
        "model_provider": "<MODEL_PROVIDER>",
        "context_window": {
          "window_id": "<INITIAL_WINDOW_ID>"
        }
      }
    }
    ...
    {
      "timestamp": "2026-06-22T12:34:56.000Z",
      "type": "compacted",
      "payload": {
        "message": "<COMPACTION_SUMMARY>",
        "replacement_history": [
          "..."
        ],
        "window_number": 1,
        "first_window_id": "<INITIAL_WINDOW_ID>",
        "previous_window_id": "<INITIAL_WINDOW_ID>",
        "window_id": "<NEXT_WINDOW_ID>"
      }
    }
    ```
    
    The nested `context_window` object is intentional: it gives rollout
    consumers a stable namespace for context-window metadata while only
    writing the non-derivable initial `window_id`. For the initial window,
    `window_number`, `first_window_id`, and `previous_window_id` are derived
    internally instead of being written to the rollout.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout
    recorder_materializes_on_flush_with_pending_items`
    - `just test -p codex-core reconstruct_history`
    - `just test -p codex-core
    record_initial_history_reconstructs_forked_transcript`
    - `just test -p codex-thread-store`
    - `just test -p codex-state`
    - `just test -p codex-app-server
    thread_read_returns_summary_without_turns`
    - `just test -p codex-rollout persistence_metrics`
  • test: branch on target OS instead of runner flavor (#29712)
    ## Why
    
    Core tests should branch on the executor's operating system, not on
    runner details such as Docker or Wine. This keeps platform behavior
    stable as new test backends are added and reserves Wine-specific skips
    for actual runner debt.
    
    ## What
    
    - Add `TestTargetOs` and target/host-aware skip helpers while keeping
    `TestEnvironment` internal.
    - Replace topology enum access with remote predicates and a narrow
    Docker accessor.
    - Migrate OS-semantic Wine skips, preserve runner-specific gaps, and
    document the skip taxonomy.
    
    ## Validation
    
    - `just test -p core_test_support`
    - `just test -p codex-core
    remote_test_env_can_connect_and_use_filesystem`
    - `bazel test //codex-rs/core:core-all-wine-exec-test
    --test_output=errors` reached test execution; unrelated existing
    view-image, path, and timing failures remain.
    - `just test -p codex-core` and `just test` reached broad test
    execution; this checkout has unrelated helper, sandbox, and timing
    failures.
  • feat(app-server): thread/turns/items/list -> thread/items/list (#29705)
    ## Description
    
    Rename the experimental app-server item pagination API from
    `thread/turns/items/list` to `thread/items/list` and make `turnId`
    optional. Clients can now page persisted items across a thread, or still
    filter to one turn when needed.
    
    ## What changed
    
    - Rename the request/response protocol types and JSON-RPC method to
    `ThreadItemsList*` / `thread/items/list`.
    - Pass optional `turnId` through to `ThreadStore::list_items`.
    - Update app-server docs and focused protocol/app-server tests.
    
    ## Validation
    
    - `just test -p codex-app-server-protocol thread_items_list_round_trips`
    - `just test -p codex-app-server thread_items_list_returns_unsupported`
  • Separate local and remote plugin analytics IDs (#29495)
    ## Why
    
    Plugin analytics overloaded `plugin_id`: most events used the Codex
    `<plugin>@<marketplace>` identity, while remote install events used the
    backend plugin ID. That makes the same field change meaning across event
    types and complicates downstream identity resolution.
    
    This change makes the contract unambiguous:
    
    - `plugin_id`: the local Codex `<plugin>@<marketplace>` identity, when
    resolved
    - `remote_plugin_id`: the backend plugin identity, when available
    
    For a remote install failure that happens before plugin details resolve,
    `plugin_id` is `null` and `remote_plugin_id` remains populated.
    
    ## What changed
    
    All six plugin analytics events use the same identity contract:
    
    - `codex_plugin_installed`
    - `codex_plugin_install_failed`
    - `codex_plugin_uninstalled`
    - `codex_plugin_enabled`
    - `codex_plugin_disabled`
    - `codex_plugin_used`
    
    Remote identity is resolved from the current installed-plugin snapshot
    first, with persisted install metadata as fallback. The telemetry
    metadata type keeps local identity optional for failures that occur
    before remote details are available.
    
    The app-server test client's manual analytics smokes now find remote
    mutation events through `remote_plugin_id` and validate that `plugin_id`
    remains local.
    
    ## Remote uninstall
    
    Resolve and capture telemetry metadata before removing the local plugin
    cache, then emit `codex_plugin_uninstalled` after the backend confirms
    success. The event is also emitted when backend uninstall succeeds but
    local cache cleanup reports `CacheRemove`.
    
    If a concurrent remote-cache refresh removes the local bundle before
    telemetry capture, the already-fetched remote plugin detail supplies
    fallback capability metadata.
    
    ## Validation
    
    - `just test -p codex-analytics` — 82 passed
    - `just test -p codex-core-plugins` — 271 passed
    - `just test -p codex-app-server-test-client` — 5 passed
    - `just test -p codex-plugin` — 3 passed
    - `just test -p codex-app-server plugin_install` — 37 passed
    - `just test -p codex-app-server plugin_uninstall` — 10 passed
    
    The production app-server install/uninstall flow was also exercised
    against `plugins~Plugin_f1b845ac33888191ac156169c58733c2`
    (`build-ios-apps@openai-curated-remote`), and the plugin's original
    uninstalled state was restored.
  • [core] debounce current-time reminders by elapsed time (#29659)
    ## Summary
    - rename `reminder_interval_model_requests` to
    `reminder_interval_seconds`
    - read the configured time provider before every model request and
    inject a reminder only after the configured number of seconds has
    elapsed
    - preserve immediate first delivery and forced delivery after compaction
    changes the context window
    
    ## Tests
    - `just test -p codex-core current_time_reminder`
  • Namespace multi-agent v2 tools under collaboration (#29067)
    ## Summary
    
    Multi-agent v2 tools now use the fixed `collaboration` namespace when
    namespace tools are available. This keeps the model-visible hint and the
    actual tool surface aligned around `functions.collaboration.*`, without
    exposing an unshipped namespace knob to users.
    
    The PR also removes the old `features.multi_agent_v2.tool_namespace`
    config/schema surface, updates the MAv2 test fixtures for namespaced
    calls, and fixes stale `TurnContext.features` references that were
    breaking `codex-core` builds.
    
    ## Changes
    
    - Expose MAv2 tools under `collaboration` instead of relying on a
    configurable namespace.
    - Remove `tool_namespace` from MAv2 TOML config, resolved config,
    validation, schema, and tests.
    - Update tool-planning and integration fixtures to assert or emit
    namespaced MAv2 tool calls.
    - Read feature state through `TurnContext.config.features` in the
    multi-agent mode context paths.
    
    ## Testing
    
    - `just write-config-schema`
    - `just test -p codex-features`
  • [codex] reject remote images at app-server ingress (#29419)
    ## Stack
    
    Stacked on #29417. Review and land that PR first.
    
    ## Summary
    
    - reject HTTP(S) image URLs in the handlers for `turn/start` and
    `turn/steer`
    - validate `thread/inject_items` after its existing
    JSON-to-`ResponseItem` conversion, so each item is deserialized once
    - turn invalid dynamic-tool image responses into the existing
    unsuccessful text fallback; the model receives the validation message as
    the function output
    - leave `thread/resume.history` compatible with legacy history; #29417
    replaces remote images before model input
    - continue accepting inline data URLs and `localImage` inputs
    - keep this policy in app-server; this PR does not add a shared protocol
    API or change core image preparation
    
    ## Test plan
    
    - `just test -p codex-app-server -E
    'test(/request_handlers_reject_remote_image_urls|dynamic_tool_remote_image_response_becomes_model_visible_error|dynamic_tool_call_round_trip_sends_content_items_to_model|turn_start_tracks_turn_event_analytics|standalone_image_edit_uses_recent_pathless_image/)'`
    (5 passed)
    - `just fix -p codex-app-server`
    - `just fmt`
  • feat(core): store turn_id on ResponseItem metadata (#28360)
    ## Description
    
    This PR is a followup to https://github.com/openai/codex/pull/28355 and
    starts assigning `internal_chat_message_metadata_passthrough.turn_id` to
    durable Responses API items created during a turn.
    
    The goal is that those items keep the `turn_id` that introduced them
    when Codex resends stateless HTTP context, reconstructs history for
    resume/fork paths, or reuses websocket response state.
    
    ## What changed
    
    - Set `internal_chat_message_metadata_passthrough.turn_id` when missing
    as response items enter durable history, initial/replacement history,
    inter-agent communication history, and local compaction summaries.
    - Preserve existing item turn IDs instead of overwriting them during
    persistence, resume reconstruction, compaction, forked history, and
    websocket incremental reuse.
    - Keep `compaction_trigger` fieldless because it is a request control,
    not a durable response item.
    - Update focused history/request assertions and fixtures for stateless
    requests, websocket incrementals, compaction, thread injection, prompt
    debug, and related CI coverage.
  • [codex] replace remote images with model-visible error text (#29417)
    ## What
    
    This PR will extend the existing centralized image-preparation path to
    replace HTTP(S) image inputs with a model visible error message. It
    won't "ruin" and break existing rollouts, but it will deprecate support
    for the pathway. App server clients should no longer use HTTP image urls
    if they'd like to upgrade.
    
    The HTTP image url pathway is currently resolved in the responsesapi. It
    is slow and not reccomended.
    
    ## Behavior
    
    - HTTP(S) image URL: replace with `input_text`
    - data URL: use the existing decode and resize path
    - other image URL schemes: leave unchanged
    
    This intentionally does not change app-server ingress. That validation
    remains a follow-up.
    
    ## Test plan
    
    - `just test -p codex-core -E
    'test(/image_preparation|prepares_image_failures_before_history_insertion|prepares_resumed_history_before_installing_it|responses_lite_prepares_images/)'`
    — 7 passed
    - `just fix -p codex-core`
    - `just fmt`
  • [plugins] Add dark-mode logo metadata (#29488)
    Adds additive dark-mode plugin logo metadata across manifests, remote
    catalogs, and the app-server protocol while keeping uninstalled Git
    listings free of synthetic local paths.
    
    Supersedes #28945. This replacement uses an upstream branch so trusted
    CI can use the repository-provided remote Bazel configuration.
    
    ## Current state
    
    Plugin interfaces expose only the default logo asset. Clients therefore
    cannot select a dedicated dark-mode logo even when a plugin provides
    one.
    
    ## What this PR changes
    
    - Adds nullable `logoDark` and `logoUrlDark` fields to
    `PluginInterface`.
    - Resolves local `interface.logoDark` assets and maps remote
    `logo_url_dark` values.
    - Removes path-backed interface assets, including `logoDark`, from
    uninstalled Git fallback listings until the plugin has a real local
    root.
    - Updates the bundled plugin validator and manifest reference.
    - Regenerates the app-server JSON schemas and TypeScript types.
    
    Local manifests expose `interface.logoDark` as a package-relative asset
    path. Remote catalog responses expose `logo_url_dark`. These values map
    into separate app-server fields so clients can preserve local-path and
    remote-URL handling.
    
    ## Risk
    
    The fields are additive and nullable, so existing clients retain their
    current logo behavior. The main risks are an incomplete mapping path or
    exposing a synthetic local path for an uninstalled Git plugin.
    Local-manifest, remote-catalog, fallback-listing, protocol
    serialization, and app-server integration tests cover those paths.
    
    Spiciness: 2/5
    
    ## Testing
    
    - `just write-app-server-schema`
    - `just fmt`
    - Regression test first failed with `logo_dark` resolved to
    `/assets/logo-dark.png`, then passed after the fallback-listing fix.
    - `just test -p codex-core-plugins` (267 tests passed)
    - `just test -p codex-app-server 'suite::v2::plugin'` (114 tests passed)
    - `just test -p codex-app-server-protocol -p codex-core-plugins -p
    codex-plugin -p codex-skills` (517 tests passed before the follow-up)
    - `just test -p codex-tui plugin` (47 tests passed)
    - Validated a local plugin manifest containing `interface.logoDark` with
    the bundled validator.
    
    ## Manual verification
    
    Create a local plugin with both `interface.logo` and
    `interface.logoDark`, then call `plugin/list` or `plugin/read`. Confirm
    the response contains separate `logo` and `logoDark` paths. For a remote
    catalog entry, confirm `logoUrlDark` is populated from `logo_url_dark`.
    For an uninstalled Git marketplace entry, confirm path-backed interface
    assets remain absent until installation.
    
    Issue: N/A - coordinated maintainer change.
  • [codex] fetch featured IDs for remote plugins (#29485)
    ## Summary
    
    - fetch featured plugin IDs when the loaded catalog includes
    `openai-curated-remote`
    - extend the existing remote marketplace regression test to cover the
    featured IDs response
    
    ## Why
    
    When the remote plugin catalog was enabled, app-server loaded
    `openai-curated-remote` but skipped `/plugins/featured` because the
    request processor only fetched featured IDs for the local
    `openai-curated` marketplace. As a result, the desktop app could not
    render the backend-curated remote featured set.
    
    This keeps the existing local behavior and also returns the curated
    ranking for remote plugins.
    
    ## Test plan
    
    - `just fmt`
    - `git diff --check`
    - `just test -p codex-app-server
    plugin_list_includes_remote_marketplaces_when_remote_plugin_enabled`
  • permission profiles: expose availability to clients (#26678)
    ## Why
    
    `permissionProfile/list` currently advertises every built-in and
    configured profile even when effective enterprise requirements prevent
    selecting it. That forces each client to reconstruct policy from
    lower-level requirement fields, which is easy to miss and difficult to
    keep consistent.
    
    The catalog should remain complete so clients can explain that an option
    was disabled by an administrator, while also reporting whether each
    profile is selectable.
    
    ## What
    
    - Add an `allowed` field to each permission profile summary.
    - Build a shared catalog from the effective config and current
    requirements, including `allowed_sandbox_modes`, `allowed_permissions`,
    and filesystem restrictions.
    - Use the shared catalog in app-server and the TUI so disallowed
    profiles remain visible but cannot be selected.
    - Use the canonical `:danger-full-access` profile ID in the TUI.
    - Update the app-server schemas, API documentation, behavioral tests,
    and TUI snapshots.
    
    ## Scope
    
    This PR targets `main` directly and is independent of #24852. It
    preserves the current behavior where built-in profiles are constrained
    by sandbox-mode requirements and `allowed_permissions` applies to
    configured profiles.
    
    ## Testing
    
    - `just test -p codex-core
    permission_profile_catalog_marks_profiles_disallowed_by_requirements`
    - `just test -p codex-app-server permission_profile_list`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-tui profile_permissions`
    - `just fix -p codex-core`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fix -p codex-tui`
    - `just fmt`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
    Co-authored-by: Joey Trasatti <joey.trasatti@openai.com>
  • Allow ChatGPT accounts without email (#28991)
    # Summary
    
    Codex required every ChatGPT account to have an email address. A
    service-account personal access token can return valid account metadata
    without one, so PAT login failed while decoding the metadata response.
    
    This change makes email optional in the account metadata type that owns
    it and preserves that absence through authentication, provider account
    state, the app-server API, generated clients, and TUI bootstrap.
    Existing accounts with email addresses keep the same behavior.
    
    ## Behavior-changing call sites
    
    | Call site | Behavior after this change |
    | --- | --- |
    | `login/src/auth/personal_access_token.rs` | PAT metadata accepts a
    missing or null email and retains `None`. |
    | `agent-identity/src/lib.rs` | Agent Identity JWT claims accept an
    omitted email. |
    | `login/src/auth/storage.rs` and `login/src/auth/agent_identity.rs` |
    Stored and managed Agent Identity records carry `Option<String>`.
    Deserialization maps the legacy empty-string sentinel to `None`. |
    | `login/src/auth/manager.rs` | `get_account_email` returns the stored
    option, and managed identity bootstrap no longer converts `None` to an
    empty string. |
    | `model-provider/src/provider.rs` and `protocol/src/account.rs` | A
    ChatGPT provider account requires a plan type but may carry no email. |
    | `app-server-protocol/src/protocol/v2/account.rs` | `account/read`
    keeps the `email` field on the wire and returns `null` when the account
    has no email. Generated TypeScript and JSON schemas describe a required,
    nullable field. |
    | `sdk/python/src/openai_codex/generated/v2_all.py` | The generated
    Python `ChatgptAccount` model accepts `None` for email. |
    | `tui/src/app_server_session.rs` | Email-less ChatGPT accounts
    bootstrap normally, keep external feedback routing, omit account-email
    telemetry, and display the plan in account status. |
    
    ## Design decisions
    
    - Missing email remains `None` at every layer. The code never uses an
    empty string as a substitute.
    - The app-server response includes `"email": null` instead of omitting
    the field. Clients retain a stable response shape.
    - Plan type remains required for provider account state. This change
    relaxes only the email assumption.
    
    ## Testing
    
    Tests: affected test targets compile, scoped Clippy and formatting pass,
    a focused TUI snapshot covers plan-only account status, real
    before/after PAT login smoke covers metadata without email, app-server
    smoke covers `account/read` with `email: null`, and a regression smoke
    covers an existing email-bearing PAT. Unit tests run in CI.
    
    ## Evidence
    
    Visual smoke evidence will be attached here.