59 Commits

  • feat(app-server): add history_mode to thread (#29927)
    ## Description
    
    This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
    This will be stored in `SessionMeta` in the JSONL rollout file and as a
    new column in the SQLite thread_metadata table, and exposed on
    `thread/start` and on the `Thread` object in app-server.
    
    ## What changed
    
    - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
    defaulting old and new SessionMeta to `legacy`.
    - Carried `history_mode` through core session config, ThreadStore stored
    metadata, local/in-memory stores, rollout metadata extraction, and the
    existing SQLite `threads` table.
    - Added experimental `historyMode` to app-server v2 `Thread` and
    `thread/start`.
    - Made paginated stored threads metadata-discoverable but unsupported
    for legacy full-history reads, `load_history`, live resume, and create
    paths.
    - Regenerated app-server schema fixtures and added
    protocol/state/thread-store/app-server coverage for persistence and
    fail-closed behavior.
    
    ## Compatibility floor
    Because users may be running various versions of Codex binaries on the
    same machine (TUI, Codex App, etc.), we will need to establish a
    compatibility floor for upcoming paginated threads, which will change
    how thread storage reads and writes work.
    
    The overall plan here:
    ```
    Release N:
    - Add historyMode to SessionMeta / Thread / SQLite metadata.
    - Teach binaries to understand paginated threads.
    - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
    - Default remains `"legacy"`.
    
    Release N+1:
    - First-party clients start opting into paginated threads where appropriate.
    - Internal dogfood / staged rollout.
    - Measure old-client usage and paginated-thread unsupported errors.
    
    Release N+2:
    - Only after Release N+ is overwhelmingly deployed, make paginated the default.
    - Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
    ```
    
    The important behavior change is fail-closed handling for a binary that
    encounters a persisted `paginated` thread before it knows how to fully
    support paginated history. In app-server, if a thread is `paginated`, we
    will:
    
    - allow metadata-only discovery paths like `thread/list` and
    `thread/read(includeTurns=false)`, so clients can still see the thread
    and inspect its `historyMode`
    - reject legacy full-history/live-thread paths like
    `thread/read(includeTurns=true)` and `thread/resume` with an unsupported
    JSON-RPC error
    - avoid silently treating an unknown or future `historyMode` as `legacy`
    
    Under the hood, the ThreadStore layer also rejects legacy operations
    that would need to load or replay the full thread history for a
    paginated thread. That gives us the behavior we want for Release N:
    future paginated threads are visible, but this binary fails closed
    instead of trying to operate on them as if they were legacy threads.
  • Persist selected capability roots and resolve availability per model step (#29856)
    ## Why
    
    `selectedCapabilityRoots` is durable thread intent: “use this capability
    root from environment `worker`.”
    
    The important product assumption is:
    
    > One environment ID always names the same logical executor and stable
    contents.
    
    `worker` does not silently change from executor A to an unrelated
    executor B. The process-local connection handle for `worker` can still
    be replaced while Codex is running, though, for example when
    `environment/add` registers a fresh handle for the same logical
    environment.
    
    The thread should persist only the stable selection. Each model step
    should pair that selection with the exact ready handle captured for that
    step.
    
    ## The boundary
    
    ```text
    persisted thread intent
      plugin@1 -> environment "worker"
                    |
                    | capture the current step
                    v
    model-step view
      unavailable, or
      plugin@1 + worker's exact captured ready handle
    ```
    
    The environment ID is the stable identity and cache key. The
    `Arc<Environment>` is only a process-local handle retained so consumers
    of one model step use the same captured environment. It is never
    persisted and it does not imply different environment contents.
    
    ## What changes
    
    ### Persist the stable selection
    
    Selected roots are written into `SessionMeta` and restored with the
    thread. Forked subagents inherit the same selections, including
    bounded-history forks.
    
    Only stable data is persisted: root ID, environment ID, and root path.
    
    ### Capture readiness together with the exact handle
    
    The environment snapshot records:
    
    ```rust
    environment_id -> Some(Arc<Environment>) // ready in this step
    environment_id -> None                   // still starting in this step
    ```
    
    This prevents readiness and execution from coming from different
    registry snapshots.
    
    For example:
    
    ```text
    step snapshot: worker -> handle A, ready
    environment/add: worker -> fresh handle B for the same logical environment
    current step: plugin@1 still uses captured handle A
    ```
    
    Without carrying handle A in the snapshot, the resolver could combine “A
    was ready” with handle B and treat B as ready before it had finished
    starting.
    
    This does not change cache invalidation. Stable capability metadata
    remains identified by environment ID and capability root. Replacing a
    process-local handle under the same stable environment ID does not
    invalidate or rediscover that metadata.
    
    ### Resolve availability per model step
    
    - A ready captured environment produces resolved roots using its
    captured handle.
    - A starting, missing, or failed environment is omitted from that step.
    - A selected lazy environment that is outside the turn's captured
    environment set is asked to start, and a later step can observe it as
    ready.
    - No capability files are scanned here.
    
    Transient transport disconnects remain the remote client's reconnect
    concern. This PR models initial attachment/readiness; it does not add
    live socket-connectivity state.
    
    ## Example
    
    ```text
    thread selection: plugin@1 -> environment "worker"
    
    step 1: worker is starting -> plugin@1 unavailable
    step 2: worker is ready    -> plugin@1 resolves through worker's captured handle
    step 3: fresh local handle -> current step remains pinned; a later step captures its own view
    ```
    
    Temporary unavailability does not discard the durable selection. Later
    PRs can retain stable metadata caches while projecting only currently
    available capabilities into model-visible World State.
    
    ## Compatibility
    
    The app-server request shape does not change. Older rollouts without
    `selected_capability_roots` deserialize to an empty list.
    
    ## Stack
    
    1. **This PR:** persist stable selected roots and resolve them through
    an exact model-step handle.
    2. #29960: cache stable skill metadata and project available skills into
    World State.
    3. #29946: cache stable plugin declarations and manage the separate live
    MCP runtime.
  • Represent MCP authentication with an enum (#29924)
    ## Why
    
    MCP authentication has distinct OAuth and ChatGPT-session flows.
    Representing that choice as `use_chatgpt_auth` makes one flow implicit
    and allows the configuration model to express the distinction only
    through a boolean.
    
    ChatGPT credential forwarding also needs a first-party trust boundary. A
    configurable `chatgpt_base_url` controls routing, but must not grant an
    MCP server permission to receive session credentials.
    
    This change builds on #29733, where the boolean was introduced.
    
    ## What changed
    
    - Replace `use_chatgpt_auth` with an `auth` field backed by the
    exhaustive `McpServerAuth` enum.
    - Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
    the default.
    - Trust only the origin derived from the existing hardcoded
    `CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
    - Keep configured bearer tokens and authorization headers ahead of the
    selected authentication flow.
    - Update config writers, schema output, fixtures, and integration-test
    setup to use the enum.
    
    ## Verification
    
    Integration coverage exercises the complete streamable HTTP startup path
    in two independent configurations:
    
    - A directly constructed MCP configuration verifies that matching an
    overridden `chatgpt_base_url` does not grant ChatGPT auth.
    - A persisted `config.toml` containing an attacker-controlled
    `chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
    through normal config parsing.
    
    Both tests complete MCP initialization and tool listing and assert that
    the full captured request sequence contains no authorization headers.
    Separate integration coverage verifies that configured authorization
    takes precedence over ChatGPT auth.
  • Allow ChatGPT-hosted MCP servers to use session auth (#29733)
    ## Why
    
    ChatGPT session authentication was inferred from the reserved Codex Apps
    server name. That couples credential routing to Codex Apps-specific
    behavior and prevents other MCP endpoints hosted by ChatGPT from
    explicitly using the current session.
    
    The opt-in also needs a clear security boundary: an arbitrary MCP
    configuration must not be able to redirect ChatGPT credentials to
    another origin.
    
    ## What changed
    
    - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
    `false`.
    - Honor the setting only when the parsed server URL has the same HTTP(S)
    origin as the configured `chatgpt_base_url`; otherwise remove the
    capability before startup.
    - Resolve bearer tokens and static or environment-backed authorization
    headers before selecting authentication, with configured authorization
    taking precedence over ChatGPT session auth.
    - Enable the setting for the built-in Codex Apps and hosted plugin
    runtime endpoints while keeping Codex Apps caching and tool
    normalization scoped to the reserved server.
    - Persist the setting through MCP config rewrite paths and expose it in
    the generated config schema.
    - Load the current login state for `codex mcp list` so reported auth
    status matches runtime behavior.
    
    ## Verification
    
    Core integration coverage exercises the complete streamable HTTP MCP
    startup path and verifies that:
    
    - a same-origin opted-in server receives the current ChatGPT access
    token;
    - an explicitly configured authorization header takes precedence;
    - a different-origin server completes MCP initialization and tool
    listing without receiving any ChatGPT authorization header.
  • core: persist initial context window metadata (#29519)
    ## Why
    
    PR #29494 made context-window IDs visible to the model by wrapping the
    token-budget window payload in `<context_window>`, but rollout JSONL
    consumers still could not see the initial window identity by tailing the
    session file. Compacted rollout items carry window IDs only after
    compaction has happened, so a session with no compaction had no durable
    JSONL record for window 0.
    
    This change gives tailing consumers a stable initial-window record at
    session creation time.
    
    ## What Changed
    
    - Added `session_meta.context_window.window_id` for the initial
    context-window identity.
    - `CreateThreadParams` now requires `initial_window_id: String`, so
    thread-store callers cannot accidentally create new threads without
    window-0 metadata.
    - Live thread creation derives the persisted initial window ID from the
    same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
    runtime state and JSONL metadata aligned.
    - Rollout reconstruction uses `session_meta.context_window.window_id` as
    the initial-window fallback and derives `window_number = 0`,
    `first_window_id = window_id`, and `previous_window_id = None`
    internally.
    - Fork reconstruction intentionally uses the same rollout reconstruction
    path; consumers that need to distinguish copied initial-window metadata
    can use the rollout `thread_id`.
    - Legacy compactions without `window_number` still use compaction-count
    fallback accounting instead of being reset to window 0 by the
    initial-window fallback.
    - Compacted rollout metadata still takes precedence once compaction
    records exist, preserving the richer chain fields there.
    
    ## JSONL Shape
    
    Real rollout JSONL is one object per line. This example is expanded for
    readability, but shows the new initial `session_meta.context_window`
    record followed by the existing compacted rollout item shape that also
    carries window IDs:
    
    ```jsonl
    {
      "timestamp": "2026-06-22T12:00:00.000Z",
      "type": "session_meta",
      "payload": {
        "session_id": "<THREAD_ID>",
        "id": "<THREAD_ID>",
        "timestamp": "2026-06-22T12:00:00.000Z",
        "cwd": "/repo",
        "originator": "codex",
        "cli_version": "0.0.0",
        "source": "cli",
        "model_provider": "<MODEL_PROVIDER>",
        "context_window": {
          "window_id": "<INITIAL_WINDOW_ID>"
        }
      }
    }
    ...
    {
      "timestamp": "2026-06-22T12:34:56.000Z",
      "type": "compacted",
      "payload": {
        "message": "<COMPACTION_SUMMARY>",
        "replacement_history": [
          "..."
        ],
        "window_number": 1,
        "first_window_id": "<INITIAL_WINDOW_ID>",
        "previous_window_id": "<INITIAL_WINDOW_ID>",
        "window_id": "<NEXT_WINDOW_ID>"
      }
    }
    ```
    
    The nested `context_window` object is intentional: it gives rollout
    consumers a stable namespace for context-window metadata while only
    writing the non-derivable initial `window_id`. For the initial window,
    `window_number`, `first_window_id`, and `previous_window_id` are derived
    internally instead of being written to the rollout.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout
    recorder_materializes_on_flush_with_pending_items`
    - `just test -p codex-core reconstruct_history`
    - `just test -p codex-core
    record_initial_history_reconstructs_forked_transcript`
    - `just test -p codex-thread-store`
    - `just test -p codex-state`
    - `just test -p codex-app-server
    thread_read_returns_summary_without_turns`
    - `just test -p codex-rollout persistence_metrics`
  • Persist session IDs across thread resume (#29327)
    ## Summary
    
    A cold-resumed subagent kept its durable thread ID but could receive a
    new session ID, splitting one agent tree across multiple sessions after
    a restart.
    
    Persist the root session ID in every rollout `SessionMeta`, carry it
    through thread creation, and restore it before initializing the resumed
    `Session` and `AgentControl`.
    
    ## Behavior
    
    For a nested agent tree:
    
    ```text
    root session R
      parent thread P
        child thread C
    ```
    
    The child rollout stores:
    
    ```text
    session_id:       R
    parent_thread_id: P
    id:               C
    ```
    
    After a cold resume, the child still belongs to root session `R` while
    its immediate parent remains `P`. The integration coverage uses distinct
    values for all three IDs so it catches restoring the session from
    `parent_thread_id`.
    
    ## Legacy rollouts
    
    Previous rollouts have `id` but no `session_id`. `SessionMetaLine`
    deserialization treats a missing `session_id` as `id`, keeping those
    files readable, listable, and resumable. When a legacy subagent is
    resumed through its root, that synthesized child ID no longer overrides
    the inherited root-scoped `AgentControl`. New rollouts always persist
    the explicit root session ID.
  • Represent dynamic tools with explicit namespaces internally (#27365)
    Follow-up to #27356.
    
    ## Stack note
    
    This PR changes Codex's internal dynamic-tool shape while leaving
    `thread/start` unchanged. App-server therefore converts the existing
    per-tool input into explicit functions and namespaces before passing it
    to core.
    
    [#27371](https://github.com/openai/codex/pull/27371) updates
    `thread/start` to use the same explicit shape and removes this temporary
    conversion.
    
    ## Why
    
    Dynamic tools repeat namespace metadata on every function. Core should
    keep one explicit namespace with its member tools so descriptions and
    membership stay consistent across sessions and runtime planning.
    
    ## What changed
    
    - Represent dynamic tools as top-level functions or explicit namespaces
    in protocol and session state.
    - Read old flat rollout metadata and write the canonical hierarchy.
    - Flatten namespace members only when registering callable tools.
    - Keep `thread/start.dynamicTools` flat for now and normalize it at the
    app-server boundary.
    
    New builds can read old rollout metadata. Older builds cannot read newly
    written hierarchical metadata.
    
    ## Test plan
    
    - `just test -p codex-app-server
    thread_start_normalizes_legacy_dynamic_tools_into_model_request`
    - `just test -p codex-protocol
    session_meta_normalizes_legacy_dynamic_tools`
    - `just test -p codex-core
    resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
    - `just test -p codex-core
    tool_search_returns_deferred_dynamic_tool_and_routes_follow_up_call`
    - `just test -p codex-core code_mode_can_call_hidden_dynamic_tools`
    - `just test -p codex-tools`
  • Pair thread environment settings (#26687)
    ## Why
    
    Thread cwd and environment selections are a single logical setting in
    core: updating one without the other can silently desynchronize the
    next-turn execution context. This change makes that relationship
    explicit in the internal thread settings flow while preserving the
    existing app-server public API shape.
    
    ## What changed
    
    - Moved the cwd/environment pair through internal
    `ThreadSettingsOverrides.environment_settings` instead of a top-level
    internal `cwd` field.
    - Kept `thread/settings/update` public params unchanged, with app-server
    translating top-level `cwd` into the paired internal settings shape.
    - Moved `Op::UserInput` environment overrides into thread settings so
    user turns and settings updates use the same core path.
    - Updated core, app-server, MCP, memories, sample, and test callsites to
    construct the paired settings shape.
    
    ## Verification
    
    - `git diff --check`
    - Local test run starting after PR creation.
  • [codex] Exclude external tool output from memories (#26821)
    ## Summary
    
    - add contains_external_context() to tool output so other tools can be
    opted out of influencing memory when disable_on_external_context=true
    - Classify standalone web-search output as external context (to match
    behavior as hosted web search)
    - Verify with integration test
  • Require absolute cwd in thread settings (#26532)
    ## Why
    
    Thread settings cwd overrides are expected to be resolved before they
    enter core. Keeping this boundary as a plain `PathBuf` made it easy for
    core/session code to keep fallback normalization and relative-path
    resolution logic in places that should only receive an already-resolved
    cwd.
    
    This is intentionally the absolute-cwd-only slice: it does not change
    environment selection stickiness or cwd-to-default-environment fallback
    behavior.
    
    ## What changed
    
    - Changes `ThreadSettingsOverrides.cwd`,
    `CodexThreadSettingsOverrides.cwd`, and `SessionSettingsUpdate.cwd` to
    use `AbsolutePathBuf`.
    - Removes core-side cwd normalization/resolution from session settings
    updates.
    - Updates affected core/app-server test helpers and callsites to pass
    existing absolute cwd values or use `abs()` helpers.
    
    ## Validation
    
    Opening as draft so CI can start while local validation continues.
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • app-server: remove experimental persist_extended_history bool flag (#25712)
    ## Summary
    
    Remove the dead experimental `persistExtendedHistory` app-server flag
    and collapse rollout persistence to the single policy app-server already
    used.
    
    ## What Changed
    
    - Removed `persistExtendedHistory` from v2 thread start/resume/fork
    params and deleted its deprecation notice path.
    - Removed the persistence-mode enums and plumbing through core, rollout,
    and thread-store.
    - Made rollout filtering mode-free, keeping the existing limited
    persisted-history behavior.
    
    ## Test Plan
    
    - `just write-app-server-schema`
    - `cargo nextest run --no-fail-fast -p codex-app-server-protocol
    schema_fixtures`
    - `cargo nextest run --no-fail-fast -p codex-app-server
    thread_shell_command_history_responses_exclude_persisted_command_executions`
    - `cargo nextest run --no-fail-fast -p codex-rollout -p
    codex-thread-store`
    - final `rg` for removed flag/type names
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • [codex] Wait for MCP readiness in core integration tests (#24964)
    Ensures MCP-backed `codex-core` integration tests exercise initialized
    servers instead of racing server startup.
    
    I've been idly investigating a few flakes and the failure modes are much
    more confusing when a tool call fails because of a failed server start
    than when the failed server start causes the test to fail directly.
  • [codex] Add user input client ids (#24653)
    ## Summary
    
    Adds an optional `clientId` field to app-server v2 `UserInput` and
    carries it through the core `UserInput` model so clients can correlate
    echoed user input items without relying on payload equality.
    
    ## Details
    
    - Adds `client_id: Option<String>` to core `UserInput` variants.
    - Exposes the v2 app-server field as `clientId` on the wire and in
    generated TypeScript.
    - Preserves the id when converting between app-server v2 and core
    protocol types.
    - Regenerates app-server schema fixtures.
    
    ## Validation
    
    - `just fmt`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-protocol`
    - `git diff --check`
  • [codex] Remove redundant SQLite dynamic tool storage (#24819)
    ## Why
    
    Dynamic tools are defined at thread start and already stored in rollout
    `SessionMeta`, which restores resumed and forked sessions. Persisting
    the same tools through SQLite creates a second runtime persistence path
    that is unnecessary prework for the explicit namespace refactor.
    
    ## What changed
    
    - Restore missing thread-start dynamic tools directly from rollout
    history, including when SQLite is enabled.
    - Remove SQLite dynamic-tool reads, writes, backfill, and thread
    metadata patch plumbing.
    - Add SQLite-enabled resume integration coverage that verifies a
    rollout-defined dynamic tool is still sent after resume.
    
    ## Compatibility
    
    The existing `thread_dynamic_tools` table is intentionally not dropped
    even though it's now unused. Older Codex binaries are allowed to open
    databases migrated by newer binaries and still reference this table;
    dropping it would break that mixed-version path. See
    [here](https://github.com/openai/codex/blob/main/codex-rs/state/src/migrations.rs#L10-L11).
    
    ## Verification
    
    - `just test -p codex-state -p codex-rollout -p codex-thread-store`
    - `just test -p codex-core --test all
    resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
  • Add experimental turn additional context (#24154)
    ## Summary
    
    Adds experimental `additionalContext` support to `turn/start` and
    `turn/steer` so clients can provide ephemeral external context, such as
    browser or automation state, without turning that plumbing into a
    visible user prompt or triggering user-prompt lifecycle behavior.
    
    ## API Shape
    
    The parameter shape is:
    
    ```ts
    additionalContext?: Record<string, {
      value: string
      kind: "untrusted" | "application"
    }> | null
    ```
    
    Example:
    
    ```json
    {
      "additionalContext": {
        "browser_info": {
          "value": "Active tab is CI failures.",
          "kind": "untrusted"
        },
        "automation_info": {
          "value": "CI rerun is in progress.",
          "kind": "application"
        }
      }
    }
    ```
    
    The keys are opaque and caller-defined.
    
    ## Context Injection
    
    When provided, accepted entries are inserted into model context as
    hidden contextual message items, not as visible thread user-message
    items.
    
    `kind: "untrusted"` entries are inserted with role `user`:
    
    ```text
    <external_${key}>${value}</external_${key}>
    ```
    
    `kind: "application"` entries are inserted with role `developer`:
    
    ```text
    <${key}>${value}</${key}>
    ```
    
    Values are not escaped. Each value is truncated to 1k approximate tokens
    before wrapping.
    
    For `turn/start`, accepted additional context is inserted before normal
    user input. For `turn/steer`, additional context is merged only when the
    steer includes non-empty user input; context-only steers still reject as
    empty input.
    
    ## Dedupe Strategy
    
    `AdditionalContextStore` lives on session state and stores the latest
    complete additional-context map.
    
    Each `turn/start` or non-empty `turn/steer` treats its
    `additionalContext` as the current complete set of values. Entries are
    injected only when the key is new or the exact entry for that key
    changed, including `value` or `kind`. After merging, the store is
    replaced with the provided map, so omitted keys are removed from the
    retained set and can be injected again later if reintroduced.
    
    Omitting `additionalContext`, passing `null`, or passing an empty object
    resets the store to empty and injects nothing.
    
    ## What Changed
    
    - Threads experimental v2 `additionalContext` through app-server into
    core turn start and steer handling.
    - Adds separate contextual fragment types for untrusted user-role
    context and application developer-role context.
    - Uses pending response input items so additional context can be
    combined with normal user input without treating it as prompt text.
    - Adds integration coverage for start/steer flow, role routing,
    dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
    behavior, empty context-only steer rejection, external-fragment marker
    matching, and truncation.
  • Move MCP tool naming mode into manager (#21576)
    ## Why
    
    The `non_prefixed_mcp_tool_names` feature should be applied where MCP
    tools become model-visible, not by remapping names later in core.
    Keeping the decision in `McpConnectionManager` construction makes
    `ToolInfo` the single shaped view that spec building, deferred tool
    search, routing, and unavailable-tool placeholders can consume directly.
    
    This also preserves the existing external behavior while the feature is
    off, and keeps the feature-on behavior for code mode and hooks explicit
    at the manager boundary.
    
    ## What Changed
    
    - Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig`
    into `McpConnectionManager::new`.
    - Normalize MCP `ToolInfo` names in the manager using either
    legacy-prefixed namespaces or non-prefixed namespaces; the legacy path
    adds `mcp__` without restoring the old trailing namespace suffix.
    - Remove the core-side MCP name remapping path so specs, tool search,
    session resolution, and unavailable-tool placeholder construction use
    the manager-provided `ToolName` values directly.
    - Keep code mode flattening on the `__` namespace separator.
    - Preserve hook compatibility by giving non-prefixed MCP hook names
    legacy `mcp__...` matcher aliases.
    - Add/adjust integration and unit coverage for non-prefixed code-mode
    behavior, hook matching with the feature on and off, and manager-level
    legacy prefixing.
    
    ## Testing
    
    - `cargo test -p codex-mcp --lib`
    - `cargo test -p codex-core --lib tools::spec::tests -- --nocapture`
    - `cargo test -p codex-core --lib mcp_tools -- --nocapture`
    - `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture`
    - `cargo test -p codex-core --test all mcp_tool -- --nocapture`
    - `cargo test -p codex-core --test all search_tool -- --nocapture`
    - `cargo test -p codex-core --test all hooks_mcp -- --nocapture`
    - `cargo test -p codex-core --test all
    code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled --
    --nocapture`
    - `cargo test -p codex-tools`
    - `cargo test -p codex-features`
  • Route MCP servers through explicit environments (#23583)
    ## Summary
    - route each configured MCP server through an explicit per-server
    `environment_id` instead of a manager-wide remote toggle
    - default omitted `environment_id` to `local`, resolve named ids through
    `EnvironmentManager`, and fail only the affected MCP server when an
    explicit id is unknown
    - keep local stdio on the existing local launcher path for now, while
    named-environment stdio uses the selected environment backend and
    requires an absolute `cwd`
    - allow local HTTP MCP servers to keep using the ambient HTTP client
    when no local `Environment` is configured; named-environment HTTP MCPs
    use that environment's HTTP client
    
    ## Validation
    - devbox Bazel build: `bazel build --bes_backend= --bes_results_url=
    //codex-rs/cli:codex //codex-rs/rmcp-client:test_stdio_server
    //codex-rs/rmcp-client:test_streamable_http_server`
    - devbox app-server config matrix with real `config.toml` /
    `environments.toml` files covering omitted local, explicit local,
    omitted local under remote default, explicit remote stdio, local HTTP
    without local env, explicit remote HTTP, local stdio without local env,
    unknown explicit env, and remote stdio without `cwd`
  • [3 of 7] Remove UserTurn (#23075)
    **Stack position:** [3 of 7]
    
    ## Summary
    
    This PR finishes the input-op consolidation by moving the remaining
    `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`.
    This touches a lot of files, but it is a low-risk mechanical migration.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075) (this PR)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • Support explicit MCP OAuth client IDs (#22575)
    ## Why
    Some MCP OAuth providers require a pre-registered public client ID and
    cannot rely on dynamic client registration. Codex already supports MCP
    OAuth, but it had no way to supply that client ID from config into the
    PKCE flow.
    
    ## What changed
    - add `oauth.client_id` under `[mcp_servers.<server>]` config, including
    config editing and schema generation
    - thread the configured client ID through CLI, app-server, plugin login,
    and MCP skill dependency OAuth entrypoints
    - configure RMCP authorization with the explicit client when present,
    while preserving the existing dynamic-registration path when it is
    absent
    - add focused coverage for config parsing/serialization and OAuth URL
    generation
    
    ## Verification
    - `cargo test -p codex-config -p codex-rmcp-client -p codex-mcp -p
    codex-core-plugins`
    - `cargo test -p codex-core blocking_replace_mcp_servers_round_trips
    --lib`
    - `cargo test -p codex-core
    replace_mcp_servers_streamable_http_serializes_oauth_resource --lib`
    - `cargo test -p codex-core config_schema_matches_fixture --lib`
    
    ## Notes
    Broader local package runs still hit unrelated pre-existing stack
    overflows in:
    - `codex-app-server::in_process_start_clamps_zero_channel_capacity`
    -
    `codex-core::resume_agent_from_rollout_uses_edge_data_when_descendant_metadata_source_is_stale`
  • chore: drop built-in MCPs (#22173)
    Drop something that was never used
  • feat: make built-in MCPs first-class runtime servers (#21356)
    ## DISCLAIMER
    This is experimental and no production service must rely on this
    
    ## Why
    
    Built-in MCPs are product-owned runtime capabilities, but they were
    previously flattened into the same config-backed stdio path as
    user-configured servers. That made them depend on a hidden `codex
    builtin-mcp` re-exec path, exposed them through config-oriented CLI
    flows, and erased distinctions the runtime needs to preserve—most
    notably whether an MCP call should count as external context for
    memory-mode pollution.
    
    ## What changed
    
    - Model product-owned built-ins separately from config-backed MCP
    servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
    - Launch built-ins in process through a reusable async transport instead
    of the hidden `builtin-mcp` stdio subcommand.
    - Keep config-oriented CLI operations such as `codex mcp
    list/get/login/logout` scoped to configured servers, while merging
    built-ins only into the effective runtime server set.
    - Retain server metadata after launch so parallel-tool support and
    context classification come from the live server set; built-in
    `memories` is now classified as local Codex state rather than external
    context.
    
    ## Test plan
    
    - `cargo test -p codex-mcp`
    - `cargo test -p codex-core --test suite
    builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add session_id (#20437)
    ## Summary
    
    Related to
    https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
    TLDR:
    We update the meaning of session ids and thread ids:
    * thread_id stays as now
    * session_id become a shared id between every thread under a /root
    thread (i.e. every sub-agent share the same session id)
    
    This PR introduces an explicit `SessionId` and threads it through the
    protocol/client boundary so `session_id` and `thread_id` can diverge
    when they need to, while preserving compatibility for older serialized
    `session_configured` events.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • core tests: migrate more turns to permission profiles (#20013)
    ## Summary
    - Migrate another batch of direct `Op::UserTurn` test construction from
    legacy `SandboxPolicy` values to `PermissionProfile` inputs via
    `turn_permission_fields()`.
    - Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec
    test with `PermissionProfile::read_only()`.
    - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32
    files at the start of the cleanup stack to 27 files.
    
    ## Testing
    - `cargo check -p codex-core --tests`
    - `just fmt`
    - `just fix -p codex-core`
  • tui: carry permission profiles on user turns (#18285)
    ## Why
    
    Per-turn permission overrides should use the same canonical profile
    abstraction as session configuration. That lets TUI submissions preserve
    exact configured permissions without round-tripping through legacy
    sandbox fields.
    
    ## What changed
    
    This adds `permission_profile` to user-turn operations, threads it
    through TUI/app-server submission paths, fills the new field in existing
    test fixtures, and adds coverage that composer submission includes the
    configured profile.
    
    ## Verification
    
    - `cargo test -p codex-tui permissions -- --nocapture`
    - `cargo test -p codex-core --test all permissions_messages --
    --nocapture`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285).
    * #18288
    * #18287
    * #18286
    * __->__ #18285
  • Add turn-scoped environment selections (#18416)
    ## Summary
    - add experimental turn/start.environments params for per-turn
    environment id + cwd selections
    - pass selections through core protocol ops and resolve them with
    EnvironmentManager before TurnContext creation
    - treat omitted selections as default behavior, empty selections as no
    environment, and non-empty selections as first environment/cwd as the
    turn primary
    
    ## Testing
    - ran `just fmt`
    - ran `just write-app-server-schema`
    - not run: unit tests for this stacked PR
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [tool search] support namespaced deferred dynamic tools (#18413)
    Deferred dynamic tools need to round-trip a namespace so a tool returned
    by `tool_search` can be called through the same registry key that core
    uses for dispatch.
    
    This change adds namespace support for dynamic tool specs/calls,
    persists it through app-server thread state, and routes dynamic tool
    calls by full `ToolName` while still sending the app the leaf tool name.
    Deferred dynamic tools must provide a namespace; non-deferred dynamic
    tools may remain top-level.
    
    It also introduces `LoadableToolSpec` as the shared
    function-or-namespace Responses shape used by both `tool_search` output
    and dynamic tool registration, so dynamic tools use the same wrapping
    logic in both paths.
    
    Validation:
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tool_search`
    
    ---------
    
    Co-authored-by: Sayan Sisodiya <sayan@openai.com>
  • feat: config aliases (#18140)
    Rename `no_memories_if_mcp_or_web_search` →
    `disable_on_external_context` with backward compatibility
    
    While doing so, we add a key alias system on our layer merging system.
    What we try to avoid is a case where a company managed config use an old
    name while the user has a new name in it's local config (which would
    make the deserialization fail)
  • Add server-level approval defaults for custom MCP servers (#17843)
    ## Summary
    - Add `default_tools_approval_mode` support for custom MCP server
    configs, matching the existing `codex_apps` behavior
    - Apply approval precedence as per-tool override, then server default,
    then `auto`
    - Update config serialization, CLI display, schema generation, docs, and
    tests
    
    ## Testing
    - `cargo check -p codex-config`
    - `cargo check -p codex-core`
    - `just write-config-schema`
    - `just fmt`
    - `cargo test -p codex-config`
    - Targeted `codex-core` tests for config parsing, config writes, and MCP
    approval precedence
    - `just fix -p codex-config -p codex-core`
  • [1/8] Add MCP server environment config (#18085)
    ## Summary
    - Add an MCP server environment setting with local as the default.
    - Thread the default through config serialization, schema generation,
    and existing config fixtures.
    
    ## Stack
    ```text
    o  #18027 [8/8] Fail exec client operations after disconnect
    │
    o  #18025 [7/8] Cover MCP stdio tests with executor placement
    │
    o  #18089 [6/8] Wire remote MCP stdio through executor
    │
    o  #18088 [5/8] Add executor process transport for MCP stdio
    │
    o  #18087 [4/8] Abstract MCP stdio server launching
    │
    o  #18020 [3/8] Add pushed exec process events
    │
    o  #18086 [2/8] Support piped stdin in exec process API
    │
    @  #18085 [1/8] Add MCP server environment config
    │
    o  main
    ```
    
    Co-authored-by: Codex <noreply@openai.com>
  • register all mcp tools with namespace (#17404)
    stacked on #17402.
    
    MCP tools returned by `tool_search` (deferred tools) get registered in
    our `ToolRegistry` with a different format than directly available
    tools. this leads to two different ways of accessing MCP tools from our
    tool catalog, only one of which works for each. fix this by registering
    all MCP tools with the namespace format, since this info is already
    available.
    
    also, direct MCP tools are registered to responsesapi without a
    namespace, while deferred MCP tools have a namespace. this means we can
    receive MCP `FunctionCall`s in both formats from namespaces. fix this by
    always registering MCP tools with namespace, regardless of deferral
    status.
    
    make code mode track `ToolName` provenance of tools so it can map the
    literal JS function name string to the correct `ToolName` for
    invocation, rather than supporting both in core.
    
    this lets us unify to a single canonical `ToolName` representation for
    each MCP tool and force everywhere to use that one, without supporting
    fallbacks.
  • Add supports_parallel_tool_calls flag to included mcps (#17667)
    ## Why
    
    For more advanced MCP usage, we want the model to be able to emit
    parallel MCP tool calls and have Codex execute eligible ones
    concurrently, instead of forcing all MCP calls through the serial block.
    
    The main design choice was where to thread the config. I made this
    server-level because parallel safety depends on the MCP server
    implementation. Codex reads the flag from `mcp_servers`, threads the
    opted-in server names into `ToolRouter`, and checks the parsed
    `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
    on model-visible tool names, which can be incomplete in
    deferred/search-tool paths or ambiguous for similarly named
    servers/tools.
    
    ## What was added
    
    Added `supports_parallel_tool_calls` for MCP servers.
    
    Before:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    ```
    
    After:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    supports_parallel_tool_calls = true
    ```
    
    MCP calls remain serial by default. Only tools from opted-in servers are
    eligible to run in parallel. Docs also now warn to enable this only when
    the server’s tools are safe to run concurrently, especially around
    shared state or read/write races.
    
    ## Testing
    
    Tested with a local stdio MCP server exposing real delay tools. The
    model/Responses side was mocked only to deterministically emit two MCP
    calls in the same turn.
    
    Each test called `query_with_delay` and `query_with_delay_2` with `{
    "seconds": 25 }`.
    
    | Build/config | Observed | Wall time |
    | --- | --- | --- |
    | main with flag enabled | serial | `58.79s` |
    | PR with flag enabled | parallel | `31.73s` |
    | PR without flag | serial | `56.70s` |
    
    PR with flag enabled showed both tools start before either completed;
    main and PR-without-flag completed the first delay before starting the
    second.
    
    Also added an integration test.
    
    Additional checks:
    
    - `cargo test -p codex-tools` passed
    - `cargo test -p codex-core
    mcp_parallel_support_uses_exact_payload_server` passed
    - `git diff --check` passed
  • Forward app-server turn clientMetadata to Responses (#16009)
    ## Summary
    App-server v2 already receives turn-scoped `clientMetadata`, but the
    Rust app-server was dropping it before the outbound Responses request.
    This change keeps the fix lightweight by threading that metadata through
    the existing turn-metadata path rather than inventing a new transport.
    
    ## What we're trying to do and why
    We want turn-scoped metadata from the app-server protocol layer,
    especially fields like Hermes/GAAS run IDs, to survive all the way to
    the actual Responses API request so it is visible in downstream
    websocket request logging and analytics.
    
    The specific bug was:
    - app-server protocol uses camelCase `clientMetadata`
    - Responses transport already has an existing turn metadata carrier:
    `x-codex-turn-metadata`
    - websocket transport already rewrites that header into
    `request.request_body.client_metadata["x-codex-turn-metadata"]`
    - but the Rust app-server never parsed or stored `clientMetadata`, so
    nothing from the app-server request was making it into that existing
    path
    
    This PR fixes that without adding a new header or a second metadata
    channel.
    
    ## How we did it
    ### Protocol surface
    - Add optional `clientMetadata` to v2 `TurnStartParams` and
    `TurnSteerParams`
    - Regenerate the JSON schema / TypeScript fixtures
    - Update app-server docs to describe the field and its behavior
    
    ### Runtime plumbing
    - Add a dedicated core op for app-server user input carrying turn-scoped
    metadata: `Op::UserInputWithClientMetadata`
    - Wire `turn/start` and `turn/steer` through that op / signature path
    instead of dropping the metadata at the message-processor boundary
    - Store the metadata in `TurnMetadataState`
    
    ### Transport behavior
    - Reuse the existing serialized `x-codex-turn-metadata` payload
    - Merge the new app-server `clientMetadata` into that JSON additively
    - Do **not** replace built-in reserved fields already present in the
    turn metadata payload
    - Keep websocket behavior unchanged at the outer shape level: it still
    sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON
    string now contains the merged fields
    - Keep HTTP fallback behavior unchanged except that the existing
    `x-codex-turn-metadata` header now includes the merged fields too
    
    ### Request shape before / after
    Before, a websocket `response.create` looked like:
    ```json
    {
      "type": "response.create",
      "client_metadata": {
        "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}"
      }
    }
    ```
    Even if the app-server caller supplied `clientMetadata`, it was not
    represented there.
    
    After, the same request shape is preserved, but the serialized payload
    now includes the new turn-scoped fields:
    ```json
    {
      "type": "response.create",
      "client_metadata": {
        "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}"
      }
    }
    ```
    
    ## Validation
    ### Targeted tests added / updated
    - protocol round-trip coverage for `clientMetadata` on `turn/start` and
    `turn/steer`
    - protocol round-trip coverage for `Op::UserInputWithClientMetadata`
    - `TurnMetadataState` merge test proving client metadata is added
    without overwriting reserved built-in fields
    - websocket request-shape test proving outbound `response.create`
    contains merged metadata inside
    `client_metadata["x-codex-turn-metadata"]`
    - app-server integration tests proving:
    - `turn/start` forwards `clientMetadata` into the outbound Responses
    request path
      - websocket warmup + real turn request both behave correctly
      - `turn/steer` updates the follow-up request metadata
    
    ### Commands run
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-core
    turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields
    --lib`
    - `cargo test -p codex-core --test all
    responses_websocket_preserves_custom_turn_metadata_fields`
    - `cargo test -p codex-app-server --test all client_metadata`
    - `cargo test -p codex-app-server --test all
    turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2
    -- --nocapture`
    - `just fmt`
    - `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol
    -p codex-app-server`
    - `just fix -p codex-exec -p codex-tui-app-server`
    - `just argument-comment-lint`
    
    ### Full suite note
    `cargo test` in `codex-rs` still fails in:
    -
    `suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request`
    
    I verified that same failure on a clean detached `HEAD` worktree with an
    isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.
  • Preserve null developer instructions (#16976)
    Preserve explicit null developer-instruction overrides across app-server
    resume and fork flows.
  • [codex] Remove codex-core config type shim (#16529)
    ## Why
    
    This finishes the config-type move out of `codex-core` by removing the
    temporary compatibility shim in `codex_core::config::types`. Callers now
    depend on `codex-config` directly, which keeps these config model types
    owned by the config crate instead of re-expanding `codex-core` as a
    transitive API surface.
    
    ## What Changed
    
    - Removed the `codex-rs/core/src/config/types.rs` re-export shim and the
    `core::config::ApprovalsReviewer` re-export.
    - Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`,
    `codex-mcp-server`, and `codex-linux-sandbox` call sites to import
    `codex_config::types` directly.
    - Added explicit `codex-config` dependencies to downstream crates that
    previously relied on the `codex-core` re-export.
    - Regenerated `codex-rs/core/config.schema.json` after updating the
    config docs path reference.
  • [mcp] Improve custom MCP elicitation (#15800)
    - [x] Support don't ask again for custom MCP tool calls.
    - [x] Don't run arc in yolo mode.
    - [x] Run arc for custom MCP tools in always allow mode.
  • chore(core) Add approvals reviewer to UserTurn (#15426)
    ## Summary
    Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
    `[CodexMessageProcessor::turn_start]` to use Op::UserTurn
    
    ## Testing
    - [x] Adds quick test for the new field
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: change multi-agent to use path-like system instead of uuids (#15313)
    This PR add an URI-based system to reference agents within a tree. This
    comes from a sync between research and engineering.
    
    The main agent (the one manually spawned by a user) is always called
    `/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
    where `agent_1` is chosen by the model.
    
    Any agent can contact any agents using the path.
    
    Paths can be used either in absolute or relative to the calling agents
    
    Resume is not supported for now on this new path
  • Split features into codex-features crate (#15253)
    - Split the feature system into a new `codex-features` crate.
    - Cut `codex-core` and workspace consumers over to the new config and
    warning APIs.
    
    Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Align SQLite feedback logs with feedback formatter (#13494)
    ## Summary
    - store a pre-rendered `feedback_log_body` in SQLite so `/feedback`
    exports keep span prefixes and structured event fields
    - render SQLite feedback exports with timestamps and level prefixes to
    match the old in-memory feedback formatter, while preserving existing
    trailing newlines
    - count `feedback_log_body` in the SQLite retention budget so structured
    or span-prefixed rows still prune correctly
    - bound `/feedback` row loading in SQL with the retention estimate, then
    apply exact whole-line truncation in Rust so uploads stay capped without
    splitting lines
    
    ## Details
    - add a `feedback_log_body` column to `logs` and backfill it from
    `message` for existing rows
    - capture span names plus formatted span and event fields at write time,
    since SQLite does not retain enough structure to reconstruct the old
    formatter later
    - keep SQLite feedback queries scoped to the requested thread plus
    same-process threadless rows
    - restore a SQL-side cumulative `estimated_bytes` cap for feedback
    export queries so over-retained partitions do not load every matching
    row before truncation
    - add focused formatting coverage for exported feedback lines and parity
    coverage against `tracing_subscriber`
    
    ## Testing
    - cargo test -p codex-state
    - just fix -p codex-state
    - just fmt
    
    codex author: `codex resume 019ca1b0-0ecc-78b1-85eb-6befdd7e4f1f`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • dynamic tool calls: add param exposeToContext to optionally hide tool (#14501)
    This extends dynamic_tool_calls to allow us to hide a tool from the
    model context but still use it as part of the general tool calling
    runtime (for ex from js_repl/code_mode)
  • config: enforce enterprise feature requirements (#13388)
    ## Why
    
    Enterprises can already constrain approvals, sandboxing, and web search
    through `requirements.toml` and MDM, but feature flags were still only
    configurable as managed defaults. That meant an enterprise could suggest
    feature values, but it could not actually pin them.
    
    This change closes that gap and makes enterprise feature requirements
    behave like the other constrained settings. The effective feature set
    now stays consistent with enterprise requirements during config load,
    when config writes are validated, and when runtime code mutates feature
    flags later in the session.
    
    It also tightens the runtime API for managed features. `ManagedFeatures`
    now follows the same constraint-oriented shape as `Constrained<T>`
    instead of exposing panic-prone mutation helpers, and production code
    can no longer construct it through an unconstrained `From<Features>`
    path.
    
    The PR also hardens the `compact_resume_fork` integration coverage on
    Windows. After the feature-management changes,
    `compact_resume_after_second_compaction_preserves_history` was
    overflowing the libtest/Tokio thread stacks on Windows, so the test now
    uses an explicit larger-stack harness as a pragmatic mitigation. That
    may not be the ideal root-cause fix, and it merits a parallel
    investigation into whether part of the async future chain should be
    boxed to reduce stack pressure instead.
    
    ## What Changed
    
    Enterprises can now pin feature values in `requirements.toml` with the
    requirements-side `features` table:
    
    ```toml
    [features]
    personality = true
    unified_exec = false
    ```
    
    Only canonical feature keys are allowed in the requirements `features`
    table; omitted keys remain unconstrained.
    
    - Added a requirements-side pinned feature map to
    `ConfigRequirementsToml`, threaded it through source-preserving
    requirements merge and normalization in `codex-config`, and made the
    TOML surface use `[features]` (while still accepting legacy
    `[feature_requirements]` for compatibility).
    - Exposed `featureRequirements` from `configRequirements/read`,
    regenerated the JSON/TypeScript schema artifacts, and updated the
    app-server README.
    - Wrapped the effective feature set in `ManagedFeatures`, backed by
    `ConstrainedWithSource<Features>`, and changed its API to mirror
    `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
    and result-returning `enable` / `disable` / `set_enabled` helpers.
    - Removed the legacy-usage and bulk-map passthroughs from
    `ManagedFeatures`; callers that need those behaviors now mutate a plain
    `Features` value and reapply it through `set(...)`, so the constrained
    wrapper remains the enforcement boundary.
    - Removed the production loophole for constructing unconstrained
    `ManagedFeatures`. Non-test code now creates it through the configured
    feature-loading path, and `impl From<Features> for ManagedFeatures` is
    restricted to `#[cfg(test)]`.
    - Rejected legacy feature aliases in enterprise feature requirements,
    and return a load error when a pinned combination cannot survive
    dependency normalization.
    - Validated config writes against enterprise feature requirements before
    persisting changes, including explicit conflicting writes and
    profile-specific feature states that normalize into invalid
    combinations.
    - Updated runtime and TUI feature-toggle paths to use the constrained
    setter API and to persist or apply the effective post-constraint value
    rather than the requested value.
    - Updated the `core_test_support` Bazel target to include the bundled
    core model-catalog fixtures in its runtime data, so helper code that
    resolves `core/models.json` through runfiles works in remote Bazel test
    environments.
    - Renamed the core config test coverage to emphasize that effective
    feature values are normalized at runtime, while conflicting persisted
    config writes are rejected.
    - Ran `compact_resume_after_second_compaction_preserves_history` inside
    an explicit 8 MiB test thread and Tokio runtime worker stack, following
    the existing larger-stack integration-test pattern, to keep the Windows
    `compact_resume_fork` test slice from aborting while a parallel
    investigation continues into whether some of the underlying async
    futures should be boxed.
    
    ## Verification
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core feature_requirements_ -- --nocapture`
    - `cargo test -p codex-core
    load_requirements_toml_produces_expected_constraints -- --nocapture`
    - `cargo test -p codex-core
    compact_resume_after_second_compaction_preserves_history -- --nocapture`
    - `cargo test -p codex-core compact_resume_fork -- --nocapture`
    - Re-ran the built `codex-core` `tests/all` binary with
    `RUST_MIN_STACK=262144` for
    `compact_resume_after_second_compaction_preserves_history` to confirm
    the explicit-stack harness fixes the deterministic low-stack repro.
    - `cargo test -p codex-core`
    - This still fails locally in unrelated integration areas that expect
    the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
    wiremock mismatches.
    
    ## Docs
    
    `developers.openai.com/codex` should document the requirements-side
    `[features]` table for enterprise and MDM-managed configuration,
    including that it only accepts canonical feature keys and that
    conflicting config writes are rejected.
  • add fast mode toggle (#13212)
    - add a local Fast mode setting in codex-core (similar to how model id
    is currently stored on disk locally)
    - send `service_tier=priority` on requests when Fast is enabled
    - add `/fast` in the TUI and persist it locally
    - feature flag
  • feat: polluted memories (#13008)
    Add a feature flag to disable memory creation for "polluted"
  • Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
    ## Summary
    - Add agent job support: spawn a batch of sub-agents from CSV, auto-run,
    auto-export, and store results in SQLite.
    - Simplify workflow: remove run/resume/get-status/export tools; spawn is
    deterministic and completes in one call.
    - Improve exec UX: stable, single-line progress bar with ETA; suppress
    sub-agent chatter in exec.
    
    ## Why
    Enables map-reduce style workflows over arbitrarily large repos using
    the existing Codex orchestrator. This addresses review feedback about
    overly complex job controls and non-deterministic monitoring.
    
    ## Demo (progress bar)
    ```
    ./codex-rs/target/debug/codex exec \
      --enable collab \
      --enable sqlite \
      --full-auto \
      --progress-cursor \
      -c agents.max_threads=16 \
      -C /Users/daveaitel/code/codex \
      - <<'PROMPT'
    Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
    path = item-01..item-30, area = test.
    
    Then call spawn_agents_on_csv with:
    - csv_path: /tmp/agent_job_progress_demo.csv
    - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
    - output_csv_path: /tmp/agent_job_progress_demo_out.csv
    PROMPT
    ```
    
    ## Review feedback addressed
    - Auto-start jobs on spawn; removed run/resume/status/export tools.
    - Auto-export on success.
    - More descriptive tool spec + clearer prompts.
    - Avoid deadlocks on spawn failure; pending/running handled safely.
    - Progress bar no longer scrolls; stable single-line redraw.
    
    ## Tests
    - `cd codex-rs && cargo test -p codex-exec`
    - `cd codex-rs && cargo build -p codex-cli`