156 Commits

  • [codex] Use model metadata for skills usage instructions (#29740)
    ## Summary
    
    - add a false-by-default `include_skills_usage_instructions` model
    metadata field
    - enable the field for the bundled `gpt-5.5` model metadata
    - consume the metadata in both core and extension skill rendering
    - remove hardcoded legacy-model matching and its marker plumbing
  • Preserve namespaces on custom tool calls (#30302)
    ## Summary
    
    - Preserve the optional namespace on custom tool calls during response
    deserialization and app-server replay.
    - Use the namespaced tool identifier for streaming argument handling and
    tool dispatch.
    - Regenerate app-server protocol schemas.
    - Add regression tests covering namespace serialization and routing.
    
    ## Testing
    
    - Ran affected protocol and app-server test suites.
    - Ran the full core test suite; two load-sensitive timing tests passed
    when rerun individually.
    - Ran Clippy and formatting checks.
    - Verified with a local end-to-end app-server replay that the namespace
    is preserved through the complete request/response flow.
  • [codex] allow CCA image generation and web search extensions (#29909)
    ## Summary
    
    - allow the standalone image-generation and web-search extensions for
    the actor-authorized provider shape used by CCA
    - preserve builtin `image_generation` and `web_search` for older models
    and existing flows
    - keep ordinary non-OpenAI providers excluded from both extensions
    - remove only the image extension local managed-AuthManager requirement
    that CCA cannot satisfy
    - share actor-authorization detection through `ModelProviderInfo`
    - keep Core tests focused on routing behavior and cover header-shape
    edge cases in `model-provider-info`
    - add a Responses Lite regression that verifies both
    `image_gen.imagegen` and `web.run`
    
    ## Why
    
    CCA uses a provider named `local` with `requires_openai_auth: false` and
    a non-empty `x-openai-actor-authorization` header. Core accepts that
    provider shape, but both extension provider-name gates rejected it;
    image generation additionally required a Codex-managed login.
    
    The standalone paths must coexist with existing builtin tools. New
    Responses Lite models can receive `image_gen.imagegen` and `web.run`,
    while older models continue using builtin tools.
    
    ## Impact
    
    This enables both standalone extensions for CCA once installed
    downstream, without removing or changing builtin-tool compatibility for
    older models.
    
    ## Validation
    
    - `just test -p codex-core
    responses_lite_exposes_standalone_tools_for_actor_authorized_provider`
    - `just test -p codex-core
    responses_lite_uses_standalone_web_search_and_image_generation`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-model-provider-info`
    - `just fmt`
    - `git diff --check`
  • Reinject missing World State fragments on resume (#30152)
    ## Why
    
    World State restores its structured snapshot on resume so unchanged
    sections do not have to be rendered again. That is safe only when the
    model-visible fragment represented by the snapshot is still present in
    retained history.
    
    For selected executor skills, the failing selected-capability scenario
    exposed this state:
    
    ```text
    persisted World State: selected skill catalog is known
    retained model history: selected skill catalog message is missing
    next diff: unchanged, so emit nothing
    ```
    
    The model resumes without being told about the selected skill catalog.
    
    ## What changed
    
    World State contributions may now optionally describe the concrete
    model-visible fragment that must remain in retained history.
    
    When a persisted snapshot is present:
    
    ```text
    matching retained fragment exists -> trust snapshot, emit nothing
    matching retained fragment missing -> treat section as absent, render current state once
    ```
    
    The skills extension uses this for non-empty selected-environment
    catalogs by matching its exact rendered catalog body. Empty or hidden
    catalogs do not require a fragment.
    
    ## Scope
    
    This does not clear or rebuild the whole World State baseline. It does
    not change skill discovery, cache invalidation, environment
    availability, or MCP runtime behavior. It only keeps a persisted section
    snapshot and its retained model context consistent across resume/history
    reconstruction.
    
    ## Coverage
    
    A focused World State regression test verifies both sides:
    
    - a missing retained fragment is rendered again
    - a matching retained fragment avoids duplicate injection
  • Project selected plugin runtime by environment availability (#30093)
    ## Why
    
    Selected plugin metadata is stable, but MCP processes are live runtime
    state. They need different lifetimes:
    
    - the MCP extension caches manifest, MCP, and connector declarations for
    each stable selected root;
    - each model step projects that cached metadata through the roots that
    resolved as ready for that exact step;
    - the MCP manager is rebuilt only when that availability projection
    changes.
    
    This matches executor skills: both features consume the same resolved
    step roots instead of inferring readiness from the turn's selected
    environments.
    
    ## Behavior
    
    ```text
    E1 not ready for this step
      -> no E1 MCP servers or connectors
      -> cached plugin metadata stays in ext/mcp
    
    E1 becomes ready
      -> reuse cached metadata
      -> publish one MCP runtime containing E1 capabilities
    
    same ready roots on the next step
      -> reuse the exact runtime; no rediscovery and no MCP restart
    
    resume
      -> create new extension thread state and a new MCP runtime
    ```
    
    All model-facing consumers use the same step snapshot:
    
    ```text
    resolved selected roots
            |
            v
    extension MCP/connector projection
            |
            v
    { MCP config, connector snapshot, MCP manager }
            |
            +-> advertise model tools
            +-> build app/connector tools
            +-> execute MCP calls
    ```
    
    ## Cache contract
    
    The existing MCP extension owns a cache keyed by the full
    `SelectedCapabilityRoot`:
    
    ```rust
    let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
    ```
    
    The cache lives with extension thread state. Environment availability
    filters projection but does not invalidate metadata. Resume creates new
    thread state. There is no file watcher or executor generation because
    contents behind a stable environment/root are assumed stable.
    
    ## What changes
    
    - Keeps executor plugin discovery and cached metadata in `ext/mcp`.
    - Caches MCP and connector declarations together per selected root.
    - Uses the step's already-resolved capability roots, including lazy
    environments that are not turn environments.
    - Reuses the current MCP runtime when the ready-root projection is
    unchanged.
    - Uses the same step MCP manager and connector snapshot for
    model-visible tools and execution.
    - Resolves direct thread-scoped MCP requests from the current
    selected-root projection.
    
    ## Deliberately out of scope
    
    - `app/list` remains based on the latest global host-plugin state; this
    PR does not make its response or notifications thread-specific.
    - `required = true` startup semantics do not apply to delayed executor
    MCP activation.
    - No filesystem/content invalidation.
    - No transport-disconnect watcher.
    - No executor generations or environment replacement semantics.
    - No client sharing across complete manager replacements.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. Project executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. **This PR:** project selected MCP and connector state from
    extension-owned metadata.
    5. Integration coverage for selected capability availability and resume.
    
    ## Verification
    
    -
    `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
    - The stacked integration PR covers unavailable to ready activation,
    unchanged-runtime reuse, skills, MCP tools, connector attribution, and
    cold resume.
  • Project executor skills through World State (#30088)
    ## Why
    
    A selected executor environment can be unavailable in one model step and
    ready in the next. The model should see its skills only while that
    environment is ready, without rescanning stable files on every sample.
    
    The product assumption is simple:
    
    - an environment ID names one stable logical environment;
    - the selected root contents do not change during the thread.
    
    ## Behavior
    
    ```text
    E1 unavailable -> do not show E1 skills
    E1 ready       -> discover once, cache, show through World State
    E1 unavailable -> hide skills, keep cache
    E1 ready again -> reuse cache, show skills again
    resume         -> create a new thread cache and discover again
    ```
    
    The cache key is the full `SelectedCapabilityRoot`. Availability does
    not invalidate it; dropping the extension's thread state does.
    
    The step supplies the ready selected roots directly. They do not have to
    be turn environments:
    
    ```text
    turn environment: laptop
    selected root:    worker:/plugins/lint-fix
    
    worker ready -> lint-fix skills are visible
    ```
    
    ## What changes
    
    - Keeps executor skill catalogs in the existing skills extension.
    - Passes the roots resolved as ready for the step into World State
    contributors.
    - Loads each ready selected root at most once per thread.
    - Contributes the executor catalog as the `skills` World State section.
    - Uses the exact step catalog for explicit skill selection and body
    reads.
    - Leaves host and orchestrator skill behavior where it already lives.
    
    Taking a step snapshot itself does not add an RPC. Executor filesystem
    calls happen only on the first discovery of a stable root for that
    thread.
    
    ## What does not change
    
    - No filesystem watcher or content-based invalidation.
    - No retry/generation framework.
    - No skill runtime migration into core.
    - No general rewrite of the skills extension.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. **This PR:** project cached executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. Project selected MCP/app/connector metadata by environment
    availability.
    5. One end-to-end integration scenario.
  • Let extensions contribute World State sections (#30100)
    ## Why
    
    #29856 already owns the durable thread intent and exact environment
    binding. This PR adds only the small missing extension boundary: an
    extension can contribute one named World State section, while core still
    owns persistence, diffing, and model-visible fragment types.
    
    This lets skills stay in the skills extension instead of moving their
    runtime into core.
    
    ## Shape
    
    ```text
    extension-owned state
            |
            | contribute section id + JSON snapshot + renderer
            v
    core World State
            |
            | compare with the previous snapshot
            v
    no message, or one incremental model-visible update
    ```
    
    The extension API is deliberately small:
    
    ```rust
    fn contribute_world_state(...) -> Vec<WorldStateSectionContribution>
    ```
    
    Core adapts the rendered result to `ContextualUserFragment`, records the
    snapshot, and keeps the existing compaction/resume behavior.
    
    ## What changes
    
    - Adds extension-owned World State section contributions.
    - Calls those contributors from the existing per-step World State
    builder.
    - Restores durable selected capability roots into extension thread state
    on resume.
    - Keeps the actual model-context fragment and rollout machinery in core.
    
    ## What does not change
    
    - No skill or MCP implementation moves out of its extension.
    - No new file watcher, generation, or RPC.
    - No generic migration of existing World State sections.
    - No change to the stable environment-ID assumption from #29856.
    
    ## Example
    
    ```text
    step 1 snapshot: skills = []
    step 2 snapshot: skills = [executor-demo:deploy]
    
    core asks the skills extension to render only that change.
    ```
    
    ## Stack
    
    1. **This PR:** let extensions contribute World State sections.
    2. Project executor skills through the skills extension.
    3. Pin one MCP runtime to each model step.
    4. Project selected MCP/app/connector metadata by environment
    availability.
    5. One end-to-end integration scenario.
  • Support HTTP MCP servers from selected executor plugins (#28522)
    ## Why
    
    Selected executor plugins can declare both stdio and Streamable HTTP MCP
    servers, but only stdio registrations were retained. That silently drops
    part of the plugin's tool surface and prevents HTTP traffic from using
    the owning executor's network.
    
    ## What changed
    
    - retain selected-plugin Streamable HTTP MCP declarations alongside
    stdio declarations
    - route their HTTP clients through the owning executor environment
    - preserve local auth-header environment references while rejecting them
    for executor-hosted declarations
    - cover thread isolation, refresh, and an executor-only HTTP route end
    to end
  • Represent MCP authentication with an enum (#29924)
    ## Why
    
    MCP authentication has distinct OAuth and ChatGPT-session flows.
    Representing that choice as `use_chatgpt_auth` makes one flow implicit
    and allows the configuration model to express the distinction only
    through a boolean.
    
    ChatGPT credential forwarding also needs a first-party trust boundary. A
    configurable `chatgpt_base_url` controls routing, but must not grant an
    MCP server permission to receive session credentials.
    
    This change builds on #29733, where the boolean was introduced.
    
    ## What changed
    
    - Replace `use_chatgpt_auth` with an `auth` field backed by the
    exhaustive `McpServerAuth` enum.
    - Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
    the default.
    - Trust only the origin derived from the existing hardcoded
    `CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
    - Keep configured bearer tokens and authorization headers ahead of the
    selected authentication flow.
    - Update config writers, schema output, fixtures, and integration-test
    setup to use the enum.
    
    ## Verification
    
    Integration coverage exercises the complete streamable HTTP startup path
    in two independent configurations:
    
    - A directly constructed MCP configuration verifies that matching an
    overridden `chatgpt_base_url` does not grant ChatGPT auth.
    - A persisted `config.toml` containing an attacker-controlled
    `chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
    through normal config parsing.
    
    Both tests complete MCP initialization and tool listing and assert that
    the full captured request sequence contains no authorization headers.
    Separate integration coverage verifies that configured authorization
    takes precedence over ChatGPT auth.
  • Allow ChatGPT-hosted MCP servers to use session auth (#29733)
    ## Why
    
    ChatGPT session authentication was inferred from the reserved Codex Apps
    server name. That couples credential routing to Codex Apps-specific
    behavior and prevents other MCP endpoints hosted by ChatGPT from
    explicitly using the current session.
    
    The opt-in also needs a clear security boundary: an arbitrary MCP
    configuration must not be able to redirect ChatGPT credentials to
    another origin.
    
    ## What changed
    
    - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
    `false`.
    - Honor the setting only when the parsed server URL has the same HTTP(S)
    origin as the configured `chatgpt_base_url`; otherwise remove the
    capability before startup.
    - Resolve bearer tokens and static or environment-backed authorization
    headers before selecting authentication, with configured authorization
    taking precedence over ChatGPT session auth.
    - Enable the setting for the built-in Codex Apps and hosted plugin
    runtime endpoints while keeping Codex Apps caching and tool
    normalization scoped to the reserved server.
    - Persist the setting through MCP config rewrite paths and expose it in
    the generated config schema.
    - Load the current login state for `codex mcp list` so reported auth
    status matches runtime behavior.
    
    ## Verification
    
    Core integration coverage exercises the complete streamable HTTP MCP
    startup path and verifies that:
    
    - a same-origin opted-in server receives the current ChatGPT access
    token;
    - an explicitly configured authorization header takes precedence;
    - a different-origin server completes MCP initialization and tool
    listing without receiving any ChatGPT authorization header.
  • Read connector declarations from executor plugins (#29852)
    ## Why
    
    Selected capability roots can live on a different executor and operating
    system from app-server. Their connector declarations must therefore be
    read through the executor that owns the package, without converting
    executor URIs into host paths.
    
    This PR adds that authority-bound reader without activating connectors
    or changing thread startup.
    
    ## What changed
    
    - Add a small `codex-connectors-extension` crate for executor-owned
    connector I/O.
    - Read only the app configuration explicitly declared by the resolved
    plugin manifest.
    - Read through the `ExecutorFileSystem` retained by
    `ResolvedExecutorPlugin`; there is no host-filesystem fallback or
    default-file probe.
    - Keep `PathUri` values intact so Windows, Unix, and remote executor
    paths work from any orchestrator OS.
    - Return full `AppDeclaration` values so the caller retains declaration
    names and categories for routing.
    - Preserve the selected plugin ID and exact executor URI in read and
    parse errors.
    
    The contract is intentionally narrow: selected packages are trusted,
    valid packages and packages that provide connectors explicitly declare
    their app configuration.
    
    ## Stack scope
    
    This PR is stacked on #29851. It only provides the executor-backed
    reader. #29856 resolves selected roots at thread start, freezes their
    connector snapshot, and contains the remote-capable end-to-end authority
    test for the complete path.
  • Keep executor plugin MCP paths URI-native (#29628)
    ## Why
    
    Executor-owned plugin roots are `PathUri`, but MCP config normalization
    still converts them into a native `Path` using the app-server host's
    rules. Relative `cwd` values can therefore resolve against the wrong
    filesystem when host and executor path conventions differ.
    
    This PR keeps executor MCP paths URI-native until the selected
    environment launches the server, while retaining the existing host
    parser behavior.
    
    ## What changed
    
    - Keep one shared MCP normalization path with narrow host-`Path` and
    executor-`PathUri` entrypoints.
    - Preserve native host resolution for locally installed plugin MCP
    configs.
    - For executor configs, default `cwd` to the plugin root and resolve
    relative working directories with the root URI's path convention.
    - Accept explicit executor `file:` URIs only when they remain within the
    selected plugin root.
    - Preserve the selected environment id and existing remote
    environment-variable ownership rules.
    - Route the executor plugin provider through the URI-native entrypoint
    without converting the root on the host.
    - Ensure `codex doctor` does not probe executor-owned stdio commands or
    foreign working directories on the host.
    - Cover foreign Windows roots, relative and absolute executor working
    directories, traversal rejection, runtime resolution, and doctor
    behavior.
    
    ```text
    plugin root:    file:///C:/plugins/demo
    configured cwd: scripts
                      |
                      v
    resolved cwd:  file:///C:/plugins/demo/scripts
                      |
                      v
    launch through the selected executor
    ```
    
    No new provider or filesystem abstraction is introduced.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. **This PR** — resolve executor MCP working directories without host
    path conversion.
  • Let image generation extension hosts control output persistence (#29711)
    ## Why
    
    Some extension hosts need generated images returned without writing them
    to the local filesystem or giving the model a local path.
    
    ## What changed
    
    **tl;dr**: we now conduct all extension operations in the image gen
    extension
    
    - Let hosts provide an optional image save root when installing the
    extension.
    - Save images and return path hints only when a save root is configured.
    - Return image data without saving or adding a path hint when no save
    root is configured.
    - Preserve the extension-provided `saved_path` instead of persisting
    extension images again in core.
    - Leave built-in image generation unchanged.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
    - `just test -p codex-core
    extension_tool_uses_granted_turn_permissions_without_local_persistence`
    - `just test -p codex-core tools::handlers::extension_tools::tests`
    - tested on CODEX CLI on both save_root: CODEX_HOME and None 
    - tested on CODEX APP on both as well
  • Load executor skills without host path conversion (#29626)
    ## Why
    
    After #28918, selected skill roots are `PathUri`, but the executor skill
    provider still converts them to the app-server host's `AbsolutePathBuf`.
    A foreign Windows root therefore cannot be discovered by a Unix host,
    and the inverse has the same problem.
    
    This PR keeps executor skill discovery and reads on the filesystem that
    owns the selected root while reusing the existing skill rules.
    
    ## What changed
    
    - Generalize the existing skill traversal to operate on `PathUri`
    through `ExecutorFileSystem`, preserving its depth, directory, symlink,
    and sibling-metadata concurrency behavior.
    - Add a small environment skill loader that reuses the shared discovery,
    frontmatter validation, dependency parsing, product policy, and
    prompt-visibility rules.
    - Keep the environment id and entrypoint `PathUri` in the skill catalog,
    then route `skills.read` back through the same environment filesystem.
    - Preserve the executor's path convention when deriving catalog handles,
    including literal backslashes in POSIX filenames.
    - Resolve plugin namespaces from nearby manifests through URI-native
    filesystem reads.
    - Cover foreign Windows roots, executor-owned reads, namespaces,
    metadata, policy, and path identity.
    
    ```text
    selected root (PathUri)
            |
            v
    shared discovery over ExecutorFileSystem
            |
            v
    environment-bound catalog entry --skills.read--> same ExecutorFileSystem
    ```
    
    No second filesystem abstraction or duplicate traversal implementation
    is introduced.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. **This PR** — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • Make selected plugin roots URI-native (#28918)
    ## Why
    
    Selected capability roots belong to the executor filesystem, not the
    app-server host. Converting their path strings into the host's native
    `Path` breaks whenever the two machines use different path conventions,
    such as a Windows executor behind a Unix app-server.
    
    This PR establishes `PathUri` as the selected-plugin boundary so the
    executor remains authoritative for its paths.
    
    ## What changed
    
    - Require `selectedCapabilityRoots[].location.path` to be a canonical
    `file:` URI and deserialize it directly as `PathUri`; native path
    strings are rejected.
    - Update the app-server schema, generated TypeScript, examples, and
    request coverage for the URI contract.
    - Keep selected roots, resolved plugin locations, manifest paths, and
    manifest resources as `PathUri`.
    - Inspect and read plugin roots and manifests only through the selected
    environment's `ExecutorFileSystem`.
    - Parse executor manifests with the shared URI-native parser from #29620
    instead of projecting them onto the host filesystem.
    - Enforce resource containment lexically and preserve the root URI's
    POSIX or Windows path convention.
    - Cover foreign Windows plugin roots and URI-native manifest resources.
    
    ```text
    thread/start
      selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo"
                                  | PathUri
                                  v
                        ExecutorFileSystem
                                  |
                                  +--> plugin.json
                                  +--> manifest resources
    ```
    
    This PR stops at the shared selected-plugin representation. The next two
    PRs remove the remaining host-path projections in the skill and MCP
    consumers.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. **This PR** — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • [codex] Use input items for Responses Lite tools (#27946)
    When using Responses Lite, we should all use `additional_tools` and a
    developer item instead of the top level tools array & instructions
    field. This keeps things 1-to-1.
    
    Forced namespacing for _all_ tools will land in a following PR after
    some coordination & fixes in Responses API (around collisions & return
    items).
    
    The goal is to eventually expand the scope of this to _all_ requests
    from codex, but that will require larger coordination across providers &
    slower rollout.
  • mcp: accept foreign absolute cwd for remote stdio (#29493)
    ## Why
    
    Remote stdio MCP servers can run in an environment whose path convention
    differs from the Codex host. A Windows cwd such as
    `C:\Users\openai\share` is absolute for the executor but was rejected by
    a POSIX orchestrator.
    
    Built on #29501, now merged, which only clarifies the host-native
    `PathUri` constructor name.
    
    ## What changed
    
    - Deserialize MCP cwd values as `LegacyAppPathString` so config does not
    apply host path rules.
    - Interpret that spelling as host-native for local launches and convert
    it to `PathUri` at executor launch.
    - Skip host filesystem and command resolution checks for remote stdio in
    `codex doctor`.
    - Add host-independent config and executor-boundary coverage using the
    foreign path convention for each test platform.
    
    ## Validation
    
    - `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
    codex-rmcp-client` (408 passed)
    - `just test -p codex-cli -p codex-rmcp-client` (372 passed)
    - `cargo check --workspace --tests`
    - `just test` (11,311 passed; 43 unrelated environment/timing failures)
    - `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
    codex-mcp-extension -p codex-rmcp-client -p codex-tui`
  • core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
    ## Description
    This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
    here: https://github.com/openai/codex/pull/28355) to
    `ResponseItem.internal_chat_message_metadata_passthrough`, which is the
    blessed path and has strongly-typed keys.
    
    For now we have to drop this MAv2 usage of `metadata`:
    https://github.com/openai/codex/pull/28561 until we figure out where
    that should live.
  • [codex] Preserve skill descriptions outside model context (#29006)
    ## Why
    
    Skill descriptions are used in model-visible lists: the default
    available-skills catalog that supports implicit selection, and the
    on-demand `skills.list` tool response used to discover orchestrator
    skills. A single overlong description should not consume a
    disproportionate share of either list.
    
    Enforcing the 1024-character limit while loading or migrating skills is
    the wrong boundary: it rejects otherwise-valid skills and discards
    metadata that non-model consumers and full skill reads may need. Skill
    metadata and `SKILL.md` content should remain intact; the cap belongs at
    model-visible list rendering boundaries.
    
    ## What changed
    
    - Preserve full `description` and `metadata.short-description` values
    when loading skills.
    - Preserve full external-agent command descriptions during
    `source-command-*` migration instead of skipping commands solely because
    their descriptions exceed 1024 characters.
    - Preserve full normalized orchestrator descriptions in the underlying
    skills catalog.
    - Cap each description at 1024 Unicode characters when rendering the
    default available-skills context in `codex-core-skills` and
    `codex-skills-extension`.
    - Apply the same cap when serializing descriptions in the model-visible
    `skills.list` response.
    - Render truncated descriptions as 1021 original characters plus `...`.
    - Leave explicit `$skill` injection, `skills.read`, underlying metadata,
    and on-disk `SKILL.md` files unchanged and full-fidelity.
    
    ## Implicit skill selection
    
    Codex injects a bounded catalog containing each implicitly allowed
    skill's name, description, and source locator, together with
    instructions to use a skill when the task clearly matches its
    description. The model makes that semantic choice; after selecting a
    skill, it reads the full `SKILL.md` from its filesystem or provider
    resource. Explicit `$skill` mentions remain a separate path that injects
    the full skill instructions. For orchestrator skills, `skills.list`
    provides bounded discovery metadata before `skills.read` returns the
    full selected resource.
    
    ## Test plan
    
    - `just test -p codex-core-skills`
    - `just test -p codex-skills-extension`
    - `just test -p codex-external-agent-migration`
    
    The focused regressions verify that overlong metadata is preserved at
    load and migration boundaries while default available-skills rendering
    and `skills.list` output produce the 1021-character prefix plus `...`.
  • Add config toggles for orchestrator skills and MCP (#28942)
    ## Why
    
    Orchestrator-provided skills and Codex Apps MCP tools add model-visible
    instructions, resources, and tools beyond the local workspace. Hosts
    need config-level switches to disable those orchestrator-owned surfaces
    independently, without disabling regular skills or regular MCP servers.
    
    ## What changed
    
    - Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled`
    config entries, both defaulting to `true`.
    - Includes the new settings in `config.schema.json` and in the config
    lock so resolved thread configuration preserves the same orchestrator
    exposure decisions.
    - Threads `orchestrator.skills.enabled` through the app-server skills
    extension so disabled orchestrator skills do not expose the `skills`
    namespace or inject orchestrator skill context.
    - Gates Codex Apps MCP exposure, app instructions, and app auth
    eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps
    MCP tools available.
    - Updates the thread-manager sample config to disable both
    orchestrator-owned surfaces.
    
    ## Verification
    
    - Added config parsing, loading, defaulting, and schema coverage for the
    new settings.
    - Added MCP exposure coverage that `orchestrator.mcp.enabled = false`
    removes Codex Apps tools while preserving regular MCP tools.
    - Added app-server coverage that `orchestrator.skills.enabled = false`
    prevents orchestrator skill tools, prompts, and resource reads from
    reaching the model turn.
  • Add indexed web search mode (#28489)
    ## Summary
    
    - Add `web_search = "indexed"` alongside `disabled`, `cached`, and
    `live`.
    - Use that same resolved mode for both hosted and standalone web search.
    - For hosted search, send `index_gated_web_access: true` with external
    web access enabled only when `indexed` is selected.
    - For standalone search, preserve the existing boolean wire values for
    existing modes (`cached` maps to `false` and `live` to `true`) and send
    `"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
    - Carry the mode through managed configuration requirements and
    generated schemas.
    
    ## Why
    
    Indexed search provides a middle ground between cached-only search and
    unrestricted live page fetching. Search queries can remain live while
    direct page fetches are limited to URLs admitted by the server.
    
    The existing `web_search` setting remains the single source of truth, so
    hosted and standalone executors cannot drift into different access
    modes. Without an explicit `indexed` selection, the existing
    model-visible tool and request shapes are unchanged.
    
    ```toml
    web_search = "indexed"
    
    [features]
    standalone_web_search = true
    ```
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-api` (`126 passed`)
    - `just test -p codex-web-search-extension` (`7 passed`)
    - `just test -p codex-core
    code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
    - Focused configuration, hosted request, standalone request, and
    managed-requirement coverage is included in the PR; remaining suites run
    in CI.
    
    The full workspace test suite was not run locally.
  • [codex] Assign response item IDs when recording history (#28814)
    ## Why
    
    Client-created response items enter history without IDs, so their
    identity is lost across rollout persistence and resume. IDs should be
    assigned once at the history-recording boundary, while IDs returned by
    the server must remain unchanged.
    
    The Responses API validates item IDs using type-specific prefixes.
    Locally generated IDs therefore use the matching prefix plus a
    hyphenated UUIDv7, keeping them valid while distinguishable from
    server-generated IDs. Because this changes persisted history and
    provider request shapes, the behavior is opt-in behind the
    under-development `item_ids` feature. Compaction triggers remain request
    controls whose API shape does not accept an ID.
    
    ## What changed
    
    - Register the disabled-by-default `item_ids` feature and expose it in
    `config.schema.json`.
    - Make supported optional `ResponseItem` IDs serializable and expose
    them in the generated app-server schemas.
    - When `item_ids` is enabled, assign an ID during conversation-history
    preparation if an item has no ID.
    - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
    item conventions.
    - Preserve existing server IDs without rewriting them.
    - Persist assigned IDs in rollouts and include them in subsequent
    Responses requests.
    - Remove the unsupported ID field from `CompactionTrigger` and document
    why it has no ID.
    - Add integration coverage for enabled ID persistence, preservation of
    server IDs, and omission of generated IDs while the feature is disabled.
    
    `prepare_conversation_items_for_history` is the single response-item ID
    allocation boundary.
    
    ## Test plan
    
    - `just test -p codex-features`
    - `just test -p codex-core
    response_item_ids_persist_across_resume_and_preserve_server_ids`
    - `just test -p codex-core
    non_openai_responses_requests_omit_item_turn_metadata`
    - `just test -p codex-core
    resize_all_images_prepares_failures_before_history_insertion`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
  • [codex] Reuse parsed plugin skills during session startup (#28844)
    ## Summary
    
    - Preserve raw plugin skill-root snapshots in the matching loaded-plugin
    cache entry, keyed by the effective plugin root identity including
    namespace.
    - Pass those snapshots through `SkillsLoadInput` as an optional preload,
    so session startup reuses plugin parsing while ordinary skill loads pass
    `None`.
    - Keep plugin skill loading cohesive: the existing loaders accept the
    optional snapshots directly, and uncached or marketplace-detail paths do
    not create a cache.
    
    ## Why
    
    Plugin discovery already parses plugin skills to determine available
    capabilities. Cold session startup then scanned and parsed the same
    roots again while building the skills snapshot.
    
    This solves the same duplicate-work problem as #28623 while keeping
    ownership narrow: `PluginsManager` creates and owns
    `PluginSkillSnapshots` only for its loaded-plugin cache entry;
    `SkillsService` consumes an optional clone. Entry replacement or
    clearing naturally drops the snapshots, with no separate generation,
    capacity policy, or watcher coupling.
    
    ## Validation
    
    - `cargo clippy -p codex-core-skills --all-targets -- -D warnings`
    - `just test -p codex-core-plugins
    skills_service_reuses_skills_parsed_during_plugin_load`
    - `just test -p codex-core-skills
    namespaces_plugin_skills_using_provided_namespace`
    - `just fmt`
  • Fix goal-first live threads missing from thread/list (#28808)
    Fixes #28263.
    
    ## Why
    
    When a thread starts with `/goal`, the goal extension can update SQLite
    goal state before the thread has any user-turn rollout items.
    `thread/list` and `thread/search` rely on persisted listing metadata, so
    a goal-first live thread could be absent from app-server listings after
    restart even though the goal itself existed.
    
    This regressed when goal handling moved out of core: the core path wrote
    the goal update through the live thread rollout path, while the
    extension-backed app-server path only updated goal state and emitted the
    live notification.
    
    ## What
    
    - Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension
    owns the canonical `ThreadGoalUpdated` rollout item shape.
    - Expose a narrow `CodexThread::append_rollout_items()` helper that
    appends through the live thread and keeps derived SQLite metadata in
    sync.
    - When app-server sets a goal on an active live thread, persist the goal
    update through that live-thread path.
    - Add an app-server regression test that starts a live thread with
    `thread/goal/set` and verifies it appears in state-DB-only
    `thread/list`.
    
    ## Verification
    
    - `env -u CODEX_SQLITE_HOME just test -p codex-app-server
    goal_first_live_thread_appears_in_state_db_thread_list`
  • Add turn-scoped context contributions (#28911)
    ## Summary
    - keep context injection on a single ContextContributor trait
    - split context injection into thread-scoped and turn-scoped
    contribution methods
    - wire turn-scoped fragments into initial context assembly so extensions
    can contribute context from turn-local state
  • [codex] Pass plugin namespace into skill loading (#28608)
    ## What changed
    
    - retain the parsed plugin manifest namespace on loaded plugins
    - carry that namespace through `PluginSkillRoot` and `SkillRoot`
    - use the provided namespace when qualifying plugin skill names
    - include the namespace in the skills cache key
    
    ## Why
    
    Plugin loading has already parsed `plugin.json`, but skill parsing
    currently walks every `SKILL.md` ancestor and probes/reads the manifest
    again to reconstruct the same namespace. Passing the parsed namespace
    removes those repeated filesystem calls, which are particularly costly
    on remote filesystems.
    
    Context:
    https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4
    
    ## Impact
    
    Plugin skill names remain unchanged. A regression test uses a
    deliberately different on-disk manifest name to verify that plugin roots
    use the provided parsed namespace.
    
    ## Validation
    
    - `just test -p codex-core-skills -p codex-core-plugins -p codex-plugin
    -p codex-utils-plugins` (352 passed)
    - `just fix -p codex-core-skills -p codex-core-plugins -p codex-plugin
    -p codex-utils-plugins`
    - `just fmt`
  • [codex] Support plugin manifest path lists (#28790)
    ## Summary
    
    Allow plugin manifests to declare `skills` as either a single path
    string or an array of path strings in the core plugin loader.
    
    ## Why
    
    Some plugin packages need to expose skills from more than one directory.
    Before this change, `plugin.json` only accepted a single string for
    `skills`, so manifests like this were ignored as an invalid `skills`
    shape:
    
    ```json
    {
      "skills": ["./skills/abc", "./skills/edk"]
    }
    ```
    
    This keeps the existing single-string form working while adding support
    for the list form. The final scope is intentionally limited to the core
    plugin manifest/load path for `skills`; `apps`, file-backed
    `mcpServers`, and the bundled plugin-creator assets are unchanged in
    this PR.
    
    ## What changed
    
    - Parse `skills` as either a string or an array of strings in
    `plugin.json`.
    - Store resolved skill paths as a list in `PluginManifestPaths`.
    - Load manifest-declared skill roots in addition to the default
    `./skills` root.
    - Deduplicate exact duplicate skill roots before loading.
    - Rely on existing skill-loader dedupe by canonical `SKILL.md` path for
    overlapping roots such as `./skills` plus `./skills/abc`.
    - Update plugin manifest tests to cover:
      - single string `skills`
      - list of string `skills`
      - duplicate skill roots
      - `./skills` as a manifest path
      - explicit child roots like `./skills/abc` and `./skills/edk`
      - overlapping-root dedupe
    
    ## Validation
    
    - `just test -p codex-plugin`
    - `just test -p codex-core-plugins`
    - `just test -p codex-mcp-extension`
    - `git diff --check`
  • [codex] Add optional IDs to response items (#28812)
    ## Why
    
    `ResponseItem` variants do not have a consistent internal ID shape: some
    variants carry required IDs, some carry optional IDs, and some cannot
    represent an ID at all. The existing fields also use inconsistent serde,
    TypeScript, and JSON-schema annotations. A single enum-level access path
    is needed before history recording can assign and retain IDs.
    
    This PR establishes that internal model only. It intentionally does not
    generate or serialize IDs; allocation and wire persistence are isolated
    in the stacked follow-up.
    
    ## What changed
    
    - Give every concrete `ResponseItem` variant an `Option<String>` ID
    field.
    - Apply the same internal-only annotations to every ID field:
    `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and
    `#[schemars(skip)]`.
    - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared
    accessors.
    - Preserve IDs when history items are rewritten for truncation.
    - Adapt consumers that previously assumed reasoning and image-generation
    IDs were required.
    - Regenerate app-server schemas so the hidden fields are represented
    consistently.
    
    The serde catch-all `ResponseItem::Other` remains ID-less because it
    must remain a unit variant.
    
    ## Test plan
    
    - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace
    -p codex-image-generation-extension`
    - `just test -p codex-protocol`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-api -p codex-rollout-trace -p
    codex-image-generation-extension`
    - `just test -p codex-core event_mapping`
  • Replace SkillsManager with SkillsService (#28705)
    ## Why
    
    Host skill discovery was still exposed as a manager even though it is a
    process-owned service shared by sessions, the app-server catalog, and
    file-watcher invalidation. The skills extension also consumed an ad hoc
    loaded-skills wrapper instead of a named immutable snapshot.
    
    ## What changed
    
    - replace `SkillsManager` with concrete `SkillsService`
    - make the service cache and return immutable `HostSkillsSnapshot`
    values
    - migrate the skills extension host provider to the snapshot boundary
    - migrate app-server catalog, watcher, and invalidation paths to the
    service
    
    This keeps the service limited to host discovery, caching, roots, and
    invalidation. Catalog rendering and invocation remain extension
    responsibilities for the next stacked change.
  • [codex] Support object-valued plugin MCP manifests (#28580)
    ## Summary
    This fixes plugin manifest parsing for MCP servers declared as an object
    directly in `plugin.json`.
    
    Before this change, Codex modeled `mcpServers` as only a string path,
    for example:
    
    ```json
    {
      "name": "counter-sample",
      "version": "1.1.1",
      "mcpServers": "./.mcp.json"
    }
    ```
    
    Some migrated plugins instead provide the server map directly in the
    manifest:
    
    ```json
    {
      "name": "counter-sample",
      "version": "1.1.1",
      "description": "Plugin that declares MCP servers in the manifest",
      "mcpServers": {
        "counter": {
          "type": "http",
          "url": "https://sample.example/counter/mcp"
        }
      }
    }
    ```
    
    That object form previously failed during install/load with an error
    like:
    
    ```text
    failed to parse plugin manifest: invalid type: map, expected a string
    ```
    
    ## What changed
    - Add a manifest representation for `mcpServers` as either
    `Path(Resource)` or `Object(map)`.
    - Parse `plugin.json` `mcpServers` as either a string path or an object.
    - Route object-valued MCP server maps through the existing plugin MCP
    config parser instead of adding a second parser.
    - Apply existing per-plugin MCP server policy to object-valued MCP
    servers the same way as file-backed MCP servers.
    - Include object-valued MCP server names in plugin telemetry/capability
    metadata.
    - Support object-valued MCP config for executor plugins without
    requiring a `.mcp.json` filesystem read.
    - Update the bundled plugin-creator validator and `plugin-json-spec.md`
    so generated-plugin validation accepts the same object-valued shape.
    
    ## Compatibility
    Existing plugin manifests that use `"mcpServers": "./.mcp.json"`
    continue to work. Plugins can now also use the object shape shown above.
    
    ## Tests
    Added coverage for the new manifest attribute shape at the install,
    normal load, telemetry, and executor-provider layers:
    
    - `install_accepts_manifest_mcp_server_objects`
    - `load_plugins_loads_manifest_mcp_server_objects`
    - `plugin_telemetry_metadata_uses_manifest_mcp_server_objects`
    - `reads_manifest_object_config_without_executor_file_system_access`
    
    Also smoke-tested the plugin-creator validator against both supported
    forms:
    
    - `mcpServers` as a direct object in `plugin.json`
    - `mcpServers` as `"./.mcp.json"` with a companion `.mcp.json`
    
    ## Validation
    - `just test -p codex-plugin`
    - `just test -p codex-core-plugins`
    - `just test -p codex-mcp-extension`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fmt`
    - `git diff --check`
    - Focused rename/object-form rerun: `just test -p codex-core-plugins
    manager::tests::load_plugins_loads_manifest_mcp_server_objects
    manager::tests::plugin_telemetry_metadata_uses_manifest_mcp_server_objects
    store::tests::install_accepts_manifest_mcp_server_objects`
    - Focused executor rerun: `just test -p codex-mcp-extension
    executor_plugin::provider::tests::reads_manifest_object_config_without_executor_file_system_access`
    - `python3
    codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
    /private/tmp/codex-validator-object`
    - `python3
    codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
    /private/tmp/codex-validator-path`
  • [codex] exec-server: stream files in chunks (#28354)
    ## Why
    
    `fs/readFile` buffers the entire file in one response, which makes large
    remote reads expensive and prevents callers from applying backpressure.
    We need an opt-in streaming path with bounded block sizes while
    preserving the existing single-call API for small and sandboxed reads.
    
    ## What changed
    
    - Add `ExecServerClient::stream`, returning a named `FileReadStream`
    that implements `futures::Stream` and yields immutable 1 MiB byte
    blocks.
    - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs.
    `fs/readBlock` accepts an explicit offset and length.
    - Keep unsandboxed files open between block reads, cap open handles per
    connection, and clean them up on EOF, error, stream drop, explicit
    close, or connection shutdown.
    - Reject platform-sandboxed streaming opens instead of turning the
    one-shot sandbox helper into a persistent server. Existing `fs/readFile`
    behavior is unchanged.
    
    ## Testing
    
    - `just test -p codex-exec-server`
    - Integration coverage for 1 MiB chunking, exact block-boundary EOF,
    sandbox rejection, and continued reads from the opened file after path
    replacement.
    - Handle-manager coverage for non-sequential offsets, variable block
    lengths, the 128-handle limit, and capacity release after close.
  • feat: render typed envelopes for multi-agent v2 messages (#28368)
    ## Why
    
    Multi-agent v2 messages need a consistent, model-visible envelope that
    identifies what kind of interaction occurred, who sent it, and which
    agent it targets. Previously, encrypted deliveries exposed only
    `encrypted_content`, while child completion used the legacy
    `<subagent_notification>` shape. That meant the client could not
    consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the
    same format.
    
    This change adds the routing envelope as plaintext while keeping task
    and message payloads encrypted. No new Responses API field is required:
    an encrypted delivery is represented as an `input_text` header
    immediately followed by its existing `encrypted_content` item.
    
    Every envelope now follows this shape:
    
    ```text
    Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER>
    Task name: <recipient agent path>
    Sender: <author agent path>
    Payload:
    <message payload>
    ```
    
    ## Message types
    
    ### `NEW_TASK`
    
    `NEW_TASK` is used when the recipient should begin a new turn, including
    an initial `spawn_agent` task and a later `followup_task`.
    
    For a root agent spawning `/root/worker`, the request contains a
    plaintext envelope followed by the encrypted task:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root",
      "recipient": "/root/worker",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n"
        },
        {
          "type": "encrypted_content",
          "encrypted_content": "<encrypted task payload>"
        }
      ]
    }
    ```
    
    Conceptually, the model receives:
    
    ```text
    Message Type: NEW_TASK
    Task name: /root/worker
    Sender: /root
    Payload:
    Review the authentication changes and report any regressions.
    ```
    
    ### `MESSAGE`
    
    `MESSAGE` is used for a queued `send_message` delivery. It communicates
    with an existing agent without starting a new turn.
    
    For `/root/worker` reporting progress to the root agent, the request
    contains:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root/worker",
      "recipient": "/root",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n"
        },
        {
          "type": "encrypted_content",
          "encrypted_content": "<encrypted message payload>"
        }
      ]
    }
    ```
    
    Conceptually, the model receives:
    
    ```text
    Message Type: MESSAGE
    Task name: /root
    Sender: /root/worker
    Payload:
    The protocol tests pass; I am checking the resume path now.
    ```
    
    ### `FINAL_ANSWER`
    
    `FINAL_ANSWER` is emitted when a child agent reaches a terminal state
    and reports its result to its parent. Completion payloads are already
    available locally, so the complete envelope is represented as plaintext
    rather than as a plaintext header plus encrypted content.
    
    For `/root/worker` completing work for the root agent, the request
    contains:
    
    ```json
    {
      "type": "agent_message",
      "author": "/root/worker",
      "recipient": "/root",
      "content": [
        {
          "type": "input_text",
          "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found."
        }
      ]
    }
    ```
    
    The model-visible form is:
    
    ```text
    Message Type: FINAL_ANSWER
    Task name: /root
    Sender: /root/worker
    Payload:
    No regressions found.
    ```
    
    Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a
    terminal-status description in the payload.
    
    ## What changed
    
    - Render `NEW_TASK` or `MESSAGE` in
    `InterAgentCommunication::to_model_input_item`, based on whether the
    encrypted delivery starts a turn.
    - Replace the multi-agent v2 `<subagent_notification>` completion
    payload with a model-visible `FINAL_ANSWER` envelope.
    - Document `Task name`, `Sender`, and `Payload` consistently in the
    multi-agent developer instructions.
    - Prevent local-only history projections from treating an encrypted
    message's plaintext header as the complete assistant message.
    - Preserve rollout-trace interaction edges when an agent message
    contains both plaintext and encrypted content.
    
    Legacy multi-agent behavior remains unchanged.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout-trace`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-core
    encrypted_multi_agent_v2_spawn_sends_agent_message_to_child`
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core
    multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn`
    - `just test -p codex-core
    multi_agent_v2_completion_queues_message_for_direct_parent`
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • Use PathUri in filesystem permission paths for exec-server (#28165)
    ## Why
    
    Progress towards letting app-server and exec-server run on different
    platforms, specifically for sandbox configuration.
    
    ## What
    
    - Make the filesystem path containment hierarchy generic, defaulting to
    `AbsolutePathBuf` for now.
    - Have clients specify `AbsolutePathBuf` or `PathUri` directly where
    needed.
    - Use `PathUri` throughout exec-server filesystem protocol and trait
    boundaries.
    - Implement `From` for conversion to path URIs and `TryFrom` for
    fallible conversion to absolute paths through the generic type
    hierarchy.
  • feat(core): add metadata field to ResponseItem (#28355)
    ## Description
    
    This PR adds an optional `metadata` field to `ResponseItem` for
    Responses API calls. Only mechanical plumbing, no actual values
    populated and sent yet. Turns out just adding a new field to
    `ResponseItem` has quite a large blast radius already.
    
    This change is backwards compatible because `metadata` is optional and
    omitted when absent, so existing response items and rollout history
    without it still deserialize and requests that do not set it keep the
    same wire shape. For provider compatibility, we strip out `metadata`
    before non-OpenAI Responses requests so Azure and AWS Bedrock never see
    this field.
    
    My followup PR here will actually make use of it to start storing and
    passing along `turn_id`: https://github.com/openai/codex/pull/28360
    
    ## What changed
    
    - Added `ResponseItemMetadata` with optional `turn_id`, plus optional
    `metadata` on Responses API item variants and inter-agent communication.
    - Preserved item metadata through response-item rewrites such as
    truncation, missing tool-output synthesis, compaction history
    rebuilding, visible-history conversion, rollout/resume, and generated
    app-server schemas/types.
    - Strip item metadata from non-OpenAI Responses requests while
    preserving it for OpenAI-shaped requests.
    - Updated the mechanical fixture/test construction churn required by the
    new optional field.
  • skills: cache orchestrator resources per thread (#28336)
    ## Why
    
    Hosted orchestrator skills are read through the remote MCP resource
    server. Within one thread, the same catalog or skill resource can be
    requested multiple times by prompt injection and the `skills.list` /
    `skills.read` tools. Re-fetching adds latency and can make those
    surfaces observe different remote contents during the same thread.
    
    This is a follow-up to #28333: orchestrator skills remain limited to
    threads without a local executor, and those threads now get a stable
    per-thread view of the remote skill data they use.
    
    ## What changed
    
    - Reuse the existing per-thread orchestrator catalog snapshot for
    `skills.list` and `skills.read` availability checks.
    - Cache successful orchestrator resource reads by authority, package,
    and resource so prompt injection and tool calls share the same contents.
    - Keep the cache memory-only and bounded to 100 resources and 8 MiB per
    thread.
    - Leave host and executor skill reads unchanged, and do not cache failed
    remote reads.
    
    ## Verification
    
    - Extended the app-server MCP resource integration test to read the same
    hosted skill resource twice and verify that the remote server receives
    one read.
    - The same test verifies that catalog discovery and the selected skill's
    main prompt are each fetched only once per thread.
  • skills: hide orchestrator skills with a local executor (#28333)
    ## Why
    
    App-server threads without a local executor need orchestrator-owned
    skills from the hosted `codex_apps` MCP server. Threads with the local
    executor already discover installed skills from the local filesystem.
    
    After the orchestrator skill provider was enabled for every app-server
    thread, local-executor threads also received the hosted skill catalog
    and the `skills.list` and `skills.read` tools. This changed the existing
    local behavior and could expose a second hosted copy of a skill that was
    already installed locally.
    
    ## What changed
    
    - Expose the thread's selected execution environments to extensions at
    thread startup.
    - Enable orchestrator skills only when the reserved local environment is
    not selected.
    - Apply that decision consistently to hosted skill catalog discovery,
    explicit skill injection, and the `skills.list` and `skills.read` tools.
    
    ## Verification
    
    - The existing no-executor app-server test continues to verify hosted
    skill discovery, invocation, and child-resource reads.
    - A new app-server test verifies that local-executor threads do not
    receive hosted skill context or `skills.*` tools.
  • Discover stdio MCP servers from selected executor plugins (#27870)
    ## Why
    
    **In short:** this PR discovers MCP registrations by reading a selected
    plugin's `.mcp.json` on its executor. #27884 then resolves those
    registrations in the shared catalog.
    
    `thread/start.selectedCapabilityRoots` can select a plugin root owned by
    an executor, and Codex can resolve that package through the executor
    filesystem. MCP declarations inside the selected plugin are still
    ignored.
    
    This PR adds the source-specific discovery layer on top of the
    selected-plugin catalog boundary in #27884:
    
    ```text
    selected capability root
            |
            v
    resolve the plugin through its executor filesystem
            |
            v
    read and normalize its MCP config through the same filesystem
            |
            v
    contribute stdio registrations bound to that environment ID
    ```
    
    The existing MCP launcher and connection manager remain unchanged. MCP
    config parsing is shared with local plugins through #27863.
    
    ## What changed
    
    - Added an executor plugin MCP provider in the MCP extension.
    - Retained only the exact filesystem capability used for package
    resolution and reused it for the selected plugin's MCP config, with no
    host-filesystem fallback or unrelated process/HTTP authority.
    - Read either the manifest-declared MCP config or the default
    `.mcp.json`; a missing default file means the plugin has no MCP servers.
    - Accepted stdio servers only for this first vertical. Executor-owned
    HTTP declarations are skipped with a warning until their placement
    semantics are defined.
    - Normalized stdio registrations with the owning environment's stable
    logical ID and plugin-root working directory.
    - Resolved environment-variable names on the owning executor and
    rejected explicit local forwarding for non-local plugins.
    - Froze discovered declarations once per active thread runtime, then
    applied current managed plugin and MCP requirements when contributing
    them.
    - Carried the selected root ID, display name, and selection order into
    the catalog contribution defined by #27884.
    
    ## Behavior and scope
    
    There is intentionally no production behavior change yet. This PR
    provides the executor provider and contribution boundary, but app-server
    does not install it in this change. Existing local plugin MCP loading is
    unchanged, and no MCP process is launched by this PR alone.
    
    ## Assumptions
    
    - The selected root ID is the plugin policy identity; the manifest
    display name is presentation metadata.
    - An environment ID is a stable logical authority. Reconnection or
    replacement under the same ID does not change ownership.
    - Selected plugin packages and their manifests are trusted inputs.
    - The selected package and MCP discovery snapshot remain frozen for the
    active thread runtime.
    
    ## Follow-up
    
    The next PR installs this contributor in app-server and adds an
    end-to-end test proving that a selected plugin MCP tool launches on its
    owning executor, can be called by the model, survives an explicit MCP
    refresh, and is invisible when its root was not selected.
    
    Resume, fork, environment removal or ID changes, dynamic catalog reload,
    and executor-owned HTTP MCP placement remain separate lifecycle
    decisions.
    
    ## Verification
    
    Focused tests cover executor-only filesystem reads, missing and
    malformed config, stdio filtering and normalization, managed
    requirements, package attribution, and selection order. CI owns
    execution of the test suite.
  • Add selected-plugin precedence and attribution to the MCP catalog (#27884)
    ## Why
    
    **In short:** this PR resolves already-discovered MCP registrations. It
    does not read selected plugins or discover their MCP servers.
    
    The resolved MCP catalog currently builds config and auto-discovered
    plugin registrations before runtime contributors are applied. A
    thread-selected plugin needs a distinct precedence tier in that same
    initial resolution pass: otherwise a disabled lower-precedence winner
    can leave stale name-level state behind, and the winning MCP tools
    cannot be attributed to the selected package reliably.
    
    This PR adds that catalog boundary before executor discovery is
    connected.
    
    ## What changed
    
    - Added an explicit selected-plugin registration tier between
    auto-discovered plugins and explicit config.
    - Collected selected-plugin contributions before the initial catalog
    build, while leaving compatibility and generic extension overlays in
    their existing runtime phase.
    - Retained the winning plugin ID and display name directly on
    plugin-owned catalog registrations.
    - Derived MCP tool provenance from the winning catalog entry instead of
    joining against local-only plugin summaries.
    - Retained the winning selected server's tool approval policy in the
    running connection manager, so a selected registration cannot inherit
    approval behavior from a losing local plugin.
    - Kept remembered approval session-scoped for selected plugins until
    there is an authority-aware persistence contract; Codex will not write
    approval back to an unrelated local plugin.
    - Preserved existing name-level disabled vetoes for discovered plugins
    and config, while keeping a selected package's own disabled registration
    scoped to that registration.
    - Preserved deterministic selection order and existing config,
    compatibility, and extension precedence.
    
    The resulting order is:
    
    ```text
    auto-discovered plugin
      < selected plugin
      < explicit config
      < compatibility registration
      < extension overlay
    ```
    
    ## Behavior and scope
    
    This is a catalog and provenance change only. No production host
    contributes selected-plugin MCP registrations yet, so existing local MCP
    behavior remains unchanged.
    
    The stacked follow-up, #27870, installs the executor plugin provider
    that produces these registrations. App-server activation remains a
    separate final step.
    
    ## Verification
    
    Focused tests cover precedence, deterministic selected-plugin conflicts,
    disabled-veto behavior across catalog phases, managed requirements
    before selected-plugin resolution, winning-server approval policy, and
    attribution when local and selected packages share an ID or server name.
    CI owns execution of the test suite.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • [codex] make PathUri::from_abs_path infallible (#27976)
    ## Why
    
    `PathUri::from_abs_path` can fail for absolute paths that do not have a
    normal `file:` URI representation, forcing filesystem call sites to
    handle a conversion error even though the original path can be preserved
    losslessly.
    
    ## What
    
    Make `from_abs_path` infallible and migrate its callers. Unrepresentable
    paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
    Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
    leading encoded null reserves a namespace that cannot collide with a
    real Unix or Windows path, and fallback URIs remain opaque to lexical
    path operations.
    
    ## Validation
    
    Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
    device/verbatim and non-Unicode paths, serialization, malformed
    fallbacks, opaque lexical operations, invalid native payloads, and
    literal `/bad/path` collision resistance.
  • Support plaintext agent messages (#27830)
    ## Why
    
    Multi-agent v2 `send_message` deliveries already reach the receiving
    model as typed `agent_message` items with encrypted content.
    Child-completion notifications are generated by Codex itself, so their
    content is plaintext and previously fell back to a serialized JSON
    envelope inside an assistant message.
    
    With plaintext `input_text` supported for `agent_message`, both delivery
    paths can use the same model-visible type while preserving explicit
    author and recipient metadata.
    
    ## What changed
    
    - add plaintext `input_text` support to `AgentMessageInputContent` and
    regenerate the affected app-server schemas
    - preserve `InterAgentCommunication` as structured mailbox input instead
    of converting it to assistant text
    - record delivered communications as typed `agent_message` history items
    - persist a dedicated rollout item so local delivery metadata such as
    `trigger_turn` remains available without leaking into the Responses
    request
    - reconstruct typed agent messages on resume and preserve fork-turn
    truncation behavior
    - remove request-time assistant-content parsing
    - preserve plaintext and encrypted inter-agent deliveries in stage-one
    memory inputs
    - normalize and link plaintext and encrypted agent messages in rollout
    traces without treating inbound messages as child results
    - cover the real MultiAgent V2 child-completion path end to end with
    deterministic mailbox synchronization
    
    ## Verification
    
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order
    record_initial_history_reconstructs_typed_inter_agent_message
    fork_turn_positions_use_inter_agent_delivery_metadata`
    - `just test -p codex-memories-write
    serializes_inter_agent_communications_for_memory`
    - `just test -p codex-rollout-trace
    agent_messages_preserve_routing_and_content
    sub_agent_started_activity_creates_spawn_edge`
    - `just test -p codex-rollout-trace
    agent_result_edge_falls_back_to_child_thread_without_result_message`
    - `just test -p codex-protocol -p codex-rollout -p
    codex-app-server-protocol`
  • [codex] Add size to internal filesystem metadata (#27927)
    ## Why
    
    `ExecutorFileSystem::get_metadata` reports file kind and timestamps but
    not size. Internal callers that need to enforce a size limit therefore
    have to read the complete file first, which is especially wasteful for
    remote filesystems.
    
    This adds the missing internal metadata so consumers can reject
    oversized files before transferring or buffering them. The field is
    named `size`, matching VS Code's `FileStat.size` filesystem convention.
    
    ## What changed
    
    - add `size: u64` to internal `FileMetadata`
    - populate it from the underlying filesystem metadata
    - carry it through sandbox-helper and remote exec-server responses
    - cover files, directories, symlink targets, and sandboxed reads across
    local and remote filesystem implementations
    
    The new field is intentionally not exposed through the app-server API.
    
    ## Testing
    
    - `just test -p codex-exec-server get_metadata`
    - `just test -p codex-exec-server
    file_system_sandboxed_metadata_and_read_allow_readable_root`
    - `just test -p codex-core-plugins`
    - `just test -p codex-skills-extension`
  • Handle standalone image generation failures as terminal items (#27920)
    ## Why
    
    Standalone image generation emitted a started item but no terminal item
    when the backend failed. Clients could leave the operation unresolved or
    render it as successful.
    
    ## What changed
    
    - Emit a terminal image-generation item with `status: "failed"` when
    generation or editing fails.
    - Skip image persistence for failed terminal items.
    - Render failed image generation distinctly in TUI history.
    - Preserve the status when handling live and replayed terminal items.
    
    ## Looks for TUI, App-Side change needed 
    
    <img width="867" height="89" alt="image"
    src="https://github.com/user-attachments/assets/9e32342f-a982-411e-8498-456639fc468a"
    />
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - App-server image-generation tests
    - Core stream-event tests
    - TUI image-generation lifecycle and snapshot tests
    - Scoped Clippy and formatting
  • Make MCP server contributions thread-scoped (#27670)
    ## Why
    
    `selectedCapabilityRoots` belongs to one thread, but MCP contributors
    previously received only the global Codex config. That left no clean way
    for a selected executor capability to contribute MCP servers to its own
    thread.
    
    ## What this PR does
    
    - Gives MCP contributors a small context containing the config and, for
    a running thread, its frozen host-seeded inputs.
    - Uses the same thread inputs during startup, status queries, refreshes,
    and skill dependency checks.
    - Keeps threadless MCP operations and the existing hosted Apps behavior
    unchanged.
    - Adds coverage showing that two threads resolve independent
    registrations and that later lifecycle mutations do not change the
    frozen MCP inputs.
    
    This PR does not discover plugin manifests, add MCP servers, or launch
    anything new. It only establishes the thread-scoped registration
    boundary.
    
    ## Follow-ups
    
    - Resolve selected executor plugin roots through their owning
    environment filesystem.
    - Convert their stdio MCP declarations into environment-bound
    registrations and add an executor MCP end-to-end test.
    
    ## Verification
    
    - `just fmt`
    - `cargo check --tests -p codex-protocol -p codex-extension-api -p
    codex-mcp-extension -p codex-core -p codex-app-server`
    
    Tests and Clippy were not run.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • Fix image extension PathUri conversion (#27711)
    ## Why
    
    `main` stopped compiling when #27498 passed an `AbsolutePathBuf` to the
    `ExecutorFileSystem` API migrated to `PathUri` by #27653.
    
    ## What
    
    Convert referenced image paths to `PathUri` before filesystem reads,
    declare the internal path-URI dependency, and refresh `Cargo.lock`.
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.
  • Resolve MCP server registrations through a catalog (#27634)
    ## Why
    
    MCP servers currently come from user config, local plugins,
    compatibility Apps synthesis, and host extensions. Those sources were
    composed by mutating a shared map, leaving registration identity,
    precedence, removal, and provenance implicit in assembly order.
    
    Before adding executor-owned MCPs, Codex needs one durable resolution
    boundary above `McpConnectionManager`. This PR introduces that boundary
    while preserving current server configuration, policy, and runtime
    behavior. Executor-scoped registrations and explicit policy layers
    remain follow-ups.
    
    ## What changed
    
    - Add typed `McpServerRegistration` inputs and an immutable
    `ResolvedMcpCatalog` in `codex-mcp`.
    - Retain each registration's complete `McpServerConfig`, including its
    environment binding, while recording its source and provenance.
    - Preserve the existing structural precedence between plugin, config,
    compatibility, and ordered extension sources.
    - Resolve equal-precedence actions by contribution order; provenance IDs
    are used only for diagnostics and cannot affect the winner.
    - Preserve extension removals and the existing name-scoped `enabled =
    false` veto.
    - Report same-tier conflicts with every contender and the final catalog
    outcome, including whether the winning action registers or removes the
    server.
    - Require MCP contributors to provide a stable diagnostic identity.
    - Derive materialized server maps and plugin ownership from the resolved
    catalog.
    
    `McpConnectionManager`, transport startup, tool calls, and resource
    routing continue to consume the same effective `McpServerConfig` values.
    
    ## Scope
    
    This PR does not add new MCP capabilities or change user-visible
    behavior. It does not add executor plugin discovery, thread-scoped
    registrations, dynamic refresh generations, or new user/managed policy
    semantics.
    
    ## Verification
    
    - Added focused catalog coverage for source precedence, complete
    configuration preservation, disabled vetoes, plugin ownership,
    contribution-order tie breaking, removal outcomes, and conflict
    diagnostics.
    - Extended hosted Apps coverage for ordered extension removal and
    Apps-disabled hosts with and without the hosted extension installed.
    - `cargo check -p codex-mcp --tests -p codex-extension-api -p
    codex-core`