52 Commits

  • Reinject missing World State fragments on resume (#30152)
    ## Why
    
    World State restores its structured snapshot on resume so unchanged
    sections do not have to be rendered again. That is safe only when the
    model-visible fragment represented by the snapshot is still present in
    retained history.
    
    For selected executor skills, the failing selected-capability scenario
    exposed this state:
    
    ```text
    persisted World State: selected skill catalog is known
    retained model history: selected skill catalog message is missing
    next diff: unchanged, so emit nothing
    ```
    
    The model resumes without being told about the selected skill catalog.
    
    ## What changed
    
    World State contributions may now optionally describe the concrete
    model-visible fragment that must remain in retained history.
    
    When a persisted snapshot is present:
    
    ```text
    matching retained fragment exists -> trust snapshot, emit nothing
    matching retained fragment missing -> treat section as absent, render current state once
    ```
    
    The skills extension uses this for non-empty selected-environment
    catalogs by matching its exact rendered catalog body. Empty or hidden
    catalogs do not require a fragment.
    
    ## Scope
    
    This does not clear or rebuild the whole World State baseline. It does
    not change skill discovery, cache invalidation, environment
    availability, or MCP runtime behavior. It only keeps a persisted section
    snapshot and its retained model context consistent across resume/history
    reconstruction.
    
    ## Coverage
    
    A focused World State regression test verifies both sides:
    
    - a missing retained fragment is rendered again
    - a matching retained fragment avoids duplicate injection
  • Project selected plugin runtime by environment availability (#30093)
    ## Why
    
    Selected plugin metadata is stable, but MCP processes are live runtime
    state. They need different lifetimes:
    
    - the MCP extension caches manifest, MCP, and connector declarations for
    each stable selected root;
    - each model step projects that cached metadata through the roots that
    resolved as ready for that exact step;
    - the MCP manager is rebuilt only when that availability projection
    changes.
    
    This matches executor skills: both features consume the same resolved
    step roots instead of inferring readiness from the turn's selected
    environments.
    
    ## Behavior
    
    ```text
    E1 not ready for this step
      -> no E1 MCP servers or connectors
      -> cached plugin metadata stays in ext/mcp
    
    E1 becomes ready
      -> reuse cached metadata
      -> publish one MCP runtime containing E1 capabilities
    
    same ready roots on the next step
      -> reuse the exact runtime; no rediscovery and no MCP restart
    
    resume
      -> create new extension thread state and a new MCP runtime
    ```
    
    All model-facing consumers use the same step snapshot:
    
    ```text
    resolved selected roots
            |
            v
    extension MCP/connector projection
            |
            v
    { MCP config, connector snapshot, MCP manager }
            |
            +-> advertise model tools
            +-> build app/connector tools
            +-> execute MCP calls
    ```
    
    ## Cache contract
    
    The existing MCP extension owns a cache keyed by the full
    `SelectedCapabilityRoot`:
    
    ```rust
    let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
    ```
    
    The cache lives with extension thread state. Environment availability
    filters projection but does not invalidate metadata. Resume creates new
    thread state. There is no file watcher or executor generation because
    contents behind a stable environment/root are assumed stable.
    
    ## What changes
    
    - Keeps executor plugin discovery and cached metadata in `ext/mcp`.
    - Caches MCP and connector declarations together per selected root.
    - Uses the step's already-resolved capability roots, including lazy
    environments that are not turn environments.
    - Reuses the current MCP runtime when the ready-root projection is
    unchanged.
    - Uses the same step MCP manager and connector snapshot for
    model-visible tools and execution.
    - Resolves direct thread-scoped MCP requests from the current
    selected-root projection.
    
    ## Deliberately out of scope
    
    - `app/list` remains based on the latest global host-plugin state; this
    PR does not make its response or notifications thread-specific.
    - `required = true` startup semantics do not apply to delayed executor
    MCP activation.
    - No filesystem/content invalidation.
    - No transport-disconnect watcher.
    - No executor generations or environment replacement semantics.
    - No client sharing across complete manager replacements.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. Project executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. **This PR:** project selected MCP and connector state from
    extension-owned metadata.
    5. Integration coverage for selected capability availability and resume.
    
    ## Verification
    
    -
    `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
    - The stacked integration PR covers unavailable to ready activation,
    unchanged-runtime reuse, skills, MCP tools, connector attribution, and
    cold resume.
  • Project executor skills through World State (#30088)
    ## Why
    
    A selected executor environment can be unavailable in one model step and
    ready in the next. The model should see its skills only while that
    environment is ready, without rescanning stable files on every sample.
    
    The product assumption is simple:
    
    - an environment ID names one stable logical environment;
    - the selected root contents do not change during the thread.
    
    ## Behavior
    
    ```text
    E1 unavailable -> do not show E1 skills
    E1 ready       -> discover once, cache, show through World State
    E1 unavailable -> hide skills, keep cache
    E1 ready again -> reuse cache, show skills again
    resume         -> create a new thread cache and discover again
    ```
    
    The cache key is the full `SelectedCapabilityRoot`. Availability does
    not invalidate it; dropping the extension's thread state does.
    
    The step supplies the ready selected roots directly. They do not have to
    be turn environments:
    
    ```text
    turn environment: laptop
    selected root:    worker:/plugins/lint-fix
    
    worker ready -> lint-fix skills are visible
    ```
    
    ## What changes
    
    - Keeps executor skill catalogs in the existing skills extension.
    - Passes the roots resolved as ready for the step into World State
    contributors.
    - Loads each ready selected root at most once per thread.
    - Contributes the executor catalog as the `skills` World State section.
    - Uses the exact step catalog for explicit skill selection and body
    reads.
    - Leaves host and orchestrator skill behavior where it already lives.
    
    Taking a step snapshot itself does not add an RPC. Executor filesystem
    calls happen only on the first discovery of a stable root for that
    thread.
    
    ## What does not change
    
    - No filesystem watcher or content-based invalidation.
    - No retry/generation framework.
    - No skill runtime migration into core.
    - No general rewrite of the skills extension.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. **This PR:** project cached executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. Project selected MCP/app/connector metadata by environment
    availability.
    5. One end-to-end integration scenario.
  • Let extensions contribute World State sections (#30100)
    ## Why
    
    #29856 already owns the durable thread intent and exact environment
    binding. This PR adds only the small missing extension boundary: an
    extension can contribute one named World State section, while core still
    owns persistence, diffing, and model-visible fragment types.
    
    This lets skills stay in the skills extension instead of moving their
    runtime into core.
    
    ## Shape
    
    ```text
    extension-owned state
            |
            | contribute section id + JSON snapshot + renderer
            v
    core World State
            |
            | compare with the previous snapshot
            v
    no message, or one incremental model-visible update
    ```
    
    The extension API is deliberately small:
    
    ```rust
    fn contribute_world_state(...) -> Vec<WorldStateSectionContribution>
    ```
    
    Core adapts the rendered result to `ContextualUserFragment`, records the
    snapshot, and keeps the existing compaction/resume behavior.
    
    ## What changes
    
    - Adds extension-owned World State section contributions.
    - Calls those contributors from the existing per-step World State
    builder.
    - Restores durable selected capability roots into extension thread state
    on resume.
    - Keeps the actual model-context fragment and rollout machinery in core.
    
    ## What does not change
    
    - No skill or MCP implementation moves out of its extension.
    - No new file watcher, generation, or RPC.
    - No generic migration of existing World State sections.
    - No change to the stable environment-ID assumption from #29856.
    
    ## Example
    
    ```text
    step 1 snapshot: skills = []
    step 2 snapshot: skills = [executor-demo:deploy]
    
    core asks the skills extension to render only that change.
    ```
    
    ## Stack
    
    1. **This PR:** let extensions contribute World State sections.
    2. Project executor skills through the skills extension.
    3. Pin one MCP runtime to each model step.
    4. Project selected MCP/app/connector metadata by environment
    availability.
    5. One end-to-end integration scenario.
  • Add turn-scoped context contributions (#28911)
    ## Summary
    - keep context injection on a single ContextContributor trait
    - split context injection into thread-scoped and turn-scoped
    contribution methods
    - wire turn-scoped fragments into initial context assembly so extensions
    can contribute context from turn-local state
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • skills: hide orchestrator skills with a local executor (#28333)
    ## Why
    
    App-server threads without a local executor need orchestrator-owned
    skills from the hosted `codex_apps` MCP server. Threads with the local
    executor already discover installed skills from the local filesystem.
    
    After the orchestrator skill provider was enabled for every app-server
    thread, local-executor threads also received the hosted skill catalog
    and the `skills.list` and `skills.read` tools. This changed the existing
    local behavior and could expose a second hosted copy of a skill that was
    already installed locally.
    
    ## What changed
    
    - Expose the thread's selected execution environments to extensions at
    thread startup.
    - Enable orchestrator skills only when the reserved local environment is
    not selected.
    - Apply that decision consistently to hosted skill catalog discovery,
    explicit skill injection, and the `skills.list` and `skills.read` tools.
    
    ## Verification
    
    - The existing no-executor app-server test continues to verify hosted
    skill discovery, invocation, and child-resource reads.
    - A new app-server test verifies that local-executor threads do not
    receive hosted skill context or `skills.*` tools.
  • Add selected-plugin precedence and attribution to the MCP catalog (#27884)
    ## Why
    
    **In short:** this PR resolves already-discovered MCP registrations. It
    does not read selected plugins or discover their MCP servers.
    
    The resolved MCP catalog currently builds config and auto-discovered
    plugin registrations before runtime contributors are applied. A
    thread-selected plugin needs a distinct precedence tier in that same
    initial resolution pass: otherwise a disabled lower-precedence winner
    can leave stale name-level state behind, and the winning MCP tools
    cannot be attributed to the selected package reliably.
    
    This PR adds that catalog boundary before executor discovery is
    connected.
    
    ## What changed
    
    - Added an explicit selected-plugin registration tier between
    auto-discovered plugins and explicit config.
    - Collected selected-plugin contributions before the initial catalog
    build, while leaving compatibility and generic extension overlays in
    their existing runtime phase.
    - Retained the winning plugin ID and display name directly on
    plugin-owned catalog registrations.
    - Derived MCP tool provenance from the winning catalog entry instead of
    joining against local-only plugin summaries.
    - Retained the winning selected server's tool approval policy in the
    running connection manager, so a selected registration cannot inherit
    approval behavior from a losing local plugin.
    - Kept remembered approval session-scoped for selected plugins until
    there is an authority-aware persistence contract; Codex will not write
    approval back to an unrelated local plugin.
    - Preserved existing name-level disabled vetoes for discovered plugins
    and config, while keeping a selected package's own disabled registration
    scoped to that registration.
    - Preserved deterministic selection order and existing config,
    compatibility, and extension precedence.
    
    The resulting order is:
    
    ```text
    auto-discovered plugin
      < selected plugin
      < explicit config
      < compatibility registration
      < extension overlay
    ```
    
    ## Behavior and scope
    
    This is a catalog and provenance change only. No production host
    contributes selected-plugin MCP registrations yet, so existing local MCP
    behavior remains unchanged.
    
    The stacked follow-up, #27870, installs the executor plugin provider
    that produces these registrations. App-server activation remains a
    separate final step.
    
    ## Verification
    
    Focused tests cover precedence, deterministic selected-plugin conflicts,
    disabled-veto behavior across catalog phases, managed requirements
    before selected-plugin resolution, winning-server approval policy, and
    attribution when local and selected packages share an ID or server name.
    CI owns execution of the test suite.
  • Make MCP server contributions thread-scoped (#27670)
    ## Why
    
    `selectedCapabilityRoots` belongs to one thread, but MCP contributors
    previously received only the global Codex config. That left no clean way
    for a selected executor capability to contribute MCP servers to its own
    thread.
    
    ## What this PR does
    
    - Gives MCP contributors a small context containing the config and, for
    a running thread, its frozen host-seeded inputs.
    - Uses the same thread inputs during startup, status queries, refreshes,
    and skill dependency checks.
    - Keeps threadless MCP operations and the existing hosted Apps behavior
    unchanged.
    - Adds coverage showing that two threads resolve independent
    registrations and that later lifecycle mutations do not change the
    frozen MCP inputs.
    
    This PR does not discover plugin manifests, add MCP servers, or launch
    anything new. It only establishes the thread-scoped registration
    boundary.
    
    ## Follow-ups
    
    - Resolve selected executor plugin roots through their owning
    environment filesystem.
    - Convert their stdio MCP declarations into environment-bound
    registrations and add an executor MCP end-to-end test.
    
    ## Verification
    
    - `just fmt`
    - `cargo check --tests -p codex-protocol -p codex-extension-api -p
    codex-mcp-extension -p codex-core -p codex-app-server`
    
    Tests and Clippy were not run.
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.
  • Resolve MCP server registrations through a catalog (#27634)
    ## Why
    
    MCP servers currently come from user config, local plugins,
    compatibility Apps synthesis, and host extensions. Those sources were
    composed by mutating a shared map, leaving registration identity,
    precedence, removal, and provenance implicit in assembly order.
    
    Before adding executor-owned MCPs, Codex needs one durable resolution
    boundary above `McpConnectionManager`. This PR introduces that boundary
    while preserving current server configuration, policy, and runtime
    behavior. Executor-scoped registrations and explicit policy layers
    remain follow-ups.
    
    ## What changed
    
    - Add typed `McpServerRegistration` inputs and an immutable
    `ResolvedMcpCatalog` in `codex-mcp`.
    - Retain each registration's complete `McpServerConfig`, including its
    environment binding, while recording its source and provenance.
    - Preserve the existing structural precedence between plugin, config,
    compatibility, and ordered extension sources.
    - Resolve equal-precedence actions by contribution order; provenance IDs
    are used only for diagnostics and cannot affect the winner.
    - Preserve extension removals and the existing name-scoped `enabled =
    false` veto.
    - Report same-tier conflicts with every contender and the final catalog
    outcome, including whether the winning action registers or removes the
    server.
    - Require MCP contributors to provide a stable diagnostic identity.
    - Derive materialized server maps and plugin ownership from the resolved
    catalog.
    
    `McpConnectionManager`, transport startup, tool calls, and resource
    routing continue to consume the same effective `McpServerConfig` values.
    
    ## Scope
    
    This PR does not add new MCP capabilities or change user-visible
    behavior. It does not add executor plugin discovery, thread-scoped
    registrations, dynamic refresh generations, or new user/managed policy
    semantics.
    
    ## Verification
    
    - Added focused catalog coverage for source precedence, complete
    configuration preservation, disabled vetoes, plugin ownership,
    contribution-order tie breaking, removal outcomes, and conflict
    diagnostics.
    - Extended hosted Apps coverage for ordered extension removal and
    Apps-disabled hosts with and without the hosted extension installed.
    - `cargo check -p codex-mcp --tests -p codex-extension-api -p
    codex-core`
  • [codex] Load user instructions through an injected provider (#27101)
    ## Why
    
    We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
    make embedders responsible for supplying user-level instructions. This
    also ensures user instructions load when no primary environment is
    selected.
    
    ## What changed
    
    Stacked on #27415, which makes `codex exec` surface thread-scoped
    runtime warnings.
    
    - Added `UserInstructionsProvider` to `codex-extension-api`, with
    absolute source attribution and recoverable loading warnings.
    - Added `codex-home` with the filesystem-backed provider for
    `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
    trimming, lossy UTF-8 handling, and the existing uncapped global
    instruction size.
    - Removed global instruction loading from `Config` and require
    `ThreadManager` callers to inject a provider.
    - Load provider instructions once for each fresh root runtime, including
    runtimes without a primary environment. Running sessions retain their
    snapshot, while child agents inherit the parent snapshot without
    invoking the provider.
    - Keep provider instructions separate while loading project `AGENTS.md`,
    then assemble the model-visible instructions with the existing ordering,
    source attribution, warning, and turn-context behavior.
    - Wired the Codex home provider through the CLI, app server, MCP server,
    core facade, and thread-manager sample.
    
    ## Validation
    
    - `just test -p codex-home -p codex-extension-api`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core guardian`
    - `just test -p codex-app-server
    thread_start_without_selected_environment_includes_only_global_instruction_source`
    - `just test -p codex-exec warning`
    - `just bazel-lock-check`
  • [codex] Remove async_trait from ToolExecutor (#27304)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    Stacked on #27299, this PR applies the trait change after the handler
    bodies have been outlined.
    
    ## What
    
    Changed `ToolExecutor::handle` to return an explicit boxed
    `ToolExecutorFuture` instead of using `async_trait`.
    
    Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
    the future alias through `codex-tools` and `codex-extension-api`, and
    removed `codex-tools` direct `async-trait` dependency.
  • Remove async-trait from extension contributors (#27383)
    ## Why
    
    Extension contributors are registered behind `dyn Trait` objects, so
    native `async fn`/RPITIT methods would make these traits
    non-object-safe. Spell out the boxed, `Send` future contract directly so
    `extension-api` no longer needs `async-trait` while retaining the
    existing runtime model.
    
    ## What changed
    
    - add a shared `ExtensionFuture` alias and use it for asynchronous
    contributor methods
    - migrate production and test implementations to return `Box::pin(async
    move { ... })`
    - remove `async-trait` dependencies where they are no longer used,
    keeping it dev-only where unrelated test executors still require it
    
    ## Behavior
    
    No behavior change is intended. Contributor futures remain boxed,
    `Send`, dynamically dispatched, and lazily executed; cancellation and
    callback ordering stay unchanged.
    
    ## Testing
    
    - `just test -p codex-extension-api` (11 passed)
    - affected extension crates (64 passed)
    - targeted `codex-core` contributor tests (14 passed)
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    A broad local `codex-core` run compiled successfully but encountered
    unrelated sandbox and missing test-binary fixture failures; CI will run
    the full checks.
  • Use plugin-service MCP as the hosted plugin runtime (#27198)
    ## Stack
    
    - Base: #27191
    - This PR is the third vertical and should be reviewed against
    `jif/external-plugins-2`, not `main`.
    
    ## Why
    
    #27191 moves the host-owned Apps MCP registration behind an extension
    contributor, but deliberately preserves the existing endpoint-selection
    feature while that contribution contract lands. App-server can therefore
    resolve the server through extensions, yet the hosted plugin endpoint is
    still selected through temporary `apps_mcp_path_override` plumbing.
    
    That is not the long-term plugin model. A plugin can bundle skills,
    connectors, MCP servers, and hooks, and those components do not all need
    the same source or execution environment. In particular, an
    authenticated HTTP MCP server can expose plugin capabilities directly
    from a backend without an executor or an orchestrator filesystem.
    
    This PR completes that hosted vertical. App-server's MCP extension now
    owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
    continue to arrive as MCP tools, while backend-provided skills arrive as
    MCP resources and use Codex's existing resource list/read paths. No
    second backend client, skill filesystem, or generic plugin activation
    framework is introduced.
    
    The backend route remains the hosted implementation. This change
    replaces Codex's temporary endpoint-selection mechanism, not the service
    behind the endpoint.
    
    ## What changed
    
    ### Hosted plugin runtime
    
    The MCP extension now contributes `codex_apps` as the hosted plugin
    runtime rather than as a configurable Apps endpoint:
    
    - `https://chatgpt.com` resolves to
    `https://chatgpt.com/backend-api/ps/mcp`;
    - a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
    - the existing product-SKU header and ChatGPT authentication behavior
    are preserved;
    - executor availability is never consulted for this streamable HTTP
    transport.
    
    The same MCP connection carries both component shapes supported by the
    hosted endpoint:
    
    - connector actions are discovered and invoked as MCP tools;
    - hosted skills are enumerated and read as MCP resources through the
    existing `list_mcp_resources` and `read_mcp_resource` paths.
    
    This keeps component access in the subsystem that already owns the
    protocol instead of downloading backend skills into an orchestrator
    filesystem or inventing a parallel hosted-skill client.
    
    ### Explicit runtime ordering
    
    `McpManager` now resolves the reserved `codex_apps` entry in three
    ordered phases:
    
    1. install the legacy Apps fallback for compatibility;
    2. apply ordered extension `Set` or `Remove` overlays;
    3. apply the final ChatGPT-auth gate without synthesizing the server
    again.
    
    This ordering is important:
    
    - an ordinary configured or plugin MCP server cannot claim the
    auth-bearing `codex_apps` name;
    - an extension-contributed hosted runtime wins over the fallback;
    - an extension `Remove` remains authoritative;
    - a host without the MCP extension retains the legacy Apps endpoint and
    current local-only behavior.
    
    The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
    longer needed.
    
    ### Remove the path override
    
    The `apps_mcp_path_override` feature and its runtime plumbing are
    removed, including:
    
    - the feature registry entry and structured feature config;
    - `Config` and `McpConfig` fields;
    - config schema output;
    - config-lock materialization;
    - URL override handling in `codex-mcp`.
    
    Existing boolean and structured forms still deserialize as ignored
    compatibility input. They are omitted from new serialized config, and
    config-lock comparison normalizes the removed input so older locks
    remain replayable.
    
    ### App-server coverage
    
    App-server MCP fixtures now serve the hosted route at
    `/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
    therefore exercise the extension-owned endpoint rather than succeeding
    through the legacy fallback.
    
    The stack also adds the missing `codex_chatgpt::connectors` re-export
    for the manager-backed connector helper introduced in #27191.
    
    ## Compatibility
    
    - App-server installs the extension and uses `/ps/mcp` for the hosted
    runtime.
    - CLI and other hosts that do not install the extension retain the
    legacy Apps endpoint.
    - Apps disabled or non-ChatGPT authentication removes `codex_apps` from
    the effective runtime view.
    - Existing local plugins, local skills, executor-selected skills,
    configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
    - Backend plugin enablement remains account/workspace state owned by the
    hosted endpoint; this PR does not add thread-local backend plugin
    selection.
    
    ## Architectural fit
    
    The stack now proves two independent runtime shapes:
    
    1. #27184 resolves filesystem-backed skills through the executor that
    owns a selected root.
    2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
    extension with no executor.
    
    Together they preserve the intended separation:
    
    - selection identifies a plugin/root when explicit selection is needed;
    - each component's owning extension resolves its concrete access
    mechanism;
    - execution stays with the runtime required by that component;
    - existing skills, MCP, connector, and hook subsystems remain the
    downstream consumers.
    
    ## Planned follow-ups
    
    1. **Executor stdio MCP:** selecting an executor plugin registers a
    manifest-declared stdio MCP server and executes it in the environment
    that owns the plugin.
    2. **Optional backend selection:** only if CCA needs thread-local
    selection distinct from backend account/workspace enablement, add a
    concrete backend-owned capability location and surface those selected
    skills through the skills catalog.
    3. **Connector metadata and hooks:** activate those plugin components
    through their existing owning subsystems, with executor hooks remaining
    environment-bound.
    4. **Propagation and persistence:** define explicit resume, fork,
    subagent, refresh, and environment-removal semantics once selected roots
    have multiple real consumers.
    5. **Local convergence:** migrate legacy local skill, MCP, connector,
    and hook paths behind their owning extensions one vertical at a time,
    then remove duplicate core managers and compatibility plumbing after
    parity.
    
    ## Verification
    
    Coverage in this change exercises:
    
    - extension-owned `/backend-api/ps/mcp` registration without an
    executor;
    - preservation of the legacy endpoint in hosts without the extension;
    - extension `Set` and `Remove` precedence over the legacy fallback;
    - ChatGPT-auth gating for the reserved server;
    - hosted MCP resource reads with and without an active thread;
    - connector tool invocation and MCP elicitation through the hosted
    route;
    - ignored boolean and structured forms of the removed path override;
    - config-lock replay compatibility for the removed feature.
    
    `cargo check -p codex-features -p codex-mcp-extension -p
    codex-app-server` passes. Tests and Clippy were not run locally under
    the current development instruction; CI provides the full validation
    pass.
  • Route hosted Apps MCP through extensions (#27191)
    ## Stack
    
    - Base: #27184
    - This PR is the second vertical and should be reviewed against
    `jif/external-plugins-1`, not `main`.
    
    ## Why
    
    CCA is moving toward a split runtime where the orchestrator may have no
    filesystem or executor, but it still needs to activate remotely hosted
    plugin components. HTTP MCP servers are the simplest complete example:
    they need configuration and host authentication, but they do not need an
    executor process.
    
    The Apps MCP endpoint is currently synthesized by a special-purpose
    loader inside the MCP runtime. That works locally, but it leaves hosted
    MCP activation outside the extension model being established in #27184.
    It also makes the Apps path a poor foundation for plugins whose skills,
    MCP servers, connectors, and hooks may come from different sources or
    execute in different places.
    
    This PR moves that one behavior behind an extension-owned contribution
    while preserving the existing local fallback. It deliberately does not
    introduce a generic plugin activation framework.
    
    ## What changed
    
    ### MCP extension contribution
    
    `codex-extension-api` gains an ordered `McpServerContributor` contract.
    A contributor returns typed `Set` or `Remove` overlays for MCP server
    configuration; later contributors win for the names they own.
    
    The contract stays at the existing MCP configuration boundary.
    Extensions do not create a second connection manager or transport
    abstraction.
    
    ### Hosted Apps MCP extension
    
    A new `codex-mcp-extension` contributes the reserved `codex_apps` server
    from the existing Apps feature, ChatGPT base URL, path override, and
    product SKU configuration.
    
    When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
    resulting streamable HTTP endpoint is
    `https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
    remains authoritative, so this server can run in an orchestrator-only
    process without being exposed for API-key sessions.
    
    ### One resolved runtime view
    
    `McpManager` now distinguishes three views:
    
    - **configured:** config- and plugin-backed servers before extension
    overlays;
    - **runtime:** configured servers plus host-installed extension
    contributions;
    - **effective:** runtime servers after auth gating and compatibility
    built-ins.
    
    App-server installs the hosted MCP extension and uses the runtime view
    for thread startup, refresh, status, threadless resource reads,
    connector discovery, and MCP OAuth lookup. This keeps
    `mcpServer/oauth/login` consistent with the servers exposed by the other
    MCP APIs. The hosted Apps server itself continues to use existing
    ChatGPT host authentication rather than MCP OAuth.
    
    ## Compatibility
    
    Hosts that do not install the MCP extension retain the existing Apps MCP
    synthesis path. This preserves current local-only, CLI, and
    standalone-host behavior while app-server exercises the extension path.
    
    Disabling Apps removes the reserved `codex_apps` entry, and losing
    ChatGPT auth removes it from the effective runtime view. Executor
    availability is not consulted for this HTTP transport.
    
    ## Follow-ups
    
    The next vertical will resolve a manifest-declared stdio MCP server from
    an executor-selected plugin root and execute it in the environment that
    owns that root. Later verticals can add backend-owned skills, connector
    metadata, hooks, durable selection semantics, and incremental local
    convergence without changing the component-specific runtime boundaries
    introduced here.
    
    ## Verification
    
    Focused coverage was added for:
    
    - contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
    executor;
    - requiring ChatGPT auth in the effective runtime view;
    - removing a reserved configured Apps server when the Apps feature is
    disabled.
    
    `cargo check -p codex-app-server -p codex-mcp-extension -p
    codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
    locally under the current development instruction; CI provides the full
    validation pass.
  • [codex] Test extension API contracts (#26835)
    ## Why
    
    `codex-extension-api` defines contracts shared by extension crates and
    their hosts, but it had no direct test suite. Host and feature tests
    cover downstream behavior, while regressions in the API crate's own
    typed state, registry ordering, and capability adapters could go
    unnoticed.
    
    ## What
    
    - Add public-surface integration tests for `ExtensionData`, including
    concurrent initialization and poison recovery.
    - Cover contributor registration order, approval short-circuiting, event
    sink retention, no-op response injection, and closure-based agent
    spawning.
    - Add the test-only dependencies used by the suite.
    
    ## Validation
    
    - `just test -p codex-extension-api`
    - `just argument-comment-lint -p codex-extension-api`
    - `just bazel-lock-check`
  • Load selected executor skills through extensions (#27184)
    ## Why
    
    CCA is moving toward a split runtime where the orchestrator may not have
    a filesystem, while executors can expose preinstalled plugins and
    skills. A thread therefore needs to select capabilities without asking
    app-server or core to interpret executor-owned paths through the
    orchestrator's filesystem.
    
    The longer-term model is broader than executor skills:
    
    - A plugin is a bundle of skills, MCP servers, connectors/apps, and
    hooks.
    - A plugin root can be local, executor-owned, or hosted by a backend.
    - Components inside one plugin can use different access and execution
    mechanisms. A skill may be read from a filesystem or through backend
    tools; an HTTP MCP server can run without an executor; a stdio MCP
    server or hook needs an execution environment.
    - Core should carry generic extension initialization data. The extension
    that owns a component should discover it, expose it to the model, and
    invoke it through the appropriate runtime.
    
    This PR establishes that architecture through one complete vertical:
    selecting a root on an executor, discovering the skills beneath it,
    exposing those skills to the model, and reading an explicitly invoked
    `SKILL.md` through the same executor.
    
    ## Contract
    
    `thread/start` gains an experimental `selectedCapabilityRoots` field:
    
    ```json
    {
      "selectedCapabilityRoots": [
        {
          "id": "deploy-plugin@1",
          "location": {
            "type": "environment",
            "environmentId": "workspace",
            "path": "/opt/codex/plugins/deploy"
          }
        }
      ]
    }
    ```
    
    The root is intentionally not classified as a "plugin" or "skill" in the
    API. It can point at a standalone skill, a directory containing several
    skills, or a plugin containing skills and other components. This PR only
    teaches the skills extension how to consume it; later extensions can
    resolve MCP, connector, and hook components from the same selection.
    
    The platform-supplied `id` is stable selection identity. The location
    says which runtime owns the root and gives that runtime an opaque path.
    App-server does not inspect or canonicalize the path.
    
    ## What changed
    
    ### Generic thread extension initialization
    
    App-server converts selected roots into `ExtensionDataInit`. Core
    carries that generic initialization value until the final thread ID is
    known, then creates thread-scoped `ExtensionData` before lifecycle
    contributors run.
    
    This keeps `Session` and core independent of the capability-selection
    contract. The initialization value is consumed during construction; it
    is not retained as another long-lived `Session` field.
    
    ### Executor-backed skills
    
    The skills extension now owns an `ExecutorSkillProvider` that:
    
    - resolves the selected environment through `EnvironmentManager`
    - discovers, canonicalizes, and reads skills through that environment's
    `ExecutorFileSystem`
    - contributes the bounded selected-skill catalog as stable developer
    context
    - reads an explicitly invoked skill body through the authority that
    listed it
    - warns when an environment or root is unavailable
    - never falls back to the orchestrator filesystem for an executor-owned
    root
    
    Skill catalog and instruction fragments have hard byte bounds, which
    also bound them below the 10K-token per-item context limit. If a
    selected executor skill has the same name as a legacy local skill, the
    executor selection owns that invocation and the local body is not
    injected a second time.
    
    Existing local and bundled skill loading remains in place. Omitting
    `selectedCapabilityRoots` therefore preserves current local-only
    behavior.
    
    ## Current semantics
    
    - Only environment-owned locations are represented in this first
    contract.
    - Roots are resolved by the destination extension, not by app-server or
    core.
    - An unavailable executor or invalid root produces a warning and no
    capabilities from that root; it does not trigger a local-filesystem
    fallback.
    - Selection applies to a newly started active thread.
    - MCP servers, connectors, and hooks beneath a selected plugin root are
    not activated yet.
    - Selection is not yet persisted or inherited across resume, fork, or
    subagent creation. Existing local capabilities continue to behave as
    they do today in those flows.
    
    ## Planned vertical follow-ups
    
    1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that
    works without an executor, then replace the special-purpose MCP plugins
    loader with that implementation.
    2. **Executor MCP:** register and execute stdio MCP servers through the
    environment that owns the selected plugin root.
    3. **Backend skills:** add a hosted skill source whose catalog and
    bodies are accessed through extension tools rather than a filesystem.
    4. **Connectors and hooks:** activate those components through their
    owning extensions, using the same selected-root boundary and
    component-specific runtime.
    5. **Durable selection:** define the desired-selection lifecycle,
    persist it, and make resume, fork, and subagent inheritance explicit
    rather than accidental.
    6. **Local convergence:** incrementally route existing local plugin,
    skill, and MCP loading through the same extension model while preserving
    current local behavior.
    
    Each follow-up remains reviewable as an end-to-end capability. The
    platform selects roots, generic thread extension data carries the
    selection, and the owning extension resolves and operates its component.
    
    ## Verification
    
    Coverage added for:
    
    - app-server end-to-end discovery and explicit invocation of a skill
    inside an executor-selected plugin root
    - exclusive invocation when a selected executor skill collides with a
    local skill name
    - executor filesystem authority for discovery, canonicalization, and
    reads
    - thread extension initialization before lifecycle contributors run
    - stable executor catalog context, explicit invocation, context
    rebuilding, hidden skills, and preserved host/remote catalog behavior
    
    Targeted protocol, core-skills, skills-extension, core lifecycle, and
    app-server executor-skill tests were run during development.
  • Bridge host-loaded skills into the skills extension (#26172)
    ## Why
    
    The skills extension needs to become the path that exposes local host
    skills without losing the behavior already owned by core skill loading.
    Host skill discovery is not just `$CODEX_HOME/skills`: it also includes
    config layers, bundled-skill settings, plugin roots, runtime extra
    roots, and the filesystem for the selected primary environment.
    
    Rather than making the extension reload host skills and risk drifting
    from that authoritative load, this PR bridges the already-loaded
    per-turn skills outcome into the extension. That lets the extension
    advertise host skills and inject explicit `$skill` prompts while
    preserving the same roots, disabled/hidden state, rendered paths, and
    environment-backed file reads that the legacy path uses.
    
    ## What Changed
    
    - Adds `HostLoadedSkills` in `core-skills` to wrap the turn's
    `SkillLoadOutcome` and read `SKILL.md` through the filesystem that
    loaded that skill.
    - Stores `HostLoadedSkills` in turn extension data for normal turns and
    review turns, so the skills extension can consume the loaded host
    catalog without reloading it.
    - Adds `HostSkillProvider` under `ext/skills/src/provider/host.rs`,
    mapping host-loaded skill metadata into the skills-extension
    catalog/read contract.
    - Registers the host provider by default from
    `codex_skills_extension::install()`.
    - Preserves host skill metadata such as dependencies, disabled state,
    hidden-from-prompt policy, and slash-normalized display paths.
    - Passes host-loaded skills through `SkillListQuery` and
    `SkillReadRequest` so explicit skill invocation reads only resources
    from the loaded host catalog.
    - Adds integration coverage for a real legacy
    `$CODEX_HOME/skills/.../SKILL.md` skill being listed and injected
    through the installed extension.
    
    ## Testing
    
    - Added `installed_extension_loads_host_skills_from_legacy_roots` in
    `ext/skills/tests/skills_extension.rs`.
    - `just test -p codex-skills-extension`
  • skills: resolve per-turn catalogs from turn input context (#26106)
    ## Why
    
    The skills extension needs the resolved turn environments to build a
    real per-turn `SkillListQuery`. The previous `TurnLifecycleContributor`
    hook only had a turn id, so it could only seed a placeholder query and
    never carry the executor authorities that executor-scoped skill routing
    will need.
    
    Moving catalog resolution onto `TurnInputContributor` puts the skills
    extension on the same turn-preparation path that already has the
    environment ids and working directories for the submitted turn, while
    keeping the actual prompt injection work for follow-up changes.
    
    ## What changed
    
    - switch `ext/skills` from `TurnLifecycleContributor` to
    `TurnInputContributor`
    - build `executor_authorities` from `TurnInputContext.environments` and
    pass them through `SkillListQuery`
    - keep storing the resolved catalog in `SkillsTurnState`, but drop the
    placeholder query helper that no longer matches the real data flow
    - update the extension TODOs to reflect that per-turn catalog resolution
    now happens in the turn-input contributor, and that prompt/context
    injection still needs to move later
    
    ## Testing
    
    - Not run locally.
  • feat: add extension turn-input contributors (#25959)
    ## Disclaimer
    Do not use for now
    
    ## Why
    
    Extensions can already contribute prompt fragments and request same-turn
    item injection, but there was no host-owned hook for contributing
    structured `ResponseItem`s while Codex is assembling a new turn's
    initial model input. This change adds that seam so extensions can attach
    turn-local input that depends on the submitted user input and resolved
    turn environments without routing through prompt text or late injection.
    
    ## What changed
    
    - add `TurnInputContributor` to `codex_extension_api` and export the new
    `TurnInputContext` / `TurnInputEnvironment` types it receives
    - teach `ExtensionRegistry` to register and expose turn-input
    contributors alongside the existing extension hooks
    - call registered turn-input contributors from
    `core/src/session/turn.rs` while building the initial injected input for
    a turn, then append their returned `ResponseItem`s after the skill and
    plugin injections
  • Route standalone image generation through host finalization md (#25176)
    ## Why
    
    Standalone image-generation extensions emitted turn items through the
    low-level event path, bypassing host-owned finalization such as image
    persistence and contributor processing. At the same time, the
    generated-image save-path hint must remain visible to the model through
    the extension tool's `FunctionCallOutput`, rather than the legacy
    built-in developer-message path.
    
    ## What changed
    
    - Extended `ExtensionTurnItem` to support image-generation items while
    keeping the extension-facing emitter API limited to `emit_started` and
    `emit_completed`.
    - Routed extension completion through core `finalize_turn_item`, so
    standalone image-generation items receive host-owned processing and
    persisted `saved_path` values before publication.
    - Kept legacy built-in image generation on its existing
    developer-message hint path, while standalone image generation returns
    its deterministic saved-path hint in `FunctionCallOutput`.
    - Shared the image artifact path and output-hint formatting used by core
    and the image-generation extension.
    - Passed thread identity through extension tool calls so standalone
    image generation can construct the same intended artifact path as core.
    - Added an app-server integration test covering real standalone image
    generation, saved artifact publication, model-visible output hint
    wiring, and absence of the legacy developer-message hint.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-web-search-extension`
    - `just test -p codex-goal-extension`
    - `just test -p codex-memories-extension`
    - Targeted `codex-core` tests for image save history, extension
    completion finalization, and contributor execution
    - `just test -p codex-app-server
    standalone_image_generation_returns_saved_path_hint_to_model`
    - `just fix -p codex-core`
    - `just fix -p codex-image-generation-extension`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
  • Route extension image generation through the native image completion pipeline (#24972)
    ## Why
    
    The standalone `image_gen.imagegen` extension should behave like native
    image generation for artifact persistence and UI completion, while
    returning its save-location guidance as part of the tool result instead
    of injecting a developer message.
    
    ## What Changed
    
    - Added an image-generation completion hook for extension tools so core
    can persist generated images and emit the existing `ImageGeneration`
    lifecycle events.
    - Reused core image artifact persistence for extension output and
    removed extension-local save-path/file-writing logic.
    - Split shared image persistence from built-in finalization so native
    image generation keeps its existing developer-message instruction
    behavior.
    - Returned the generated image save-location instruction through the
    extension `FunctionCallOutput`, alongside the generated image input for
    model follow-up.
    - Preserved the existing image-generation event shape for current UI and
    replay compatibility.
    - Avoided cloning the full generated-image base64 payload when emitting
    the in-progress image item.
    - Removed dependencies no longer needed after moving persistence out of
    the extension crate.
    
    ## Fast Follow
    - Adjust the existing Extension API and add a general `TurnItem`
    finalization path for re-usability of code
    
    ## Validation
    
    - Ran `just fmt`.
    - Ran `just bazel-lock-update`.
    - Ran `just bazel-lock-check`.
    - Ran `just test -p codex-tools -p codex-extension-api -p
    codex-image-generation-extension`.
    - Ran `just test -p codex-core
    image_generation_publication_is_finalized_by_core`.
    - Ran `just test -p codex-core
    handle_output_item_done_records_image_save_history_message`.
    - Ran `just fix -p codex-tools -p codex-extension-api -p codex-core -p
    codex-image-generation-extension`.
  • extension-api: add TurnItemEmitter to tool calls (#24813)
    ## Why
    Extension-contributed tools need to emit visible turn items through
    Codex's normal event and persistence pipeline.
    
    ## What
    - Add `TurnItemEmitter` to extension `ToolCall`s and route the core
    implementation through `Session::emit_turn_item_*`.
    - Hold weak session and turn references so retained tool calls cannot
    keep host state alive.
    - Provide a no-op emitter for extension test callers.
    
    ## Test Plan
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Add turn error lifecycle contributor (#24916)
    Summary
    - Add TurnErrorInput and TurnLifecycleContributor::on_turn_error to the
    extension API.
    - Emit the turn-error lifecycle from core turn error paths, including
    usage limit failures.
    - Add direct lifecycle coverage for the emitted error facts and stores.
    
    Tests
    - just fmt
    - git diff --check
    - Not run: full tests or clippy (per instructions)
  • Add thread start contributor facts (#24915)
    Summary: add session source and persistent-state availability to
    ThreadStartInput; populate them from session init; update existing goal
    test harness constructors. Tests: just fmt; git diff --check. No full
    tests or clippy run per request.
  • feat: add thread idle lifecycle hook (#24744)
    ## Why
    
    Extensions can currently observe thread start, resume, and stop, but
    they do not have a lifecycle point for the host to say that immediately
    pending thread work has drained. That makes idle follow-up behavior
    harder to express as extension-owned logic instead of host-specific
    plumbing.
    
    This adds an explicit idle lifecycle hook so an extension can react when
    a thread becomes idle while the host keeps ownership of whether any
    submitted follow-up input starts a turn, is queued, or is ignored.
    
    ## What changed
    
    - Added `ThreadIdleInput` with access to the session-scoped and
    thread-scoped extension stores.
    - Added a default `on_thread_idle` method to
    `ThreadLifecycleContributor`.
    - Re-exported `ThreadIdleInput` from the extension API surface.
    
    ## Testing
    
    Not run; this only extends the extension API trait surface with a
    default hook and exported input type.
  • fix: dont compact standalone websearch schema (#24660)
    add new `parse_tool_input_schema_without_compaction` to bypass the
    existing compaction/trimming of client-provided tool schemas that are
    over 4k bytes.
    
    we want this for standalone web search to keep field guidance/metadata
    on certain fields; this keeps us closer to parity with existing hosted
    tool schema (which didnt go through this 4k byte filter).
  • Expose conversation history to extension tools (#23963)
    ## Why
    
    Extension tools that need conversation context should be able to read it
    from the live tool invocation instead of reaching into thread
    persistence themselves.
    
    ## What changed
    
    - Add a `ConversationHistory` snapshot to extension `ToolCall`s and
    populate it from the current raw in-memory response history.
    - Expose all history items at this boundary so each extension can filter
    and bound the subset it needs before consuming or forwarding it.
    - Cover the adapter and registry dispatch paths and update existing
    extension tests that construct `ToolCall` literals.
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-extension-api`
    - `cargo test -p codex-goal-extension`
    - `cargo test -p codex-memories-extension`
    - `cargo test -p codex-core passes_turn_fields_to_extension_call`
    - `cargo test -p codex-core
    extension_tool_executors_are_model_visible_and_dispatchable`
  • [codex] Steer budget-limited goal extension turns (#23718)
    ## What
    - Add a small extension capability for injecting model-visible response
    items into the active turn
    - Have the goal extension inject hidden goal-context steering when
    tool-finish accounting reaches `BudgetLimited`
    - Cover the extension backend path with an assertion on the injected
    steering item
    
    ## Why
    PR #23696 persists and emits the budget-limited goal update from
    tool-finish accounting, but it leaves the model unaware of that
    transition. The existing core runtime steers the model to wrap up in
    this case; the extension path should do the same through an explicit
    host capability.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-goal-extension`
    - `cargo test -p codex-extension-api`
  • feat: expose turn-start metadata to extensions (#23688)
    ## Why
    
    The goal extension needs more context when a turn starts than
    `turn_store` alone provides.
    
    In particular, goal accounting needs the stable turn id, the effective
    collaboration mode, and the cumulative token-usage baseline captured at
    turn start so it can:
    
    - suppress goal accounting for plan-mode turns
    - compute exact per-turn deltas from cumulative `total_token_usage`
    snapshots instead of relying on the most recent usage event alone
    - keep the extension-owned accounting path aligned with the host turn
    lifecycle
    
    ## What
    
    - extend `codex_extension_api::TurnStartInput` to expose `turn_id`,
    `collaboration_mode`, and `token_usage_at_turn_start`
    - pass the full `TurnContext` plus the captured token-usage baseline
    through the turn-start lifecycle emission path
    - initialize goal turn accounting from the turn-start baseline and
    collaboration mode
    - switch goal token accounting to compute deltas from cumulative
    `total_token_usage` snapshots
    - add coverage for the new turn-start lifecycle fields and for
    goal-accounting baseline behavior
    
    ## Testing
    
    - added `turn_start_lifecycle_exposes_turn_metadata_and_token_baseline`
    in `codex-rs/core/src/session/tests.rs`
    - added `ext/goal/tests/accounting.rs` coverage for baseline-aware goal
    accounting and plan-mode suppression
  • Add tool lifecycle extension contributor (#23309)
    ## Why
    
    Extensions that need to track runtime progress currently have no typed
    host signal for tool execution. The goal extension in particular needs
    to observe tool attempts without inspecting tool payloads, owning tool
    implementations, or staying coupled to core-only runtime plumbing.
    
    This adds a narrow lifecycle contributor API for host-owned tool
    execution: extensions can observe when an accepted tool call starts and
    how it finishes, while policy hooks and tool handlers continue to own
    payload rewriting, blocking, and execution.
    
    Relevant code:
    
    -
    [`ToolLifecycleContributor`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors.rs#L119)
    defines the extension-facing observer contract.
    -
    [`tool_lifecycle.rs`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors/tool_lifecycle.rs)
    defines the typed start/finish inputs, source, and outcome enums.
    - [`notify_tool_start` /
    `notify_tool_finish`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/core/src/tools/lifecycle.rs)
    bridges core tool dispatch into the extension registry.
    
    ## What Changed
    
    - Added `ToolLifecycleContributor` to `codex-extension-api`, including:
      - `ToolStartInput`
      - `ToolFinishInput`
      - `ToolCallSource`
      - `ToolCallOutcome`
    - Added registration and lookup support on `ExtensionRegistryBuilder` /
    `ExtensionRegistry`.
    - Wired core tool dispatch to notify lifecycle contributors for:
      - accepted tool starts
      - completed tool calls, including the tool output success marker
      - pre-tool-use blocks
      - failures before or after the handler runs
      - cancellation/abort in the parallel tool path
    - Registered the goal extension as a lifecycle contributor and added the
    outcome filter it will use for goal progress accounting.
    
    ## Test Coverage
    
    - Added `dispatch_notifies_tool_lifecycle_contributors` to cover
    lifecycle notification ordering and outcomes for successful and
    handler-failed tool calls.
  • chore: make token usage async (#23305)
    Make the `TokenUsageContributor` async. This will be required for future
    extension and it's basically free
  • feat: add extension event sink capability (#23293)
    ## Why
    
    Extensions can already expose typed contributions and receive host
    capabilities such as `AgentSpawner`, but they do not have a typed way to
    send protocol events back through the host. Extensions that need to
    surface progress or status should not have to own persistence, ordering,
    transport fanout, or logging decisions themselves.
    
    ## What
    
    - Add `ExtensionEventSink`, a host-provided fire-and-forget sink for
    `codex_protocol::protocol::Event`.
    - Add `NoopExtensionEventSink` so hosts that do not expose extension
    event emission keep the existing empty-registry behavior.
    - Store the sink on `ExtensionRegistryBuilder` / `ExtensionRegistry`,
    with `with_event_sink(...)` and `event_sink()` accessors, and re-export
    the new capability from `codex-extension-api`.
    
    ## Testing
    
    - Not run locally; PR metadata/body update only.
  • Make extension lifecycle hooks async (#23291)
    ## Why
    
    Extension lifecycle hooks sit on the host/extension boundary, but the
    current trait surface only allows synchronous callbacks. That forces
    extensions that need to seed, rehydrate, observe, or flush
    extension-owned state during thread and turn transitions to either block
    inside the callback or move async work into separate host plumbing.
    
    This PR makes those lifecycle callbacks awaitable so extension
    implementations can perform async work directly at the lifecycle point
    where the host already has the relevant session, thread, or turn stores
    available.
    
    ## What changed
    
    - Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor`
    async in `codex-extension-api`.
    - Awaits thread start/resume/stop and turn start/stop/abort lifecycle
    callbacks from `codex-core`.
    - Updates the guardian and memories extensions to implement the async
    lifecycle trait surface.
    - Updates the existing lifecycle tests to use async contributor
    implementations.
    - Adds `async-trait` to the crates that now expose or implement these
    async object-safe lifecycle traits.
    
    ## Testing
    
    - Existing `codex-core` lifecycle tests were updated to cover async
    implementations for thread stop and turn abort ordering.
  • Simplify tool executor and registry plumbing (#22636)
    ## Why
    
    The tool runtime path still had a typed output associated type on
    `ToolExecutor`, plus a core-only `RegisteredTool` adapter and
    extension-only executor aliases. That made every new shared tool runtime
    carry extra adapter plumbing before it could participate in core
    dispatch, extension tools, hook payloads, telemetry, and model-visible
    spec generation.
    
    This PR moves output erasure to the shared executor boundary so core and
    extension tools can use the same execution contract directly.
    
    ## What Changed
    
    - Changed `codex_tools::ToolExecutor` to return `Box<dyn ToolOutput>`
    instead of an associated `Output` type.
    - Removed the extension-specific `ExtensionToolExecutor` /
    `ExtensionToolOutput` aliases and exposed `ToolExecutor<ToolCall>` plus
    `ToolOutput` through `codex-extension-api`.
    - Reworked core tool registration around `CoreToolRuntime` and
    `ToolRegistry::from_tools`, removing the extra `RegisteredTool` /
    `ToolRegistryBuilder` layer.
    - Consolidated model-visible spec planning and registry construction in
    `core/src/tools/spec_plan.rs`, including deferred tool search and
    code-mode-only filtering.
    - Added `ToolOutput` helpers for post-tool-use hook ids and inputs so
    MCP, unified exec, extension, and other boxed outputs preserve the same
    hook payload behavior.
    - Updated core handlers, memories tools, and the related
    registry/spec/router tests to use the simplified contract.
    
    ## Test Coverage
    
    - Updated coverage for tool spec planning, registry lookup, deferred
    tool search registration, extension tool routing, post-tool-use hook
    payloads, dispatch tracing, guardian output extraction, and memories
    extension tool execution.
  • feat: make ToolExecutor an async trait (#22560)
    ## Why
    
    `codex_tools::ToolExecutor` keeps a tool spec attached to its runtime
    handler, but extension tools still carried a parallel
    `ExtensionToolFuture` / `ExtensionToolExecutor` shape. That made
    extension-owned tools look different from host tools even though
    routing, registration, and execution need the same abstraction.
    
    This PR makes the shared executor contract directly async and lets
    extension tools implement it too, so host tools and extension tools can
    move through the same registration path.
    
    ## What changed
    
    - Changed `ToolExecutor::handle` to an `async fn` using `async-trait`,
    and updated built-in tool handlers to implement the async trait
    directly.
    - Replaced the bespoke `ExtensionToolFuture` contract with a marker
    `ExtensionToolExecutor` over `ToolExecutor<ToolCall, Output =
    JsonToolOutput>`, re-exporting `ToolExecutor` from
    `codex-extension-api`.
    - Updated the memories extension tools to implement the shared executor
    trait.
    - Split tool-router construction into collected executors plus hosted
    model specs, keeping hosted tools like web search and image generation
    separate from executable handlers.
    - Updated spec/router tests and extension-tool stubs for the new
    executor shape.
    
    ## Verification
    
    - Not run locally.
  • fix: main (#22503)
    Fix main due to conflicting merge
  • feat: add config-change extension contributor (#22488)
    ## Why
    
    Extensions can observe thread and turn lifecycle events today, but there
    was no single host-owned hook for changes to the effective thread
    configuration. That makes features that need to react to model,
    permission, or tool-suggest updates either depend on individual mutation
    paths or risk going stale after runtime config refreshes.
    
    This adds a typed config-change contributor so extension-owned state can
    stay synchronized with the effective thread config while the host
    remains responsible for deciding when config changed.
    
    ## What Changed
    
    - Added `ConfigContributor<C>` to `codex_extension_api`, with
    before/after immutable snapshots of the effective config plus
    session/thread extension stores.
    - Added registry builder/accessor support through `config_contributor`
    and `config_contributors`.
    - Emits config-change callbacks after committed updates from session
    settings, per-turn setting updates, and `refresh_runtime_config`.
    - Builds effective config snapshots only when config contributors are
    registered, and suppresses no-op callbacks when the before/after
    snapshots are equal.
    - Added a core session regression test that verifies contributors
    observe both model changes and user-layer runtime config changes,
    including access to session and thread extension stores.
    
    ## Validation
    
    Added `config_change_contributor_observes_effective_config_changes` in
    `codex-rs/core/src/session/tests.rs` to cover the new contributor path.
  • Make context contributors async (#22491)
    ## Summary
    - make ContextContributor return a boxed Send future
    - await context contributors during initial context assembly
    - update existing contributors and extension-api examples for the async
    contract
    
    ## Testing
    - cargo test -p codex-extension-api --examples
    - cargo test -p codex-git-attribution
    - cargo test -p codex-core
    build_initial_context_includes_git_attribution_from_extensions --
    --nocapture
    - cargo test -p codex-core
    build_initial_context_omits_git_attribution_when_feature_is_disabled --
    --nocapture
    - cargo test -p codex-core (fails in unrelated
    agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
    stack overflow)
    - just fix -p codex-extension-api
    - just fix -p codex-git-attribution
    - just fix -p codex-core
    - cargo clippy -p codex-extension-api --examples
  • feat: move extension scope ids into ExtensionData (#22490)
    ## Summary
    - add a scoped level_id to ExtensionData and expose it through
    level_id()
    - remove thread_id/turn_id parameters from extension contributor inputs
    where the scoped ExtensionData already carries that identity
    - move turn-scoped extension data onto TurnContext so token usage and
    lifecycle contributors can share the same turn store
    
    ## Testing
    - cargo check -p codex-extension-api -p codex-core --tests
    - cargo test -p codex-extension-api
    - cargo test -p codex-guardian
    - cargo test -p codex-core --lib
    record_token_usage_info_notifies_extension_contributors
    - cargo test -p codex-core --lib
    submission_loop_channel_close_emits_thread_stop_lifecycle
    - cargo test -p codex-core --lib
    submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle
    - just fix -p codex-extension-api
    - just fix -p codex-guardian
    - just fix -p codex-core
    - just fmt
    
    ## Note
    - Attempted cargo test -p codex-core; it aborted in
    agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
    with the existing stack overflow before the full suite completed.
  • feat: add token usage contributor hook (#22485)
    ## Why
    
    Extensions need a stable place to observe token accounting after Codex
    folds model-provider usage into the session's cached `TokenUsageInfo`.
    Without a contributor hook, extension-owned features that need last-turn
    or cumulative token usage have to duplicate session plumbing or infer
    state from client-facing `TokenCount` notifications.
    
    ## What changed
    
    - Added `TokenUsageContributor` to `codex-extension-api`, passing
    session/thread `ExtensionData`, `ThreadId`, turn id, and the current
    `TokenUsageInfo`.
    - Added registry builder/storage support for token-usage contributors.
    - Invoked registered contributors from
    `Session::record_token_usage_info` after the session token cache is
    updated and before the client `TokenCount` notification is emitted.
    
    ## Testing
    
    - Added `record_token_usage_info_notifies_extension_contributors`,
    covering cumulative token usage updates and access to both extension
    stores.
  • feat: add turn lifecycle contributors (#22480)
    ## Why
    
    Extensions can already contribute prompt, tool, turn-item, and
    thread-lifecycle behavior, but there was no explicit host-owned hook for
    per-turn setup and cleanup. That makes extension-private turn state
    awkward: an extension either has to stash it outside the turn lifecycle
    or depend on core runtime objects.
    
    This adds a small turn lifecycle boundary. Extensions receive stable
    identifiers plus the existing session, thread, and turn `ExtensionData`
    stores, while core keeps owning task scheduling, cancellation, and turn
    teardown.
    
    ## What Changed
    
    - Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`,
    and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`.
    - Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput`
    payloads that expose `thread_id`, `turn_id`, `session_store`,
    `thread_store`, and `turn_store`.
    - Registered and re-exported turn lifecycle contributors through
    `ExtensionRegistry` and `ExtensionRegistryBuilder`.
    - Wired `Session` to emit turn start, stop, and abort callbacks from the
    existing turn/task lifecycle paths.
    - Carried the turn-scoped `ExtensionData` through `RunningTask` and
    `RemovedTask` so stop/abort callbacks receive the same turn store
    created at turn start.
    
    ## Verification
    
    - Not run locally.
  • feat: add thread lifecycle contributor hooks (#22476)
    ## Why
    
    Extensions that need thread-scoped state currently only get a start-time
    callback. That is enough for seeding stores, but it leaves the host
    without a shared extension seam for later thread rehydrate and flush
    work as thread ownership evolves. This PR turns that start-only seam
    into a host-owned thread lifecycle contributor contract so
    extension-private state can stay behind the extension API instead of
    leaking extra orchestration through core.
    
    ## What changed
    
    - Replaced `ThreadStartContributor` with `ThreadLifecycleContributor`
    and added typed lifecycle inputs for thread start, resume, and stop. The
    contract lives in
    [`contributors/thread_lifecycle.rs`](https://github.com/openai/codex/blob/d0e9211f70e58d6b07ef07e84f359d1b9aa25955/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs#L1-L64).
    - Kept the existing start-time behavior intact by routing session
    construction through `on_thread_start`.
    - Invoked `on_thread_stop` during session shutdown before thread-scoped
    extension state is dropped, while isolating contributor failures behind
    warning logs.
    - Migrated `git-attribution` and `guardian` onto the lifecycle
    registration path.
    - Renamed the extension registry plumbing from start-specific
    contributors to lifecycle-specific contributors.
    
    ## Notes
    
    `on_thread_resume` is introduced at the API boundary here so extensions
    can target the final lifecycle shape; host resume dispatch can be wired
    where that runtime path is finalized.
  • Refactor extension tools onto shared ToolExecutor (#22369)
    ## Why
    
    Extension tools were split across two public runtime contracts:
    `codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
    types, while core native tools used `codex_tools::ToolExecutor`. That
    made contributed tool specs and execution behavior easy to drift apart
    and added another crate boundary for what should be one executable-tool
    seam.
    
    This PR makes `ToolExecutor` the single runtime contract and keeps
    extension-specific pinning in `codex-extension-api`.
    
    ## Remaining todo
    
    https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
    Either generic `Invocation` or sub-extract the `ToolCall` and clean
    `ToolInvocation`
    
    ## What changed
    
    - Removed the `codex-tool-api` workspace crate and its dependencies from
    core and `codex-extension-api`.
    - Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
    extension contributors can return a dyn executor.
    - Added the extension-facing aliases under
    `ext/extension-api/src/contributors/tools.rs`, including
    `ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
    ExtensionToolOutput>`.
    - Changed `ToolContributor::tools` to return extension executors
    directly instead of `ToolBundle`s.
    - Updated core’s extension tool handler/registry/router path to adapt
    those extension executors into the existing native `ToolInvocation`
    runtime path.
    - Added focused coverage for extension tools being registered,
    model-visible, dispatchable, and not replacing built-in tools.
    
    ## Verification
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-extension-api`
  • extension-api: add approval review contributor flow (#22344)
    ## Why
    
    `codex-extension-api` needs an approval hook that lets an installed
    extension own a rendered approval-review prompt and produce the final
    `ReviewDecision`. The prior interceptor stub only exposed a yes/no claim
    and did not model the review result itself, which left the host with the
    missing half of the control flow.
    
    ## What changed
    
    - Replaces `ApprovalInterceptorContributor` with
    [`ApprovalReviewContributor`](https://github.com/openai/codex/blob/c49d17531e15057a373a9b17f410cafb6299d0c1/codex-rs/ext/extension-api/src/contributors.rs#L43-L55),
    which may claim a rendered prompt and return an async `ReviewDecision`.
    - Re-exports the new contributor and future types from `extension-api`.
    - Adds registry support through `approval_review_contributor(...)` plus
    [`ExtensionRegistry::approval_review(...)`](https://github.com/openai/codex/blob/c49d17531e15057a373a9b17f410cafb6299d0c1/codex-rs/ext/extension-api/src/registry.rs#L90-L101),
    which returns the first installed contributor that claims the prompt.
  • feat: guardian as an extension (contributors part) (#22216)
    Part 1 of guardian as extension. This bind all the logic to spawn
    another agent from an extension and it adds `ThreadId` in the start
    thread collaborator
  • feat: wire extension tool bundles into core (#22147)
    ## Why
    
    This is the next narrow step toward moving concrete tool families out of
    core. After #22138 introduced `codex-tool-api`, we still needed a real
    end-to-end seam that lets an extension own an executable tool definition
    once and have core install it without the temporary `extension-api`
    wrapper or a dependency on `codex-tools`.
    
    `codex-tool-api` is the small extension-facing execution contract, while
    `codex-tools` still has a different job: host-side shared tool metadata
    and planning logic that is not “run this contributed tool”, like spec
    shaping, namespaces, discovery, code-mode augmentation, and
    MCP/dynamic-to-Responses API conversion
    
    ## What changed
    
    - Moved the shared leaf tool-spec and JSON Schema types into
    `codex-tool-api`, so the executable contract now lives with
    [`ToolBundle`](https://github.com/openai/codex/blob/c538758095337d4fe0a52a172363ccede4066bda/codex-rs/tool-api/src/bundle.rs#L19-L70).
    - Replaced the temporary extension-side tool wrapper with direct
    `ToolBundle` use in `codex-extension-api`.
    - Taught core to collect contributed bundles, include them in spec
    planning, register them through
    [`ToolRegistryBuilder::register_tool_bundle`](https://github.com/openai/codex/blob/c538758095337d4fe0a52a172363ccede4066bda/codex-rs/core/src/tools/registry.rs#L653-L667),
    and dispatch them through the existing router/runtime path.
    - Added focused coverage for contributed tools becoming model-visible
    and dispatchable, plus spec-planning coverage for contributed function
    and freeform tools.
    
    ## Verification
    
    - Added `extension_tool_bundles_are_model_visible_and_dispatchable` in
    `core/src/tools/router_tests.rs`.
    - Added spec-plan coverage in `core/src/tools/spec_plan_tests.rs` for
    contributed extension bundles.
    
    ## Related
    
    - Follow-up to #22138