codex

Reinject missing World State fragments on resume (#30152 )

## Why

World State restores its structured snapshot on resume so unchanged
sections do not have to be rendered again. That is safe only when the
model-visible fragment represented by the snapshot is still present in
retained history.

For selected executor skills, the failing selected-capability scenario
exposed this state:

```text
persisted World State: selected skill catalog is known
retained model history: selected skill catalog message is missing
next diff: unchanged, so emit nothing
```

The model resumes without being told about the selected skill catalog.

## What changed

World State contributions may now optionally describe the concrete
model-visible fragment that must remain in retained history.

When a persisted snapshot is present:

```text
matching retained fragment exists -> trust snapshot, emit nothing
matching retained fragment missing -> treat section as absent, render current state once
```

The skills extension uses this for non-empty selected-environment
catalogs by matching its exact rendered catalog body. Empty or hidden
catalogs do not require a fragment.

## Scope

This does not clear or rebuild the whole World State baseline. It does
not change skill discovery, cache invalidation, environment
availability, or MCP runtime behavior. It only keeps a persisted section
snapshot and its retained model context consistent across resume/history
reconstruction.

## Coverage

A focused World State regression test verifies both sides:

- a missing retained fragment is rendered again
- a matching retained fragment avoids duplicate injection

jif · 2026-06-26 02:18:00 +01:00

723b23efd0

Project selected plugin runtime by environment availability (#30093 )

## Why

Selected plugin metadata is stable, but MCP processes are live runtime
state. They need different lifetimes:

- the MCP extension caches manifest, MCP, and connector declarations for
each stable selected root;
- each model step projects that cached metadata through the roots that
resolved as ready for that exact step;
- the MCP manager is rebuilt only when that availability projection
changes.

This matches executor skills: both features consume the same resolved
step roots instead of inferring readiness from the turn's selected
environments.

## Behavior

```text
E1 not ready for this step
  -> no E1 MCP servers or connectors
  -> cached plugin metadata stays in ext/mcp

E1 becomes ready
  -> reuse cached metadata
  -> publish one MCP runtime containing E1 capabilities

same ready roots on the next step
  -> reuse the exact runtime; no rediscovery and no MCP restart

resume
  -> create new extension thread state and a new MCP runtime
```

All model-facing consumers use the same step snapshot:

```text
resolved selected roots
        |
        v
extension MCP/connector projection
        |
        v
{ MCP config, connector snapshot, MCP manager }
        |
        +-> advertise model tools
        +-> build app/connector tools
        +-> execute MCP calls
```

## Cache contract

The existing MCP extension owns a cache keyed by the full
`SelectedCapabilityRoot`:

```rust
let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
```

The cache lives with extension thread state. Environment availability
filters projection but does not invalidate metadata. Resume creates new
thread state. There is no file watcher or executor generation because
contents behind a stable environment/root are assumed stable.

## What changes

- Keeps executor plugin discovery and cached metadata in `ext/mcp`.
- Caches MCP and connector declarations together per selected root.
- Uses the step's already-resolved capability roots, including lazy
environments that are not turn environments.
- Reuses the current MCP runtime when the ready-root projection is
unchanged.
- Uses the same step MCP manager and connector snapshot for
model-visible tools and execution.
- Resolves direct thread-scoped MCP requests from the current
selected-root projection.

## Deliberately out of scope

- `app/list` remains based on the latest global host-plugin state; this
PR does not make its response or notifications thread-specific.
- `required = true` startup semantics do not apply to delayed executor
MCP activation.
- No filesystem/content invalidation.
- No transport-disconnect watcher.
- No executor generations or environment replacement semantics.
- No client sharing across complete manager replacements.

## Stack

1. Extension-owned World State sections.
2. Project executor skills through World State.
3. Pin one MCP runtime to each model step.
4. **This PR:** project selected MCP and connector state from
extension-owned metadata.
5. Integration coverage for selected capability availability and resume.

## Verification

-
`selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
- The stacked integration PR covers unavailable to ready activation,
unchanged-runtime reuse, skills, MCP tools, connector attribution, and
cold resume.

jif · 2026-06-26 01:36:44 +01:00

3095ea9c3d

Project executor skills through World State (#30088 )

## Why

A selected executor environment can be unavailable in one model step and
ready in the next. The model should see its skills only while that
environment is ready, without rescanning stable files on every sample.

The product assumption is simple:

- an environment ID names one stable logical environment;
- the selected root contents do not change during the thread.

## Behavior

```text
E1 unavailable -> do not show E1 skills
E1 ready       -> discover once, cache, show through World State
E1 unavailable -> hide skills, keep cache
E1 ready again -> reuse cache, show skills again
resume         -> create a new thread cache and discover again
```

The cache key is the full `SelectedCapabilityRoot`. Availability does
not invalidate it; dropping the extension's thread state does.

The step supplies the ready selected roots directly. They do not have to
be turn environments:

```text
turn environment: laptop
selected root:    worker:/plugins/lint-fix

worker ready -> lint-fix skills are visible
```

## What changes

- Keeps executor skill catalogs in the existing skills extension.
- Passes the roots resolved as ready for the step into World State
contributors.
- Loads each ready selected root at most once per thread.
- Contributes the executor catalog as the `skills` World State section.
- Uses the exact step catalog for explicit skill selection and body
reads.
- Leaves host and orchestrator skill behavior where it already lives.

Taking a step snapshot itself does not add an RPC. Executor filesystem
calls happen only on the first discovery of a stable root for that
thread.

## What does not change

- No filesystem watcher or content-based invalidation.
- No retry/generation framework.
- No skill runtime migration into core.
- No general rewrite of the skills extension.

## Stack

1. Extension-owned World State sections.
2. **This PR:** project cached executor skills through World State.
3. Pin one MCP runtime to each model step.
4. Project selected MCP/app/connector metadata by environment
availability.
5. One end-to-end integration scenario.

jif · 2026-06-26 00:13:43 +01:00

5eebeb8169

Let extensions contribute World State sections (#30100 )

## Why

#29856 already owns the durable thread intent and exact environment
binding. This PR adds only the small missing extension boundary: an
extension can contribute one named World State section, while core still
owns persistence, diffing, and model-visible fragment types.

This lets skills stay in the skills extension instead of moving their
runtime into core.

## Shape

```text
extension-owned state
        |
        | contribute section id + JSON snapshot + renderer
        v
core World State
        |
        | compare with the previous snapshot
        v
no message, or one incremental model-visible update
```

The extension API is deliberately small:

```rust
fn contribute_world_state(...) -> Vec<WorldStateSectionContribution>
```

Core adapts the rendered result to `ContextualUserFragment`, records the
snapshot, and keeps the existing compaction/resume behavior.

## What changes

- Adds extension-owned World State section contributions.
- Calls those contributors from the existing per-step World State
builder.
- Restores durable selected capability roots into extension thread state
on resume.
- Keeps the actual model-context fragment and rollout machinery in core.

## What does not change

- No skill or MCP implementation moves out of its extension.
- No new file watcher, generation, or RPC.
- No generic migration of existing World State sections.
- No change to the stable environment-ID assumption from #29856.

## Example

```text
step 1 snapshot: skills = []
step 2 snapshot: skills = [executor-demo:deploy]

core asks the skills extension to render only that change.
```

## Stack

1. **This PR:** let extensions contribute World State sections.
2. Project executor skills through the skills extension.
3. Pin one MCP runtime to each model step.
4. Project selected MCP/app/connector metadata by environment
availability.
5. One end-to-end integration scenario.

jif · 2026-06-25 22:23:51 +01:00

c9e6d9783d

Add turn-scoped context contributions (#28911 )

## Summary
- keep context injection on a single ContextContributor trait
- split context injection into thread-scoped and turn-scoped
contribution methods
- wire turn-scoped fragments into initial context assembly so extensions
can contribute context from turn-local state

jif · 2026-06-18 19:40:28 +02:00

9684ec25be

[codex] Use expect in integration tests (#28441 )

The workspace denies `clippy::expect_used` in production. Although
`clippy.toml` allows `expect` in tests, Bazel Clippy compiles
integration-test helper code in a way that does not receive that
exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
and equivalent `match`/`let else` forms.

This allows `clippy::expect_used` once at each integration-test crate
root (including aggregated suites and test-support libraries), then
replaces manual panic-based Result and Option unwraps with
`expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
crate roots. Intentional assertion and unexpected-variant panics remain
unchanged, and the production `expect_used = "deny"` lint remains in
place.

The cleanup is mechanical and net-negative in line count.

pakrym-oai · 2026-06-15 21:53:47 -07:00

e752f7b4ae

skills: hide orchestrator skills with a local executor (#28333 )

## Why

App-server threads without a local executor need orchestrator-owned
skills from the hosted `codex_apps` MCP server. Threads with the local
executor already discover installed skills from the local filesystem.

After the orchestrator skill provider was enabled for every app-server
thread, local-executor threads also received the hosted skill catalog
and the `skills.list` and `skills.read` tools. This changed the existing
local behavior and could expose a second hosted copy of a skill that was
already installed locally.

## What changed

- Expose the thread's selected execution environments to extensions at
thread startup.
- Enable orchestrator skills only when the reserved local environment is
not selected.
- Apply that decision consistently to hosted skill catalog discovery,
explicit skill injection, and the `skills.list` and `skills.read` tools.

## Verification

- The existing no-executor app-server test continues to verify hosted
skill discovery, invocation, and child-resource reads.
- A new app-server test verifies that local-executor threads do not
receive hosted skill context or `skills.*` tools.

jif · 2026-06-15 17:15:45 +02:00

6d9df687a5

Add selected-plugin precedence and attribution to the MCP catalog (#27884 )

## Why

**In short:** this PR resolves already-discovered MCP registrations. It
does not read selected plugins or discover their MCP servers.

The resolved MCP catalog currently builds config and auto-discovered
plugin registrations before runtime contributors are applied. A
thread-selected plugin needs a distinct precedence tier in that same
initial resolution pass: otherwise a disabled lower-precedence winner
can leave stale name-level state behind, and the winning MCP tools
cannot be attributed to the selected package reliably.

This PR adds that catalog boundary before executor discovery is
connected.

## What changed

- Added an explicit selected-plugin registration tier between
auto-discovered plugins and explicit config.
- Collected selected-plugin contributions before the initial catalog
build, while leaving compatibility and generic extension overlays in
their existing runtime phase.
- Retained the winning plugin ID and display name directly on
plugin-owned catalog registrations.
- Derived MCP tool provenance from the winning catalog entry instead of
joining against local-only plugin summaries.
- Retained the winning selected server's tool approval policy in the
running connection manager, so a selected registration cannot inherit
approval behavior from a losing local plugin.
- Kept remembered approval session-scoped for selected plugins until
there is an authority-aware persistence contract; Codex will not write
approval back to an unrelated local plugin.
- Preserved existing name-level disabled vetoes for discovered plugins
and config, while keeping a selected package's own disabled registration
scoped to that registration.
- Preserved deterministic selection order and existing config,
compatibility, and extension precedence.

The resulting order is:

```text
auto-discovered plugin
  < selected plugin
  < explicit config
  < compatibility registration
  < extension overlay
```

## Behavior and scope

This is a catalog and provenance change only. No production host
contributes selected-plugin MCP registrations yet, so existing local MCP
behavior remains unchanged.

The stacked follow-up, #27870, installs the executor plugin provider
that produces these registrations. App-server activation remains a
separate final step.

## Verification

Focused tests cover precedence, deterministic selected-plugin conflicts,
disabled-veto behavior across catalog phases, managed requirements
before selected-plugin resolution, winning-server approval policy, and
attribution when local and selected packages share an ID or server name.
CI owns execution of the test suite.

jif · 2026-06-15 11:10:51 +02:00

c3a479620f

Make MCP server contributions thread-scoped (#27670 )

## Why

`selectedCapabilityRoots` belongs to one thread, but MCP contributors
previously received only the global Codex config. That left no clean way
for a selected executor capability to contribute MCP servers to its own
thread.

## What this PR does

- Gives MCP contributors a small context containing the config and, for
a running thread, its frozen host-seeded inputs.
- Uses the same thread inputs during startup, status queries, refreshes,
and skill dependency checks.
- Keeps threadless MCP operations and the existing hosted Apps behavior
unchanged.
- Adds coverage showing that two threads resolve independent
registrations and that later lifecycle mutations do not change the
frozen MCP inputs.

This PR does not discover plugin manifests, add MCP servers, or launch
anything new. It only establishes the thread-scoped registration
boundary.

## Follow-ups

- Resolve selected executor plugin roots through their owning
environment filesystem.
- Convert their stdio MCP declarations into environment-bound
registrations and add an executor MCP end-to-end test.

## Verification

- `just fmt`
- `cargo check --tests -p codex-protocol -p codex-extension-api -p
codex-mcp-extension -p codex-core -p codex-app-server`

Tests and Clippy were not run.

jif · 2026-06-12 11:20:34 +02:00

693082f3c4

Route image extension reads through turn environments v2 (#27498 )

## Why

Image generation used `std::fs::read` for referenced image paths, which
did not support environment-backed filesystems or their sandbox context.

## What changed

- Expose optional turn environments to extension tool calls.
- Include each environment’s ID, working directory, filesystem, and
sandbox context.
- Read referenced images through the selected environment filesystem.
- Keep sandbox usage at the extension call site so extensions can choose
the appropriate access mode.
- Consolidate image request construction into one async function.
- Add coverage for successful environment reads and read failures.

## Validation

- `cargo check -p codex-image-generation-extension --tests`
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`

`just test -p codex-image-generation-extension` could not complete
because the build exhausted available disk space.

Won Park · 2026-06-11 16:32:52 -07:00

19ce6394af

Resolve MCP server registrations through a catalog (#27634 )

## Why

MCP servers currently come from user config, local plugins,
compatibility Apps synthesis, and host extensions. Those sources were
composed by mutating a shared map, leaving registration identity,
precedence, removal, and provenance implicit in assembly order.

Before adding executor-owned MCPs, Codex needs one durable resolution
boundary above `McpConnectionManager`. This PR introduces that boundary
while preserving current server configuration, policy, and runtime
behavior. Executor-scoped registrations and explicit policy layers
remain follow-ups.

## What changed

- Add typed `McpServerRegistration` inputs and an immutable
`ResolvedMcpCatalog` in `codex-mcp`.
- Retain each registration's complete `McpServerConfig`, including its
environment binding, while recording its source and provenance.
- Preserve the existing structural precedence between plugin, config,
compatibility, and ordered extension sources.
- Resolve equal-precedence actions by contribution order; provenance IDs
are used only for diagnostics and cannot affect the winner.
- Preserve extension removals and the existing name-scoped `enabled =
false` veto.
- Report same-tier conflicts with every contender and the final catalog
outcome, including whether the winning action registers or removes the
server.
- Require MCP contributors to provide a stable diagnostic identity.
- Derive materialized server maps and plugin ownership from the resolved
catalog.

`McpConnectionManager`, transport startup, tool calls, and resource
routing continue to consume the same effective `McpServerConfig` values.

## Scope

This PR does not add new MCP capabilities or change user-visible
behavior. It does not add executor plugin discovery, thread-scoped
registrations, dynamic refresh generations, or new user/managed policy
semantics.

## Verification

- Added focused catalog coverage for source precedence, complete
configuration preservation, disabled vetoes, plugin ownership,
contribution-order tie breaking, removal outcomes, and conflict
diagnostics.
- Extended hosted Apps coverage for ordered extension removal and
Apps-disabled hosts with and without the hosted extension installed.
- `cargo check -p codex-mcp --tests -p codex-extension-api -p
codex-core`

jif · 2026-06-11 21:54:52 +02:00

4a5a676499

[codex] Load user instructions through an injected provider (#27101 )

## Why

We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
make embedders responsible for supplying user-level instructions. This
also ensures user instructions load when no primary environment is
selected.

## What changed

Stacked on #27415, which makes `codex exec` surface thread-scoped
runtime warnings.

- Added `UserInstructionsProvider` to `codex-extension-api`, with
absolute source attribution and recoverable loading warnings.
- Added `codex-home` with the filesystem-backed provider for
`AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
trimming, lossy UTF-8 handling, and the existing uncapped global
instruction size.
- Removed global instruction loading from `Config` and require
`ThreadManager` callers to inject a provider.
- Load provider instructions once for each fresh root runtime, including
runtimes without a primary environment. Running sessions retain their
snapshot, while child agents inherit the parent snapshot without
invoking the provider.
- Keep provider instructions separate while loading project `AGENTS.md`,
then assemble the model-visible instructions with the existing ordering,
source attribution, warning, and turn-context behavior.
- Wired the Codex home provider through the CLI, app server, MCP server,
core facade, and thread-manager sample.

## Validation

- `just test -p codex-home -p codex-extension-api`
- `just test -p codex-core agents_md`
- `just test -p codex-core guardian`
- `just test -p codex-app-server
thread_start_without_selected_environment_includes_only_global_instruction_source`
- `just test -p codex-exec warning`
- `just bazel-lock-check`

Adam Perry @ OpenAI · 2026-06-11 19:28:47 +00:00

236b50125d

[codex] Remove async_trait from ToolExecutor (#27304 )

## Why

We're now [discouraging use of
`async_trait`](https://github.com/openai/codex/pull/20242).

Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.

Stacked on #27299, this PR applies the trait change after the handler
bodies have been outlined.

## What

Changed `ToolExecutor::handle` to return an explicit boxed
`ToolExecutorFuture` instead of using `async_trait`.

Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
the future alias through `codex-tools` and `codex-extension-api`, and
removed `codex-tools` direct `async-trait` dependency.

Adam Perry @ OpenAI · 2026-06-10 10:26:53 -07:00

2704ecea9a

Remove async-trait from extension contributors (#27383 )

## Why

Extension contributors are registered behind `dyn Trait` objects, so
native `async fn`/RPITIT methods would make these traits
non-object-safe. Spell out the boxed, `Send` future contract directly so
`extension-api` no longer needs `async-trait` while retaining the
existing runtime model.

## What changed

- add a shared `ExtensionFuture` alias and use it for asynchronous
contributor methods
- migrate production and test implementations to return `Box::pin(async
move { ... })`
- remove `async-trait` dependencies where they are no longer used,
keeping it dev-only where unrelated test executors still require it

## Behavior

No behavior change is intended. Contributor futures remain boxed,
`Send`, dynamically dispatched, and lazily executed; cancellation and
callback ordering stay unchanged.

## Testing

- `just test -p codex-extension-api` (11 passed)
- affected extension crates (64 passed)
- targeted `codex-core` contributor tests (14 passed)
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`

A broad local `codex-core` run compiled successfully but encountered
unrelated sandbox and missing test-binary fixture failures; CI will run
the full checks.

jif · 2026-06-10 14:31:09 +02:00

d2f6d23c6c

Use plugin-service MCP as the hosted plugin runtime (#27198 )

## Stack

- Base: #27191
- This PR is the third vertical and should be reviewed against
`jif/external-plugins-2`, not `main`.

## Why

#27191 moves the host-owned Apps MCP registration behind an extension
contributor, but deliberately preserves the existing endpoint-selection
feature while that contribution contract lands. App-server can therefore
resolve the server through extensions, yet the hosted plugin endpoint is
still selected through temporary `apps_mcp_path_override` plumbing.

That is not the long-term plugin model. A plugin can bundle skills,
connectors, MCP servers, and hooks, and those components do not all need
the same source or execution environment. In particular, an
authenticated HTTP MCP server can expose plugin capabilities directly
from a backend without an executor or an orchestrator filesystem.

This PR completes that hosted vertical. App-server's MCP extension now
owns the aggregate hosted plugin runtime at `/ps/mcp`. Connector actions
continue to arrive as MCP tools, while backend-provided skills arrive as
MCP resources and use Codex's existing resource list/read paths. No
second backend client, skill filesystem, or generic plugin activation
framework is introduced.

The backend route remains the hosted implementation. This change
replaces Codex's temporary endpoint-selection mechanism, not the service
behind the endpoint.

## What changed

### Hosted plugin runtime

The MCP extension now contributes `codex_apps` as the hosted plugin
runtime rather than as a configurable Apps endpoint:

- `https://chatgpt.com` resolves to
`https://chatgpt.com/backend-api/ps/mcp`;
- a bare custom ChatGPT base resolves to `/api/codex/ps/mcp`;
- the existing product-SKU header and ChatGPT authentication behavior
are preserved;
- executor availability is never consulted for this streamable HTTP
transport.

The same MCP connection carries both component shapes supported by the
hosted endpoint:

- connector actions are discovered and invoked as MCP tools;
- hosted skills are enumerated and read as MCP resources through the
existing `list_mcp_resources` and `read_mcp_resource` paths.

This keeps component access in the subsystem that already owns the
protocol instead of downloading backend skills into an orchestrator
filesystem or inventing a parallel hosted-skill client.

### Explicit runtime ordering

`McpManager` now resolves the reserved `codex_apps` entry in three
ordered phases:

1. install the legacy Apps fallback for compatibility;
2. apply ordered extension `Set` or `Remove` overlays;
3. apply the final ChatGPT-auth gate without synthesizing the server
again.

This ordering is important:

- an ordinary configured or plugin MCP server cannot claim the
auth-bearing `codex_apps` name;
- an extension-contributed hosted runtime wins over the fallback;
- an extension `Remove` remains authoritative;
- a host without the MCP extension retains the legacy Apps endpoint and
current local-only behavior.

The temporary `legacy_apps_mcp_loader_enabled` coordination flag is no
longer needed.

### Remove the path override

The `apps_mcp_path_override` feature and its runtime plumbing are
removed, including:

- the feature registry entry and structured feature config;
- `Config` and `McpConfig` fields;
- config schema output;
- config-lock materialization;
- URL override handling in `codex-mcp`.

Existing boolean and structured forms still deserialize as ignored
compatibility input. They are omitted from new serialized config, and
config-lock comparison normalizes the removed input so older locks
remain replayable.

### App-server coverage

App-server MCP fixtures now serve the hosted route at
`/api/codex/ps/mcp`. Existing resource-read and tool/elicitation flows
therefore exercise the extension-owned endpoint rather than succeeding
through the legacy fallback.

The stack also adds the missing `codex_chatgpt::connectors` re-export
for the manager-backed connector helper introduced in #27191.

## Compatibility

- App-server installs the extension and uses `/ps/mcp` for the hosted
runtime.
- CLI and other hosts that do not install the extension retain the
legacy Apps endpoint.
- Apps disabled or non-ChatGPT authentication removes `codex_apps` from
the effective runtime view.
- Existing local plugins, local skills, executor-selected skills,
configured MCP servers, and MCP OAuth behavior are otherwise unchanged.
- Backend plugin enablement remains account/workspace state owned by the
hosted endpoint; this PR does not add thread-local backend plugin
selection.

## Architectural fit

The stack now proves two independent runtime shapes:

1. #27184 resolves filesystem-backed skills through the executor that
owns a selected root.
2. #27191 and this PR resolve a backend-hosted HTTP MCP through an
extension with no executor.

Together they preserve the intended separation:

- selection identifies a plugin/root when explicit selection is needed;
- each component's owning extension resolves its concrete access
mechanism;
- execution stays with the runtime required by that component;
- existing skills, MCP, connector, and hook subsystems remain the
downstream consumers.

## Planned follow-ups

1. **Executor stdio MCP:** selecting an executor plugin registers a
manifest-declared stdio MCP server and executes it in the environment
that owns the plugin.
2. **Optional backend selection:** only if CCA needs thread-local
selection distinct from backend account/workspace enablement, add a
concrete backend-owned capability location and surface those selected
skills through the skills catalog.
3. **Connector metadata and hooks:** activate those plugin components
through their existing owning subsystems, with executor hooks remaining
environment-bound.
4. **Propagation and persistence:** define explicit resume, fork,
subagent, refresh, and environment-removal semantics once selected roots
have multiple real consumers.
5. **Local convergence:** migrate legacy local skill, MCP, connector,
and hook paths behind their owning extensions one vertical at a time,
then remove duplicate core managers and compatibility plumbing after
parity.

## Verification

Coverage in this change exercises:

- extension-owned `/backend-api/ps/mcp` registration without an
executor;
- preservation of the legacy endpoint in hosts without the extension;
- extension `Set` and `Remove` precedence over the legacy fallback;
- ChatGPT-auth gating for the reserved server;
- hosted MCP resource reads with and without an active thread;
- connector tool invocation and MCP elicitation through the hosted
route;
- ignored boolean and structured forms of the removed path override;
- config-lock replay compatibility for the removed feature.

`cargo check -p codex-features -p codex-mcp-extension -p
codex-app-server` passes. Tests and Clippy were not run locally under
the current development instruction; CI provides the full validation
pass.

jif · 2026-06-10 12:54:21 +02:00

9cd11e9e62

Route hosted Apps MCP through extensions (#27191 )

## Stack

- Base: #27184
- This PR is the second vertical and should be reviewed against
`jif/external-plugins-1`, not `main`.

## Why

CCA is moving toward a split runtime where the orchestrator may have no
filesystem or executor, but it still needs to activate remotely hosted
plugin components. HTTP MCP servers are the simplest complete example:
they need configuration and host authentication, but they do not need an
executor process.

The Apps MCP endpoint is currently synthesized by a special-purpose
loader inside the MCP runtime. That works locally, but it leaves hosted
MCP activation outside the extension model being established in #27184.
It also makes the Apps path a poor foundation for plugins whose skills,
MCP servers, connectors, and hooks may come from different sources or
execute in different places.

This PR moves that one behavior behind an extension-owned contribution
while preserving the existing local fallback. It deliberately does not
introduce a generic plugin activation framework.

## What changed

### MCP extension contribution

`codex-extension-api` gains an ordered `McpServerContributor` contract.
A contributor returns typed `Set` or `Remove` overlays for MCP server
configuration; later contributors win for the names they own.

The contract stays at the existing MCP configuration boundary.
Extensions do not create a second connection manager or transport
abstraction.

### Hosted Apps MCP extension

A new `codex-mcp-extension` contributes the reserved `codex_apps` server
from the existing Apps feature, ChatGPT base URL, path override, and
product SKU configuration.

When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
resulting streamable HTTP endpoint is
`https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
remains authoritative, so this server can run in an orchestrator-only
process without being exposed for API-key sessions.

### One resolved runtime view

`McpManager` now distinguishes three views:

- **configured:** config- and plugin-backed servers before extension
overlays;
- **runtime:** configured servers plus host-installed extension
contributions;
- **effective:** runtime servers after auth gating and compatibility
built-ins.

App-server installs the hosted MCP extension and uses the runtime view
for thread startup, refresh, status, threadless resource reads,
connector discovery, and MCP OAuth lookup. This keeps
`mcpServer/oauth/login` consistent with the servers exposed by the other
MCP APIs. The hosted Apps server itself continues to use existing
ChatGPT host authentication rather than MCP OAuth.

## Compatibility

Hosts that do not install the MCP extension retain the existing Apps MCP
synthesis path. This preserves current local-only, CLI, and
standalone-host behavior while app-server exercises the extension path.

Disabling Apps removes the reserved `codex_apps` entry, and losing
ChatGPT auth removes it from the effective runtime view. Executor
availability is not consulted for this HTTP transport.

## Follow-ups

The next vertical will resolve a manifest-declared stdio MCP server from
an executor-selected plugin root and execute it in the environment that
owns that root. Later verticals can add backend-owned skills, connector
metadata, hooks, durable selection semantics, and incremental local
convergence without changing the component-specific runtime boundaries
introduced here.

## Verification

Focused coverage was added for:

- contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
executor;
- requiring ChatGPT auth in the effective runtime view;
- removing a reserved configured Apps server when the Apps feature is
disabled.

`cargo check -p codex-app-server -p codex-mcp-extension -p
codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
locally under the current development instruction; CI provides the full
validation pass.

jif · 2026-06-09 22:44:16 +02:00

4ec3b8eeea

[codex] Test extension API contracts (#26835 )

## Why

`codex-extension-api` defines contracts shared by extension crates and
their hosts, but it had no direct test suite. Host and feature tests
cover downstream behavior, while regressions in the API crate's own
typed state, registry ordering, and capability adapters could go
unnoticed.

## What

- Add public-surface integration tests for `ExtensionData`, including
concurrent initialization and poison recovery.
- Cover contributor registration order, approval short-circuiting, event
sink retention, no-op response injection, and closure-based agent
spawning.
- Add the test-only dependencies used by the suite.

## Validation

- `just test -p codex-extension-api`
- `just argument-comment-lint -p codex-extension-api`
- `just bazel-lock-check`

Adam Perry @ OpenAI · 2026-06-09 18:30:24 +00:00

99da697e4c

Load selected executor skills through extensions (#27184 )

## Why

CCA is moving toward a split runtime where the orchestrator may not have
a filesystem, while executors can expose preinstalled plugins and
skills. A thread therefore needs to select capabilities without asking
app-server or core to interpret executor-owned paths through the
orchestrator's filesystem.

The longer-term model is broader than executor skills:

- A plugin is a bundle of skills, MCP servers, connectors/apps, and
hooks.
- A plugin root can be local, executor-owned, or hosted by a backend.
- Components inside one plugin can use different access and execution
mechanisms. A skill may be read from a filesystem or through backend
tools; an HTTP MCP server can run without an executor; a stdio MCP
server or hook needs an execution environment.
- Core should carry generic extension initialization data. The extension
that owns a component should discover it, expose it to the model, and
invoke it through the appropriate runtime.

This PR establishes that architecture through one complete vertical:
selecting a root on an executor, discovering the skills beneath it,
exposing those skills to the model, and reading an explicitly invoked
`SKILL.md` through the same executor.

## Contract

`thread/start` gains an experimental `selectedCapabilityRoots` field:

```json
{
  "selectedCapabilityRoots": [
    {
      "id": "deploy-plugin@1",
      "location": {
        "type": "environment",
        "environmentId": "workspace",
        "path": "/opt/codex/plugins/deploy"
      }
    }
  ]
}
```

The root is intentionally not classified as a "plugin" or "skill" in the
API. It can point at a standalone skill, a directory containing several
skills, or a plugin containing skills and other components. This PR only
teaches the skills extension how to consume it; later extensions can
resolve MCP, connector, and hook components from the same selection.

The platform-supplied `id` is stable selection identity. The location
says which runtime owns the root and gives that runtime an opaque path.
App-server does not inspect or canonicalize the path.

## What changed

### Generic thread extension initialization

App-server converts selected roots into `ExtensionDataInit`. Core
carries that generic initialization value until the final thread ID is
known, then creates thread-scoped `ExtensionData` before lifecycle
contributors run.

This keeps `Session` and core independent of the capability-selection
contract. The initialization value is consumed during construction; it
is not retained as another long-lived `Session` field.

### Executor-backed skills

The skills extension now owns an `ExecutorSkillProvider` that:

- resolves the selected environment through `EnvironmentManager`
- discovers, canonicalizes, and reads skills through that environment's
`ExecutorFileSystem`
- contributes the bounded selected-skill catalog as stable developer
context
- reads an explicitly invoked skill body through the authority that
listed it
- warns when an environment or root is unavailable
- never falls back to the orchestrator filesystem for an executor-owned
root

Skill catalog and instruction fragments have hard byte bounds, which
also bound them below the 10K-token per-item context limit. If a
selected executor skill has the same name as a legacy local skill, the
executor selection owns that invocation and the local body is not
injected a second time.

Existing local and bundled skill loading remains in place. Omitting
`selectedCapabilityRoots` therefore preserves current local-only
behavior.

## Current semantics

- Only environment-owned locations are represented in this first
contract.
- Roots are resolved by the destination extension, not by app-server or
core.
- An unavailable executor or invalid root produces a warning and no
capabilities from that root; it does not trigger a local-filesystem
fallback.
- Selection applies to a newly started active thread.
- MCP servers, connectors, and hooks beneath a selected plugin root are
not activated yet.
- Selection is not yet persisted or inherited across resume, fork, or
subagent creation. Existing local capabilities continue to behave as
they do today in those flows.

## Planned vertical follow-ups

1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that
works without an executor, then replace the special-purpose MCP plugins
loader with that implementation.
2. **Executor MCP:** register and execute stdio MCP servers through the
environment that owns the selected plugin root.
3. **Backend skills:** add a hosted skill source whose catalog and
bodies are accessed through extension tools rather than a filesystem.
4. **Connectors and hooks:** activate those components through their
owning extensions, using the same selected-root boundary and
component-specific runtime.
5. **Durable selection:** define the desired-selection lifecycle,
persist it, and make resume, fork, and subagent inheritance explicit
rather than accidental.
6. **Local convergence:** incrementally route existing local plugin,
skill, and MCP loading through the same extension model while preserving
current local behavior.

Each follow-up remains reviewable as an end-to-end capability. The
platform selects roots, generic thread extension data carries the
selection, and the owning extension resolves and operates its component.

## Verification

Coverage added for:

- app-server end-to-end discovery and explicit invocation of a skill
inside an executor-selected plugin root
- exclusive invocation when a selected executor skill collides with a
local skill name
- executor filesystem authority for discovery, canonicalization, and
reads
- thread extension initialization before lifecycle contributors run
- stable executor catalog context, explicit invocation, context
rebuilding, hidden skills, and preserved host/remote catalog behavior

Targeted protocol, core-skills, skills-extension, core lifecycle, and
app-server executor-skill tests were run during development.

jif · 2026-06-09 19:51:54 +02:00

89ac3ec27c

Bridge host-loaded skills into the skills extension (#26172 )

## Why

The skills extension needs to become the path that exposes local host
skills without losing the behavior already owned by core skill loading.
Host skill discovery is not just `$CODEX_HOME/skills`: it also includes
config layers, bundled-skill settings, plugin roots, runtime extra
roots, and the filesystem for the selected primary environment.

Rather than making the extension reload host skills and risk drifting
from that authoritative load, this PR bridges the already-loaded
per-turn skills outcome into the extension. That lets the extension
advertise host skills and inject explicit `$skill` prompts while
preserving the same roots, disabled/hidden state, rendered paths, and
environment-backed file reads that the legacy path uses.

## What Changed

- Adds `HostLoadedSkills` in `core-skills` to wrap the turn's
`SkillLoadOutcome` and read `SKILL.md` through the filesystem that
loaded that skill.
- Stores `HostLoadedSkills` in turn extension data for normal turns and
review turns, so the skills extension can consume the loaded host
catalog without reloading it.
- Adds `HostSkillProvider` under `ext/skills/src/provider/host.rs`,
mapping host-loaded skill metadata into the skills-extension
catalog/read contract.
- Registers the host provider by default from
`codex_skills_extension::install()`.
- Preserves host skill metadata such as dependencies, disabled state,
hidden-from-prompt policy, and slash-normalized display paths.
- Passes host-loaded skills through `SkillListQuery` and
`SkillReadRequest` so explicit skill invocation reads only resources
from the loaded host catalog.
- Adds integration coverage for a real legacy
`$CODEX_HOME/skills/.../SKILL.md` skill being listed and injected
through the installed extension.

## Testing

- Added `installed_extension_loads_host_skills_from_legacy_roots` in
`ext/skills/tests/skills_extension.rs`.
- `just test -p codex-skills-extension`

jif · 2026-06-04 15:28:06 +02:00

d46a98d31a

skills: resolve per-turn catalogs from turn input context (#26106 )

## Why

The skills extension needs the resolved turn environments to build a
real per-turn `SkillListQuery`. The previous `TurnLifecycleContributor`
hook only had a turn id, so it could only seed a placeholder query and
never carry the executor authorities that executor-scoped skill routing
will need.

Moving catalog resolution onto `TurnInputContributor` puts the skills
extension on the same turn-preparation path that already has the
environment ids and working directories for the submitted turn, while
keeping the actual prompt injection work for follow-up changes.

## What changed

- switch `ext/skills` from `TurnLifecycleContributor` to
`TurnInputContributor`
- build `executor_authorities` from `TurnInputContext.environments` and
pass them through `SkillListQuery`
- keep storing the resolved catalog in `SkillsTurnState`, but drop the
placeholder query helper that no longer matches the real data flow
- update the extension TODOs to reflect that per-turn catalog resolution
now happens in the turn-input contributor, and that prompt/context
injection still needs to move later

## Testing

- Not run locally.

jif · 2026-06-03 13:32:55 +02:00

3389fa554e

feat: add extension turn-input contributors (#25959 )

## Disclaimer
Do not use for now

## Why

Extensions can already contribute prompt fragments and request same-turn
item injection, but there was no host-owned hook for contributing
structured `ResponseItem`s while Codex is assembling a new turn's
initial model input. This change adds that seam so extensions can attach
turn-local input that depends on the submitted user input and resolved
turn environments without routing through prompt text or late injection.

## What changed

- add `TurnInputContributor` to `codex_extension_api` and export the new
`TurnInputContext` / `TurnInputEnvironment` types it receives
- teach `ExtensionRegistry` to register and expose turn-input
contributors alongside the existing extension hooks
- call registered turn-input contributors from
`core/src/session/turn.rs` while building the initial injected input for
a turn, then append their returned `ResponseItem`s after the skill and
plugin injections

jif · 2026-06-03 01:33:31 +02:00

271d5cecf2

Route standalone image generation through host finalization md (#25176 )

## Why

Standalone image-generation extensions emitted turn items through the
low-level event path, bypassing host-owned finalization such as image
persistence and contributor processing. At the same time, the
generated-image save-path hint must remain visible to the model through
the extension tool's `FunctionCallOutput`, rather than the legacy
built-in developer-message path.

## What changed

- Extended `ExtensionTurnItem` to support image-generation items while
keeping the extension-facing emitter API limited to `emit_started` and
`emit_completed`.
- Routed extension completion through core `finalize_turn_item`, so
standalone image-generation items receive host-owned processing and
persisted `saved_path` values before publication.
- Kept legacy built-in image generation on its existing
developer-message hint path, while standalone image generation returns
its deterministic saved-path hint in `FunctionCallOutput`.
- Shared the image artifact path and output-hint formatting used by core
and the image-generation extension.
- Passed thread identity through extension tool calls so standalone
image generation can construct the same intended artifact path as core.
- Added an app-server integration test covering real standalone image
generation, saved artifact publication, model-visible output hint
wiring, and absence of the legacy developer-message hint.

## Validation

- `just fmt`
- `just test -p codex-image-generation-extension`
- `just test -p codex-web-search-extension`
- `just test -p codex-goal-extension`
- `just test -p codex-memories-extension`
- Targeted `codex-core` tests for image save history, extension
completion finalization, and contributor execution
- `just test -p codex-app-server
standalone_image_generation_returns_saved_path_hint_to_model`
- `just fix -p codex-core`
- `just fix -p codex-image-generation-extension`
- `just bazel-lock-update`
- `just bazel-lock-check`

Won Park · 2026-06-02 12:00:04 -07:00

57f337a8e9

Route extension image generation through the native image completion pipeline (#24972 )

## Why

The standalone `image_gen.imagegen` extension should behave like native
image generation for artifact persistence and UI completion, while
returning its save-location guidance as part of the tool result instead
of injecting a developer message.

## What Changed

- Added an image-generation completion hook for extension tools so core
can persist generated images and emit the existing `ImageGeneration`
lifecycle events.
- Reused core image artifact persistence for extension output and
removed extension-local save-path/file-writing logic.
- Split shared image persistence from built-in finalization so native
image generation keeps its existing developer-message instruction
behavior.
- Returned the generated image save-location instruction through the
extension `FunctionCallOutput`, alongside the generated image input for
model follow-up.
- Preserved the existing image-generation event shape for current UI and
replay compatibility.
- Avoided cloning the full generated-image base64 payload when emitting
the in-progress image item.
- Removed dependencies no longer needed after moving persistence out of
the extension crate.

## Fast Follow
- Adjust the existing Extension API and add a general `TurnItem`
finalization path for re-usability of code

## Validation

- Ran `just fmt`.
- Ran `just bazel-lock-update`.
- Ran `just bazel-lock-check`.
- Ran `just test -p codex-tools -p codex-extension-api -p
codex-image-generation-extension`.
- Ran `just test -p codex-core
image_generation_publication_is_finalized_by_core`.
- Ran `just test -p codex-core
handle_output_item_done_records_image_save_history_message`.
- Ran `just fix -p codex-tools -p codex-extension-api -p codex-core -p
codex-image-generation-extension`.

Won Park · 2026-05-29 17:33:13 +00:00

10b0399034

extension-api: add TurnItemEmitter to tool calls (#24813 )

## Why
Extension-contributed tools need to emit visible turn items through
Codex's normal event and persistence pipeline.

## What
- Add `TurnItemEmitter` to extension `ToolCall`s and route the core
implementation through `Session::emit_turn_item_*`.
- Hold weak session and turn references so retained tool calls cannot
keep host state alive.
- Provide a no-op emitter for extension test callers.

## Test Plan
- `just test -p codex-core -E
'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`

---------

Co-authored-by: jif-oai <jif@openai.com>

sayan-oai · 2026-05-28 09:13:43 -07:00

2066874415

Add turn error lifecycle contributor (#24916 )

Summary
- Add TurnErrorInput and TurnLifecycleContributor::on_turn_error to the
extension API.
- Emit the turn-error lifecycle from core turn error paths, including
usage limit failures.
- Add direct lifecycle coverage for the emitted error facts and stores.

Tests
- just fmt
- git diff --check
- Not run: full tests or clippy (per instructions)

jif-oai · 2026-05-28 16:13:54 +02:00

e7d156eb08

Add thread start contributor facts (#24915 )

Summary: add session source and persistent-state availability to
ThreadStartInput; populate them from session init; update existing goal
test harness constructors. Tests: just fmt; git diff --check. No full
tests or clippy run per request.

jif-oai · 2026-05-28 16:10:55 +02:00

ec803fe6c7

feat: add thread idle lifecycle hook (#24744 )

## Why

Extensions can currently observe thread start, resume, and stop, but
they do not have a lifecycle point for the host to say that immediately
pending thread work has drained. That makes idle follow-up behavior
harder to express as extension-owned logic instead of host-specific
plumbing.

This adds an explicit idle lifecycle hook so an extension can react when
a thread becomes idle while the host keeps ownership of whether any
submitted follow-up input starts a turn, is queued, or is ignored.

## What changed

- Added `ThreadIdleInput` with access to the session-scoped and
thread-scoped extension stores.
- Added a default `on_thread_idle` method to
`ThreadLifecycleContributor`.
- Re-exported `ThreadIdleInput` from the extension API surface.

## Testing

Not run; this only extends the extension API trait surface with a
default hook and exported input type.

jif-oai · 2026-05-27 15:17:23 +02:00

d2ebb8d8ca

fix: dont compact standalone websearch schema (#24660 )

add new `parse_tool_input_schema_without_compaction` to bypass the
existing compaction/trimming of client-provided tool schemas that are
over 4k bytes.

we want this for standalone web search to keep field guidance/metadata
on certain fields; this keeps us closer to parity with existing hosted
tool schema (which didnt go through this 4k byte filter).

sayan-oai · 2026-05-27 01:05:19 +00:00

9fe55d68e6

Expose conversation history to extension tools (#23963 )

## Why

Extension tools that need conversation context should be able to read it
from the live tool invocation instead of reaching into thread
persistence themselves.

## What changed

- Add a `ConversationHistory` snapshot to extension `ToolCall`s and
populate it from the current raw in-memory response history.
- Expose all history items at this boundary so each extension can filter
and bound the subset it needs before consuming or forwarding it.
- Cover the adapter and registry dispatch paths and update existing
extension tests that construct `ToolCall` literals.

## Test plan

- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`
- `cargo test -p codex-goal-extension`
- `cargo test -p codex-memories-extension`
- `cargo test -p codex-core passes_turn_fields_to_extension_call`
- `cargo test -p codex-core
extension_tool_executors_are_model_visible_and_dispatchable`

sayan-oai · 2026-05-22 01:11:47 +00:00

7e802b22f1

[codex] Steer budget-limited goal extension turns (#23718 )

## What
- Add a small extension capability for injecting model-visible response
items into the active turn
- Have the goal extension inject hidden goal-context steering when
tool-finish accounting reaches `BudgetLimited`
- Cover the extension backend path with an assertion on the injected
steering item

## Why
PR #23696 persists and emits the budget-limited goal update from
tool-finish accounting, but it leaves the model unaware of that
transition. The existing core runtime steers the model to wrap up in
this case; the extension path should do the same through an explicit
host capability.

## Testing
- `just fmt`
- `cargo test -p codex-goal-extension`
- `cargo test -p codex-extension-api`

jif-oai · 2026-05-21 12:54:00 +02:00

791b69dd53

feat: expose turn-start metadata to extensions (#23688 )

## Why

The goal extension needs more context when a turn starts than
`turn_store` alone provides.

In particular, goal accounting needs the stable turn id, the effective
collaboration mode, and the cumulative token-usage baseline captured at
turn start so it can:

- suppress goal accounting for plan-mode turns
- compute exact per-turn deltas from cumulative `total_token_usage`
snapshots instead of relying on the most recent usage event alone
- keep the extension-owned accounting path aligned with the host turn
lifecycle

## What

- extend `codex_extension_api::TurnStartInput` to expose `turn_id`,
`collaboration_mode`, and `token_usage_at_turn_start`
- pass the full `TurnContext` plus the captured token-usage baseline
through the turn-start lifecycle emission path
- initialize goal turn accounting from the turn-start baseline and
collaboration mode
- switch goal token accounting to compute deltas from cumulative
`total_token_usage` snapshots
- add coverage for the new turn-start lifecycle fields and for
goal-accounting baseline behavior

## Testing

- added `turn_start_lifecycle_exposes_turn_metadata_and_token_baseline`
in `codex-rs/core/src/session/tests.rs`
- added `ext/goal/tests/accounting.rs` coverage for baseline-aware goal
accounting and plan-mode suppression

jif-oai · 2026-05-20 15:54:29 +02:00

59507b8491

feat: async turn item process (#23692 )

Mechanical change

jif-oai · 2026-05-20 15:30:01 +02:00

1392a2a770

feat: async approval contrib (#23690 )

jif-oai · 2026-05-20 15:13:54 +02:00

f64fce61b3

Add tool lifecycle extension contributor (#23309 )

## Why

Extensions that need to track runtime progress currently have no typed
host signal for tool execution. The goal extension in particular needs
to observe tool attempts without inspecting tool payloads, owning tool
implementations, or staying coupled to core-only runtime plumbing.

This adds a narrow lifecycle contributor API for host-owned tool
execution: extensions can observe when an accepted tool call starts and
how it finishes, while policy hooks and tool handlers continue to own
payload rewriting, blocking, and execution.

Relevant code:

-
[`ToolLifecycleContributor`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors.rs#L119)
defines the extension-facing observer contract.
-
[`tool_lifecycle.rs`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors/tool_lifecycle.rs)
defines the typed start/finish inputs, source, and outcome enums.
- [`notify_tool_start` /
`notify_tool_finish`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/core/src/tools/lifecycle.rs)
bridges core tool dispatch into the extension registry.

## What Changed

- Added `ToolLifecycleContributor` to `codex-extension-api`, including:
  - `ToolStartInput`
  - `ToolFinishInput`
  - `ToolCallSource`
  - `ToolCallOutcome`
- Added registration and lookup support on `ExtensionRegistryBuilder` /
`ExtensionRegistry`.
- Wired core tool dispatch to notify lifecycle contributors for:
  - accepted tool starts
  - completed tool calls, including the tool output success marker
  - pre-tool-use blocks
  - failures before or after the handler runs
  - cancellation/abort in the parallel tool path
- Registered the goal extension as a lifecycle contributor and added the
outcome filter it will use for goal progress accounting.

## Test Coverage

- Added `dispatch_notifies_tool_lifecycle_contributors` to cover
lifecycle notification ordering and outcomes for successful and
handler-failed tool calls.

jif-oai · 2026-05-18 21:55:57 +02:00

c69cde3547

chore: make token usage async (#23305 )

Make the `TokenUsageContributor` async. This will be required for future
extension and it's basically free

jif-oai · 2026-05-18 15:59:06 +02:00

b631d92170

feat: add extension event sink capability (#23293 )

## Why

Extensions can already expose typed contributions and receive host
capabilities such as `AgentSpawner`, but they do not have a typed way to
send protocol events back through the host. Extensions that need to
surface progress or status should not have to own persistence, ordering,
transport fanout, or logging decisions themselves.

## What

- Add `ExtensionEventSink`, a host-provided fire-and-forget sink for
`codex_protocol::protocol::Event`.
- Add `NoopExtensionEventSink` so hosts that do not expose extension
event emission keep the existing empty-registry behavior.
- Store the sink on `ExtensionRegistryBuilder` / `ExtensionRegistry`,
with `with_event_sink(...)` and `event_sink()` accessors, and re-export
the new capability from `codex-extension-api`.

## Testing

- Not run locally; PR metadata/body update only.

jif-oai · 2026-05-18 14:08:56 +02:00

6a8173588c

Make extension lifecycle hooks async (#23291 )

## Why

Extension lifecycle hooks sit on the host/extension boundary, but the
current trait surface only allows synchronous callbacks. That forces
extensions that need to seed, rehydrate, observe, or flush
extension-owned state during thread and turn transitions to either block
inside the callback or move async work into separate host plumbing.

This PR makes those lifecycle callbacks awaitable so extension
implementations can perform async work directly at the lifecycle point
where the host already has the relevant session, thread, or turn stores
available.

## What changed

- Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor`
async in `codex-extension-api`.
- Awaits thread start/resume/stop and turn start/stop/abort lifecycle
callbacks from `codex-core`.
- Updates the guardian and memories extensions to implement the async
lifecycle trait surface.
- Updates the existing lifecycle tests to use async contributor
implementations.
- Adds `async-trait` to the crates that now expose or implement these
async object-safe lifecycle traits.

## Testing

- Existing `codex-core` lifecycle tests were updated to cover async
implementations for thread stop and turn abort ordering.

jif-oai · 2026-05-18 13:53:58 +02:00

9531e932ef

Simplify tool executor and registry plumbing (#22636 )

## Why

The tool runtime path still had a typed output associated type on
`ToolExecutor`, plus a core-only `RegisteredTool` adapter and
extension-only executor aliases. That made every new shared tool runtime
carry extra adapter plumbing before it could participate in core
dispatch, extension tools, hook payloads, telemetry, and model-visible
spec generation.

This PR moves output erasure to the shared executor boundary so core and
extension tools can use the same execution contract directly.

## What Changed

- Changed `codex_tools::ToolExecutor` to return `Box<dyn ToolOutput>`
instead of an associated `Output` type.
- Removed the extension-specific `ExtensionToolExecutor` /
`ExtensionToolOutput` aliases and exposed `ToolExecutor<ToolCall>` plus
`ToolOutput` through `codex-extension-api`.
- Reworked core tool registration around `CoreToolRuntime` and
`ToolRegistry::from_tools`, removing the extra `RegisteredTool` /
`ToolRegistryBuilder` layer.
- Consolidated model-visible spec planning and registry construction in
`core/src/tools/spec_plan.rs`, including deferred tool search and
code-mode-only filtering.
- Added `ToolOutput` helpers for post-tool-use hook ids and inputs so
MCP, unified exec, extension, and other boxed outputs preserve the same
hook payload behavior.
- Updated core handlers, memories tools, and the related
registry/spec/router tests to use the simplified contract.

## Test Coverage

- Updated coverage for tool spec planning, registry lookup, deferred
tool search registration, extension tool routing, post-tool-use hook
payloads, dispatch tracing, guardian output extraction, and memories
extension tool execution.

jif-oai · 2026-05-15 11:47:54 +02:00

6f1a01fbdd

feat: make ToolExecutor an async trait (#22560 )

## Why

`codex_tools::ToolExecutor` keeps a tool spec attached to its runtime
handler, but extension tools still carried a parallel
`ExtensionToolFuture` / `ExtensionToolExecutor` shape. That made
extension-owned tools look different from host tools even though
routing, registration, and execution need the same abstraction.

This PR makes the shared executor contract directly async and lets
extension tools implement it too, so host tools and extension tools can
move through the same registration path.

## What changed

- Changed `ToolExecutor::handle` to an `async fn` using `async-trait`,
and updated built-in tool handlers to implement the async trait
directly.
- Replaced the bespoke `ExtensionToolFuture` contract with a marker
`ExtensionToolExecutor` over `ToolExecutor<ToolCall, Output =
JsonToolOutput>`, re-exporting `ToolExecutor` from
`codex-extension-api`.
- Updated the memories extension tools to implement the shared executor
trait.
- Split tool-router construction into collected executors plus hosted
model specs, keeping hosted tools like web search and image generation
separate from executable handlers.
- Updated spec/router tests and extension-tool stubs for the new
executor shape.

## Verification

- Not run locally.

jif-oai · 2026-05-14 11:23:57 +02:00

6d65686313

fix: main (#22503 )

Fix main due to conflicting merge

jif-oai · 2026-05-13 17:28:37 +02:00

441c2f818f

feat: add config-change extension contributor (#22488 )

## Why

Extensions can observe thread and turn lifecycle events today, but there
was no single host-owned hook for changes to the effective thread
configuration. That makes features that need to react to model,
permission, or tool-suggest updates either depend on individual mutation
paths or risk going stale after runtime config refreshes.

This adds a typed config-change contributor so extension-owned state can
stay synchronized with the effective thread config while the host
remains responsible for deciding when config changed.

## What Changed

- Added `ConfigContributor<C>` to `codex_extension_api`, with
before/after immutable snapshots of the effective config plus
session/thread extension stores.
- Added registry builder/accessor support through `config_contributor`
and `config_contributors`.
- Emits config-change callbacks after committed updates from session
settings, per-turn setting updates, and `refresh_runtime_config`.
- Builds effective config snapshots only when config contributors are
registered, and suppresses no-op callbacks when the before/after
snapshots are equal.
- Added a core session regression test that verifies contributors
observe both model changes and user-layer runtime config changes,
including access to session and thread extension stores.

## Validation

Added `config_change_contributor_observes_effective_config_changes` in
`codex-rs/core/src/session/tests.rs` to cover the new contributor path.

jif-oai · 2026-05-13 17:13:34 +02:00

34bb85519f

Make context contributors async (#22491 )

## Summary
- make ContextContributor return a boxed Send future
- await context contributors during initial context assembly
- update existing contributors and extension-api examples for the async
contract

## Testing
- cargo test -p codex-extension-api --examples
- cargo test -p codex-git-attribution
- cargo test -p codex-core
build_initial_context_includes_git_attribution_from_extensions --
--nocapture
- cargo test -p codex-core
build_initial_context_omits_git_attribution_when_feature_is_disabled --
--nocapture
- cargo test -p codex-core (fails in unrelated
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
stack overflow)
- just fix -p codex-extension-api
- just fix -p codex-git-attribution
- just fix -p codex-core
- cargo clippy -p codex-extension-api --examples

jif-oai · 2026-05-13 16:43:28 +02:00

68e045a631

feat: move extension scope ids into ExtensionData (#22490 )

## Summary
- add a scoped level_id to ExtensionData and expose it through
level_id()
- remove thread_id/turn_id parameters from extension contributor inputs
where the scoped ExtensionData already carries that identity
- move turn-scoped extension data onto TurnContext so token usage and
lifecycle contributors can share the same turn store

## Testing
- cargo check -p codex-extension-api -p codex-core --tests
- cargo test -p codex-extension-api
- cargo test -p codex-guardian
- cargo test -p codex-core --lib
record_token_usage_info_notifies_extension_contributors
- cargo test -p codex-core --lib
submission_loop_channel_close_emits_thread_stop_lifecycle
- cargo test -p codex-core --lib
submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle
- just fix -p codex-extension-api
- just fix -p codex-guardian
- just fix -p codex-core
- just fmt

## Note
- Attempted cargo test -p codex-core; it aborted in
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
with the existing stack overflow before the full suite completed.

jif-oai · 2026-05-13 16:13:16 +02:00

1dcc89f1d4

feat: add token usage contributor hook (#22485 )

## Why

Extensions need a stable place to observe token accounting after Codex
folds model-provider usage into the session's cached `TokenUsageInfo`.
Without a contributor hook, extension-owned features that need last-turn
or cumulative token usage have to duplicate session plumbing or infer
state from client-facing `TokenCount` notifications.

## What changed

- Added `TokenUsageContributor` to `codex-extension-api`, passing
session/thread `ExtensionData`, `ThreadId`, turn id, and the current
`TokenUsageInfo`.
- Added registry builder/storage support for token-usage contributors.
- Invoked registered contributors from
`Session::record_token_usage_info` after the session token cache is
updated and before the client `TokenCount` notification is emitted.

## Testing

- Added `record_token_usage_info_notifies_extension_contributors`,
covering cumulative token usage updates and access to both extension
stores.

jif-oai · 2026-05-13 14:32:23 +02:00

083c1962f9

feat: add turn lifecycle contributors (#22480 )

## Why

Extensions can already contribute prompt, tool, turn-item, and
thread-lifecycle behavior, but there was no explicit host-owned hook for
per-turn setup and cleanup. That makes extension-private turn state
awkward: an extension either has to stash it outside the turn lifecycle
or depend on core runtime objects.

This adds a small turn lifecycle boundary. Extensions receive stable
identifiers plus the existing session, thread, and turn `ExtensionData`
stores, while core keeps owning task scheduling, cancellation, and turn
teardown.

## What Changed

- Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`,
and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`.
- Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput`
payloads that expose `thread_id`, `turn_id`, `session_store`,
`thread_store`, and `turn_store`.
- Registered and re-exported turn lifecycle contributors through
`ExtensionRegistry` and `ExtensionRegistryBuilder`.
- Wired `Session` to emit turn start, stop, and abort callbacks from the
existing turn/task lifecycle paths.
- Carried the turn-scoped `ExtensionData` through `RunningTask` and
`RemovedTask` so stop/abort callbacks receive the same turn store
created at turn start.

## Verification

- Not run locally.

jif-oai · 2026-05-13 13:47:27 +02:00

27e67a8c2a

feat: add thread lifecycle contributor hooks (#22476 )

## Why

Extensions that need thread-scoped state currently only get a start-time
callback. That is enough for seeding stores, but it leaves the host
without a shared extension seam for later thread rehydrate and flush
work as thread ownership evolves. This PR turns that start-only seam
into a host-owned thread lifecycle contributor contract so
extension-private state can stay behind the extension API instead of
leaking extra orchestration through core.

## What changed

- Replaced `ThreadStartContributor` with `ThreadLifecycleContributor`
and added typed lifecycle inputs for thread start, resume, and stop. The
contract lives in
[`contributors/thread_lifecycle.rs`](https://github.com/openai/codex/blob/d0e9211f70e58d6b07ef07e84f359d1b9aa25955/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs#L1-L64).
- Kept the existing start-time behavior intact by routing session
construction through `on_thread_start`.
- Invoked `on_thread_stop` during session shutdown before thread-scoped
extension state is dropped, while isolating contributor failures behind
warning logs.
- Migrated `git-attribution` and `guardian` onto the lifecycle
registration path.
- Renamed the extension registry plumbing from start-specific
contributors to lifecycle-specific contributors.

## Notes

`on_thread_resume` is introduced at the API boundary here so extensions
can target the final lifecycle shape; host resume dispatch can be wired
where that runtime path is finalized.

jif-oai · 2026-05-13 13:11:30 +02:00

5ab7e6b4c6

Refactor extension tools onto shared ToolExecutor (#22369 )

## Why

Extension tools were split across two public runtime contracts:
`codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
types, while core native tools used `codex_tools::ToolExecutor`. That
made contributed tool specs and execution behavior easy to drift apart
and added another crate boundary for what should be one executable-tool
seam.

This PR makes `ToolExecutor` the single runtime contract and keeps
extension-specific pinning in `codex-extension-api`.

## Remaining todo

https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
Either generic `Invocation` or sub-extract the `ToolCall` and clean
`ToolInvocation`

## What changed

- Removed the `codex-tool-api` workspace crate and its dependencies from
core and `codex-extension-api`.
- Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
extension contributors can return a dyn executor.
- Added the extension-facing aliases under
`ext/extension-api/src/contributors/tools.rs`, including
`ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
ExtensionToolOutput>`.
- Changed `ToolContributor::tools` to return extension executors
directly instead of `ToolBundle`s.
- Updated core’s extension tool handler/registry/router path to adapt
those extension executors into the existing native `ToolInvocation`
runtime path.
- Added focused coverage for extension tools being registered,
model-visible, dispatchable, and not replacing built-in tools.

## Verification

- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`

jif-oai · 2026-05-13 12:12:06 +02:00

9c5dfa7b1a

extension-api: add approval review contributor flow (#22344 )

## Why

`codex-extension-api` needs an approval hook that lets an installed
extension own a rendered approval-review prompt and produce the final
`ReviewDecision`. The prior interceptor stub only exposed a yes/no claim
and did not model the review result itself, which left the host with the
missing half of the control flow.

## What changed

- Replaces `ApprovalInterceptorContributor` with
[`ApprovalReviewContributor`](https://github.com/openai/codex/blob/c49d17531e15057a373a9b17f410cafb6299d0c1/codex-rs/ext/extension-api/src/contributors.rs#L43-L55),
which may claim a rendered prompt and return an async `ReviewDecision`.
- Re-exports the new contributor and future types from `extension-api`.
- Adds registry support through `approval_review_contributor(...)` plus
[`ExtensionRegistry::approval_review(...)`](https://github.com/openai/codex/blob/c49d17531e15057a373a9b17f410cafb6299d0c1/codex-rs/ext/extension-api/src/registry.rs#L90-L101),
which returns the first installed contributor that claims the prompt.

jif-oai · 2026-05-13 10:39:12 +02:00

155c04ad40

feat: guardian as an extension (contributors part) (#22216 )

Part 1 of guardian as extension. This bind all the logic to spawn
another agent from an extension and it adds `ThreadId` in the start
thread collaborator

jif-oai · 2026-05-12 14:41:45 +02:00

d996f5366f

feat: wire extension tool bundles into core (#22147 )

## Why

This is the next narrow step toward moving concrete tool families out of
core. After #22138 introduced `codex-tool-api`, we still needed a real
end-to-end seam that lets an extension own an executable tool definition
once and have core install it without the temporary `extension-api`
wrapper or a dependency on `codex-tools`.

`codex-tool-api` is the small extension-facing execution contract, while
`codex-tools` still has a different job: host-side shared tool metadata
and planning logic that is not “run this contributed tool”, like spec
shaping, namespaces, discovery, code-mode augmentation, and
MCP/dynamic-to-Responses API conversion

## What changed

- Moved the shared leaf tool-spec and JSON Schema types into
`codex-tool-api`, so the executable contract now lives with
[`ToolBundle`](https://github.com/openai/codex/blob/c538758095337d4fe0a52a172363ccede4066bda/codex-rs/tool-api/src/bundle.rs#L19-L70).
- Replaced the temporary extension-side tool wrapper with direct
`ToolBundle` use in `codex-extension-api`.
- Taught core to collect contributed bundles, include them in spec
planning, register them through
[`ToolRegistryBuilder::register_tool_bundle`](https://github.com/openai/codex/blob/c538758095337d4fe0a52a172363ccede4066bda/codex-rs/core/src/tools/registry.rs#L653-L667),
and dispatch them through the existing router/runtime path.
- Added focused coverage for contributed tools becoming model-visible
and dispatchable, plus spec-planning coverage for contributed function
and freeform tools.

## Verification

- Added `extension_tool_bundles_are_model_visible_and_dispatchable` in
`core/src/tools/router_tests.rs`.
- Added spec-plan coverage in `core/src/tools/spec_plan_tests.rs` for
contributed extension bundles.

## Related

- Follow-up to #22138

jif-oai · 2026-05-11 16:42:29 +02:00

672cc1f669

52 Commits