codex

[codex] Use model metadata for skills usage instructions (#29740 )

## Summary

- add a false-by-default `include_skills_usage_instructions` model
metadata field
- enable the field for the bundled `gpt-5.5` model metadata
- consume the metadata in both core and extension skill rendering
- remove hardcoded legacy-model matching and its marker plumbing

ani-oai · 2026-06-29 09:44:36 +09:00

6b5f5743b3

Preserve namespaces on custom tool calls (#30302 )

## Summary

- Preserve the optional namespace on custom tool calls during response
deserialization and app-server replay.
- Use the namespaced tool identifier for streaming argument handling and
tool dispatch.
- Regenerate app-server protocol schemas.
- Add regression tests covering namespace serialization and routing.

## Testing

- Ran affected protocol and app-server test suites.
- Ran the full core test suite; two load-sensitive timing tests passed
when rerun individually.
- Ran Clippy and formatting checks.
- Verified with a local end-to-end app-server replay that the namespace
is preserved through the complete request/response flow.

nhamidi-oai · 2026-06-27 09:54:56 -07:00

328e95110c

[codex] allow CCA image generation and web search extensions (#29909 )

## Summary

- allow the standalone image-generation and web-search extensions for
the actor-authorized provider shape used by CCA
- preserve builtin `image_generation` and `web_search` for older models
and existing flows
- keep ordinary non-OpenAI providers excluded from both extensions
- remove only the image extension local managed-AuthManager requirement
that CCA cannot satisfy
- share actor-authorization detection through `ModelProviderInfo`
- keep Core tests focused on routing behavior and cover header-shape
edge cases in `model-provider-info`
- add a Responses Lite regression that verifies both
`image_gen.imagegen` and `web.run`

## Why

CCA uses a provider named `local` with `requires_openai_auth: false` and
a non-empty `x-openai-actor-authorization` header. Core accepts that
provider shape, but both extension provider-name gates rejected it;
image generation additionally required a Codex-managed login.

The standalone paths must coexist with existing builtin tools. New
Responses Lite models can receive `image_gen.imagegen` and `web.run`,
while older models continue using builtin tools.

## Impact

This enables both standalone extensions for CCA once installed
downstream, without removing or changing builtin-tool compatibility for
older models.

## Validation

- `just test -p codex-core
responses_lite_exposes_standalone_tools_for_actor_authorized_provider`
- `just test -p codex-core
responses_lite_uses_standalone_web_search_and_image_generation`
- `just test -p codex-core
hosted_tools_follow_provider_auth_model_and_config_gates`
- `just test -p codex-image-generation-extension`
- `just test -p codex-web-search-extension`
- `just test -p codex-model-provider-info`
- `just fmt`
- `git diff --check`

Won Park · 2026-06-25 18:34:35 -07:00

0d4351c1b8

Reinject missing World State fragments on resume (#30152 )

## Why

World State restores its structured snapshot on resume so unchanged
sections do not have to be rendered again. That is safe only when the
model-visible fragment represented by the snapshot is still present in
retained history.

For selected executor skills, the failing selected-capability scenario
exposed this state:

```text
persisted World State: selected skill catalog is known
retained model history: selected skill catalog message is missing
next diff: unchanged, so emit nothing
```

The model resumes without being told about the selected skill catalog.

## What changed

World State contributions may now optionally describe the concrete
model-visible fragment that must remain in retained history.

When a persisted snapshot is present:

```text
matching retained fragment exists -> trust snapshot, emit nothing
matching retained fragment missing -> treat section as absent, render current state once
```

The skills extension uses this for non-empty selected-environment
catalogs by matching its exact rendered catalog body. Empty or hidden
catalogs do not require a fragment.

## Scope

This does not clear or rebuild the whole World State baseline. It does
not change skill discovery, cache invalidation, environment
availability, or MCP runtime behavior. It only keeps a persisted section
snapshot and its retained model context consistent across resume/history
reconstruction.

## Coverage

A focused World State regression test verifies both sides:

- a missing retained fragment is rendered again
- a matching retained fragment avoids duplicate injection

jif · 2026-06-26 02:18:00 +01:00

723b23efd0

Project selected plugin runtime by environment availability (#30093 )

## Why

Selected plugin metadata is stable, but MCP processes are live runtime
state. They need different lifetimes:

- the MCP extension caches manifest, MCP, and connector declarations for
each stable selected root;
- each model step projects that cached metadata through the roots that
resolved as ready for that exact step;
- the MCP manager is rebuilt only when that availability projection
changes.

This matches executor skills: both features consume the same resolved
step roots instead of inferring readiness from the turn's selected
environments.

## Behavior

```text
E1 not ready for this step
  -> no E1 MCP servers or connectors
  -> cached plugin metadata stays in ext/mcp

E1 becomes ready
  -> reuse cached metadata
  -> publish one MCP runtime containing E1 capabilities

same ready roots on the next step
  -> reuse the exact runtime; no rediscovery and no MCP restart

resume
  -> create new extension thread state and a new MCP runtime
```

All model-facing consumers use the same step snapshot:

```text
resolved selected roots
        |
        v
extension MCP/connector projection
        |
        v
{ MCP config, connector snapshot, MCP manager }
        |
        +-> advertise model tools
        +-> build app/connector tools
        +-> execute MCP calls
```

## Cache contract

The existing MCP extension owns a cache keyed by the full
`SelectedCapabilityRoot`:

```rust
let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
```

The cache lives with extension thread state. Environment availability
filters projection but does not invalidate metadata. Resume creates new
thread state. There is no file watcher or executor generation because
contents behind a stable environment/root are assumed stable.

## What changes

- Keeps executor plugin discovery and cached metadata in `ext/mcp`.
- Caches MCP and connector declarations together per selected root.
- Uses the step's already-resolved capability roots, including lazy
environments that are not turn environments.
- Reuses the current MCP runtime when the ready-root projection is
unchanged.
- Uses the same step MCP manager and connector snapshot for
model-visible tools and execution.
- Resolves direct thread-scoped MCP requests from the current
selected-root projection.

## Deliberately out of scope

- `app/list` remains based on the latest global host-plugin state; this
PR does not make its response or notifications thread-specific.
- `required = true` startup semantics do not apply to delayed executor
MCP activation.
- No filesystem/content invalidation.
- No transport-disconnect watcher.
- No executor generations or environment replacement semantics.
- No client sharing across complete manager replacements.

## Stack

1. Extension-owned World State sections.
2. Project executor skills through World State.
3. Pin one MCP runtime to each model step.
4. **This PR:** project selected MCP and connector state from
extension-owned metadata.
5. Integration coverage for selected capability availability and resume.

## Verification

-
`selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
- The stacked integration PR covers unavailable to ready activation,
unchanged-runtime reuse, skills, MCP tools, connector attribution, and
cold resume.

jif · 2026-06-26 01:36:44 +01:00

3095ea9c3d

Project executor skills through World State (#30088 )

## Why

A selected executor environment can be unavailable in one model step and
ready in the next. The model should see its skills only while that
environment is ready, without rescanning stable files on every sample.

The product assumption is simple:

- an environment ID names one stable logical environment;
- the selected root contents do not change during the thread.

## Behavior

```text
E1 unavailable -> do not show E1 skills
E1 ready       -> discover once, cache, show through World State
E1 unavailable -> hide skills, keep cache
E1 ready again -> reuse cache, show skills again
resume         -> create a new thread cache and discover again
```

The cache key is the full `SelectedCapabilityRoot`. Availability does
not invalidate it; dropping the extension's thread state does.

The step supplies the ready selected roots directly. They do not have to
be turn environments:

```text
turn environment: laptop
selected root:    worker:/plugins/lint-fix

worker ready -> lint-fix skills are visible
```

## What changes

- Keeps executor skill catalogs in the existing skills extension.
- Passes the roots resolved as ready for the step into World State
contributors.
- Loads each ready selected root at most once per thread.
- Contributes the executor catalog as the `skills` World State section.
- Uses the exact step catalog for explicit skill selection and body
reads.
- Leaves host and orchestrator skill behavior where it already lives.

Taking a step snapshot itself does not add an RPC. Executor filesystem
calls happen only on the first discovery of a stable root for that
thread.

## What does not change

- No filesystem watcher or content-based invalidation.
- No retry/generation framework.
- No skill runtime migration into core.
- No general rewrite of the skills extension.

## Stack

1. Extension-owned World State sections.
2. **This PR:** project cached executor skills through World State.
3. Pin one MCP runtime to each model step.
4. Project selected MCP/app/connector metadata by environment
availability.
5. One end-to-end integration scenario.

jif · 2026-06-26 00:13:43 +01:00

5eebeb8169

Let extensions contribute World State sections (#30100 )

## Why

#29856 already owns the durable thread intent and exact environment
binding. This PR adds only the small missing extension boundary: an
extension can contribute one named World State section, while core still
owns persistence, diffing, and model-visible fragment types.

This lets skills stay in the skills extension instead of moving their
runtime into core.

## Shape

```text
extension-owned state
        |
        | contribute section id + JSON snapshot + renderer
        v
core World State
        |
        | compare with the previous snapshot
        v
no message, or one incremental model-visible update
```

The extension API is deliberately small:

```rust
fn contribute_world_state(...) -> Vec<WorldStateSectionContribution>
```

Core adapts the rendered result to `ContextualUserFragment`, records the
snapshot, and keeps the existing compaction/resume behavior.

## What changes

- Adds extension-owned World State section contributions.
- Calls those contributors from the existing per-step World State
builder.
- Restores durable selected capability roots into extension thread state
on resume.
- Keeps the actual model-context fragment and rollout machinery in core.

## What does not change

- No skill or MCP implementation moves out of its extension.
- No new file watcher, generation, or RPC.
- No generic migration of existing World State sections.
- No change to the stable environment-ID assumption from #29856.

## Example

```text
step 1 snapshot: skills = []
step 2 snapshot: skills = [executor-demo:deploy]

core asks the skills extension to render only that change.
```

## Stack

1. **This PR:** let extensions contribute World State sections.
2. Project executor skills through the skills extension.
3. Pin one MCP runtime to each model step.
4. Project selected MCP/app/connector metadata by environment
availability.
5. One end-to-end integration scenario.

jif · 2026-06-25 22:23:51 +01:00

c9e6d9783d

Support HTTP MCP servers from selected executor plugins (#28522 )

## Why

Selected executor plugins can declare both stdio and Streamable HTTP MCP
servers, but only stdio registrations were retained. That silently drops
part of the plugin's tool surface and prevents HTTP traffic from using
the owning executor's network.

## What changed

- retain selected-plugin Streamable HTTP MCP declarations alongside
stdio declarations
- route their HTTP clients through the owning executor environment
- preserve local auth-header environment references while rejecting them
for executor-hosted declarations
- cover thread isolation, refresh, and an executor-only HTTP route end
to end

jif · 2026-06-25 10:10:36 +01:00

6368937939

Represent MCP authentication with an enum (#29924 )

## Why

MCP authentication has distinct OAuth and ChatGPT-session flows.
Representing that choice as `use_chatgpt_auth` makes one flow implicit
and allows the configuration model to express the distinction only
through a boolean.

ChatGPT credential forwarding also needs a first-party trust boundary. A
configurable `chatgpt_base_url` controls routing, but must not grant an
MCP server permission to receive session credentials.

This change builds on #29733, where the boolean was introduced.

## What changed

- Replace `use_chatgpt_auth` with an `auth` field backed by the
exhaustive `McpServerAuth` enum.
- Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining
the default.
- Trust only the origin derived from the existing hardcoded
`CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server.
- Keep configured bearer tokens and authorization headers ahead of the
selected authentication flow.
- Update config writers, schema output, fixtures, and integration-test
setup to use the enum.

## Verification

Integration coverage exercises the complete streamable HTTP startup path
in two independent configurations:

- A directly constructed MCP configuration verifies that matching an
overridden `chatgpt_base_url` does not grant ChatGPT auth.
- A persisted `config.toml` containing an attacker-controlled
`chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary
through normal config parsing.

Both tests complete MCP initialization and tool listing and assert that
the full captured request sequence contains no authorization headers.
Separate integration coverage verifies that configured authorization
takes precedence over ChatGPT auth.

Ahmed Ibrahim · 2026-06-24 19:51:51 -07:00

f8937b7d86

Allow ChatGPT-hosted MCP servers to use session auth (#29733 )

## Why

ChatGPT session authentication was inferred from the reserved Codex Apps
server name. That couples credential routing to Codex Apps-specific
behavior and prevents other MCP endpoints hosted by ChatGPT from
explicitly using the current session.

The opt-in also needs a clear security boundary: an arbitrary MCP
configuration must not be able to redirect ChatGPT credentials to
another origin.

## What changed

- Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to
`false`.
- Honor the setting only when the parsed server URL has the same HTTP(S)
origin as the configured `chatgpt_base_url`; otherwise remove the
capability before startup.
- Resolve bearer tokens and static or environment-backed authorization
headers before selecting authentication, with configured authorization
taking precedence over ChatGPT session auth.
- Enable the setting for the built-in Codex Apps and hosted plugin
runtime endpoints while keeping Codex Apps caching and tool
normalization scoped to the reserved server.
- Persist the setting through MCP config rewrite paths and expose it in
the generated config schema.
- Load the current login state for `codex mcp list` so reported auth
status matches runtime behavior.

## Verification

Core integration coverage exercises the complete streamable HTTP MCP
startup path and verifies that:

- a same-origin opted-in server receives the current ChatGPT access
token;
- an explicitly configured authorization header takes precedence;
- a different-origin server completes MCP initialization and tool
listing without receiving any ChatGPT authorization header.

Ahmed Ibrahim · 2026-06-24 19:21:28 -07:00

4c0706e24a

Read connector declarations from executor plugins (#29852 )

## Why

Selected capability roots can live on a different executor and operating
system from app-server. Their connector declarations must therefore be
read through the executor that owns the package, without converting
executor URIs into host paths.

This PR adds that authority-bound reader without activating connectors
or changing thread startup.

## What changed

- Add a small `codex-connectors-extension` crate for executor-owned
connector I/O.
- Read only the app configuration explicitly declared by the resolved
plugin manifest.
- Read through the `ExecutorFileSystem` retained by
`ResolvedExecutorPlugin`; there is no host-filesystem fallback or
default-file probe.
- Keep `PathUri` values intact so Windows, Unix, and remote executor
paths work from any orchestrator OS.
- Return full `AppDeclaration` values so the caller retains declaration
names and categories for routing.
- Preserve the selected plugin ID and exact executor URI in read and
parse errors.

The contract is intentionally narrow: selected packages are trusted,
valid packages and packages that provide connectors explicitly declare
their app configuration.

## Stack scope

This PR is stacked on #29851. It only provides the executor-backed
reader. #29856 resolves selected roots at thread start, freezes their
connector snapshot, and contains the remote-capable end-to-end authority
test for the complete path.

jif · 2026-06-24 23:56:50 +01:00

9ff8068880

Keep executor plugin MCP paths URI-native (#29628 )

## Why

Executor-owned plugin roots are `PathUri`, but MCP config normalization
still converts them into a native `Path` using the app-server host's
rules. Relative `cwd` values can therefore resolve against the wrong
filesystem when host and executor path conventions differ.

This PR keeps executor MCP paths URI-native until the selected
environment launches the server, while retaining the existing host
parser behavior.

## What changed

- Keep one shared MCP normalization path with narrow host-`Path` and
executor-`PathUri` entrypoints.
- Preserve native host resolution for locally installed plugin MCP
configs.
- For executor configs, default `cwd` to the plugin root and resolve
relative working directories with the root URI's path convention.
- Accept explicit executor `file:` URIs only when they remain within the
selected plugin root.
- Preserve the selected environment id and existing remote
environment-variable ownership rules.
- Route the executor plugin provider through the URI-native entrypoint
without converting the root on the host.
- Ensure `codex doctor` does not probe executor-owned stdio commands or
foreign working directories on the host.
- Cover foreign Windows roots, relative and absolute executor working
directories, traversal rejection, runtime resolution, and doctor
behavior.

```text
plugin root:    file:///C:/plugins/demo
configured cwd: scripts
                  |
                  v
resolved cwd:  file:///C:/plugins/demo/scripts
                  |
                  v
launch through the selected executor
```

No new provider or filesystem abstraction is introduced.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. #28918 — keep selected plugin roots and resources URI-native.
4. #29626 — load executor skills without host path conversion.
5. **This PR** — resolve executor MCP working directories without host
path conversion.

jif · 2026-06-24 09:46:07 +01:00

3e39e92f03

Let image generation extension hosts control output persistence (#29711 )

## Why

Some extension hosts need generated images returned without writing them
to the local filesystem or giving the model a local path.

## What changed

**tl;dr**: we now conduct all extension operations in the image gen
extension

- Let hosts provide an optional image save root when installing the
extension.
- Save images and return path hints only when a save root is configured.
- Return image data without saving or adding a path hint when no save
root is configured.
- Preserve the extension-provided `saved_path` instead of persisting
extension images again in core.
- Leave built-in image generation unchanged.

## Validation

- `just test -p codex-image-generation-extension`
- `just test -p codex-app-server
standalone_image_generation_returns_saved_path_hint_to_model`
- `just test -p codex-core
extension_tool_uses_granted_turn_permissions_without_local_persistence`
- `just test -p codex-core tools::handlers::extension_tools::tests`
- tested on CODEX CLI on both save_root: CODEX_HOME and None 
- tested on CODEX APP on both as well

Won Park · 2026-06-23 18:51:49 -07:00

61f5a84930

Load executor skills without host path conversion (#29626 )

## Why

After #28918, selected skill roots are `PathUri`, but the executor skill
provider still converts them to the app-server host's `AbsolutePathBuf`.
A foreign Windows root therefore cannot be discovered by a Unix host,
and the inverse has the same problem.

This PR keeps executor skill discovery and reads on the filesystem that
owns the selected root while reusing the existing skill rules.

## What changed

- Generalize the existing skill traversal to operate on `PathUri`
through `ExecutorFileSystem`, preserving its depth, directory, symlink,
and sibling-metadata concurrency behavior.
- Add a small environment skill loader that reuses the shared discovery,
frontmatter validation, dependency parsing, product policy, and
prompt-visibility rules.
- Keep the environment id and entrypoint `PathUri` in the skill catalog,
then route `skills.read` back through the same environment filesystem.
- Preserve the executor's path convention when deriving catalog handles,
including literal backslashes in POSIX filenames.
- Resolve plugin namespaces from nearby manifests through URI-native
filesystem reads.
- Cover foreign Windows roots, executor-owned reads, namespaces,
metadata, policy, and path identity.

```text
selected root (PathUri)
        |
        v
shared discovery over ExecutorFileSystem
        |
        v
environment-bound catalog entry --skills.read--> same ExecutorFileSystem
```

No second filesystem abstraction or duplicate traversal implementation
is introduced.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. #28918 — keep selected plugin roots and resources URI-native.
4. **This PR** — load executor skills without host path conversion.
5. #29628 — resolve executor MCP working directories without host path
conversion.

jif · 2026-06-23 23:26:06 +01:00

220f5b76b2

Make selected plugin roots URI-native (#28918 )

## Why

Selected capability roots belong to the executor filesystem, not the
app-server host. Converting their path strings into the host's native
`Path` breaks whenever the two machines use different path conventions,
such as a Windows executor behind a Unix app-server.

This PR establishes `PathUri` as the selected-plugin boundary so the
executor remains authoritative for its paths.

## What changed

- Require `selectedCapabilityRoots[].location.path` to be a canonical
`file:` URI and deserialize it directly as `PathUri`; native path
strings are rejected.
- Update the app-server schema, generated TypeScript, examples, and
request coverage for the URI contract.
- Keep selected roots, resolved plugin locations, manifest paths, and
manifest resources as `PathUri`.
- Inspect and read plugin roots and manifests only through the selected
environment's `ExecutorFileSystem`.
- Parse executor manifests with the shared URI-native parser from #29620
instead of projecting them onto the host filesystem.
- Enforce resource containment lexically and preserve the root URI's
POSIX or Windows path convention.
- Cover foreign Windows plugin roots and URI-native manifest resources.

```text
thread/start
  selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo"
                              | PathUri
                              v
                    ExecutorFileSystem
                              |
                              +--> plugin.json
                              +--> manifest resources
```

This PR stops at the shared selected-plugin representation. The next two
PRs remove the remaining host-path projections in the skill and MCP
consumers.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. **This PR** — keep selected plugin roots and resources URI-native.
4. #29626 — load executor skills without host path conversion.
5. #29628 — resolve executor MCP working directories without host path
conversion.

jif · 2026-06-23 22:51:19 +01:00

2e69966cd8

[codex] Use input items for Responses Lite tools (#27946 )

When using Responses Lite, we should all use `additional_tools` and a
developer item instead of the top level tools array & instructions
field. This keeps things 1-to-1.

Forced namespacing for _all_ tools will land in a following PR after
some coordination & fixes in Responses API (around collisions & return
items).

The goal is to eventually expand the scope of this to _all_ requests
from codex, but that will require larger coordination across providers &
slower rollout.

rka-oai · 2026-06-22 23:56:16 -07:00

33cc928d33

mcp: accept foreign absolute cwd for remote stdio (#29493 )

## Why

Remote stdio MCP servers can run in an environment whose path convention
differs from the Codex host. A Windows cwd such as
`C:\Users\openai\share` is absolute for the executor but was rejected by
a POSIX orchestrator.

Built on #29501, now merged, which only clarifies the host-native
`PathUri` constructor name.

## What changed

- Deserialize MCP cwd values as `LegacyAppPathString` so config does not
apply host path rules.
- Interpret that spelling as host-native for local launches and convert
it to `PathUri` at executor launch.
- Skip host filesystem and command resolution checks for remote stdio in
`codex doctor`.
- Add host-independent config and executor-boundary coverage using the
foreign path convention for each test platform.

## Validation

- `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p
codex-rmcp-client` (408 passed)
- `just test -p codex-cli -p codex-rmcp-client` (372 passed)
- `cargo check --workspace --tests`
- `just test` (11,311 passed; 43 unrelated environment/timing failures)
- `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p
codex-mcp-extension -p codex-rmcp-client -p codex-tui`

Adam Perry @ OpenAI · 2026-06-23 01:33:51 +00:00

67009bc53f

core: rename metadata -> internal_chat_message_metadata_passthrough (#28968 )

## Description
This PR cuts Codex over from generic `ResponseItem.metadata` (introduced
here: https://github.com/openai/codex/pull/28355) to
`ResponseItem.internal_chat_message_metadata_passthrough`, which is the
blessed path and has strongly-typed keys.

For now we have to drop this MAv2 usage of `metadata`:
https://github.com/openai/codex/pull/28561 until we figure out where
that should live.

Owen Lin · 2026-06-22 11:11:25 -07:00

5b95745eae

[codex] Preserve skill descriptions outside model context (#29006 )

## Why

Skill descriptions are used in model-visible lists: the default
available-skills catalog that supports implicit selection, and the
on-demand `skills.list` tool response used to discover orchestrator
skills. A single overlong description should not consume a
disproportionate share of either list.

Enforcing the 1024-character limit while loading or migrating skills is
the wrong boundary: it rejects otherwise-valid skills and discards
metadata that non-model consumers and full skill reads may need. Skill
metadata and `SKILL.md` content should remain intact; the cap belongs at
model-visible list rendering boundaries.

## What changed

- Preserve full `description` and `metadata.short-description` values
when loading skills.
- Preserve full external-agent command descriptions during
`source-command-*` migration instead of skipping commands solely because
their descriptions exceed 1024 characters.
- Preserve full normalized orchestrator descriptions in the underlying
skills catalog.
- Cap each description at 1024 Unicode characters when rendering the
default available-skills context in `codex-core-skills` and
`codex-skills-extension`.
- Apply the same cap when serializing descriptions in the model-visible
`skills.list` response.
- Render truncated descriptions as 1021 original characters plus `...`.
- Leave explicit `$skill` injection, `skills.read`, underlying metadata,
and on-disk `SKILL.md` files unchanged and full-fidelity.

## Implicit skill selection

Codex injects a bounded catalog containing each implicitly allowed
skill's name, description, and source locator, together with
instructions to use a skill when the task clearly matches its
description. The model makes that semantic choice; after selecting a
skill, it reads the full `SKILL.md` from its filesystem or provider
resource. Explicit `$skill` mentions remain a separate path that injects
the full skill instructions. For orchestrator skills, `skills.list`
provides bounded discovery metadata before `skills.read` returns the
full selected resource.

## Test plan

- `just test -p codex-core-skills`
- `just test -p codex-skills-extension`
- `just test -p codex-external-agent-migration`

The focused regressions verify that overlong metadata is preserved at
load and migration boundaries while default available-skills rendering
and `skills.list` output produce the 1021-character prefix plus `...`.

charlesgong-openai · 2026-06-19 12:47:53 -07:00

64bdeed9f7

[codex] trace pre-sampling skill and persistence latency (#29042 )

rphilizaire-openai · 2026-06-19 10:13:27 -07:00

a57b268d61

Add config toggles for orchestrator skills and MCP (#28942 )

## Why

Orchestrator-provided skills and Codex Apps MCP tools add model-visible
instructions, resources, and tools beyond the local workspace. Hosts
need config-level switches to disable those orchestrator-owned surfaces
independently, without disabling regular skills or regular MCP servers.

## What changed

- Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled`
config entries, both defaulting to `true`.
- Includes the new settings in `config.schema.json` and in the config
lock so resolved thread configuration preserves the same orchestrator
exposure decisions.
- Threads `orchestrator.skills.enabled` through the app-server skills
extension so disabled orchestrator skills do not expose the `skills`
namespace or inject orchestrator skill context.
- Gates Codex Apps MCP exposure, app instructions, and app auth
eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps
MCP tools available.
- Updates the thread-manager sample config to disable both
orchestrator-owned surfaces.

## Verification

- Added config parsing, loading, defaulting, and schema coverage for the
new settings.
- Added MCP exposure coverage that `orchestrator.mcp.enabled = false`
removes Codex Apps tools while preserving regular MCP tools.
- Added app-server coverage that `orchestrator.skills.enabled = false`
prevents orchestrator skill tools, prompts, and resource reads from
reaching the model turn.

jif · 2026-06-19 14:42:26 +02:00

81b000421d

Add indexed web search mode (#28489 )

## Summary

- Add `web_search = "indexed"` alongside `disabled`, `cached`, and
`live`.
- Use that same resolved mode for both hosted and standalone web search.
- For hosted search, send `index_gated_web_access: true` with external
web access enabled only when `indexed` is selected.
- For standalone search, preserve the existing boolean wire values for
existing modes (`cached` maps to `false` and `live` to `true`) and send
`"indexed"` only for `indexed`; `disabled` keeps the tool unavailable.
- Carry the mode through managed configuration requirements and
generated schemas.

## Why

Indexed search provides a middle ground between cached-only search and
unrestricted live page fetching. Search queries can remain live while
direct page fetches are limited to URLs admitted by the server.

The existing `web_search` setting remains the single source of truth, so
hosted and standalone executors cannot drift into different access
modes. Without an explicit `indexed` selection, the existing
model-visible tool and request shapes are unchanged.

```toml
web_search = "indexed"

[features]
standalone_web_search = true
```

## Validation

- `just fmt`
- `just test -p codex-api` (`126 passed`)
- `just test -p codex-web-search-extension` (`7 passed`)
- `just test -p codex-core
code_mode_can_call_indexed_standalone_web_search` (`1 passed`)
- Focused configuration, hosted request, standalone request, and
managed-requirement coverage is included in the PR; remaining suites run
in CI.

The full workspace test suite was not run locally.

Winston Howes · 2026-06-19 05:35:57 -07:00

3a2712ea14

[codex] Assign response item IDs when recording history (#28814 )

## Why

Client-created response items enter history without IDs, so their
identity is lost across rollout persistence and resume. IDs should be
assigned once at the history-recording boundary, while IDs returned by
the server must remain unchanged.

The Responses API validates item IDs using type-specific prefixes.
Locally generated IDs therefore use the matching prefix plus a
hyphenated UUIDv7, keeping them valid while distinguishable from
server-generated IDs. Because this changes persisted history and
provider request shapes, the behavior is opt-in behind the
under-development `item_ids` feature. Compaction triggers remain request
controls whose API shape does not accept an ID.

## What changed

- Register the disabled-by-default `item_ids` feature and expose it in
`config.schema.json`.
- Make supported optional `ResponseItem` IDs serializable and expose
them in the generated app-server schemas.
- When `item_ids` is enabled, assign an ID during conversation-history
preparation if an item has no ID.
- Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API
item conventions.
- Preserve existing server IDs without rewriting them.
- Persist assigned IDs in rollouts and include them in subsequent
Responses requests.
- Remove the unsupported ID field from `CompactionTrigger` and document
why it has no ID.
- Add integration coverage for enabled ID persistence, preservation of
server IDs, and omission of generated IDs while the feature is disabled.

`prepare_conversation_items_for_history` is the single response-item ID
allocation boundary.

## Test plan

- `just test -p codex-features`
- `just test -p codex-core
response_item_ids_persist_across_resume_and_preserve_server_ids`
- `just test -p codex-core
non_openai_responses_requests_omit_item_turn_metadata`
- `just test -p codex-core
resize_all_images_prepares_failures_before_history_insertion`
- `just test -p codex-protocol`
- `just test -p codex-app-server-protocol`
- `just test -p codex-api azure_default_store_attaches_ids_and_headers`

pakrym-oai · 2026-06-18 17:30:55 -07:00

f00f93d8c0

[codex] Reuse parsed plugin skills during session startup (#28844 )

## Summary

- Preserve raw plugin skill-root snapshots in the matching loaded-plugin
cache entry, keyed by the effective plugin root identity including
namespace.
- Pass those snapshots through `SkillsLoadInput` as an optional preload,
so session startup reuses plugin parsing while ordinary skill loads pass
`None`.
- Keep plugin skill loading cohesive: the existing loaders accept the
optional snapshots directly, and uncached or marketplace-detail paths do
not create a cache.

## Why

Plugin discovery already parses plugin skills to determine available
capabilities. Cold session startup then scanned and parsed the same
roots again while building the skills snapshot.

This solves the same duplicate-work problem as #28623 while keeping
ownership narrow: `PluginsManager` creates and owns
`PluginSkillSnapshots` only for its loaded-plugin cache entry;
`SkillsService` consumes an optional clone. Entry replacement or
clearing naturally drops the snapshots, with no separate generation,
capacity policy, or watcher coupling.

## Validation

- `cargo clippy -p codex-core-skills --all-targets -- -D warnings`
- `just test -p codex-core-plugins
skills_service_reuses_skills_parsed_during_plugin_load`
- `just test -p codex-core-skills
namespaces_plugin_skills_using_provided_namespace`
- `just fmt`

xl-openai · 2026-06-18 16:45:58 -07:00

e83b7841b0

Fix goal-first live threads missing from thread/list (#28808 )

Fixes #28263.

## Why

When a thread starts with `/goal`, the goal extension can update SQLite
goal state before the thread has any user-turn rollout items.
`thread/list` and `thread/search` rely on persisted listing metadata, so
a goal-first live thread could be absent from app-server listings after
restart even though the goal itself existed.

This regressed when goal handling moved out of core: the core path wrote
the goal update through the live thread rollout path, while the
extension-backed app-server path only updated goal state and emitted the
live notification.

## What

- Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension
owns the canonical `ThreadGoalUpdated` rollout item shape.
- Expose a narrow `CodexThread::append_rollout_items()` helper that
appends through the live thread and keeps derived SQLite metadata in
sync.
- When app-server sets a goal on an active live thread, persist the goal
update through that live-thread path.
- Add an app-server regression test that starts a live thread with
`thread/goal/set` and verifies it appears in state-DB-only
`thread/list`.

## Verification

- `env -u CODEX_SQLITE_HOME just test -p codex-app-server
goal_first_live_thread_appears_in_state_db_thread_list`

Eric Traut · 2026-06-18 10:50:15 -07:00

e8dd1b45cb

Add turn-scoped context contributions (#28911 )

## Summary
- keep context injection on a single ContextContributor trait
- split context injection into thread-scoped and turn-scoped
contribution methods
- wire turn-scoped fragments into initial context assembly so extensions
can contribute context from turn-local state

jif · 2026-06-18 19:40:28 +02:00

9684ec25be

[codex] Pass plugin namespace into skill loading (#28608 )

## What changed

- retain the parsed plugin manifest namespace on loaded plugins
- carry that namespace through `PluginSkillRoot` and `SkillRoot`
- use the provided namespace when qualifying plugin skill names
- include the namespace in the skills cache key

## Why

Plugin loading has already parsed `plugin.json`, but skill parsing
currently walks every `SKILL.md` ancestor and probes/reads the manifest
again to reconstruct the same namespace. Passing the parsed namespace
removes those repeated filesystem calls, which are particularly costly
on remote filesystems.

Context:
https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4

## Impact

Plugin skill names remain unchanged. A regression test uses a
deliberately different on-disk manifest name to verify that plugin roots
use the provided parsed namespace.

## Validation

- `just test -p codex-core-skills -p codex-core-plugins -p codex-plugin
-p codex-utils-plugins` (352 passed)
- `just fix -p codex-core-skills -p codex-core-plugins -p codex-plugin
-p codex-utils-plugins`
- `just fmt`

Matthew Zeng · 2026-06-18 00:16:46 -07:00

c73296a0f0

[codex] Support plugin manifest path lists (#28790 )

## Summary

Allow plugin manifests to declare `skills` as either a single path
string or an array of path strings in the core plugin loader.

## Why

Some plugin packages need to expose skills from more than one directory.
Before this change, `plugin.json` only accepted a single string for
`skills`, so manifests like this were ignored as an invalid `skills`
shape:

```json
{
  "skills": ["./skills/abc", "./skills/edk"]
}
```

This keeps the existing single-string form working while adding support
for the list form. The final scope is intentionally limited to the core
plugin manifest/load path for `skills`; `apps`, file-backed
`mcpServers`, and the bundled plugin-creator assets are unchanged in
this PR.

## What changed

- Parse `skills` as either a string or an array of strings in
`plugin.json`.
- Store resolved skill paths as a list in `PluginManifestPaths`.
- Load manifest-declared skill roots in addition to the default
`./skills` root.
- Deduplicate exact duplicate skill roots before loading.
- Rely on existing skill-loader dedupe by canonical `SKILL.md` path for
overlapping roots such as `./skills` plus `./skills/abc`.
- Update plugin manifest tests to cover:
  - single string `skills`
  - list of string `skills`
  - duplicate skill roots
  - `./skills` as a manifest path
  - explicit child roots like `./skills/abc` and `./skills/edk`
  - overlapping-root dedupe

## Validation

- `just test -p codex-plugin`
- `just test -p codex-core-plugins`
- `just test -p codex-mcp-extension`
- `git diff --check`

charlesgong-openai · 2026-06-17 21:33:53 -07:00

e12dd73b7d

[codex] Add optional IDs to response items (#28812 )

## Why

`ResponseItem` variants do not have a consistent internal ID shape: some
variants carry required IDs, some carry optional IDs, and some cannot
represent an ID at all. The existing fields also use inconsistent serde,
TypeScript, and JSON-schema annotations. A single enum-level access path
is needed before history recording can assign and retain IDs.

This PR establishes that internal model only. It intentionally does not
generate or serialize IDs; allocation and wire persistence are isolated
in the stacked follow-up.

## What changed

- Give every concrete `ResponseItem` variant an `Option<String>` ID
field.
- Apply the same internal-only annotations to every ID field:
`#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and
`#[schemars(skip)]`.
- Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared
accessors.
- Preserve IDs when history items are rewritten for truncation.
- Adapt consumers that previously assumed reasoning and image-generation
IDs were required.
- Regenerate app-server schemas so the hidden fields are represented
consistently.

The serde catch-all `ResponseItem::Other` remains ID-less because it
must remain a unit variant.

## Test plan

- `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace
-p codex-image-generation-extension`
- `just test -p codex-protocol`
- `just test -p codex-app-server-protocol`
- `just test -p codex-api -p codex-rollout-trace -p
codex-image-generation-extension`
- `just test -p codex-core event_mapping`

pakrym-oai · 2026-06-17 18:27:43 -07:00

dbd2857f4b

Replace SkillsManager with SkillsService (#28705 )

## Why

Host skill discovery was still exposed as a manager even though it is a
process-owned service shared by sessions, the app-server catalog, and
file-watcher invalidation. The skills extension also consumed an ad hoc
loaded-skills wrapper instead of a named immutable snapshot.

## What changed

- replace `SkillsManager` with concrete `SkillsService`
- make the service cache and return immutable `HostSkillsSnapshot`
values
- migrate the skills extension host provider to the snapshot boundary
- migrate app-server catalog, watcher, and invalidation paths to the
service

This keeps the service limited to host discovery, caching, roots, and
invalidation. Catalog rendering and invocation remain extension
responsibilities for the next stacked change.

jif · 2026-06-17 17:01:06 +02:00

0318381762

[codex] Support object-valued plugin MCP manifests (#28580 )

## Summary
This fixes plugin manifest parsing for MCP servers declared as an object
directly in `plugin.json`.

Before this change, Codex modeled `mcpServers` as only a string path,
for example:

```json
{
  "name": "counter-sample",
  "version": "1.1.1",
  "mcpServers": "./.mcp.json"
}
```

Some migrated plugins instead provide the server map directly in the
manifest:

```json
{
  "name": "counter-sample",
  "version": "1.1.1",
  "description": "Plugin that declares MCP servers in the manifest",
  "mcpServers": {
    "counter": {
      "type": "http",
      "url": "https://sample.example/counter/mcp"
    }
  }
}
```

That object form previously failed during install/load with an error
like:

```text
failed to parse plugin manifest: invalid type: map, expected a string
```

## What changed
- Add a manifest representation for `mcpServers` as either
`Path(Resource)` or `Object(map)`.
- Parse `plugin.json` `mcpServers` as either a string path or an object.
- Route object-valued MCP server maps through the existing plugin MCP
config parser instead of adding a second parser.
- Apply existing per-plugin MCP server policy to object-valued MCP
servers the same way as file-backed MCP servers.
- Include object-valued MCP server names in plugin telemetry/capability
metadata.
- Support object-valued MCP config for executor plugins without
requiring a `.mcp.json` filesystem read.
- Update the bundled plugin-creator validator and `plugin-json-spec.md`
so generated-plugin validation accepts the same object-valued shape.

## Compatibility
Existing plugin manifests that use `"mcpServers": "./.mcp.json"`
continue to work. Plugins can now also use the object shape shown above.

## Tests
Added coverage for the new manifest attribute shape at the install,
normal load, telemetry, and executor-provider layers:

- `install_accepts_manifest_mcp_server_objects`
- `load_plugins_loads_manifest_mcp_server_objects`
- `plugin_telemetry_metadata_uses_manifest_mcp_server_objects`
- `reads_manifest_object_config_without_executor_file_system_access`

Also smoke-tested the plugin-creator validator against both supported
forms:

- `mcpServers` as a direct object in `plugin.json`
- `mcpServers` as `"./.mcp.json"` with a companion `.mcp.json`

## Validation
- `just test -p codex-plugin`
- `just test -p codex-core-plugins`
- `just test -p codex-mcp-extension`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fmt`
- `git diff --check`
- Focused rename/object-form rerun: `just test -p codex-core-plugins
manager::tests::load_plugins_loads_manifest_mcp_server_objects
manager::tests::plugin_telemetry_metadata_uses_manifest_mcp_server_objects
store::tests::install_accepts_manifest_mcp_server_objects`
- Focused executor rerun: `just test -p codex-mcp-extension
executor_plugin::provider::tests::reads_manifest_object_config_without_executor_file_system_access`
- `python3
codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
/private/tmp/codex-validator-object`
- `python3
codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py
/private/tmp/codex-validator-path`

charlesgong-openai · 2026-06-16 19:22:57 -07:00

1883dedc0e

[codex] exec-server: stream files in chunks (#28354 )

## Why

`fs/readFile` buffers the entire file in one response, which makes large
remote reads expensive and prevents callers from applying backpressure.
We need an opt-in streaming path with bounded block sizes while
preserving the existing single-call API for small and sandboxed reads.

## What changed

- Add `ExecServerClient::stream`, returning a named `FileReadStream`
that implements `futures::Stream` and yields immutable 1 MiB byte
blocks.
- Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs.
`fs/readBlock` accepts an explicit offset and length.
- Keep unsandboxed files open between block reads, cap open handles per
connection, and clean them up on EOF, error, stream drop, explicit
close, or connection shutdown.
- Reject platform-sandboxed streaming opens instead of turning the
one-shot sandbox helper into a persistent server. Existing `fs/readFile`
behavior is unchanged.

## Testing

- `just test -p codex-exec-server`
- Integration coverage for 1 MiB chunking, exact block-boundary EOF,
sandbox rejection, and continued reads from the opened file after path
replacement.
- Handle-manager coverage for non-sequential offsets, variable block
lengths, the 128-handle limit, and capacity release after close.

pakrym-oai · 2026-06-16 09:50:55 -07:00

a4711b88dd

feat: render typed envelopes for multi-agent v2 messages (#28368 )

## Why

Multi-agent v2 messages need a consistent, model-visible envelope that
identifies what kind of interaction occurred, who sent it, and which
agent it targets. Previously, encrypted deliveries exposed only
`encrypted_content`, while child completion used the legacy
`<subagent_notification>` shape. That meant the client could not
consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the
same format.

This change adds the routing envelope as plaintext while keeping task
and message payloads encrypted. No new Responses API field is required:
an encrypted delivery is represented as an `input_text` header
immediately followed by its existing `encrypted_content` item.

Every envelope now follows this shape:

```text
Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER>
Task name: <recipient agent path>
Sender: <author agent path>
Payload:
<message payload>
```

## Message types

### `NEW_TASK`

`NEW_TASK` is used when the recipient should begin a new turn, including
an initial `spawn_agent` task and a later `followup_task`.

For a root agent spawning `/root/worker`, the request contains a
plaintext envelope followed by the encrypted task:

```json
{
  "type": "agent_message",
  "author": "/root",
  "recipient": "/root/worker",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n"
    },
    {
      "type": "encrypted_content",
      "encrypted_content": "<encrypted task payload>"
    }
  ]
}
```

Conceptually, the model receives:

```text
Message Type: NEW_TASK
Task name: /root/worker
Sender: /root
Payload:
Review the authentication changes and report any regressions.
```

### `MESSAGE`

`MESSAGE` is used for a queued `send_message` delivery. It communicates
with an existing agent without starting a new turn.

For `/root/worker` reporting progress to the root agent, the request
contains:

```json
{
  "type": "agent_message",
  "author": "/root/worker",
  "recipient": "/root",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n"
    },
    {
      "type": "encrypted_content",
      "encrypted_content": "<encrypted message payload>"
    }
  ]
}
```

Conceptually, the model receives:

```text
Message Type: MESSAGE
Task name: /root
Sender: /root/worker
Payload:
The protocol tests pass; I am checking the resume path now.
```

### `FINAL_ANSWER`

`FINAL_ANSWER` is emitted when a child agent reaches a terminal state
and reports its result to its parent. Completion payloads are already
available locally, so the complete envelope is represented as plaintext
rather than as a plaintext header plus encrypted content.

For `/root/worker` completing work for the root agent, the request
contains:

```json
{
  "type": "agent_message",
  "author": "/root/worker",
  "recipient": "/root",
  "content": [
    {
      "type": "input_text",
      "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found."
    }
  ]
}
```

The model-visible form is:

```text
Message Type: FINAL_ANSWER
Task name: /root
Sender: /root/worker
Payload:
No regressions found.
```

Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a
terminal-status description in the payload.

## What changed

- Render `NEW_TASK` or `MESSAGE` in
`InterAgentCommunication::to_model_input_item`, based on whether the
encrypted delivery starts a turn.
- Replace the multi-agent v2 `<subagent_notification>` completion
payload with a model-visible `FINAL_ANSWER` envelope.
- Document `Task name`, `Sender`, and `Payload` consistently in the
multi-agent developer instructions.
- Prevent local-only history projections from treating an encrypted
message's plaintext header as the complete assistant message.
- Preserve rollout-trace interaction edges when an agent message
contains both plaintext and encrypted content.

Legacy multi-agent behavior remains unchanged.

## Verification

- `just test -p codex-protocol`
- `just test -p codex-rollout-trace`
- `just test -p codex-web-search-extension`
- `just test -p codex-core
encrypted_multi_agent_v2_spawn_sends_agent_message_to_child`
- `just test -p codex-core
plaintext_multi_agent_v2_completion_sends_agent_message`
- `just test -p codex-core
multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn`
- `just test -p codex-core
multi_agent_v2_completion_queues_message_for_direct_parent`

jif · 2026-06-16 11:46:59 +02:00

5b22a8e5b1

[codex] Use expect in integration tests (#28441 )

The workspace denies `clippy::expect_used` in production. Although
`clippy.toml` allows `expect` in tests, Bazel Clippy compiles
integration-test helper code in a way that does not receive that
exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
and equivalent `match`/`let else` forms.

This allows `clippy::expect_used` once at each integration-test crate
root (including aggregated suites and test-support libraries), then
replaces manual panic-based Result and Option unwraps with
`expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
crate roots. Intentional assertion and unexpected-variant panics remain
unchanged, and the production `expect_used = "deny"` lint remains in
place.

The cleanup is mechanical and net-negative in line count.

pakrym-oai · 2026-06-15 21:53:47 -07:00

e752f7b4ae

Use PathUri in filesystem permission paths for exec-server (#28165 )

## Why

Progress towards letting app-server and exec-server run on different
platforms, specifically for sandbox configuration.

## What

- Make the filesystem path containment hierarchy generic, defaulting to
`AbsolutePathBuf` for now.
- Have clients specify `AbsolutePathBuf` or `PathUri` directly where
needed.
- Use `PathUri` throughout exec-server filesystem protocol and trait
boundaries.
- Implement `From` for conversion to path URIs and `TryFrom` for
fallible conversion to absolute paths through the generic type
hierarchy.

Adam Perry @ OpenAI · 2026-06-15 23:55:23 +00:00

46f17930b6

feat(core): add metadata field to ResponseItem (#28355 )

## Description

This PR adds an optional `metadata` field to `ResponseItem` for
Responses API calls. Only mechanical plumbing, no actual values
populated and sent yet. Turns out just adding a new field to
`ResponseItem` has quite a large blast radius already.

This change is backwards compatible because `metadata` is optional and
omitted when absent, so existing response items and rollout history
without it still deserialize and requests that do not set it keep the
same wire shape. For provider compatibility, we strip out `metadata`
before non-OpenAI Responses requests so Azure and AWS Bedrock never see
this field.

My followup PR here will actually make use of it to start storing and
passing along `turn_id`: https://github.com/openai/codex/pull/28360

## What changed

- Added `ResponseItemMetadata` with optional `turn_id`, plus optional
`metadata` on Responses API item variants and inter-agent communication.
- Preserved item metadata through response-item rewrites such as
truncation, missing tool-output synthesis, compaction history
rebuilding, visible-history conversion, rollout/resume, and generated
app-server schemas/types.
- Strip item metadata from non-OpenAI Responses requests while
preserving it for OpenAI-shaped requests.
- Updated the mechanical fixture/test construction churn required by the
new optional field.

Owen Lin · 2026-06-15 15:05:28 -07:00

040dafa32d

skills: cache orchestrator resources per thread (#28336 )

## Why

Hosted orchestrator skills are read through the remote MCP resource
server. Within one thread, the same catalog or skill resource can be
requested multiple times by prompt injection and the `skills.list` /
`skills.read` tools. Re-fetching adds latency and can make those
surfaces observe different remote contents during the same thread.

This is a follow-up to #28333: orchestrator skills remain limited to
threads without a local executor, and those threads now get a stable
per-thread view of the remote skill data they use.

## What changed

- Reuse the existing per-thread orchestrator catalog snapshot for
`skills.list` and `skills.read` availability checks.
- Cache successful orchestrator resource reads by authority, package,
and resource so prompt injection and tool calls share the same contents.
- Keep the cache memory-only and bounded to 100 resources and 8 MiB per
thread.
- Leave host and executor skill reads unchanged, and do not cache failed
remote reads.

## Verification

- Extended the app-server MCP resource integration test to read the same
hosted skill resource twice and verify that the remote server receives
one read.
- The same test verifies that catalog discovery and the selected skill's
main prompt are each fetched only once per thread.

jif · 2026-06-15 20:20:19 +02:00

0afe559318

skills: hide orchestrator skills with a local executor (#28333 )

## Why

App-server threads without a local executor need orchestrator-owned
skills from the hosted `codex_apps` MCP server. Threads with the local
executor already discover installed skills from the local filesystem.

After the orchestrator skill provider was enabled for every app-server
thread, local-executor threads also received the hosted skill catalog
and the `skills.list` and `skills.read` tools. This changed the existing
local behavior and could expose a second hosted copy of a skill that was
already installed locally.

## What changed

- Expose the thread's selected execution environments to extensions at
thread startup.
- Enable orchestrator skills only when the reserved local environment is
not selected.
- Apply that decision consistently to hosted skill catalog discovery,
explicit skill injection, and the `skills.list` and `skills.read` tools.

## Verification

- The existing no-executor app-server test continues to verify hosted
skill discovery, invocation, and child-resource reads.
- A new app-server test verifies that local-executor threads do not
receive hosted skill context or `skills.*` tools.

jif · 2026-06-15 17:15:45 +02:00

6d9df687a5

Discover stdio MCP servers from selected executor plugins (#27870 )

## Why

**In short:** this PR discovers MCP registrations by reading a selected
plugin's `.mcp.json` on its executor. #27884 then resolves those
registrations in the shared catalog.

`thread/start.selectedCapabilityRoots` can select a plugin root owned by
an executor, and Codex can resolve that package through the executor
filesystem. MCP declarations inside the selected plugin are still
ignored.

This PR adds the source-specific discovery layer on top of the
selected-plugin catalog boundary in #27884:

```text
selected capability root
        |
        v
resolve the plugin through its executor filesystem
        |
        v
read and normalize its MCP config through the same filesystem
        |
        v
contribute stdio registrations bound to that environment ID
```

The existing MCP launcher and connection manager remain unchanged. MCP
config parsing is shared with local plugins through #27863.

## What changed

- Added an executor plugin MCP provider in the MCP extension.
- Retained only the exact filesystem capability used for package
resolution and reused it for the selected plugin's MCP config, with no
host-filesystem fallback or unrelated process/HTTP authority.
- Read either the manifest-declared MCP config or the default
`.mcp.json`; a missing default file means the plugin has no MCP servers.
- Accepted stdio servers only for this first vertical. Executor-owned
HTTP declarations are skipped with a warning until their placement
semantics are defined.
- Normalized stdio registrations with the owning environment's stable
logical ID and plugin-root working directory.
- Resolved environment-variable names on the owning executor and
rejected explicit local forwarding for non-local plugins.
- Froze discovered declarations once per active thread runtime, then
applied current managed plugin and MCP requirements when contributing
them.
- Carried the selected root ID, display name, and selection order into
the catalog contribution defined by #27884.

## Behavior and scope

There is intentionally no production behavior change yet. This PR
provides the executor provider and contribution boundary, but app-server
does not install it in this change. Existing local plugin MCP loading is
unchanged, and no MCP process is launched by this PR alone.

## Assumptions

- The selected root ID is the plugin policy identity; the manifest
display name is presentation metadata.
- An environment ID is a stable logical authority. Reconnection or
replacement under the same ID does not change ownership.
- Selected plugin packages and their manifests are trusted inputs.
- The selected package and MCP discovery snapshot remain frozen for the
active thread runtime.

## Follow-up

The next PR installs this contributor in app-server and adds an
end-to-end test proving that a selected plugin MCP tool launches on its
owning executor, can be called by the model, survives an explicit MCP
refresh, and is invisible when its root was not selected.

Resume, fork, environment removal or ID changes, dynamic catalog reload,
and executor-owned HTTP MCP placement remain separate lifecycle
decisions.

## Verification

Focused tests cover executor-only filesystem reads, missing and
malformed config, stdio filtering and normalization, managed
requirements, package attribution, and selection order. CI owns
execution of the test suite.

jif · 2026-06-15 11:52:05 +02:00

b3c423e475

Add selected-plugin precedence and attribution to the MCP catalog (#27884 )

## Why

**In short:** this PR resolves already-discovered MCP registrations. It
does not read selected plugins or discover their MCP servers.

The resolved MCP catalog currently builds config and auto-discovered
plugin registrations before runtime contributors are applied. A
thread-selected plugin needs a distinct precedence tier in that same
initial resolution pass: otherwise a disabled lower-precedence winner
can leave stale name-level state behind, and the winning MCP tools
cannot be attributed to the selected package reliably.

This PR adds that catalog boundary before executor discovery is
connected.

## What changed

- Added an explicit selected-plugin registration tier between
auto-discovered plugins and explicit config.
- Collected selected-plugin contributions before the initial catalog
build, while leaving compatibility and generic extension overlays in
their existing runtime phase.
- Retained the winning plugin ID and display name directly on
plugin-owned catalog registrations.
- Derived MCP tool provenance from the winning catalog entry instead of
joining against local-only plugin summaries.
- Retained the winning selected server's tool approval policy in the
running connection manager, so a selected registration cannot inherit
approval behavior from a losing local plugin.
- Kept remembered approval session-scoped for selected plugins until
there is an authority-aware persistence contract; Codex will not write
approval back to an unrelated local plugin.
- Preserved existing name-level disabled vetoes for discovered plugins
and config, while keeping a selected package's own disabled registration
scoped to that registration.
- Preserved deterministic selection order and existing config,
compatibility, and extension precedence.

The resulting order is:

```text
auto-discovered plugin
  < selected plugin
  < explicit config
  < compatibility registration
  < extension overlay
```

## Behavior and scope

This is a catalog and provenance change only. No production host
contributes selected-plugin MCP registrations yet, so existing local MCP
behavior remains unchanged.

The stacked follow-up, #27870, installs the executor plugin provider
that produces these registrations. App-server activation remains a
separate final step.

## Verification

Focused tests cover precedence, deterministic selected-plugin conflicts,
disabled-veto behavior across catalog phases, managed requirements
before selected-plugin resolution, winning-server approval policy, and
attribution when local and selected packages share an ID or server name.
CI owns execution of the test suite.

jif · 2026-06-15 11:10:51 +02:00

c3a479620f

build: run buildifier from just fmt (#28125 )

## Intent

Keep Bazel and Starlark files consistently formatted without requiring
contributors to install or version buildifier themselves.

## Implementation

- Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
v8.5.1.
- Run buildifier from the shared `just fmt` and `just fmt-check` driver,
with Windows-safe explicit DotSlash invocation.
- Provision DotSlash in formatting CI and contributor devcontainers, and
document the source-build prerequisite.
- Apply the initial mechanical buildifier formatting baseline.

Adam Perry @ OpenAI · 2026-06-13 21:43:39 -07:00

740c4f269d

[codex] make PathUri::from_abs_path infallible (#27976 )

## Why

`PathUri::from_abs_path` can fail for absolute paths that do not have a
normal `file:` URI representation, forcing filesystem call sites to
handle a conversion error even though the original path can be preserved
losslessly.

## What

Make `from_abs_path` infallible and migrate its callers. Unrepresentable
paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
leading encoded null reserves a namespace that cannot collide with a
real Unix or Windows path, and fallback URIs remain opaque to lexical
path operations.

## Validation

Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
device/verbatim and non-Unicode paths, serialization, malformed
fallbacks, opaque lexical operations, invalid native payloads, and
literal `/bad/path` collision resistance.

Adam Perry @ OpenAI · 2026-06-12 16:58:42 -07:00

968a3ac9c1

Support plaintext agent messages (#27830 )

## Why

Multi-agent v2 `send_message` deliveries already reach the receiving
model as typed `agent_message` items with encrypted content.
Child-completion notifications are generated by Codex itself, so their
content is plaintext and previously fell back to a serialized JSON
envelope inside an assistant message.

With plaintext `input_text` supported for `agent_message`, both delivery
paths can use the same model-visible type while preserving explicit
author and recipient metadata.

## What changed

- add plaintext `input_text` support to `AgentMessageInputContent` and
regenerate the affected app-server schemas
- preserve `InterAgentCommunication` as structured mailbox input instead
of converting it to assistant text
- record delivered communications as typed `agent_message` history items
- persist a dedicated rollout item so local delivery metadata such as
`trigger_turn` remains available without leaking into the Responses
request
- reconstruct typed agent messages on resume and preserve fork-turn
truncation behavior
- remove request-time assistant-content parsing
- preserve plaintext and encrypted inter-agent deliveries in stage-one
memory inputs
- normalize and link plaintext and encrypted agent messages in rollout
traces without treating inbound messages as child results
- cover the real MultiAgent V2 child-completion path end to end with
deterministic mailbox synchronization

## Verification

- `just test -p codex-core
plaintext_multi_agent_v2_completion_sends_agent_message`
- `just test -p codex-core input_queue_drains_mailbox_in_delivery_order
record_initial_history_reconstructs_typed_inter_agent_message
fork_turn_positions_use_inter_agent_delivery_metadata`
- `just test -p codex-memories-write
serializes_inter_agent_communications_for_memory`
- `just test -p codex-rollout-trace
agent_messages_preserve_routing_and_content
sub_agent_started_activity_creates_spawn_edge`
- `just test -p codex-rollout-trace
agent_result_edge_falls_back_to_child_thread_without_result_message`
- `just test -p codex-protocol -p codex-rollout -p
codex-app-server-protocol`

jif · 2026-06-12 13:50:04 -07:00

8f2d6416ce

[codex] Add size to internal filesystem metadata (#27927 )

## Why

`ExecutorFileSystem::get_metadata` reports file kind and timestamps but
not size. Internal callers that need to enforce a size limit therefore
have to read the complete file first, which is especially wasteful for
remote filesystems.

This adds the missing internal metadata so consumers can reject
oversized files before transferring or buffering them. The field is
named `size`, matching VS Code's `FileStat.size` filesystem convention.

## What changed

- add `size: u64` to internal `FileMetadata`
- populate it from the underlying filesystem metadata
- carry it through sandbox-helper and remote exec-server responses
- cover files, directories, symlink targets, and sandboxed reads across
local and remote filesystem implementations

The new field is intentionally not exposed through the app-server API.

## Testing

- `just test -p codex-exec-server get_metadata`
- `just test -p codex-exec-server
file_system_sandboxed_metadata_and_read_allow_readable_root`
- `just test -p codex-core-plugins`
- `just test -p codex-skills-extension`

pakrym-oai · 2026-06-12 12:12:08 -07:00

76d8f20241

Handle standalone image generation failures as terminal items (#27920 )

## Why

Standalone image generation emitted a started item but no terminal item
when the backend failed. Clients could leave the operation unresolved or
render it as successful.

## What changed

- Emit a terminal image-generation item with `status: "failed"` when
generation or editing fails.
- Skip image persistence for failed terminal items.
- Render failed image generation distinctly in TUI history.
- Preserve the status when handling live and replayed terminal items.

## Looks for TUI, App-Side change needed 

<img width="867" height="89" alt="image"
src="https://github.com/user-attachments/assets/9e32342f-a982-411e-8498-456639fc468a"
/>

## Validation

- `just test -p codex-image-generation-extension`
- App-server image-generation tests
- Core stream-event tests
- TUI image-generation lifecycle and snapshot tests
- Scoped Clippy and formatting

Won Park · 2026-06-12 11:57:22 -07:00

b6baa77eec

Make MCP server contributions thread-scoped (#27670 )

## Why

`selectedCapabilityRoots` belongs to one thread, but MCP contributors
previously received only the global Codex config. That left no clean way
for a selected executor capability to contribute MCP servers to its own
thread.

## What this PR does

- Gives MCP contributors a small context containing the config and, for
a running thread, its frozen host-seeded inputs.
- Uses the same thread inputs during startup, status queries, refreshes,
and skill dependency checks.
- Keeps threadless MCP operations and the existing hosted Apps behavior
unchanged.
- Adds coverage showing that two threads resolve independent
registrations and that later lifecycle mutations do not change the
frozen MCP inputs.

This PR does not discover plugin manifests, add MCP servers, or launch
anything new. It only establishes the thread-scoped registration
boundary.

## Follow-ups

- Resolve selected executor plugin roots through their owning
environment filesystem.
- Convert their stdio MCP declarations into environment-bound
registrations and add an executor MCP end-to-end test.

## Verification

- `just fmt`
- `cargo check --tests -p codex-protocol -p codex-extension-api -p
codex-mcp-extension -p codex-core -p codex-app-server`

Tests and Clippy were not run.

jif · 2026-06-12 11:20:34 +02:00

693082f3c4

[codex] Remove async_trait from first-party code (#27475 )

## Why

First-party async traits should expose their `Send` contracts explicitly
without requiring `async_trait`. This completes the migration pattern
established in #27303 and #27304.

## What changed

- Replaced the remaining first-party `async_trait` traits with native
return-position `impl Future + Send` where statically dispatched and
explicit boxed `Send` futures where object safety is required.
- Kept implementations behavior-preserving, outlining existing async
bodies into inherent methods where that keeps the diff reviewable.
- Removed all direct first-party `async-trait` dependencies and the
workspace dependency declaration.
- Added a cargo-deny policy that permits `async-trait` only through the
remaining transitive wrapper crates.
- Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
keep the full cargo-deny check passing.

## Validation

- `just test -p codex-exec-server`: 216 passed, 2 skipped.
- `just test -p codex-model-provider`: 39 passed.
- `just test -p codex-core` and `just test`: changed tests passed;
remaining failures are environment-sensitive suites unrelated to this
migration.
- `cargo deny check`
- `just fix`
- `just fmt`
- `cargo shear`
- `just bazel-lock-check`

Adam Perry @ OpenAI · 2026-06-11 18:16:39 -07:00

5a56caf18c

Fix image extension PathUri conversion (#27711 )

## Why

`main` stopped compiling when #27498 passed an `AbsolutePathBuf` to the
`ExecutorFileSystem` API migrated to `PathUri` by #27653.

## What

Convert referenced image paths to `PathUri` before filesystem reads,
declare the internal path-URI dependency, and refresh `Cargo.lock`.

Adam Perry @ OpenAI · 2026-06-12 00:15:19 +00:00

1829ed1122

Route image extension reads through turn environments v2 (#27498 )

## Why

Image generation used `std::fs::read` for referenced image paths, which
did not support environment-backed filesystems or their sandbox context.

## What changed

- Expose optional turn environments to extension tool calls.
- Include each environment’s ID, working directory, filesystem, and
sandbox context.
- Read referenced images through the selected environment filesystem.
- Keep sandbox usage at the extension call site so extensions can choose
the appropriate access mode.
- Consolidate image request construction into one async function.
- Add coverage for successful environment reads and read failures.

## Validation

- `cargo check -p codex-image-generation-extension --tests`
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`

`just test -p codex-image-generation-extension` could not complete
because the build exhausted available disk space.

Won Park · 2026-06-11 16:32:52 -07:00

19ce6394af

Resolve MCP server registrations through a catalog (#27634 )

## Why

MCP servers currently come from user config, local plugins,
compatibility Apps synthesis, and host extensions. Those sources were
composed by mutating a shared map, leaving registration identity,
precedence, removal, and provenance implicit in assembly order.

Before adding executor-owned MCPs, Codex needs one durable resolution
boundary above `McpConnectionManager`. This PR introduces that boundary
while preserving current server configuration, policy, and runtime
behavior. Executor-scoped registrations and explicit policy layers
remain follow-ups.

## What changed

- Add typed `McpServerRegistration` inputs and an immutable
`ResolvedMcpCatalog` in `codex-mcp`.
- Retain each registration's complete `McpServerConfig`, including its
environment binding, while recording its source and provenance.
- Preserve the existing structural precedence between plugin, config,
compatibility, and ordered extension sources.
- Resolve equal-precedence actions by contribution order; provenance IDs
are used only for diagnostics and cannot affect the winner.
- Preserve extension removals and the existing name-scoped `enabled =
false` veto.
- Report same-tier conflicts with every contender and the final catalog
outcome, including whether the winning action registers or removes the
server.
- Require MCP contributors to provide a stable diagnostic identity.
- Derive materialized server maps and plugin ownership from the resolved
catalog.

`McpConnectionManager`, transport startup, tool calls, and resource
routing continue to consume the same effective `McpServerConfig` values.

## Scope

This PR does not add new MCP capabilities or change user-visible
behavior. It does not add executor plugin discovery, thread-scoped
registrations, dynamic refresh generations, or new user/managed policy
semantics.

## Verification

- Added focused catalog coverage for source precedence, complete
configuration preservation, disabled vetoes, plugin ownership,
contribution-order tie breaking, removal outcomes, and conflict
diagnostics.
- Extended hosted Apps coverage for ordered extension removal and
Apps-disabled hosts with and without the hosted extension installed.
- `cargo check -p codex-mcp --tests -p codex-extension-api -p
codex-core`

jif · 2026-06-11 21:54:52 +02:00

4a5a676499

156 Commits