mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
156 Commits
-
[codex] Use model metadata for skills usage instructions (#29740)
## Summary - add a false-by-default `include_skills_usage_instructions` model metadata field - enable the field for the bundled `gpt-5.5` model metadata - consume the metadata in both core and extension skill rendering - remove hardcoded legacy-model matching and its marker plumbing
ani-oai ·
2026-06-29 09:44:36 +09:00 -
Preserve namespaces on custom tool calls (#30302)
## Summary - Preserve the optional namespace on custom tool calls during response deserialization and app-server replay. - Use the namespaced tool identifier for streaming argument handling and tool dispatch. - Regenerate app-server protocol schemas. - Add regression tests covering namespace serialization and routing. ## Testing - Ran affected protocol and app-server test suites. - Ran the full core test suite; two load-sensitive timing tests passed when rerun individually. - Ran Clippy and formatting checks. - Verified with a local end-to-end app-server replay that the namespace is preserved through the complete request/response flow.
nhamidi-oai ·
2026-06-27 09:54:56 -07:00 -
[codex] allow CCA image generation and web search extensions (#29909)
## Summary - allow the standalone image-generation and web-search extensions for the actor-authorized provider shape used by CCA - preserve builtin `image_generation` and `web_search` for older models and existing flows - keep ordinary non-OpenAI providers excluded from both extensions - remove only the image extension local managed-AuthManager requirement that CCA cannot satisfy - share actor-authorization detection through `ModelProviderInfo` - keep Core tests focused on routing behavior and cover header-shape edge cases in `model-provider-info` - add a Responses Lite regression that verifies both `image_gen.imagegen` and `web.run` ## Why CCA uses a provider named `local` with `requires_openai_auth: false` and a non-empty `x-openai-actor-authorization` header. Core accepts that provider shape, but both extension provider-name gates rejected it; image generation additionally required a Codex-managed login. The standalone paths must coexist with existing builtin tools. New Responses Lite models can receive `image_gen.imagegen` and `web.run`, while older models continue using builtin tools. ## Impact This enables both standalone extensions for CCA once installed downstream, without removing or changing builtin-tool compatibility for older models. ## Validation - `just test -p codex-core responses_lite_exposes_standalone_tools_for_actor_authorized_provider` - `just test -p codex-core responses_lite_uses_standalone_web_search_and_image_generation` - `just test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates` - `just test -p codex-image-generation-extension` - `just test -p codex-web-search-extension` - `just test -p codex-model-provider-info` - `just fmt` - `git diff --check`
Won Park ·
2026-06-25 18:34:35 -07:00 -
Reinject missing World State fragments on resume (#30152)
## Why World State restores its structured snapshot on resume so unchanged sections do not have to be rendered again. That is safe only when the model-visible fragment represented by the snapshot is still present in retained history. For selected executor skills, the failing selected-capability scenario exposed this state: ```text persisted World State: selected skill catalog is known retained model history: selected skill catalog message is missing next diff: unchanged, so emit nothing ``` The model resumes without being told about the selected skill catalog. ## What changed World State contributions may now optionally describe the concrete model-visible fragment that must remain in retained history. When a persisted snapshot is present: ```text matching retained fragment exists -> trust snapshot, emit nothing matching retained fragment missing -> treat section as absent, render current state once ``` The skills extension uses this for non-empty selected-environment catalogs by matching its exact rendered catalog body. Empty or hidden catalogs do not require a fragment. ## Scope This does not clear or rebuild the whole World State baseline. It does not change skill discovery, cache invalidation, environment availability, or MCP runtime behavior. It only keeps a persisted section snapshot and its retained model context consistent across resume/history reconstruction. ## Coverage A focused World State regression test verifies both sides: - a missing retained fragment is rendered again - a matching retained fragment avoids duplicate injection
jif ·
2026-06-26 02:18:00 +01:00 -
Project selected plugin runtime by environment availability (#30093)
## Why Selected plugin metadata is stable, but MCP processes are live runtime state. They need different lifetimes: - the MCP extension caches manifest, MCP, and connector declarations for each stable selected root; - each model step projects that cached metadata through the roots that resolved as ready for that exact step; - the MCP manager is rebuilt only when that availability projection changes. This matches executor skills: both features consume the same resolved step roots instead of inferring readiness from the turn's selected environments. ## Behavior ```text E1 not ready for this step -> no E1 MCP servers or connectors -> cached plugin metadata stays in ext/mcp E1 becomes ready -> reuse cached metadata -> publish one MCP runtime containing E1 capabilities same ready roots on the next step -> reuse the exact runtime; no rediscovery and no MCP restart resume -> create new extension thread state and a new MCP runtime ``` All model-facing consumers use the same step snapshot: ```text resolved selected roots | v extension MCP/connector projection | v { MCP config, connector snapshot, MCP manager } | +-> advertise model tools +-> build app/connector tools +-> execute MCP calls ``` ## Cache contract The existing MCP extension owns a cache keyed by the full `SelectedCapabilityRoot`: ```rust let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default); ``` The cache lives with extension thread state. Environment availability filters projection but does not invalidate metadata. Resume creates new thread state. There is no file watcher or executor generation because contents behind a stable environment/root are assumed stable. ## What changes - Keeps executor plugin discovery and cached metadata in `ext/mcp`. - Caches MCP and connector declarations together per selected root. - Uses the step's already-resolved capability roots, including lazy environments that are not turn environments. - Reuses the current MCP runtime when the ready-root projection is unchanged. - Uses the same step MCP manager and connector snapshot for model-visible tools and execution. - Resolves direct thread-scoped MCP requests from the current selected-root projection. ## Deliberately out of scope - `app/list` remains based on the latest global host-plugin state; this PR does not make its response or notifications thread-specific. - `required = true` startup semantics do not apply to delayed executor MCP activation. - No filesystem/content invalidation. - No transport-disconnect watcher. - No executor generations or environment replacement semantics. - No client sharing across complete manager replacements. ## Stack 1. Extension-owned World State sections. 2. Project executor skills through World State. 3. Pin one MCP runtime to each model step. 4. **This PR:** project selected MCP and connector state from extension-owned metadata. 5. Integration coverage for selected capability availability and resume. ## Verification - `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id` - The stacked integration PR covers unavailable to ready activation, unchanged-runtime reuse, skills, MCP tools, connector attribution, and cold resume.jif ·
2026-06-26 01:36:44 +01:00 -
Project executor skills through World State (#30088)
## Why A selected executor environment can be unavailable in one model step and ready in the next. The model should see its skills only while that environment is ready, without rescanning stable files on every sample. The product assumption is simple: - an environment ID names one stable logical environment; - the selected root contents do not change during the thread. ## Behavior ```text E1 unavailable -> do not show E1 skills E1 ready -> discover once, cache, show through World State E1 unavailable -> hide skills, keep cache E1 ready again -> reuse cache, show skills again resume -> create a new thread cache and discover again ``` The cache key is the full `SelectedCapabilityRoot`. Availability does not invalidate it; dropping the extension's thread state does. The step supplies the ready selected roots directly. They do not have to be turn environments: ```text turn environment: laptop selected root: worker:/plugins/lint-fix worker ready -> lint-fix skills are visible ``` ## What changes - Keeps executor skill catalogs in the existing skills extension. - Passes the roots resolved as ready for the step into World State contributors. - Loads each ready selected root at most once per thread. - Contributes the executor catalog as the `skills` World State section. - Uses the exact step catalog for explicit skill selection and body reads. - Leaves host and orchestrator skill behavior where it already lives. Taking a step snapshot itself does not add an RPC. Executor filesystem calls happen only on the first discovery of a stable root for that thread. ## What does not change - No filesystem watcher or content-based invalidation. - No retry/generation framework. - No skill runtime migration into core. - No general rewrite of the skills extension. ## Stack 1. Extension-owned World State sections. 2. **This PR:** project cached executor skills through World State. 3. Pin one MCP runtime to each model step. 4. Project selected MCP/app/connector metadata by environment availability. 5. One end-to-end integration scenario.
jif ·
2026-06-26 00:13:43 +01:00 -
Let extensions contribute World State sections (#30100)
## Why #29856 already owns the durable thread intent and exact environment binding. This PR adds only the small missing extension boundary: an extension can contribute one named World State section, while core still owns persistence, diffing, and model-visible fragment types. This lets skills stay in the skills extension instead of moving their runtime into core. ## Shape ```text extension-owned state | | contribute section id + JSON snapshot + renderer v core World State | | compare with the previous snapshot v no message, or one incremental model-visible update ``` The extension API is deliberately small: ```rust fn contribute_world_state(...) -> Vec<WorldStateSectionContribution> ``` Core adapts the rendered result to `ContextualUserFragment`, records the snapshot, and keeps the existing compaction/resume behavior. ## What changes - Adds extension-owned World State section contributions. - Calls those contributors from the existing per-step World State builder. - Restores durable selected capability roots into extension thread state on resume. - Keeps the actual model-context fragment and rollout machinery in core. ## What does not change - No skill or MCP implementation moves out of its extension. - No new file watcher, generation, or RPC. - No generic migration of existing World State sections. - No change to the stable environment-ID assumption from #29856. ## Example ```text step 1 snapshot: skills = [] step 2 snapshot: skills = [executor-demo:deploy] core asks the skills extension to render only that change. ``` ## Stack 1. **This PR:** let extensions contribute World State sections. 2. Project executor skills through the skills extension. 3. Pin one MCP runtime to each model step. 4. Project selected MCP/app/connector metadata by environment availability. 5. One end-to-end integration scenario.
jif ·
2026-06-25 22:23:51 +01:00 -
Support HTTP MCP servers from selected executor plugins (#28522)
## Why Selected executor plugins can declare both stdio and Streamable HTTP MCP servers, but only stdio registrations were retained. That silently drops part of the plugin's tool surface and prevents HTTP traffic from using the owning executor's network. ## What changed - retain selected-plugin Streamable HTTP MCP declarations alongside stdio declarations - route their HTTP clients through the owning executor environment - preserve local auth-header environment references while rejecting them for executor-hosted declarations - cover thread isolation, refresh, and an executor-only HTTP route end to end
jif ·
2026-06-25 10:10:36 +01:00 -
Represent MCP authentication with an enum (#29924)
## Why MCP authentication has distinct OAuth and ChatGPT-session flows. Representing that choice as `use_chatgpt_auth` makes one flow implicit and allows the configuration model to express the distinction only through a boolean. ChatGPT credential forwarding also needs a first-party trust boundary. A configurable `chatgpt_base_url` controls routing, but must not grant an MCP server permission to receive session credentials. This change builds on #29733, where the boolean was introduced. ## What changed - Replace `use_chatgpt_auth` with an `auth` field backed by the exhaustive `McpServerAuth` enum. - Support `auth = "oauth"` and `auth = "chatgpt"`, with OAuth remaining the default. - Trust only the origin derived from the existing hardcoded `CHATGPT_CODEX_BASE_URL` when granting ChatGPT auth to an MCP server. - Keep configured bearer tokens and authorization headers ahead of the selected authentication flow. - Update config writers, schema output, fixtures, and integration-test setup to use the enum. ## Verification Integration coverage exercises the complete streamable HTTP startup path in two independent configurations: - A directly constructed MCP configuration verifies that matching an overridden `chatgpt_base_url` does not grant ChatGPT auth. - A persisted `config.toml` containing an attacker-controlled `chatgpt_base_url` and `auth = "chatgpt"` verifies the same boundary through normal config parsing. Both tests complete MCP initialization and tool listing and assert that the full captured request sequence contains no authorization headers. Separate integration coverage verifies that configured authorization takes precedence over ChatGPT auth.
Ahmed Ibrahim ·
2026-06-24 19:51:51 -07:00 -
Allow ChatGPT-hosted MCP servers to use session auth (#29733)
## Why ChatGPT session authentication was inferred from the reserved Codex Apps server name. That couples credential routing to Codex Apps-specific behavior and prevents other MCP endpoints hosted by ChatGPT from explicitly using the current session. The opt-in also needs a clear security boundary: an arbitrary MCP configuration must not be able to redirect ChatGPT credentials to another origin. ## What changed - Add `use_chatgpt_auth` to HTTP MCP server configuration, defaulting to `false`. - Honor the setting only when the parsed server URL has the same HTTP(S) origin as the configured `chatgpt_base_url`; otherwise remove the capability before startup. - Resolve bearer tokens and static or environment-backed authorization headers before selecting authentication, with configured authorization taking precedence over ChatGPT session auth. - Enable the setting for the built-in Codex Apps and hosted plugin runtime endpoints while keeping Codex Apps caching and tool normalization scoped to the reserved server. - Persist the setting through MCP config rewrite paths and expose it in the generated config schema. - Load the current login state for `codex mcp list` so reported auth status matches runtime behavior. ## Verification Core integration coverage exercises the complete streamable HTTP MCP startup path and verifies that: - a same-origin opted-in server receives the current ChatGPT access token; - an explicitly configured authorization header takes precedence; - a different-origin server completes MCP initialization and tool listing without receiving any ChatGPT authorization header.
Ahmed Ibrahim ·
2026-06-24 19:21:28 -07:00 -
Read connector declarations from executor plugins (#29852)
## Why Selected capability roots can live on a different executor and operating system from app-server. Their connector declarations must therefore be read through the executor that owns the package, without converting executor URIs into host paths. This PR adds that authority-bound reader without activating connectors or changing thread startup. ## What changed - Add a small `codex-connectors-extension` crate for executor-owned connector I/O. - Read only the app configuration explicitly declared by the resolved plugin manifest. - Read through the `ExecutorFileSystem` retained by `ResolvedExecutorPlugin`; there is no host-filesystem fallback or default-file probe. - Keep `PathUri` values intact so Windows, Unix, and remote executor paths work from any orchestrator OS. - Return full `AppDeclaration` values so the caller retains declaration names and categories for routing. - Preserve the selected plugin ID and exact executor URI in read and parse errors. The contract is intentionally narrow: selected packages are trusted, valid packages and packages that provide connectors explicitly declare their app configuration. ## Stack scope This PR is stacked on #29851. It only provides the executor-backed reader. #29856 resolves selected roots at thread start, freezes their connector snapshot, and contains the remote-capable end-to-end authority test for the complete path.
jif ·
2026-06-24 23:56:50 +01:00 -
Keep executor plugin MCP paths URI-native (#29628)
## Why Executor-owned plugin roots are `PathUri`, but MCP config normalization still converts them into a native `Path` using the app-server host's rules. Relative `cwd` values can therefore resolve against the wrong filesystem when host and executor path conventions differ. This PR keeps executor MCP paths URI-native until the selected environment launches the server, while retaining the existing host parser behavior. ## What changed - Keep one shared MCP normalization path with narrow host-`Path` and executor-`PathUri` entrypoints. - Preserve native host resolution for locally installed plugin MCP configs. - For executor configs, default `cwd` to the plugin root and resolve relative working directories with the root URI's path convention. - Accept explicit executor `file:` URIs only when they remain within the selected plugin root. - Preserve the selected environment id and existing remote environment-variable ownership rules. - Route the executor plugin provider through the URI-native entrypoint without converting the root on the host. - Ensure `codex doctor` does not probe executor-owned stdio commands or foreign working directories on the host. - Cover foreign Windows roots, relative and absolute executor working directories, traversal rejection, runtime resolution, and doctor behavior. ```text plugin root: file:///C:/plugins/demo configured cwd: scripts | v resolved cwd: file:///C:/plugins/demo/scripts | v launch through the selected executor ``` No new provider or filesystem abstraction is introduced. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. #28918 — keep selected plugin roots and resources URI-native. 4. #29626 — load executor skills without host path conversion. 5. **This PR** — resolve executor MCP working directories without host path conversion.jif ·
2026-06-24 09:46:07 +01:00 -
Let image generation extension hosts control output persistence (#29711)
## Why Some extension hosts need generated images returned without writing them to the local filesystem or giving the model a local path. ## What changed **tl;dr**: we now conduct all extension operations in the image gen extension - Let hosts provide an optional image save root when installing the extension. - Save images and return path hints only when a save root is configured. - Return image data without saving or adding a path hint when no save root is configured. - Preserve the extension-provided `saved_path` instead of persisting extension images again in core. - Leave built-in image generation unchanged. ## Validation - `just test -p codex-image-generation-extension` - `just test -p codex-app-server standalone_image_generation_returns_saved_path_hint_to_model` - `just test -p codex-core extension_tool_uses_granted_turn_permissions_without_local_persistence` - `just test -p codex-core tools::handlers::extension_tools::tests` - tested on CODEX CLI on both save_root: CODEX_HOME and None - tested on CODEX APP on both as well
Won Park ·
2026-06-23 18:51:49 -07:00 -
Load executor skills without host path conversion (#29626)
## Why After #28918, selected skill roots are `PathUri`, but the executor skill provider still converts them to the app-server host's `AbsolutePathBuf`. A foreign Windows root therefore cannot be discovered by a Unix host, and the inverse has the same problem. This PR keeps executor skill discovery and reads on the filesystem that owns the selected root while reusing the existing skill rules. ## What changed - Generalize the existing skill traversal to operate on `PathUri` through `ExecutorFileSystem`, preserving its depth, directory, symlink, and sibling-metadata concurrency behavior. - Add a small environment skill loader that reuses the shared discovery, frontmatter validation, dependency parsing, product policy, and prompt-visibility rules. - Keep the environment id and entrypoint `PathUri` in the skill catalog, then route `skills.read` back through the same environment filesystem. - Preserve the executor's path convention when deriving catalog handles, including literal backslashes in POSIX filenames. - Resolve plugin namespaces from nearby manifests through URI-native filesystem reads. - Cover foreign Windows roots, executor-owned reads, namespaces, metadata, policy, and path identity. ```text selected root (PathUri) | v shared discovery over ExecutorFileSystem | v environment-bound catalog entry --skills.read--> same ExecutorFileSystem ``` No second filesystem abstraction or duplicate traversal implementation is introduced. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. #28918 — keep selected plugin roots and resources URI-native. 4. **This PR** — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 23:26:06 +01:00 -
Make selected plugin roots URI-native (#28918)
## Why Selected capability roots belong to the executor filesystem, not the app-server host. Converting their path strings into the host's native `Path` breaks whenever the two machines use different path conventions, such as a Windows executor behind a Unix app-server. This PR establishes `PathUri` as the selected-plugin boundary so the executor remains authoritative for its paths. ## What changed - Require `selectedCapabilityRoots[].location.path` to be a canonical `file:` URI and deserialize it directly as `PathUri`; native path strings are rejected. - Update the app-server schema, generated TypeScript, examples, and request coverage for the URI contract. - Keep selected roots, resolved plugin locations, manifest paths, and manifest resources as `PathUri`. - Inspect and read plugin roots and manifests only through the selected environment's `ExecutorFileSystem`. - Parse executor manifests with the shared URI-native parser from #29620 instead of projecting them onto the host filesystem. - Enforce resource containment lexically and preserve the root URI's POSIX or Windows path convention. - Cover foreign Windows plugin roots and URI-native manifest resources. ```text thread/start selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo" | PathUri v ExecutorFileSystem | +--> plugin.json +--> manifest resources ``` This PR stops at the shared selected-plugin representation. The next two PRs remove the remaining host-path projections in the skill and MCP consumers. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. **This PR** — keep selected plugin roots and resources URI-native. 4. #29626 — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 22:51:19 +01:00 -
[codex] Use input items for Responses Lite tools (#27946)
When using Responses Lite, we should all use `additional_tools` and a developer item instead of the top level tools array & instructions field. This keeps things 1-to-1. Forced namespacing for _all_ tools will land in a following PR after some coordination & fixes in Responses API (around collisions & return items). The goal is to eventually expand the scope of this to _all_ requests from codex, but that will require larger coordination across providers & slower rollout.
rka-oai ·
2026-06-22 23:56:16 -07:00 -
mcp: accept foreign absolute cwd for remote stdio (#29493)
## Why Remote stdio MCP servers can run in an environment whose path convention differs from the Codex host. A Windows cwd such as `C:\Users\openai\share` is absolute for the executor but was rejected by a POSIX orchestrator. Built on #29501, now merged, which only clarifies the host-native `PathUri` constructor name. ## What changed - Deserialize MCP cwd values as `LegacyAppPathString` so config does not apply host path rules. - Interpret that spelling as host-native for local launches and convert it to `PathUri` at executor launch. - Skip host filesystem and command resolution checks for remote stdio in `codex doctor`. - Add host-independent config and executor-boundary coverage using the foreign path convention for each test platform. ## Validation - `just test -p codex-utils-path-uri -p codex-config -p codex-mcp -p codex-rmcp-client` (408 passed) - `just test -p codex-cli -p codex-rmcp-client` (372 passed) - `cargo check --workspace --tests` - `just test` (11,311 passed; 43 unrelated environment/timing failures) - `just fix -p codex-cli -p codex-config -p codex-core -p codex-mcp -p codex-mcp-extension -p codex-rmcp-client -p codex-tui`
Adam Perry @ OpenAI ·
2026-06-23 01:33:51 +00:00 -
core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
## Description This PR cuts Codex over from generic `ResponseItem.metadata` (introduced here: https://github.com/openai/codex/pull/28355) to `ResponseItem.internal_chat_message_metadata_passthrough`, which is the blessed path and has strongly-typed keys. For now we have to drop this MAv2 usage of `metadata`: https://github.com/openai/codex/pull/28561 until we figure out where that should live.
Owen Lin ·
2026-06-22 11:11:25 -07:00 -
[codex] Preserve skill descriptions outside model context (#29006)
## Why Skill descriptions are used in model-visible lists: the default available-skills catalog that supports implicit selection, and the on-demand `skills.list` tool response used to discover orchestrator skills. A single overlong description should not consume a disproportionate share of either list. Enforcing the 1024-character limit while loading or migrating skills is the wrong boundary: it rejects otherwise-valid skills and discards metadata that non-model consumers and full skill reads may need. Skill metadata and `SKILL.md` content should remain intact; the cap belongs at model-visible list rendering boundaries. ## What changed - Preserve full `description` and `metadata.short-description` values when loading skills. - Preserve full external-agent command descriptions during `source-command-*` migration instead of skipping commands solely because their descriptions exceed 1024 characters. - Preserve full normalized orchestrator descriptions in the underlying skills catalog. - Cap each description at 1024 Unicode characters when rendering the default available-skills context in `codex-core-skills` and `codex-skills-extension`. - Apply the same cap when serializing descriptions in the model-visible `skills.list` response. - Render truncated descriptions as 1021 original characters plus `...`. - Leave explicit `$skill` injection, `skills.read`, underlying metadata, and on-disk `SKILL.md` files unchanged and full-fidelity. ## Implicit skill selection Codex injects a bounded catalog containing each implicitly allowed skill's name, description, and source locator, together with instructions to use a skill when the task clearly matches its description. The model makes that semantic choice; after selecting a skill, it reads the full `SKILL.md` from its filesystem or provider resource. Explicit `$skill` mentions remain a separate path that injects the full skill instructions. For orchestrator skills, `skills.list` provides bounded discovery metadata before `skills.read` returns the full selected resource. ## Test plan - `just test -p codex-core-skills` - `just test -p codex-skills-extension` - `just test -p codex-external-agent-migration` The focused regressions verify that overlong metadata is preserved at load and migration boundaries while default available-skills rendering and `skills.list` output produce the 1021-character prefix plus `...`.
charlesgong-openai ·
2026-06-19 12:47:53 -07:00 -
rphilizaire-openai ·
2026-06-19 10:13:27 -07:00 -
Add config toggles for orchestrator skills and MCP (#28942)
## Why Orchestrator-provided skills and Codex Apps MCP tools add model-visible instructions, resources, and tools beyond the local workspace. Hosts need config-level switches to disable those orchestrator-owned surfaces independently, without disabling regular skills or regular MCP servers. ## What changed - Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled` config entries, both defaulting to `true`. - Includes the new settings in `config.schema.json` and in the config lock so resolved thread configuration preserves the same orchestrator exposure decisions. - Threads `orchestrator.skills.enabled` through the app-server skills extension so disabled orchestrator skills do not expose the `skills` namespace or inject orchestrator skill context. - Gates Codex Apps MCP exposure, app instructions, and app auth eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps MCP tools available. - Updates the thread-manager sample config to disable both orchestrator-owned surfaces. ## Verification - Added config parsing, loading, defaulting, and schema coverage for the new settings. - Added MCP exposure coverage that `orchestrator.mcp.enabled = false` removes Codex Apps tools while preserving regular MCP tools. - Added app-server coverage that `orchestrator.skills.enabled = false` prevents orchestrator skill tools, prompts, and resource reads from reaching the model turn.
jif ·
2026-06-19 14:42:26 +02:00 -
Add indexed web search mode (#28489)
## Summary - Add `web_search = "indexed"` alongside `disabled`, `cached`, and `live`. - Use that same resolved mode for both hosted and standalone web search. - For hosted search, send `index_gated_web_access: true` with external web access enabled only when `indexed` is selected. - For standalone search, preserve the existing boolean wire values for existing modes (`cached` maps to `false` and `live` to `true`) and send `"indexed"` only for `indexed`; `disabled` keeps the tool unavailable. - Carry the mode through managed configuration requirements and generated schemas. ## Why Indexed search provides a middle ground between cached-only search and unrestricted live page fetching. Search queries can remain live while direct page fetches are limited to URLs admitted by the server. The existing `web_search` setting remains the single source of truth, so hosted and standalone executors cannot drift into different access modes. Without an explicit `indexed` selection, the existing model-visible tool and request shapes are unchanged. ```toml web_search = "indexed" [features] standalone_web_search = true ``` ## Validation - `just fmt` - `just test -p codex-api` (`126 passed`) - `just test -p codex-web-search-extension` (`7 passed`) - `just test -p codex-core code_mode_can_call_indexed_standalone_web_search` (`1 passed`) - Focused configuration, hosted request, standalone request, and managed-requirement coverage is included in the PR; remaining suites run in CI. The full workspace test suite was not run locally.
Winston Howes ·
2026-06-19 05:35:57 -07:00 -
[codex] Assign response item IDs when recording history (#28814)
## Why Client-created response items enter history without IDs, so their identity is lost across rollout persistence and resume. IDs should be assigned once at the history-recording boundary, while IDs returned by the server must remain unchanged. The Responses API validates item IDs using type-specific prefixes. Locally generated IDs therefore use the matching prefix plus a hyphenated UUIDv7, keeping them valid while distinguishable from server-generated IDs. Because this changes persisted history and provider request shapes, the behavior is opt-in behind the under-development `item_ids` feature. Compaction triggers remain request controls whose API shape does not accept an ID. ## What changed - Register the disabled-by-default `item_ids` feature and expose it in `config.schema.json`. - Make supported optional `ResponseItem` IDs serializable and expose them in the generated app-server schemas. - When `item_ids` is enabled, assign an ID during conversation-history preparation if an item has no ID. - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API item conventions. - Preserve existing server IDs without rewriting them. - Persist assigned IDs in rollouts and include them in subsequent Responses requests. - Remove the unsupported ID field from `CompactionTrigger` and document why it has no ID. - Add integration coverage for enabled ID persistence, preservation of server IDs, and omission of generated IDs while the feature is disabled. `prepare_conversation_items_for_history` is the single response-item ID allocation boundary. ## Test plan - `just test -p codex-features` - `just test -p codex-core response_item_ids_persist_across_resume_and_preserve_server_ids` - `just test -p codex-core non_openai_responses_requests_omit_item_turn_metadata` - `just test -p codex-core resize_all_images_prepares_failures_before_history_insertion` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
pakrym-oai ·
2026-06-18 17:30:55 -07:00 -
[codex] Reuse parsed plugin skills during session startup (#28844)
## Summary - Preserve raw plugin skill-root snapshots in the matching loaded-plugin cache entry, keyed by the effective plugin root identity including namespace. - Pass those snapshots through `SkillsLoadInput` as an optional preload, so session startup reuses plugin parsing while ordinary skill loads pass `None`. - Keep plugin skill loading cohesive: the existing loaders accept the optional snapshots directly, and uncached or marketplace-detail paths do not create a cache. ## Why Plugin discovery already parses plugin skills to determine available capabilities. Cold session startup then scanned and parsed the same roots again while building the skills snapshot. This solves the same duplicate-work problem as #28623 while keeping ownership narrow: `PluginsManager` creates and owns `PluginSkillSnapshots` only for its loaded-plugin cache entry; `SkillsService` consumes an optional clone. Entry replacement or clearing naturally drops the snapshots, with no separate generation, capacity policy, or watcher coupling. ## Validation - `cargo clippy -p codex-core-skills --all-targets -- -D warnings` - `just test -p codex-core-plugins skills_service_reuses_skills_parsed_during_plugin_load` - `just test -p codex-core-skills namespaces_plugin_skills_using_provided_namespace` - `just fmt`
xl-openai ·
2026-06-18 16:45:58 -07:00 -
Fix goal-first live threads missing from thread/list (#28808)
Fixes #28263. ## Why When a thread starts with `/goal`, the goal extension can update SQLite goal state before the thread has any user-turn rollout items. `thread/list` and `thread/search` rely on persisted listing metadata, so a goal-first live thread could be absent from app-server listings after restart even though the goal itself existed. This regressed when goal handling moved out of core: the core path wrote the goal update through the live thread rollout path, while the extension-backed app-server path only updated goal state and emitted the live notification. ## What - Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension owns the canonical `ThreadGoalUpdated` rollout item shape. - Expose a narrow `CodexThread::append_rollout_items()` helper that appends through the live thread and keeps derived SQLite metadata in sync. - When app-server sets a goal on an active live thread, persist the goal update through that live-thread path. - Add an app-server regression test that starts a live thread with `thread/goal/set` and verifies it appears in state-DB-only `thread/list`. ## Verification - `env -u CODEX_SQLITE_HOME just test -p codex-app-server goal_first_live_thread_appears_in_state_db_thread_list`
Eric Traut ·
2026-06-18 10:50:15 -07:00 -
Add turn-scoped context contributions (#28911)
## Summary - keep context injection on a single ContextContributor trait - split context injection into thread-scoped and turn-scoped contribution methods - wire turn-scoped fragments into initial context assembly so extensions can contribute context from turn-local state
jif ·
2026-06-18 19:40:28 +02:00 -
[codex] Pass plugin namespace into skill loading (#28608)
## What changed - retain the parsed plugin manifest namespace on loaded plugins - carry that namespace through `PluginSkillRoot` and `SkillRoot` - use the provided namespace when qualifying plugin skill names - include the namespace in the skills cache key ## Why Plugin loading has already parsed `plugin.json`, but skill parsing currently walks every `SKILL.md` ancestor and probes/reads the manifest again to reconstruct the same namespace. Passing the parsed namespace removes those repeated filesystem calls, which are particularly costly on remote filesystems. Context: https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4 ## Impact Plugin skill names remain unchanged. A regression test uses a deliberately different on-disk manifest name to verify that plugin roots use the provided parsed namespace. ## Validation - `just test -p codex-core-skills -p codex-core-plugins -p codex-plugin -p codex-utils-plugins` (352 passed) - `just fix -p codex-core-skills -p codex-core-plugins -p codex-plugin -p codex-utils-plugins` - `just fmt`
Matthew Zeng ·
2026-06-18 00:16:46 -07:00 -
[codex] Support plugin manifest path lists (#28790)
## Summary Allow plugin manifests to declare `skills` as either a single path string or an array of path strings in the core plugin loader. ## Why Some plugin packages need to expose skills from more than one directory. Before this change, `plugin.json` only accepted a single string for `skills`, so manifests like this were ignored as an invalid `skills` shape: ```json { "skills": ["./skills/abc", "./skills/edk"] } ``` This keeps the existing single-string form working while adding support for the list form. The final scope is intentionally limited to the core plugin manifest/load path for `skills`; `apps`, file-backed `mcpServers`, and the bundled plugin-creator assets are unchanged in this PR. ## What changed - Parse `skills` as either a string or an array of strings in `plugin.json`. - Store resolved skill paths as a list in `PluginManifestPaths`. - Load manifest-declared skill roots in addition to the default `./skills` root. - Deduplicate exact duplicate skill roots before loading. - Rely on existing skill-loader dedupe by canonical `SKILL.md` path for overlapping roots such as `./skills` plus `./skills/abc`. - Update plugin manifest tests to cover: - single string `skills` - list of string `skills` - duplicate skill roots - `./skills` as a manifest path - explicit child roots like `./skills/abc` and `./skills/edk` - overlapping-root dedupe ## Validation - `just test -p codex-plugin` - `just test -p codex-core-plugins` - `just test -p codex-mcp-extension` - `git diff --check`charlesgong-openai ·
2026-06-17 21:33:53 -07:00 -
[codex] Add optional IDs to response items (#28812)
## Why `ResponseItem` variants do not have a consistent internal ID shape: some variants carry required IDs, some carry optional IDs, and some cannot represent an ID at all. The existing fields also use inconsistent serde, TypeScript, and JSON-schema annotations. A single enum-level access path is needed before history recording can assign and retain IDs. This PR establishes that internal model only. It intentionally does not generate or serialize IDs; allocation and wire persistence are isolated in the stacked follow-up. ## What changed - Give every concrete `ResponseItem` variant an `Option<String>` ID field. - Apply the same internal-only annotations to every ID field: `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and `#[schemars(skip)]`. - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared accessors. - Preserve IDs when history items are rewritten for truncation. - Adapt consumers that previously assumed reasoning and image-generation IDs were required. - Regenerate app-server schemas so the hidden fields are represented consistently. The serde catch-all `ResponseItem::Other` remains ID-less because it must remain a unit variant. ## Test plan - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-core event_mapping`
pakrym-oai ·
2026-06-17 18:27:43 -07:00 -
Replace SkillsManager with SkillsService (#28705)
## Why Host skill discovery was still exposed as a manager even though it is a process-owned service shared by sessions, the app-server catalog, and file-watcher invalidation. The skills extension also consumed an ad hoc loaded-skills wrapper instead of a named immutable snapshot. ## What changed - replace `SkillsManager` with concrete `SkillsService` - make the service cache and return immutable `HostSkillsSnapshot` values - migrate the skills extension host provider to the snapshot boundary - migrate app-server catalog, watcher, and invalidation paths to the service This keeps the service limited to host discovery, caching, roots, and invalidation. Catalog rendering and invocation remain extension responsibilities for the next stacked change.
jif ·
2026-06-17 17:01:06 +02:00 -
[codex] Support object-valued plugin MCP manifests (#28580)
## Summary This fixes plugin manifest parsing for MCP servers declared as an object directly in `plugin.json`. Before this change, Codex modeled `mcpServers` as only a string path, for example: ```json { "name": "counter-sample", "version": "1.1.1", "mcpServers": "./.mcp.json" } ``` Some migrated plugins instead provide the server map directly in the manifest: ```json { "name": "counter-sample", "version": "1.1.1", "description": "Plugin that declares MCP servers in the manifest", "mcpServers": { "counter": { "type": "http", "url": "https://sample.example/counter/mcp" } } } ``` That object form previously failed during install/load with an error like: ```text failed to parse plugin manifest: invalid type: map, expected a string ``` ## What changed - Add a manifest representation for `mcpServers` as either `Path(Resource)` or `Object(map)`. - Parse `plugin.json` `mcpServers` as either a string path or an object. - Route object-valued MCP server maps through the existing plugin MCP config parser instead of adding a second parser. - Apply existing per-plugin MCP server policy to object-valued MCP servers the same way as file-backed MCP servers. - Include object-valued MCP server names in plugin telemetry/capability metadata. - Support object-valued MCP config for executor plugins without requiring a `.mcp.json` filesystem read. - Update the bundled plugin-creator validator and `plugin-json-spec.md` so generated-plugin validation accepts the same object-valued shape. ## Compatibility Existing plugin manifests that use `"mcpServers": "./.mcp.json"` continue to work. Plugins can now also use the object shape shown above. ## Tests Added coverage for the new manifest attribute shape at the install, normal load, telemetry, and executor-provider layers: - `install_accepts_manifest_mcp_server_objects` - `load_plugins_loads_manifest_mcp_server_objects` - `plugin_telemetry_metadata_uses_manifest_mcp_server_objects` - `reads_manifest_object_config_without_executor_file_system_access` Also smoke-tested the plugin-creator validator against both supported forms: - `mcpServers` as a direct object in `plugin.json` - `mcpServers` as `"./.mcp.json"` with a companion `.mcp.json` ## Validation - `just test -p codex-plugin` - `just test -p codex-core-plugins` - `just test -p codex-mcp-extension` - `just bazel-lock-update` - `just bazel-lock-check` - `just fmt` - `git diff --check` - Focused rename/object-form rerun: `just test -p codex-core-plugins manager::tests::load_plugins_loads_manifest_mcp_server_objects manager::tests::plugin_telemetry_metadata_uses_manifest_mcp_server_objects store::tests::install_accepts_manifest_mcp_server_objects` - Focused executor rerun: `just test -p codex-mcp-extension executor_plugin::provider::tests::reads_manifest_object_config_without_executor_file_system_access` - `python3 codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py /private/tmp/codex-validator-object` - `python3 codex-rs/skills/src/assets/samples/plugin-creator/scripts/validate_plugin.py /private/tmp/codex-validator-path`charlesgong-openai ·
2026-06-16 19:22:57 -07:00 -
[codex] exec-server: stream files in chunks (#28354)
## Why `fs/readFile` buffers the entire file in one response, which makes large remote reads expensive and prevents callers from applying backpressure. We need an opt-in streaming path with bounded block sizes while preserving the existing single-call API for small and sandboxed reads. ## What changed - Add `ExecServerClient::stream`, returning a named `FileReadStream` that implements `futures::Stream` and yields immutable 1 MiB byte blocks. - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs. `fs/readBlock` accepts an explicit offset and length. - Keep unsandboxed files open between block reads, cap open handles per connection, and clean them up on EOF, error, stream drop, explicit close, or connection shutdown. - Reject platform-sandboxed streaming opens instead of turning the one-shot sandbox helper into a persistent server. Existing `fs/readFile` behavior is unchanged. ## Testing - `just test -p codex-exec-server` - Integration coverage for 1 MiB chunking, exact block-boundary EOF, sandbox rejection, and continued reads from the opened file after path replacement. - Handle-manager coverage for non-sequential offsets, variable block lengths, the 128-handle limit, and capacity release after close.
pakrym-oai ·
2026-06-16 09:50:55 -07:00 -
feat: render typed envelopes for multi-agent v2 messages (#28368)
## Why Multi-agent v2 messages need a consistent, model-visible envelope that identifies what kind of interaction occurred, who sent it, and which agent it targets. Previously, encrypted deliveries exposed only `encrypted_content`, while child completion used the legacy `<subagent_notification>` shape. That meant the client could not consistently present `NEW_TASK`, `MESSAGE`, and `FINAL_ANSWER` using the same format. This change adds the routing envelope as plaintext while keeping task and message payloads encrypted. No new Responses API field is required: an encrypted delivery is represented as an `input_text` header immediately followed by its existing `encrypted_content` item. Every envelope now follows this shape: ```text Message Type: <NEW_TASK | MESSAGE | FINAL_ANSWER> Task name: <recipient agent path> Sender: <author agent path> Payload: <message payload> ``` ## Message types ### `NEW_TASK` `NEW_TASK` is used when the recipient should begin a new turn, including an initial `spawn_agent` task and a later `followup_task`. For a root agent spawning `/root/worker`, the request contains a plaintext envelope followed by the encrypted task: ```json { "type": "agent_message", "author": "/root", "recipient": "/root/worker", "content": [ { "type": "input_text", "text": "Message Type: NEW_TASK\nTask name: /root/worker\nSender: /root\nPayload:\n" }, { "type": "encrypted_content", "encrypted_content": "<encrypted task payload>" } ] } ``` Conceptually, the model receives: ```text Message Type: NEW_TASK Task name: /root/worker Sender: /root Payload: Review the authentication changes and report any regressions. ``` ### `MESSAGE` `MESSAGE` is used for a queued `send_message` delivery. It communicates with an existing agent without starting a new turn. For `/root/worker` reporting progress to the root agent, the request contains: ```json { "type": "agent_message", "author": "/root/worker", "recipient": "/root", "content": [ { "type": "input_text", "text": "Message Type: MESSAGE\nTask name: /root\nSender: /root/worker\nPayload:\n" }, { "type": "encrypted_content", "encrypted_content": "<encrypted message payload>" } ] } ``` Conceptually, the model receives: ```text Message Type: MESSAGE Task name: /root Sender: /root/worker Payload: The protocol tests pass; I am checking the resume path now. ``` ### `FINAL_ANSWER` `FINAL_ANSWER` is emitted when a child agent reaches a terminal state and reports its result to its parent. Completion payloads are already available locally, so the complete envelope is represented as plaintext rather than as a plaintext header plus encrypted content. For `/root/worker` completing work for the root agent, the request contains: ```json { "type": "agent_message", "author": "/root/worker", "recipient": "/root", "content": [ { "type": "input_text", "text": "Message Type: FINAL_ANSWER\nTask name: /root\nSender: /root/worker\nPayload:\nNo regressions found." } ] } ``` The model-visible form is: ```text Message Type: FINAL_ANSWER Task name: /root Sender: /root/worker Payload: No regressions found. ``` Errored, shut down, and missing agents also use `FINAL_ANSWER`, with a terminal-status description in the payload. ## What changed - Render `NEW_TASK` or `MESSAGE` in `InterAgentCommunication::to_model_input_item`, based on whether the encrypted delivery starts a turn. - Replace the multi-agent v2 `<subagent_notification>` completion payload with a model-visible `FINAL_ANSWER` envelope. - Document `Task name`, `Sender`, and `Payload` consistently in the multi-agent developer instructions. - Prevent local-only history projections from treating an encrypted message's plaintext header as the complete assistant message. - Preserve rollout-trace interaction edges when an agent message contains both plaintext and encrypted content. Legacy multi-agent behavior remains unchanged. ## Verification - `just test -p codex-protocol` - `just test -p codex-rollout-trace` - `just test -p codex-web-search-extension` - `just test -p codex-core encrypted_multi_agent_v2_spawn_sends_agent_message_to_child` - `just test -p codex-core plaintext_multi_agent_v2_completion_sends_agent_message` - `just test -p codex-core multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn` - `just test -p codex-core multi_agent_v2_completion_queues_message_for_direct_parent`jif ·
2026-06-16 11:46:59 +02:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
Use PathUri in filesystem permission paths for exec-server (#28165)
## Why Progress towards letting app-server and exec-server run on different platforms, specifically for sandbox configuration. ## What - Make the filesystem path containment hierarchy generic, defaulting to `AbsolutePathBuf` for now. - Have clients specify `AbsolutePathBuf` or `PathUri` directly where needed. - Use `PathUri` throughout exec-server filesystem protocol and trait boundaries. - Implement `From` for conversion to path URIs and `TryFrom` for fallible conversion to absolute paths through the generic type hierarchy.
Adam Perry @ OpenAI ·
2026-06-15 23:55:23 +00:00 -
feat(core): add metadata field to ResponseItem (#28355)
## Description This PR adds an optional `metadata` field to `ResponseItem` for Responses API calls. Only mechanical plumbing, no actual values populated and sent yet. Turns out just adding a new field to `ResponseItem` has quite a large blast radius already. This change is backwards compatible because `metadata` is optional and omitted when absent, so existing response items and rollout history without it still deserialize and requests that do not set it keep the same wire shape. For provider compatibility, we strip out `metadata` before non-OpenAI Responses requests so Azure and AWS Bedrock never see this field. My followup PR here will actually make use of it to start storing and passing along `turn_id`: https://github.com/openai/codex/pull/28360 ## What changed - Added `ResponseItemMetadata` with optional `turn_id`, plus optional `metadata` on Responses API item variants and inter-agent communication. - Preserved item metadata through response-item rewrites such as truncation, missing tool-output synthesis, compaction history rebuilding, visible-history conversion, rollout/resume, and generated app-server schemas/types. - Strip item metadata from non-OpenAI Responses requests while preserving it for OpenAI-shaped requests. - Updated the mechanical fixture/test construction churn required by the new optional field.
Owen Lin ·
2026-06-15 15:05:28 -07:00 -
skills: cache orchestrator resources per thread (#28336)
## Why Hosted orchestrator skills are read through the remote MCP resource server. Within one thread, the same catalog or skill resource can be requested multiple times by prompt injection and the `skills.list` / `skills.read` tools. Re-fetching adds latency and can make those surfaces observe different remote contents during the same thread. This is a follow-up to #28333: orchestrator skills remain limited to threads without a local executor, and those threads now get a stable per-thread view of the remote skill data they use. ## What changed - Reuse the existing per-thread orchestrator catalog snapshot for `skills.list` and `skills.read` availability checks. - Cache successful orchestrator resource reads by authority, package, and resource so prompt injection and tool calls share the same contents. - Keep the cache memory-only and bounded to 100 resources and 8 MiB per thread. - Leave host and executor skill reads unchanged, and do not cache failed remote reads. ## Verification - Extended the app-server MCP resource integration test to read the same hosted skill resource twice and verify that the remote server receives one read. - The same test verifies that catalog discovery and the selected skill's main prompt are each fetched only once per thread.
jif ·
2026-06-15 20:20:19 +02:00 -
skills: hide orchestrator skills with a local executor (#28333)
## Why App-server threads without a local executor need orchestrator-owned skills from the hosted `codex_apps` MCP server. Threads with the local executor already discover installed skills from the local filesystem. After the orchestrator skill provider was enabled for every app-server thread, local-executor threads also received the hosted skill catalog and the `skills.list` and `skills.read` tools. This changed the existing local behavior and could expose a second hosted copy of a skill that was already installed locally. ## What changed - Expose the thread's selected execution environments to extensions at thread startup. - Enable orchestrator skills only when the reserved local environment is not selected. - Apply that decision consistently to hosted skill catalog discovery, explicit skill injection, and the `skills.list` and `skills.read` tools. ## Verification - The existing no-executor app-server test continues to verify hosted skill discovery, invocation, and child-resource reads. - A new app-server test verifies that local-executor threads do not receive hosted skill context or `skills.*` tools.
jif ·
2026-06-15 17:15:45 +02:00 -
Discover stdio MCP servers from selected executor plugins (#27870)
## Why **In short:** this PR discovers MCP registrations by reading a selected plugin's `.mcp.json` on its executor. #27884 then resolves those registrations in the shared catalog. `thread/start.selectedCapabilityRoots` can select a plugin root owned by an executor, and Codex can resolve that package through the executor filesystem. MCP declarations inside the selected plugin are still ignored. This PR adds the source-specific discovery layer on top of the selected-plugin catalog boundary in #27884: ```text selected capability root | v resolve the plugin through its executor filesystem | v read and normalize its MCP config through the same filesystem | v contribute stdio registrations bound to that environment ID ``` The existing MCP launcher and connection manager remain unchanged. MCP config parsing is shared with local plugins through #27863. ## What changed - Added an executor plugin MCP provider in the MCP extension. - Retained only the exact filesystem capability used for package resolution and reused it for the selected plugin's MCP config, with no host-filesystem fallback or unrelated process/HTTP authority. - Read either the manifest-declared MCP config or the default `.mcp.json`; a missing default file means the plugin has no MCP servers. - Accepted stdio servers only for this first vertical. Executor-owned HTTP declarations are skipped with a warning until their placement semantics are defined. - Normalized stdio registrations with the owning environment's stable logical ID and plugin-root working directory. - Resolved environment-variable names on the owning executor and rejected explicit local forwarding for non-local plugins. - Froze discovered declarations once per active thread runtime, then applied current managed plugin and MCP requirements when contributing them. - Carried the selected root ID, display name, and selection order into the catalog contribution defined by #27884. ## Behavior and scope There is intentionally no production behavior change yet. This PR provides the executor provider and contribution boundary, but app-server does not install it in this change. Existing local plugin MCP loading is unchanged, and no MCP process is launched by this PR alone. ## Assumptions - The selected root ID is the plugin policy identity; the manifest display name is presentation metadata. - An environment ID is a stable logical authority. Reconnection or replacement under the same ID does not change ownership. - Selected plugin packages and their manifests are trusted inputs. - The selected package and MCP discovery snapshot remain frozen for the active thread runtime. ## Follow-up The next PR installs this contributor in app-server and adds an end-to-end test proving that a selected plugin MCP tool launches on its owning executor, can be called by the model, survives an explicit MCP refresh, and is invisible when its root was not selected. Resume, fork, environment removal or ID changes, dynamic catalog reload, and executor-owned HTTP MCP placement remain separate lifecycle decisions. ## Verification Focused tests cover executor-only filesystem reads, missing and malformed config, stdio filtering and normalization, managed requirements, package attribution, and selection order. CI owns execution of the test suite.
jif ·
2026-06-15 11:52:05 +02:00 -
Add selected-plugin precedence and attribution to the MCP catalog (#27884)
## Why **In short:** this PR resolves already-discovered MCP registrations. It does not read selected plugins or discover their MCP servers. The resolved MCP catalog currently builds config and auto-discovered plugin registrations before runtime contributors are applied. A thread-selected plugin needs a distinct precedence tier in that same initial resolution pass: otherwise a disabled lower-precedence winner can leave stale name-level state behind, and the winning MCP tools cannot be attributed to the selected package reliably. This PR adds that catalog boundary before executor discovery is connected. ## What changed - Added an explicit selected-plugin registration tier between auto-discovered plugins and explicit config. - Collected selected-plugin contributions before the initial catalog build, while leaving compatibility and generic extension overlays in their existing runtime phase. - Retained the winning plugin ID and display name directly on plugin-owned catalog registrations. - Derived MCP tool provenance from the winning catalog entry instead of joining against local-only plugin summaries. - Retained the winning selected server's tool approval policy in the running connection manager, so a selected registration cannot inherit approval behavior from a losing local plugin. - Kept remembered approval session-scoped for selected plugins until there is an authority-aware persistence contract; Codex will not write approval back to an unrelated local plugin. - Preserved existing name-level disabled vetoes for discovered plugins and config, while keeping a selected package's own disabled registration scoped to that registration. - Preserved deterministic selection order and existing config, compatibility, and extension precedence. The resulting order is: ```text auto-discovered plugin < selected plugin < explicit config < compatibility registration < extension overlay ``` ## Behavior and scope This is a catalog and provenance change only. No production host contributes selected-plugin MCP registrations yet, so existing local MCP behavior remains unchanged. The stacked follow-up, #27870, installs the executor plugin provider that produces these registrations. App-server activation remains a separate final step. ## Verification Focused tests cover precedence, deterministic selected-plugin conflicts, disabled-veto behavior across catalog phases, managed requirements before selected-plugin resolution, winning-server approval policy, and attribution when local and selected packages share an ID or server name. CI owns execution of the test suite.
jif ·
2026-06-15 11:10:51 +02:00 -
build: run buildifier from just fmt (#28125)
## Intent Keep Bazel and Starlark files consistently formatted without requiring contributors to install or version buildifier themselves. ## Implementation - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier v8.5.1. - Run buildifier from the shared `just fmt` and `just fmt-check` driver, with Windows-safe explicit DotSlash invocation. - Provision DotSlash in formatting CI and contributor devcontainers, and document the source-build prerequisite. - Apply the initial mechanical buildifier formatting baseline.
Adam Perry @ OpenAI ·
2026-06-13 21:43:39 -07:00 -
[codex] make PathUri::from_abs_path infallible (#27976)
## Why `PathUri::from_abs_path` can fail for absolute paths that do not have a normal `file:` URI representation, forcing filesystem call sites to handle a conversion error even though the original path can be preserved losslessly. ## What Make `from_abs_path` infallible and migrate its callers. Unrepresentable paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The leading encoded null reserves a namespace that cannot collide with a real Unix or Windows path, and fallback URIs remain opaque to lexical path operations. ## Validation Added path-URI coverage for Unix null and non-UTF-8 paths, Windows device/verbatim and non-Unicode paths, serialization, malformed fallbacks, opaque lexical operations, invalid native payloads, and literal `/bad/path` collision resistance.
Adam Perry @ OpenAI ·
2026-06-12 16:58:42 -07:00 -
Support plaintext agent messages (#27830)
## Why Multi-agent v2 `send_message` deliveries already reach the receiving model as typed `agent_message` items with encrypted content. Child-completion notifications are generated by Codex itself, so their content is plaintext and previously fell back to a serialized JSON envelope inside an assistant message. With plaintext `input_text` supported for `agent_message`, both delivery paths can use the same model-visible type while preserving explicit author and recipient metadata. ## What changed - add plaintext `input_text` support to `AgentMessageInputContent` and regenerate the affected app-server schemas - preserve `InterAgentCommunication` as structured mailbox input instead of converting it to assistant text - record delivered communications as typed `agent_message` history items - persist a dedicated rollout item so local delivery metadata such as `trigger_turn` remains available without leaking into the Responses request - reconstruct typed agent messages on resume and preserve fork-turn truncation behavior - remove request-time assistant-content parsing - preserve plaintext and encrypted inter-agent deliveries in stage-one memory inputs - normalize and link plaintext and encrypted agent messages in rollout traces without treating inbound messages as child results - cover the real MultiAgent V2 child-completion path end to end with deterministic mailbox synchronization ## Verification - `just test -p codex-core plaintext_multi_agent_v2_completion_sends_agent_message` - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order record_initial_history_reconstructs_typed_inter_agent_message fork_turn_positions_use_inter_agent_delivery_metadata` - `just test -p codex-memories-write serializes_inter_agent_communications_for_memory` - `just test -p codex-rollout-trace agent_messages_preserve_routing_and_content sub_agent_started_activity_creates_spawn_edge` - `just test -p codex-rollout-trace agent_result_edge_falls_back_to_child_thread_without_result_message` - `just test -p codex-protocol -p codex-rollout -p codex-app-server-protocol`
jif ·
2026-06-12 13:50:04 -07:00 -
[codex] Add size to internal filesystem metadata (#27927)
## Why `ExecutorFileSystem::get_metadata` reports file kind and timestamps but not size. Internal callers that need to enforce a size limit therefore have to read the complete file first, which is especially wasteful for remote filesystems. This adds the missing internal metadata so consumers can reject oversized files before transferring or buffering them. The field is named `size`, matching VS Code's `FileStat.size` filesystem convention. ## What changed - add `size: u64` to internal `FileMetadata` - populate it from the underlying filesystem metadata - carry it through sandbox-helper and remote exec-server responses - cover files, directories, symlink targets, and sandboxed reads across local and remote filesystem implementations The new field is intentionally not exposed through the app-server API. ## Testing - `just test -p codex-exec-server get_metadata` - `just test -p codex-exec-server file_system_sandboxed_metadata_and_read_allow_readable_root` - `just test -p codex-core-plugins` - `just test -p codex-skills-extension`
pakrym-oai ·
2026-06-12 12:12:08 -07:00 -
Handle standalone image generation failures as terminal items (#27920)
## Why Standalone image generation emitted a started item but no terminal item when the backend failed. Clients could leave the operation unresolved or render it as successful. ## What changed - Emit a terminal image-generation item with `status: "failed"` when generation or editing fails. - Skip image persistence for failed terminal items. - Render failed image generation distinctly in TUI history. - Preserve the status when handling live and replayed terminal items. ## Looks for TUI, App-Side change needed <img width="867" height="89" alt="image" src="https://github.com/user-attachments/assets/9e32342f-a982-411e-8498-456639fc468a" /> ## Validation - `just test -p codex-image-generation-extension` - App-server image-generation tests - Core stream-event tests - TUI image-generation lifecycle and snapshot tests - Scoped Clippy and formatting
Won Park ·
2026-06-12 11:57:22 -07:00 -
Make MCP server contributions thread-scoped (#27670)
## Why `selectedCapabilityRoots` belongs to one thread, but MCP contributors previously received only the global Codex config. That left no clean way for a selected executor capability to contribute MCP servers to its own thread. ## What this PR does - Gives MCP contributors a small context containing the config and, for a running thread, its frozen host-seeded inputs. - Uses the same thread inputs during startup, status queries, refreshes, and skill dependency checks. - Keeps threadless MCP operations and the existing hosted Apps behavior unchanged. - Adds coverage showing that two threads resolve independent registrations and that later lifecycle mutations do not change the frozen MCP inputs. This PR does not discover plugin manifests, add MCP servers, or launch anything new. It only establishes the thread-scoped registration boundary. ## Follow-ups - Resolve selected executor plugin roots through their owning environment filesystem. - Convert their stdio MCP declarations into environment-bound registrations and add an executor MCP end-to-end test. ## Verification - `just fmt` - `cargo check --tests -p codex-protocol -p codex-extension-api -p codex-mcp-extension -p codex-core -p codex-app-server` Tests and Clippy were not run.
jif ·
2026-06-12 11:20:34 +02:00 -
[codex] Remove async_trait from first-party code (#27475)
## Why First-party async traits should expose their `Send` contracts explicitly without requiring `async_trait`. This completes the migration pattern established in #27303 and #27304. ## What changed - Replaced the remaining first-party `async_trait` traits with native return-position `impl Future + Send` where statically dispatched and explicit boxed `Send` futures where object safety is required. - Kept implementations behavior-preserving, outlining existing async bodies into inherent methods where that keeps the diff reviewable. - Removed all direct first-party `async-trait` dependencies and the workspace dependency declaration. - Added a cargo-deny policy that permits `async-trait` only through the remaining transitive wrapper crates. - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and keep the full cargo-deny check passing. ## Validation - `just test -p codex-exec-server`: 216 passed, 2 skipped. - `just test -p codex-model-provider`: 39 passed. - `just test -p codex-core` and `just test`: changed tests passed; remaining failures are environment-sensitive suites unrelated to this migration. - `cargo deny check` - `just fix` - `just fmt` - `cargo shear` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 18:16:39 -07:00 -
Fix image extension PathUri conversion (#27711)
## Why `main` stopped compiling when #27498 passed an `AbsolutePathBuf` to the `ExecutorFileSystem` API migrated to `PathUri` by #27653. ## What Convert referenced image paths to `PathUri` before filesystem reads, declare the internal path-URI dependency, and refresh `Cargo.lock`.
Adam Perry @ OpenAI ·
2026-06-12 00:15:19 +00:00 -
Route image extension reads through turn environments v2 (#27498)
## Why Image generation used `std::fs::read` for referenced image paths, which did not support environment-backed filesystems or their sandbox context. ## What changed - Expose optional turn environments to extension tool calls. - Include each environment’s ID, working directory, filesystem, and sandbox context. - Read referenced images through the selected environment filesystem. - Keep sandbox usage at the extension call site so extensions can choose the appropriate access mode. - Consolidate image request construction into one async function. - Add coverage for successful environment reads and read failures. ## Validation - `cargo check -p codex-image-generation-extension --tests` - `just fmt` - `just bazel-lock-update` - `just bazel-lock-check` `just test -p codex-image-generation-extension` could not complete because the build exhausted available disk space.
Won Park ·
2026-06-11 16:32:52 -07:00 -
Resolve MCP server registrations through a catalog (#27634)
## Why MCP servers currently come from user config, local plugins, compatibility Apps synthesis, and host extensions. Those sources were composed by mutating a shared map, leaving registration identity, precedence, removal, and provenance implicit in assembly order. Before adding executor-owned MCPs, Codex needs one durable resolution boundary above `McpConnectionManager`. This PR introduces that boundary while preserving current server configuration, policy, and runtime behavior. Executor-scoped registrations and explicit policy layers remain follow-ups. ## What changed - Add typed `McpServerRegistration` inputs and an immutable `ResolvedMcpCatalog` in `codex-mcp`. - Retain each registration's complete `McpServerConfig`, including its environment binding, while recording its source and provenance. - Preserve the existing structural precedence between plugin, config, compatibility, and ordered extension sources. - Resolve equal-precedence actions by contribution order; provenance IDs are used only for diagnostics and cannot affect the winner. - Preserve extension removals and the existing name-scoped `enabled = false` veto. - Report same-tier conflicts with every contender and the final catalog outcome, including whether the winning action registers or removes the server. - Require MCP contributors to provide a stable diagnostic identity. - Derive materialized server maps and plugin ownership from the resolved catalog. `McpConnectionManager`, transport startup, tool calls, and resource routing continue to consume the same effective `McpServerConfig` values. ## Scope This PR does not add new MCP capabilities or change user-visible behavior. It does not add executor plugin discovery, thread-scoped registrations, dynamic refresh generations, or new user/managed policy semantics. ## Verification - Added focused catalog coverage for source precedence, complete configuration preservation, disabled vetoes, plugin ownership, contribution-order tie breaking, removal outcomes, and conflict diagnostics. - Extended hosted Apps coverage for ordered extension removal and Apps-disabled hosts with and without the hosted extension installed. - `cargo check -p codex-mcp --tests -p codex-extension-api -p codex-core`
jif ·
2026-06-11 21:54:52 +02:00