mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
32 Commits
-
[codex] Use model metadata for skills usage instructions (#29740)
## Summary - add a false-by-default `include_skills_usage_instructions` model metadata field - enable the field for the bundled `gpt-5.5` model metadata - consume the metadata in both core and extension skill rendering - remove hardcoded legacy-model matching and its marker plumbing
ani-oai ·
2026-06-29 09:44:36 +09:00 -
Reinject missing World State fragments on resume (#30152)
## Why World State restores its structured snapshot on resume so unchanged sections do not have to be rendered again. That is safe only when the model-visible fragment represented by the snapshot is still present in retained history. For selected executor skills, the failing selected-capability scenario exposed this state: ```text persisted World State: selected skill catalog is known retained model history: selected skill catalog message is missing next diff: unchanged, so emit nothing ``` The model resumes without being told about the selected skill catalog. ## What changed World State contributions may now optionally describe the concrete model-visible fragment that must remain in retained history. When a persisted snapshot is present: ```text matching retained fragment exists -> trust snapshot, emit nothing matching retained fragment missing -> treat section as absent, render current state once ``` The skills extension uses this for non-empty selected-environment catalogs by matching its exact rendered catalog body. Empty or hidden catalogs do not require a fragment. ## Scope This does not clear or rebuild the whole World State baseline. It does not change skill discovery, cache invalidation, environment availability, or MCP runtime behavior. It only keeps a persisted section snapshot and its retained model context consistent across resume/history reconstruction. ## Coverage A focused World State regression test verifies both sides: - a missing retained fragment is rendered again - a matching retained fragment avoids duplicate injection
jif ·
2026-06-26 02:18:00 +01:00 -
Project executor skills through World State (#30088)
## Why A selected executor environment can be unavailable in one model step and ready in the next. The model should see its skills only while that environment is ready, without rescanning stable files on every sample. The product assumption is simple: - an environment ID names one stable logical environment; - the selected root contents do not change during the thread. ## Behavior ```text E1 unavailable -> do not show E1 skills E1 ready -> discover once, cache, show through World State E1 unavailable -> hide skills, keep cache E1 ready again -> reuse cache, show skills again resume -> create a new thread cache and discover again ``` The cache key is the full `SelectedCapabilityRoot`. Availability does not invalidate it; dropping the extension's thread state does. The step supplies the ready selected roots directly. They do not have to be turn environments: ```text turn environment: laptop selected root: worker:/plugins/lint-fix worker ready -> lint-fix skills are visible ``` ## What changes - Keeps executor skill catalogs in the existing skills extension. - Passes the roots resolved as ready for the step into World State contributors. - Loads each ready selected root at most once per thread. - Contributes the executor catalog as the `skills` World State section. - Uses the exact step catalog for explicit skill selection and body reads. - Leaves host and orchestrator skill behavior where it already lives. Taking a step snapshot itself does not add an RPC. Executor filesystem calls happen only on the first discovery of a stable root for that thread. ## What does not change - No filesystem watcher or content-based invalidation. - No retry/generation framework. - No skill runtime migration into core. - No general rewrite of the skills extension. ## Stack 1. Extension-owned World State sections. 2. **This PR:** project cached executor skills through World State. 3. Pin one MCP runtime to each model step. 4. Project selected MCP/app/connector metadata by environment availability. 5. One end-to-end integration scenario.
jif ·
2026-06-26 00:13:43 +01:00 -
Load executor skills without host path conversion (#29626)
## Why After #28918, selected skill roots are `PathUri`, but the executor skill provider still converts them to the app-server host's `AbsolutePathBuf`. A foreign Windows root therefore cannot be discovered by a Unix host, and the inverse has the same problem. This PR keeps executor skill discovery and reads on the filesystem that owns the selected root while reusing the existing skill rules. ## What changed - Generalize the existing skill traversal to operate on `PathUri` through `ExecutorFileSystem`, preserving its depth, directory, symlink, and sibling-metadata concurrency behavior. - Add a small environment skill loader that reuses the shared discovery, frontmatter validation, dependency parsing, product policy, and prompt-visibility rules. - Keep the environment id and entrypoint `PathUri` in the skill catalog, then route `skills.read` back through the same environment filesystem. - Preserve the executor's path convention when deriving catalog handles, including literal backslashes in POSIX filenames. - Resolve plugin namespaces from nearby manifests through URI-native filesystem reads. - Cover foreign Windows roots, executor-owned reads, namespaces, metadata, policy, and path identity. ```text selected root (PathUri) | v shared discovery over ExecutorFileSystem | v environment-bound catalog entry --skills.read--> same ExecutorFileSystem ``` No second filesystem abstraction or duplicate traversal implementation is introduced. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. #28918 — keep selected plugin roots and resources URI-native. 4. **This PR** — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 23:26:06 +01:00 -
Make selected plugin roots URI-native (#28918)
## Why Selected capability roots belong to the executor filesystem, not the app-server host. Converting their path strings into the host's native `Path` breaks whenever the two machines use different path conventions, such as a Windows executor behind a Unix app-server. This PR establishes `PathUri` as the selected-plugin boundary so the executor remains authoritative for its paths. ## What changed - Require `selectedCapabilityRoots[].location.path` to be a canonical `file:` URI and deserialize it directly as `PathUri`; native path strings are rejected. - Update the app-server schema, generated TypeScript, examples, and request coverage for the URI contract. - Keep selected roots, resolved plugin locations, manifest paths, and manifest resources as `PathUri`. - Inspect and read plugin roots and manifests only through the selected environment's `ExecutorFileSystem`. - Parse executor manifests with the shared URI-native parser from #29620 instead of projecting them onto the host filesystem. - Enforce resource containment lexically and preserve the root URI's POSIX or Windows path convention. - Cover foreign Windows plugin roots and URI-native manifest resources. ```text thread/start selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo" | PathUri v ExecutorFileSystem | +--> plugin.json +--> manifest resources ``` This PR stops at the shared selected-plugin representation. The next two PRs remove the remaining host-path projections in the skill and MCP consumers. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. **This PR** — keep selected plugin roots and resources URI-native. 4. #29626 — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 22:51:19 +01:00 -
[codex] Preserve skill descriptions outside model context (#29006)
## Why Skill descriptions are used in model-visible lists: the default available-skills catalog that supports implicit selection, and the on-demand `skills.list` tool response used to discover orchestrator skills. A single overlong description should not consume a disproportionate share of either list. Enforcing the 1024-character limit while loading or migrating skills is the wrong boundary: it rejects otherwise-valid skills and discards metadata that non-model consumers and full skill reads may need. Skill metadata and `SKILL.md` content should remain intact; the cap belongs at model-visible list rendering boundaries. ## What changed - Preserve full `description` and `metadata.short-description` values when loading skills. - Preserve full external-agent command descriptions during `source-command-*` migration instead of skipping commands solely because their descriptions exceed 1024 characters. - Preserve full normalized orchestrator descriptions in the underlying skills catalog. - Cap each description at 1024 Unicode characters when rendering the default available-skills context in `codex-core-skills` and `codex-skills-extension`. - Apply the same cap when serializing descriptions in the model-visible `skills.list` response. - Render truncated descriptions as 1021 original characters plus `...`. - Leave explicit `$skill` injection, `skills.read`, underlying metadata, and on-disk `SKILL.md` files unchanged and full-fidelity. ## Implicit skill selection Codex injects a bounded catalog containing each implicitly allowed skill's name, description, and source locator, together with instructions to use a skill when the task clearly matches its description. The model makes that semantic choice; after selecting a skill, it reads the full `SKILL.md` from its filesystem or provider resource. Explicit `$skill` mentions remain a separate path that injects the full skill instructions. For orchestrator skills, `skills.list` provides bounded discovery metadata before `skills.read` returns the full selected resource. ## Test plan - `just test -p codex-core-skills` - `just test -p codex-skills-extension` - `just test -p codex-external-agent-migration` The focused regressions verify that overlong metadata is preserved at load and migration boundaries while default available-skills rendering and `skills.list` output produce the 1021-character prefix plus `...`.
charlesgong-openai ·
2026-06-19 12:47:53 -07:00 -
rphilizaire-openai ·
2026-06-19 10:13:27 -07:00 -
Add config toggles for orchestrator skills and MCP (#28942)
## Why Orchestrator-provided skills and Codex Apps MCP tools add model-visible instructions, resources, and tools beyond the local workspace. Hosts need config-level switches to disable those orchestrator-owned surfaces independently, without disabling regular skills or regular MCP servers. ## What changed - Adds `[orchestrator.skills].enabled` and `[orchestrator.mcp].enabled` config entries, both defaulting to `true`. - Includes the new settings in `config.schema.json` and in the config lock so resolved thread configuration preserves the same orchestrator exposure decisions. - Threads `orchestrator.skills.enabled` through the app-server skills extension so disabled orchestrator skills do not expose the `skills` namespace or inject orchestrator skill context. - Gates Codex Apps MCP exposure, app instructions, and app auth eligibility on `orchestrator.mcp.enabled` while leaving non-Codex-Apps MCP tools available. - Updates the thread-manager sample config to disable both orchestrator-owned surfaces. ## Verification - Added config parsing, loading, defaulting, and schema coverage for the new settings. - Added MCP exposure coverage that `orchestrator.mcp.enabled = false` removes Codex Apps tools while preserving regular MCP tools. - Added app-server coverage that `orchestrator.skills.enabled = false` prevents orchestrator skill tools, prompts, and resource reads from reaching the model turn.
jif ·
2026-06-19 14:42:26 +02:00 -
[codex] Reuse parsed plugin skills during session startup (#28844)
## Summary - Preserve raw plugin skill-root snapshots in the matching loaded-plugin cache entry, keyed by the effective plugin root identity including namespace. - Pass those snapshots through `SkillsLoadInput` as an optional preload, so session startup reuses plugin parsing while ordinary skill loads pass `None`. - Keep plugin skill loading cohesive: the existing loaders accept the optional snapshots directly, and uncached or marketplace-detail paths do not create a cache. ## Why Plugin discovery already parses plugin skills to determine available capabilities. Cold session startup then scanned and parsed the same roots again while building the skills snapshot. This solves the same duplicate-work problem as #28623 while keeping ownership narrow: `PluginsManager` creates and owns `PluginSkillSnapshots` only for its loaded-plugin cache entry; `SkillsService` consumes an optional clone. Entry replacement or clearing naturally drops the snapshots, with no separate generation, capacity policy, or watcher coupling. ## Validation - `cargo clippy -p codex-core-skills --all-targets -- -D warnings` - `just test -p codex-core-plugins skills_service_reuses_skills_parsed_during_plugin_load` - `just test -p codex-core-skills namespaces_plugin_skills_using_provided_namespace` - `just fmt`
xl-openai ·
2026-06-18 16:45:58 -07:00 -
Add turn-scoped context contributions (#28911)
## Summary - keep context injection on a single ContextContributor trait - split context injection into thread-scoped and turn-scoped contribution methods - wire turn-scoped fragments into initial context assembly so extensions can contribute context from turn-local state
jif ·
2026-06-18 19:40:28 +02:00 -
[codex] Pass plugin namespace into skill loading (#28608)
## What changed - retain the parsed plugin manifest namespace on loaded plugins - carry that namespace through `PluginSkillRoot` and `SkillRoot` - use the provided namespace when qualifying plugin skill names - include the namespace in the skills cache key ## Why Plugin loading has already parsed `plugin.json`, but skill parsing currently walks every `SKILL.md` ancestor and probes/reads the manifest again to reconstruct the same namespace. Passing the parsed namespace removes those repeated filesystem calls, which are particularly costly on remote filesystems. Context: https://openai.slack.com/archives/C0ARA9GF5D4/p1781639496496439?thread_ts=1781202444.891669&cid=C0ARA9GF5D4 ## Impact Plugin skill names remain unchanged. A regression test uses a deliberately different on-disk manifest name to verify that plugin roots use the provided parsed namespace. ## Validation - `just test -p codex-core-skills -p codex-core-plugins -p codex-plugin -p codex-utils-plugins` (352 passed) - `just fix -p codex-core-skills -p codex-core-plugins -p codex-plugin -p codex-utils-plugins` - `just fmt`
Matthew Zeng ·
2026-06-18 00:16:46 -07:00 -
Replace SkillsManager with SkillsService (#28705)
## Why Host skill discovery was still exposed as a manager even though it is a process-owned service shared by sessions, the app-server catalog, and file-watcher invalidation. The skills extension also consumed an ad hoc loaded-skills wrapper instead of a named immutable snapshot. ## What changed - replace `SkillsManager` with concrete `SkillsService` - make the service cache and return immutable `HostSkillsSnapshot` values - migrate the skills extension host provider to the snapshot boundary - migrate app-server catalog, watcher, and invalidation paths to the service This keeps the service limited to host discovery, caching, roots, and invalidation. Catalog rendering and invocation remain extension responsibilities for the next stacked change.
jif ·
2026-06-17 17:01:06 +02:00 -
[codex] exec-server: stream files in chunks (#28354)
## Why `fs/readFile` buffers the entire file in one response, which makes large remote reads expensive and prevents callers from applying backpressure. We need an opt-in streaming path with bounded block sizes while preserving the existing single-call API for small and sandboxed reads. ## What changed - Add `ExecServerClient::stream`, returning a named `FileReadStream` that implements `futures::Stream` and yields immutable 1 MiB byte blocks. - Add internal `fs/open`, `fs/readBlock`, and `fs/close` RPCs. `fs/readBlock` accepts an explicit offset and length. - Keep unsandboxed files open between block reads, cap open handles per connection, and clean them up on EOF, error, stream drop, explicit close, or connection shutdown. - Reject platform-sandboxed streaming opens instead of turning the one-shot sandbox helper into a persistent server. Existing `fs/readFile` behavior is unchanged. ## Testing - `just test -p codex-exec-server` - Integration coverage for 1 MiB chunking, exact block-boundary EOF, sandbox rejection, and continued reads from the opened file after path replacement. - Handle-manager coverage for non-sequential offsets, variable block lengths, the 128-handle limit, and capacity release after close.
pakrym-oai ·
2026-06-16 09:50:55 -07:00 -
skills: cache orchestrator resources per thread (#28336)
## Why Hosted orchestrator skills are read through the remote MCP resource server. Within one thread, the same catalog or skill resource can be requested multiple times by prompt injection and the `skills.list` / `skills.read` tools. Re-fetching adds latency and can make those surfaces observe different remote contents during the same thread. This is a follow-up to #28333: orchestrator skills remain limited to threads without a local executor, and those threads now get a stable per-thread view of the remote skill data they use. ## What changed - Reuse the existing per-thread orchestrator catalog snapshot for `skills.list` and `skills.read` availability checks. - Cache successful orchestrator resource reads by authority, package, and resource so prompt injection and tool calls share the same contents. - Keep the cache memory-only and bounded to 100 resources and 8 MiB per thread. - Leave host and executor skill reads unchanged, and do not cache failed remote reads. ## Verification - Extended the app-server MCP resource integration test to read the same hosted skill resource twice and verify that the remote server receives one read. - The same test verifies that catalog discovery and the selected skill's main prompt are each fetched only once per thread.
jif ·
2026-06-15 20:20:19 +02:00 -
skills: hide orchestrator skills with a local executor (#28333)
## Why App-server threads without a local executor need orchestrator-owned skills from the hosted `codex_apps` MCP server. Threads with the local executor already discover installed skills from the local filesystem. After the orchestrator skill provider was enabled for every app-server thread, local-executor threads also received the hosted skill catalog and the `skills.list` and `skills.read` tools. This changed the existing local behavior and could expose a second hosted copy of a skill that was already installed locally. ## What changed - Expose the thread's selected execution environments to extensions at thread startup. - Enable orchestrator skills only when the reserved local environment is not selected. - Apply that decision consistently to hosted skill catalog discovery, explicit skill injection, and the `skills.list` and `skills.read` tools. ## Verification - The existing no-executor app-server test continues to verify hosted skill discovery, invocation, and child-resource reads. - A new app-server test verifies that local-executor threads do not receive hosted skill context or `skills.*` tools.
jif ·
2026-06-15 17:15:45 +02:00 -
[codex] make PathUri::from_abs_path infallible (#27976)
## Why `PathUri::from_abs_path` can fail for absolute paths that do not have a normal `file:` URI representation, forcing filesystem call sites to handle a conversion error even though the original path can be preserved losslessly. ## What Make `from_abs_path` infallible and migrate its callers. Unrepresentable paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The leading encoded null reserves a namespace that cannot collide with a real Unix or Windows path, and fallback URIs remain opaque to lexical path operations. ## Validation Added path-URI coverage for Unix null and non-UTF-8 paths, Windows device/verbatim and non-Unicode paths, serialization, malformed fallbacks, opaque lexical operations, invalid native payloads, and literal `/bad/path` collision resistance.
Adam Perry @ OpenAI ·
2026-06-12 16:58:42 -07:00 -
[codex] Add size to internal filesystem metadata (#27927)
## Why `ExecutorFileSystem::get_metadata` reports file kind and timestamps but not size. Internal callers that need to enforce a size limit therefore have to read the complete file first, which is especially wasteful for remote filesystems. This adds the missing internal metadata so consumers can reject oversized files before transferring or buffering them. The field is named `size`, matching VS Code's `FileStat.size` filesystem convention. ## What changed - add `size: u64` to internal `FileMetadata` - populate it from the underlying filesystem metadata - carry it through sandbox-helper and remote exec-server responses - cover files, directories, symlink targets, and sandboxed reads across local and remote filesystem implementations The new field is intentionally not exposed through the app-server API. ## Testing - `just test -p codex-exec-server get_metadata` - `just test -p codex-exec-server file_system_sandboxed_metadata_and_read_allow_readable_root` - `just test -p codex-core-plugins` - `just test -p codex-skills-extension`
pakrym-oai ·
2026-06-12 12:12:08 -07:00 -
[codex] Remove async_trait from first-party code (#27475)
## Why First-party async traits should expose their `Send` contracts explicitly without requiring `async_trait`. This completes the migration pattern established in #27303 and #27304. ## What changed - Replaced the remaining first-party `async_trait` traits with native return-position `impl Future + Send` where statically dispatched and explicit boxed `Send` futures where object safety is required. - Kept implementations behavior-preserving, outlining existing async bodies into inherent methods where that keeps the diff reviewable. - Removed all direct first-party `async-trait` dependencies and the workspace dependency declaration. - Added a cargo-deny policy that permits `async-trait` only through the remaining transitive wrapper crates. - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and keep the full cargo-deny check passing. ## Validation - `just test -p codex-exec-server`: 216 passed, 2 skipped. - `just test -p codex-model-provider`: 39 passed. - `just test -p codex-core` and `just test`: changed tests passed; remaining failures are environment-sensitive suites unrelated to this migration. - `cargo deny check` - `just fix` - `just fmt` - `cargo shear` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 18:16:39 -07:00 -
[codex] migrate ExecutorFileSystem paths to PathUri (#27424)
## Why We're moving exec-server to use PathUri for its internal path representations. ## What Move `ExecutorFileSystem` APIs to use `PathUri` instead of `AbsolutePathBuf`. Future changes will convert higher-level parts of exec-server.
Adam Perry @ OpenAI ·
2026-06-11 18:44:18 +00:00 -
[codex] remove EnvironmentPathRef (#27433)
We're switching to using a static encoding of the host path in `PathUri`. We may need a type like this again but we can add it when it's more compelling. Stacked on #27454.
Adam Perry @ OpenAI ·
2026-06-11 18:26:12 +00:00 -
skills: decouple the skills extension from core (#27413)
## Why `ext/skills` currently depends on `codex-core` for two host concerns: reading the concrete `Config` type and borrowing core-owned model-context fragment types. That coupling prevents the extension from being assembled independently above core and leaves context that belongs to the skills feature owned by core. This stacked PR introduces the host boundary needed for the broader extension migration while intentionally preserving existing skills behavior. It is stacked on #27404. ## What changed - Adds a small public `SkillsExtensionConfig` view and makes skills installation generic over the host config type. - Requires the host to map its config into that view; app-server supplies the current `Config` values. - Moves the available-skills and selected-skill context fragment implementations into `ext/skills`, preserving their roles, markers, and rendered bytes. - Removes the direct `codex-core` dependency from `codex-skills-extension`. - Keeps local discovery, invocation, side effects, and the `codex-core-skills` compatibility types unchanged for later staged PRs. ## Behavior This adds no capability and is intended to have no user-visible or model-visible behavior change. The install API and ownership boundary change internally; emitted skills context remains byte-for-byte compatible. ## Validation - Updates the skills extension integration coverage to use a host-owned test config. - Asserts the complete rendered catalog and selected-skill fragments, including their roles and markers. - `just bazel-lock-check` - Rust tests and Clippy were not run locally per request; CI will run them.
jif ·
2026-06-11 14:03:53 +02:00 -
skills: render catalog locators by authority (#27591)
## Why Hosted skills introduced by #27388 use opaque `skill://` resource identifiers, but the skills catalog rendered every locator as a `file` and told the model that every skill body lived on disk. That can send the model toward filesystem tools for a resource that must instead be read through its owning authority. The catalog should describe how each source is accessed without changing the underlying discovery or invocation behavior. ## What changed - Render host skills as `file`, executor-owned skills as `environment resource`, orchestrator-owned skills as `orchestrator resource`, and custom-provider skills as `custom resource`. - Update the shared no-alias guidance to describe source locators rather than assuming every skill is stored on the host filesystem. - Direct orchestrator resources through `skills.list` and `skills.read`, and explicitly tell the model not to treat `skill://` identifiers as filesystem paths. - Preserve the existing filesystem and alias behavior for local skills. ## Scope This PR changes only model-visible catalog rendering and guidance. It does not change skill discovery, selection, prompt injection, provider routing, catalog caching or refresh behavior, resource validation, or the `skills.*` tool contract. ## Verification - Extended skills-extension coverage for host-file and executor-resource labels. - Extended the no-executor app-server flow to assert orchestrator-resource wording and non-filesystem guidance.
jif ·
2026-06-11 13:51:04 +02:00 -
nit: cap error (#27585)
Just cap an error that could end up in the model context
jif ·
2026-06-11 12:52:46 +02:00 -
skills: expose remote skill resource tools (#27388)
## Why PR #27387 makes backend plugin skills discoverable and invocable without an executor, but resources referenced by those skills still sit behind the generic MCP resource surface. The model needs a skills-owned API that preserves the provider authority and package boundary instead of treating remote resources like local files. This is stacked on #27387. ## What - Adds one `skills` namespace with bounded `list` and `read` tools for remote skill providers. - Revalidates `authority + package` against the live remote catalog on every read, then routes the opaque resource ID back through that provider. - Allows the backend provider to read canonical child `skill://` resources while rejecting cross-package, non-canonical, query, fragment, and traversal-shaped URIs. - Caps each serialized tool result at 8 KB. Lists are paginated; reads return an opaque continuation cursor. - Marks the JSON output as external context so memory generation can apply its normal suppression policy. - Deliberately does not add `skills.search`; that waits for a bounded plugin-service search contract. ## Tool contract Pseudo-Python matching the wire shape: ```python from typing import Literal, NotRequired, TypedDict class RemoteSkillAuthority(TypedDict): kind: Literal["remote"] id: str # e.g. "codex_apps" class RemoteSkill(TypedDict): authority: RemoteSkillAuthority package: str # opaque provider-owned package ID name: str description: str main_resource: str # opaque provider-owned SKILL.md ID class SkillsListParams(TypedDict): cursor: NotRequired[str] class SkillsListResult(TypedDict): skills: list[RemoteSkill] next_cursor: str | None warnings: list[str] truncated: bool class SkillsReadParams(TypedDict): authority: RemoteSkillAuthority # copied from skills.list package: str # copied from skills.list resource: str # provider-owned child resource ID cursor: NotRequired[str] # copy next_cursor to continue class SkillsReadResult(TypedDict): resource: str contents: str next_cursor: str | None truncated: bool class Skills: def list(self, params: SkillsListParams) -> SkillsListResult: ... def read(self, params: SkillsReadParams) -> SkillsReadResult: ... ``` There is one namespace for all remote skills, not one tool or MCP server per skill. No resource ID is converted into a filesystem path. ## Backend dependency `/ps/mcp` must support direct reads of child resources such as `skill://plugin_demo/deploy/references/deploy.md`. This PR implements and tests the Codex side of that contract; production child reads remain dependent on the corresponding plugin-service support. Search remains out of scope until that service exposes a bounded search/resource API. ## Validation - Added an app-server integration test covering `skills.list` followed by `skills.read` with no executor. - Ran `just fmt`. - Ran `just bazel-lock-update` and `just bazel-lock-check`. - Did not run Rust tests or Clippy locally, per request; CI will run them.
jif ·
2026-06-11 12:38:04 +02:00 -
skills: cache remote catalog failures per thread (#27403)
## Summary - cache the first remote skill catalog outcome per thread, including failures - preserve discovery errors as catalog warnings - update the existing cache regression test to verify failed discovery is attempted once ## Why A failed or hanging `codex_apps` `resources/list` call could run once while building initial context and immediately again while contributing first-turn input. With the discovery timeout, an ordinary Apps turn could wait up to 20 seconds before inference and retry again on later turns even when no remote skill was mentioned. Caching a warning-only empty catalog preserves graceful degradation while preventing repeated synchronous discovery attempts. ## Testing - `just fmt` - Tests and Clippy not run per request; CI will validate the change.
jif ·
2026-06-11 11:46:47 +02:00 -
skills: make backend plugin skills invocable without an executor (#27387)
## Why #27198 made the extension-owned `codex_apps` MCP connection the hosted plugin runtime, but its `mcp/skill` resources still bypassed the skills extension. App-server could list and read those resources through generic MCP APIs, but a thread with no selected environment did not expose them in the model's skills catalog or load their `SKILL.md` through `$skill`. Hosted skills should stay remote while using the same typed catalog, source authority, deduplication, bounded contextual catalog, and selected-skill prompt injection as host and executor skills. They should not be downloaded or exposed as ambient filesystem paths. ## What changed - Add a session-scoped `McpResourceClient` over the replaceable MCP connection manager so resource list/read calls follow startup and refresh replacements. - Add a `BackendSkillProvider` that pages `codex_apps` resources, accepts bounded and validated `mcp/skill` entries, and reads a selected skill's `SKILL.md` through the same MCP connection. - Register the remote provider in app-server and include it in the skills catalog even when a thread has no selected capability roots or executor. - Contribute hosted skill metadata through the bounded `AvailableSkillsInstructions` developer-context path, exclude remote entries from per-turn catalog injection, and classify `<skills>` messages as contextual developer content so rollback can trim and rebuild them correctly. ## Testing - Extend the app-server MCP resource integration test with `environments: []` to exercise two-page discovery, filter a non-`mcp/skill` resource, verify the escaped developer catalog entry and user-role `<skill>` fragment containing the fetched `SKILL.md`, and preserve generic MCP resource reads. - Add core event-mapping coverage that classifies `<skills>` developer messages as contextual history.
jif ·
2026-06-11 11:28:16 +02:00 -
Remove async-trait from extension contributors (#27383)
## Why Extension contributors are registered behind `dyn Trait` objects, so native `async fn`/RPITIT methods would make these traits non-object-safe. Spell out the boxed, `Send` future contract directly so `extension-api` no longer needs `async-trait` while retaining the existing runtime model. ## What changed - add a shared `ExtensionFuture` alias and use it for asynchronous contributor methods - migrate production and test implementations to return `Box::pin(async move { ... })` - remove `async-trait` dependencies where they are no longer used, keeping it dev-only where unrelated test executors still require it ## Behavior No behavior change is intended. Contributor futures remain boxed, `Send`, dynamically dispatched, and lazily executed; cancellation and callback ordering stay unchanged. ## Testing - `just test -p codex-extension-api` (11 passed) - affected extension crates (64 passed) - targeted `codex-core` contributor tests (14 passed) - `just fmt` - `just bazel-lock-update` - `just bazel-lock-check` A broad local `codex-core` run compiled successfully but encountered unrelated sandbox and missing test-binary fixture failures; CI will run the full checks.jif ·
2026-06-10 14:31:09 +02:00 -
Load selected executor skills through extensions (#27184)
## Why CCA is moving toward a split runtime where the orchestrator may not have a filesystem, while executors can expose preinstalled plugins and skills. A thread therefore needs to select capabilities without asking app-server or core to interpret executor-owned paths through the orchestrator's filesystem. The longer-term model is broader than executor skills: - A plugin is a bundle of skills, MCP servers, connectors/apps, and hooks. - A plugin root can be local, executor-owned, or hosted by a backend. - Components inside one plugin can use different access and execution mechanisms. A skill may be read from a filesystem or through backend tools; an HTTP MCP server can run without an executor; a stdio MCP server or hook needs an execution environment. - Core should carry generic extension initialization data. The extension that owns a component should discover it, expose it to the model, and invoke it through the appropriate runtime. This PR establishes that architecture through one complete vertical: selecting a root on an executor, discovering the skills beneath it, exposing those skills to the model, and reading an explicitly invoked `SKILL.md` through the same executor. ## Contract `thread/start` gains an experimental `selectedCapabilityRoots` field: ```json { "selectedCapabilityRoots": [ { "id": "deploy-plugin@1", "location": { "type": "environment", "environmentId": "workspace", "path": "/opt/codex/plugins/deploy" } } ] } ``` The root is intentionally not classified as a "plugin" or "skill" in the API. It can point at a standalone skill, a directory containing several skills, or a plugin containing skills and other components. This PR only teaches the skills extension how to consume it; later extensions can resolve MCP, connector, and hook components from the same selection. The platform-supplied `id` is stable selection identity. The location says which runtime owns the root and gives that runtime an opaque path. App-server does not inspect or canonicalize the path. ## What changed ### Generic thread extension initialization App-server converts selected roots into `ExtensionDataInit`. Core carries that generic initialization value until the final thread ID is known, then creates thread-scoped `ExtensionData` before lifecycle contributors run. This keeps `Session` and core independent of the capability-selection contract. The initialization value is consumed during construction; it is not retained as another long-lived `Session` field. ### Executor-backed skills The skills extension now owns an `ExecutorSkillProvider` that: - resolves the selected environment through `EnvironmentManager` - discovers, canonicalizes, and reads skills through that environment's `ExecutorFileSystem` - contributes the bounded selected-skill catalog as stable developer context - reads an explicitly invoked skill body through the authority that listed it - warns when an environment or root is unavailable - never falls back to the orchestrator filesystem for an executor-owned root Skill catalog and instruction fragments have hard byte bounds, which also bound them below the 10K-token per-item context limit. If a selected executor skill has the same name as a legacy local skill, the executor selection owns that invocation and the local body is not injected a second time. Existing local and bundled skill loading remains in place. Omitting `selectedCapabilityRoots` therefore preserves current local-only behavior. ## Current semantics - Only environment-owned locations are represented in this first contract. - Roots are resolved by the destination extension, not by app-server or core. - An unavailable executor or invalid root produces a warning and no capabilities from that root; it does not trigger a local-filesystem fallback. - Selection applies to a newly started active thread. - MCP servers, connectors, and hooks beneath a selected plugin root are not activated yet. - Selection is not yet persisted or inherited across resume, fork, or subagent creation. Existing local capabilities continue to behave as they do today in those flows. ## Planned vertical follow-ups 1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that works without an executor, then replace the special-purpose MCP plugins loader with that implementation. 2. **Executor MCP:** register and execute stdio MCP servers through the environment that owns the selected plugin root. 3. **Backend skills:** add a hosted skill source whose catalog and bodies are accessed through extension tools rather than a filesystem. 4. **Connectors and hooks:** activate those components through their owning extensions, using the same selected-root boundary and component-specific runtime. 5. **Durable selection:** define the desired-selection lifecycle, persist it, and make resume, fork, and subagent inheritance explicit rather than accidental. 6. **Local convergence:** incrementally route existing local plugin, skill, and MCP loading through the same extension model while preserving current local behavior. Each follow-up remains reviewable as an end-to-end capability. The platform selects roots, generic thread extension data carries the selection, and the owning extension resolves and operates its component. ## Verification Coverage added for: - app-server end-to-end discovery and explicit invocation of a skill inside an executor-selected plugin root - exclusive invocation when a selected executor skill collides with a local skill name - executor filesystem authority for discovery, canonicalization, and reads - thread extension initialization before lifecycle contributors run - stable executor catalog context, explicit invocation, context rebuilding, hidden skills, and preserved host/remote catalog behavior Targeted protocol, core-skills, skills-extension, core lifecycle, and app-server executor-skill tests were run during development.jif ·
2026-06-09 19:51:54 +02:00 -
Bridge host-loaded skills into the skills extension (#26172)
## Why The skills extension needs to become the path that exposes local host skills without losing the behavior already owned by core skill loading. Host skill discovery is not just `$CODEX_HOME/skills`: it also includes config layers, bundled-skill settings, plugin roots, runtime extra roots, and the filesystem for the selected primary environment. Rather than making the extension reload host skills and risk drifting from that authoritative load, this PR bridges the already-loaded per-turn skills outcome into the extension. That lets the extension advertise host skills and inject explicit `$skill` prompts while preserving the same roots, disabled/hidden state, rendered paths, and environment-backed file reads that the legacy path uses. ## What Changed - Adds `HostLoadedSkills` in `core-skills` to wrap the turn's `SkillLoadOutcome` and read `SKILL.md` through the filesystem that loaded that skill. - Stores `HostLoadedSkills` in turn extension data for normal turns and review turns, so the skills extension can consume the loaded host catalog without reloading it. - Adds `HostSkillProvider` under `ext/skills/src/provider/host.rs`, mapping host-loaded skill metadata into the skills-extension catalog/read contract. - Registers the host provider by default from `codex_skills_extension::install()`. - Preserves host skill metadata such as dependencies, disabled state, hidden-from-prompt policy, and slash-normalized display paths. - Passes host-loaded skills through `SkillListQuery` and `SkillReadRequest` so explicit skill invocation reads only resources from the loaded host catalog. - Adds integration coverage for a real legacy `$CODEX_HOME/skills/.../SKILL.md` skill being listed and injected through the installed extension. ## Testing - Added `installed_extension_loads_host_skills_from_legacy_roots` in `ext/skills/tests/skills_extension.rs`. - `just test -p codex-skills-extension`
jif ·
2026-06-04 15:28:06 +02:00 -
Implement v1 skills extension prompt injection (#26167)
## Why The skills extension needs a real turn-time path before host, executor, or remote skills can be routed through it. The previous code was mostly a placeholder catalog/provider sketch, so there was no bounded available-skills fragment, no source-owned `SKILL.md` read, and no place for warnings or per-turn selection state to live. This PR makes `ext/skills` the authority-preserving flow for listing candidate skills and injecting only explicitly selected main prompts, without adding more of that logic to `codex-core`. ## What changed - Expands catalog entries with `main_prompt`, display path, short description, dependency metadata, enabled/prompt visibility flags, and authority/package-aware read requests. - Replaces the placeholder `providers/*` modules with `SkillProviderSource` and `SkillProviders`, routing list/read/search calls by source kind and surfacing provider failures as warnings. - Adds bounded available-skills rendering and `SKILL.md` main-prompt truncation before the fragments enter model context. - Resolves explicit skill selections from structured `UserInput::Skill`, skill-file mentions, `skill://...` paths, and plain `$skill` text mentions, then reads selected prompts through their owning provider. - Stores mutable per-thread skills config and per-turn catalog/selection/warning state. - Adds `install_with_providers` so tests and future host wiring can supply concrete providers. ## Testing - Not run locally. - Added `codex-rs/ext/skills/tests/skills_extension.rs` coverage for available-catalog injection, selected prompt injection through the owning provider, and prompt-hidden skills that remain invokable.
jif ·
2026-06-03 16:24:16 +02:00 -
skills: resolve per-turn catalogs from turn input context (#26106)
## Why The skills extension needs the resolved turn environments to build a real per-turn `SkillListQuery`. The previous `TurnLifecycleContributor` hook only had a turn id, so it could only seed a placeholder query and never carry the executor authorities that executor-scoped skill routing will need. Moving catalog resolution onto `TurnInputContributor` puts the skills extension on the same turn-preparation path that already has the environment ids and working directories for the submitted turn, while keeping the actual prompt injection work for follow-up changes. ## What changed - switch `ext/skills` from `TurnLifecycleContributor` to `TurnInputContributor` - build `executor_authorities` from `TurnInputContext.environments` and pass them through `SkillListQuery` - keep storing the resolved catalog in `SkillsTurnState`, but drop the placeholder query helper that no longer matches the real data flow - update the extension TODOs to reflect that per-turn catalog resolution now happens in the turn-input contributor, and that prompt/context injection still needs to move later ## Testing - Not run locally.
jif ·
2026-06-03 13:32:55 +02:00 -
feat: add skills extension scaffold (#25953)
## Disclaimer This is only here for iteration purpose! Do not make any code rely on this ## Why Skills still live behind `codex-core` discovery and injection paths, but the extension system needs an authority-aware home before that logic can move. This adds that boundary without changing current skills behavior, and keeps host, executor, and remote skills distinct so future list/read/search flows do not collapse back to ambient local paths. ## What changed - Add the `codex-skills-extension` workspace/Bazel crate under `ext/skills`. - Define the initial catalog, authority, provider, and turn-state types for authority-bound skill packages and resources. - Register placeholder thread/config/prompt/turn lifecycle contributors plus host, executor, and remote provider aggregation points. - Capture the remaining extraction work as TODOs, including the missing extension API hooks needed for per-turn catalog construction and typed skill injection. - Keep plugins outside the runtime skills model: plugin-installed skills are treated as materialized host-owned skill sources once available. ## Verification - Not run locally.
jif ·
2026-06-03 01:10:26 +02:00