mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
363 Commits
-
fix(remote-control): avoid server token refresh retry storms (#30201)
## Why Remote-control websocket reconnects and pairing requests proactively refresh their server token. When `/server/refresh` returns a transient error such as `502`, the still-valid token was discarded as a usable connection path, causing reconnect failures and repeated refresh attempts that could amplify an upstream incident. ## What Changed - Start proactive refresh five minutes before token expiry and distinguish it from a required refresh for missing or expired tokens. - Continue websocket and pairing operations with the existing valid token after `429`, `5xx`, or timeout failures. - Share an in-memory `next_refresh_at` throttle across websocket and pairing callers, honoring both `Retry-After` formats and otherwise using a jittered 24–36 second delay. - Keep required refreshes strict, preserve `404` enrollment replacement, and clear token/throttle state for `401` and `403` auth recovery. - Preserve refresh response metadata internally and add focused wire-level and integration coverage. ## Verification Added behavioral coverage proving that: - a valid near-expiry token still completes websocket and pairing requests after transient refresh failures; - `Retry-After` suppresses a subsequent refresh across websocket and pairing callers; - request and response-body timeouts are classified as transient; - an expired token, including one that expires during refresh, cannot proceed to websocket connection; - auth failures clear the attempted token without overwriting a concurrently rotated token.
Anton Panasenko ·
2026-06-26 17:34:52 -07:00 -
Project selected plugin runtime by environment availability (#30093)
## Why Selected plugin metadata is stable, but MCP processes are live runtime state. They need different lifetimes: - the MCP extension caches manifest, MCP, and connector declarations for each stable selected root; - each model step projects that cached metadata through the roots that resolved as ready for that exact step; - the MCP manager is rebuilt only when that availability projection changes. This matches executor skills: both features consume the same resolved step roots instead of inferring readiness from the turn's selected environments. ## Behavior ```text E1 not ready for this step -> no E1 MCP servers or connectors -> cached plugin metadata stays in ext/mcp E1 becomes ready -> reuse cached metadata -> publish one MCP runtime containing E1 capabilities same ready roots on the next step -> reuse the exact runtime; no rediscovery and no MCP restart resume -> create new extension thread state and a new MCP runtime ``` All model-facing consumers use the same step snapshot: ```text resolved selected roots | v extension MCP/connector projection | v { MCP config, connector snapshot, MCP manager } | +-> advertise model tools +-> build app/connector tools +-> execute MCP calls ``` ## Cache contract The existing MCP extension owns a cache keyed by the full `SelectedCapabilityRoot`: ```rust let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default); ``` The cache lives with extension thread state. Environment availability filters projection but does not invalidate metadata. Resume creates new thread state. There is no file watcher or executor generation because contents behind a stable environment/root are assumed stable. ## What changes - Keeps executor plugin discovery and cached metadata in `ext/mcp`. - Caches MCP and connector declarations together per selected root. - Uses the step's already-resolved capability roots, including lazy environments that are not turn environments. - Reuses the current MCP runtime when the ready-root projection is unchanged. - Uses the same step MCP manager and connector snapshot for model-visible tools and execution. - Resolves direct thread-scoped MCP requests from the current selected-root projection. ## Deliberately out of scope - `app/list` remains based on the latest global host-plugin state; this PR does not make its response or notifications thread-specific. - `required = true` startup semantics do not apply to delayed executor MCP activation. - No filesystem/content invalidation. - No transport-disconnect watcher. - No executor generations or environment replacement semantics. - No client sharing across complete manager replacements. ## Stack 1. Extension-owned World State sections. 2. Project executor skills through World State. 3. Pin one MCP runtime to each model step. 4. **This PR:** project selected MCP and connector state from extension-owned metadata. 5. Integration coverage for selected capability availability and resume. ## Verification - `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id` - The stacked integration PR covers unavailable to ready activation, unchanged-runtime reuse, skills, MCP tools, connector attribution, and cold resume.jif ·
2026-06-26 01:36:44 +01:00 -
Read connector declarations from executor plugins (#29852)
## Why Selected capability roots can live on a different executor and operating system from app-server. Their connector declarations must therefore be read through the executor that owns the package, without converting executor URIs into host paths. This PR adds that authority-bound reader without activating connectors or changing thread startup. ## What changed - Add a small `codex-connectors-extension` crate for executor-owned connector I/O. - Read only the app configuration explicitly declared by the resolved plugin manifest. - Read through the `ExecutorFileSystem` retained by `ResolvedExecutorPlugin`; there is no host-filesystem fallback or default-file probe. - Keep `PathUri` values intact so Windows, Unix, and remote executor paths work from any orchestrator OS. - Return full `AppDeclaration` values so the caller retains declaration names and categories for routing. - Preserve the selected plugin ID and exact executor URI in read and parse errors. The contract is intentionally narrow: selected packages are trusted, valid packages and packages that provide connectors explicitly declare their app configuration. ## Stack scope This PR is stacked on #29851. It only provides the executor-backed reader. #29856 resolves selected roots at thread start, freezes their connector snapshot, and contains the remote-capable end-to-end authority test for the complete path.
jif ·
2026-06-24 23:56:50 +01:00 -
[codex] Inject agent graph store into ThreadManager (#29736)
Pick up the AgentGraphStore migration. - Inject an explicit optional agent graph store into `ThreadManager` - Move all calls to spawn, close, recursive resume, and subtree/archive/delete/feedback traversal through it - Keep using `LocalAgentGraphStore` when SQLite is available This required some changes to the interface to deal with futures: - The interface now matches `ThreadStore`'s object-safe pattern by returning a boxed `AgentGraphStoreFuture` directly, allowing `ThreadManager` to hold `Arc<dyn AgentGraphStore>` *Slight behavior change!* Unfiltered subtree enumeration now performs a single all-status breadth-first traversal, so a closed grandchild beneath an open edge is included; the previous Open-then-Closed traversals could not cross mixed-status paths and silently omitted it.
Tom ·
2026-06-24 13:24:10 -07:00 -
protocol: separate app and exec RPC ownership (#29714)
## Why The app-server and exec-server expose separate JSON-RPC APIs, but exec-server currently sources its serialized protocol and envelope types through app-server-oriented code. Giving each API an explicit owner makes the crate boundary legible without introducing shared generic envelopes. ## What changed - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and JSON-RPC envelopes. - Updated exec-server clients, transports, handlers, and tests to use the new crate. - Exposed app-server's existing JSON-RPC types through a public `rpc` module while retaining root re-exports. - Preserved existing wire shapes, including exec `PathUri` behavior. ## Stack This is PR 1 of 6. Next: [PR #29721](https://github.com/openai/codex/pull/29721), which moves auth mode below the app wire boundary. ## Validation - Exec-server protocol and server coverage passed in the focused protocol test runs. - App-server protocol schema fixtures passed.
Adam Perry @ OpenAI ·
2026-06-23 22:37:31 +00:00 -
Update rmcp to 1.8.0 (#29634)
## Summary - Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0. - Adapt to the new shared `peer_info` return type. - Box OAuth status discovery at the MCP boundary to keep the expanded future type from overflowing Rust's trait recursion limit. This brings in custom OAuth HTTP client support from [modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).
jif ·
2026-06-23 15:25:28 +01:00 -
Share resumed rollout history (#28426)
## Summary Resuming a persisted thread currently deep-clones its complete rollout history several times. `InitialHistory` is retained for the app-server response, copied into thread persistence, and copied again by read-only accessors. These copies scale with the complete rollout rather than the bounded model context and add measurable latency for large sessions. This change stores resumed rollout history in `Arc<Vec<RolloutItem>>`. Rollout loading wraps the parsed vector once, while app-server response construction, session initialization, and thread persistence share it through inexpensive `Arc` clones. Read-only history access now returns a borrowed slice, and fork paths use `Arc::unwrap_or_clone` where they genuinely need mutable ownership. Rollout reconstruction also consumes its temporary context instead of cloning the reconstructed model history. The serialized representation remains unchanged. In an artificial 123 MB rollout benchmark, sharing resumed history reduced cold resume latency by roughly 9–10%. The affected crates compile with their test targets, all 80 thread-store tests pass, and the Bazel dependency lock remains valid.
Charlie Marsh ·
2026-06-23 10:23:25 -04:00 -
PAC 4 - Add macOS system proxy resolver (#26709)
## Summary Stacked on #26708. Adds the macOS implementation of the shared system-proxy contract. This allows Codex-owned auth clients to use the route macOS selects for each auth URL through SystemConfiguration and CFNetwork, including PAC and WPAD results. The `respect_system_proxy` feature is disabled by default, so existing client behavior remains unchanged unless explicitly enabled. ## Implementation - Adds the macOS-only `system-configuration` dependency to `codex-client`. - Dispatches system-proxy resolution to `outbound_proxy/macos.rs` on macOS. - Reads system proxy settings from `SCDynamicStore` and resolves the target URL with `CFNetworkCopyProxiesForURL`. - Executes PAC URLs and inline PAC JavaScript through a bounded run loop with a five-second timeout. - Handles `DIRECT`, HTTP proxies, and CFNetwork HTTPS entries using HTTP CONNECT; unsupported SOCKS entries map to `UnsupportedProxyScheme`. - Builds concrete proxy URLs from host and port entries, including IPv6 host bracketing. - Maps results into the shared `SystemProxyDecision::{Direct, Proxy, Unavailable}` contract. - Hashes URL-specific cache keys so PAC decisions remain distinct without retaining raw request URLs or query strings. ## End-user behavior - Disabled/default: existing client behavior is unchanged. - Enabled with `[features.respect_system_proxy]`: - macOS auth clients honor system proxy configuration, PAC, and WPAD; - valid OS/PAC `DIRECT` decisions use a direct connection; - unavailable system resolution falls back to explicit environment proxy variables, then `DIRECT`, through the shared contract from #26707. - Unsupported proxy schemes are not silently translated into another route. - Custom CA handling remains separate from proxy selection. - Known limitation: only the first supported system/PAC candidate is used. Subsequent proxy or `DIRECT` candidates are not attempted after a connection failure. This matches the current Windows behavior and leaves room for future ordered-fallback support. ## Tests - `just test -p codex-client` — 34 tests passed. - `just clippy -p codex-client` - `just fmt` - `just bazel-lock-check`
canvrno-oai ·
2026-06-22 17:56:04 -07:00 -
chore: advance tungstenite fork pins (#29480)
## Why `openai-oss-forks/tokio-tungstenite` now includes the updated `tungstenite` fork revision from [openai-oss-forks/tokio-tungstenite#3](https://github.com/openai-oss-forks/tokio-tungstenite/pull/3). Codex should consume the merged fork commit and resolve its direct and transitive `tungstenite` dependencies to the same revision instead of retaining the older pins. ## What Changed - Advanced the `tokio-tungstenite` git pin to `0e5b2d73aa18dd9f0a50ee9ff199d5aef7594186`. - Advanced the `tungstenite` fork pin to `4fffad30fe373adbdcffab9545e9e9bf4f2fc19f` and adjusted the patch source so the transitive dependency resolves to that revision. - Updated `Cargo.lock` and `MODULE.bazel.lock` to match the dependency graph.
Anton Panasenko ·
2026-06-22 14:24:02 -07:00 -
chore(deps): advance tokio-tungstenite (#29132)
## Why Responses websocket connections use `tokio-tungstenite`. When DNS returns an unusable native IPv6 address before a working IPv4 address, sequential dialing can consume Codex's outer websocket timeout before reaching IPv4. The merged fork change adds Happy Eyeballs-style alternate-family racing so websocket dialing matches the recovery behavior already present in the HTTP path. ## What Changed Advance the workspace `tokio-tungstenite` patch from `132f5b39` to merged commit `e5e64b86`, and update the matching lockfile source. The new revision comes from [openai-oss-forks/tokio-tungstenite#1](https://github.com/openai-oss-forks/tokio-tungstenite/pull/1).
Anton Panasenko ·
2026-06-19 12:02:12 -07:00 -
exec-server: add Noise relay transport (#26242)
## Why Rendezvous forwards traffic between the orchestrator and exec-server. The endpoints need to authenticate each other and encrypt that traffic without trusting Rendezvous with plaintext or endpoint keys. ## Changes - Adds a hybrid Noise IK channel through Clatter using X25519, ML-KEM-768, AES-256-GCM, and SHA-256. - Binds each handshake to `environment_id`, `executor_registration_id`, and `stream_id`. - Pins the registry-provided executor key and carries the harness authorization inside the encrypted handshake. - Orders relay frames before consuming Noise nonces and fragments large JSON-RPC messages into bounded records. - Bounds handshake payloads, frames, streams, and message reassembly. Runtime activation is in [openai/codex#26245](https://github.com/openai/codex/pull/26245). ## Stack 1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**: Noise channel and relay transport 2. [openai/codex#26245](https://github.com/openai/codex/pull/26245): remote registration and runtime activation ## Verification - `just test -p codex-exec-server` - Oversized initiator payload regression coverage - `just fix -p codex-exec-server` - `just bazel-lock-check` - `cargo shear` --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-15 16:39:41 -07:00 -
Use aws-lc-rs for rustls crypto provider (#27706)
## Why Some enterprise TLS proxies issue certificate chains signed with `ecdsa_secp521r1_sha512` / `ECDSA_NISTP521_SHA512`. Custom CA configuration such as `SSL_CERT_FILE` can add the right trust root, but it cannot make `rustls`'s `ring` verifier support a certificate signature algorithm it does not advertise. That can still break TLS after the CA bundle is configured, including on Rust websocket paths that call the shared `ensure_rustls_crypto_provider()` helper, such as the Responses websocket connector and remote app-server client: - [`codex-api/src/endpoint/responses_websocket.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L441) - [`app-server-client/src/remote.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/app-server-client/src/remote.rs#L718) The `aws-lc-rs` `rustls` provider supports this P-521/SHA-512 certificate signature scheme, so use it as Codex's process-wide `rustls` provider. ## What Changed - Switch the workspace `rustls` feature from `ring` to `aws_lc_rs`. - Update `codex-utils-rustls-provider` to install `rustls::crypto::aws_lc_rs::default_provider()`. - Add an assertion and integration test that the installed provider supports `ECDSA_NISTP521_SHA512`. ## Verification ```shell just fmt just test -p codex-utils-rustls-provider just bazel-lock-update just bazel-lock-check ```
malsamiri-oai ·
2026-06-15 11:32:13 -07:00 -
[codex] Pin bundled SQLite to fixed WAL-reset version (#27992)
## Summary Prevent dependency refreshes from silently downgrading Codex's bundled SQLite to a release affected by the WAL-reset corruption bug. SQLx 0.9 accepts a broad `libsqlite3-sys` range. An unrelated lock refresh therefore moved Codex from `libsqlite3-sys 0.37.0` back to `0.35.0`, changing the bundled SQLite runtime from 3.51.3 to 3.50.2. SQLite documents the affected versions and fix in [The WAL Reset Bug](https://www.sqlite.org/wal.html#the_wal_reset_bug) and the [SQLite 3.51.3 changelog](https://www.sqlite.org/changes.html#version_3_51_3).
Gabriel Peal ·
2026-06-13 21:28:31 -07:00 -
Remove TUI realtime voice support (#27801)
## Why Removes the realtime audio support from TUI. ## What Changed - Removed the TUI `/realtime` and realtime `/settings` command paths. - Deleted TUI voice capture/playback, WebRTC session handling, audio-device selection UI, and recording-meter code. - Removed TUI realtime tests and snapshots that covered the deleted surfaces. - Dropped the TUI-only `cpal` and `codex-realtime-webrtc` dependencies and refreshed the Rust/Bazel locks.
Eric Traut ·
2026-06-12 14:20:55 -07:00 -
code-mode standalone: extract protocol and add host crate (#27724)
This is phase 1 of a 4 phase stack: 1. **Add protocol and host crates for new IPC code mode implementation** 2. Create the new standalone binary 3. Create a new IPC `CodeModeSessionProvider` to use new binary 4. Remove v8 from core and only use IPC provider ## Add protocol and host crates for new IPC code mode implementation Establish a clean process boundary without changing the existing in-process behavior. - Add the codex-code-mode-protocol crate for shared session, runtime, response, and tool-definition types. - Move protocol-facing code out of the V8-backed implementation. - Add a buildable codex-code-mode-host crate as the foundation for the standalone process. - Keep the existing in-process runtime as the active implementation.
Channing Conger ·
2026-06-11 22:37:26 -07:00 -
[codex] parallelize release code generation (#27702)
The release profile still uses one codegen unit, which serializes LLVM code generation within each crate. That setting was selected alongside fat LTO for optimization quality and binary size, but releases now use ThinLTO and code generation dominates the critical-path build. Use four codegen units. On an Apple M4 Max with 16 cores and 128 GiB RAM, using rustc 1.96.0, four and eight units took 507.486 and 505.325 seconds respectively. Four therefore keeps the build-time gain while limiting the stripped `codex` increase to 14.7%, compared with 21.5% at eight units. The gzip-compressed binary grows 7.8% at four units. The one-unit build from an empty target directory took 981.150 seconds. That comparison also populated dependency and native build caches, so it is directional rather than controlled. It agrees with the earlier clean matrix where eight units reduced 671 seconds to 303 seconds: https://gist.github.com/anp/4b88393a0acd35783d9f42156f3243d5 At the local 48% reduction, the current release's 55m22s critical-path macOS Cargo step would save about 26 minutes from the 71m28s workflow: https://github.com/openai/codex/actions/runs/27367405663 The prompt-image medians ranged from 3.9% faster to 0.9% slower. CLI startup shifted by 1-2 ms while user and system CPU time were unchanged. This is a draft because the release-latency improvement may not justify the binary-size increase.
Tamir Duberstein ·
2026-06-11 19:44:36 -07:00 -
[codex] Remove async_trait from first-party code (#27475)
## Why First-party async traits should expose their `Send` contracts explicitly without requiring `async_trait`. This completes the migration pattern established in #27303 and #27304. ## What changed - Replaced the remaining first-party `async_trait` traits with native return-position `impl Future + Send` where statically dispatched and explicit boxed `Send` futures where object safety is required. - Kept implementations behavior-preserving, outlining existing async bodies into inherent methods where that keeps the diff reviewable. - Removed all direct first-party `async-trait` dependencies and the workspace dependency declaration. - Added a cargo-deny policy that permits `async-trait` only through the remaining transitive wrapper crates. - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and keep the full cargo-deny check passing. ## Validation - `just test -p codex-exec-server`: 216 passed, 2 skipped. - `just test -p codex-model-provider`: 39 passed. - `just test -p codex-core` and `just test`: changed tests passed; remaining failures are environment-sensitive suites unrelated to this migration. - `cargo deny check` - `just fix` - `just fmt` - `cargo shear` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 18:16:39 -07:00 -
[codex] Load user instructions through an injected provider (#27101)
## Why We want to remove implicit use of `$CODEX_HOME` from `codex-core` and make embedders responsible for supplying user-level instructions. This also ensures user instructions load when no primary environment is selected. ## What changed Stacked on #27415, which makes `codex exec` surface thread-scoped runtime warnings. - Added `UserInstructionsProvider` to `codex-extension-api`, with absolute source attribution and recoverable loading warnings. - Added `codex-home` with the filesystem-backed provider for `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback, trimming, lossy UTF-8 handling, and the existing uncapped global instruction size. - Removed global instruction loading from `Config` and require `ThreadManager` callers to inject a provider. - Load provider instructions once for each fresh root runtime, including runtimes without a primary environment. Running sessions retain their snapshot, while child agents inherit the parent snapshot without invoking the provider. - Keep provider instructions separate while loading project `AGENTS.md`, then assemble the model-visible instructions with the existing ordering, source attribution, warning, and turn-context behavior. - Wired the Codex home provider through the CLI, app server, MCP server, core facade, and thread-manager sample. ## Validation - `just test -p codex-home -p codex-extension-api` - `just test -p codex-core agents_md` - `just test -p codex-core guardian` - `just test -p codex-app-server thread_start_without_selected_environment_includes_only_global_instruction_source` - `just test -p codex-exec warning` - `just bazel-lock-check`
Adam Perry @ OpenAI ·
2026-06-11 19:28:47 +00:00 -
[codex] migrate ExecutorFileSystem paths to PathUri (#27424)
## Why We're moving exec-server to use PathUri for its internal path representations. ## What Move `ExecutorFileSystem` APIs to use `PathUri` instead of `AbsolutePathBuf`. Future changes will convert higher-level parts of exec-server.
Adam Perry @ OpenAI ·
2026-06-11 18:44:18 +00:00 -
Route hosted Apps MCP through extensions (#27191)
## Stack - Base: #27184 - This PR is the second vertical and should be reviewed against `jif/external-plugins-1`, not `main`. ## Why CCA is moving toward a split runtime where the orchestrator may have no filesystem or executor, but it still needs to activate remotely hosted plugin components. HTTP MCP servers are the simplest complete example: they need configuration and host authentication, but they do not need an executor process. The Apps MCP endpoint is currently synthesized by a special-purpose loader inside the MCP runtime. That works locally, but it leaves hosted MCP activation outside the extension model being established in #27184. It also makes the Apps path a poor foundation for plugins whose skills, MCP servers, connectors, and hooks may come from different sources or execute in different places. This PR moves that one behavior behind an extension-owned contribution while preserving the existing local fallback. It deliberately does not introduce a generic plugin activation framework. ## What changed ### MCP extension contribution `codex-extension-api` gains an ordered `McpServerContributor` contract. A contributor returns typed `Set` or `Remove` overlays for MCP server configuration; later contributors win for the names they own. The contract stays at the existing MCP configuration boundary. Extensions do not create a second connection manager or transport abstraction. ### Hosted Apps MCP extension A new `codex-mcp-extension` contributes the reserved `codex_apps` server from the existing Apps feature, ChatGPT base URL, path override, and product SKU configuration. When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the resulting streamable HTTP endpoint is `https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate remains authoritative, so this server can run in an orchestrator-only process without being exposed for API-key sessions. ### One resolved runtime view `McpManager` now distinguishes three views: - **configured:** config- and plugin-backed servers before extension overlays; - **runtime:** configured servers plus host-installed extension contributions; - **effective:** runtime servers after auth gating and compatibility built-ins. App-server installs the hosted MCP extension and uses the runtime view for thread startup, refresh, status, threadless resource reads, connector discovery, and MCP OAuth lookup. This keeps `mcpServer/oauth/login` consistent with the servers exposed by the other MCP APIs. The hosted Apps server itself continues to use existing ChatGPT host authentication rather than MCP OAuth. ## Compatibility Hosts that do not install the MCP extension retain the existing Apps MCP synthesis path. This preserves current local-only, CLI, and standalone-host behavior while app-server exercises the extension path. Disabling Apps removes the reserved `codex_apps` entry, and losing ChatGPT auth removes it from the effective runtime view. Executor availability is not consulted for this HTTP transport. ## Follow-ups The next vertical will resolve a manifest-declared stdio MCP server from an executor-selected plugin root and execute it in the environment that owns that root. Later verticals can add backend-owned skills, connector metadata, hooks, durable selection semantics, and incremental local convergence without changing the component-specific runtime boundaries introduced here. ## Verification Focused coverage was added for: - contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an executor; - requiring ChatGPT auth in the effective runtime view; - removing a reserved configured Apps server when the Apps feature is disabled. `cargo check -p codex-app-server -p codex-mcp-extension -p codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run locally under the current development instruction; CI provides the full validation pass.
jif ·
2026-06-09 22:44:16 +02:00 -
Load selected executor skills through extensions (#27184)
## Why CCA is moving toward a split runtime where the orchestrator may not have a filesystem, while executors can expose preinstalled plugins and skills. A thread therefore needs to select capabilities without asking app-server or core to interpret executor-owned paths through the orchestrator's filesystem. The longer-term model is broader than executor skills: - A plugin is a bundle of skills, MCP servers, connectors/apps, and hooks. - A plugin root can be local, executor-owned, or hosted by a backend. - Components inside one plugin can use different access and execution mechanisms. A skill may be read from a filesystem or through backend tools; an HTTP MCP server can run without an executor; a stdio MCP server or hook needs an execution environment. - Core should carry generic extension initialization data. The extension that owns a component should discover it, expose it to the model, and invoke it through the appropriate runtime. This PR establishes that architecture through one complete vertical: selecting a root on an executor, discovering the skills beneath it, exposing those skills to the model, and reading an explicitly invoked `SKILL.md` through the same executor. ## Contract `thread/start` gains an experimental `selectedCapabilityRoots` field: ```json { "selectedCapabilityRoots": [ { "id": "deploy-plugin@1", "location": { "type": "environment", "environmentId": "workspace", "path": "/opt/codex/plugins/deploy" } } ] } ``` The root is intentionally not classified as a "plugin" or "skill" in the API. It can point at a standalone skill, a directory containing several skills, or a plugin containing skills and other components. This PR only teaches the skills extension how to consume it; later extensions can resolve MCP, connector, and hook components from the same selection. The platform-supplied `id` is stable selection identity. The location says which runtime owns the root and gives that runtime an opaque path. App-server does not inspect or canonicalize the path. ## What changed ### Generic thread extension initialization App-server converts selected roots into `ExtensionDataInit`. Core carries that generic initialization value until the final thread ID is known, then creates thread-scoped `ExtensionData` before lifecycle contributors run. This keeps `Session` and core independent of the capability-selection contract. The initialization value is consumed during construction; it is not retained as another long-lived `Session` field. ### Executor-backed skills The skills extension now owns an `ExecutorSkillProvider` that: - resolves the selected environment through `EnvironmentManager` - discovers, canonicalizes, and reads skills through that environment's `ExecutorFileSystem` - contributes the bounded selected-skill catalog as stable developer context - reads an explicitly invoked skill body through the authority that listed it - warns when an environment or root is unavailable - never falls back to the orchestrator filesystem for an executor-owned root Skill catalog and instruction fragments have hard byte bounds, which also bound them below the 10K-token per-item context limit. If a selected executor skill has the same name as a legacy local skill, the executor selection owns that invocation and the local body is not injected a second time. Existing local and bundled skill loading remains in place. Omitting `selectedCapabilityRoots` therefore preserves current local-only behavior. ## Current semantics - Only environment-owned locations are represented in this first contract. - Roots are resolved by the destination extension, not by app-server or core. - An unavailable executor or invalid root produces a warning and no capabilities from that root; it does not trigger a local-filesystem fallback. - Selection applies to a newly started active thread. - MCP servers, connectors, and hooks beneath a selected plugin root are not activated yet. - Selection is not yet persisted or inherited across resume, fork, or subagent creation. Existing local capabilities continue to behave as they do today in those flows. ## Planned vertical follow-ups 1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that works without an executor, then replace the special-purpose MCP plugins loader with that implementation. 2. **Executor MCP:** register and execute stdio MCP servers through the environment that owns the selected plugin root. 3. **Backend skills:** add a hosted skill source whose catalog and bodies are accessed through extension tools rather than a filesystem. 4. **Connectors and hooks:** activate those components through their owning extensions, using the same selected-root boundary and component-specific runtime. 5. **Durable selection:** define the desired-selection lifecycle, persist it, and make resume, fork, and subagent inheritance explicit rather than accidental. 6. **Local convergence:** incrementally route existing local plugin, skill, and MCP loading through the same extension model while preserving current local behavior. Each follow-up remains reviewable as an end-to-end capability. The platform selects roots, generic thread extension data carries the selection, and the owning extension resolves and operates its component. ## Verification Coverage added for: - app-server end-to-end discovery and explicit invocation of a skill inside an executor-selected plugin root - exclusive invocation when a selected executor skill collides with a local skill name - executor filesystem authority for discovery, canonicalization, and reads - thread extension initialization before lifecycle contributors run - stable executor catalog context, explicit invocation, context rebuilding, hidden skills, and preserved host/remote catalog behavior Targeted protocol, core-skills, skills-extension, core lifecycle, and app-server executor-skill tests were run during development.jif ·
2026-06-09 19:51:54 +02:00 -
Add typed file URIs (#26840)
## Why Codex needs stable `file:` URI identifiers that can cross process and operating-system boundaries without eagerly interpreting them as native paths. Existing fields also need to keep accepting absolute path strings during migration. ## What changed - Add `codex-utils-path-uri` with a validated, immutable `PathUri` wrapper that currently accepts only `file:` URLs. - Expose URI-level `basename`, `parent`, and `join` operations that preserve authorities and percent encoding without guessing the source operating system. - Keep native conversion explicit through `AbsolutePathBuf` and the current host rules. - Serialize as canonical URI text while accepting both URI text and legacy absolute native paths during deserialization. - Add adversarial coverage for Windows-looking and POSIX paths, UNC authorities, encoded metadata characters, non-UTF-8 POSIX paths, URI hierarchy operations, and legacy serde round trips.
Adam Perry @ OpenAI ·
2026-06-08 16:33:41 -07:00 -
[codex] Restore release symbol artifacts with line tables (#26202)
## Summary - Restore separate release symbol archives for macOS, Linux, and Windows binaries. - Build release binaries with `line-tables-only` debuginfo instead of full debuginfo. - Strip Unix distribution binaries after extracting symbols, preserve Windows PDBs, and keep symbol archives available to the release job. - Strip the packaged Linux `bwrap` binary before hashing it so the embedded digest matches the distributed bytes. ## Root cause The first symbol-artifact implementation enabled `CARGO_PROFILE_RELEASE_DEBUG=full`. In the June 2 release runs, macOS ARM primary builds reached the 90-minute timeout while still inside `Cargo build`. After the symbol changes were reverted, the same primary build completed in about 22 minutes. The archive step itself completed in tens of seconds when reached. Rust's `line-tables-only` debuginfo level preserves function names and source locations for symbolication without emitting the heavier variable and type information from full debuginfo. ## Validation - Ran `just fmt` from `codex-rs`. - Ran `just test-github-scripts` from the repository root: 23 tests passed. - Ran `bash -n` and `shellcheck` on `.github/scripts/archive-release-symbols-and-strip-binaries.sh`. - Parsed both modified workflows as YAML and ran `git diff --check`. - Built a macOS release smoke binary with `line-tables-only`, archived its dSYM through the restored script, stripped the production binary, and verified that `atos` resolves `symbol_smoke_function` to `main.rs:2`. - Ran Linux archive-script control-flow coverage with stubbed `objcopy` and `strip` commands. - Ran Windows PDB archive staging coverage and verified underscore-emitted Rust PDB names are staged under shipped hyphenated binary names. ## Follow-up The release workflow only runs for tags or manual dispatches, so CI cannot dry-run the full release matrix on this PR. The next release run will verify runner time and memory behavior under `line-tables-only`.
Jeremy Rose ·
2026-06-08 17:16:36 +00:00 -
Michael Bolin ·
2026-06-07 17:35:33 -07:00 -
Channing Conger ·
2026-06-06 14:27:23 -07:00 -
[2 of 2] Finish moving goal runtime to extension (#26548)
## Stack 1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align goal extension with core behavior 2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move goal runtime to extension ## Why This PR completes the switch of the goal behavior to the extension-backed runtime and removes the old core goal implementation. ## What Changed - Installs the goal extension for app-server `ThreadManager` sessions. - Routes app-server thread goal `get`, `set`, and `clear` through `GoalService`. - Uses thread-idle lifecycle emission after goal resume and snapshot ordering so the extension can decide whether to continue the goal. - Forwards extension goal updates through a FIFO async app-server notification path so backpressure does not drop them or reorder updates. - Keeps review turns from enabling goal runtime behavior. - Plans extension tools before dynamic tools so built-in goal tool names keep their old precedence when goals are enabled. - Removes the old core goal runtime, core goal tool handlers, and core goal tool specs. - Updates tests that were coupled to the core-owned goal runtime while leaving the legacy `<goal_context>` compatibility path in core for old threads. - Removes the stale cargo-shear ignore now that `codex-goal-extension` is used by the workspace. - Keeps realtime event matching exhaustive after removing the old goal-specific realtime text path. ## Validation - Ran manual `/goal` runs in TUI. Validated time accounting matched wall-clock time and goal lifecycle state transitions.
Eric Traut ·
2026-06-05 14:17:30 -07:00 -
build: use ThinLTO for release binaries (#23710)
## Why Fat LTO makes release builds substantially slower without providing enough measured runtime benefit to justify the release CI long pole. The build-profile investigation found that keeping Cargo's default release `opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced a clean `codex-cli` release build from 2073.893 seconds to 1243.172 seconds, a 40.06% improvement. The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%). Measured runtime changes were small: the worst image workload median was +0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO retains cross-crate optimization while avoiding most of the fat-LTO build cost. This deliberately avoids global size optimization: final-executable testing showed a substantial regression on the image request path, which is expected to become more important as image usage grows. ## What changed - Set the workspace release profile to `lto = "thin"`, retaining Cargo's default release `opt-level=3`. - Remove release and CI workflow-specific LTO overrides so release-profile builds consistently use the workspace setting. - Remove the now-unused Windows release workflow input and related diagnostic output. ## Validation - Confirmed the release profile parses with `cargo metadata --no-deps --format-version 1`. - CI validates release builds across the supported target matrix.
Adam Perry @ OpenAI ·
2026-06-04 20:07:53 +00:00 -
Optimize unbounded byte scans with memchr (#26265)
## Summary This PR adds `memchr` for some low-hanging performance improvements (namely, in MCP stdio, Ollama streaming, and full message-history newline counts). Codex produced the following release benchmarks: | Operation | Before | After | Speedup | | --- | ---: | ---: | ---: | | MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x | | Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x | | Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x | With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python MCP server, completed `initialize`, requested `tools/list`, and deserialized a 1 MiB tool description over newline-delimited stdio), it's about 16x faster end-to-end: | Branch | 50 calls | Per call | | --- | ---: | ---: | | `main` | 862.53 ms | 17.25 ms | | this branch | 53.89 ms | 1.08 ms | `memchr` is already in our dependency tree and extremely widely used for this kind of optimized scanning.
Charlie Marsh ·
2026-06-04 09:53:08 -04:00 -
chore: extract context fragments into dedicated crate (#26122)
## Why `codex-core` currently owns the generic contextual-fragment trait and several reusable fragment implementations. That makes it harder for other crates to share the same host-owned model-input abstraction without depending on all of `codex-core`. This change extracts the reusable fragment machinery into a small `codex-context-fragments` crate so future extension and skills work can depend on the fragment abstraction directly. ## What Changed - Added the `codex-context-fragments` crate with: - `ContextualUserFragment` - `FragmentRegistration` / `FragmentRegistrationProxy` - additional-context fragment types - Moved `SkillInstructions` into `codex-core-skills`, since skill-specific rendering belongs with skills rather than generic core context machinery. - Kept `codex-core` re-exporting the fragment types it still uses internally, so existing call sites keep the same shape. - Updated Cargo and Bazel workspace metadata for the new crate. ## Verification - `cargo metadata --locked --format-version 1 --no-deps` - `just bazel-lock-update` - `just bazel-lock-check`
jif ·
2026-06-03 12:25:21 +02:00 -
feat: add skills extension scaffold (#25953)
## Disclaimer This is only here for iteration purpose! Do not make any code rely on this ## Why Skills still live behind `codex-core` discovery and injection paths, but the extension system needs an authority-aware home before that logic can move. This adds that boundary without changing current skills behavior, and keeps host, executor, and remote skills distinct so future list/read/search flows do not collapse back to ambient local paths. ## What changed - Add the `codex-skills-extension` workspace/Bazel crate under `ext/skills`. - Define the initial catalog, authority, provider, and turn-state types for authority-bound skill packages and resources. - Register placeholder thread/config/prompt/turn lifecycle contributors plus host, executor, and remote provider aggregation points. - Capture the remaining extraction work as TODOs, including the missing extension API hooks needed for per-turn catalog construction and typed skill injection. - Keep plugins outside the runtime skills model: plugin-installed skills are treated as materialized host-owned skill sources once available. ## Verification - Not run locally.
jif ·
2026-06-03 01:10:26 +02:00 -
Move cloud requirements crate to cloud config (#24621)
## Summary - Moves the existing `codex-cloud-requirements` crate to `codex-cloud-config`. - Updates workspace dependencies and imports to the new crate name. - Intentionally keeps runtime behavior unchanged: this still fetches the legacy cloud requirements endpoint. ## Details This PR exists to make the lineage obvious before the bundle migration. GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs` implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather than as unrelated new code. The follow-up PR adapts this moved crate to the new config bundle API and switches runtime consumers over.
joeflorencio-openai ·
2026-06-01 16:43:52 -07:00 -
[codex] Consolidate shared prompts in codex-prompts (#25151)
## Why `codex_core` is consistently a bottleneck for incremental builds during iteration. The simplest fix is to make the crate smaller. ## Summary `codex-core` owns several reusable prompt renderers and static prompt assets, which makes the crate harder to split apart. Rename `codex-review-prompts` to `codex-prompts` and move shared review, goal, permissions, compaction, realtime, hierarchical AGENTS.md, and `apply_patch` prompts into it. Move prompt-only tests and update consumers and `CODEOWNERS`. ## Validation - `just test -p codex-prompts -p codex-apply-patch` - `just test -p codex-core prompt_caching` - Bazel builds for the affected crates
Adam Perry @ OpenAI ·
2026-06-01 18:45:07 +00:00 -
jif-oai ·
2026-05-29 12:53:31 +02:00 -
Add feature-gated standalone image generation extension (#24723)
## Why Add a standalone image generation path that can be exercised independently of hosted Responses image generation, while retaining the hosted tool as fallback unless the extension is actually available to the model. ## What changed - Added the `codex-image-generation-extension` crate with standalone generate/edit execution, prior-image selection for edits, model-visible image output, and local generated-image persistence. - Installed the extension in app-server behind the disabled-by-default `imagegenext` feature and backend eligibility checks. - Updated core tool planning so eligible `image_gen.imagegen` exposure replaces hosted `image_generation`, while unavailable configurations retain hosted fallback. - Added coverage for extension behavior, edit history reuse, feature gating, auth eligibility, and hosted-tool replacement. - The extension is installed through app-server only in this PR; other execution paths retain hosted image generation because hosted replacement occurs only when the standalone executor is actually registered and model-visible. - The initial extension contract intentionally fixes the image model to `gpt-image-2` and uses automatic image parameters. - Native generated-image history/card parity and rollout persistence cleanup are intentionally deferred follow-up work. ## Validation - `just test -p codex-image-generation-extension` - `just test -p codex-features` - `just test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates` - `just test -p codex-app-server` - `just fix -p codex-image-generation-extension -p codex-features -p codex-core -p codex-app-server` - `just fmt` - `just bazel-lock-update` - `just bazel-lock-check` --------- Co-authored-by: jif-oai <jif@openai.com>
Won Park ·
2026-05-28 11:44:55 -07:00 -
Revert "Add app-server startup benchmark crate" (#24937)
Reverts openai/codex#24651, broke musl job https://github.com/openai/codex/actions/runs/26585495205/job/78330166927
Adam Perry @ OpenAI ·
2026-05-28 17:49:41 +00:00 -
Add app-server startup benchmark crate (#24651)
## Summary - Add a new `app-server-start-bench` crate to measure app-server startup performance - Wire the benchmark into the workspace and Bazel build so it can be run consistently - Update lockfiles and repo automation to account for the new package
Adam Perry @ OpenAI ·
2026-05-28 08:46:30 -07:00 -
Update rmcp to 1.7.0 (#24763)
WIll make it easier to uprev when the new draft spec is supported. Also updates reqwest where needed for compatibility but doesn't update it everywhere since this is already a large diff. The new version of rmcp handles certain kinds of authentication failures differently, this patch includes support for identifying the failing scope in a WWW-Authenticate header.
Adam Perry @ OpenAI ·
2026-05-27 14:52:06 -07:00 -
Bump SQLx to pick up newer bundled SQLite (#24728)
## Why Codex stores thread, log, goal, and memory state in bundled SQLite databases through SQLx. We have a suspected SQLite WAL-reset corruption issue under heavy concurrent writer load, especially when multiple subagents are active. The existing `sqlx 0.8.6` dependency kept us on an older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack far enough forward to pick up the newer bundled SQLite library. ## What changed - Bump the workspace `sqlx` dependency to `0.9.0`. - Use the SQLx 0.9 feature names explicitly: `runtime-tokio`, `tls-rustls`, and `sqlite-bundled`. - Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys 0.37.0`. - Refresh `MODULE.bazel.lock` for the dependency changes. - Adapt `codex-state` to SQLx 0.9: - build dynamic state queries with `QueryBuilder<Sqlite>` instead of passing dynamic `String`s to `sqlx::query`; - remove the old `QueryBuilder` lifetime parameter from helper signatures; - preserve SQLx's new `Migrator` fields when constructing runtime migrators. ## Verification - `just test -p codex-state` - `just bazel-lock-check` - `cargo check -p codex-state --tests`
jif-oai ·
2026-05-27 18:44:07 +02:00 -
standalone websearch extension (#23823)
## Summary Add the extension-backed standalone `web.run` tool so Codex can call the standalone search endpoint through the `codex-api` search client and return its encrypted output to Responses. - gate the new tool behind `standalone_web_search` - install the extension in the app-server thread registry and hide hosted `web_search` when standalone search is enabled for OpenAI providers so the two paths stay mutually exclusive - build search context from persisted history using a small tail heuristic: previous user message, assistant text between the last two user turns capped at about 1k tokens, and current user message ## Test Plan - `cargo test -p codex-web-search-extension` - `cargo test -p codex-api` - `cargo test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates`
sayan-oai ·
2026-05-26 11:12:24 -07:00 -
chore: drop orphaned codex memories MCP crate (#24555)
## Why The memory read-tool surface had two implementations: the app-server extension path under `ext/memories`, and an unused `codex-memories-mcp` workspace crate under `memories/mcp`. The MCP crate no longer has reverse dependents, so keeping it around preserves duplicate backend, schema, and tool code that is not part of the live app-server memory path. Dropping the orphaned crate makes the remaining memory crate split clearer: `memories/read` owns read-path prompt/citation helpers, `memories/write` owns the write pipeline, and `ext/memories` owns the app-server extension integration. ## What changed - Removed the `memories/mcp` crate and its Bazel/Cargo metadata. - Removed `memories/mcp` from the Rust workspace and lockfile. - Updated `memories/README.md` so it only lists the remaining reusable memory crates. ## Verification - `cargo metadata --format-version 1 --no-deps` succeeds.
jif-oai ·
2026-05-26 11:29:37 +02:00 -
[codex] Add image re-encoding benchmarks (#23935)
## Summary - add Divan benchmarks for prompt image re-encoding paths - wire the image benchmark smoke test into Rust CI workflows ## Why Image prompt handling includes re-encoding work that benefits from repeatable benchmark coverage so changes can be measured in CI and locally. This already helped identify a potential regression from changing compiler flags. ## Impact Developers can run and compare the new image re-encoding benchmarks, and CI exercises the benchmark target via the Rust benchmark smoke test.
Adam Perry @ OpenAI ·
2026-05-22 22:38:40 +00:00 -
feat: support local refs and defs in tool input schemas (#23357)
# Why Some connector tool input schemas use local JSON Schema references and definition tables to avoid duplicating large nested shapes. Codex previously lowered these schemas into the supported subset in a way that could discard `$ref`-only schema objects and lose the corresponding definitions, which made non-strict tool registration less faithful than the original connector schema. This keeps the existing minimal-lowering policy: Codex still does not raw-pass through arbitrary JSON Schema, but it now preserves local reference structure that fits the Responses-compatible subset and prunes definition entries that cannot be reached by following `$ref`s from the root schema after sanitization, including refs found transitively inside other reachable definitions. The pruning matters because Responses parses definition tables even when entries are unused, so keeping dead definitions wastes prompt tokens. # What changed - Added `$ref`, `$defs`, and legacy `definitions` fields to the tool `JsonSchema` representation. - Updated `parse_tool_input_schema` lowering so `$ref`-only schema objects survive sanitization instead of becoming `{}`. - Sanitized definition tables recursively and dropped malformed definition tables so non-strict registration degrades gracefully. - Added reachability pruning for root definition tables by starting from refs outside definition tables, then following refs inside reachable definitions. - Added JSON Pointer decoding for local definition refs such as `#/$defs/Foo~1Bar`. # Verification ran local golden-schema probes against representative connector schemas to validate behavior on real generated schemas: | Golden schema | Before bytes | After bytes | `$defs` before -> after | `$ref` before -> after | Result | |---|---:|---:|---:|---:|---| | `google_calendar/create_space` | 7111 | 4526 | 7 -> 7 | 7 -> 7 | all definitions preserved because all are reachable | | `figma/apply_file_variable_changes` | 4609 | 999 | 8 -> 5 | 8 -> 5 | unused defs pruned after unsupported `oneOf` shapes lower away | | `snowflake/list_catalog_integrations` | 1380 | 404 | 3 -> 0 | 0 -> 0 | all defs pruned because none are referenced | | `dropbox/create_shared_link` | 8894 | 1836 | 14 -> 4 | 9 -> 4 | only defs reachable from the root schema after sanitization are retained, including transitively through other retained defs | Token increase across golden schema due to this change: <img width="817" height="366" alt="Screenshot 2026-05-19 at 1 47 04 PM" src="https://github.com/user-attachments/assets/d5c80fe9-da85-41e6-8ac7-a01d1e0b0f71" />Celia Chen ·
2026-05-22 00:32:14 +00:00 -
CI: Customize v8 building (#22086)
## Summary Move the rusty_v8 artifact production into hermetic Bazel path and bump the `v8` crate to `147.4.0` The new flow builds V8 release artifacts from source for Darwin and Linux targets, publishes both the current release-compatible artifacts and sandbox-enabled variants, and keeps Cargo consumers on prebuilt binaries by continuing to feed the `v8` crate the archive and generated binding files it already expects. ## Why We need control over V8 build-time features without giving up prebuilt artifacts for downstream Cargo builds. Upstream `rusty_v8` already supports source-only features such as `v8_enable_sandbox`, but its normal prebuilt release assets do not cover every feature combination we need. Building the artifacts ourselves lets us enable settings such as the V8 sandbox and pointer compression at artifact build time, then publish those outputs so ordinary Cargo builds can still consume prebuilts instead of compiling V8 locally. This keeps the fast consumer experience of prebuilt `rusty_v8` archives while giving us a reproducible path to ship featureful variants that upstream does not currently publish for us. ## Implementation Notes The Bazel graph in this PR is not copied wholesale from `rusty_v8`; `rusty_v8`'s normal source build is still GN/Ninja-based. Instead, this change starts from upstream V8's Bazel rules and adapts them to Codex's hermetic toolchains and dependency layout. Where we intentionally follow `rusty_v8`, we mirror its existing artifact contract: - the same `v8` crate version and generated binding expectations - the same sandbox feature relationship, where sandboxing requires pointer compression - the same custom libc++ model expected by Cargo's default `use_custom_libcxx` feature - the same release-style archive plus `src_binding` outputs consumed by the `v8` crate To preserve that contract, the Bazel release path pins the libc++, libc++abi, and llvm-libc revisions used by `rusty_v8 v147.4.0`, builds release artifacts with `--config=rusty-v8-upstream-libcxx`, and folds the matching runtime objects into the final static archive. ## Windows Windows is annoyingly handled differently. Codex's current hermetic Bazel Windows C++ platform is `windows-gnullvm` / `x86_64-w64-windows-gnu`, while upstream `rusty_v8` publishes Windows prebuilts for `*-pc-windows-msvc`. Those are different ABIs, so the Bazel graph cannot truthfully reproduce the upstream MSVC artifacts until we add a real MSVC-targeting C++ toolchain. For now: - Windows MSVC consumers continue to use upstream `rusty_v8` release archives. - Windows GNU targets are built in-tree so they link against a matching GNU ABI. - The canary workflow separately exercises upstream `rusty_v8` source builds for MSVC sandbox artifacts, but MSVC is not yet part of the Bazel-produced release matrix. ## Validation This PR is technically self validating through CI. I have already published it as a release tag so the artifacts from this branch are published to https://github.com/openai/codex/releases/tag/rusty-v8-v147.4.0 CI for this PR should therefore consume our own release targets. I have also locally tested for linux and darwin. --------- Co-authored-by: Codex <noreply@openai.com>
Channing Conger ·
2026-05-18 21:33:05 -07:00 -
chore: goal ext skeleton (#23288)
Skeleton of `/goal` in extension Lot's of follow-ups coming
jif-oai ·
2026-05-18 13:32:21 +02:00 -
Move memory prompt injection to app-server extension (#22841)
## Why Memory prompt injection should be owned by the extension path that app-server composes at runtime, not by an inlined special case inside `codex-core`. This keeps `codex-core` focused on session orchestration while allowing the memories extension to own its app-server prompt behavior. ## What Changed - Registers `codex-memories-extension` in the app-server extension registry. - Moves the memory developer-instruction injection out of `core/src/session/mod.rs` and into the memories extension prompt contributor. - Adds config-change handling so the extension keeps its per-thread memory settings in sync after startup. - Leaves memories read/retrieval tools unregistered for now so this PR only changes prompt injection. - Removes the stale `cargo-shear` ignore now that app-server depends on the extension crate. ## Validation Not run locally; validation is left to CI.
jif-oai ·
2026-05-15 16:19:34 +02:00 -
chore(config) rm Feature::CodexGitCommit (#22412)
## Summary Removes the unused Feature::CodexGitCommit ## Testing - [x] tests pass
Dylan Hurd ·
2026-05-13 12:33:36 -07:00 -
config: add strict config parsing (#20559)
## Why Codex intentionally ignores unknown `config.toml` fields by default so older and newer config files keep working across versions. That leniency also makes typo detection hard because misspelled or misplaced keys disappear silently. This change adds an opt-in strict config mode so users and tooling can fail fast on unrecognized config fields without changing the default permissive behavior. This feature is possible because `serde_ignored` exposes the exact signal Codex needs: it lets Codex run ordinary Serde deserialization while recording fields Serde would otherwise ignore. That avoids requiring `#[serde(deny_unknown_fields)]` across every config type and keeps strict validation opt-in around the existing config model. ## What Changed ### Added strict config validation - Added `serde_ignored`-based validation for `ConfigToml` in `codex-rs/config/src/strict_config.rs`. - Combined `serde_ignored` with `serde_path_to_error` so strict mode preserves typed config error paths while also collecting fields Serde would otherwise ignore. - Added strict-mode validation for unknown `[features]` keys, including keys that would otherwise be accepted by `FeaturesToml`'s flattened boolean map. - Kept typed config errors ahead of ignored-field reporting, so malformed known fields are reported before unknown-field diagnostics. - Added source-range diagnostics for top-level and nested unknown config fields, including non-file managed preference source names. ### Kept parsing single-pass per source - Reworked file and managed-config loading so strict validation reuses the already parsed `TomlValue` for that source. - For actual config files and managed config strings, the loader now reads once, parses once, and validates that same parsed value instead of deserializing multiple times. - Validated `-c` / `--config` override layers with the same base-directory context used for normal relative-path resolution, so unknown override keys are still reported when another override contains a relative path. ### Scoped `--strict-config` to config-heavy entry points - Added support for `--strict-config` on the main config-loading entry points where it is most useful: - `codex` - `codex resume` - `codex fork` - `codex exec` - `codex review` - `codex mcp-server` - `codex app-server` when running the server itself - the standalone `codex-app-server` binary - the standalone `codex-exec` binary - Commands outside that set now reject `--strict-config` early with targeted errors instead of accepting it everywhere through shared CLI plumbing. - `codex app-server` subcommands such as `proxy`, `daemon`, and `generate-*` are intentionally excluded from the first rollout. - When app-server strict mode sees invalid config, app-server exits with the config error instead of logging a warning and continuing with defaults. - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli` instead of extending shared `ReviewArgs`, so `--strict-config` stays on the outer config-loading command surface and does not become part of the reusable review payload used by `codex exec review`. ### Coverage - Added tests for top-level and nested unknown config fields, unknown `[features]` keys, typed-error precedence, source-location reporting, and non-file managed preference source names. - Added CLI coverage showing invalid `--enable`, invalid `--disable`, and unknown `-c` overrides still error when `--strict-config` is present, including compound-looking feature names such as `multi_agent_v2.subagent_usage_hint_text`. - Added integration coverage showing both `codex app-server --strict-config` and standalone `codex-app-server --strict-config` exit with an error for unknown config fields instead of starting with fallback defaults. - Added coverage showing unsupported command surfaces reject `--strict-config` with explicit errors. ## Example Usage Run Codex with strict config validation enabled: ```shell codex --strict-config ``` Strict config mode is also available on the supported config-heavy subcommands: ```shell codex --strict-config exec "explain this repository" codex review --strict-config --uncommitted codex mcp-server --strict-config codex app-server --strict-config --listen off codex-app-server --strict-config --listen off ``` For example, if `~/.codex/config.toml` contains a typo in a key name: ```toml model = "gpt-5" approval_polic = "on-request" ``` then `codex --strict-config` reports the misspelled key instead of silently ignoring it. The path is shortened to `~` here for readability: ```text $ codex --strict-config Error loading config.toml: ~/.codex/config.toml:2:1: unknown configuration field `approval_polic` | 2 | approval_polic = "on-request" | ^^^^^^^^^^^^^^ ``` Without `--strict-config`, Codex keeps the existing permissive behavior and ignores the unknown key. Strict config mode also validates ad-hoc `-c` / `--config` overrides: ```text $ codex --strict-config -c foo=bar Error: unknown configuration field `foo` in -c/--config override $ codex --strict-config -c features.foo=true Error: unknown configuration field `features.foo` in -c/--config override ``` Invalid feature toggles are rejected too, including values that look like nested config paths: ```text $ codex --strict-config --enable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --disable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text ``` Unsupported commands reject the flag explicitly: ```text $ codex --strict-config cloud list Error: `--strict-config` is not supported for `codex cloud` ``` ## Verification The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text` case, unknown `-c` overrides, app-server strict startup failure through `codex app-server`, and rejection for unsupported commands such as `codex cloud`, `codex mcp`, `codex remote-control`, and `codex app-server proxy`. The config and config-loader tests cover unknown top-level fields, unknown nested fields, unknown `[features]` keys, source-location reporting, non-file managed config sources, and `-c` validation for keys such as `features.foo`. The app-server test suite covers standalone `codex-app-server --strict-config` startup failure for an unknown config field. ## Documentation The Codex CLI docs on developers.openai.com/codex should mention `--strict-config` as an opt-in validation mode for supported config-heavy entry points once this ships.
Michael Bolin ·
2026-05-13 16:08:05 +00:00 -
feat: memories ext (#22498)
First memories extension implementation Based on memories-mcp tools
jif-oai ·
2026-05-13 17:14:31 +02:00 -
Refactor extension tools onto shared ToolExecutor (#22369)
## Why Extension tools were split across two public runtime contracts: `codex-tool-api` exposed `ToolBundle` plus its own call/spec/error types, while core native tools used `codex_tools::ToolExecutor`. That made contributed tool specs and execution behavior easy to drift apart and added another crate boundary for what should be one executable-tool seam. This PR makes `ToolExecutor` the single runtime contract and keeps extension-specific pinning in `codex-extension-api`. ## Remaining todo https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5 Either generic `Invocation` or sub-extract the `ToolCall` and clean `ToolInvocation` ## What changed - Removed the `codex-tool-api` workspace crate and its dependencies from core and `codex-extension-api`. - Made `codex_tools::ToolExecutor` object-safe with `async_trait` so extension contributors can return a dyn executor. - Added the extension-facing aliases under `ext/extension-api/src/contributors/tools.rs`, including `ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output = ExtensionToolOutput>`. - Changed `ToolContributor::tools` to return extension executors directly instead of `ToolBundle`s. - Updated core’s extension tool handler/registry/router path to adapt those extension executors into the existing native `ToolInvocation` runtime path. - Added focused coverage for extension tools being registered, model-visible, dispatchable, and not replacing built-in tools. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-extension-api`
jif-oai ·
2026-05-13 12:12:06 +02:00 -
Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
## Why `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests bypass the normal Responses transport by reading SSE from local files. That kept test-only behavior wired through production client code. The affected tests can stay hermetic by using the existing `core_test_support::responses` mock server and passing `openai_base_url` instead. ## What Changed - Removed the `CODEX_RS_SSE_FIXTURE` flag, `codex_api::stream_from_fixture`, the `env-flags` dependency, and the checked-in SSE fixture files. - Repointed the affected core, exec, and TUI tests at `MockServer` with the existing SSE event constructors. - Removed the Bazel test data plumbing for the deleted fixtures and refreshed cargo/Bazel lock state. ## Verification - `cargo build -p codex-cli` - `cargo test -p codex-api` - `cargo test -p codex-core --test all responses_api_stream_cli` - `cargo test -p codex-core --test all integration_creates_and_checks_session_file` - `cargo test -p codex-exec --test all ephemeral` - `cargo test -p codex-exec --test all resume` - `cargo test -p codex-tui --test all resume_startup_does_not_consume_model_availability_nux_count` - `just bazel-lock-update` - `just bazel-lock-check` - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui` - `git diff --check`
pakrym-oai ·
2026-05-13 03:08:01 +00:00