mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
1446 Commits
-
[codex] Use model metadata for skills usage instructions (#29740)
## Summary - add a false-by-default `include_skills_usage_instructions` model metadata field - enable the field for the bundled `gpt-5.5` model metadata - consume the metadata in both core and extension skill rendering - remove hardcoded legacy-model matching and its marker plumbing
ani-oai ·
2026-06-29 09:44:36 +09:00 -
[codex] Enable remote plugins by default (#30297)
## Summary - enable the remote plugin feature by default - promote the remote plugin feature from under development to stable - preserve the existing `features.remote_plugin` override for explicitly disabling it - keep legacy disabled-path coverage explicit in TUI and app-server tests ## Impact Remote plugin functionality is enabled by default for configurations that do not set the feature flag. The existing Codex backend authentication gate still applies. ## Validation - `just fmt` - `just test -p codex-features` - `just test -p codex-tui plugins_popup_remote_section_fallback_states_snapshot` - targeted `codex-app-server` plugin-list and skills-list tests - `git diff --check` The full TUI and app-server suites were also exercised locally. All remote-plugin-related coverage passed; unrelated local sandbox/test-binary failures remain outside this change.
xl-openai ·
2026-06-28 11:46:25 -07:00 -
[app-server] increase currentTime/read timeout (#30384)
## Summary Increase the external currentTime/read request timeout from 5 seconds to 10 seconds. ## Validation - just fmt - Focused app-server test build was stopped to defer validation to CI.
rka-oai ·
2026-06-27 16:42:03 -07:00 -
[app-server] expose environment info RPC (#30291)
## Why App-server clients that configure named execution environments need to discover an environment's shell and working directory before selecting it for a thread or turn. Because the environment can run on a different operating system than app-server, its working directory is represented as a canonical `file:` URI rather than a host-local path string. The probe also needs a bounded response time: an exec-server that completes initialization but never answers `environment/info` must not hold the environment serialization queue indefinitely. ## What changed - Add an experimental `environment/info` app-server RPC for named environments. - Route the probe through the managed environment connection and return target-native shell metadata plus the default working directory as a `PathUri`. - Return connection and protocol failures as JSON-RPC errors. - Bound the exec-server probe response to 30 seconds and remove timed-out calls from the pending-request table so later environment mutations can proceed. - Cover successful responses, omitted working directories, unknown environments, connection failures, and pending-call cleanup. ## Protocol examples Request: ```json { "id": 42, "method": "environment/info", "params": { "environmentId": "remote-a" } } ``` Successful response: ```json { "id": 42, "result": { "shell": { "name": "zsh", "path": "/bin/zsh" }, "cwd": "file:///workspace" } } ``` If the exec-server initializes but does not answer the probe within 30 seconds: ```json { "id": 42, "error": { "code": -32603, "message": "failed to get info for environment `remote-a`: exec-server protocol error: timed out waiting for exec-server `environment/info` response after 30s" } } ``` ## Testing - App-server integration coverage for successful info (including omitted `cwd`), unknown environments, and connection failures. - Exec-server RPC coverage verifying a timed-out call is removed from the pending-request table. --------- Co-authored-by: Michael Bolin <mbolin@openai.com>Max Johnson ·
2026-06-27 19:34:10 +00:00 -
app-server: structure and test JSON shutdown logs (#30314)
## Why `LOG_FORMAT=json` and `RUST_LOG` are supported by app-server, but the behavior was only covered indirectly. We should verify the actual JSONL written by both user-facing entry points: `codex app-server` and the standalone `codex-app-server` binary. The existing processor shutdown message also always said the channel closed, even though the processor can exit for several different reasons. Structured fields make that event more accurate and useful to log consumers. ## What changed - Record the processor `exit_reason`, remaining connection count, and forced-shutdown state as structured tracing fields. - Add a shared process-test helper that enables JSON logging, validates every stderr line as JSON, and verifies the top-level timestamp is RFC 3339. - Cover both `codex app-server` and `codex-app-server`, asserting the stable `level`, `fields`, and `target` payload. ## Test plan - `just test -p codex-app-server standalone_app_server_emits_json_info_events` - `just test -p codex-cli app_server_emits_json_info_events`
Michael Bolin ·
2026-06-26 18:19:56 -07:00 -
[codex] Support npm marketplace plugin sources (#29375)
## Why Marketplace source deserialization treated `{"source":"npm", ...}` as unsupported. The loader logged and skipped the entry, so npm-backed plugins never appeared in `plugin list --available` and `plugin add` returned "plugin not found". Codex plugins are installed from a plugin root, not from an npm dependency tree. For npm-backed marketplace entries, Codex should fetch the published package contents without running package scripts or installing unrelated dependencies. ## What changed - Add `npm` marketplace plugin sources with `package`, optional semver `version` or version range, and optional HTTPS `registry`. - Reject unsafe npm source fields before materialization, including invalid package names, non-semver version selectors, plaintext or credential-bearing registry URLs, and registry query/fragment data. - Materialize npm plugins with `npm pack --ignore-scripts`, then unpack the resulting tarball through the existing hardened plugin bundle extractor. - Enforce npm archive and extracted-size limits, require the standard npm `package/` archive root, and verify the extracted `package.json` name matches the requested package before installing. - Keep plugin listings, install-source descriptions, CLI JSON/human output, app-server v2 `PluginSource`, TUI source summaries, regenerated schema fixtures, and app-server documentation in sync. ## Impact Marketplaces can distribute Codex plugins from public or configured private HTTPS npm registries using the same install flow as existing materialized plugin sources. `npm` must be available on `PATH` when an npm-backed plugin is installed. Fixes #27831 ## Validation - `just write-app-server-schema` - `just test -p codex-core-plugins -p codex-app-server-protocol -p codex-app-server -p codex-cli` - npm/schema/core-plugin coverage passed in the run. - The full focused command finished with `1739 passed`, `11 failed`, and `6 timed out`; the failures were unrelated local app-server environment failures from `sandbox-exec: sandbox_apply: Operation not permitted` plus one missing `test_stdio_server` helper binary. - Installed an npm-published Codex plugin package through a throwaway local marketplace and throwaway `CODEX_HOME` to exercise the real npm materialization path end to end.charlesgong-openai ·
2026-06-26 17:24:46 -04:00 -
feat(app-server): add optional turn_id to thread/fork (#30277)
## Description This adds stable optional `turnId` support to `thread/fork`. When supplied, the fork copies persisted history through that terminal turn, inclusive, and drops later turns from the new thread. Omitting or passing `null` preserves the existing full-history fork behavior, including the interruption marker when the stored source history ends mid-turn. ## Why We're deprecating `thread/rollback` and this will help certain UX use cases work around it by using `thread/fork` + `turn_id` instead.
Owen Lin ·
2026-06-26 19:35:54 +00:00 -
[codex] allow AGENTS.md and skills to authorize delegation (#30274)
Prompt update of MAv2 to include agents.md and skills more explicitly should mimic: https://github.com/openai/codex/pull/27919
Charles Du ·
2026-06-26 12:17:26 -07:00 -
[codex] Add managed new-thread model settings (#29683)
## Why Admins need persistent defaults for the model, reasoning effort, and service tier shown when the Desktop App creates a new thread. These are initialization defaults rather than runtime constraints: the App should use them to initialize its draft while still allowing a user to make an explicit selection. The app-server therefore needs to expose the managed values before thread creation without changing `thread/start` behavior for other clients. ## What changed - Parse `model`, `model_reasoning_effort`, and `service_tier` from `[models.new_thread]` in `requirements.toml`. - Compose the `models` requirements through the existing requirements-layer precedence rules. - Expose the resolved values through `configRequirements/read` as `requirements.models.newThread`. - Add the corresponding app-server protocol types and regenerate the JSON and TypeScript schema fixtures. - Document the new `configRequirements/read` fields in the app-server README. ## Scope This PR is data plumbing only. It does not apply these values during `thread/start` and does not change thread creation for existing app-server clients, resumed or forked sessions, internal or subagent sessions, `codex exec`, or the TUI. A companion Desktop App change owns draft initialization, sends the effective settings for ordinary and prewarmed starts, and preserves explicit user changes. ## Validation - Requirements deserialization coverage for `[models.new_thread]` - Requirements-layer precedence coverage - App-server API mapping coverage - `configRequirements/read` integration coverage - Regenerated app-server JSON and TypeScript schema fixtures
hefuc-oai ·
2026-06-26 18:37:40 +00:00 -
feat(app-server): add history_mode to thread (#29927)
## Description This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`. This will be stored in `SessionMeta` in the JSONL rollout file and as a new column in the SQLite thread_metadata table, and exposed on `thread/start` and on the `Thread` object in app-server. ## What changed - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`, defaulting old and new SessionMeta to `legacy`. - Carried `history_mode` through core session config, ThreadStore stored metadata, local/in-memory stores, rollout metadata extraction, and the existing SQLite `threads` table. - Added experimental `historyMode` to app-server v2 `Thread` and `thread/start`. - Made paginated stored threads metadata-discoverable but unsupported for legacy full-history reads, `load_history`, live resume, and create paths. - Regenerated app-server schema fixtures and added protocol/state/thread-store/app-server coverage for persistence and fail-closed behavior. ## Compatibility floor Because users may be running various versions of Codex binaries on the same machine (TUI, Codex App, etc.), we will need to establish a compatibility floor for upcoming paginated threads, which will change how thread storage reads and writes work. The overall plan here: ``` Release N: - Add historyMode to SessionMeta / Thread / SQLite metadata. - Teach binaries to understand paginated threads. - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread. - Default remains `"legacy"`. Release N+1: - First-party clients start opting into paginated threads where appropriate. - Internal dogfood / staged rollout. - Measure old-client usage and paginated-thread unsupported errors. Release N+2: - Only after Release N+ is overwhelmingly deployed, make paginated the default. - Accept that a small tail of N-1-or-older binaries may not understand paginated threads. ``` The important behavior change is fail-closed handling for a binary that encounters a persisted `paginated` thread before it knows how to fully support paginated history. In app-server, if a thread is `paginated`, we will: - allow metadata-only discovery paths like `thread/list` and `thread/read(includeTurns=false)`, so clients can still see the thread and inspect its `historyMode` - reject legacy full-history/live-thread paths like `thread/read(includeTurns=true)` and `thread/resume` with an unsupported JSON-RPC error - avoid silently treating an unknown or future `historyMode` as `legacy` Under the hood, the ThreadStore layer also rejects legacy operations that would need to load or replay the full thread history for a paginated thread. That gives us the behavior we want for Release N: future paginated threads are visible, but this binary fails closed instead of trying to operate on them as if they were legacy threads.
Owen Lin ·
2026-06-26 09:12:42 -07:00 -
Test selected capabilities across unavailable resume (#30215)
## Why The selected-capability integration test already covers initial attachment and cold resume, but it resumes while the selected executor is still reachable. That leaves an important World State transition untested: a thread remembers its selected capability root, resumes while that environment is unavailable, and later sees the same stable environment return. ## What this tests This extends the existing end-to-end scenario: ```text selected executor available ↓ app-server stops and the executor goes away ↓ thread resumes with the executor unavailable ↓ skills, selected MCP tools, and connector attribution are absent ↓ the same environment ID is attached again ↓ skills, MCP tools, and connector attribution return ``` The test also checks that the unavailable snapshot explicitly tells the model that no selected-environment skills are currently available. After reattachment, it invokes the selected skill again and verifies that a new executor-owned MCP process starts. ## Scope This is test-only. It keeps the existing assumption that an environment ID refers to stable capability contents. It does not add package-file invalidation or live transport reconnect behavior.jif ·
2026-06-26 11:02:27 +01:00 -
Test selected capabilities across availability and resume (#30157)
## Why This stack crosses World State, executor skills, selected plugin metadata, MCP processes, connectors, dynamic environments, and resume. This PR adds two end-to-end scenarios that validate those pieces together. Both tests enable `deferred_executor`, so they exercise the real delayed-environment path. ## Scenario 1: availability across turns and resume ```text 1. Start a thread with one selected plugin root bound to E1. 2. E1 is unavailable. - executor skill is absent - selected MCP is absent - connector has no selected-plugin attribution 3. Start E1 and register the same stable environment ID. 4. Start a new turn. - the executor skill appears through World State - its body beats a colliding host skill - the selected MCP tool is advertised and executes inside E1 - the connector is attributed to the selected plugin 5. Start another turn without changing E1. - the MCP PID stays the same, proving runtime reuse 6. Restart app-server and resume the thread. - durable selected-root intent is restored - skills, MCP, and connector attribution are restored - a new MCP PID proves ephemeral process state was rebuilt ``` ## Scenario 2: availability changes inside one turn ```text 1. Start a turn while E1 is unavailable. 2. The first model sample sees no executor skill, MCP, or selected connector. 3. The turn pauses on request_user_input. 4. Start E1 and register it while that same turn is still active. 5. Continue the turn. 6. The very next model sample sees: - the executor skill catalog - the selected MCP tool - selected-plugin connector attribution 7. The model calls the MCP, and its output proves execution happened inside E1. ``` This second scenario specifically protects the aeon-style behavior: capability state is captured again for every sampling step, not only at the next user turn. ## Scope These are integration tests only. They do not add a combinatorial matrix for unsupported plugin-file mutation, environment generations, transport disconnects, or delayed `required = true` executor MCPs.
jif ·
2026-06-26 03:11:55 +01:00 -
Expose MCP app identity in app context (#29934)
## Why MCP tool-call events need to expose trusted app identity and action metadata directly so v2 clients do not have to infer it from tool names or resource URIs. ## What changed - Add optional `appName`, `templateId`, and `actionName` fields to MCP tool-call `appContext`. - Populate `appName` and `templateId` from trusted Codex Apps metadata, and derive `actionName` from the trusted app resource metadata. - Preserve all three fields through core events, legacy protocol events, persisted thread history, resume redaction, and app-server v2 responses. - Document the public `appContext` fields in `codex-rs/app-server/README.md`. - Regenerate app-server JSON and TypeScript schemas and add coverage for serialization, persistence, redaction, and metadata propagation. ## Validation - `just test -p codex-app-server-protocol mcp_tool_call` - `just test -p codex-core mcp_tool_call_item_metadata_only_trusts_codex_apps_identity mcp_tool_call_item_includes_app_identity` - `just write-app-server-schema` --------- Co-authored-by: Martin Au-Yeung <280153141+martinauyeung-oai@users.noreply.github.com>
Martin Au-Yeung ·
2026-06-25 18:31:10 -07:00 -
Keep MCP elicitation routable across runtime refreshes (#30127)
## Why An MCP tool call can still be waiting for an elicitation response when an environment update replaces the thread's MCP runtime. Before this change: ```text runtime A starts a tool call and asks the user environment becomes ready, so runtime B is published client answers the prompt through runtime B runtime B cannot find runtime A's pending responder ``` The response is lost and the original tool call stays blocked. ## What changed All MCP runtimes for one thread now share a small elicitation router: ```text runtime A ---\ shared router: response token -> exact pending responder runtime B ---/ ``` When Codex surfaces an MCP elicitation, it assigns a unique opaque response token. The router records which pending request owns that token. A replacement runtime reuses the same router, so the latest runtime can deliver a response to a request started by the previous runtime. The Codex-owned token also prevents two runtime connections that reuse the same MCP server request ID from receiving each other's responses. This does not retain or search old MCP managers. Only the pending responder map is shared. ## Covered scenario The integration test exercises the complete failure mode: 1. A thread starts while its selected environment is still unavailable. 2. A configured MCP server starts a tool call and asks the client for input. 3. The environment becomes ready, causing Codex to publish a replacement MCP runtime. 4. The client answers the original prompt after the replacement. 5. The original tool call receives that answer and completes. A focused routing test also creates two runtimes with the same server request ID and verifies that each response reaches the exact request that emitted its token. ## Scope This PR changes only elicitation response routing across MCP runtime replacement. It does not change when runtimes are rebuilt, which environments contribute MCP configuration, or how environment availability is detected.jif ·
2026-06-26 01:28:14 +00:00 -
[codex] Attribute app-server analytics by thread originator (#29935)
## Why Desktop Work threads and regular Codex threads can share the same app-server connection. App-server analytics currently copy `product_client_id` from connection metadata for every thread-scoped event, so Work thread activity is attributed to the Desktop connection instead of the thread's resolved originator. This prevents analytics from distinguishing the two products on a shared connection. ## What changed - Publish the resolved originator after a thread is materialized, covering new, resumed, forked, and subagent threads. - Store that originator in the analytics reducer's existing per-thread state. - Override only `app_server_client.product_client_id` for thread, turn, tool, review, goal, guardian, and compaction events while preserving the connection's client name, version, and transport metadata. - Fall back to the connection-wide product client ID when a thread has no originator override. - Preserve persisted originators in thread initialization analytics for resume and fork flows. ## Validation - `just test -p codex-analytics thread_originator_overrides_shared_connection_across_thread_events subagent_events_keep_thread_originator_with_explicit_turn_connection` - `just test -p codex-app-server turn_start_tracks_thread_originator_in_analytics thread_start_tracks_thread_initialized_analytics thread_fork_tracks_thread_initialized_analytics thread_resume_tracks_thread_initialized_analytics` - `just test -p codex-core thread_manager`
alexsong-oai ·
2026-06-25 18:15:48 -07:00 -
Project selected plugin runtime by environment availability (#30093)
## Why Selected plugin metadata is stable, but MCP processes are live runtime state. They need different lifetimes: - the MCP extension caches manifest, MCP, and connector declarations for each stable selected root; - each model step projects that cached metadata through the roots that resolved as ready for that exact step; - the MCP manager is rebuilt only when that availability projection changes. This matches executor skills: both features consume the same resolved step roots instead of inferring readiness from the turn's selected environments. ## Behavior ```text E1 not ready for this step -> no E1 MCP servers or connectors -> cached plugin metadata stays in ext/mcp E1 becomes ready -> reuse cached metadata -> publish one MCP runtime containing E1 capabilities same ready roots on the next step -> reuse the exact runtime; no rediscovery and no MCP restart resume -> create new extension thread state and a new MCP runtime ``` All model-facing consumers use the same step snapshot: ```text resolved selected roots | v extension MCP/connector projection | v { MCP config, connector snapshot, MCP manager } | +-> advertise model tools +-> build app/connector tools +-> execute MCP calls ``` ## Cache contract The existing MCP extension owns a cache keyed by the full `SelectedCapabilityRoot`: ```rust let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default); ``` The cache lives with extension thread state. Environment availability filters projection but does not invalidate metadata. Resume creates new thread state. There is no file watcher or executor generation because contents behind a stable environment/root are assumed stable. ## What changes - Keeps executor plugin discovery and cached metadata in `ext/mcp`. - Caches MCP and connector declarations together per selected root. - Uses the step's already-resolved capability roots, including lazy environments that are not turn environments. - Reuses the current MCP runtime when the ready-root projection is unchanged. - Uses the same step MCP manager and connector snapshot for model-visible tools and execution. - Resolves direct thread-scoped MCP requests from the current selected-root projection. ## Deliberately out of scope - `app/list` remains based on the latest global host-plugin state; this PR does not make its response or notifications thread-specific. - `required = true` startup semantics do not apply to delayed executor MCP activation. - No filesystem/content invalidation. - No transport-disconnect watcher. - No executor generations or environment replacement semantics. - No client sharing across complete manager replacements. ## Stack 1. Extension-owned World State sections. 2. Project executor skills through World State. 3. Pin one MCP runtime to each model step. 4. **This PR:** project selected MCP and connector state from extension-owned metadata. 5. Integration coverage for selected capability availability and resume. ## Verification - `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id` - The stacked integration PR covers unavailable to ready activation, unchanged-runtime reuse, skills, MCP tools, connector attribution, and cold resume.jif ·
2026-06-26 01:36:44 +01:00 -
Pin MCP runtimes to model steps (#30101)
## Why An MCP refresh can replace the session's current manager while a model step is still running. The step must execute calls through the same manager whose tools it advertised. ## Boundary ```text current session MCP runtime | | capture once for this model step v StepContext.mcp - exact MCP config - exact connection manager - exact runtime environment context ``` ```rust pub struct McpRuntimeSnapshot { config: Arc<McpConfig>, manager: Arc<McpConnectionManager>, runtime_context: McpRuntimeContext, } ``` ## Example ```text step A captures runtime A and advertises A's tools refresh publishes runtime B step A tool call -> runtime A next step -> runtime B ``` Capturing the snapshot is only an `Arc` clone. It does not restart MCPs or make an RPC. ## What changes - Captures one MCP runtime in `StepContext`. - Uses it for tool planning, tool calls, resources, approvals, connector attribution, and elicitation. - Publishes replacement runtimes atomically. - Lets an old runtime live only while an in-flight step or request still holds its `Arc`. Most of this diff is mechanical routing from the session-global manager to `step_context.mcp`; it does not introduce selected-plugin discovery yet. ## What does not change - No plugin or extension migration. - No new MCP cache policy. - No environment file watching. - No client sharing between separate managers. ## Stack 1. Extension-owned World State sections. 2. Project executor skills through World State. 3. **This PR:** pin one MCP runtime to each model step. 4. Project selected MCP/app/connector metadata by environment availability. 5. One end-to-end integration scenario.jif ·
2026-06-26 00:53:07 +01:00 -
[codex] Surface MCP reauthentication-required startup failures (#29877)
## Summary - distinguish expired, non-refreshable stored MCP OAuth credentials from first-time missing credentials - carry a typed `failureReason: "reauthenticationRequired"` on the existing `mcpServer/startupStatus/updated` notification only when user action is required - keep the public MCP auth-status API unchanged and regenerate the app-server protocol schemas and documentation ## Why An MCP server with an expired access token and no usable refresh token currently fails startup without giving clients a reliable, typed recovery signal. The existing startup-status notification is the natural place to carry this state. Its nullable `failureReason` keeps the recovery reason attached to the failed startup transition without adding a one-off notification. Internally, Codex distinguishes first-time login from reauthentication and emits the reason only when the startup error itself requires authentication. ## User impact App clients can prompt an existing user to reconnect an MCP server when automatic recovery is impossible by handling a failed `mcpServer/startupStatus/updated` notification whose `failureReason` is `reauthenticationRequired`. Starting, ready, cancelled, unrelated failures, and first-time setup carry no reauthentication reason. ## Companion app PR - openai/openai#1069582 ## Validation - `just test -p codex-app-server-protocol` — 248 passed; schema fixture tests passed - `cargo check -p codex-app-server -p codex-tui` - `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped - `just test -p codex-protocol -p codex-app-server-protocol -p codex-mcp` — 579 passed - `just write-app-server-schema` - `just fmt`
felixxia-oai ·
2026-06-25 21:50:36 +00:00 -
fix(app-server): suppress TUI rollback warning (#30124)
## Why The TUI uses `thread/rollback` internally for user-facing flows such as prompt cancellation/backtracking. After `thread/rollback` was marked deprecated, those internal calls started surfacing `deprecationNotice` messages in the TUI, even though the user did not explicitly call the deprecated app-server API. The endpoint should remain deprecated for external app-server clients, but the built-in `codex-tui` client should not show this implementation-detail warning during normal interaction. ## What changed - Pass the initialized app-server client name into the `thread/rollback` request processor. - Suppress the `thread/rollback` deprecation notice only for `codex-tui`. - Preserve the existing `deprecationNotice` behavior for non-TUI clients. - Add regression coverage for the `codex-tui` suppression path. ## How to Test 1. Start Codex TUI from this branch. 2. Type text into the composer and press `Esc` to cancel/backtrack. 3. Confirm the TUI restores/cancels the prompt without showing `thread/rollback is deprecated and will be removed soon`. 4. Also verify an external app-server client that calls `thread/rollback` still receives `deprecationNotice`. Targeted tests: - `just test -p codex-app-server thread_rollback` - `just argument-comment-lint`
Felipe Coury ·
2026-06-25 18:44:35 -03:00 -
feat(core, mcp): cache codex_apps tools in memory (#29003)
## Description This makes Codex Apps tool reads use a shared in-memory snapshot instead of rereading the disk cache every time `list_all_tools()` runs. Disk still seeds the cache on startup and gets updated after successful fetches, but it is no longer the live read path. The core change is that `McpManager` now owns a process-scoped `CodexAppsToolsCache`. Codex threads in the same app-server process now share this Codex Apps in-memory tools snapshot. The snapshot is keyed by the Codex home plus the Codex Apps identity: the active Codex auth user/workspace and the effective Codex Apps MCP source config. There's already code to hard-refresh the cache, so we respect it in this PR. ## Local benchmark I ran a local steady-state microbenchmark of the exact repeated Codex Apps cached-tools read this PR removes, using the same real local cache payload in both trees: `3,678,138` bytes and `381` tools. The cache file was already warm in the OS page cache, so this measures same-process reread/deserialization work rather than cold-disk latency or full turn latency. Each run is 25 iterations (mimicking a turn that makes 25 inference calls). | Version | Run 1 | Run 2 | Avg | |---|---:|---:|---:| | `origin/main` disk read + JSON deserialize + `filter_tools` | `50.755 ms` | `52.894 ms` | `51.825 ms` | | This branch in-memory `current_tools` + `filter_tools` | `0.740 ms` | `0.778 ms` | `0.759 ms` | That removes about `51 ms` from each repeated Codex Apps cached-tools read on this machine, roughly `68x` faster for that subpath. It is useful evidence for the hot path this PR changes, but not a claim that every production turn gets `51 ms` faster; end-to-end impact also depends on the rest of `list_all_tools()` and tool-payload construction. This is on my M2 Max macbook, so with a slower disk this would be much worse (and indeed we did see this really blew up turn runtime with a slow disk).
Owen Lin ·
2026-06-25 20:54:48 +00:00 -
[codex] poll external clock during sleep (#30113)
## Summary - make the external app-server time provider establish sleep deadlines using `currentTime/read` - poll the external clock once per second and complete `clock.sleep` when the deadline is reached - keep the system-clock timer and existing steer/agent-message interruption behavior unchanged ## Why This lets training control `clock.sleep` through its existing external simulated clock without adding separate sleep/wake protocol methods. ## Testing - `just fmt` - `just test -p codex-app-server external_sleep_polls_current_time_and_emits_items`
rka-oai ·
2026-06-25 13:46:42 -07:00 -
feat: add provider-aware model fallback to thread start (#29942)
## Why Helper threads such as task title generation can request a model ID that is valid for the default OpenAI provider but unavailable from the active provider. With Amazon Bedrock, `gpt-5.4-mini` is rejected while the provider static catalog exposes Bedrock model IDs such as `openai.gpt-5.5` and `openai.gpt-5.4`. This causes repeated background 404s and can surface a misleading turn error even when the main turn succeeds. Clients need an explicit way to ask app-server to resolve an unavailable helper model to the active provider default. That fallback must remain limited to providers with an authoritative static catalog so custom or dynamically discovered model IDs are not rewritten based on an incomplete catalog. Fixes #28741. ## What changed - Add the experimental `allowProviderModelFallback` option to `thread/start`, defaulting to `false` to preserve existing behavior. - Thread the option through thread creation and model selection. - When enabled for a static model manager, preserve requested models present in the catalog and replace unavailable models with the provider default. - Continue preserving explicit model IDs for dynamic model managers without fetching a catalog solely to validate them. - Document the new `thread/start` behavior in the app-server API overview. ## Test Temporary test-client harness: ``` ThreadStartParams { model: Some("gpt-5.4-mini".to_string()), allow_provider_model_fallback: true, ..Default::default() } ``` Command: ``` CODEX_HOME=/tmp/codex-bedrock-thread-start-home \ CODEX_E2E_BEDROCK_THREAD_START_ONLY=1 \ ./target/debug/codex-app-server-test-client \ --codex-bin ./target/debug/codex \ -c 'model_provider="amazon-bedrock"' \ send-message-v2 --experimental-api ignored ``` Relevant output: ``` > "method": "thread/start", > "params": { > "model": "gpt-5.4-mini", > "modelProvider": null, > "allowProviderModelFallback": true, > ... > } < "result": { < "model": "openai.gpt-5.5", < "modelProvider": "amazon-bedrock", < ... < } ```
Celia Chen ·
2026-06-25 18:24:34 +00:00 -
Persist selected capability roots and resolve availability per model step (#29856)
## Why `selectedCapabilityRoots` is durable thread intent: “use this capability root from environment `worker`.” The important product assumption is: > One environment ID always names the same logical executor and stable contents. `worker` does not silently change from executor A to an unrelated executor B. The process-local connection handle for `worker` can still be replaced while Codex is running, though, for example when `environment/add` registers a fresh handle for the same logical environment. The thread should persist only the stable selection. Each model step should pair that selection with the exact ready handle captured for that step. ## The boundary ```text persisted thread intent plugin@1 -> environment "worker" | | capture the current step v model-step view unavailable, or plugin@1 + worker's exact captured ready handle ``` The environment ID is the stable identity and cache key. The `Arc<Environment>` is only a process-local handle retained so consumers of one model step use the same captured environment. It is never persisted and it does not imply different environment contents. ## What changes ### Persist the stable selection Selected roots are written into `SessionMeta` and restored with the thread. Forked subagents inherit the same selections, including bounded-history forks. Only stable data is persisted: root ID, environment ID, and root path. ### Capture readiness together with the exact handle The environment snapshot records: ```rust environment_id -> Some(Arc<Environment>) // ready in this step environment_id -> None // still starting in this step ``` This prevents readiness and execution from coming from different registry snapshots. For example: ```text step snapshot: worker -> handle A, ready environment/add: worker -> fresh handle B for the same logical environment current step: plugin@1 still uses captured handle A ``` Without carrying handle A in the snapshot, the resolver could combine “A was ready” with handle B and treat B as ready before it had finished starting. This does not change cache invalidation. Stable capability metadata remains identified by environment ID and capability root. Replacing a process-local handle under the same stable environment ID does not invalidate or rediscover that metadata. ### Resolve availability per model step - A ready captured environment produces resolved roots using its captured handle. - A starting, missing, or failed environment is omitted from that step. - A selected lazy environment that is outside the turn's captured environment set is asked to start, and a later step can observe it as ready. - No capability files are scanned here. Transient transport disconnects remain the remote client's reconnect concern. This PR models initial attachment/readiness; it does not add live socket-connectivity state. ## Example ```text thread selection: plugin@1 -> environment "worker" step 1: worker is starting -> plugin@1 unavailable step 2: worker is ready -> plugin@1 resolves through worker's captured handle step 3: fresh local handle -> current step remains pinned; a later step captures its own view ``` Temporary unavailability does not discard the durable selection. Later PRs can retain stable metadata caches while projecting only currently available capabilities into model-visible World State. ## Compatibility The app-server request shape does not change. Older rollouts without `selected_capability_roots` deserialize to an empty list. ## Stack 1. **This PR:** persist stable selected roots and resolve them through an exact model-step handle. 2. #29960: cache stable skill metadata and project available skills into World State. 3. #29946: cache stable plugin declarations and manage the separate live MCP runtime.jif ·
2026-06-25 17:49:43 +00:00 -
chore(app-server): mark thread/rollback as deprecated (#29928)
We will drop support for this in the near future due to the complexity it introduces.
Owen Lin ·
2026-06-25 17:15:46 +00:00 -
Test executor-routed MCP OAuth token exchange (#29656)
## Why #28529 proves OAuth discovery uses the selected executor, but its end-to-end test stops before the callback and token exchange. ## What changed - add an executor-only mock token endpoint - complete the OAuth callback using the authorization URL's `state` and `redirect_uri` - assert the PKCE token exchange reaches the executor-only endpoint - assert the completion notification reports the selected thread and succeeds Depends on #28529.
jif ·
2026-06-25 09:45:20 +00:00 -
Support OAuth for HTTP MCP servers from selected executor plugins (#28529)
## Why #28522 routes selected-plugin HTTP MCP traffic through the owning executor, but OAuth bootstrap and refresh still used host-local clients. Executor-only servers therefore cannot complete discovery or login through the same network boundary as the MCP connection. ## What changed - adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient` contract - let RMCP own discovery, dynamic registration, PKCE, token exchange, and refresh - route auth status, persisted-token startup, and app-server login through the server runtime while preserving the existing local discovery path - add optional `threadId` to `mcpServer/oauth/login` and echo it in the completion notification - implement RMCP's redirect policy and 1 MiB OAuth response limit over executor HTTP - cover selected-thread OAuth discovery and login through an executor-only route Depends on #28522.
jif ·
2026-06-25 10:31:17 +01:00 -
Support HTTP MCP servers from selected executor plugins (#28522)
## Why Selected executor plugins can declare both stdio and Streamable HTTP MCP servers, but only stdio registrations were retained. That silently drops part of the plugin's tool surface and prevents HTTP traffic from using the owning executor's network. ## What changed - retain selected-plugin Streamable HTTP MCP declarations alongside stdio declarations - route their HTTP clients through the owning executor environment - preserve local auth-header environment references while rejecting them for executor-hosted declarations - cover thread isolation, refresh, and an executor-only HTTP route end to end
jif ·
2026-06-25 10:10:36 +01:00 -
[codex] route sleep through time providers (#29973)
## Summary - add a cancellable sleep operation to `TimeProvider` - route `clock.sleep` through the configured provider - extend the supported sleep duration to 12 hours - complete the sleep turn item before propagating provider failures ## Why This isolates the core clock abstraction needed by external clock integrations. Existing system and app-server behavior remains wall-clock based in this PR; the stacked follow-up supplies app-server sleeps from an external clock.
rka-oai ·
2026-06-24 22:17:43 -07:00 -
[codex] Add Ultra reasoning effort (#29899)
## Why Ultra should be one user-facing reasoning selection for work that benefits from both maximum reasoning and proactive multi-agent delegation. Without it, clients must coordinate maximum reasoning with the experimental `multiAgentMode` setting, even though the inference backend still expects its existing `max` effort value. This change makes reasoning effort the source of truth: clients select `ultra`, core derives proactive multi-agent behavior when the turn is eligible for multi-agent V2, and inference requests continue to use the backend-compatible `max` value. ## What changed - Add `ultra` as a first-class reasoning effort and preserve model-catalog ordering when exposing it to clients. - Convert `ultra` to `max` at the inference request boundary, including Responses HTTP/WebSocket requests, startup prewarm, compaction, and memory summarization. - Derive effective multi-agent mode per turn from effective reasoning effort: - eligible multi-agent V2 + `ultra` → `proactive` - eligible multi-agent V2 + any other effort → `explicitRequestOnly` - V1 or otherwise ineligible sessions → no multi-agent mode instruction - Keep the derived effective mode in turn context history so successive turns can emit a developer-message update only when the effective mode changes. - Remove selected multi-agent mode from core session configuration, turn construction, thread settings, resume/fork restoration, and subagent spawn plumbing. Subagents inherit reasoning effort and derive their own effective mode. - Retain the experimental app-server `multiAgentMode` fields for wire compatibility while marking them deprecated. Request values are accepted but ignored; compatibility response fields report `explicitRequestOnly`. - Display Ultra in the TUI using the order supplied by `model/list`. ## Validation - `just test -p codex-core ultra_reasoning_uses_max_for_requests` - `just test -p codex-tui model_reasoning_selection_popup`
Shijie Rao ·
2026-06-24 20:13:52 -07:00 -
[codex] Populate remote plugin local versions (#29956)
# What - Carry installed remote release versions through remote plugin summaries as `localVersion`. - Keep the app-server mapping a pure adapter by populating that value in the remote catalog layer. # Why Remote plugin summaries always returned `localVersion: null` even after their versioned bundles had been installed locally. Consumers such as scheduled-task template discovery use `localVersion` to resolve a plugin's materialized root, so templates from remote curated plugins were silently skipped.
Abhinav ·
2026-06-25 03:13:03 +00:00 -
[codex] nest sleep config under current time reminder (#29910)
## Summary - move sleep tool enablement from top-level `[features].sleep_tool` to `[features.current_time_reminder].sleep_tool` - remove the standalone `Feature::SleepTool` flag and gate `clock.sleep` from resolved current-time configuration - update config schema, config-lock materialization, and existing sleep coverage Stacked on #29907.
rka-oai ·
2026-06-24 17:49:00 -07:00 -
[codex] namespace sleep under clock (#29907)
## Summary - expose the interruptible sleep tool as `clock.sleep` instead of top-level `sleep` - keep `clock.curr_time` and `clock.sleep` in the same model-visible namespace when both features are enabled - update existing core and app-server integration coverage to issue namespaced sleep calls ## Why Sleep is a clock operation. Grouping it with `clock.curr_time` gives the model a more coherent tool surface without changing the sleep feature gate or runtime behavior. ## Validation - `just test -p codex-core sleep_tool_follows_feature_gate` - `just test -p codex-core any_new_input_interrupts_sleep` - `just test -p codex-app-server sleep_emits_started_and_completed_items`
rka-oai ·
2026-06-24 17:17:28 -07:00 -
Add a connector declaration snapshot (#29851)
## Why Connector declarations currently enter Codex through broad plugin capability summaries, then MCP setup, turn tooling, and `app/list` each reconstruct the same information. That makes executor-selected connectors difficult to add without coupling connector behavior to the host plugin loader. This PR introduces a small connector-owned value that later stack layers can populate before thread startup. ## What changed - Move the pure app-declaration parser into `codex-connectors`, preserving declaration order and category cleanup while leaving host-side validation and deduplication unchanged. - Add an immutable `ConnectorSnapshot` with ordered connector IDs and plugin display-name provenance. - Adapt the existing local-plugin capability summaries into that snapshot at current consumer boundaries. - Use the snapshot for MCP tool provenance, turn connector inventory, and `app/list`. - Keep the crate API narrow: no test-only snapshot accessors are exposed. The externally visible behavior is unchanged. Connector tools still come from the orchestrator-owned `/ps/mcp` server, and local plugin enablement remains owned by the existing plugin loader. ## Stack scope This is the foundation only. It does not read selected executor packages or change thread startup. #29852 adds the executor-backed declaration reader, and #29856 composes selected declarations into a thread snapshot.
jif ·
2026-06-24 23:24:01 +01:00 -
[apps] Thread structured icon assets through app list (#29889)
## Summary - Add `iconAssets` and `iconDarkAssets` to the app-list protocol. - Preserve structured icons through directory merging and the connector, app- server, and TUI boundaries. - Keep legacy logo URLs unchanged as compatibility fallbacks. - Update generated protocol schemas and TypeScript types.
Drew ·
2026-06-24 13:25:44 -07:00 -
[codex] Inject agent graph store into ThreadManager (#29736)
Pick up the AgentGraphStore migration. - Inject an explicit optional agent graph store into `ThreadManager` - Move all calls to spawn, close, recursive resume, and subtree/archive/delete/feedback traversal through it - Keep using `LocalAgentGraphStore` when SQLite is available This required some changes to the interface to deal with futures: - The interface now matches `ThreadStore`'s object-safe pattern by returning a boxed `AgentGraphStoreFuture` directly, allowing `ThreadManager` to hold `Arc<dyn AgentGraphStore>` *Slight behavior change!* Unfiltered subtree enumeration now performs a single all-status breadth-first traversal, so a closed grandchild beneath an open edge is included; the previous Open-then-Closed traversals could not cross mixed-status paths and silently omitted it.
Tom ·
2026-06-24 13:24:10 -07:00 -
feat(app-server): list descendant threads by ancestor (#29591)
## Why `thread/list` can filter direct children with `parentThreadId`, but clients cannot request an entire spawned subtree. Discovering every descendant requires repeated client-side requests and gives up the database's existing filtering and pagination path. ## What changed Experimental clients can use `ancestorThreadId` to return strict descendants at any depth while `parentThreadId` retains its direct-child meaning. The filters are mutually exclusive, the ancestor is excluded, and every result preserves its immediate `parentThreadId` so callers can reconstruct the tree. ## How it works - **Explicit relationship:** Internal list parameters distinguish direct children from transitive descendants without changing the meaning of `parentThreadId`. - **Existing graph:** Persisted parent-child spawn edges remain the source of truth, so descendant lookup needs no schema migration or ancestry cache. - **Indexed traversal:** A recursive SQLite query starts from the parent-edge index, walks each generation, and applies thread filters, sorting, and cursor pagination in the same database request. - **Reconstructable results:** The response stays flat and normally ordered while carrying each descendant's immediate parent. ## Verification Ran 550 tests across the protocol, state, rollout, and thread-store crates, then reran the four focused state, store, and app-server descendant-listing tests after the final diff reduction. Scoped Clippy and formatting checks passed. Stable and experimental schema generation was checked; the stable fixtures remain unchanged while the experimental schema includes the new field.
Brent Traut ·
2026-06-24 13:08:14 -07:00 -
[codex] show external import result counts (#29567)
## What changed - Show per-type import counts in the `/import` review UI and started message. - Render completion results as a multi-line summary with total imported/failed counts and one row per import type. - Add snapshot coverage for the updated review and completion output. <img width="537" height="322" alt="Screenshot 2026-06-23 at 9 41 20 PM" src="https://github.com/user-attachments/assets/166542eb-2097-4b2b-8130-8f6fd8c680ce" /> ## Why The TUI previously only reported that Claude Code import started or finished. Users could not see how many items of each type were selected or how many actually imported versus failed.
charlesgong-openai ·
2026-06-24 08:56:57 -07:00 -
test: use automatic environments in app-server integration tests (#29789)
## Why Topology-neutral app-server integration tests should exercise automatic environment selection so the same setup covers local and remote executors. ## What Migrate eligible tests to `TestAppServer::new_with_auto_env()` and `send_thread_start_request_with_auto_env()`. Leave explicit-topology tests unchanged, and skip the request-permissions case on Windows with a TODO for cross-platform tool routing. ## Validation - `just test -p codex-app-server` - `bazel test //codex-rs/app-server:app-server-all-wine-exec-test --test_output=errors` Stacked on #29788.
Adam Perry @ OpenAI ·
2026-06-23 22:48:06 -07:00 -
test: run app-server integration tests under Wine (#29788)
## Why Made a mistake when carving #29746 out of my local changes and the test was missing from the build graph. Oops! ## What Enable the app-server Wine exec test target. Remove the `manual` tag from generated Wine-exec test variants so wildcard Bazel test invocations select them. Refactor the smoke test to ensure it passes with current Windows support.
Adam Perry @ OpenAI ·
2026-06-24 05:23:29 +00:00 -
connectors: own app metadata types (#29723)
## Why Connector metadata is consumed by connector discovery, ChatGPT integration, core, and TUI code. Treating app-server's wire DTO as the shared domain model reverses the intended dependency direction. ## What changed - Added connector-owned app branding, review, screenshot, metadata, and info types. - Added explicit conversions in app-server and TUI while preserving app-server's wire payloads. - Removed production app-server-protocol dependencies from connectors and ChatGPT connector code. ## Stack This is PR 4 of 6, stacked on [PR #29722](https://github.com/openai/codex/pull/29722). Review only the delta from `codex/split-config-layer-types`. Next: [PR #29724](https://github.com/openai/codex/pull/29724). ## Validation - Connector and tools coverage passed. - App-server app-list coverage passed: 13 tests.
Adam Perry @ OpenAI ·
2026-06-23 22:08:23 -07:00 -
config: own layer provenance types (#29722)
## Why Config layer provenance describes how effective configuration was assembled, so it belongs with the config loader rather than in app-server's serialized API types. ## What changed - Moved `ConfigLayerSource`, `ConfigLayerMetadata`, and `ConfigLayer` ownership into `codex-config`. - Kept app-server's wire payloads unchanged and added explicit conversions at the app boundary. - Removed lower-level app-server-protocol dependencies from config consumers. ## Stack This is PR 3 of 6, stacked on [PR #29721](https://github.com/openai/codex/pull/29721). Review only the delta from `codex/split-auth-domain-types`. Next: [PR #29723](https://github.com/openai/codex/pull/29723). ## Validation - `codex-config` coverage passed. - App-server config-manager and config RPC coverage passed.
Adam Perry @ OpenAI ·
2026-06-24 04:03:04 +00:00 -
[plugins] Enforce marketplace source admission requirements (#29753)
## Why Managed marketplace source requirements only become effective when every local marketplace mutation path applies the same admission decision. This change centralizes that decision so CLI, app-server, and external-agent migration flows cannot add, install from, or refresh a disallowed source. ## What changed - Match exact normalized Git repository URLs with an optional exact `ref`. - Match Git hosts with managed regular expressions. - Match local marketplaces by exact absolute path. - Preserve the expected path/name boundary for managed OpenAI marketplaces. - Enforce source admission during marketplace add, plugin install, and configured Git marketplace upgrade. - Continue upgrading independent marketplaces when one source is rejected and return a per-marketplace error. - Load the effective requirements stack at CLI, app-server, and external-agent migration entry points. This PR does not filter already configured marketplaces at runtime; that remains in draft follow-up #29691. ## Stack This is PR 2 of 3 and is based on #29690, which introduces the requirements data shape and merge behavior. ## Test plan - Source matcher coverage for Git URL/ref, host-pattern, local-path, and managed marketplace cases. - Marketplace add and plugin install coverage for allowed and rejected sources. - Marketplace upgrade coverage for rejection and per-marketplace continuation.
xl-openai ·
2026-06-23 20:13:11 -07:00 -
auth: move domain mode below app wire types (#29721)
## Why Authentication mode is a domain concept used by login, model selection, telemetry, and transports. Keeping the canonical type in app-server protocol forces those lower-level crates to depend on an unrelated wire API. ## What changed - Added canonical `codex_protocol::auth::AuthMode` domain values. - Kept the app-server wire DTO unchanged and added an explicit app-side conversion. - Removed production app-server-protocol dependencies from login, model-provider-info, models-manager, and otel call paths. ## Stack This is PR 2 of 6, stacked on [PR #29714](https://github.com/openai/codex/pull/29714). Review only the delta from `codex/split-json-rpc-protocols`. Next: [PR #29722](https://github.com/openai/codex/pull/29722). ## Validation - Auth and login coverage passed in the focused protocol/domain test run. - App-server account and auth conversion coverage passed.
Adam Perry @ OpenAI ·
2026-06-24 03:10:20 +00:00 -
[codex] Ignore local curated plugins when remote catalog is active (#29765)
## Summary - suppress configured `openai-curated` plugins when the remote plugin feature is enabled and auth uses the Codex backend - preserve `openai-api-curated` and non-Codex-backend behavior while including remote catalog activation in the plugin load cache key - add core plugin coverage and an app-server integration test for runtime feature enablement ## Why The Codex app enables remote plugins through process-local runtime feature enablement, which can happen after app-server startup tasks have already observed legacy local plugin state. The existing conflict logic only preferred a remote plugin when the same plugin was already installed remotely, so a configured legacy-only plugin could continue exposing skills and other capabilities from `openai-curated`. ## Impact When the remote catalog is active, legacy `openai-curated` plugins no longer contribute skills, MCP servers, apps, or hooks. Remote installed plugins continue to load normally, and `openai-api-curated` remains unaffected. This does not change remote fetch, bundle sync, or uninstall behavior. ## Validation - `just test -p codex-core-plugins remote_global_catalog_ignores_local_curated_plugins remote_plugin_feature_keeps_local_curated_without_codex_backend` - `just test -p codex-app-server runtime_remote_plugin_enablement_excludes_local_curated_plugin_skills` - `just fmt` - `git diff --check`
xl-openai ·
2026-06-23 19:51:31 -07:00 -
Let image generation extension hosts control output persistence (#29711)
## Why Some extension hosts need generated images returned without writing them to the local filesystem or giving the model a local path. ## What changed **tl;dr**: we now conduct all extension operations in the image gen extension - Let hosts provide an optional image save root when installing the extension. - Save images and return path hints only when a save root is configured. - Return image data without saving or adding a path hint when no save root is configured. - Preserve the extension-provided `saved_path` instead of persisting extension images again in core. - Leave built-in image generation unchanged. ## Validation - `just test -p codex-image-generation-extension` - `just test -p codex-app-server standalone_image_generation_returns_saved_path_hint_to_model` - `just test -p codex-core extension_tool_uses_granted_turn_permissions_without_local_persistence` - `just test -p codex-core tools::handlers::extension_tools::tests` - tested on CODEX CLI on both save_root: CODEX_HOME and None - tested on CODEX APP on both as well
Won Park ·
2026-06-23 18:51:49 -07:00 -
test: add app-server auto environment helper (#29746)
## Why Start moving towards app-server tests defaulting to running against remote & foreign OS executors. To do so we need a point of indirection similar to core integration tests' `build_with_auto_env`, but with the flexibility of letting tests control environment registration if they need to. ## What This adds: - `TestAppServer::new_with_auto_env()` for constructing an app server with a default environment defined by the test runner (e.g. bazel) - `TestAppServer::auto_env_params()` for tests to easily acquire turn env params tailored to the automatic environment - `TestAppServer::send_thread_start_request_with_auto_env()` to make it easy for tests to start a thread using the automatic environment The above methods all fail if the test calling them has set up an environment where the automatic environment configuration conflicts with test-created state. ## Validation Adds a couple of basic smoke tests to the app-server test suite. Follow-ups will migrate more tests to use it.
Adam Perry @ OpenAI ·
2026-06-24 01:06:29 +00:00 -
Support thread-level originator overrides (#29477)
## Why Work(TPP) threads can be launched from the Desktop app, but if they all keep the Desktop app's default originator then downstream attribution cannot distinguish local Work launches from cloud-backed Work launches. `thread/start.serviceName` already carries that launch signal, while `SessionMeta.originator` is the durable thread-level value that survives resume and fork. This change converts the Desktop Work service names into an effective originator at thread creation time, persists that originator with the thread, and keeps using it for later model requests and memory writes. ## What changed - Map `CODEX_WORK_LOCAL` and `CODEX_WORK_CLOUD` service names to per-thread originators, while preserving `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` as the highest-precedence override. - Persist the effective originator in `SessionMeta.originator`, read it back on resume/fork, and inherit the parent originator for subagent spawns when there is no persisted session metadata. - Handle truncated `SpawnAgentForkMode::LastNTurns` forks by falling back to the live parent originator when the forked history no longer includes `SessionMeta`. - Thread the per-thread originator through Responses headers, websocket/compaction request paths, thread-store creation, rollout metadata, and memory stage-one telemetry. ## Verification - `just test -p codex-core agent::control::tests::spawn_thread_subagent_inherits_parent_originator_without_fork agent::control::tests::spawn_thread_subagent_fork_last_n_turns_inherits_parent_originator_without_session_meta thread_manager::tests::originator_override_precedes_service_name_remapping` - `just test -p codex-core agent::control::tests::resume_thread_subagent_restores_stored_metadata_and_effective_multi_agent_mode` - `just test -p codex-memories-write` - `just fix -p codex-core -p codex-memories-write` - `git diff --check`
alexsong-oai ·
2026-06-23 17:23:38 -07:00 -
[codex] rename rollout budget error to session budget error (#29744)
## Summary - rename the rollout-budget exhaustion error from `RolloutBudgetExceeded` to `SessionBudgetExceeded` - expose the matching app-server v2 wire value as `sessionBudgetExceeded` - regenerate JSON/TypeScript schema fixtures and update the app-server docs and focused tests This is a naming-only follow-up to #29715 based on [Pavel's review suggestion](https://github.com/openai/codex/pull/29715#discussion_r3463183480). Runtime behavior is unchanged. ## Tests - `just test -p codex-core rollout_budget` - `just test -p codex-app-server-protocol` - `just fmt` - `just write-app-server-schema`
rka-oai ·
2026-06-23 16:49:13 -07:00 -
[codex] surface rollout budget exhaustion (#29715)
## Summary - surface shared rollout-budget exhaustion as `CodexErr::RolloutBudgetExceeded` instead of a generic interrupted turn - map it through the existing `CodexErrorInfo` and app-server v2 `codexErrorInfo` path - keep local compaction from retrying after the shared rollout budget is exhausted This gives app-server clients a stable `rolloutBudgetExceeded` error they can classify without guessing from `status="interrupted"`. ## Tests - `just test -p codex-core rollout_budget`
rka-oai ·
2026-06-23 15:01:28 -07:00 -
Make selected plugin roots URI-native (#28918)
## Why Selected capability roots belong to the executor filesystem, not the app-server host. Converting their path strings into the host's native `Path` breaks whenever the two machines use different path conventions, such as a Windows executor behind a Unix app-server. This PR establishes `PathUri` as the selected-plugin boundary so the executor remains authoritative for its paths. ## What changed - Require `selectedCapabilityRoots[].location.path` to be a canonical `file:` URI and deserialize it directly as `PathUri`; native path strings are rejected. - Update the app-server schema, generated TypeScript, examples, and request coverage for the URI contract. - Keep selected roots, resolved plugin locations, manifest paths, and manifest resources as `PathUri`. - Inspect and read plugin roots and manifests only through the selected environment's `ExecutorFileSystem`. - Parse executor manifests with the shared URI-native parser from #29620 instead of projecting them onto the host filesystem. - Enforce resource containment lexically and preserve the root URI's POSIX or Windows path convention. - Cover foreign Windows plugin roots and URI-native manifest resources. ```text thread/start selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo" | PathUri v ExecutorFileSystem | +--> plugin.json +--> manifest resources ``` This PR stops at the shared selected-plugin representation. The next two PRs remove the remaining host-path projections in the skill and MCP consumers. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. **This PR** — keep selected plugin roots and resources URI-native. 4. #29626 — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 22:51:19 +01:00