mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
2e69966cd8fbb3f0265152af778dd5512c66d151
190 Commits
-
Make selected plugin roots URI-native (#28918)
## Why Selected capability roots belong to the executor filesystem, not the app-server host. Converting their path strings into the host's native `Path` breaks whenever the two machines use different path conventions, such as a Windows executor behind a Unix app-server. This PR establishes `PathUri` as the selected-plugin boundary so the executor remains authoritative for its paths. ## What changed - Require `selectedCapabilityRoots[].location.path` to be a canonical `file:` URI and deserialize it directly as `PathUri`; native path strings are rejected. - Update the app-server schema, generated TypeScript, examples, and request coverage for the URI contract. - Keep selected roots, resolved plugin locations, manifest paths, and manifest resources as `PathUri`. - Inspect and read plugin roots and manifests only through the selected environment's `ExecutorFileSystem`. - Parse executor manifests with the shared URI-native parser from #29620 instead of projecting them onto the host filesystem. - Enforce resource containment lexically and preserve the root URI's POSIX or Windows path convention. - Cover foreign Windows plugin roots and URI-native manifest resources. ```text thread/start selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo" | PathUri v ExecutorFileSystem | +--> plugin.json +--> manifest resources ``` This PR stops at the shared selected-plugin representation. The next two PRs remove the remaining host-path projections in the skill and MCP consumers. ## Stack 1. #29614 — add lexical `PathUri` containment. 2. #29620 — share URI-native manifest path resolution. 3. **This PR** — keep selected plugin roots and resources URI-native. 4. #29626 — load executor skills without host path conversion. 5. #29628 — resolve executor MCP working directories without host path conversion.
jif ·
2026-06-23 22:51:19 +01:00 -
chore(core) rm AskForApproval::OnFailure (#28418)
## Summary Deletes the OnFailure variant of the `AskForApproval` enum. This option has been deprecated since #11631. ## Testing - [x] Tests pass
Dylan Hurd ·
2026-06-23 12:13:54 -07:00 -
feat(core): store turn_id on ResponseItem metadata (#28360)
## Description This PR is a followup to https://github.com/openai/codex/pull/28355 and starts assigning `internal_chat_message_metadata_passthrough.turn_id` to durable Responses API items created during a turn. The goal is that those items keep the `turn_id` that introduced them when Codex resends stateless HTTP context, reconstructs history for resume/fork paths, or reuses websocket response state. ## What changed - Set `internal_chat_message_metadata_passthrough.turn_id` when missing as response items enter durable history, initial/replacement history, inter-agent communication history, and local compaction summaries. - Preserve existing item turn IDs instead of overwriting them during persistence, resume reconstruction, compaction, forked history, and websocket incremental reuse. - Keep `compaction_trigger` fieldless because it is a request control, not a durable response item. - Update focused history/request assertions and fixtures for stateless requests, websocket incrementals, compaction, thread injection, prompt debug, and related CI coverage.
Owen Lin ·
2026-06-22 16:45:14 -07:00 -
core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
## Description This PR cuts Codex over from generic `ResponseItem.metadata` (introduced here: https://github.com/openai/codex/pull/28355) to `ResponseItem.internal_chat_message_metadata_passthrough`, which is the blessed path and has strongly-typed keys. For now we have to drop this MAv2 usage of `metadata`: https://github.com/openai/codex/pull/28561 until we figure out where that should live.
Owen Lin ·
2026-06-22 11:11:25 -07:00 -
Add workspace messages app-server API (#29001)
## Summary - Add backend-client types and fetch support for active workspace messages. - Add the app-server v2 `account/workspaceMessages/read` method, generated schemas, and README documentation. - Delegate workspace-message eligibility to the Codex backend feature gate; map a backend 404 to `featureEnabled: false`. ## Testing - `just write-app-server-schema` - `just test -p codex-backend-client` - `just test -p codex-app-server-protocol` - `just test -p codex-app-server workspace_messages` - `just fix -p codex-backend-client -p codex-app-server-protocol -p codex-app-server` - `just fmt` ## Stack - Base PR for #28232, which adds the TUI status-line integration.
xli-oai ·
2026-06-22 04:25:07 -07:00 -
Simplify multi-agent mode controls (#29324)
## Why Multi-agent delegation policy was split across `multiAgentMode`, `features.multi_agent_mode`, and `usage_hint_enabled`. These controls could disagree: a requested mode could be downgraded by the feature flag, and disabling usage hints also disabled mode instructions. Some clients also need multi-agent tools without adding delegation-policy text to model context. The previous two-mode API could not express that directly. ## What changed `multiAgentMode` is now the only live delegation-policy control: | Mode | Behavior | | --- | --- | | `none` | Keep multi-agent tools available without adding mode instructions. | | `explicitRequestOnly` | Only delegate after an explicit user request. | | `proactive` | Delegate when parallel work materially improves speed or quality. | - new threads default to `explicitRequestOnly`; omitting the mode on later turns keeps the current value - thread start, resume, fork, and settings responses always report the concrete current mode instead of `null` - mode selection remains sticky across turns and resume - usage-hint text no longer controls whether mode instructions apply - `features.multi_agent_mode` and `usage_hint_enabled` remain accepted as ignored compatibility settings so existing configs continue to load - app-server documentation and generated schemas describe the three-mode API ## Tests - `just test -p codex-core multi_agent_mode` - `just test -p codex-core multi_agent_v2_config_from_feature_table` - `just test -p codex-core spawn_agent_description` - `just test -p codex-features` - `just test -p codex-app-server-protocol` - `just test -p codex-app-server multi_agent_mode`
jif ·
2026-06-22 10:05:36 +02:00 -
Add per-turn multi-agent mode (#28685)
## Why Multi-agent v2 currently carries an explicit-request-only delegation rule in its static usage hint. That provides a safe default, but it prevents clients from selecting proactive delegation per turn without changing static guidance or rewriting prior model context. This change makes delegation mode a session selection that can be updated through `turn/start`, while deriving the effective model-visible mode separately for each turn. Eligible multi-agent v2 turns remain explicit-request-only unless proactive mode is both selected and enabled. ## What changed - Add the experimental `turn/start.multiAgentMode` parameter with `explicitRequestOnly` and `proactive` values. Omission retains the loaded session's current optional selection. - Add the default-off `features.multi_agent_mode` feature gate. Eligible multi-agent v2 turns use the selected mode when enabled; an unset selection or disabled gate resolves to `explicitRequestOnly`. - Treat mode prompting as inapplicable for multi-agent v1 and other unsupported session configurations, producing no multi-agent mode developer message rather than rejecting the turn. - Move the explicit-request-only rule out of the static v2 usage hint and into a bounded, tagged developer context fragment. - Emit the effective mode in initial context and only when that effective mode changes on later turns. - Persist the effective mode in `TurnContextItem` as the durable baseline for resume and context-update comparisons. Historical rollout items are not rewritten. Later mode developer messages establish the current rule incrementally. ## Not covered - Initial selection through `thread/start` and selected-mode reporting from thread lifecycle/settings APIs; those are isolated in the stacked #28792. - A TUI control or slash command for selecting the mode. - Persisting a preferred mode to `config.toml`; selection remains session/turn scoped. - Changes to multi-agent concurrency limits, tool availability, or model catalog capability declarations. - Rewriting historical rollout prompt items. Cold resume restores the latest persisted effective mode when available while leaving historical developer messages intact. ## Verification - `CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode` - Focused app-server coverage verifies that `turn/start.multiAgentMode` produces proactive developer instructions for an eligible v2 turn. ## Stack Followed by #28792, which adds `thread/start` initialization and lifecycle/settings observability.
Shijie Rao ·
2026-06-18 22:47:51 -07:00 -
[codex] Assign response item IDs when recording history (#28814)
## Why Client-created response items enter history without IDs, so their identity is lost across rollout persistence and resume. IDs should be assigned once at the history-recording boundary, while IDs returned by the server must remain unchanged. The Responses API validates item IDs using type-specific prefixes. Locally generated IDs therefore use the matching prefix plus a hyphenated UUIDv7, keeping them valid while distinguishable from server-generated IDs. Because this changes persisted history and provider request shapes, the behavior is opt-in behind the under-development `item_ids` feature. Compaction triggers remain request controls whose API shape does not accept an ID. ## What changed - Register the disabled-by-default `item_ids` feature and expose it in `config.schema.json`. - Make supported optional `ResponseItem` IDs serializable and expose them in the generated app-server schemas. - When `item_ids` is enabled, assign an ID during conversation-history preparation if an item has no ID. - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API item conventions. - Preserve existing server IDs without rewriting them. - Persist assigned IDs in rollouts and include them in subsequent Responses requests. - Remove the unsupported ID field from `CompactionTrigger` and document why it has no ID. - Add integration coverage for enabled ID persistence, preservation of server IDs, and omission of generated IDs while the feature is disabled. `prepare_conversation_items_for_history` is the single response-item ID allocation boundary. ## Test plan - `just test -p codex-features` - `just test -p codex-core response_item_ids_persist_across_resume_and_preserve_server_ids` - `just test -p codex-core non_openai_responses_requests_omit_item_turn_metadata` - `just test -p codex-core resize_all_images_prepares_failures_before_history_insertion` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
pakrym-oai ·
2026-06-18 17:30:55 -07:00 -
Always use AVAS for realtime WebRTC calls (#28856)
## Summary - Remove the realtime `architecture` selector from core protocol, app-server protocol, config parsing, generated schemas, and callers. - Always create WebRTC realtime calls with the AVAS query params: `intent=quicksilver&architecture=avas`. - Keep direct websocket realtime behavior on the existing config/default path, while WebRTC starts without an explicit version now default to realtime v1 because AVAS requires v1. ## Notes - WebRTC realtime now means AVAS. If a caller explicitly asks to start WebRTC with realtime v2, Codex rejects that request because the AVAS WebRTC path only supports realtime v1. Websocket realtime is separate and can still use realtime v2. - The old `[realtime] architecture = "realtimeapi" | "avas"` config knob is removed. Local configs that still set it will need to delete that line. - Some app-server tests that were only trying to exercise realtime v2 protocol behavior now use websocket transport, because WebRTC is intentionally locked to AVAS/v1. Separate WebRTC tests cover the AVAS query params, v1 startup, SDP flow, and sideband join. ## Validation - Merged fresh `origin/main` at `83e6a786a2`. - `just fmt` - `just write-config-schema` - `just write-app-server-schema` - `git diff --check` - `just test -p codex-api -p codex-core -p codex-app-server-protocol -p codex-app-server realtime` (176 passed) - `just test -p codex-protocol -p codex-config` (413 passed)
Peter Bakkum ·
2026-06-18 19:11:21 -05:00 -
Support
openai/formextended form elicitations (#27500)# Summary Allow App Server clients to opt into `openai/form` MCP elicitations.
Gabriel Peal ·
2026-06-18 11:54:49 -07:00 -
[codex] Support assistant realtime append text (#28836)
## Why Frontend realtime voice continuity needs to replay a tiny previous-session overlap as actual conversation items, including assistant text. The app-server `thread/realtime/appendText` API already carries a role through to the Rust realtime websocket layer, but the shared role enum only accepted `user` and `developer`. ## What Changed - Added `assistant` to `ConversationTextRole` and regenerated the app-server schema/type fixtures. - Added `output_text` as a realtime conversation content type. - Updated realtime websocket item creation so assistant appendText emits `content: [{ type: "output_text", text }]`, while user and developer continue to emit `input_text`. - Updated app-server docs and tests to cover assistant appendText alongside the existing developer role behavior. ## Validation - `just write-app-server-schema` - `just fmt` (first sandboxed attempt failed because `uv` could not access `~/.cache/uv`; reran with filesystem access and passed) - `just test -p codex-api` passed: 126/126 - `just test -p codex-app-server-protocol` passed: 239/239, including generated JSON/TypeScript fixture checks - `just test -p codex-app-server` was started locally but stopped per request after unrelated local sandbox/Seatbelt failures (`sandbox-exec: sandbox_apply: Operation not permitted`) and one missing local `codex` binary failure; CI should be faster and more authoritative for the full suite.guinness-oai ·
2026-06-17 20:57:13 -07:00 -
[codex] Add optional IDs to response items (#28812)
## Why `ResponseItem` variants do not have a consistent internal ID shape: some variants carry required IDs, some carry optional IDs, and some cannot represent an ID at all. The existing fields also use inconsistent serde, TypeScript, and JSON-schema annotations. A single enum-level access path is needed before history recording can assign and retain IDs. This PR establishes that internal model only. It intentionally does not generate or serialize IDs; allocation and wire persistence are isolated in the stacked follow-up. ## What changed - Give every concrete `ResponseItem` variant an `Option<String>` ID field. - Apply the same internal-only annotations to every ID field: `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and `#[schemars(skip)]`. - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared accessors. - Preserve IDs when history items are rewritten for truncation. - Adapt consumers that previously assumed reasoning and image-generation IDs were required. - Regenerate app-server schemas so the hidden fields are represented consistently. The serde catch-all `ResponseItem::Other` remains ID-less because it must remain a unit variant. ## Test plan - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-core event_mapping`
pakrym-oai ·
2026-06-17 18:27:43 -07:00 -
[codex] Track plugin install and import telemetry failures (#28731)
## Summary - Track plugin install failures through the unified `codex_plugin_install_failed` event for local installs, remote install preflight failures, bundle failures, and remote catalog/backend failures. - Send classified `error_type` values in plugin install failure analytics instead of raw error strings. - Stop sending raw external-agent import errors in analytics while preserving raw failure details in app-facing import notifications/history. - Keep raw plugin/migration diagnostics in `tracing::warn!` logs. - Keep remote failure plugin names as the existing local placeholder (`unknown`) and remove the extra telemetry plugin-name override. - Change `ExternalAgentConfigImportParams.source` from a generated enum to `string | null`, with legacy `claudeCode` / `claudeCowork` inputs normalized to existing analytics values. ## Testing
charlesgong-openai ·
2026-06-17 13:16:34 -07:00 -
[codex] Restore thread recency with compatible migration history (#28671)
## Summary - Revert #28655, restoring the thread `recencyAt` behavior introduced by #27910. - Move `threads_recency_at` to migration 0039 so it no longer collides with `external_agent_config_imports` at version 0038. - Repair databases that already applied the recency migration as version 38 by moving the matching migration-history row to version 39 before SQLx validation. The current version-38 migration can then apply normally. ## Validation - `just test -p codex-state migrations::tests::repairs_recency_migration_that_was_applied_as_version_38` - `just test -p codex-state -p codex-rollout -p codex-thread-store -p codex-app-server-protocol -p codex-tui`: 3,439 passed; six TUI tests could not open the machine's existing read-only incident database at `~/.codex/sqlite/state_5.sqlite`. - `just fix -p codex-state` - `just fmt` - Verified that state migration versions are unique.
Jeremy Rose ·
2026-06-17 18:52:18 +00:00 -
Add join key for MAv2 inter-agent messages (#28561)
## Summary This keeps inter-agent communication on the existing raw response item path and adds a join key for MAv2 tool calls. MAv2 `spawn_agent`, `send_message`, and `followup_task` now stamp the originating tool call id into `ResponseItemMetadata.source_call_id` on the raw `ResponseItem::AgentMessage`. App-server clients can join that raw item back to the existing tool/activity event by call id, while using the raw agent message's existing sender, receiver, and content fields. No new app-server `ThreadItem` or notification type is added. ## Tests - `just fmt` - `just write-app-server-schema` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-core multi_agent_v2_spawn_returns_path_and_send_message_accepts_relative_path` - `just test -p codex-core multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn` - `just fix -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core`
jif ·
2026-06-17 14:48:56 +02:00 -
Revert thread recencyAt for sidebar ordering (#28655)
## Why Revert #27910 to remove the newly introduced thread `recencyAt` persistence and API behavior from `main`. ## What changed This reverts commit `fac3158c2a783095768076489815f361fa9b0db4`, including the state migration, thread-store propagation, app-server API surface, generated schemas, and related tests. ## Validation Not run before opening; relying on CI for the initial fast signal.
pakrym-oai ·
2026-06-16 21:39:30 -07:00 -
Add thread recencyAt for sidebar ordering (#27910)
## Summary Add a server-owned `recencyAt` timestamp and `recency_at` thread-list sort key for product recency ordering while preserving the existing meaning of `updatedAt` as the latest persisted thread mutation. This is the server-side alternative to #27697. Rather than narrowing `updatedAt`, clients can sort the sidebar by `recency_at` and continue treating `updatedAt` as mutation time. Paired Codex Apps PR: [openai/openai#1024599](https://github.com/openai/openai/pull/1024599) ## Contract - `recencyAt` initializes when a thread is created. - A turn start advances `recencyAt` monotonically. - Commentary, agent output, tool results, token/accounting updates, turn completion, archive, unarchive, resume, and generic metadata writes do not advance it. - `updatedAt` retains its existing behavior and continues to advance for persisted thread mutations. - Current servers populate `recencyAt`; the response field is optional in generated TypeScript so clients connected to older servers can fall back to `updatedAt`. - Filesystem-only fallback uses existing updated/mtime ordering when SQLite is unavailable. ## Persistence and compatibility Migration 0038 adds second- and millisecond-precision recency columns, backfills them from the existing updated timestamp, creates list indexes, and includes an insert trigger so older binaries writing to a migrated database seed recency without causing later mutations to advance it. Generic metadata upserts preserve existing recency values. Turn-start updates use a dedicated monotonic touch, and process-local allocation keeps millisecond cursor values unique. State DB list, search, read, filtered-list repair, rollout fallback propagation, and app-server conversions all carry the new field. ## API `Thread` responses include: ```ts recencyAt?: number ``` `thread/list` and `thread/search` accept: ```json { "sortKey": "recency_at" } ``` Generated TypeScript and JSON schemas are included. ## Validation - `just test -p codex-state` — 146 passed - `just test -p codex-rollout` — 69 passed - `just test -p codex-thread-store` — 81 passed - `just test -p codex-app-server-protocol` — 231 passed - Focused app-server list ordering, response mapping, archive/unarchive, and resume lifecycle tests passed - Scoped `just fix` for state, rollout, thread-store, app-server-protocol, and app-server - `just fmt` - `git diff --check` - Independent correctness, simplicity, elegance, security, and test-quality reviews; actionable ordering, lifecycle, query-projection, and timestamp-uniqueness findings were addressed
Jeremy Rose ·
2026-06-16 17:06:22 -07:00 -
app-server: preserve target-native environment cwd (#28146)
## Why app-server may run on a different OS from the selected exec-server environment. Parsing that environment’s cwd with the Codex host’s path rules prevents thread startup. ## What Carry environment cwd values as `LegacyAppPathString` at the app-server boundary and `PathUri` internally. Existing tool-call schemas and relative-path behavior stay host-native; remaining local-only consumers convert explicitly and leave follow-up TODOs. The Wine integration test verifies app-server can start a thread and complete an ordinary turn with a Windows environment cwd from Linux. ## Validation - `bazel test //codex-rs/core/tests/remote_env_windows:smoke-test --test_output=errors` - focused app-server environment-selection and protocol schema tests - scoped Clippy for `codex-core` and `codex-app-server-protocol`
Adam Perry @ OpenAI ·
2026-06-16 21:42:28 +00:00 -
[codex] Record external agent import results (#28396)
## Summary - restore `externalAgentConfig/import/progress` notifications while keeping `externalAgentConfig/import/completed` as the must-deliver event - persist completed external-agent config imports in state DB by `importId`, including concrete success/failure details for config, AGENTS.md, skills, plugins, MCP servers, subagents, hooks, commands, and sessions - add `externalAgentConfig/import/readHistories` so clients can recover persisted import results after missing the live completion notification - include `errorType` on import failures in protocol responses/notifications and persisted DB JSON so future code can classify failures without another wire/storage shape change ## Validation - `git diff --check` - `just test -p codex-state external_agent_config_imports` - `just test -p codex-app-server-protocol` - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-sqlite-read-details just test -p codex-app-server external_agent_config_import_sends_completion_notification_for_sync_only_import` Also ran earlier broader checks before publishing: - `just test -p codex-state` - `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-external-agent-test-sqlite just test -p codex-app-server external_agent_config` - `just test -p codex-external-agent-migration`
charlesgong-openai ·
2026-06-15 23:17:24 -07:00 -
[codex] Add created-by-me remote plugin marketplace (#28203)
## Summary - add the `created-by-me-remote` marketplace backed by paginated `scope=USER` plugin directory and installed-plugin requests - include USER plugins in installed-plugin caching, bundle sync, and stale-cache cleanup without client-side discoverability filtering - expose the marketplace through app-server v2 and regenerate the protocol schemas ## Testing - `cargo build -p codex-app-server --bin codex-app-server` - production-auth `plugin/list` smoke test for `created-by-me-remote` (returned the expected USER plugin as installed and enabled) - `just test -p codex-core-plugins` (221 passed) - `just test -p codex-app-server-protocol` (231 passed) - `just test -p codex-app-server suite::v2::plugin_list::` (37 passed) - `just fix -p codex-core-plugins -p codex-app-server-protocol -p codex-app-server` - `just fmt`
Eric Ning ·
2026-06-15 22:07:07 +00:00 -
feat(core): add metadata field to ResponseItem (#28355)
## Description This PR adds an optional `metadata` field to `ResponseItem` for Responses API calls. Only mechanical plumbing, no actual values populated and sent yet. Turns out just adding a new field to `ResponseItem` has quite a large blast radius already. This change is backwards compatible because `metadata` is optional and omitted when absent, so existing response items and rollout history without it still deserialize and requests that do not set it keep the same wire shape. For provider compatibility, we strip out `metadata` before non-OpenAI Responses requests so Azure and AWS Bedrock never see this field. My followup PR here will actually make use of it to start storing and passing along `turn_id`: https://github.com/openai/codex/pull/28360 ## What changed - Added `ResponseItemMetadata` with optional `turn_id`, plus optional `metadata` on Responses API item variants and inter-agent communication. - Preserved item metadata through response-item rewrites such as truncation, missing tool-output synthesis, compaction history rebuilding, visible-history conversion, rollout/resume, and generated app-server schemas/types. - Strip item metadata from non-OpenAI Responses requests while preserving it for OpenAI-shaped requests. - Updated the mechanical fixture/test construction churn required by the new optional field.
Owen Lin ·
2026-06-15 15:05:28 -07:00 -
feat(app-server): expose rate-limit reset credits (#28143)
## Why Codex users can earn personal rate-limit reset credits, but app-server clients do not currently have an API for reading or redeeming them. This adds the backend and protocol foundation used by the `/usage` TUI flow in #28154. ## What changed - Extend `account/rateLimits/read` with a nullable `rateLimitResetCredits` summary sourced from the existing usage response. - Add backend-client and app-server support for consuming a reset with a caller-generated idempotency key. A UUID is recommended, and clients reuse the same key when retrying the same logical reset. - Return only the consume `outcome`; clients refetch `account/rateLimits/read` for updated window state. - Document the response field and each consume outcome, and regenerate the JSON and TypeScript schema fixtures. - Clarify in `AGENTS.md` that new app-server string enum values use camelCase on the wire. - Update the existing TUI response fixture for the expanded protocol shape. - Add coverage for authentication, response mapping, backend failures, consume outcomes, and request timeout behavior. ## Validation - `just test -p codex-app-server-protocol` — 231 passed. - `just test -p codex-backend-client` — 14 passed. - Focused `codex-app-server` reset-credit tests — 5 passed. - Focused `codex-tui` protocol response fixture test — passed. - `just fix -p codex-backend-client -p codex-app-server-protocol -p codex-app-server` — passed. - `just fmt` — passed.
jay ·
2026-06-15 21:54:01 +00:00 -
Expose explicit dynamic tool namespaces in thread start (#27371)
Stacked on #27365. ## Stack note [#27365](https://github.com/openai/codex/pull/27365) kept `thread/start` unchanged and converted its input in `thread_processor`. This PR updates `thread/start` to accept explicit functions and namespaces directly. Legacy per-tool arrays are still accepted and converted while reading the request. As a result, `thread_processor` can validate and pass the tools through directly, which is why some code added in #27365 is removed here. ## Why `thread/start.dynamicTools` still repeats namespace data on each function even though core now stores explicit namespace groups. The request API should use the same shape so each namespace has one description and one member list. ## What changed - Accept top-level functions and explicit namespace objects in `dynamicTools`. - Continue accepting fully legacy flat arrays, including `exposeToContext`. - Reject arrays that mix legacy and canonical entries. - Reuse the protocol types directly and remove the temporary app-server adapter. - Update validation, docs, the test client, and generated schemas. ## Test plan - `just test -p codex-app-server-protocol` - `just test -p codex-app-server dynamic_tool_call_round_trip_sends_text_content_items_to_model` - `just test -p codex-app-server thread_start_normalizes_legacy_dynamic_tools_into_model_request` - `just test -p codex-app-server thread_start_rejects_mixed_dynamic_tool_formats` - `just test -p codex-app-server thread_start_rejects_hidden_dynamic_tools_without_namespace`
sayan-oai ·
2026-06-15 15:35:57 +00:00 -
[codex] add roles to realtime append text (#27936)
## Summary Add an explicit `user` or `developer` role to `thread/realtime/appendText` and propagate it through the realtime input queue into `conversation.item.create`. Older JSON clients that omit the field continue to default to `user`. This lets app-provided context such as memory retain developer authority without bypassing app-server through a renderer-owned data channel. The app-server schemas, API documentation, and focused protocol and websocket coverage are updated with the new contract. The Codex Apps consumer is tracked in [openai/openai#1025261](https://github.com/openai/openai/pull/1025261).
Alex Gamble ·
2026-06-12 15:05:37 -07:00 -
Support plaintext agent messages (#27830)
## Why Multi-agent v2 `send_message` deliveries already reach the receiving model as typed `agent_message` items with encrypted content. Child-completion notifications are generated by Codex itself, so their content is plaintext and previously fell back to a serialized JSON envelope inside an assistant message. With plaintext `input_text` supported for `agent_message`, both delivery paths can use the same model-visible type while preserving explicit author and recipient metadata. ## What changed - add plaintext `input_text` support to `AgentMessageInputContent` and regenerate the affected app-server schemas - preserve `InterAgentCommunication` as structured mailbox input instead of converting it to assistant text - record delivered communications as typed `agent_message` history items - persist a dedicated rollout item so local delivery metadata such as `trigger_turn` remains available without leaking into the Responses request - reconstruct typed agent messages on resume and preserve fork-turn truncation behavior - remove request-time assistant-content parsing - preserve plaintext and encrypted inter-agent deliveries in stage-one memory inputs - normalize and link plaintext and encrypted agent messages in rollout traces without treating inbound messages as child results - cover the real MultiAgent V2 child-completion path end to end with deterministic mailbox synchronization ## Verification - `just test -p codex-core plaintext_multi_agent_v2_completion_sends_agent_message` - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order record_initial_history_reconstructs_typed_inter_agent_message fork_turn_positions_use_inter_agent_delivery_metadata` - `just test -p codex-memories-write serializes_inter_agent_communications_for_memory` - `just test -p codex-rollout-trace agent_messages_preserve_routing_and_content sub_agent_started_activity_creates_spawn_edge` - `just test -p codex-rollout-trace agent_result_edge_falls_back_to_child_thread_without_result_message` - `just test -p codex-protocol -p codex-rollout -p codex-app-server-protocol`
jif ·
2026-06-12 13:50:04 -07:00 -
realtime: add AVAS architecture override (#27720)
## Summary Adds a `RealtimeConversationArchitecture` option for realtime conversation startup, with `realtimeapi` as the default and `avas` as an opt-in architecture. The AVAS path is limited to realtime v1 conversational WebRTC starts, and WebRTC call creation appends `intent=quicksilver&architecture=avas` to `/v1/realtime/calls`. The existing sideband websocket still joins by `call_id`. This also exposes the per-session architecture override through app-server v2 `thread/realtime/start` params and updates the config schema for `[realtime].architecture`. ## Validation - `just fmt` - `just write-config-schema` - `just test -p codex-api sends_avas_session_call_query_params` - `just test -p codex-core -E 'test(~conversation_webrtc_start_uses_avas_architecture_query)'` - `just test -p codex-core -E 'test(realtime_loads_from_config_toml)'` - `just test -p codex-app-server-protocol -E 'test(~serialize_thread_realtime_start) | test(generated_ts_optional_nullable_fields_only_in_params)'` - `just test -p codex-app-server -E 'test(realtime_webrtc_start_emits_sdp_notification)'`
Peter Bakkum ·
2026-06-12 18:11:13 +00:00 -
feat(app-server): persist remote-control desired state (#27445)
## Why Remote-control runtime enablement and persisted enrollment preference were represented by separate flags. That made startup rehydration, RPC persistence, and new-enrollment seeding race with one another, and it did not cleanly distinguish runtime-only CLI or daemon starts from durable app-server RPC changes. ## What Changed - Replace the parallel enablement, seed, and rehydration flags with one transport-owned `RemoteControlDesiredState`. - Add nullable enrollment-scoped persistence and preserve existing preferences during enrollment upserts. - Rehydrate plain startup only after auth and client scope resolve, without overwriting a concurrent RPC transition. - Make ordinary `remoteControl/enable` and `remoteControl/disable` durable while retaining `ephemeral: true` for runtime-only callers. - Have the daemon explicitly request ephemeral enablement and regenerate the app-server schemas. ## Verification - Covered migration and `NULL`/`0`/`1` persistence round trips. - Covered plain-start rehydration and runtime-only versus durable enrollment seeding. - Covered durable enable, durable disable, and ephemeral enable through app-server RPC. - Covered the daemon's exact `{ "ephemeral": true }` request payload. Related issue: N/A (internal remote-control persistence architecture change).Anton Panasenko ·
2026-06-11 21:28:52 -07:00 -
Add app-server
thread/deleteAPI (#25018)## Why Clients can archive and unarchive threads today, but there is no app-server API for permanently removing a thread. Deletion also needs to cover the full session tree: deleting a main thread should remove spawned subagent threads and the related local metadata instead of leaving orphaned rollout files, goals, or subagent state behind. ## What - Adds the v2 `thread/delete` request and `thread/deleted` notification, with the response shape kept consistent with `thread/archive`. - Implements local hard delete for active and archived rollout files. - Deletes the requested thread's state DB row as the commit point, then best-effort cleans associated state including spawned descendants, goals, spawn edges, logs, dynamic tools, and agent job assignments. - Updates app-server API docs and generated protocol schema/TypeScript fixtures.
Eric Traut ·
2026-06-10 11:22:12 -07:00 -
Add per-session realtime model and version overrides (#24999)
## Why Clients need to select a realtime session configuration for an individual start without rewriting persisted configuration or restarting the app-server process. ## What Changed - Add optional `model` and `version` fields to `thread/realtime/start` - Forward those optional values through the realtime start operation and apply them only for that session - Preserve existing configured/default behavior when the new fields are omitted - Update generated protocol schema and app-server documentation ## Validation - Added/updated protocol serialization coverage for the new optional request fields - Added focused core coverage for a session override taking precedence over configured realtime selection - Added focused app-server coverage that a request override reaches the realtime WebSocket handshake
guinness-oai ·
2026-06-09 17:54:32 -07:00 -
[codex-analytics] add extensible feature thread sources (#27063)
## Why - `ThreadSource` currently defines a closed set of core-owned values - Product features also create threads for background or scheduled work - Adding every product-specific value to the core enum would require repeated `codex-rs` protocol changes - Feature-backed values let product callers provide precise attribution while preserving the existing core classifications ## What Changed - Adds `ThreadSource::Feature(String)` for app-owned thread source values - Represents all app-server v2 thread sources as scalar strings, so a feature source is supplied as `"automation"` - Persists and emits the feature's plain string label, so `"automation"` produces `thread_source="automation"` in analytics - Keeps `user`, `subagent`, and `memory_consolidation` as explicit core-owned values and regenerates the app-server schemas and TypeScript bindings ## Verification - `just write-app-server-schema` - `cargo check --workspace` - `just test -p codex-protocol feature_thread_source_serializes_as_its_app_owned_label` - `just test -p codex-app-server-protocol thread_sources_round_trip_as_scalar_labels` - `cargo test -p codex-analytics thread_initialized_event_serializes_expected_shape` - `just fmt`
marksteinbrick-oai ·
2026-06-09 12:27:10 -07:00 -
Load selected executor skills through extensions (#27184)
## Why CCA is moving toward a split runtime where the orchestrator may not have a filesystem, while executors can expose preinstalled plugins and skills. A thread therefore needs to select capabilities without asking app-server or core to interpret executor-owned paths through the orchestrator's filesystem. The longer-term model is broader than executor skills: - A plugin is a bundle of skills, MCP servers, connectors/apps, and hooks. - A plugin root can be local, executor-owned, or hosted by a backend. - Components inside one plugin can use different access and execution mechanisms. A skill may be read from a filesystem or through backend tools; an HTTP MCP server can run without an executor; a stdio MCP server or hook needs an execution environment. - Core should carry generic extension initialization data. The extension that owns a component should discover it, expose it to the model, and invoke it through the appropriate runtime. This PR establishes that architecture through one complete vertical: selecting a root on an executor, discovering the skills beneath it, exposing those skills to the model, and reading an explicitly invoked `SKILL.md` through the same executor. ## Contract `thread/start` gains an experimental `selectedCapabilityRoots` field: ```json { "selectedCapabilityRoots": [ { "id": "deploy-plugin@1", "location": { "type": "environment", "environmentId": "workspace", "path": "/opt/codex/plugins/deploy" } } ] } ``` The root is intentionally not classified as a "plugin" or "skill" in the API. It can point at a standalone skill, a directory containing several skills, or a plugin containing skills and other components. This PR only teaches the skills extension how to consume it; later extensions can resolve MCP, connector, and hook components from the same selection. The platform-supplied `id` is stable selection identity. The location says which runtime owns the root and gives that runtime an opaque path. App-server does not inspect or canonicalize the path. ## What changed ### Generic thread extension initialization App-server converts selected roots into `ExtensionDataInit`. Core carries that generic initialization value until the final thread ID is known, then creates thread-scoped `ExtensionData` before lifecycle contributors run. This keeps `Session` and core independent of the capability-selection contract. The initialization value is consumed during construction; it is not retained as another long-lived `Session` field. ### Executor-backed skills The skills extension now owns an `ExecutorSkillProvider` that: - resolves the selected environment through `EnvironmentManager` - discovers, canonicalizes, and reads skills through that environment's `ExecutorFileSystem` - contributes the bounded selected-skill catalog as stable developer context - reads an explicitly invoked skill body through the authority that listed it - warns when an environment or root is unavailable - never falls back to the orchestrator filesystem for an executor-owned root Skill catalog and instruction fragments have hard byte bounds, which also bound them below the 10K-token per-item context limit. If a selected executor skill has the same name as a legacy local skill, the executor selection owns that invocation and the local body is not injected a second time. Existing local and bundled skill loading remains in place. Omitting `selectedCapabilityRoots` therefore preserves current local-only behavior. ## Current semantics - Only environment-owned locations are represented in this first contract. - Roots are resolved by the destination extension, not by app-server or core. - An unavailable executor or invalid root produces a warning and no capabilities from that root; it does not trigger a local-filesystem fallback. - Selection applies to a newly started active thread. - MCP servers, connectors, and hooks beneath a selected plugin root are not activated yet. - Selection is not yet persisted or inherited across resume, fork, or subagent creation. Existing local capabilities continue to behave as they do today in those flows. ## Planned vertical follow-ups 1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that works without an executor, then replace the special-purpose MCP plugins loader with that implementation. 2. **Executor MCP:** register and execute stdio MCP servers through the environment that owns the selected plugin root. 3. **Backend skills:** add a hosted skill source whose catalog and bodies are accessed through extension tools rather than a filesystem. 4. **Connectors and hooks:** activate those components through their owning extensions, using the same selected-root boundary and component-specific runtime. 5. **Durable selection:** define the desired-selection lifecycle, persist it, and make resume, fork, and subagent inheritance explicit rather than accidental. 6. **Local convergence:** incrementally route existing local plugin, skill, and MCP loading through the same extension model while preserving current local behavior. Each follow-up remains reviewable as an end-to-end capability. The platform selects roots, generic thread extension data carries the selection, and the owning extension resolves and operates its component. ## Verification Coverage added for: - app-server end-to-end discovery and explicit invocation of a skill inside an executor-selected plugin root - exclusive invocation when a selected executor skill collides with a local skill name - executor filesystem authority for discovery, canonicalization, and reads - thread extension initialization before lifecycle contributors run - stable executor catalog context, explicit invocation, context rebuilding, hidden skills, and preserved host/remote catalog behavior Targeted protocol, core-skills, skills-extension, core lifecycle, and app-server executor-skill tests were run during development.jif ·
2026-06-09 19:51:54 +02:00 -
feat(app-server): expose account token usage [1 of 2] (#25344)
## Why Token activity is useful account-level context, but terminal clients need a supported app-server path to fetch it without reaching into ChatGPT backend details directly. The API should also live under the broader account usage umbrella so future usage surfaces can be added without proliferating user-facing concepts. ## What Changed - Add `codex-backend-client` support for the ChatGPT profile token-usage payload. - Add the v2 `account/usage/read` app-server RPC. - Map lifetime usage, peak daily usage, streak, longest task duration, and daily buckets into app-server protocol types. - Gate the request on Codex-backend auth, which supports ChatGPT auth tokens and AgentIdentity. - Regenerate the app-server JSON and TypeScript schema fixtures. ## Token Count Source `account/usage/read` returns the token-usage aggregate supplied by the ChatGPT profile backend. App-server maps that backend-owned aggregate into protocol fields; it does not recompute cached-token treatment, usage multipliers, or raw input/output totals locally. ## Stack 1. feat(app-server): expose account token usage [1 of 2] (this PR) 2. [#25345](https://github.com/openai/codex/pull/25345) feat(tui): add token activity command [2 of 2] ## How to Test 1. Start an app-server client from this branch while authenticated with ChatGPT or AgentIdentity. 2. Call `account/usage/read`. 3. Confirm the response includes `summary` and `dailyUsageBuckets`. 4. Also verify a session without Codex-backend auth receives the existing auth error path. Targeted tests: - `just test -p codex-backend-client -p codex-app-server-protocol -p codex-app-server` - `just write-app-server-schema`
Felipe Coury ·
2026-06-05 14:43:44 +00:00 -
Encrypt multi-agent v2 message payloads (#26210)
## Why Multi-agent v2 currently routes agent instructions through normal tool arguments and inter-agent context. That means the parent model can emit plaintext task text, Codex can persist it in history/rollouts, and the recipient can receive it as ordinary assistant-message JSON. This changes the v2 path so agent instructions stay encrypted between model calls: Responses encrypts the `message` argument returned by the model, Codex forwards only that ciphertext, and Responses decrypts it internally for the recipient model. ## What changed - Mark the v2 `message` parameter as encrypted for `spawn_agent`, `send_message`, and `followup_task`. - Treat multi-agent v2 tool `message` values as ciphertext unconditionally. - Store v2 inter-agent task text in `InterAgentCommunication.encrypted_content` with empty plaintext `content`. - Convert encrypted inter-agent communications into the Responses `agent_message` input item before sending the child request. - Preserve `agent_message` items across history, rollout, compaction, telemetry, and app-server schema paths. - Leave multi-agent v1 unchanged. ## Message shape The model still calls the v2 tools with a `message` argument, but that value is now ciphertext: ```json { "name": "spawn_agent", "arguments": { "task_name": "worker", "message": "<ciphertext>" } } ``` Codex stores the task as encrypted inter-agent communication: ```json { "author": "/root", "recipient": "/root/worker", "content": "", "encrypted_content": "<ciphertext>", "trigger_turn": true } ``` When Codex builds the recipient request, it forwards the ciphertext using the new Responses input item: ```json { "type": "agent_message", "author": "/root", "recipient": "/root/worker", "content": [ { "type": "encrypted_content", "encrypted_content": "<ciphertext>" } ] } ``` Responses decrypts that item internally for the recipient model. ## Context impact - Parent context no longer carries plaintext v2 agent task instructions from these tool arguments. - Codex rollout/history stores ciphertext for v2 agent instructions. - Recipient requests receive an `agent_message` item instead of assistant commentary JSON for encrypted task delivery. - Plaintext completion/status notifications are still plaintext because they are Codex-generated status messages, not encrypted model tool arguments. ## Validation - `just test -p codex-tools` - `just test -p codex-protocol` - `just test -p codex-rollout` - `just test -p codex-rollout-trace` - `just test -p codex-otel` - `just write-app-server-schema`jif ·
2026-06-05 10:25:57 +02:00 -
[codex] Support model-defined reasoning efforts (#26444)
## Summary - accept non-empty model-defined reasoning effort values while preserving built-in effort behavior - propagate the non-Copy effort type through core, app-server, TUI, telemetry, and persistence call sites - preserve string wire encoding and expose an open-string schema for clients - update model selection and shortcut behavior for model-advertised effort values ## Root cause `ReasoningEffort` gained a string-backed custom variant, so it could no longer implement `Copy` or rely on derived closed-enum serialization. Existing consumers still moved effort values from shared references and assumed a fixed built-in value set. ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI.
Ahmed Ibrahim ·
2026-06-04 13:36:24 -07:00 -
Add runtime extra skill roots API (#24977)
## Summary - Add v2 `skills/extraRoots/set` to replace app-server process-local standalone skill roots. The setting is not persisted, accepts missing roots, and `extraRoots: []` clears the runtime set. - Wire runtime roots into core skill discovery for `skills/list` and turn loads, clear skill caches on set, and register the roots with the skills watcher so later filesystem changes emit `skills/changed`. - Update app-server docs, generated JSON/TypeScript schemas, and coverage for serialization, missing roots, empty clears, and restart behavior. ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-skills` - `cargo test -p codex-app-server skills_extra_roots_set_updates_process_runtime_roots` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core-skills` - `just fix -p codex-app-server`
xl-openai ·
2026-05-28 21:14:34 -07:00 -
[codex] Add user input client ids (#24653)
## Summary Adds an optional `clientId` field to app-server v2 `UserInput` and carries it through the core `UserInput` model so clients can correlate echoed user input items without relying on payload equality. ## Details - Adds `client_id: Option<String>` to core `UserInput` variants. - Exposes the v2 app-server field as `clientId` on the wire and in generated TypeScript. - Preserves the id when converting between app-server v2 and core protocol types. - Regenerates app-server schema fixtures. ## Validation - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-protocol` - `git diff --check`
Alexi Christakis ·
2026-05-28 14:54:39 -07:00 -
feat(app-server): include turns page on thread resume (#23534)
## Summary The client currently calls `thread/resume` to establish live updates and immediately follows it with `thread/turns/list` to hydrate recent turns. This lets `thread/resume` return that page directly, eliminating a round trip and the ordering/deduplication gap between the two calls. Experimental clients opt in with `initialTurnsPage: { limit, sortDirection, itemsView }`. The response returns `initialTurnsPage` as a `TurnsPage`, including cursors for paging further back in history. Keeping the controls in a nested opt-in object provides the useful `thread/turns/list` knobs without spreading page-specific parameters across `thread/resume`. ## Verification - `just fmt` - `just write-app-server-schema --experimental` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_resume_initial_turns_page_matches_requested_turns_list_page --tests` - `cargo test -p codex-app-server thread_resume_rejoins_running_thread_even_with_override_mismatch --tests` - `just fix -p codex-app-server-protocol -p codex-app-server`Brent Traut ·
2026-05-28 09:18:13 -07:00 -
Restore legacy image detail values (#24644)
## Why Older persisted rollouts can contain `input_image.detail` values of `auto` or `low` from before `ImageDetail` was narrowed to `high`/`original`. Current deserialization rejects those values, which can make resume skip later compacted checkpoints and reconstruct an oversized raw suffix before the next compaction attempt. Confirmed Sentry reports fixed by this compatibility path: - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/) - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/) - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/) - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/) ## Background [openai/codex#20693](https://github.com/openai/codex/pull/20693) added image-detail plumbing for app-server `UserInput` so input images could explicitly request `detail: original`. The Slack discussion behind that PR was about ScreenSpot / bridge evals where user input images were resized, while tool output images already had MCP/code-mode ways to request image detail. In review, the intended new API surface was narrowed to `high` and `original`: default to `high`, allow `original` when callers need unchanged image handling, and avoid encouraging new `auto` or `low` usage. That policy still makes sense for newly emitted values. The missing compatibility piece is persisted history. Older rollouts can already contain `auto` and `low`, and resume reconstructs typed history by deserializing those rollout records. Rejecting old values at that boundary causes valid compacted checkpoints to be skipped. This PR restores `auto` and `low` as real variants so old records deserialize and round-trip without being rewritten as `high`, while product paths can continue to default to `high` and avoid emitting `auto` for new behavior. ## What changed - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class protocol values. - Preserved `auto`/`low` through rollout deserialization, MCP image metadata, code-mode image output, and schema/type generation. - Kept local image byte handling conservative: only `original` switches to original-resolution loading; `auto`/`low`/`high` continue through the resize-to-fit path while retaining their detail value. - Added regression coverage for enum round-tripping and code-mode `low` detail handling. ## Testing - `just write-app-server-schema` - `just test -p codex-protocol` - `just test -p codex-tools` - `just test -p codex-code-mode` - `just test -p codex-app-server-protocol` - `just test -p codex-core suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata` - `just test -p codex-core suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper` - Loaded broken rollouts on local fixed builds, and started/completed new turns. I also attempted `just test -p codex-core`; the local broad run did not finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed out. The failures were broad timeout/deadline failures across unrelated areas; targeted changed-path core tests above passed.
rhan-oai ·
2026-05-26 16:24:33 -07:00 -
Add experimental turn additional context (#24154)
## Summary Adds experimental `additionalContext` support to `turn/start` and `turn/steer` so clients can provide ephemeral external context, such as browser or automation state, without turning that plumbing into a visible user prompt or triggering user-prompt lifecycle behavior. ## API Shape The parameter shape is: ```ts additionalContext?: Record<string, { value: string kind: "untrusted" | "application" }> | null ``` Example: ```json { "additionalContext": { "browser_info": { "value": "Active tab is CI failures.", "kind": "untrusted" }, "automation_info": { "value": "CI rerun is in progress.", "kind": "application" } } } ``` The keys are opaque and caller-defined. ## Context Injection When provided, accepted entries are inserted into model context as hidden contextual message items, not as visible thread user-message items. `kind: "untrusted"` entries are inserted with role `user`: ```text <external_${key}>${value}</external_${key}> ``` `kind: "application"` entries are inserted with role `developer`: ```text <${key}>${value}</${key}> ``` Values are not escaped. Each value is truncated to 1k approximate tokens before wrapping. For `turn/start`, accepted additional context is inserted before normal user input. For `turn/steer`, additional context is merged only when the steer includes non-empty user input; context-only steers still reject as empty input. ## Dedupe Strategy `AdditionalContextStore` lives on session state and stores the latest complete additional-context map. Each `turn/start` or non-empty `turn/steer` treats its `additionalContext` as the current complete set of values. Entries are injected only when the key is new or the exact entry for that key changed, including `value` or `kind`. After merging, the store is replaced with the provided map, so omitted keys are removed from the retained set and can be injected again later if reintroduced. Omitting `additionalContext`, passing `null`, or passing an empty object resets the store to empty and injects nothing. ## What Changed - Threads experimental v2 `additionalContext` through app-server into core turn start and steer handling. - Adds separate contextual fragment types for untrusted user-role context and application developer-role context. - Uses pending response input items so additional context can be combined with normal user input without treating it as prompt text. - Adds integration coverage for start/steer flow, role routing, dedupe/reset behavior, deletion/re-add behavior, hook-blocked input behavior, empty context-only steer rejection, external-fragment marker matching, and truncation.pakrym-oai ·
2026-05-26 13:02:34 -07:00 -
Use thread config for TUI MCP inventory (#24532)
## Summary `/mcp` in the TUI should reflect the current loaded thread, including project-local MCP servers from that thread config. Before this change, `mcpServerStatus/list` only read the latest global MCP config, so the active chat could miss project-local servers. This adds optional `threadId` to `mcpServerStatus/list`. When present, app-server resolves the loaded thread and lists MCP status from the refreshed effective config for that thread; when omitted, existing global config behavior stays unchanged. The TUI now sends the active chat thread id for `/mcp` and `/mcp verbose`, carries that origin through the async inventory result, and ignores stale completions if the user has switched threads before the fetch returns. The app-server schemas were regenerated. ## Follow-up Once this app-server API change lands, the desktop app should make the same `threadId` plumbing so its MCP inventory also uses the current thread config. Fixes #23874
Eric Traut ·
2026-05-26 07:44:04 -07:00 -
fix(app-server): fix optional bool annotations (#24099)
`#[serde(default)]` wasn't sufficient for our generated TS types to reflect that clients didn't have to set them. We also need `skip_serializing_if = "std::ops::Not::not"`. This is already a rule in our agents.md file.
Owen Lin ·
2026-05-22 16:52:53 +00:00 -
Make goals feature on by default and no longer experimental (#23732)
## Why The `goals` feature is ready to be available without requiring users to opt into experimental features. Keeping it behind the beta flag leaves persisted thread goals and automatic goal continuation disabled by default. This PR also marks the goal-related app server APIs and events as no longer experimental. ## What changed - Mark `goals` as `Stage::Stable`. - Enable `goals` by default in `codex-rs/features/src/lib.rs`.
Eric Traut ·
2026-05-20 15:07:35 -07:00 -
add encryptedcontent to functioncalloutput (#23500)
add new `EncryptedContent` variant to `FunctionCallOutputContentItem` ahead of standalone websearch. we need to be able to receive and pass encrypted function call output from the new web search endpoint back to responsesapi, as we cannot expose direct search results.
sayan-oai ·
2026-05-19 23:47:48 -07:00 -
feat: Add vertical remote plugin collection support (#23584)
- Adds an explicit vertical marketplace kind for plugin/list that fail-open fetches collection=vertical only when full remote plugins are disabled. - Renames the global remote marketplace/cache identity to openai-curated-remote and materializes remote installs with backend release versions and app manifests.
xl-openai ·
2026-05-19 22:03:08 -07:00 -
feat: add permission profile list api (#23412)
## Why Clients need a typed permission-profile catalog instead of reconstructing that state from config internals. ## What changed - Added `permissionProfile/list` to the app-server v2 protocol with cursor pagination and optional `cwd`. - The list response includes built-in permission profiles plus config-defined `[permissions.<id>]` profiles from the effective config for the request context. - Permission profiles keep optional `description` metadata for display purposes. - App-server docs and schema fixtures are updated for the new RPC.
viyatb-oai ·
2026-05-20 02:42:56 +00:00 -
Fix empty rollout path app-server handling (#23400)
## Summary - Coerce `path: ""` to `None` at the v2 protocol params deserialization boundary for `thread/resume` and `thread/fork`. - Restore the pre-ThreadStore running-thread resume behavior: if `threadId` is already running, rejoin it by id and treat a non-empty `path` only as a consistency check; otherwise cold resume keeps `history > path > threadId` precedence. - Add protocol, resume, and fork regression coverage for empty path payloads; refresh app-server schema fixtures for the clarified params docs. ## Tests - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol thread_path_params_deserialize_empty_path_as_none` - `cargo test -p codex-app-server-protocol --test schema_fixtures` - `cargo test -p codex-app-server empty_path` - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server --test all thread_resume_rejects_mismatched_path_for_running_thread_id` - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server --test all thread_resume_uses_path_over_non_running_thread_id`
Tom ·
2026-05-19 21:19:38 +00:00 -
app-server: use profile ids in v2 permission params (#23360)
## Why The v2 app-server permission profile fields are experimental, but the previous migration kept a legacy object payload for profile selection. That made clients aware of server-owned `activePermissionProfile` metadata such as `extends`, and it kept a `legacy_additional_writable_roots` path even though `runtimeWorkspaceRoots` now owns runtime workspace-root selection. This PR makes the client contract match the intended model: clients select a permission profile by id, and the server resolves and reports active profile provenance in response payloads. Follow-up to #22611. ## What Changed - Changed `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` permission profile selection to plain profile id strings. - Changed `command/exec.permissionProfile` to a plain profile id string for the same client/server ownership split. - Removed `PermissionProfileSelectionParams` and the legacy `{ type: "profile", modifications: [...] }` compatibility deserializer. - Updated app-server, TUI, and `codex exec` call sites to send only ids, while keeping `activePermissionProfile` as server response metadata. - Updated app-server docs and schema fixtures for the revised `command/exec.permissionProfile` shape. ## Verification - `cargo test -p codex-app-server-protocol` - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server` - `cargo test -p codex-exec` - `RUST_MIN_STACK=8388608 cargo test -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23360). * #23368 * __->__ #23360
Michael Bolin ·
2026-05-18 17:28:50 -07:00 -
feat(app-server): add optional thread_id to experimentalFeature/list (#23335)
## Why `experimentalFeature/list` reports effective feature enablement, but currently does not resolve it against a working directory where project-local config.toml files can exist and toggle on/off features when merged into the effective config after resolving the various config layers. That means we effectively (and incorrectly) ignore features set in project-local config. To address that, this PR exposes an optional `thread_id` param which allows us to load the thread's `cwd. ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server experimental_feature_list`
Owen Lin ·
2026-05-18 12:12:14 -07:00 -
goal: pause continuation loops on usage limits and blockers (#23094)
Addresses #22833, #22245, #23067 ## Why `/goal` can keep synthesizing turns even when the next turn cannot make meaningful progress. Hard usage exhaustion can replay failing turns, and repeated permission or external-resource blockers can keep burning tokens while waiting for user or system intervention. ## What changed - Add resumable `blocked` and `usageLimited` goal states. As with `paused`, goal continuation stops with these states. - Move to `usageLimited` after usage-limit failures. - Allow the built-in `update_goal` tool to set `blocked` only under explicit repeated-impasse guidance. Updated goal continuation prompt to specify that agent should use `blocked` only when it has made at least three attempts to get past an impasse. Most of the files touched by this PR are because of the small app server protocol update. ## Validation I manually reproduced a number of situations where an agent can run into a true impasse and verified that it properly enters `blocked` state. I then resumed and verified that it once again entered `blocked` state several turns later if the impasse still exists. I also manually reproduced the usage-limit condition by creating a simulated responses API endpoint that returns 429 errors with the appropriate error message. Verified that the goal runtime properly moves the goal into `usageLimited` state and TUI UI updates appropriately. Verified that `/goal resume` resumes (and immediately goes back into `ussageLImited` state if appropriate). ## Follow-up PRs Small changes will be needed to the GUI clients to properly handle the two new states.
Eric Traut ·
2026-05-18 11:28:53 -07:00 -
[codex] Add installed-plugin mention API (#22448)
## Summary - add app-server `plugin/installed` for mention-oriented plugin loading - return installed plugins plus explicitly requested install-suggestion rows - keep remote handling on installed-state data instead of the broad catalog listing path ## Why The `@` mention surface only needs plugins that are usable now, plus a small product-approved set of install suggestions. It does not need the full catalog-shaped `plugin/list` payload that the Plugins page uses. ## Validation - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-plugins` - `cargo test -p codex-app-server --test all plugin_installed_` ## Notes - The package-wide `cargo test -p codex-app-server` run still hits an existing unrelated stack overflow in `in_process::tests::in_process_start_clamps_zero_channel_capacity`. - Companion webview PR: https://github.com/openai/openai/pull/915672
xli-oai ·
2026-05-18 03:11:54 -07:00