mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
206 Commits
-
[codex] Use model metadata for skills usage instructions (#29740)
## Summary - add a false-by-default `include_skills_usage_instructions` model metadata field - enable the field for the bundled `gpt-5.5` model metadata - consume the metadata in both core and extension skill rendering - remove hardcoded legacy-model matching and its marker plumbing
ani-oai ·
2026-06-29 09:44:36 +09:00 -
app-server: structure and test JSON shutdown logs (#30314)
## Why `LOG_FORMAT=json` and `RUST_LOG` are supported by app-server, but the behavior was only covered indirectly. We should verify the actual JSONL written by both user-facing entry points: `codex app-server` and the standalone `codex-app-server` binary. The existing processor shutdown message also always said the channel closed, even though the processor can exit for several different reasons. Structured fields make that event more accurate and useful to log consumers. ## What changed - Record the processor `exit_reason`, remaining connection count, and forced-shutdown state as structured tracing fields. - Add a shared process-test helper that enables JSON logging, validates every stderr line as JSON, and verifies the top-level timestamp is RFC 3339. - Cover both `codex app-server` and `codex-app-server`, asserting the stable `level`, `fields`, and `target` payload. ## Test plan - `just test -p codex-app-server standalone_app_server_emits_json_info_events` - `just test -p codex-cli app_server_emits_json_info_events`
Michael Bolin ·
2026-06-26 18:19:56 -07:00 -
feat(app-server): add history_mode to thread (#29927)
## Description This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`. This will be stored in `SessionMeta` in the JSONL rollout file and as a new column in the SQLite thread_metadata table, and exposed on `thread/start` and on the `Thread` object in app-server. ## What changed - Added canonical `ThreadHistoryMode` with `legacy` and `paginated`, defaulting old and new SessionMeta to `legacy`. - Carried `history_mode` through core session config, ThreadStore stored metadata, local/in-memory stores, rollout metadata extraction, and the existing SQLite `threads` table. - Added experimental `historyMode` to app-server v2 `Thread` and `thread/start`. - Made paginated stored threads metadata-discoverable but unsupported for legacy full-history reads, `load_history`, live resume, and create paths. - Regenerated app-server schema fixtures and added protocol/state/thread-store/app-server coverage for persistence and fail-closed behavior. ## Compatibility floor Because users may be running various versions of Codex binaries on the same machine (TUI, Codex App, etc.), we will need to establish a compatibility floor for upcoming paginated threads, which will change how thread storage reads and writes work. The overall plan here: ``` Release N: - Add historyMode to SessionMeta / Thread / SQLite metadata. - Teach binaries to understand paginated threads. - If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread. - Default remains `"legacy"`. Release N+1: - First-party clients start opting into paginated threads where appropriate. - Internal dogfood / staged rollout. - Measure old-client usage and paginated-thread unsupported errors. Release N+2: - Only after Release N+ is overwhelmingly deployed, make paginated the default. - Accept that a small tail of N-1-or-older binaries may not understand paginated threads. ``` The important behavior change is fail-closed handling for a binary that encounters a persisted `paginated` thread before it knows how to fully support paginated history. In app-server, if a thread is `paginated`, we will: - allow metadata-only discovery paths like `thread/list` and `thread/read(includeTurns=false)`, so clients can still see the thread and inspect its `historyMode` - reject legacy full-history/live-thread paths like `thread/read(includeTurns=true)` and `thread/resume` with an unsupported JSON-RPC error - avoid silently treating an unknown or future `historyMode` as `legacy` Under the hood, the ThreadStore layer also rejects legacy operations that would need to load or replay the full thread history for a paginated thread. That gives us the behavior we want for Release N: future paginated threads are visible, but this binary fails closed instead of trying to operate on them as if they were legacy threads.
Owen Lin ·
2026-06-26 09:12:42 -07:00 -
Persist selected capability roots and resolve availability per model step (#29856)
## Why `selectedCapabilityRoots` is durable thread intent: “use this capability root from environment `worker`.” The important product assumption is: > One environment ID always names the same logical executor and stable contents. `worker` does not silently change from executor A to an unrelated executor B. The process-local connection handle for `worker` can still be replaced while Codex is running, though, for example when `environment/add` registers a fresh handle for the same logical environment. The thread should persist only the stable selection. Each model step should pair that selection with the exact ready handle captured for that step. ## The boundary ```text persisted thread intent plugin@1 -> environment "worker" | | capture the current step v model-step view unavailable, or plugin@1 + worker's exact captured ready handle ``` The environment ID is the stable identity and cache key. The `Arc<Environment>` is only a process-local handle retained so consumers of one model step use the same captured environment. It is never persisted and it does not imply different environment contents. ## What changes ### Persist the stable selection Selected roots are written into `SessionMeta` and restored with the thread. Forked subagents inherit the same selections, including bounded-history forks. Only stable data is persisted: root ID, environment ID, and root path. ### Capture readiness together with the exact handle The environment snapshot records: ```rust environment_id -> Some(Arc<Environment>) // ready in this step environment_id -> None // still starting in this step ``` This prevents readiness and execution from coming from different registry snapshots. For example: ```text step snapshot: worker -> handle A, ready environment/add: worker -> fresh handle B for the same logical environment current step: plugin@1 still uses captured handle A ``` Without carrying handle A in the snapshot, the resolver could combine “A was ready” with handle B and treat B as ready before it had finished starting. This does not change cache invalidation. Stable capability metadata remains identified by environment ID and capability root. Replacing a process-local handle under the same stable environment ID does not invalidate or rediscover that metadata. ### Resolve availability per model step - A ready captured environment produces resolved roots using its captured handle. - A starting, missing, or failed environment is omitted from that step. - A selected lazy environment that is outside the turn's captured environment set is asked to start, and a later step can observe it as ready. - No capability files are scanned here. Transient transport disconnects remain the remote client's reconnect concern. This PR models initial attachment/readiness; it does not add live socket-connectivity state. ## Example ```text thread selection: plugin@1 -> environment "worker" step 1: worker is starting -> plugin@1 unavailable step 2: worker is ready -> plugin@1 resolves through worker's captured handle step 3: fresh local handle -> current step remains pinned; a later step captures its own view ``` Temporary unavailability does not discard the durable selection. Later PRs can retain stable metadata caches while projecting only currently available capabilities into model-visible World State. ## Compatibility The app-server request shape does not change. Older rollouts without `selected_capability_roots` deserialize to an empty list. ## Stack 1. **This PR:** persist stable selected roots and resolve them through an exact model-step handle. 2. #29960: cache stable skill metadata and project available skills into World State. 3. #29946: cache stable plugin declarations and manage the separate live MCP runtime.jif ·
2026-06-25 17:49:43 +00:00 -
auth: move domain mode below app wire types (#29721)
## Why Authentication mode is a domain concept used by login, model selection, telemetry, and transports. Keeping the canonical type in app-server protocol forces those lower-level crates to depend on an unrelated wire API. ## What changed - Added canonical `codex_protocol::auth::AuthMode` domain values. - Kept the app-server wire DTO unchanged and added an explicit app-side conversion. - Removed production app-server-protocol dependencies from login, model-provider-info, models-manager, and otel call paths. ## Stack This is PR 2 of 6, stacked on [PR #29714](https://github.com/openai/codex/pull/29714). Review only the delta from `codex/split-json-rpc-protocols`. Next: [PR #29722](https://github.com/openai/codex/pull/29722). ## Validation - Auth and login coverage passed in the focused protocol/domain test run. - App-server account and auth conversion coverage passed.
Adam Perry @ OpenAI ·
2026-06-24 03:10:20 +00:00 -
test: add app-server auto environment helper (#29746)
## Why Start moving towards app-server tests defaulting to running against remote & foreign OS executors. To do so we need a point of indirection similar to core integration tests' `build_with_auto_env`, but with the flexibility of letting tests control environment registration if they need to. ## What This adds: - `TestAppServer::new_with_auto_env()` for constructing an app server with a default environment defined by the test runner (e.g. bazel) - `TestAppServer::auto_env_params()` for tests to easily acquire turn env params tailored to the automatic environment - `TestAppServer::send_thread_start_request_with_auto_env()` to make it easy for tests to start a thread using the automatic environment The above methods all fail if the test calling them has set up an environment where the automatic environment configuration conflicts with test-created state. ## Validation Adds a couple of basic smoke tests to the app-server test suite. Follow-ups will migrate more tests to use it.
Adam Perry @ OpenAI ·
2026-06-24 01:06:29 +00:00 -
core: persist initial context window metadata (#29519)
## Why PR #29494 made context-window IDs visible to the model by wrapping the token-budget window payload in `<context_window>`, but rollout JSONL consumers still could not see the initial window identity by tailing the session file. Compacted rollout items carry window IDs only after compaction has happened, so a session with no compaction had no durable JSONL record for window 0. This change gives tailing consumers a stable initial-window record at session creation time. ## What Changed - Added `session_meta.context_window.window_id` for the initial context-window identity. - `CreateThreadParams` now requires `initial_window_id: String`, so thread-store callers cannot accidentally create new threads without window-0 metadata. - Live thread creation derives the persisted initial window ID from the same `AutoCompactWindowIds` used to initialize `SessionState`, keeping runtime state and JSONL metadata aligned. - Rollout reconstruction uses `session_meta.context_window.window_id` as the initial-window fallback and derives `window_number = 0`, `first_window_id = window_id`, and `previous_window_id = None` internally. - Fork reconstruction intentionally uses the same rollout reconstruction path; consumers that need to distinguish copied initial-window metadata can use the rollout `thread_id`. - Legacy compactions without `window_number` still use compaction-count fallback accounting instead of being reset to window 0 by the initial-window fallback. - Compacted rollout metadata still takes precedence once compaction records exist, preserving the richer chain fields there. ## JSONL Shape Real rollout JSONL is one object per line. This example is expanded for readability, but shows the new initial `session_meta.context_window` record followed by the existing compacted rollout item shape that also carries window IDs: ```jsonl { "timestamp": "2026-06-22T12:00:00.000Z", "type": "session_meta", "payload": { "session_id": "<THREAD_ID>", "id": "<THREAD_ID>", "timestamp": "2026-06-22T12:00:00.000Z", "cwd": "/repo", "originator": "codex", "cli_version": "0.0.0", "source": "cli", "model_provider": "<MODEL_PROVIDER>", "context_window": { "window_id": "<INITIAL_WINDOW_ID>" } } } ... { "timestamp": "2026-06-22T12:34:56.000Z", "type": "compacted", "payload": { "message": "<COMPACTION_SUMMARY>", "replacement_history": [ "..." ], "window_number": 1, "first_window_id": "<INITIAL_WINDOW_ID>", "previous_window_id": "<INITIAL_WINDOW_ID>", "window_id": "<NEXT_WINDOW_ID>" } } ``` The nested `context_window` object is intentional: it gives rollout consumers a stable namespace for context-window metadata while only writing the non-derivable initial `window_id`. For the initial window, `window_number`, `first_window_id`, and `previous_window_id` are derived internally instead of being written to the rollout. ## Verification - `just test -p codex-protocol` - `just test -p codex-rollout recorder_materializes_on_flush_with_pending_items` - `just test -p codex-core reconstruct_history` - `just test -p codex-core record_initial_history_reconstructs_forked_transcript` - `just test -p codex-thread-store` - `just test -p codex-state` - `just test -p codex-app-server thread_read_returns_summary_without_turns` - `just test -p codex-rollout persistence_metrics`
Michael Bolin ·
2026-06-23 21:50:50 +00:00 -
feat(app-server): thread/turns/items/list -> thread/items/list (#29705)
## Description Rename the experimental app-server item pagination API from `thread/turns/items/list` to `thread/items/list` and make `turnId` optional. Clients can now page persisted items across a thread, or still filter to one turn when needed. ## What changed - Rename the request/response protocol types and JSON-RPC method to `ThreadItemsList*` / `thread/items/list`. - Pass optional `turnId` through to `ThreadStore::list_items`. - Update app-server docs and focused protocol/app-server tests. ## Validation - `just test -p codex-app-server-protocol thread_items_list_round_trips` - `just test -p codex-app-server thread_items_list_returns_unsupported`
Owen Lin ·
2026-06-23 13:57:08 -07:00 -
Persist session IDs across thread resume (#29327)
## Summary A cold-resumed subagent kept its durable thread ID but could receive a new session ID, splitting one agent tree across multiple sessions after a restart. Persist the root session ID in every rollout `SessionMeta`, carry it through thread creation, and restore it before initializing the resumed `Session` and `AgentControl`. ## Behavior For a nested agent tree: ```text root session R parent thread P child thread C ``` The child rollout stores: ```text session_id: R parent_thread_id: P id: C ``` After a cold resume, the child still belongs to root session `R` while its immediate parent remains `P`. The integration coverage uses distinct values for all three IDs so it catches restoring the session from `parent_thread_id`. ## Legacy rollouts Previous rollouts have `id` but no `session_id`. `SessionMetaLine` deserialization treats a missing `session_id` as `id`, keeping those files readable, listable, and resumable. When a legacy subagent is resumed through its root, that synthesized child ID no longer overrides the inherited root-scoped `AgentControl`. New rollouts always persist the explicit root session ID.jif ·
2026-06-22 09:36:08 +02:00 -
[codex] Use expect in integration tests (#28441)
The workspace denies `clippy::expect_used` in production. Although `clippy.toml` allows `expect` in tests, Bazel Clippy compiles integration-test helper code in a way that does not receive that exemption, which encouraged verbose `unwrap_or_else(... panic!(...))` and equivalent `match`/`let else` forms. This allows `clippy::expect_used` once at each integration-test crate root (including aggregated suites and test-support libraries), then replaces manual panic-based Result and Option unwraps with `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own crate roots. Intentional assertion and unexpected-variant panics remain unchanged, and the production `expect_used = "deny"` lint remains in place. The cleanup is mechanical and net-negative in line count.
pakrym-oai ·
2026-06-15 21:53:47 -07:00 -
Add realtime speech append control (#27917)
## Why Realtime voice harness tuning needs app-side control over what backend Codex text is spoken. Backend orchestrator text is written for a reading UI, so automatically speaking every preamble, progress update, or final assistant message can make the realtime voice model too chatty. For experimentation, clients need two simple controls: keep app/client text-item injection on the existing item-create path, and add an explicit speakable path that app code can call only when it wants realtime to speak. Automatic Codex output also needs an opt-in way to switch from the protocol's default speakable path to regular realtime items, with a caller-provided prefix so prompt wording can be tuned outside core. The default remains unchanged: if a client omits the new start fields and never calls `appendSpeech`, automatic backend output continues down the existing speakable path for the selected realtime protocol. ## What Changed - Adds experimental `thread/realtime/appendSpeech` for app-provided speakable text. - Keeps existing `thread/realtime/appendText` as the item-create API for app-provided realtime text items. - Adds `codexResponsesAsItems` / `codex_responses_as_items` on `thread/realtime/start` to send automatic Codex responses with `conversation.item.create` instead of the protocol's default speakable output path. - Adds `codexResponseItemPrefix` / `codex_response_item_prefix` so clients can prepend experiment instructions to those automatic Codex response items. - Keeps literal `conversation.handoff.append` routing scoped to the v1 speakable path; v2 default speech uses its item/function-output plus `response.create` behavior. - Removes the earlier public silent-context API and hardcoded silent-context prefix. - Updates realtime tests to cover default automatic speakable behavior, opt-in automatic item-create behavior, and explicit `appendSpeech` behavior. ## Validation - `cargo check -p codex-core -p codex-app-server -p codex-api` - `just test -p codex-app-server realtime_conversation` - `just test -p codex-core realtime_conversation` (50/51 passed in the filtered parallel run; the lone failure passed when rerun in isolation) - `just test -p codex-core conversation_mirrors_assistant_message_text_to_realtime_handoff` - `just test -p codex-api e2e_connect_and_exchange_events_against_mock_ws_server` - `just fix -p codex-core` - `just fix -p codex-app-server` - `cargo build -p codex-cli`
guinness-oai ·
2026-06-15 16:15:58 -07:00 -
feat(app-server): expose rate-limit reset credits (#28143)
## Why Codex users can earn personal rate-limit reset credits, but app-server clients do not currently have an API for reading or redeeming them. This adds the backend and protocol foundation used by the `/usage` TUI flow in #28154. ## What changed - Extend `account/rateLimits/read` with a nullable `rateLimitResetCredits` summary sourced from the existing usage response. - Add backend-client and app-server support for consuming a reset with a caller-generated idempotency key. A UUID is recommended, and clients reuse the same key when retrying the same logical reset. - Return only the consume `outcome`; clients refetch `account/rateLimits/read` for updated window state. - Document the response field and each consume outcome, and regenerate the JSON and TypeScript schema fixtures. - Clarify in `AGENTS.md` that new app-server string enum values use camelCase on the wire. - Update the existing TUI response fixture for the expanded protocol shape. - Add coverage for authentication, response mapping, backend failures, consume outcomes, and request timeout behavior. ## Validation - `just test -p codex-app-server-protocol` — 231 passed. - `just test -p codex-backend-client` — 14 passed. - Focused `codex-app-server` reset-credit tests — 5 passed. - Focused `codex-tui` protocol response fixture test — passed. - `just fix -p codex-backend-client -p codex-app-server-protocol -p codex-app-server` — passed. - `just fmt` — passed.
jay ·
2026-06-15 21:54:01 +00:00 -
build: run buildifier from just fmt (#28125)
## Intent Keep Bazel and Starlark files consistently formatted without requiring contributors to install or version buildifier themselves. ## Implementation - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier v8.5.1. - Run buildifier from the shared `just fmt` and `just fmt-check` driver, with Windows-safe explicit DotSlash invocation. - Provision DotSlash in formatting CI and contributor devcontainers, and document the source-build prerequisite. - Apply the initial mechanical buildifier formatting baseline.
Adam Perry @ OpenAI ·
2026-06-13 21:43:39 -07:00 -
feat(app-server): enforce managed remote control disable (#27961)
## Why Managed deployments need a reliable deny gate for remote control. Persisted enablement and explicit startup requests currently remain able to start the transport, while the removed `features.remote_control` key is intentionally only a compatibility no-op. This adds a dedicated requirement that administrators can use to force remote control off without deleting the user's persisted preference. Removing the requirement and restarting restores the prior choice. ## What Changed - Added top-level `allow_remote_control` requirements parsing, sourced layer precedence, debug output, and `configRequirements/read` exposure as `allowRemoteControl`. - Added a typed transport policy captured from the startup requirements snapshot. Managed disable forces the initial state to disabled and prevents enrollment, refresh, connection, and persisted-preference mutation. - Rejected every `remoteControl/*` RPC before parameter deserialization with JSON-RPC `-32600` and `remote control is disabled by managed requirements`. - Preserved the existing disabled status notification and the previous behavior when the requirement is `true` or omitted. - Regenerated app-server protocol schemas and documented the new requirement. ## Verification - Confirmed all remote-control RPCs, including a malformed request, return the managed-policy error while the initial status notification remains `disabled`. - Confirmed explicit ephemeral startup and persisted enablement make no backend connection and leave the SQLite preference unchanged. - Confirmed `allow_remote_control = true` does not enable or block remote control and `configRequirements/read` returns `allowRemoteControl: false` for the deny policy. Related issue: N/A (managed-policy hardening).
Anton Panasenko ·
2026-06-12 20:10:12 -07:00 -
feat: use encrypted local secrets for CLI auth (#27539)
## Why Windows Credential Manager limits generic credential blobs to 2,560 bytes. Large serialized ChatGPT auth payloads can exceed that limit, so keyring-mode CLI auth needs a backend that keeps only the encryption key in the OS keyring and stores the payload in Codex's encrypted local-secrets file. This is the third PR in the encrypted-auth stack: 1. #27504 — feature and config selection 2. #27535 — auth-specific local-secrets namespaces 3. This PR — CLI auth implementation and activation 4. MCP OAuth implementation and activation ## What Changed - Added encrypted CLI-auth storage using the `CliAuth` secrets namespace. - Preserved direct keyring storage for platforms/configurations where it remains selected. - Selected the backend consistently for login, logout, refresh, device-code login, auth loading, and login restrictions. - Threaded resolved bootstrap/full config through CLI, exec, TUI, app-server account handling, cloud config, and cloud tasks. - Removed stale `auth.json` fallback data after successful encrypted saves and removed encrypted, direct-keyring, and fallback data during logout. - Added storage and integration coverage for both direct and encrypted keyring modes. MCP OAuth persistence is intentionally left to the next PR. ## Validation - `just test -p codex-login` — 131 passed - `just test -p codex-cli` — 280 passed - `just test -p codex-app-server v2::account` — 25 passed - `just test -p codex-cloud-config service` — 21 passed, 7 skipped - `just fix -p codex-login` - `just fix -p codex-cli` - `just fmt`
Celia Chen ·
2026-06-12 21:23:50 +00:00 -
feat(app-server): persist remote-control desired state (#27445)
## Why Remote-control runtime enablement and persisted enrollment preference were represented by separate flags. That made startup rehydration, RPC persistence, and new-enrollment seeding race with one another, and it did not cleanly distinguish runtime-only CLI or daemon starts from durable app-server RPC changes. ## What Changed - Replace the parallel enablement, seed, and rehydration flags with one transport-owned `RemoteControlDesiredState`. - Add nullable enrollment-scoped persistence and preserve existing preferences during enrollment upserts. - Rehydrate plain startup only after auth and client scope resolve, without overwriting a concurrent RPC transition. - Make ordinary `remoteControl/enable` and `remoteControl/disable` durable while retaining `ephemeral: true` for runtime-only callers. - Have the daemon explicitly request ephemeral enablement and regenerate the app-server schemas. ## Verification - Covered migration and `NULL`/`0`/`1` persistence round trips. - Covered plain-start rehydration and runtime-only versus durable enrollment seeding. - Covered durable enable, durable disable, and ephemeral enable through app-server RPC. - Covered the daemon's exact `{ "ephemeral": true }` request payload. Related issue: N/A (internal remote-control persistence architecture change).Anton Panasenko ·
2026-06-11 21:28:52 -07:00 -
[codex] Add comp_hash to model metadata (#27532)
## Summary - add optional `comp_hash` metadata to `ModelInfo` - update `ModelInfo` fixtures for the shared schema change - keep older model responses compatible by defaulting the field to `None` ## Why The models endpoint needs an opaque identifier for compaction-compatible model configurations. This PR only exposes that value in model metadata; it does not add it to turn context or change runtime behavior. Follow-up #27520 carries the value through turn context and rollouts, then uses it to trigger compaction. ## Stack - based directly on `main` - replaces #27519, which was accidentally merged into the wrong base branch - functionality follow-up: #27520 ## Testing - `just test -p codex-protocol model_info_defaults_availability_nux_to_none_when_omitted` - `just fix -p codex-core -p codex-protocol -p codex-analytics -p codex-models-manager`
Ahmed Ibrahim ·
2026-06-10 20:42:55 -07:00 -
feat: add Bedrock API key as a managed auth mode (#27443)
## Why Codex needs to manage Amazon Bedrock API key credentials through the existing auth lifecycle instead of introducing a separate auth manager or provider-specific credential file. Treating Bedrock API key login as a primary auth mode gives it the same persistence, keyring, reload, and logout behavior as the existing OpenAI API key and ChatGPT modes. The credential is valid only for the `amazon-bedrock` model provider. OpenAI-compatible providers must reject this auth mode rather than treating the Bedrock key as an OpenAI bearer token. ## What changed - Added `bedrockApiKey` as an app-server `AuthMode` and `CodexAuth::BedrockApiKey` as a primary `AuthManager` mode. - Added `BedrockApiKeyAuth`, containing the API key and AWS region, to the existing `AuthDotJson` payload stored in `$CODEX_HOME/auth.json` or the configured keyring backend. - Added `login_with_bedrock_api_key(...)`, parallel to `login_with_api_key(...)`, which replaces the current stored login with Bedrock credentials. - Reused generic auth reload and logout behavior instead of adding a Bedrock-specific auth manager or logout path. - Updated login restrictions, status reporting, diagnostics, telemetry classification, generated app-server schemas, and auth fixtures for the new mode. - Added explicit errors when Bedrock API key auth is selected with an OpenAI-compatible model provider. This PR establishes managed storage and auth-mode behavior. Routing the managed key and region into Amazon Bedrock requests will be in follow-up PRs.
Celia Chen ·
2026-06-10 20:42:38 -07:00 -
Add app-server
thread/deleteAPI (#25018)## Why Clients can archive and unarchive threads today, but there is no app-server API for permanently removing a thread. Deletion also needs to cover the full session tree: deleting a main thread should remove spawned subagent threads and the related local metadata instead of leaving orphaned rollout files, goals, or subagent state behind. ## What - Adds the v2 `thread/delete` request and `thread/deleted` notification, with the response shape kept consistent with `thread/archive`. - Implements local hard delete for active and archived rollout files. - Deletes the requested thread's state DB row as the commit point, then best-effort cleans associated state including spawned descendants, goals, spawn edges, logs, dynamic tools, and agent job assignments. - Updates app-server API docs and generated protocol schema/TypeScript fixtures.
Eric Traut ·
2026-06-10 11:22:12 -07:00 -
[codex-rs] support v2 personal access tokens (#25731)
## Summary - add v2 personal access token support for `codex login --with-access-token` and `CODEX_ACCESS_TOKEN` - classify opaque `at-` tokens separately from legacy Agent Identity JWTs - hydrate required ChatGPT account metadata through AuthAPI `/v1/user-auth-credential/whoami` - use PATs directly as bearer tokens while preserving existing ChatGPT account surfaces - expose PAT-backed auth as the explicit `personalAccessToken` app-server auth mode ## Implementation PAT auth is intentionally small and stateless. Loading a PAT performs one AuthAPI metadata request, stores the hydrated metadata in the in-memory auth object, and redacts the secret from debug output. Legacy Agent Identity JWT handling remains unchanged. The shared access-token classifier lives in a private neutral module because it dispatches between both credential types. PAT hydration fails closed when AuthAPI omits any required metadata, including email. Hydrated metadata is intentionally not persisted: startup performs a live `whoami` preflight so revoked tokens or changed account metadata are not accepted from a stale cache. ## Workspace restriction scope This change intentionally does **not** apply `forced_chatgpt_workspace_id` to PAT authentication. The setting is a client-side config guardrail, not an authorization boundary, and PAT does not currently require workspace-ID parity. The PAT login and `CODEX_ACCESS_TOKEN` paths therefore validate through AuthAPI without threading workspace-restriction state through access-token loading. Existing workspace checks for non-PAT auth remain on their established paths. ## App-server compatibility The public app-server `AuthMode` is shared across v1 and v2, and PAT-backed auth reports `personalAccessToken` through both APIs. Following human review, this intentionally removes the temporary v1 compatibility mapping that reported PATs as `chatgpt`; the deprecated v1 API is kept in parity with v2 rather than maintaining a separate closed enum. Clients with exhaustive auth-mode handling in either API version must add the new case and should generally treat it as ChatGPT-backed unless they need PAT-specific behavior. The v1 auth-status response still omits the raw PAT when `includeToken` is requested because that response cannot carry the account metadata needed to reuse the credential safely. Persisted PAT auth also omits the new enum value so older Codex builds can deserialize `auth.json` and infer PAT auth from the credential field after a rollback. ## Validation Latest review-fix validation: - `CARGO_INCREMENTAL=0 just test -p codex-login` (126 passed) - `CARGO_INCREMENTAL=0 just test -p codex-cli` (263 passed) - `CARGO_INCREMENTAL=0 just test -p codex-cli stored_auth_validation_handles_personal_access_token` - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol` (226 passed) - `CARGO_INCREMENTAL=0 just test -p codex-models-manager refresh_available_models_uses_remote_only_catalog_for_chatgpt_auth` - `CARGO_INCREMENTAL=0 just test -p codex-tui existing_non_oauth_chatgpt_login_counts_as_signed_in` - `CARGO_INCREMENTAL=0 just fix -p codex-login -p codex-app-server-protocol -p codex-models-manager -p codex-tui -p codex-cli` - `just fmt` - `git diff --check` The broader `codex-tui` suite previously compiled and ran 2,834 tests. Three unrelated environment-sensitive guardian/IDE-socket tests failed after retries; the PAT-relevant TUI coverage passed.
cooper-oai ·
2026-06-05 17:36:18 -07:00 -
feat(app-server): add remote control pairing status RPC (#26450)
## What Exposes the pairing status transport as experimental app-server v2 RPC `remoteControl/pairing/status`. - Adds request/response protocol types for exactly one lookup key: `pairingCode` or `manualPairingCode`, returning `{ claimed }`. - Registers the RPC with `global_shared_read("remote-control-pairing")`. - Wires the method through `MessageProcessor` and `RemoteControlRequestProcessor`. - Validates missing/conflicting pairing-code params as invalid requests. - Documents the RPC in `app-server/README.md`. - Adds processor, protocol export, and JSON-RPC integration coverage for both code paths. ## Why This is the app-server surface the desktop app can poll while the QR/manual pairing modal is active. Depends on https://github.com/openai/codex/pull/26449 Related backend change: https://github.com/openai/openai/pull/990244 ## Verification - `cargo test --manifest-path app-server-protocol/Cargo.toml remote_control` - `cargo test --manifest-path app-server/Cargo.toml remote_control` - `cargo fmt --all --check` - `git diff --check`hefuc-oai ·
2026-06-05 10:33:56 -07:00 -
[codex] Add use_responses_lite 'override' logic (#26487)
## Summary - add a defaulted `ModelInfo.use_responses_lite` catalog field - support serializing `reasoning.context` while preserving the existing effort and summary path - has not been turned on for any models yet I've added an override to parallel tools if responses_lite is on. I've also forced persistent reasoning when using responses_lite. It would be ideal if we could centralize all the responses_lite plumbing, but I think this is best for now to keep the plumbing & diffs small. ## Testing - `cargo test -p codex-protocol model_info_defaults_availability_nux_to_none_when_omitted` - `RUST_MIN_STACK=8388608 cargo test -p codex-core responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls` - `RUST_MIN_STACK=8388608 cargo test -p codex-core configured_reasoning_summary_is_sent` - `cargo check -p codex-core --tests` - `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes with pre-existing warnings in `codex-code-mode` and `codex-core-plugins`)
rka-oai ·
2026-06-04 18:49:51 -07:00 -
[codex] Support model-defined reasoning efforts (#26444)
## Summary - accept non-empty model-defined reasoning effort values while preserving built-in effort behavior - propagate the non-Copy effort type through core, app-server, TUI, telemetry, and persistence call sites - preserve string wire encoding and expose an open-string schema for clients - update model selection and shortcut behavior for model-advertised effort values ## Root cause `ReasoningEffort` gained a string-backed custom variant, so it could no longer implement `Copy` or rely on derived closed-enum serialization. Existing consumers still moved effort values from shared references and assumed a fixed built-in value set. ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI.
Ahmed Ibrahim ·
2026-06-04 13:36:24 -07:00 -
feat(app-server): add remote control client management RPCs (#25785)
## Why Remote-control clients need to list and revoke controller-device grants without enabling or enrolling the local relay. These are signed-in account-management operations, so coupling them to websocket, pairing, enrollment, or persisted relay state would prevent clients from managing stale grants from the picker. Related enhancement request: N/A. This adds the Codex app-server surface for the planned upstream environment-scoped revoke endpoint. ## What Changed - Added experimental app-server v2 RPCs: - `remoteControl/client/list` - `remoteControl/client/revoke` - Added picker-oriented protocol types and standard generated schema fixtures. The list response intentionally omits backend account id, enrollment status, and location fields. - Added `app-server-transport/src/transport/remote_control/clients.rs` for environment-scoped GET and DELETE requests. It builds escaped URL path segments, forwards optional pagination query fields, sends ChatGPT auth plus `chatgpt-account-id`, converts RFC3339 `last_seen_at` values to Unix seconds, accepts `204 No Content` revoke responses, and retries once after a `401`. - Extracted shared ChatGPT auth loading and recovery into `app-server-transport/src/transport/remote_control/auth.rs` so websocket, pairing, and client management use the same account-auth boundary. - Retained the configured remote-control base URL on `RemoteControlHandle` and resolve management URLs lazily, preserving deferred validation while relay startup is disabled. - Registered list as `global_shared_read("remote-control-clients")` and revoke as `global("remote-control-clients")`. ## Verification - Added transport coverage proving list and revoke work while relay state is disabled, IDs are escaped, picker-only fields are returned, timestamps are converted, revoke accepts `204`, auth headers are forwarded, `401` retries exactly once, `403` is not retried, and malformed list payloads retain decode context. - Added an app-server integration test proving both JSON-RPC methods work before relay enablement and successful revoke returns `{}`. - Regenerated and validated experimental and standard app-server schema fixtures.Anton Panasenko ·
2026-06-02 17:01:02 -07:00 -
Add multi-agent runtime metadata types (#25720)
Stack split from #25708. Original PR intentionally left open. This first PR adds the multi-agent runtime metadata types and catalog plumbing used by the rest of the stack.
jif-oai ·
2026-06-02 12:10:14 +02:00 -
feat(remote-control): add pairing start (#25675)
## Why Remote control enrollment authorizes a desktop server, but app-server v2 did not expose the follow-up pairing operation needed to mint a short-lived controller pairing artifact from that enrolled server. Clients need a narrow RPC that starts pairing without exposing the backend `serverId` or conflating pairing with websocket connection state. Issue: N/A; internal remote-control pairing API change. ## What Changed Added experimental app-server v2 `remoteControl/pairing/start` with `manualCode` input and `pairingCode`, nullable `manualPairingCode`, `environmentId`, and Unix-seconds `expiresAt` output. The method serializes under its own `global("remote-control-pairing")` scope and is documented in `app-server/README.md`. Extended the remote-control transport with private `/server/pair` request/response types and normalized `pair_url` handling. Pairing uses the current enrolled server bearer, refreshes that bearer when needed, keeps backend `server_id` private, validates returned `server_id` and `environment_id` against the current enrollment, and preserves backend status/header/body context for failures and malformed responses. Wired the request through `RemoteControlRequestProcessor` and `MessageProcessor`, mapping unavailable/disabled pairing to `invalid_request` and backend failures to internal errors. ## Verification - `just test -p codex-app-server-transport` - `just test -p codex-app-server remote_control_pairing_start_returns_pairing_artifacts`Anton Panasenko ·
2026-06-02 01:05:50 +00:00 -
fix: rename McpServer to TestAppServer (#25701)
This PR brought to you via VS Code rather than Codex... - opened `codex-rs/app-server/tests/common/mcp_process.rs` - put the cursor on `McpServer` - hit `F2` and renamed the symbol to `TestAppServer` - went to the file tree - hit enter and renamed `mcp_process.rs` to `test_app_server.rs` - ran **Save All Files** from the Command Palette - ran `just fmt` The End (Admittedly, most of the local variables for `TestAppServer` are still named `mcp`, though.)
Michael Bolin ·
2026-06-01 21:49:38 +00:00 -
[codex-rs] auto-review model override (#23767)
## Why Guardian auto-review normally uses the provider-preferred review model when one is available. Some parent models need model-catalog metadata to select a different review model while keeping older `/models` payloads compatible when that metadata is absent. ## What changed - Added optional `ModelInfo::auto_review_model_override` metadata to the public model payload as a review-model slug. - Updated Guardian review model selection to prefer the catalog override when present, while preserving the existing provider preferred-model path and parent-model fallback when it is omitted. - Added focused Guardian coverage for override and no-override model selection. - Added an `auto_review` core integration suite test that loads override metadata from a remote model catalog path and asserts the strict auto-review `/responses` request uses the catalog-selected review model. - Updated existing `ModelInfo` fixtures and local catalog constructors for the new optional field. ## Validation - `cargo test -p codex-protocol model_info_defaults_availability_nux_to_none_when_omitted` - `cargo test -p codex-core guardian_review_uses_` - `cargo test -p codex-core remote_model_override_uses_catalog_model_for_strict_auto_review --test all` - `just fix -p codex-protocol` - `just fix -p codex-core` - `just fmt` - `git diff --check`
Won Park ·
2026-06-01 11:51:15 -07:00 -
store and expose parent_thread_id on Threads (#25113)
## Why This PR https://github.com/openai/codex/pull/24161#discussion_r3325692763 revealed a subagent data modeling issue, where we overloaded `forked_from_id` to also mean `parent_thread_id`. That's incorrect since guardian and review subagents can be a subagent and NOT fork the main thread's history. The solution here is to explicitly store a new `parent_thread_id` on `SessionMeta`, alongside `forked_from_id` which already exists. While we're at it, also expose it in the app-server protocol on the `Thread` object. A thread->subagent relationship and a fork of thread history are orthogonal concepts. ## What Changed - Added top-level `parent_thread_id` persistence on `SessionMeta` and runtime/session plumbing through `SessionConfiguredEvent`, `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`, `TurnContext`, and `ModelClient`. - Made turn metadata, request headers, analytics, and subagent-start events read the separate runtime/top-level parent field instead of deriving general parent lineage from `SessionSource` or `forked_from_thread_id`. - Passed parent lineage separately at delegated subagent, review, guardian, agent-job, and multi-agent spawn construction sites; copied-history fork lineage remains derived only from `InitialHistory`. - Persisted and exposed parent lineage through rollout/thread-store projections and app-server v2 `Thread.parentThreadId`. - Updated app-server README text and regenerated app-server schema fixtures for the additive `parentThreadId` response field.
Owen Lin ·
2026-06-01 04:33:20 +00:00 -
[codex] Add model tool mode selector (#25031)
## Why Some models need to select their code-execution behavior through model catalog metadata. Models without that metadata must continue to follow the existing `CodeMode` and `CodeModeOnly` feature flags, including when a newer server sends an enum value this client does not recognize. ## What changed - add optional `ModelInfo.tool_mode` metadata with `direct`, `code_mode`, and `code_mode_only` - treat omitted and unknown wire values as `None` - resolve `None` from the existing feature flags - carry the resolved `ToolMode` directly on `TurnContext`, outside `Config` - use the resolved value for turn creation, model switches, review turns, tool planning, and code execution ## Coverage - add protocol coverage for omitted, known, and unknown enum values - add focused coverage for flag fallback and explicit metadata overriding feature flags - add core integration coverage that fetches remote model metadata through `/v1/models` and verifies the outbound `/responses` tools for explicit `direct` and `code_mode_only` selectors ## Stack - followed by #25032
Ahmed Ibrahim ·
2026-05-29 09:05:05 -07:00 -
Add runtime extra skill roots API (#24977)
## Summary - Add v2 `skills/extraRoots/set` to replace app-server process-local standalone skill roots. The setting is not persisted, accepts missing roots, and `extraRoots: []` clears the runtime set. - Wire runtime roots into core skill discovery for `skills/list` and turn loads, clear skill caches on set, and register the roots with the skills watcher so later filesystem changes emit `skills/changed`. - Update app-server docs, generated JSON/TypeScript schemas, and coverage for serialization, missing roots, empty clears, and restart behavior. ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-skills` - `cargo test -p codex-app-server skills_extra_roots_set_updates_process_runtime_roots` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core-skills` - `just fix -p codex-app-server`
xl-openai ·
2026-05-28 21:14:34 -07:00 -
package: include zsh fork in Codex package (#23756)
## Why The package layout gives Codex a stable place for runtime helpers that should travel with the entrypoint. `shell_zsh_fork` still required users to configure `zsh_path` manually, even though we already publish prebuilt zsh fork artifacts. This PR builds on #24129 and uses the shared DotSlash artifact fetcher to include the zsh fork in Codex packages when a matching target artifact exists. Packaged Codex builds can then discover the bundled fork automatically; the user/profile `zsh_path` override is removed so the feature uses the package-managed artifact instead of a legacy path knob. ## What Changed - Added `scripts/codex_package/codex-zsh`, a checked-in DotSlash manifest for the current macOS arm64 and Linux zsh fork artifacts. - Taught `scripts/build_codex_package.py` to fetch the matching zsh fork artifact and install it at `codex-resources/zsh/bin/zsh` when available for the selected target. - Added package layout validation for the optional bundled zsh resource. - Added `InstallContext::bundled_zsh_path()` and `InstallContext::bundled_zsh_bin_dir()` for package-layout resource discovery. - Threaded the packaged zsh path through config loading as the runtime `zsh_path` for packaged installs, and removed the config/profile/CLI override path. - Kept the packaged default zsh override typed as `AbsolutePathBuf` until the existing runtime `Config::zsh_path` boundary. - Updated app-server zsh-fork integration tests to spawn `codex-app-server` from a temporary package layout with `codex-resources/zsh/bin/zsh`, matching the new packaged discovery path instead of setting `zsh_path` in config. - Switched package executable copying from metadata-preserving `copy2()` to `copyfile()` plus explicit executable bits, which avoids macOS file-flag failures when local smoke tests use system binaries as inputs. ## Testing To verify that the `zsh` executable from the Codex package is picked up correctly, first I ran: ```shell ./scripts/build_codex_package.py ``` which created: ``` /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/ ``` so then I ran: ``` /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/bin/codex exec --enable shell_zsh_fork 'run `echo $0`' ``` which reported the following, as expected: ``` /private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/codex-resources/zsh/bin/zsh ``` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23756). * #23768 * __->__ #23756
Michael Bolin ·
2026-05-22 17:54:07 -07:00 -
[codex] Add rollout-backed thread content search (#23519)
## Summary - add experimental `thread/search` for local rollout-backed thread search using `rg` over JSONL rollouts - return search-specific result rows with optional previews instead of storing preview data on `StoredThread` or ordinary `Thread` responses - keep `thread/list` separate from full-content search and document the new app-server surface ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_search_returns_content_and_title_matches -- --nocapture`
Francis Chalissery ·
2026-05-21 11:52:24 -07:00 -
Honor client-resolved service tier defaults (#23537)
## Why Model catalog responses can now advertise a nullable `default_service_tier` for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference: - no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled - explicit `default`, meaning the user intentionally wants standard routing - explicit catalog tier ids such as `priority`, `flex`, or future tiers Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server `thread/start` / `turn/start` updates. ## What Changed - Plumbed `default_service_tier` through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions. - Added the request-only `default` service tier sentinel and normalized legacy config spelling so `fast` in `config.toml` still materializes as the runtime/request id `priority`. - Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change. - Updated app-server thread lifecycle config construction so `serviceTier: null` preserves explicit standard-routing intent by mapping to `default` instead of internal `None`. - Kept core responsible for validating explicit tiers against the current model and stripping `default` before `/v1/responses`, without applying catalog defaults itself. ## Validation - `CARGO_INCREMENTAL=0 cargo build -p codex-cli` - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list` - `cargo test -p codex-tui service_tier` - `cargo test -p codex-protocol service_tier_for_request` - `cargo test -p codex-core get_service_tier` - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier`
Shijie Rao ·
2026-05-20 15:57:50 -07:00 -
Add thread/settings/update app-server API (#23502)
## Why App-server clients need a way to update a thread's next-turn settings without starting a turn, adding transcript content, or waiting for turn lifecycle events. This gives settings UI a direct path for durable thread settings while clients observe the eventual effective state through a notification. This is a simplified rework of PR https://github.com/openai/codex/pull/22509. In particular, it changes the `thread/settings/update` api to return immediately rather than waiting and returning the effective (updated) thread settings. This makes the new api consistent with `turn/start` and greatly reduces the complexity of the implementation relative to the earlier attempt. ## What Changed - Adds experimental `thread/settings/update` with partial-update request fields and an empty acknowledgment response. - Adds experimental `thread/settings/updated`, carrying full effective `ThreadSettings` and scoped by `threadId` to subscribed clients for the affected thread. - Shares durable settings validation with `turn/start`, including `sandboxPolicy` plus `permissions` rejection and `serviceTier: null` clearing. - Emits the same settings notification when `turn/start` overrides change the stored effective thread settings. - Regenerates app-server protocol schema fixtures and updates `app-server/README.md`.
Eric Traut ·
2026-05-20 11:03:20 -07:00 -
feat: add permission profile list api (#23412)
## Why Clients need a typed permission-profile catalog instead of reconstructing that state from config internals. ## What changed - Added `permissionProfile/list` to the app-server v2 protocol with cursor pagination and optional `cwd`. - The list response includes built-in permission profiles plus config-defined `[permissions.<id>]` profiles from the effective config for the request context. - Permission profiles keep optional `description` metadata for display purposes. - App-server docs and schema fixtures are updated for the new RPC.
viyatb-oai ·
2026-05-20 02:42:56 +00:00 -
[codex] Add installed-plugin mention API (#22448)
## Summary - add app-server `plugin/installed` for mention-oriented plugin loading - return installed plugins plus explicitly requested install-suggestion rows - keep remote handling on installed-state data instead of the broad catalog listing path ## Why The `@` mention surface only needs plugins that are usable now, plus a small product-approved set of install suggestions. It does not need the full catalog-shaped `plugin/list` payload that the Plugins page uses. ## Validation - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-plugins` - `cargo test -p codex-app-server --test all plugin_installed_` ## Notes - The package-wide `cargo test -p codex-app-server` run still hits an existing unrelated stack overflow in `in_process::tests::in_process_start_clamps_zero_channel_capacity`. - Companion webview PR: https://github.com/openai/openai/pull/915672
xli-oai ·
2026-05-18 03:11:54 -07:00 -
feat(app-server): update remote control APIs for better UX (#22877)
## Why To help improve `codex remote-control` CLI UX which I plan to do in a followup, this PR adds `server-name` to the various remote control APIs: - `remoteControl/enable` - `remoteControl/disable` - `remoteControl/status/changed` Also, add a `remoteControl/status/read` API. This will be helpful in the Codex App.
Owen Lin ·
2026-05-15 14:33:24 -07:00 -
enable/disable remote control at runtime, not via features (#22578)
## Why reapplies https://github.com/openai/codex/pull/22386 which was previously reverted Also, introduce `remoteControl/enable` and `remoteControl/disable` app-server APIs to toggle on/off remote control at runtime for a given running app-server instance. ## What Changed - Adds experimental v2 RPCs: - `remoteControl/enable` - `remoteControl/disable` - Adds `RemoteControlRequestProcessor` and routes the new RPCs through it instead of `ConfigRequestProcessor`. - Adds named `RemoteControlHandle::enable`, `disable`, and `status` methods. - Makes `remoteControl/enable` return an error when sqlite state DB is unavailable, while keeping enrollment/websocket failures as async status updates. - Adds `AppServerRuntimeOptions.remote_control_enabled` and hidden `--remote-control` flags for `codex app-server` and `codex-app-server`. - Updates managed daemon startup to use `codex app-server --remote-control --listen unix://`. - Marks `Feature::RemoteControl` as removed and ignores `[features].remote_control`. - Updates app-server README entries for the new remote-control methods.
Owen Lin ·
2026-05-14 01:07:46 +00:00 -
feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566)
## Why The goal of this PR is to align on app-server and `ThreadStore` API updates for paginating through large threads. #### app-server ##### `thread/turns/list` - Updates `thread/turns/list` to support `itemsView?: "notLoaded" | "summary" | "full" | null`, defaulting to `summary`. - Implements the current `thread/turns/list` behavior over the existing persisted rollout-history fallback: - `notLoaded` returns turn envelopes with empty `items`. - `summary` returns the first user message and final assistant message when available. - `full` preserves the existing full item behavior. Note that this method still uses the naive approach of loading the entire rollout file, and returns just the filtered slice of the data. Real pagination will come later by leveraging SQLite. ##### `thread/turns/items/list` - Adds the experimental `thread/turns/items/list` protocol, schema, dispatcher, and processor stub. The app-server currently returns JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`. #### ThreadStore - Adds the experimental `thread/turns/items/list` protocol, schema, dispatcher, and processor stub. The app-server currently returns JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`. - Adds `ThreadStore` contract types and stubbed methods for listing thread turns and listing items within a turn. - Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking app-server API enums or lossy string status values into the store-facing turn contract. - Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking app-server API enums or lossy string status values into the store-facing turn contract. This also sketches the storage abstraction we expect to need once turns are indexed/stored. In particular, `notLoaded` is useful only if ThreadStore can eventually list turn metadata without loading every persisted item for each turn. ## Validation - Added/updated protocol serialization coverage for the new request and response shapes. - Added app-server integration coverage for `thread/turns/list` default summary behavior and all three `itemsView` modes. - Added app-server integration coverage that `thread/turns/items/list` returns the expected unsupported JSON-RPC error when experimental APIs are enabled. - Added thread-store coverage that the default trait methods return `ThreadStoreError::Unsupported`. No developers.openai.com documentation update is needed for this internal experimental app-server API surface.
Owen Lin ·
2026-05-07 15:44:43 -07:00 -
Disable empty Cargo test targets (#21584)
## Summary `cargo test` has entails both running standard Rust tests and doctests. It turns out that the doctest discovery is fairly slow, and it's a cost you pay even for crates that don't include any doctests. This PR disables doctests with `doctest = false` for crates that lack any doctests. For the collection of crates below, this speeds up test execution by >4x. E.g., before this PR: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s] Range (min … max): 0.418 s … 14.529 s 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms] Range (min … max): 418.0 ms … 436.8 ms 10 runs ``` For a single crate, with >2x speedup, before: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms] Range (min … max): 480.9 ms … 512.0 ms 10 runs ``` And after: ``` Benchmark 1: cargo test -p codex-utils-string Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms] Range (min … max): 206.8 ms … 221.0 ms 13 runs ``` Co-authored-by: Codex <noreply@openai.com>
Charlie Marsh ·
2026-05-07 15:44:17 -07:00 -
[codex-analytics] rework thread_source for thread analytics (#20949)
## Summary - make `thread_source` an explicit optional thread-level field on `thread/start`, `thread/fork`, and returned thread payloads - persist `thread_source` in rollout/session metadata so resumed live threads retain the original value - replace the old best-effort `session_source` -> `thread_source` mapping with an explicit caller-supplied analytics classification ## Why Before this change, analytics `thread_source` was populated by a best-effort mapping from `session_source`. `session_source` describes the runtime/client surface, not the actual thread-level origin, so that projection was not accurate enough to distinguish cases such as `user`, `subagent`, `memory_consolidation`, and future thread origins reliably. Making `thread_source` explicit keeps one thread-level analytics field while letting callers provide the real classification directly instead of recovering it indirectly from `session_source`. ## Impact For new analytics events, `thread_source` now reflects the explicit thread-level classification supplied by the caller rather than an inferred value derived from `session_source`. Existing protocol fields remain optional; callers that omit `threadSource` now produce `null` instead of a best-effort inferred value. ## Validation - `just write-app-server-schema` - `cargo test -p codex-analytics -p codex-core -p codex-app-server-protocol --no-run` - `cargo test -p codex-app-server-protocol generated_ts_optional_nullable_fields_only_in_params` - `cargo test -p codex-analytics thread_initialized_event_serializes_expected_shape` - `cargo test -p codex-core resume_stopped_thread_from_rollout_preserves_thread_source`
rhan-oai ·
2026-05-06 02:12:31 +00:00 -
1- Add model service tiers metadata (#20969)
## Why The model list needs to carry display-ready service tier metadata so clients can render tier choices with stable IDs, names, and descriptions. A raw speed-tier string list is not enough for richer UI copy or future tier labels. ## What changed - Added `ModelServiceTier` to shared model metadata with string `id`, `name`, and `description` fields. - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving empty defaults for older cached model payloads. - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded it through TUI app-server model conversion. - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers` metadata as deprecated in source and generated schema output. - Regenerated app-server protocol JSON schema and TypeScript fixtures, including `ModelServiceTier.ts`. ## Verification - Ran `just write-app-server-schema`. - Did not run local tests per repo instruction; relying on PR CI. --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-05-05 09:51:18 +03:00 -
[codex] Add unsandboxed process exec API (#19040)
## Why App-server clients sometimes need argv-based local process execution while sandbox policy is controlled outside Codex. Those environments can reject sandbox-disabling paths before a command ever starts, even when the caller intentionally wants unsandboxed execution. This PR adds a distinct `process/*` API for that use case instead of extending `command/exec` with another sandbox-disabling shape. Keeping the new surface separate also makes the future removal of `command/exec` simpler: clients that need explicit process lifecycle control can move to the newer handle-based API without depending on `command/exec` business logic. ## What changed - Added v2 process lifecycle methods: `process/spawn`, `process/writeStdin`, `process/resizePty`, and `process/kill`. - Added process notifications: `process/outputDelta` for streamed stdout/stderr chunks and `process/exited` for final exit status and buffered output. - Made `process/spawn` intentionally unsandboxed and omitted sandbox-selection fields such as `sandboxPolicy` and `permissionProfile`. - Added client-supplied, connection-scoped `processHandle` values for follow-up control requests and notification routing. - Supported cwd, environment overrides, PTY mode and size, stdin streaming, stdout/stderr streaming, per-stream output caps, and timeout controls. - Killed active process sessions when the originating app-server connection closes. - Wired the implementation through the modular `request_processors/` app-server layout, with process-handle request serialization for follow-up control calls. - Updated generated JSON/TypeScript schema fixtures and documented the new API in `codex-rs/app-server/README.md`. - Added v2 app-server integration coverage in `codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn acknowledgement before exit, buffered output caps, and process termination. ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server` --------- Co-authored-by: Owen Lin <owen@openai.com>
Ruslan Nigmatullin ·
2026-05-04 16:43:58 -07:00 -
Add remote plugin skill read API (#20150)
## Summary Adds an app-server `plugin/skill/read` method for remote plugin skill markdown. The new method calls the plugin-service skill detail endpoint and returns `skill_md_contents`, so clients can preview skills for remote plugins before the bundle is installed locally. ## Why Uninstalled remote plugin skills do not have local `SKILL.md` files. Without an on-demand remote read, the desktop plugin details UI cannot render the skill details modal for those skills. ## Validation - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server --test all -- suite::v2::plugin_read::plugin_skill_read_reads_remote_skill_contents_when_remote_plugin_enabled --exact` - `just fix -p codex-app-server-protocol -p codex-core-plugins -p codex-app-server`
xli-oai ·
2026-05-01 00:16:25 -07:00 -
Sync remote installed plugin bundles (#20268)
## Summary - Download missing remote installed plugin bundles during app-server startup and plugin/list refresh. - Upgrade cached remote installed bundles when the backend installed version changes. - Remove stale remote installed bundle caches without writing remote plugin state into config.toml. ## Review note This is a clean PR branch cut from the current diff on top of latest `origin/main`. The diff intentionally has no `codex-rs/core/**` files, so CODEOWNERS should not request the core-directory owner review from stale PR history. ## Validation Already run on the source branch before creating this clean PR: - `just fmt` - `cargo test -p codex-core-plugins` - `cargo test -p codex-app-server --test all app_server_startup_sync_downloads_remote_installed_plugin_bundles -- --nocapture` - `cargo test -p codex-app-server --test all plugin_list_sync_upgrades_and_removes_remote_installed_plugin_bundles -- --nocapture` - `cargo test -p codex-app-server --test all app_server_startup_remote_plugin_sync_runs_once -- --nocapture` - `just fix -p codex-core-plugins` - `just fix -p codex-app-server` - `git diff --check`
xli-oai ·
2026-04-30 16:05:14 -07:00 -
Add hooks/list app-server RPC (#19778)
## Why We need a way to list the available hooks to expose via the TUI and App so users can view and manage their hooks ## What - Adds `hooks/list` for one or more `cwd` values that returns discovered hook metadata ## Stack 1. openai/codex#19705 2. This PR - openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Review Notes The generated schema files account for most of the raw diff, these files have the core change: - `hooks/src/engine/discovery.rs` builds the inventory entries during hook discovery while leaving runtime handlers focused on execution. - `app-server/src/codex_message_processor.rs` wires `hooks/list` into the app-server flow for each requested `cwd`. - `app-server-protocol/src/protocol/v2.rs` defines the new v2 request/response payloads exposed on the wire. ### Core Changes `core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so `skills/list` and `hooks/list`can resolve plugin state for each requested `cwd` --------- Co-authored-by: Codex <noreply@openai.com>
Abhinav ·
2026-04-29 23:39:57 +00:00 -
feat: expose provider capability bounds to app server clients (#20049)
follow up of #19442. The app server now exposes provider-derived bounds through a new v2 `modelProvider/read` method. The response reports the configured provider map key as `modelProvider` and returns the effective capability booleans so clients can align their UI with the same provider-owned limits used by core.
Celia Chen ·
2026-04-29 01:36:19 +00:00 -
Fix plugin list workspace settings test isolation (#20086)
Fixes test that often fails locally when running `cargo test` - Add an app-server test helper that combines managed-config isolation with custom env overrides. - Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests so host home marketplaces do not affect results.
canvrno-oai ·
2026-04-28 18:34:38 -07:00 -
test: harden app-server integration tests (#19683)
## Why Windows Bazel runs in the permissions stack exposed that app-server integration tests were launching normal plugin startup warmups in every subprocess. Those warmups can call `https://chatgpt.com/backend-api/plugins/featured` when a test is not specifically exercising plugin startup, which adds slow background work, noisy stderr, and dependence on external network state. The relevant startup/featured-plugin behavior was introduced across #15042 and #15264. A few app-server tests also had long optional waits or unbounded cleanup paths, making failures expensive to diagnose and contributing to slow Windows shards. One external-agent config test from #18246 used a GitHub-style marketplace source, which was enough to exercise the pending remote-import path but also meant the background completion task could attempt a real clone. ## What Changed - Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks` plumbing and a hidden debug-only `--disable-plugin-startup-tasks-for-tests` app-server flag, so integration tests can suppress startup plugin warmups without adding a production env-var gate. - Has the app-server test harness pass that hidden flag by default, while opting plugin-startup coverage back in for tests that intentionally exercise startup sync and featured-plugin warmup behavior. - Lowers normal app-server subprocess logging from `info`/`debug` to `warn` to avoid multi-megabyte stderr output in Bazel logs. - Prevents the external-agent config test from attempting a real marketplace clone by using an invalid non-local source while still exercising the pending-import completion path. - Bounds optional filesystem/realtime waits and fake WebSocket test-server shutdown so failures produce targeted timeouts instead of hanging a shard. - Fixes the Unix script-resolution test in `rmcp-client` to exercise PATH resolution directly and include the actual spawn error in failures. ## Verification - `cargo check -p codex-app-server` - `cargo clippy -p codex-app-server --tests -- -D warnings` - `cargo test -p codex-rmcp-client program_resolver::tests::test_unix_executes_script_without_extension` - `cargo test -p codex-app-server --test all external_agent_config_import_sends_completion_notification_after_pending_plugins_finish -- --nocapture` - `cargo test -p codex-app-server --test all plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request -- --nocapture` - Windows Local Bazel passed with this test-hardening bundle before it was extracted from #19606. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683). * #19395 * #19394 * #19393 * #19392 * #19606 * __->__ #19683
Michael Bolin ·
2026-04-26 12:43:16 -07:00