codex

[codex] Use model metadata for skills usage instructions (#29740 )

## Summary

- add a false-by-default `include_skills_usage_instructions` model
metadata field
- enable the field for the bundled `gpt-5.5` model metadata
- consume the metadata in both core and extension skill rendering
- remove hardcoded legacy-model matching and its marker plumbing

ani-oai · 2026-06-29 09:44:36 +09:00

6b5f5743b3

app-server: structure and test JSON shutdown logs (#30314 )

## Why

`LOG_FORMAT=json` and `RUST_LOG` are supported by app-server, but the
behavior was only covered indirectly. We should verify the actual JSONL
written by both user-facing entry points: `codex app-server` and the
standalone `codex-app-server` binary.

The existing processor shutdown message also always said the channel
closed, even though the processor can exit for several different
reasons. Structured fields make that event more accurate and useful to
log consumers.

## What changed

- Record the processor `exit_reason`, remaining connection count, and
forced-shutdown state as structured tracing fields.
- Add a shared process-test helper that enables JSON logging, validates
every stderr line as JSON, and verifies the top-level timestamp is RFC
3339.
- Cover both `codex app-server` and `codex-app-server`, asserting the
stable `level`, `fields`, and `target` payload.

## Test plan

- `just test -p codex-app-server
standalone_app_server_emits_json_info_events`
- `just test -p codex-cli app_server_emits_json_info_events`

Michael Bolin · 2026-06-26 18:19:56 -07:00

4f1b5a4b73

feat(app-server): add history_mode to thread (#29927 )

## Description

This PR adds a new `historyMode = "legacy" | "paginated"` to `Thread`.
This will be stored in `SessionMeta` in the JSONL rollout file and as a
new column in the SQLite thread_metadata table, and exposed on
`thread/start` and on the `Thread` object in app-server.

## What changed

- Added canonical `ThreadHistoryMode` with `legacy` and `paginated`,
defaulting old and new SessionMeta to `legacy`.
- Carried `history_mode` through core session config, ThreadStore stored
metadata, local/in-memory stores, rollout metadata extraction, and the
existing SQLite `threads` table.
- Added experimental `historyMode` to app-server v2 `Thread` and
`thread/start`.
- Made paginated stored threads metadata-discoverable but unsupported
for legacy full-history reads, `load_history`, live resume, and create
paths.
- Regenerated app-server schema fixtures and added
protocol/state/thread-store/app-server coverage for persistence and
fail-closed behavior.

## Compatibility floor
Because users may be running various versions of Codex binaries on the
same machine (TUI, Codex App, etc.), we will need to establish a
compatibility floor for upcoming paginated threads, which will change
how thread storage reads and writes work.

The overall plan here:
```
Release N:
- Add historyMode to SessionMeta / Thread / SQLite metadata.
- Teach binaries to understand paginated threads.
- If a binary sees `historyMode="paginated"` but does not support the paginated contract, it refuses to resume/mutate the thread.
- Default remains `"legacy"`.

Release N+1:
- First-party clients start opting into paginated threads where appropriate.
- Internal dogfood / staged rollout.
- Measure old-client usage and paginated-thread unsupported errors.

Release N+2:
- Only after Release N+ is overwhelmingly deployed, make paginated the default.
- Accept that a small tail of N-1-or-older binaries may not understand paginated threads.
```

The important behavior change is fail-closed handling for a binary that
encounters a persisted `paginated` thread before it knows how to fully
support paginated history. In app-server, if a thread is `paginated`, we
will:

- allow metadata-only discovery paths like `thread/list` and
`thread/read(includeTurns=false)`, so clients can still see the thread
and inspect its `historyMode`
- reject legacy full-history/live-thread paths like
`thread/read(includeTurns=true)` and `thread/resume` with an unsupported
JSON-RPC error
- avoid silently treating an unknown or future `historyMode` as `legacy`

Under the hood, the ThreadStore layer also rejects legacy operations
that would need to load or replay the full thread history for a
paginated thread. That gives us the behavior we want for Release N:
future paginated threads are visible, but this binary fails closed
instead of trying to operate on them as if they were legacy threads.

Owen Lin · 2026-06-26 09:12:42 -07:00

5267e805fb

Persist selected capability roots and resolve availability per model step (#29856 )

## Why

`selectedCapabilityRoots` is durable thread intent: “use this capability
root from environment `worker`.”

The important product assumption is:

> One environment ID always names the same logical executor and stable
contents.

`worker` does not silently change from executor A to an unrelated
executor B. The process-local connection handle for `worker` can still
be replaced while Codex is running, though, for example when
`environment/add` registers a fresh handle for the same logical
environment.

The thread should persist only the stable selection. Each model step
should pair that selection with the exact ready handle captured for that
step.

## The boundary

```text
persisted thread intent
plugin@1 -> environment "worker"
|
| capture the current step
v
model-step view
unavailable, or
plugin@1 + worker's exact captured ready handle
```

The environment ID is the stable identity and cache key. The
`Arc<Environment>` is only a process-local handle retained so consumers
of one model step use the same captured environment. It is never
persisted and it does not imply different environment contents.

## What changes

### Persist the stable selection

Selected roots are written into `SessionMeta` and restored with the
thread. Forked subagents inherit the same selections, including
bounded-history forks.

Only stable data is persisted: root ID, environment ID, and root path.

### Capture readiness together with the exact handle

The environment snapshot records:

```rust
environment_id -> Some(Arc<Environment>) // ready in this step
environment_id -> None // still starting in this step
```

This prevents readiness and execution from coming from different
registry snapshots.

For example:

```text
step snapshot: worker -> handle A, ready
environment/add: worker -> fresh handle B for the same logical environment
current step: plugin@1 still uses captured handle A
```

Without carrying handle A in the snapshot, the resolver could combine “A
was ready” with handle B and treat B as ready before it had finished
starting.

This does not change cache invalidation. Stable capability metadata
remains identified by environment ID and capability root. Replacing a
process-local handle under the same stable environment ID does not
invalidate or rediscover that metadata.

### Resolve availability per model step

- A ready captured environment produces resolved roots using its
captured handle.
- A starting, missing, or failed environment is omitted from that step.
- A selected lazy environment that is outside the turn's captured
environment set is asked to start, and a later step can observe it as
ready.
- No capability files are scanned here.

Transient transport disconnects remain the remote client's reconnect
concern. This PR models initial attachment/readiness; it does not add
live socket-connectivity state.

## Example

```text
thread selection: plugin@1 -> environment "worker"

step 1: worker is starting -> plugin@1 unavailable
step 2: worker is ready -> plugin@1 resolves through worker's captured handle
step 3: fresh local handle -> current step remains pinned; a later step captures its own view
```

Temporary unavailability does not discard the durable selection. Later
PRs can retain stable metadata caches while projecting only currently
available capabilities into model-visible World State.

## Compatibility

The app-server request shape does not change. Older rollouts without
`selected_capability_roots` deserialize to an empty list.

## Stack

1. **This PR:** persist stable selected roots and resolve them through
an exact model-step handle.
2. #29960: cache stable skill metadata and project available skills into
World State.
3. #29946: cache stable plugin declarations and manage the separate live
MCP runtime.

jif · 2026-06-25 17:49:43 +00:00

8f02973d25

auth: move domain mode below app wire types (#29721 )

## Why

Authentication mode is a domain concept used by login, model selection,
telemetry, and transports. Keeping the canonical type in app-server
protocol forces those lower-level crates to depend on an unrelated wire
API.

## What changed

- Added canonical `codex_protocol::auth::AuthMode` domain values.
- Kept the app-server wire DTO unchanged and added an explicit app-side
conversion.
- Removed production app-server-protocol dependencies from login,
model-provider-info, models-manager, and otel call paths.

## Stack

This is PR 2 of 6, stacked on [PR
#29714](https://github.com/openai/codex/pull/29714). Review only the
delta from `codex/split-json-rpc-protocols`. Next: [PR
#29722](https://github.com/openai/codex/pull/29722).

## Validation

- Auth and login coverage passed in the focused protocol/domain test
run.
- App-server account and auth conversion coverage passed.

Adam Perry @ OpenAI · 2026-06-24 03:10:20 +00:00

31372078d1

test: add app-server auto environment helper (#29746 )

## Why

Start moving towards app-server tests defaulting to running against
remote & foreign OS executors. To do so we need a point of indirection
similar to core integration tests' `build_with_auto_env`, but with the
flexibility of letting tests control environment registration if they
need to.

## What

This adds:

- `TestAppServer::new_with_auto_env()` for constructing an app server
with a default environment defined by the test runner (e.g. bazel)
- `TestAppServer::auto_env_params()` for tests to easily acquire turn
env params tailored to the automatic environment
- `TestAppServer::send_thread_start_request_with_auto_env()` to make it
easy for tests to start a thread using the automatic environment

The above methods all fail if the test calling them has set up an
environment where the automatic environment configuration conflicts with
test-created state.

## Validation

Adds a couple of basic smoke tests to the app-server test suite.
Follow-ups will migrate more tests to use it.

Adam Perry @ OpenAI · 2026-06-24 01:06:29 +00:00

283bc4cf01

core: persist initial context window metadata (#29519 )

## Why

PR #29494 made context-window IDs visible to the model by wrapping the
token-budget window payload in `<context_window>`, but rollout JSONL
consumers still could not see the initial window identity by tailing the
session file. Compacted rollout items carry window IDs only after
compaction has happened, so a session with no compaction had no durable
JSONL record for window 0.

This change gives tailing consumers a stable initial-window record at
session creation time.

## What Changed

- Added `session_meta.context_window.window_id` for the initial
context-window identity.
- `CreateThreadParams` now requires `initial_window_id: String`, so
thread-store callers cannot accidentally create new threads without
window-0 metadata.
- Live thread creation derives the persisted initial window ID from the
same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
runtime state and JSONL metadata aligned.
- Rollout reconstruction uses `session_meta.context_window.window_id` as
the initial-window fallback and derives `window_number = 0`,
`first_window_id = window_id`, and `previous_window_id = None`
internally.
- Fork reconstruction intentionally uses the same rollout reconstruction
path; consumers that need to distinguish copied initial-window metadata
can use the rollout `thread_id`.
- Legacy compactions without `window_number` still use compaction-count
fallback accounting instead of being reset to window 0 by the
initial-window fallback.
- Compacted rollout metadata still takes precedence once compaction
records exist, preserving the richer chain fields there.

## JSONL Shape

Real rollout JSONL is one object per line. This example is expanded for
readability, but shows the new initial `session_meta.context_window`
record followed by the existing compacted rollout item shape that also
carries window IDs:

```jsonl
{
  "timestamp": "2026-06-22T12:00:00.000Z",
  "type": "session_meta",
  "payload": {
    "session_id": "<THREAD_ID>",
    "id": "<THREAD_ID>",
    "timestamp": "2026-06-22T12:00:00.000Z",
    "cwd": "/repo",
    "originator": "codex",
    "cli_version": "0.0.0",
    "source": "cli",
    "model_provider": "<MODEL_PROVIDER>",
    "context_window": {
      "window_id": "<INITIAL_WINDOW_ID>"
    }
  }
}
...
{
  "timestamp": "2026-06-22T12:34:56.000Z",
  "type": "compacted",
  "payload": {
    "message": "<COMPACTION_SUMMARY>",
    "replacement_history": [
      "..."
    ],
    "window_number": 1,
    "first_window_id": "<INITIAL_WINDOW_ID>",
    "previous_window_id": "<INITIAL_WINDOW_ID>",
    "window_id": "<NEXT_WINDOW_ID>"
  }
}
```

The nested `context_window` object is intentional: it gives rollout
consumers a stable namespace for context-window metadata while only
writing the non-derivable initial `window_id`. For the initial window,
`window_number`, `first_window_id`, and `previous_window_id` are derived
internally instead of being written to the rollout.

## Verification

- `just test -p codex-protocol`
- `just test -p codex-rollout
recorder_materializes_on_flush_with_pending_items`
- `just test -p codex-core reconstruct_history`
- `just test -p codex-core
record_initial_history_reconstructs_forked_transcript`
- `just test -p codex-thread-store`
- `just test -p codex-state`
- `just test -p codex-app-server
thread_read_returns_summary_without_turns`
- `just test -p codex-rollout persistence_metrics`

Michael Bolin · 2026-06-23 21:50:50 +00:00

01f89c8c59

feat(app-server): thread/turns/items/list -> thread/items/list (#29705 )

## Description

Rename the experimental app-server item pagination API from
`thread/turns/items/list` to `thread/items/list` and make `turnId`
optional. Clients can now page persisted items across a thread, or still
filter to one turn when needed.

## What changed

- Rename the request/response protocol types and JSON-RPC method to
`ThreadItemsList*` / `thread/items/list`.
- Pass optional `turnId` through to `ThreadStore::list_items`.
- Update app-server docs and focused protocol/app-server tests.

## Validation

- `just test -p codex-app-server-protocol thread_items_list_round_trips`
- `just test -p codex-app-server thread_items_list_returns_unsupported`

Owen Lin · 2026-06-23 13:57:08 -07:00

1882719b30

Persist session IDs across thread resume (#29327 )

## Summary

A cold-resumed subagent kept its durable thread ID but could receive a
new session ID, splitting one agent tree across multiple sessions after
a restart.

Persist the root session ID in every rollout `SessionMeta`, carry it
through thread creation, and restore it before initializing the resumed
`Session` and `AgentControl`.

## Behavior

For a nested agent tree:

```text
root session R
  parent thread P
    child thread C
```

The child rollout stores:

```text
session_id:       R
parent_thread_id: P
id:               C
```

After a cold resume, the child still belongs to root session `R` while
its immediate parent remains `P`. The integration coverage uses distinct
values for all three IDs so it catches restoring the session from
`parent_thread_id`.

## Legacy rollouts

Previous rollouts have `id` but no `session_id`. `SessionMetaLine`
deserialization treats a missing `session_id` as `id`, keeping those
files readable, listable, and resumable. When a legacy subagent is
resumed through its root, that synthesized child ID no longer overrides
the inherited root-scoped `AgentControl`. New rollouts always persist
the explicit root session ID.

jif · 2026-06-22 09:36:08 +02:00

6d15bb3d17

[codex] Use expect in integration tests (#28441 )

The workspace denies `clippy::expect_used` in production. Although
`clippy.toml` allows `expect` in tests, Bazel Clippy compiles
integration-test helper code in a way that does not receive that
exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
and equivalent `match`/`let else` forms.

This allows `clippy::expect_used` once at each integration-test crate
root (including aggregated suites and test-support libraries), then
replaces manual panic-based Result and Option unwraps with
`expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
crate roots. Intentional assertion and unexpected-variant panics remain
unchanged, and the production `expect_used = "deny"` lint remains in
place.

The cleanup is mechanical and net-negative in line count.

pakrym-oai · 2026-06-15 21:53:47 -07:00

e752f7b4ae

Add realtime speech append control (#27917 )

## Why

Realtime voice harness tuning needs app-side control over what backend
Codex text is spoken. Backend orchestrator text is written for a reading
UI, so automatically speaking every preamble, progress update, or final
assistant message can make the realtime voice model too chatty.

For experimentation, clients need two simple controls: keep app/client
text-item injection on the existing item-create path, and add an
explicit speakable path that app code can call only when it wants
realtime to speak. Automatic Codex output also needs an opt-in way to
switch from the protocol's default speakable path to regular realtime
items, with a caller-provided prefix so prompt wording can be tuned
outside core.

The default remains unchanged: if a client omits the new start fields
and never calls `appendSpeech`, automatic backend output continues down
the existing speakable path for the selected realtime protocol.

## What Changed

- Adds experimental `thread/realtime/appendSpeech` for app-provided
speakable text.
- Keeps existing `thread/realtime/appendText` as the item-create API for
app-provided realtime text items.
- Adds `codexResponsesAsItems` / `codex_responses_as_items` on
`thread/realtime/start` to send automatic Codex responses with
`conversation.item.create` instead of the protocol's default speakable
output path.
- Adds `codexResponseItemPrefix` / `codex_response_item_prefix` so
clients can prepend experiment instructions to those automatic Codex
response items.
- Keeps literal `conversation.handoff.append` routing scoped to the v1
speakable path; v2 default speech uses its item/function-output plus
`response.create` behavior.
- Removes the earlier public silent-context API and hardcoded
silent-context prefix.
- Updates realtime tests to cover default automatic speakable behavior,
opt-in automatic item-create behavior, and explicit `appendSpeech`
behavior.

## Validation

- `cargo check -p codex-core -p codex-app-server -p codex-api`
- `just test -p codex-app-server realtime_conversation`
- `just test -p codex-core realtime_conversation` (50/51 passed in the
filtered parallel run; the lone failure passed when rerun in isolation)
- `just test -p codex-core
conversation_mirrors_assistant_message_text_to_realtime_handoff`
- `just test -p codex-api
e2e_connect_and_exchange_events_against_mock_ws_server`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
- `cargo build -p codex-cli`

guinness-oai · 2026-06-15 16:15:58 -07:00

1d8ff89aa3

feat(app-server): expose rate-limit reset credits (#28143 )

## Why

Codex users can earn personal rate-limit reset credits, but app-server
clients do not currently have an API for reading or redeeming them. This
adds the backend and protocol foundation used by the `/usage` TUI flow
in #28154.

## What changed

- Extend `account/rateLimits/read` with a nullable
`rateLimitResetCredits` summary sourced from the existing usage
response.
- Add backend-client and app-server support for consuming a reset with a
caller-generated idempotency key. A UUID is recommended, and clients
reuse the same key when retrying the same logical reset.
- Return only the consume `outcome`; clients refetch
`account/rateLimits/read` for updated window state.
- Document the response field and each consume outcome, and regenerate
the JSON and TypeScript schema fixtures.
- Clarify in `AGENTS.md` that new app-server string enum values use
camelCase on the wire.
- Update the existing TUI response fixture for the expanded protocol
shape.
- Add coverage for authentication, response mapping, backend failures,
consume outcomes, and request timeout behavior.

## Validation

- `just test -p codex-app-server-protocol` — 231 passed.
- `just test -p codex-backend-client` — 14 passed.
- Focused `codex-app-server` reset-credit tests — 5 passed.
- Focused `codex-tui` protocol response fixture test — passed.
- `just fix -p codex-backend-client -p codex-app-server-protocol -p
codex-app-server` — passed.
- `just fmt` — passed.

jay · 2026-06-15 21:54:01 +00:00

bef99f861b

build: run buildifier from just fmt (#28125 )

## Intent

Keep Bazel and Starlark files consistently formatted without requiring
contributors to install or version buildifier themselves.

## Implementation

- Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
v8.5.1.
- Run buildifier from the shared `just fmt` and `just fmt-check` driver,
with Windows-safe explicit DotSlash invocation.
- Provision DotSlash in formatting CI and contributor devcontainers, and
document the source-build prerequisite.
- Apply the initial mechanical buildifier formatting baseline.

Adam Perry @ OpenAI · 2026-06-13 21:43:39 -07:00

740c4f269d

feat(app-server): enforce managed remote control disable (#27961 )

## Why

Managed deployments need a reliable deny gate for remote control.
Persisted enablement and explicit startup requests currently remain able
to start the transport, while the removed `features.remote_control` key
is intentionally only a compatibility no-op.

This adds a dedicated requirement that administrators can use to force
remote control off without deleting the user's persisted preference.
Removing the requirement and restarting restores the prior choice.

## What Changed

- Added top-level `allow_remote_control` requirements parsing, sourced
layer precedence, debug output, and `configRequirements/read` exposure
as `allowRemoteControl`.
- Added a typed transport policy captured from the startup requirements
snapshot. Managed disable forces the initial state to disabled and
prevents enrollment, refresh, connection, and persisted-preference
mutation.
- Rejected every `remoteControl/*` RPC before parameter deserialization
with JSON-RPC `-32600` and `remote control is disabled by managed
requirements`.
- Preserved the existing disabled status notification and the previous
behavior when the requirement is `true` or omitted.
- Regenerated app-server protocol schemas and documented the new
requirement.

## Verification

- Confirmed all remote-control RPCs, including a malformed request,
return the managed-policy error while the initial status notification
remains `disabled`.
- Confirmed explicit ephemeral startup and persisted enablement make no
backend connection and leave the SQLite preference unchanged.
- Confirmed `allow_remote_control = true` does not enable or block
remote control and `configRequirements/read` returns
`allowRemoteControl: false` for the deny policy.

Related issue: N/A (managed-policy hardening).

Anton Panasenko · 2026-06-12 20:10:12 -07:00

b9dc3b7a8b

feat: use encrypted local secrets for CLI auth (#27539 )

## Why

Windows Credential Manager limits generic credential blobs to 2,560
bytes. Large serialized ChatGPT auth payloads can exceed that limit, so
keyring-mode CLI auth needs a backend that keeps only the encryption key
in the OS keyring and stores the payload in Codex's encrypted
local-secrets file.

This is the third PR in the encrypted-auth stack:

1. #27504 — feature and config selection
2. #27535 — auth-specific local-secrets namespaces
3. This PR — CLI auth implementation and activation
4. MCP OAuth implementation and activation

## What Changed

- Added encrypted CLI-auth storage using the `CliAuth` secrets
namespace.
- Preserved direct keyring storage for platforms/configurations where it
remains selected.
- Selected the backend consistently for login, logout, refresh,
device-code login, auth loading, and login restrictions.
- Threaded resolved bootstrap/full config through CLI, exec, TUI,
app-server account handling, cloud config, and cloud tasks.
- Removed stale `auth.json` fallback data after successful encrypted
saves and removed encrypted, direct-keyring, and fallback data during
logout.
- Added storage and integration coverage for both direct and encrypted
keyring modes.

MCP OAuth persistence is intentionally left to the next PR.

## Validation

- `just test -p codex-login` — 131 passed
- `just test -p codex-cli` — 280 passed
- `just test -p codex-app-server v2::account` — 25 passed
- `just test -p codex-cloud-config service` — 21 passed, 7 skipped
- `just fix -p codex-login`
- `just fix -p codex-cli`
- `just fmt`

Celia Chen · 2026-06-12 21:23:50 +00:00

56c97e3b5c

feat(app-server): persist remote-control desired state (#27445 )

## Why

Remote-control runtime enablement and persisted enrollment preference
were represented by separate flags. That made startup rehydration, RPC
persistence, and new-enrollment seeding race with one another, and it
did not cleanly distinguish runtime-only CLI or daemon starts from
durable app-server RPC changes.

## What Changed

- Replace the parallel enablement, seed, and rehydration flags with one
transport-owned `RemoteControlDesiredState`.
- Add nullable enrollment-scoped persistence and preserve existing
preferences during enrollment upserts.
- Rehydrate plain startup only after auth and client scope resolve,
without overwriting a concurrent RPC transition.
- Make ordinary `remoteControl/enable` and `remoteControl/disable`
durable while retaining `ephemeral: true` for runtime-only callers.
- Have the daemon explicitly request ephemeral enablement and regenerate
the app-server schemas.

## Verification

- Covered migration and `NULL`/`0`/`1` persistence round trips.
- Covered plain-start rehydration and runtime-only versus durable
enrollment seeding.
- Covered durable enable, durable disable, and ephemeral enable through
app-server RPC.
- Covered the daemon's exact `{ "ephemeral": true }` request payload.

Related issue: N/A (internal remote-control persistence architecture
change).

Anton Panasenko · 2026-06-11 21:28:52 -07:00

d61dfeb23a

[codex] Add comp_hash to model metadata (#27532 )

## Summary
- add optional `comp_hash` metadata to `ModelInfo`
- update `ModelInfo` fixtures for the shared schema change
- keep older model responses compatible by defaulting the field to
`None`

## Why
The models endpoint needs an opaque identifier for compaction-compatible
model configurations. This PR only exposes that value in model metadata;
it does not add it to turn context or change runtime behavior.

Follow-up #27520 carries the value through turn context and rollouts,
then uses it to trigger compaction.

## Stack
- based directly on `main`
- replaces #27519, which was accidentally merged into the wrong base
branch
- functionality follow-up: #27520

## Testing
- `just test -p codex-protocol
model_info_defaults_availability_nux_to_none_when_omitted`
- `just fix -p codex-core -p codex-protocol -p codex-analytics -p
codex-models-manager`

Ahmed Ibrahim · 2026-06-10 20:42:55 -07:00

e614fad02e

feat: add Bedrock API key as a managed auth mode (#27443 )

## Why

Codex needs to manage Amazon Bedrock API key credentials through the
existing auth lifecycle instead of introducing a separate auth manager
or provider-specific credential file. Treating Bedrock API key login as
a primary auth mode gives it the same persistence, keyring, reload, and
logout behavior as the existing OpenAI API key and ChatGPT modes.

The credential is valid only for the `amazon-bedrock` model provider.
OpenAI-compatible providers must reject this auth mode rather than
treating the Bedrock key as an OpenAI bearer token.

## What changed

- Added `bedrockApiKey` as an app-server `AuthMode` and
`CodexAuth::BedrockApiKey` as a primary `AuthManager` mode.
- Added `BedrockApiKeyAuth`, containing the API key and AWS region, to
the existing `AuthDotJson` payload stored in `$CODEX_HOME/auth.json` or
the configured keyring backend.
- Added `login_with_bedrock_api_key(...)`, parallel to
`login_with_api_key(...)`, which replaces the current stored login with
Bedrock credentials.
- Reused generic auth reload and logout behavior instead of adding a
Bedrock-specific auth manager or logout path.
- Updated login restrictions, status reporting, diagnostics, telemetry
classification, generated app-server schemas, and auth fixtures for the
new mode.
- Added explicit errors when Bedrock API key auth is selected with an
OpenAI-compatible model provider.

This PR establishes managed storage and auth-mode behavior. Routing the
managed key and region into Amazon Bedrock requests will be in follow-up
PRs.

Celia Chen · 2026-06-10 20:42:38 -07:00

06afd63f4a

Add app-server thread/delete API (#25018 )

## Why

Clients can archive and unarchive threads today, but there is no
app-server API for permanently removing a thread. Deletion also needs to
cover the full session tree: deleting a main thread should remove
spawned subagent threads and the related local metadata instead of
leaving orphaned rollout files, goals, or subagent state behind.

## What

- Adds the v2 `thread/delete` request and `thread/deleted` notification,
with the response shape kept consistent with `thread/archive`.
- Implements local hard delete for active and archived rollout files.
- Deletes the requested thread's state DB row as the commit point, then
best-effort cleans associated state including spawned descendants,
goals, spawn edges, logs, dynamic tools, and agent job assignments.
- Updates app-server API docs and generated protocol schema/TypeScript
fixtures.

Eric Traut · 2026-06-10 11:22:12 -07:00

a19d43a40a

[codex-rs] support v2 personal access tokens (#25731 )

## Summary

- add v2 personal access token support for `codex login
--with-access-token` and `CODEX_ACCESS_TOKEN`
- classify opaque `at-` tokens separately from legacy Agent Identity
JWTs
- hydrate required ChatGPT account metadata through AuthAPI
`/v1/user-auth-credential/whoami`
- use PATs directly as bearer tokens while preserving existing ChatGPT
account surfaces
- expose PAT-backed auth as the explicit `personalAccessToken`
app-server auth mode

## Implementation

PAT auth is intentionally small and stateless. Loading a PAT performs
one AuthAPI metadata request, stores the hydrated metadata in the
in-memory auth object, and redacts the secret from debug output. Legacy
Agent Identity JWT handling remains unchanged. The shared access-token
classifier lives in a private neutral module because it dispatches
between both credential types.

PAT hydration fails closed when AuthAPI omits any required metadata,
including email. Hydrated metadata is intentionally not persisted:
startup performs a live `whoami` preflight so revoked tokens or changed
account metadata are not accepted from a stale cache.

## Workspace restriction scope

This change intentionally does **not** apply
`forced_chatgpt_workspace_id` to PAT authentication. The setting is a
client-side config guardrail, not an authorization boundary, and PAT
does not currently require workspace-ID parity. The PAT login and
`CODEX_ACCESS_TOKEN` paths therefore validate through AuthAPI without
threading workspace-restriction state through access-token loading.
Existing workspace checks for non-PAT auth remain on their established
paths.

## App-server compatibility

The public app-server `AuthMode` is shared across v1 and v2, and
PAT-backed auth reports `personalAccessToken` through both APIs.
Following human review, this intentionally removes the temporary v1
compatibility mapping that reported PATs as `chatgpt`; the deprecated v1
API is kept in parity with v2 rather than maintaining a separate closed
enum. Clients with exhaustive auth-mode handling in either API version
must add the new case and should generally treat it as ChatGPT-backed
unless they need PAT-specific behavior.

The v1 auth-status response still omits the raw PAT when `includeToken`
is requested because that response cannot carry the account metadata
needed to reuse the credential safely. Persisted PAT auth also omits the
new enum value so older Codex builds can deserialize `auth.json` and
infer PAT auth from the credential field after a rollback.

## Validation

Latest review-fix validation:

- `CARGO_INCREMENTAL=0 just test -p codex-login` (126 passed)
- `CARGO_INCREMENTAL=0 just test -p codex-cli` (263 passed)
- `CARGO_INCREMENTAL=0 just test -p codex-cli
stored_auth_validation_handles_personal_access_token`
- `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol` (226
passed)
- `CARGO_INCREMENTAL=0 just test -p codex-models-manager
refresh_available_models_uses_remote_only_catalog_for_chatgpt_auth`
- `CARGO_INCREMENTAL=0 just test -p codex-tui
existing_non_oauth_chatgpt_login_counts_as_signed_in`
- `CARGO_INCREMENTAL=0 just fix -p codex-login -p
codex-app-server-protocol -p codex-models-manager -p codex-tui -p
codex-cli`
- `just fmt`
- `git diff --check`

The broader `codex-tui` suite previously compiled and ran 2,834 tests.
Three unrelated environment-sensitive guardian/IDE-socket tests failed
after retries; the PAT-relevant TUI coverage passed.

cooper-oai · 2026-06-05 17:36:18 -07:00

df7818c7d1

feat(app-server): add remote control pairing status RPC (#26450 )

## What

Exposes the pairing status transport as experimental app-server v2 RPC
`remoteControl/pairing/status`.

- Adds request/response protocol types for exactly one lookup key:
`pairingCode` or `manualPairingCode`, returning `{ claimed }`.
- Registers the RPC with `global_shared_read("remote-control-pairing")`.
- Wires the method through `MessageProcessor` and
`RemoteControlRequestProcessor`.
- Validates missing/conflicting pairing-code params as invalid requests.
- Documents the RPC in `app-server/README.md`.
- Adds processor, protocol export, and JSON-RPC integration coverage for
both code paths.

## Why

This is the app-server surface the desktop app can poll while the
QR/manual pairing modal is active.

Depends on https://github.com/openai/codex/pull/26449
Related backend change: https://github.com/openai/openai/pull/990244

## Verification

- `cargo test --manifest-path app-server-protocol/Cargo.toml
remote_control`
- `cargo test --manifest-path app-server/Cargo.toml remote_control`
- `cargo fmt --all --check`
- `git diff --check`

hefuc-oai · 2026-06-05 10:33:56 -07:00

0177231ca0

[codex] Add use_responses_lite 'override' logic (#26487 )

## Summary

- add a defaulted `ModelInfo.use_responses_lite` catalog field
- support serializing `reasoning.context` while preserving the existing
effort and summary path
- has not been turned on for any models yet

I've added an override to parallel tools if responses_lite is on. I've
also forced persistent reasoning when using responses_lite. It would be
ideal if we could centralize all the responses_lite plumbing, but I
think this is best for now to keep the plumbing & diffs small.

## Testing

- `cargo test -p codex-protocol
model_info_defaults_availability_nux_to_none_when_omitted`
- `RUST_MIN_STACK=8388608 cargo test -p codex-core
responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls`
- `RUST_MIN_STACK=8388608 cargo test -p codex-core
configured_reasoning_summary_is_sent`
- `cargo check -p codex-core --tests`
- `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes
with pre-existing warnings in `codex-code-mode` and
`codex-core-plugins`)

rka-oai · 2026-06-04 18:49:51 -07:00

e0096db6dc

[codex] Support model-defined reasoning efforts (#26444 )

## Summary
- accept non-empty model-defined reasoning effort values while
preserving built-in effort behavior
- propagate the non-Copy effort type through core, app-server, TUI,
telemetry, and persistence call sites
- preserve string wire encoding and expose an open-string schema for
clients
- update model selection and shortcut behavior for model-advertised
effort values

## Root cause
`ReasoningEffort` gained a string-backed custom variant, so it could no
longer implement `Copy` or rely on derived closed-enum serialization.
Existing consumers still moved effort values from shared references and
assumed a fixed built-in value set.

## Validation
- `just fmt`
- Local tests and compilation were not run per request; relying on CI.

Ahmed Ibrahim · 2026-06-04 13:36:24 -07:00

8ac304c299

feat(app-server): add remote control client management RPCs (#25785 )

## Why

Remote-control clients need to list and revoke controller-device grants
without enabling or enrolling the local relay. These are signed-in
account-management operations, so coupling them to websocket, pairing,
enrollment, or persisted relay state would prevent clients from managing
stale grants from the picker.

Related enhancement request: N/A. This adds the Codex app-server surface
for the planned upstream environment-scoped revoke endpoint.

## What Changed

- Added experimental app-server v2 RPCs:
  - `remoteControl/client/list`
  - `remoteControl/client/revoke`
- Added picker-oriented protocol types and standard generated schema
fixtures. The list response intentionally omits backend account id,
enrollment status, and location fields.
- Added `app-server-transport/src/transport/remote_control/clients.rs`
for environment-scoped GET and DELETE requests. It builds escaped URL
path segments, forwards optional pagination query fields, sends ChatGPT
auth plus `chatgpt-account-id`, converts RFC3339 `last_seen_at` values
to Unix seconds, accepts `204 No Content` revoke responses, and retries
once after a `401`.
- Extracted shared ChatGPT auth loading and recovery into
`app-server-transport/src/transport/remote_control/auth.rs` so
websocket, pairing, and client management use the same account-auth
boundary.
- Retained the configured remote-control base URL on
`RemoteControlHandle` and resolve management URLs lazily, preserving
deferred validation while relay startup is disabled.
- Registered list as `global_shared_read("remote-control-clients")` and
revoke as `global("remote-control-clients")`.

## Verification

- Added transport coverage proving list and revoke work while relay
state is disabled, IDs are escaped, picker-only fields are returned,
timestamps are converted, revoke accepts `204`, auth headers are
forwarded, `401` retries exactly once, `403` is not retried, and
malformed list payloads retain decode context.
- Added an app-server integration test proving both JSON-RPC methods
work before relay enablement and successful revoke returns `{}`.
- Regenerated and validated experimental and standard app-server schema
fixtures.

Anton Panasenko · 2026-06-02 17:01:02 -07:00

98a62a62ce

Add multi-agent runtime metadata types (#25720 )

Stack split from #25708. Original PR intentionally left open. This first
PR adds the multi-agent runtime metadata types and catalog plumbing used
by the rest of the stack.

jif-oai · 2026-06-02 12:10:14 +02:00

3f1fb7ed8b

feat(remote-control): add pairing start (#25675 )

## Why

Remote control enrollment authorizes a desktop server, but app-server v2
did not expose the follow-up pairing operation needed to mint a
short-lived controller pairing artifact from that enrolled server.
Clients need a narrow RPC that starts pairing without exposing the
backend `serverId` or conflating pairing with websocket connection
state.

Issue: N/A; internal remote-control pairing API change.

## What Changed

Added experimental app-server v2 `remoteControl/pairing/start` with
`manualCode` input and `pairingCode`, nullable `manualPairingCode`,
`environmentId`, and Unix-seconds `expiresAt` output. The method
serializes under its own `global("remote-control-pairing")` scope and is
documented in `app-server/README.md`.

Extended the remote-control transport with private `/server/pair`
request/response types and normalized `pair_url` handling. Pairing uses
the current enrolled server bearer, refreshes that bearer when needed,
keeps backend `server_id` private, validates returned `server_id` and
`environment_id` against the current enrollment, and preserves backend
status/header/body context for failures and malformed responses.

Wired the request through `RemoteControlRequestProcessor` and
`MessageProcessor`, mapping unavailable/disabled pairing to
`invalid_request` and backend failures to internal errors.

## Verification

- `just test -p codex-app-server-transport`
- `just test -p codex-app-server
remote_control_pairing_start_returns_pairing_artifacts`

Anton Panasenko · 2026-06-02 01:05:50 +00:00

0002316687

fix: rename McpServer to TestAppServer (#25701 )

This PR brought to you via VS Code rather than Codex...

- opened `codex-rs/app-server/tests/common/mcp_process.rs`
- put the cursor on `McpServer`
- hit `F2` and renamed the symbol to `TestAppServer`
- went to the file tree
- hit enter and renamed `mcp_process.rs` to `test_app_server.rs`
- ran **Save All Files** from the Command Palette
- ran `just fmt`

The End

(Admittedly, most of the local variables for `TestAppServer` are still
named `mcp`, though.)

Michael Bolin · 2026-06-01 21:49:38 +00:00

6536841d89

[codex-rs] auto-review model override (#23767 )

## Why

Guardian auto-review normally uses the provider-preferred review model
when one is available. Some parent models need model-catalog metadata to
select a different review model while keeping older `/models` payloads
compatible when that metadata is absent.

## What changed

- Added optional `ModelInfo::auto_review_model_override` metadata to the
public model payload as a review-model slug.
- Updated Guardian review model selection to prefer the catalog override
when present, while preserving the existing provider preferred-model
path and parent-model fallback when it is omitted.
- Added focused Guardian coverage for override and no-override model
selection.
- Added an `auto_review` core integration suite test that loads override
metadata from a remote model catalog path and asserts the strict
auto-review `/responses` request uses the catalog-selected review model.
- Updated existing `ModelInfo` fixtures and local catalog constructors
for the new optional field.

## Validation

- `cargo test -p codex-protocol
model_info_defaults_availability_nux_to_none_when_omitted`
- `cargo test -p codex-core guardian_review_uses_`
- `cargo test -p codex-core
remote_model_override_uses_catalog_model_for_strict_auto_review --test
all`
- `just fix -p codex-protocol`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`

Won Park · 2026-06-01 11:51:15 -07:00

f1609d9fb6

store and expose parent_thread_id on Threads (#25113 )

## Why

This PR
https://github.com/openai/codex/pull/24161#discussion_r3325692763
revealed a subagent data modeling issue, where we overloaded
`forked_from_id` to also mean `parent_thread_id`. That's incorrect since
guardian and review subagents can be a subagent and NOT fork the main
thread's history.

The solution here is to explicitly store a new `parent_thread_id` on
`SessionMeta`, alongside `forked_from_id` which already exists. While
we're at it, also expose it in the app-server protocol on the `Thread`
object.

A thread->subagent relationship and a fork of thread history are
orthogonal concepts.

## What Changed

- Added top-level `parent_thread_id` persistence on `SessionMeta` and
runtime/session plumbing through `SessionConfiguredEvent`,
`CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
`TurnContext`, and `ModelClient`.
- Made turn metadata, request headers, analytics, and subagent-start
events read the separate runtime/top-level parent field instead of
deriving general parent lineage from `SessionSource` or
`forked_from_thread_id`.
- Passed parent lineage separately at delegated subagent, review,
guardian, agent-job, and multi-agent spawn construction sites;
copied-history fork lineage remains derived only from `InitialHistory`.
- Persisted and exposed parent lineage through rollout/thread-store
projections and app-server v2 `Thread.parentThreadId`.
- Updated app-server README text and regenerated app-server schema
fixtures for the additive `parentThreadId` response field.

Owen Lin · 2026-06-01 04:33:20 +00:00

cf0911076f

[codex] Add model tool mode selector (#25031 )

## Why
Some models need to select their code-execution behavior through model
catalog metadata. Models without that metadata must continue to follow
the existing `CodeMode` and `CodeModeOnly` feature flags, including when
a newer server sends an enum value this client does not recognize.

## What changed
- add optional `ModelInfo.tool_mode` metadata with `direct`,
`code_mode`, and `code_mode_only`
- treat omitted and unknown wire values as `None`
- resolve `None` from the existing feature flags
- carry the resolved `ToolMode` directly on `TurnContext`, outside
`Config`
- use the resolved value for turn creation, model switches, review
turns, tool planning, and code execution

## Coverage
- add protocol coverage for omitted, known, and unknown enum values
- add focused coverage for flag fallback and explicit metadata
overriding feature flags
- add core integration coverage that fetches remote model metadata
through `/v1/models` and verifies the outbound `/responses` tools for
explicit `direct` and `code_mode_only` selectors

## Stack
- followed by #25032

Ahmed Ibrahim · 2026-05-29 09:05:05 -07:00

5577a9e148

Add runtime extra skill roots API (#24977 )

## Summary
- Add v2 `skills/extraRoots/set` to replace app-server process-local
standalone skill roots. The setting is not persisted, accepts missing
roots, and `extraRoots: []` clears the runtime set.
- Wire runtime roots into core skill discovery for `skills/list` and
turn loads, clear skill caches on set, and register the roots with the
skills watcher so later filesystem changes emit `skills/changed`.
- Update app-server docs, generated JSON/TypeScript schemas, and
coverage for serialization, missing roots, empty clears, and restart
behavior.

## Testing
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core-skills`
- `cargo test -p codex-app-server
skills_extra_roots_set_updates_process_runtime_roots`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core-skills`
- `just fix -p codex-app-server`

xl-openai · 2026-05-28 21:14:34 -07:00

f0a839ea0c

package: include zsh fork in Codex package (#23756 )

## Why

The package layout gives Codex a stable place for runtime helpers that
should travel with the entrypoint. `shell_zsh_fork` still required users
to configure `zsh_path` manually, even though we already publish
prebuilt zsh fork artifacts.

This PR builds on #24129 and uses the shared DotSlash artifact fetcher
to include the zsh fork in Codex packages when a matching target
artifact exists. Packaged Codex builds can then discover the bundled
fork automatically; the user/profile `zsh_path` override is removed so
the feature uses the package-managed artifact instead of a legacy path
knob.

## What Changed

- Added `scripts/codex_package/codex-zsh`, a checked-in DotSlash
manifest for the current macOS arm64 and Linux zsh fork artifacts.
- Taught `scripts/build_codex_package.py` to fetch the matching zsh fork
artifact and install it at `codex-resources/zsh/bin/zsh` when available
for the selected target.
- Added package layout validation for the optional bundled zsh resource.
- Added `InstallContext::bundled_zsh_path()` and
`InstallContext::bundled_zsh_bin_dir()` for package-layout resource
discovery.
- Threaded the packaged zsh path through config loading as the runtime
`zsh_path` for packaged installs, and removed the config/profile/CLI
override path.
- Kept the packaged default zsh override typed as `AbsolutePathBuf`
until the existing runtime `Config::zsh_path` boundary.
- Updated app-server zsh-fork integration tests to spawn
`codex-app-server` from a temporary package layout with
`codex-resources/zsh/bin/zsh`, matching the new packaged discovery path
instead of setting `zsh_path` in config.
- Switched package executable copying from metadata-preserving `copy2()`
to `copyfile()` plus explicit executable bits, which avoids macOS
file-flag failures when local smoke tests use system binaries as inputs.

## Testing

To verify that the `zsh` executable from the Codex package is picked up
correctly, first I ran:

```shell
./scripts/build_codex_package.py
```

which created:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/
```

so then I ran:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/bin/codex exec --enable shell_zsh_fork 'run `echo $0`'
```

which reported the following, as expected:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/codex-resources/zsh/bin/zsh
```



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23756).
* #23768
* __->__ #23756

Michael Bolin · 2026-05-22 17:54:07 -07:00

c7bcb90f9b

[codex] Add rollout-backed thread content search (#23519 )

## Summary
- add experimental `thread/search` for local rollout-backed thread
search using `rg` over JSONL rollouts
- return search-specific result rows with optional previews instead of
storing preview data on `StoredThread` or ordinary `Thread` responses
- keep `thread/list` separate from full-content search and document the
new app-server surface

## Testing
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server
thread_search_returns_content_and_title_matches -- --nocapture`

Francis Chalissery · 2026-05-21 11:52:24 -07:00

ac0bff27e7

Honor client-resolved service tier defaults (#23537 )

## Why

Model catalog responses can now advertise a nullable
`default_service_tier` for each model. Codex needs to preserve three
distinct states all the way from config/app-server inputs to inference:

- no explicit service tier, so the client may apply the current model
catalog default when FastMode is enabled
- explicit `default`, meaning the user intentionally wants standard
routing
- explicit catalog tier ids such as `priority`, `flex`, or future tiers

Keeping those states distinct prevents the UI from showing one tier
while core sends another, especially after model switches or app-server
`thread/start` / `turn/start` updates.

## What Changed

- Plumbed `default_service_tier` through model catalog protocol types,
app-server model responses, generated schemas, model cache fixtures, and
provider/model-manager conversions.
- Added the request-only `default` service tier sentinel and normalized
legacy config spelling so `fast` in `config.toml` still materializes as
the runtime/request id `priority`.
- Moved catalog default resolution to the TUI/client side, including
recomputing the effective service tier when model/FastMode-dependent
surfaces change.
- Updated app-server thread lifecycle config construction so
`serviceTier: null` preserves explicit standard-routing intent by
mapping to `default` instead of internal `None`.
- Kept core responsible for validating explicit tiers against the
current model and stripping `default` before `/v1/responses`, without
applying catalog defaults itself.

## Validation

- `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
- `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
- `cargo test -p codex-tui service_tier`
- `cargo test -p codex-protocol service_tier_for_request`
- `cargo test -p codex-core get_service_tier`
- `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
service_tier`

Shijie Rao · 2026-05-20 15:57:50 -07:00

370b13afc9

Add thread/settings/update app-server API (#23502 )

## Why

App-server clients need a way to update a thread's next-turn settings
without starting a turn, adding transcript content, or waiting for turn
lifecycle events. This gives settings UI a direct path for durable
thread settings while clients observe the eventual effective state
through a notification.

This is a simplified rework of PR
https://github.com/openai/codex/pull/22509. In particular, it changes
the `thread/settings/update` api to return immediately rather than
waiting and returning the effective (updated) thread settings. This
makes the new api consistent with `turn/start` and greatly reduces the
complexity of the implementation relative to the earlier attempt.

## What Changed

- Adds experimental `thread/settings/update` with partial-update request
fields and an empty acknowledgment response.
- Adds experimental `thread/settings/updated`, carrying full effective
`ThreadSettings` and scoped by `threadId` to subscribed clients for the
affected thread.
- Shares durable settings validation with `turn/start`, including
`sandboxPolicy` plus `permissions` rejection and `serviceTier: null`
clearing.
- Emits the same settings notification when `turn/start` overrides
change the stored effective thread settings.
- Regenerates app-server protocol schema fixtures and updates
`app-server/README.md`.

Eric Traut · 2026-05-20 11:03:20 -07:00

771a4e74ac

feat: add permission profile list api (#23412 )

## Why

Clients need a typed permission-profile catalog instead of
reconstructing that state from config internals.

## What changed

- Added `permissionProfile/list` to the app-server v2 protocol with
cursor pagination and optional `cwd`.
- The list response includes built-in permission profiles plus
config-defined `[permissions.<id>]` profiles from the effective config
for the request context.
- Permission profiles keep optional `description` metadata for display
purposes.
- App-server docs and schema fixtures are updated for the new RPC.

viyatb-oai · 2026-05-20 02:42:56 +00:00

c3faea0b09

[codex] Add installed-plugin mention API (#22448 )

## Summary
- add app-server `plugin/installed` for mention-oriented plugin loading
- return installed plugins plus explicitly requested install-suggestion
rows
- keep remote handling on installed-state data instead of the broad
catalog listing path

## Why
The `@` mention surface only needs plugins that are usable now, plus a
small product-approved set of install suggestions. It does not need the
full catalog-shaped `plugin/list` payload that the Plugins page uses.

## Validation
- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-app-server --test all plugin_installed_`

## Notes
- The package-wide `cargo test -p codex-app-server` run still hits an
existing unrelated stack overflow in
`in_process::tests::in_process_start_clamps_zero_channel_capacity`.
- Companion webview PR: https://github.com/openai/openai/pull/915672

xli-oai · 2026-05-18 03:11:54 -07:00

da14dd2add

feat(app-server): update remote control APIs for better UX (#22877 )

## Why
To help improve `codex remote-control` CLI UX which I plan to do in a
followup, this PR adds `server-name` to the various remote control APIs:
- `remoteControl/enable`
- `remoteControl/disable`
- `remoteControl/status/changed`

Also, add a `remoteControl/status/read` API. This will be helpful in the
Codex App.

Owen Lin · 2026-05-15 14:33:24 -07:00

6a331a66eb

enable/disable remote control at runtime, not via features (#22578 )

## Why
reapplies https://github.com/openai/codex/pull/22386 which was
previously reverted

Also, introduce `remoteControl/enable` and `remoteControl/disable`
app-server APIs to toggle on/off remote control at runtime for a given
running app-server instance.

## What Changed

- Adds experimental v2 RPCs:
  - `remoteControl/enable`
  - `remoteControl/disable`
- Adds `RemoteControlRequestProcessor` and routes the new RPCs through
it instead of `ConfigRequestProcessor`.
- Adds named `RemoteControlHandle::enable`, `disable`, and `status`
methods.
- Makes `remoteControl/enable` return an error when sqlite state DB is
unavailable, while keeping enrollment/websocket failures as async status
updates.
- Adds `AppServerRuntimeOptions.remote_control_enabled` and hidden
`--remote-control` flags for `codex app-server` and `codex-app-server`.
- Updates managed daemon startup to use `codex app-server
--remote-control --listen unix://`.
- Marks `Feature::RemoteControl` as removed and ignores
`[features].remote_control`.
- Updates app-server README entries for the new remote-control methods.

Owen Lin · 2026-05-14 01:07:46 +00:00

4e368aa2e9

feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566 )

## Why
The goal of this PR is to align on app-server and `ThreadStore` API
updates for paginating through large threads.


#### app-server
##### `thread/turns/list`
- Updates `thread/turns/list` to support `itemsView?: "notLoaded" |
"summary" | "full" | null`, defaulting to `summary`.
- Implements the current `thread/turns/list` behavior over the existing
persisted rollout-history fallback:
  - `notLoaded` returns turn envelopes with empty `items`.
- `summary` returns the first user message and final assistant message
when available.
  - `full` preserves the existing full item behavior.

Note that this method still uses the naive approach of loading the
entire rollout file, and returns just the filtered slice of the data.
Real pagination will come later by leveraging SQLite.

##### `thread/turns/items/list`
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.

#### ThreadStore
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.
- Adds `ThreadStore` contract types and stubbed methods for listing
thread turns and listing items within a turn.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.

This also sketches the storage abstraction we expect to need once turns
are indexed/stored. In particular, `notLoaded` is useful only if
ThreadStore can eventually list turn metadata without loading every
persisted item for each turn.

## Validation

- Added/updated protocol serialization coverage for the new request and
response shapes.
- Added app-server integration coverage for `thread/turns/list` default
summary behavior and all three `itemsView` modes.
- Added app-server integration coverage that `thread/turns/items/list`
returns the expected unsupported JSON-RPC error when experimental APIs
are enabled.
- Added thread-store coverage that the default trait methods return
`ThreadStoreError::Unsupported`.

No developers.openai.com documentation update is needed for this
internal experimental app-server API surface.

Owen Lin · 2026-05-07 15:44:43 -07:00

0d0835dd53

Disable empty Cargo test targets (#21584 )

## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>

Charlie Marsh · 2026-05-07 15:44:17 -07:00

54ef99a365

[codex-analytics] rework thread_source for thread analytics (#20949 )

## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`

rhan-oai · 2026-05-06 02:12:31 +00:00

b3d4f1a9f0

1- Add model service tiers metadata (#20969 )

## Why

The model list needs to carry display-ready service tier metadata so
clients can render tier choices with stable IDs, names, and
descriptions. A raw speed-tier string list is not enough for richer UI
copy or future tier labels.

## What changed

- Added `ModelServiceTier` to shared model metadata with string `id`,
`name`, and `description` fields.
- Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
empty defaults for older cached model payloads.
- Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
it through TUI app-server model conversion.
- Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
metadata as deprecated in source and generated schema output.
- Regenerated app-server protocol JSON schema and TypeScript fixtures,
including `ModelServiceTier.ts`.

## Verification

- Ran `just write-app-server-schema`.
- Did not run local tests per repo instruction; relying on PR CI.

---------

Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-05-05 09:51:18 +03:00

9d579813bb

[codex] Add unsandboxed process exec API (#19040 )

## Why

App-server clients sometimes need argv-based local process execution
while sandbox policy is controlled outside Codex. Those environments can
reject sandbox-disabling paths before a command ever starts, even when
the caller intentionally wants unsandboxed execution.

This PR adds a distinct `process/*` API for that use case instead of
extending `command/exec` with another sandbox-disabling shape. Keeping
the new surface separate also makes the future removal of `command/exec`
simpler: clients that need explicit process lifecycle control can move
to the newer handle-based API without depending on `command/exec`
business logic.

## What changed

- Added v2 process lifecycle methods: `process/spawn`,
`process/writeStdin`, `process/resizePty`, and `process/kill`.
- Added process notifications: `process/outputDelta` for streamed
stdout/stderr chunks and `process/exited` for final exit status and
buffered output.
- Made `process/spawn` intentionally unsandboxed and omitted
sandbox-selection fields such as `sandboxPolicy` and
`permissionProfile`.
- Added client-supplied, connection-scoped `processHandle` values for
follow-up control requests and notification routing.
- Supported cwd, environment overrides, PTY mode and size, stdin
streaming, stdout/stderr streaming, per-stream output caps, and timeout
controls.
- Killed active process sessions when the originating app-server
connection closes.
- Wired the implementation through the modular `request_processors/`
app-server layout, with process-handle request serialization for
follow-up control calls.
- Updated generated JSON/TypeScript schema fixtures and documented the
new API in `codex-rs/app-server/README.md`.
- Added v2 app-server integration coverage in
`codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn
acknowledgement before exit, buffered output caps, and process
termination.

## Verification

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`

---------

Co-authored-by: Owen Lin <owen@openai.com>

Ruslan Nigmatullin · 2026-05-04 16:43:58 -07:00

4950e7d8a6

Add remote plugin skill read API (#20150 )

## Summary

Adds an app-server `plugin/skill/read` method for remote plugin skill
markdown. The new method calls the plugin-service skill detail endpoint
and returns `skill_md_contents`, so clients can preview skills for
remote plugins before the bundle is installed locally.

## Why

Uninstalled remote plugin skills do not have local `SKILL.md` files.
Without an on-demand remote read, the desktop plugin details UI cannot
render the skill details modal for those skills.

## Validation

- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all --
suite::v2::plugin_read::plugin_skill_read_reads_remote_skill_contents_when_remote_plugin_enabled
--exact`
- `just fix -p codex-app-server-protocol -p codex-core-plugins -p
codex-app-server`

xli-oai · 2026-05-01 00:16:25 -07:00

96d2ea9058

Sync remote installed plugin bundles (#20268 )

## Summary
- Download missing remote installed plugin bundles during app-server
startup and plugin/list refresh.
- Upgrade cached remote installed bundles when the backend installed
version changes.
- Remove stale remote installed bundle caches without writing remote
plugin state into config.toml.

## Review note
This is a clean PR branch cut from the current diff on top of latest
`origin/main`. The diff intentionally has no `codex-rs/core/**` files,
so CODEOWNERS should not request the core-directory owner review from
stale PR history.

## Validation
Already run on the source branch before creating this clean PR:
- `just fmt`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-app-server --test all
app_server_startup_sync_downloads_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_sync_upgrades_and_removes_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
app_server_startup_remote_plugin_sync_runs_once -- --nocapture`
- `just fix -p codex-core-plugins`
- `just fix -p codex-app-server`
- `git diff --check`

xli-oai · 2026-04-30 16:05:14 -07:00

2686873e77

Add hooks/list app-server RPC (#19778 )

## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>

Abhinav · 2026-04-29 23:39:57 +00:00

8774229a89

feat: expose provider capability bounds to app server clients (#20049 )

follow up of #19442. The app server now exposes provider-derived bounds
through a new v2 `modelProvider/read` method. The response reports the
configured provider map key as `modelProvider` and returns the effective
capability booleans so clients can align their UI with the same
provider-owned limits used by core.

Celia Chen · 2026-04-29 01:36:19 +00:00

8c47e36504

Fix plugin list workspace settings test isolation (#20086 )

Fixes test that often fails locally when running `cargo test`
- Add an app-server test helper that combines managed-config isolation
with custom env overrides.
- Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests
so host home marketplaces do not affect results.

canvrno-oai · 2026-04-28 18:34:38 -07:00

4c39ad33cb

test: harden app-server integration tests (#19683 )

## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683

Michael Bolin · 2026-04-26 12:43:16 -07:00

ac2bffa443

206 Commits