codex

[app-server] expose environment info RPC (#30291 )

## Why

App-server clients that configure named execution environments need to
discover an environment's shell and working directory before selecting
it for a thread or turn. Because the environment can run on a different
operating system than app-server, its working directory is represented
as a canonical `file:` URI rather than a host-local path string. The
probe also needs a bounded response time: an exec-server that completes
initialization but never answers `environment/info` must not hold the
environment serialization queue indefinitely.

## What changed

- Add an experimental `environment/info` app-server RPC for named
environments.
- Route the probe through the managed environment connection and return
target-native shell metadata plus the default working directory as a
`PathUri`.
- Return connection and protocol failures as JSON-RPC errors.
- Bound the exec-server probe response to 30 seconds and remove
timed-out calls from the pending-request table so later environment
mutations can proceed.
- Cover successful responses, omitted working directories, unknown
environments, connection failures, and pending-call cleanup.

## Protocol examples

Request:

```json
{
  "id": 42,
  "method": "environment/info",
  "params": {
    "environmentId": "remote-a"
  }
}
```

Successful response:

```json
{
  "id": 42,
  "result": {
    "shell": {
      "name": "zsh",
      "path": "/bin/zsh"
    },
    "cwd": "file:///workspace"
  }
}
```

If the exec-server initializes but does not answer the probe within 30
seconds:

```json
{
  "id": 42,
  "error": {
    "code": -32603,
    "message": "failed to get info for environment `remote-a`: exec-server protocol error: timed out waiting for exec-server `environment/info` response after 30s"
  }
}
```

## Testing

- App-server integration coverage for successful info (including omitted
`cwd`), unknown environments, and connection failures.
- Exec-server RPC coverage verifying a timed-out call is removed from
the pending-request table.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>

Max Johnson · 2026-06-27 19:34:10 +00:00

e2398d0b16

feat(app-server): add optional turn_id to thread/fork (#30277 )

## Description

This adds stable optional `turnId` support to `thread/fork`. When
supplied, the fork copies persisted history through that terminal turn,
inclusive, and drops later turns from the new thread.

Omitting or passing `null` preserves the existing full-history fork
behavior, including the interruption marker when the stored source
history ends mid-turn.

## Why

We're deprecating `thread/rollback` and this will help certain UX use
cases work around it by using `thread/fork` + `turn_id` instead.

Owen Lin · 2026-06-26 19:35:54 +00:00

f72976a5f1

[codex] Add managed new-thread model settings (#29683 )

## Why

Admins need persistent defaults for the model, reasoning effort, and
service tier shown when the Desktop App creates a new thread. These are
initialization defaults rather than runtime constraints: the App should
use them to initialize its draft while still allowing a user to make an
explicit selection.

The app-server therefore needs to expose the managed values before
thread creation without changing `thread/start` behavior for other
clients.

## What changed

- Parse `model`, `model_reasoning_effort`, and `service_tier` from
`[models.new_thread]` in `requirements.toml`.
- Compose the `models` requirements through the existing
requirements-layer precedence rules.
- Expose the resolved values through `configRequirements/read` as
`requirements.models.newThread`.
- Add the corresponding app-server protocol types and regenerate the
JSON and TypeScript schema fixtures.
- Document the new `configRequirements/read` fields in the app-server
README.

## Scope

This PR is data plumbing only. It does not apply these values during
`thread/start` and does not change thread creation for existing
app-server clients, resumed or forked sessions, internal or subagent
sessions, `codex exec`, or the TUI. A companion Desktop App change owns
draft initialization, sends the effective settings for ordinary and
prewarmed starts, and preserves explicit user changes.

## Validation

- Requirements deserialization coverage for `[models.new_thread]`
- Requirements-layer precedence coverage
- App-server API mapping coverage
- `configRequirements/read` integration coverage
- Regenerated app-server JSON and TypeScript schema fixtures

hefuc-oai · 2026-06-26 18:37:40 +00:00

d9cf931d0e

Expose MCP app identity in app context (#29934 )

## Why

MCP tool-call events need to expose trusted app identity and action
metadata directly so v2 clients do not have to infer it from tool names
or resource URIs.

## What changed

- Add optional `appName`, `templateId`, and `actionName` fields to MCP
tool-call `appContext`.
- Populate `appName` and `templateId` from trusted Codex Apps metadata,
and derive `actionName` from the trusted app resource metadata.
- Preserve all three fields through core events, legacy protocol events,
persisted thread history, resume redaction, and app-server v2 responses.
- Document the public `appContext` fields in
`codex-rs/app-server/README.md`.
- Regenerate app-server JSON and TypeScript schemas and add coverage for
serialization, persistence, redaction, and metadata propagation.

## Validation

- `just test -p codex-app-server-protocol mcp_tool_call`
- `just test -p codex-core
mcp_tool_call_item_metadata_only_trusts_codex_apps_identity
mcp_tool_call_item_includes_app_identity`
- `just write-app-server-schema`

---------

Co-authored-by: Martin Au-Yeung <280153141+martinauyeung-oai@users.noreply.github.com>

Martin Au-Yeung · 2026-06-25 18:31:10 -07:00

ec300bc7bd

[codex] Surface MCP reauthentication-required startup failures (#29877 )

## Summary

- distinguish expired, non-refreshable stored MCP OAuth credentials from
first-time missing credentials
- carry a typed `failureReason: "reauthenticationRequired"` on the
existing `mcpServer/startupStatus/updated` notification only when user
action is required
- keep the public MCP auth-status API unchanged and regenerate the
app-server protocol schemas and documentation

## Why

An MCP server with an expired access token and no usable refresh token
currently fails startup without giving clients a reliable, typed
recovery signal.

The existing startup-status notification is the natural place to carry
this state. Its nullable `failureReason` keeps the recovery reason
attached to the failed startup transition without adding a one-off
notification. Internally, Codex distinguishes first-time login from
reauthentication and emits the reason only when the startup error itself
requires authentication.

## User impact

App clients can prompt an existing user to reconnect an MCP server when
automatic recovery is impossible by handling a failed
`mcpServer/startupStatus/updated` notification whose `failureReason` is
`reauthenticationRequired`. Starting, ready, cancelled, unrelated
failures, and first-time setup carry no reauthentication reason.

## Companion app PR

- openai/openai#1069582

## Validation

- `just test -p codex-app-server-protocol` — 248 passed; schema fixture
tests passed
- `cargo check -p codex-app-server -p codex-tui`
- `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
- `just test -p codex-protocol -p codex-app-server-protocol -p
codex-mcp` — 579 passed
- `just write-app-server-schema`
- `just fmt`

felixxia-oai · 2026-06-25 21:50:36 +00:00

a6d20ed297

feat: add provider-aware model fallback to thread start (#29942 )

## Why

Helper threads such as task title generation can request a model ID that
is valid for the default OpenAI provider but unavailable from the active
provider. With Amazon Bedrock, `gpt-5.4-mini` is rejected while the
provider static catalog exposes Bedrock model IDs such as
`openai.gpt-5.5` and `openai.gpt-5.4`. This causes repeated background
404s and can surface a misleading turn error even when the main turn
succeeds.

Clients need an explicit way to ask app-server to resolve an unavailable
helper model to the active provider default. That fallback must remain
limited to providers with an authoritative static catalog so custom or
dynamically discovered model IDs are not rewritten based on an
incomplete catalog.

Fixes #28741.

## What changed

- Add the experimental `allowProviderModelFallback` option to
`thread/start`, defaulting to `false` to preserve existing behavior.
- Thread the option through thread creation and model selection.
- When enabled for a static model manager, preserve requested models
present in the catalog and replace unavailable models with the provider
default.
- Continue preserving explicit model IDs for dynamic model managers
without fetching a catalog solely to validate them.
- Document the new `thread/start` behavior in the app-server API
overview.

## Test
Temporary test-client harness:
```
ThreadStartParams {
    model: Some("gpt-5.4-mini".to_string()),
    allow_provider_model_fallback: true,
    ..Default::default()
}
```
Command:
```
CODEX_HOME=/tmp/codex-bedrock-thread-start-home \
CODEX_E2E_BEDROCK_THREAD_START_ONLY=1 \
./target/debug/codex-app-server-test-client \
  --codex-bin ./target/debug/codex \
  -c 'model_provider="amazon-bedrock"' \
  send-message-v2 --experimental-api ignored
```
Relevant output:
```
> "method": "thread/start",
> "params": {
>   "model": "gpt-5.4-mini",
>   "modelProvider": null,
>   "allowProviderModelFallback": true,
>   ...
> }

< "result": {
<   "model": "openai.gpt-5.5",
<   "modelProvider": "amazon-bedrock",
<   ...
< }
```

Celia Chen · 2026-06-25 18:24:34 +00:00

6d9dbacf1a

chore(app-server): mark thread/rollback as deprecated (#29928 )

We will drop support for this in the near future due to the complexity
it introduces.

Owen Lin · 2026-06-25 17:15:46 +00:00

268328001f

Support OAuth for HTTP MCP servers from selected executor plugins (#28529 )

## Why

#28522 routes selected-plugin HTTP MCP traffic through the owning
executor, but OAuth bootstrap and refresh still used host-local clients.
Executor-only servers therefore cannot complete discovery or login
through the same network boundary as the MCP connection.

## What changed

- adapt `codex_exec_server::HttpClient` to RMCP 1.8's `OAuthHttpClient`
contract
- let RMCP own discovery, dynamic registration, PKCE, token exchange,
and refresh
- route auth status, persisted-token startup, and app-server login
through the server runtime while preserving the existing local discovery
path
- add optional `threadId` to `mcpServer/oauth/login` and echo it in the
completion notification
- implement RMCP's redirect policy and 1 MiB OAuth response limit over
executor HTTP
- cover selected-thread OAuth discovery and login through an
executor-only route

Depends on #28522.

jif · 2026-06-25 10:31:17 +01:00

b215961a56

Support HTTP MCP servers from selected executor plugins (#28522 )

## Why

Selected executor plugins can declare both stdio and Streamable HTTP MCP
servers, but only stdio registrations were retained. That silently drops
part of the plugin's tool surface and prevents HTTP traffic from using
the owning executor's network.

## What changed

- retain selected-plugin Streamable HTTP MCP declarations alongside
stdio declarations
- route their HTTP clients through the owning executor environment
- preserve local auth-header environment references while rejecting them
for executor-hosted declarations
- cover thread isolation, refresh, and an executor-only HTTP route end
to end

jif · 2026-06-25 10:10:36 +01:00

6368937939

[codex] Add Ultra reasoning effort (#29899 )

## Why

Ultra should be one user-facing reasoning selection for work that
benefits from both maximum reasoning and proactive multi-agent
delegation. Without it, clients must coordinate maximum reasoning with
the experimental `multiAgentMode` setting, even though the inference
backend still expects its existing `max` effort value.

This change makes reasoning effort the source of truth: clients select
`ultra`, core derives proactive multi-agent behavior when the turn is
eligible for multi-agent V2, and inference requests continue to use the
backend-compatible `max` value.

## What changed

- Add `ultra` as a first-class reasoning effort and preserve
model-catalog ordering when exposing it to clients.
- Convert `ultra` to `max` at the inference request boundary, including
Responses HTTP/WebSocket requests, startup prewarm, compaction, and
memory summarization.
- Derive effective multi-agent mode per turn from effective reasoning
effort:
  - eligible multi-agent V2 + `ultra` → `proactive`
  - eligible multi-agent V2 + any other effort → `explicitRequestOnly`
- V1 or otherwise ineligible sessions → no multi-agent mode instruction
- Keep the derived effective mode in turn context history so successive
turns can emit a developer-message update only when the effective mode
changes.
- Remove selected multi-agent mode from core session configuration, turn
construction, thread settings, resume/fork restoration, and subagent
spawn plumbing. Subagents inherit reasoning effort and derive their own
effective mode.
- Retain the experimental app-server `multiAgentMode` fields for wire
compatibility while marking them deprecated. Request values are accepted
but ignored; compatibility response fields report `explicitRequestOnly`.
- Display Ultra in the TUI using the order supplied by `model/list`.

## Validation

- `just test -p codex-core ultra_reasoning_uses_max_for_requests`
- `just test -p codex-tui model_reasoning_selection_popup`

Shijie Rao · 2026-06-24 20:13:52 -07:00

df1199fddb

[apps] Thread structured icon assets through app list (#29889 )

## Summary

- Add `iconAssets` and `iconDarkAssets` to the app-list protocol.
- Preserve structured icons through directory merging and the connector,
app-
  server, and TUI boundaries.
- Keep legacy logo URLs unchanged as compatibility fallbacks.
- Update generated protocol schemas and TypeScript types.

Drew · 2026-06-24 13:25:44 -07:00

a33ad93996

feat(app-server): list descendant threads by ancestor (#29591 )

## Why

`thread/list` can filter direct children with `parentThreadId`, but
clients cannot request an entire spawned subtree. Discovering every
descendant requires repeated client-side requests and gives up the
database's existing filtering and pagination path.

## What changed

Experimental clients can use `ancestorThreadId` to return strict
descendants at any depth while `parentThreadId` retains its direct-child
meaning. The filters are mutually exclusive, the ancestor is excluded,
and every result preserves its immediate `parentThreadId` so callers can
reconstruct the tree.

## How it works

- **Explicit relationship:** Internal list parameters distinguish direct
children from transitive descendants without changing the meaning of
`parentThreadId`.
- **Existing graph:** Persisted parent-child spawn edges remain the
source of truth, so descendant lookup needs no schema migration or
ancestry cache.
- **Indexed traversal:** A recursive SQLite query starts from the
parent-edge index, walks each generation, and applies thread filters,
sorting, and cursor pagination in the same database request.
- **Reconstructable results:** The response stays flat and normally
ordered while carrying each descendant's immediate parent.

## Verification

Ran 550 tests across the protocol, state, rollout, and thread-store
crates, then reran the four focused state, store, and app-server
descendant-listing tests after the final diff reduction. Scoped Clippy
and formatting checks passed. Stable and experimental schema generation
was checked; the stable fixtures remain unchanged while the experimental
schema includes the new field.

Brent Traut · 2026-06-24 13:08:14 -07:00

8057603d0c

[codex] show external import result counts (#29567 )

## What changed

- Show per-type import counts in the `/import` review UI and started
message.
- Render completion results as a multi-line summary with total
imported/failed counts and one row per import type.
- Add snapshot coverage for the updated review and completion output.

<img width="537" height="322" alt="Screenshot 2026-06-23 at 9 41 20 PM"
src="https://github.com/user-attachments/assets/166542eb-2097-4b2b-8130-8f6fd8c680ce"
/>


## Why

The TUI previously only reported that Claude Code import started or
finished. Users could not see how many items of each type were selected
or how many actually imported versus failed.

charlesgong-openai · 2026-06-24 08:56:57 -07:00

3694b48a82

[codex] rename rollout budget error to session budget error (#29744 )

## Summary

- rename the rollout-budget exhaustion error from
`RolloutBudgetExceeded` to `SessionBudgetExceeded`
- expose the matching app-server v2 wire value as
`sessionBudgetExceeded`
- regenerate JSON/TypeScript schema fixtures and update the app-server
docs and focused tests

This is a naming-only follow-up to #29715 based on [Pavel's review
suggestion](https://github.com/openai/codex/pull/29715#discussion_r3463183480).
Runtime behavior is unchanged.

## Tests

- `just test -p codex-core rollout_budget`
- `just test -p codex-app-server-protocol`
- `just fmt`
- `just write-app-server-schema`

rka-oai · 2026-06-23 16:49:13 -07:00

1ec3def0b5

[codex] surface rollout budget exhaustion (#29715 )

## Summary
- surface shared rollout-budget exhaustion as
`CodexErr::RolloutBudgetExceeded` instead of a generic interrupted turn
- map it through the existing `CodexErrorInfo` and app-server v2
`codexErrorInfo` path
- keep local compaction from retrying after the shared rollout budget is
exhausted

This gives app-server clients a stable `rolloutBudgetExceeded` error
they can classify without guessing from `status="interrupted"`.

## Tests
- `just test -p codex-core rollout_budget`

rka-oai · 2026-06-23 15:01:28 -07:00

bbbea91960

Make selected plugin roots URI-native (#28918 )

## Why

Selected capability roots belong to the executor filesystem, not the
app-server host. Converting their path strings into the host's native
`Path` breaks whenever the two machines use different path conventions,
such as a Windows executor behind a Unix app-server.

This PR establishes `PathUri` as the selected-plugin boundary so the
executor remains authoritative for its paths.

## What changed

- Require `selectedCapabilityRoots[].location.path` to be a canonical
`file:` URI and deserialize it directly as `PathUri`; native path
strings are rejected.
- Update the app-server schema, generated TypeScript, examples, and
request coverage for the URI contract.
- Keep selected roots, resolved plugin locations, manifest paths, and
manifest resources as `PathUri`.
- Inspect and read plugin roots and manifests only through the selected
environment's `ExecutorFileSystem`.
- Parse executor manifests with the shared URI-native parser from #29620
instead of projecting them onto the host filesystem.
- Enforce resource containment lexically and preserve the root URI's
POSIX or Windows path convention.
- Cover foreign Windows plugin roots and URI-native manifest resources.

```text
thread/start
  selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo"
                              | PathUri
                              v
                    ExecutorFileSystem
                              |
                              +--> plugin.json
                              +--> manifest resources
```

This PR stops at the shared selected-plugin representation. The next two
PRs remove the remaining host-path projections in the skill and MCP
consumers.

## Stack

1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. **This PR** — keep selected plugin roots and resources URI-native.
4. #29626 — load executor skills without host path conversion.
5. #29628 — resolve executor MCP working directories without host path
conversion.

jif · 2026-06-23 22:51:19 +01:00

2e69966cd8

feat(app-server): thread/turns/items/list -> thread/items/list (#29705 )

## Description

Rename the experimental app-server item pagination API from
`thread/turns/items/list` to `thread/items/list` and make `turnId`
optional. Clients can now page persisted items across a thread, or still
filter to one turn when needed.

## What changed

- Rename the request/response protocol types and JSON-RPC method to
`ThreadItemsList*` / `thread/items/list`.
- Pass optional `turnId` through to `ThreadStore::list_items`.
- Update app-server docs and focused protocol/app-server tests.

## Validation

- `just test -p codex-app-server-protocol thread_items_list_round_trips`
- `just test -p codex-app-server thread_items_list_returns_unsupported`

Owen Lin · 2026-06-23 13:57:08 -07:00

1882719b30

Propagate safety buffering treatment metadata (#29473 )

## Summary

- read the request-scoped safety-buffering treatment from HTTP response
headers and per-turn WebSocket metadata through one shared header parser
- combine that treatment with Responses API safety-buffering signals
- propagate `showBufferingUi` and nullable `fasterModel` through the
existing `model/safetyBuffering/updated` app-server notification
- update the app-server documentation and generated JSON and TypeScript
schemas

The public implementation contains no model mapping or real model
identifier. Tests and protocol examples use generic `current-model` and
`faster-model` placeholders only.

## Dependencies

- server-side treatment evaluation:
https://github.com/openai/openai/pull/1060247
- initial Responses API safety-buffering propagation:
https://github.com/openai/codex/pull/29371
- Codex App UI: https://github.com/openai/openai/pull/1057789

## Validation

- Codex API tests: 129 passed
- focused Codex core safety-buffering integration test passed
- app-server protocol tests passed after regenerating schema fixtures
- Clippy fix and repository formatting completed successfully

The broader app-server run compiled all changed crates and completed
with 1,269 passing tests. Its remaining failures were unrelated
environment limitations: macOS sandbox application was denied, one
expected test binary was unavailable, and several existing subprocess
tests timed out as a result.

Francis Chalissery · 2026-06-22 19:51:03 -07:00

7c22d376e5

[codex] reject remote images at app-server ingress (#29419 )

## Stack

Stacked on #29417. Review and land that PR first.

## Summary

- reject HTTP(S) image URLs in the handlers for `turn/start` and
`turn/steer`
- validate `thread/inject_items` after its existing
JSON-to-`ResponseItem` conversion, so each item is deserialized once
- turn invalid dynamic-tool image responses into the existing
unsuccessful text fallback; the model receives the validation message as
the function output
- leave `thread/resume.history` compatible with legacy history; #29417
replaces remote images before model input
- continue accepting inline data URLs and `localImage` inputs
- keep this policy in app-server; this PR does not add a shared protocol
API or change core image preparation

## Test plan

- `just test -p codex-app-server -E
'test(/request_handlers_reject_remote_image_urls|dynamic_tool_remote_image_response_becomes_model_visible_error|dynamic_tool_call_round_trip_sends_content_items_to_model|turn_start_tracks_turn_event_analytics|standalone_image_edit_uses_recent_pathless_image/)'`
(5 passed)
- `just fix -p codex-app-server`
- `just fmt`

rka-oai · 2026-06-23 00:43:56 +00:00

b294638bb5

permission profiles: expose availability to clients (#26678 )

## Why

`permissionProfile/list` currently advertises every built-in and
configured profile even when effective enterprise requirements prevent
selecting it. That forces each client to reconstruct policy from
lower-level requirement fields, which is easy to miss and difficult to
keep consistent.

The catalog should remain complete so clients can explain that an option
was disabled by an administrator, while also reporting whether each
profile is selectable.

## What

- Add an `allowed` field to each permission profile summary.
- Build a shared catalog from the effective config and current
requirements, including `allowed_sandbox_modes`, `allowed_permissions`,
and filesystem restrictions.
- Use the shared catalog in app-server and the TUI so disallowed
profiles remain visible but cannot be selected.
- Use the canonical `:danger-full-access` profile ID in the TUI.
- Update the app-server schemas, API documentation, behavioral tests,
and TUI snapshots.

## Scope

This PR targets `main` directly and is independent of #24852. It
preserves the current behavior where built-in profiles are constrained
by sandbox-mode requirements and `allowed_permissions` applies to
configured profiles.

## Testing

- `just test -p codex-core
permission_profile_catalog_marks_profiles_disallowed_by_requirements`
- `just test -p codex-app-server permission_profile_list`
- `just test -p codex-app-server-protocol`
- `just test -p codex-tui profile_permissions`
- `just fix -p codex-core`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
- `just fmt`

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Joey Trasatti <joey.trasatti@openai.com>

viyatb-oai · 2026-06-22 13:48:09 -07:00

ced3e4b9a7

Allow ChatGPT accounts without email (#28991 )

# Summary

Codex required every ChatGPT account to have an email address. A
service-account personal access token can return valid account metadata
without one, so PAT login failed while decoding the metadata response.

This change makes email optional in the account metadata type that owns
it and preserves that absence through authentication, provider account
state, the app-server API, generated clients, and TUI bootstrap.
Existing accounts with email addresses keep the same behavior.

## Behavior-changing call sites

| Call site | Behavior after this change |
| --- | --- |
| `login/src/auth/personal_access_token.rs` | PAT metadata accepts a
missing or null email and retains `None`. |
| `agent-identity/src/lib.rs` | Agent Identity JWT claims accept an
omitted email. |
| `login/src/auth/storage.rs` and `login/src/auth/agent_identity.rs` |
Stored and managed Agent Identity records carry `Option<String>`.
Deserialization maps the legacy empty-string sentinel to `None`. |
| `login/src/auth/manager.rs` | `get_account_email` returns the stored
option, and managed identity bootstrap no longer converts `None` to an
empty string. |
| `model-provider/src/provider.rs` and `protocol/src/account.rs` | A
ChatGPT provider account requires a plan type but may carry no email. |
| `app-server-protocol/src/protocol/v2/account.rs` | `account/read`
keeps the `email` field on the wire and returns `null` when the account
has no email. Generated TypeScript and JSON schemas describe a required,
nullable field. |
| `sdk/python/src/openai_codex/generated/v2_all.py` | The generated
Python `ChatgptAccount` model accepts `None` for email. |
| `tui/src/app_server_session.rs` | Email-less ChatGPT accounts
bootstrap normally, keep external feedback routing, omit account-email
telemetry, and display the plan in account status. |

## Design decisions

- Missing email remains `None` at every layer. The code never uses an
empty string as a substitute.
- The app-server response includes `"email": null` instead of omitting
the field. Clients retain a stable response shape.
- Plan type remains required for provider account state. This change
relaxes only the email assumption.

## Testing

Tests: affected test targets compile, scoped Clippy and formatting pass,
a focused TUI snapshot covers plan-only account status, real
before/after PAT login smoke covers metadata without email, app-server
smoke covers `account/read` with `email: null`, and a regression smoke
covers an existing email-bearing PAT. Unit tests run in CI.

## Evidence

Visual smoke evidence will be attached here.

efrazer-oai · 2026-06-22 13:19:40 -07:00

5a67d898a5

Add workspace messages app-server API (#29001 )

## Summary

- Add backend-client types and fetch support for active workspace
messages.
- Add the app-server v2 `account/workspaceMessages/read` method,
generated schemas, and README documentation.
- Delegate workspace-message eligibility to the Codex backend feature
gate; map a backend 404 to `featureEnabled: false`.

## Testing

- `just write-app-server-schema`
- `just test -p codex-backend-client`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server workspace_messages`
- `just fix -p codex-backend-client -p codex-app-server-protocol -p
codex-app-server`
- `just fmt`

## Stack

- Base PR for #28232, which adds the TUI status-line integration.

xli-oai · 2026-06-22 04:25:07 -07:00

21d36296f1

Simplify multi-agent mode controls (#29324 )

## Why

Multi-agent delegation policy was split across `multiAgentMode`,
`features.multi_agent_mode`, and `usage_hint_enabled`. These controls
could disagree: a requested mode could be downgraded by the feature
flag, and disabling usage hints also disabled mode instructions.

Some clients also need multi-agent tools without adding
delegation-policy text to model context. The previous two-mode API could
not express that directly.

## What changed

`multiAgentMode` is now the only live delegation-policy control:

| Mode | Behavior |
| --- | --- |
| `none` | Keep multi-agent tools available without adding mode
instructions. |
| `explicitRequestOnly` | Only delegate after an explicit user request.
|
| `proactive` | Delegate when parallel work materially improves speed or
quality. |

- new threads default to `explicitRequestOnly`; omitting the mode on
later turns keeps the current value
- thread start, resume, fork, and settings responses always report the
concrete current mode instead of `null`
- mode selection remains sticky across turns and resume
- usage-hint text no longer controls whether mode instructions apply
- `features.multi_agent_mode` and `usage_hint_enabled` remain accepted
as ignored compatibility settings so existing configs continue to load
- app-server documentation and generated schemas describe the three-mode
API

## Tests

- `just test -p codex-core multi_agent_mode`
- `just test -p codex-core multi_agent_v2_config_from_feature_table`
- `just test -p codex-core spawn_agent_description`
- `just test -p codex-features`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server multi_agent_mode`

jif · 2026-06-22 10:05:36 +02:00

c03742ca0a

Propagate safety buffering events to app-server clients (#29371 )

Responses API safety buffering metadata currently stops at the transport
boundary, so app-server clients cannot render the in-progress safety
review state.

This change:
- decodes and deduplicates `safety_buffering` metadata from Responses
API SSE and WebSocket events without suppressing the original response
event
- emits a typed core event containing the requested model plus backend
use cases and reasons
- forwards that event as `turn/safetyBuffering/updated` through
app-server v2 and updates generated protocol schemas
- keeps the side-channel event out of persisted rollouts and turn timing

This supports the Codex Apps buffering UX and depends on the Responses
API backend work in https://github.com/openai/openai/pull/1044569 and
https://github.com/openai/openai/pull/1044571.

Validation:
- focused `codex-core` safety-buffering integration test passes
- `cargo check -p codex-core -p codex-app-server -p
codex-app-server-protocol`
- `just fix -p codex-api -p codex-protocol -p codex-core -p
codex-app-server-protocol -p codex-app-server -p codex-rollout -p
codex-rollout-trace -p codex-otel`
- `just fmt`
- broad package test run: 4,430/4,492 passed; 62 unrelated
local-environment/concurrency failures involved unavailable test
binaries, MCP subprocess setup, and app-server timeouts

Francis Chalissery · 2026-06-22 03:39:14 +00:00

566f7bf631

Expose thread-level multi-agent mode (#28792 )

## Why

Once multi-agent mode can be selected per turn, clients also need to
choose the initial selection when creating a thread and observe that
selection through lifecycle and settings APIs.

The selected value is intentionally distinct from the effective
model-visible value: no client selection is represented as `null`, even
though an eligible multi-agent v2 turn derives `explicitRequestOnly` as
its effective default.

## What changed

- Add the optional experimental `thread/start.multiAgentMode` parameter
and pass it through thread creation.
- Preserve an omitted initial value as an unset selection rather than
eagerly storing `explicitRequestOnly`.
- Apply an explicit `thread/start` selection to the first turn through
the session configuration established at thread creation.
- Restore the latest persisted effective mode as the selected baseline
on cold resume when rollout history contains one.
- Inherit the optional selected mode from a loaded parent when creating
related runtime threads.
- Return the current selected `multiAgentMode` from `thread/start`,
`thread/resume`, `thread/fork`, and thread settings, using `null` when
no mode is selected.
- Keep lifecycle reporting independent from model capability and feature
eligibility; core turn construction remains responsible for calculating
and persisting the effective mode.

## Not covered

- Clearing an existing loaded-session selection back to unset through
`turn/start`; omitted or `null` currently retains the session's
selection.
- A TUI control, slash command, or `config.toml` preference.

## Verification

- `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol`
- `CARGO_INCREMENTAL=0 just test -p codex-app-server multi_agent_mode`

The focused app-server coverage verifies explicit `thread/start`
initialization, first-turn prompting, nullable reporting for an omitted
selection, and retention of selections that are not currently
runtime-eligible.

## Stack

Stacked on #28685. This PR contains only the thread initialization and
lifecycle/settings API layer.

Shijie Rao · 2026-06-19 10:50:44 +02:00

7abfcf220b

Add per-turn multi-agent mode (#28685 )

## Why

Multi-agent v2 currently carries an explicit-request-only delegation
rule in its static usage hint. That provides a safe default, but it
prevents clients from selecting proactive delegation per turn without
changing static guidance or rewriting prior model context.

This change makes delegation mode a session selection that can be
updated through `turn/start`, while deriving the effective model-visible
mode separately for each turn. Eligible multi-agent v2 turns remain
explicit-request-only unless proactive mode is both selected and
enabled.

## What changed

- Add the experimental `turn/start.multiAgentMode` parameter with
`explicitRequestOnly` and `proactive` values. Omission retains the
loaded session's current optional selection.
- Add the default-off `features.multi_agent_mode` feature gate. Eligible
multi-agent v2 turns use the selected mode when enabled; an unset
selection or disabled gate resolves to `explicitRequestOnly`.
- Treat mode prompting as inapplicable for multi-agent v1 and other
unsupported session configurations, producing no multi-agent mode
developer message rather than rejecting the turn.
- Move the explicit-request-only rule out of the static v2 usage hint
and into a bounded, tagged developer context fragment.
- Emit the effective mode in initial context and only when that
effective mode changes on later turns.
- Persist the effective mode in `TurnContextItem` as the durable
baseline for resume and context-update comparisons.

Historical rollout items are not rewritten. Later mode developer
messages establish the current rule incrementally.

## Not covered

- Initial selection through `thread/start` and selected-mode reporting
from thread lifecycle/settings APIs; those are isolated in the stacked
#28792.
- A TUI control or slash command for selecting the mode.
- Persisting a preferred mode to `config.toml`; selection remains
session/turn scoped.
- Changes to multi-agent concurrency limits, tool availability, or model
catalog capability declarations.
- Rewriting historical rollout prompt items. Cold resume restores the
latest persisted effective mode when available while leaving historical
developer messages intact.

## Verification

- `CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode`
- Focused app-server coverage verifies that `turn/start.multiAgentMode`
produces proactive developer instructions for an eligible v2 turn.

## Stack

Followed by #28792, which adds `thread/start` initialization and
lifecycle/settings observability.

Shijie Rao · 2026-06-18 22:47:51 -07:00

fc8c6b7384

[3/3] app-server: configure environment connection timeout (#29025 )

## Why

Remote environments registered through `environment/add` currently use
the fixed 10-second WebSocket connection timeout. Slow-starting
executors need a caller-selected connection window, but this should not
add retry policy or couple exec-server behavior to Core’s
`deferred_executor` feature.

Make the timeout an optional part of the existing experimental request.
Existing clients continue using the current default, while callers that
know an executor may take longer can request a larger window explicitly.

Depends on #28683.

## What changed

- Add optional `connectTimeoutMs` to `EnvironmentAddParams` and document
it in the app-server README.
- Pass the optional timeout through `EnvironmentRequestProcessor` into
one `EnvironmentManager::upsert_environment()` path; the manager applies
the existing default when it is omitted.
- Preserve the existing single-attempt lifecycle. The configured value
controls WebSocket connection and handshake time for both initial
connection and later reconnects; initialization retains its separate
timeout.
- Add an app-server integration test that sends the real JSON-RPC
request and verifies a stalled handshake observes the requested timeout.

## Test plan

- `just test -p codex-app-server-protocol`
- `just test -p codex-exec-server`
- `just test -p codex-app-server
environment_add_applies_connect_timeout`

## Rollout

This is additive and does not enable `deferred_executor`. Callers should
send a non-default timeout only after a compatible app-server is
deployed; omitted or `null` values retain the existing 10-second
default.

sayan-oai · 2026-06-19 05:27:45 +00:00

f886e33e5a

Always use AVAS for realtime WebRTC calls (#28856 )

## Summary

- Remove the realtime `architecture` selector from core protocol,
app-server protocol, config parsing, generated schemas, and callers.
- Always create WebRTC realtime calls with the AVAS query params:
`intent=quicksilver&architecture=avas`.
- Keep direct websocket realtime behavior on the existing config/default
path, while WebRTC starts without an explicit version now default to
realtime v1 because AVAS requires v1.

## Notes

- WebRTC realtime now means AVAS. If a caller explicitly asks to start
WebRTC with realtime v2, Codex rejects that request because the AVAS
WebRTC path only supports realtime v1. Websocket realtime is separate
and can still use realtime v2.
- The old `[realtime] architecture = "realtimeapi" | "avas"` config knob
is removed. Local configs that still set it will need to delete that
line.
- Some app-server tests that were only trying to exercise realtime v2
protocol behavior now use websocket transport, because WebRTC is
intentionally locked to AVAS/v1. Separate WebRTC tests cover the AVAS
query params, v1 startup, SDP flow, and sideband join.

## Validation

- Merged fresh `origin/main` at `83e6a786a2`.
- `just fmt`
- `just write-config-schema`
- `just write-app-server-schema`
- `git diff --check`
- `just test -p codex-api -p codex-core -p codex-app-server-protocol -p
codex-app-server realtime` (176 passed)
- `just test -p codex-protocol -p codex-config` (413 passed)

Peter Bakkum · 2026-06-18 19:11:21 -05:00

8e7c213f8f

core: load AGENTS.md from foreign environments (#28958 )

## Why

Make it possible to load AGENTS.md from remote exec-servers whose OS is
different than app-server.

## What

- keep `AGENTS.md` discovery and provenance as `PathUri`, with
root-aware parent and ancestor traversal
- expose lifecycle instruction sources as legacy app-server path strings
in events while retaining `PathUri` internally
- preserve and test mixed POSIX and Windows paths in model context and
TUI status output
- cover remote Windows loading end to end by seeding the Wine prefix
through host filesystem APIs
- fix bug in `PathUri`'s parent() implementation that would erase
Windows drive letters

Adam Perry @ OpenAI · 2026-06-18 15:06:23 -07:00

dce673905a

Emit Trusted MCP App Identity on Tool-Call Items (#27132 )

## Summary

- Add optional `appContext` to app-server MCP tool-call items with
trusted `connectorId`, `linkId`, and `mcpAppResourceUri` metadata.
- Preserve that context across tool-call events, persisted history,
reconnects, and thread resume.
- Keep the deprecated top-level `mcpAppResourceUri` temporarily for
client migration.

The consumer contract is `{ appContext: { connectorId, linkId,
mcpAppResourceUri }, tool }`.

## Validation

- Full GitHub Actions suite passes, including CLA, Bazel tests, clippy,
release builds, and argument-comment lint.

---------

Co-authored-by: martinauyeung-oai <280153141+martinauyeung-oai@users.noreply.github.com>

martinauyeung-oai · 2026-06-18 14:02:54 -07:00

765309d5a6

Add app-server current-time impl (varlatency 3/n) (#28835 )

## What

Server should request:

```
{
  "id": 42,
  "method": "currentTime/read",
  "params": {
    "threadId": "11111111-1111-1111-1111-aaaaafdc2c11"
  }
}
```

Client should respond with something like:

```rust
{
  "id": 42,
  "result": {
    "currentTimeAt": 1781717655
  }
}
```

## Why

Sessions configured with `clock_source = "external"` need a
thread-specific external time source before inference. The system clock
remains the default production provider.

## Validation

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all
current_time_read_round_trip_adds_reminder_to_model_input`
- `cargo test -p codex-app-server
first_attestation_capable_connection_for_thread_only_uses_thread_subscribers`
- `cargo test -p codex-analytics`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`

Stacked on #28824.

rka-oai · 2026-06-18 13:12:11 -07:00

f4602b7516

Support openai/form extended form elicitations (#27500 )

# Summary
Allow App Server clients to opt into `openai/form` MCP elicitations.

Gabriel Peal · 2026-06-18 11:54:49 -07:00

21a599fa56

[codex] Support assistant realtime append text (#28836 )

## Why

Frontend realtime voice continuity needs to replay a tiny
previous-session overlap as actual conversation items, including
assistant text. The app-server `thread/realtime/appendText` API already
carries a role through to the Rust realtime websocket layer, but the
shared role enum only accepted `user` and `developer`.

## What Changed

- Added `assistant` to `ConversationTextRole` and regenerated the
app-server schema/type fixtures.
- Added `output_text` as a realtime conversation content type.
- Updated realtime websocket item creation so assistant appendText emits
`content: [{ type: "output_text", text }]`, while user and developer
continue to emit `input_text`.
- Updated app-server docs and tests to cover assistant appendText
alongside the existing developer role behavior.

## Validation

- `just write-app-server-schema`
- `just fmt` (first sandboxed attempt failed because `uv` could not
access `~/.cache/uv`; reran with filesystem access and passed)
- `just test -p codex-api` passed: 126/126
- `just test -p codex-app-server-protocol` passed: 239/239, including
generated JSON/TypeScript fixture checks
- `just test -p codex-app-server` was started locally but stopped per
request after unrelated local sandbox/Seatbelt failures (`sandbox-exec:
sandbox_apply: Operation not permitted`) and one missing local `codex`
binary failure; CI should be faster and more authoritative for the full
suite.

guinness-oai · 2026-06-17 20:57:13 -07:00

e922f46a0f

[codex] control automatic realtime handoff delivery (#27986 )

## What

Built on the realtime speech-control plumbing merged in #27917.

- Add optional `codexResponseHandoffPrefix` to `thread/realtime/start`.
- Apply that prefix only to automatic V1 commentary sent through
`conversation.handoff.append`; final answers remain unprefixed.
- Add opt-in `clientManagedHandoffs`. When true, core suppresses
automatic response handoffs and completion output so delivery is
controlled by explicit client append APIs.
- Preserve existing automatic behavior by default.
`codexResponsesAsItems: true` continues to select item routing when
client-managed mode is disabled.

## Why

Voice clients need two delivery policies: automatic background context
with silent commentary instructions and fully client-owned handoffs.
Phase-aware prefixing keeps routine commentary silent without
suppressing the final answer, while client-managed mode lets an app
decide exactly which updates to append.

## Validation

- `just fmt`
- `cargo test -p codex-app-server-protocol
serialize_thread_realtime_start`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core --test all
conversation_handoff_persists_across_item_done_until_turn_complete`
- `RUST_MIN_STACK=16777216 cargo test -p codex-app-server --test all
webrtc_v1_client_managed_handoffs_disable_automatic_output`
- `RUST_MIN_STACK=16777216 cargo test -p codex-app-server --test all
webrtc_v1_final_automatic_handoff_omits_silent_prefix`
- `cargo build -p codex-cli --bin codex`
- Local Codex Apps compatibility check: 43 focused webview tests passed,
and a live voice session routed through the source-built app-server.

The explicit `RUST_MIN_STACK` avoids a macOS Tokio test-worker stack
overflow seen with the default test environment.

jiayuhuang-openai · 2026-06-18 02:22:29 +00:00

683bd170dc

[codex] Track plugin install and import telemetry failures (#28731 )

## Summary
- Track plugin install failures through the unified
`codex_plugin_install_failed` event for local installs, remote install
preflight failures, bundle failures, and remote catalog/backend
failures.
- Send classified `error_type` values in plugin install failure
analytics instead of raw error strings.
- Stop sending raw external-agent import errors in analytics while
preserving raw failure details in app-facing import
notifications/history.
- Keep raw plugin/migration diagnostics in `tracing::warn!` logs.
- Keep remote failure plugin names as the existing local placeholder
(`unknown`) and remove the extra telemetry plugin-name override.
- Change `ExternalAgentConfigImportParams.source` from a generated enum
to `string | null`, with legacy `claudeCode` / `claudeCowork` inputs
normalized to existing analytics values.

## Testing

charlesgong-openai · 2026-06-17 13:16:34 -07:00

3959ab0ffc

[codex] Restore thread recency with compatible migration history (#28671 )

## Summary

- Revert #28655, restoring the thread `recencyAt` behavior introduced by
#27910.
- Move `threads_recency_at` to migration 0039 so it no longer collides
with `external_agent_config_imports` at version 0038.
- Repair databases that already applied the recency migration as version
38 by moving the matching migration-history row to version 39 before
SQLx validation. The current version-38 migration can then apply
normally.

## Validation

- `just test -p codex-state
migrations::tests::repairs_recency_migration_that_was_applied_as_version_38`
- `just test -p codex-state -p codex-rollout -p codex-thread-store -p
codex-app-server-protocol -p codex-tui`: 3,439 passed; six TUI tests
could not open the machine's existing read-only incident database at
`~/.codex/sqlite/state_5.sqlite`.
- `just fix -p codex-state`
- `just fmt`
- Verified that state migration versions are unique.

Jeremy Rose · 2026-06-17 18:52:18 +00:00

7dc7096ae1

Scope command approvals by execution environment (#28738 )

## Why

Command approval cache keys included the command and working directory,
but not the execution environment. An approval for `/workspace` locally
could therefore be reused for the same command and path on an executor.

## What changed

- Include the selected environment ID in shell and unified-exec approval
cache keys.
- Carry that ID through the normal command approval request so clients
can show which environment is being approved.
- Expose the environment through app-server as a required nullable
`environmentId` and show it in the inline TUI approval prompt.
- Keep older recorded approval events compatible when the environment is
absent.

For example, `echo ok` in local `/workspace` and `echo ok` in executor
`/workspace` now produce different approval keys and separate prompts.

## Scope

This PR does not change network approvals, Guardian review actions, MCP
elicitation, full-screen TUI rendering, or environment-ID validation.
Remote `shell_command` execution itself remains in #28722; this PR only
makes its approval key environment-aware.

jif · 2026-06-17 19:52:43 +02:00

1391d786bc

[ez][codex-rs] Support apps._default.default_tools_approval_mode (#27965 )

[from codex]

## Summary

- add `default_tools_approval_mode` to `[apps._default]` and expose it
through app-server v2 `config/read`
- apply it after managed, per-tool, and per-app approval settings,
before the built-in `auto` fallback
- document the precedence, regenerate config/app-server schemas, and add
unit plus end-to-end approval coverage

## Configuration

```toml
[apps._default]
default_tools_approval_mode = "prompt"
```

The effective precedence is managed requirements, tool-specific
`approval_mode`, app-specific `default_tools_approval_mode`,
`apps._default.default_tools_approval_mode`, then `auto`.

## Test plan

- `just write-config-schema`
- `just write-app-server-schema`
- `just write-app-server-schema --experimental`
- `just test -p codex-core app_tool_policy`
- `just test -p codex-core mcp_turn_metadata`
- `just test -p codex-config`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server config_read_includes_apps`
- `just fix -p codex-config -p codex-core -p codex-app-server-protocol
-p codex-app-server`
- `just fmt`

Alex Zamoshchin · 2026-06-17 08:50:39 -07:00

c78911e37f

Revert thread recencyAt for sidebar ordering (#28655 )

## Why

Revert #27910 to remove the newly introduced thread `recencyAt`
persistence and API behavior from `main`.

## What changed

This reverts commit `fac3158c2a783095768076489815f361fa9b0db4`,
including the state migration, thread-store propagation, app-server API
surface, generated schemas, and related tests.

## Validation

Not run before opening; relying on CI for the initial fast signal.

pakrym-oai · 2026-06-16 21:39:30 -07:00

cb15c64760

Add thread recencyAt for sidebar ordering (#27910 )

## Summary

Add a server-owned `recencyAt` timestamp and `recency_at` thread-list
sort key for product recency ordering while preserving the existing
meaning of `updatedAt` as the latest persisted thread mutation.

This is the server-side alternative to #27697. Rather than narrowing
`updatedAt`, clients can sort the sidebar by `recency_at` and continue
treating `updatedAt` as mutation time.

Paired Codex Apps PR:
[openai/openai#1024599](https://github.com/openai/openai/pull/1024599)

## Contract

- `recencyAt` initializes when a thread is created.
- A turn start advances `recencyAt` monotonically.
- Commentary, agent output, tool results, token/accounting updates, turn
completion, archive, unarchive, resume, and generic metadata writes do
not advance it.
- `updatedAt` retains its existing behavior and continues to advance for
persisted thread mutations.
- Current servers populate `recencyAt`; the response field is optional
in generated TypeScript so clients connected to older servers can fall
back to `updatedAt`.
- Filesystem-only fallback uses existing updated/mtime ordering when
SQLite is unavailable.

## Persistence and compatibility

Migration 0038 adds second- and millisecond-precision recency columns,
backfills them from the existing updated timestamp, creates list
indexes, and includes an insert trigger so older binaries writing to a
migrated database seed recency without causing later mutations to
advance it.

Generic metadata upserts preserve existing recency values. Turn-start
updates use a dedicated monotonic touch, and process-local allocation
keeps millisecond cursor values unique. State DB list, search, read,
filtered-list repair, rollout fallback propagation, and app-server
conversions all carry the new field.

## API

`Thread` responses include:

```ts
recencyAt?: number
```

`thread/list` and `thread/search` accept:

```json
{ "sortKey": "recency_at" }
```

Generated TypeScript and JSON schemas are included.

## Validation

- `just test -p codex-state` — 146 passed
- `just test -p codex-rollout` — 69 passed
- `just test -p codex-thread-store` — 81 passed
- `just test -p codex-app-server-protocol` — 231 passed
- Focused app-server list ordering, response mapping, archive/unarchive,
and resume lifecycle tests passed
- Scoped `just fix` for state, rollout, thread-store,
app-server-protocol, and app-server
- `just fmt`
- `git diff --check`
- Independent correctness, simplicity, elegance, security, and
test-quality reviews; actionable ordering, lifecycle, query-projection,
and timestamp-uniqueness findings were addressed

Jeremy Rose · 2026-06-16 17:06:22 -07:00

fac3158c2a

[codex] expose Bedrock credential source in account/read (#27751 )

## Why

`account/read` currently reports only `type: "amazonBedrock"`, so
clients cannot distinguish a Codex-managed Bedrock API key from
credentials supplied by AWS. The app UI needs that distinction to render
the appropriate account state without duplicating provider-auth logic.

Credential-source selection belongs to the Bedrock model provider
because it already owns the precedence between managed Bedrock auth and
the external AWS credential path. This builds on #27443 and #27689.

## What changed

- Added `AmazonBedrockCredentialSource` with `codexManaged` and
`awsManaged` values.
- Included the selected credential source in
`ProviderAccount::AmazonBedrock` and the app-server `Account` response.
- Made `AmazonBedrockModelProvider::account_state()` classify the source
from its managed-auth state.
- Regenerated the app-server JSON and TypeScript schemas.
- Updated app-server account documentation and downstream TUI matches.

`codexManaged` means the provider found a managed Bedrock API key.
`awsManaged` identifies the provider's external AWS credential path; it
does not assert that the AWS credential chain has been validated.

## Testing

- Added model-provider coverage for Codex-managed precedence and
AWS-managed fallback.
- Added app-server protocol serialization coverage for both wire values.
- Added app-server integration coverage for both `account/read`
responses.
- `just test -p codex-protocol -p codex-model-provider -p
codex-app-server-protocol` (497 tests passed).

After rebasing onto #27711, the `codex-app-server` test target compiled
past the image-generation `PathUri` migration. Local linking was then
interrupted by disk exhaustion (`No space left on device`).

Celia Chen · 2026-06-16 07:14:53 +00:00

12aaeb7bf8

[codex] Add interruptible sleep tool (#28429 )

## Why

Models sometimes need to pause briefly while waiting for external work,
but using a shell command for that delay ties the wait to a process and
does not naturally resume when new turn input arrives.

## What changed

- add a built-in `sleep` tool behind the under-development `sleep_tool`
feature
- accept a bounded `duration_ms` argument, matching the millisecond
convention used by unified exec
- end the sleep early when either steered user input or mailbox input
arrives
- include elapsed wall-clock time in completed and interrupted outputs
- emit a dedicated core `SleepItem` through `item/started` and
`item/completed`
- expose the sleep item as app-server v2 `ThreadItem::Sleep` and retain
it in reconstructed thread history
- regenerate the configuration schema for the new feature flag
- regenerate app-server JSON and TypeScript schema fixtures

## Test plan

- `just test -p codex-core sleep_tool_follows_feature_gate`
- `just test -p codex-core any_new_input_interrupts_sleep`
- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server
sleep_emits_started_and_completed_items`

pakrym-oai · 2026-06-15 21:39:21 -07:00

08901fc8e1

Add a toggle for realtime startup context (#28405 )

## Summary
- Add `includeStartupContext` to realtime start requests so callers can
explicitly skip Codex startup context while keeping the backend prompt
- Thread the new flag through protocol types, request processing, and
realtime session config
- Update app-server docs and coverage for the new default and opt-out
behavior

## Testing
- Added protocol serialization coverage for `includeStartupContext`
- Added realtime integration coverage for starting a session with
startup context disabled

guinness-oai · 2026-06-15 17:14:22 -07:00

d5b4b98370

Add realtime speech append control (#27917 )

## Why

Realtime voice harness tuning needs app-side control over what backend
Codex text is spoken. Backend orchestrator text is written for a reading
UI, so automatically speaking every preamble, progress update, or final
assistant message can make the realtime voice model too chatty.

For experimentation, clients need two simple controls: keep app/client
text-item injection on the existing item-create path, and add an
explicit speakable path that app code can call only when it wants
realtime to speak. Automatic Codex output also needs an opt-in way to
switch from the protocol's default speakable path to regular realtime
items, with a caller-provided prefix so prompt wording can be tuned
outside core.

The default remains unchanged: if a client omits the new start fields
and never calls `appendSpeech`, automatic backend output continues down
the existing speakable path for the selected realtime protocol.

## What Changed

- Adds experimental `thread/realtime/appendSpeech` for app-provided
speakable text.
- Keeps existing `thread/realtime/appendText` as the item-create API for
app-provided realtime text items.
- Adds `codexResponsesAsItems` / `codex_responses_as_items` on
`thread/realtime/start` to send automatic Codex responses with
`conversation.item.create` instead of the protocol's default speakable
output path.
- Adds `codexResponseItemPrefix` / `codex_response_item_prefix` so
clients can prepend experiment instructions to those automatic Codex
response items.
- Keeps literal `conversation.handoff.append` routing scoped to the v1
speakable path; v2 default speech uses its item/function-output plus
`response.create` behavior.
- Removes the earlier public silent-context API and hardcoded
silent-context prefix.
- Updates realtime tests to cover default automatic speakable behavior,
opt-in automatic item-create behavior, and explicit `appendSpeech`
behavior.

## Validation

- `cargo check -p codex-core -p codex-app-server -p codex-api`
- `just test -p codex-app-server realtime_conversation`
- `just test -p codex-core realtime_conversation` (50/51 passed in the
filtered parallel run; the lone failure passed when rerun in isolation)
- `just test -p codex-core
conversation_mirrors_assistant_message_text_to_realtime_handoff`
- `just test -p codex-api
e2e_connect_and_exchange_events_against_mock_ws_server`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
- `cargo build -p codex-cli`

guinness-oai · 2026-06-15 16:15:58 -07:00

1d8ff89aa3

[codex] Add created-by-me remote plugin marketplace (#28203 )

## Summary
- add the `created-by-me-remote` marketplace backed by paginated
`scope=USER` plugin directory and installed-plugin requests
- include USER plugins in installed-plugin caching, bundle sync, and
stale-cache cleanup without client-side discoverability filtering
- expose the marketplace through app-server v2 and regenerate the
protocol schemas

## Testing
- `cargo build -p codex-app-server --bin codex-app-server`
- production-auth `plugin/list` smoke test for `created-by-me-remote`
(returned the expected USER plugin as installed and enabled)
- `just test -p codex-core-plugins` (221 passed)
- `just test -p codex-app-server-protocol` (231 passed)
- `just test -p codex-app-server suite::v2::plugin_list::` (37 passed)
- `just fix -p codex-core-plugins -p codex-app-server-protocol -p
codex-app-server`
- `just fmt`

Eric Ning · 2026-06-15 22:07:07 +00:00

709f19e111

feat(app-server): expose rate-limit reset credits (#28143 )

## Why

Codex users can earn personal rate-limit reset credits, but app-server
clients do not currently have an API for reading or redeeming them. This
adds the backend and protocol foundation used by the `/usage` TUI flow
in #28154.

## What changed

- Extend `account/rateLimits/read` with a nullable
`rateLimitResetCredits` summary sourced from the existing usage
response.
- Add backend-client and app-server support for consuming a reset with a
caller-generated idempotency key. A UUID is recommended, and clients
reuse the same key when retrying the same logical reset.
- Return only the consume `outcome`; clients refetch
`account/rateLimits/read` for updated window state.
- Document the response field and each consume outcome, and regenerate
the JSON and TypeScript schema fixtures.
- Clarify in `AGENTS.md` that new app-server string enum values use
camelCase on the wire.
- Update the existing TUI response fixture for the expanded protocol
shape.
- Add coverage for authentication, response mapping, backend failures,
consume outcomes, and request timeout behavior.

## Validation

- `just test -p codex-app-server-protocol` — 231 passed.
- `just test -p codex-backend-client` — 14 passed.
- Focused `codex-app-server` reset-credit tests — 5 passed.
- Focused `codex-tui` protocol response fixture test — passed.
- `just fix -p codex-backend-client -p codex-app-server-protocol -p
codex-app-server` — passed.
- `just fmt` — passed.

jay · 2026-06-15 21:54:01 +00:00

bef99f861b

[codex] Add external agent import result accounting (#28008 )

## Why

External-agent imports can complete synchronously or continue in the
background for plugins/sessions. Clients need a stable import id to
correlate the immediate response with the eventual completion
notification, and the completion payload needs enough accounting to show
which artifact types succeeded or failed without hiding partial
failures.

## What Changed

- `externalAgentConfig/import` now returns an `importId`;
`externalAgentConfig/import/completed` includes the same `importId` plus
type-level `itemResults`.
- Completed `itemResults` report `successCount`, `errorCount`,
`successes`, and `rawErrors` for each migrated item type.
- Added protocol/schema/TypeScript types for import successes, raw
errors, and type-level results. No progress notification is included in
the final PR.
- `ExternalAgentConfigService::import` now returns an outcome object
with synchronous item results and pending plugin imports.
- Plugin import outcomes track succeeded/failed marketplaces, plugin
ids, and raw errors. Plugin failures can be reported in completed
accounting while later migration items continue.
- Non-plugin synchronous import failures still fail the request, so
invalid config/skills-style failures are not reported as a successful
import response.
- Session imports now return item results. Successful imports include
the source session path and imported thread id; prepare, persist,
ledger, and source-validation failures become raw errors in completion
accounting where the import can continue.
- The request processor generates the `importId`, aggregates synchronous
results with background plugin/session results, and sends a single
completed notification when all selected work is done.
- App-server docs and generated schema fixtures were updated for the new
response/completed payload shapes.

## Validation

- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server-client event_requires_delivery`
- `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-sync-error
just test -p codex-app-server
external_agent_config_import_returns_error_for_failed_sync_import`
- `CODEX_SQLITE_HOME=/private/tmp/codex-app-server-review-external-agent
just test -p codex-app-server external_agent_config`

Note: local sandbox validation used `CODEX_SQLITE_HOME` because the
default sqlite state path is read-only in this environment.

charlesgong-openai · 2026-06-15 13:25:42 -07:00

fc1fb682a7

Expose explicit dynamic tool namespaces in thread start (#27371 )

Stacked on #27365.

## Stack note

[#27365](https://github.com/openai/codex/pull/27365) kept `thread/start`
unchanged and converted its input in `thread_processor`. This PR updates
`thread/start` to accept explicit functions and namespaces directly.

Legacy per-tool arrays are still accepted and converted while reading
the request. As a result, `thread_processor` can validate and pass the
tools through directly, which is why some code added in #27365 is
removed here.

## Why

`thread/start.dynamicTools` still repeats namespace data on each
function even though core now stores explicit namespace groups. The
request API should use the same shape so each namespace has one
description and one member list.

## What changed

- Accept top-level functions and explicit namespace objects in
`dynamicTools`.
- Continue accepting fully legacy flat arrays, including
`exposeToContext`.
- Reject arrays that mix legacy and canonical entries.
- Reuse the protocol types directly and remove the temporary app-server
adapter.
- Update validation, docs, the test client, and generated schemas.

## Test plan

- `just test -p codex-app-server-protocol`
- `just test -p codex-app-server
dynamic_tool_call_round_trip_sends_text_content_items_to_model`
- `just test -p codex-app-server
thread_start_normalizes_legacy_dynamic_tools_into_model_request`
- `just test -p codex-app-server
thread_start_rejects_mixed_dynamic_tool_formats`
- `just test -p codex-app-server
thread_start_rejects_hidden_dynamic_tools_without_namespace`

sayan-oai · 2026-06-15 15:35:57 +00:00

11faf9af94

Activate selected executor plugin MCPs in app-server (#27893 )

## Why

#27870 teaches the MCP extension how to discover stdio MCP servers
declared by a selected executor plugin, but app-server does not yet
install that contributor or initialize its per-thread state. As a
result, `thread/start.selectedCapabilityRoots` can select the plugin
while its MCP servers remain inactive.

This PR closes that app-server wiring gap:

```text
thread/start(selectedCapabilityRoots)
    -> initialize the thread's selected-plugin MCP snapshot
    -> read the selected plugin's .mcp.json through its environment
    -> start declared stdio servers in that environment
    -> expose their tools only on the selected thread
```

## What changed

- Install the selected-executor-plugin MCP contributor in app-server
using the existing shared `EnvironmentManager`.
- Initialize its frozen thread snapshot when `thread/start` includes
selected capability roots.
- Document that selected plugin stdio MCPs are activated in their owning
environment.
- Add an app-server E2E covering the complete selection-to-tool-call
path.

The E2E verifies that:

- the selected MCP process receives an executor-only environment value,
proving the tool runs through the selected environment;
- the MCP tool is advertised to the model and can be called;
- a normal MCP config reload does not discard the thread's frozen
selected-plugin registration;
- another thread without the selected root does not see the MCP server.

## Scope

- Existing sessions without `selectedCapabilityRoots` are unchanged.
- Only stdio MCP declarations are activated. HTTP declarations remain
inactive.
- This does not change selected-root persistence across resume/fork or
add hosted-plugin behavior.

## Verification

- Focused app-server E2E:
`selected_executor_plugin_exposes_its_stdio_mcp_only_to_that_thread`

## Stack

Stacked on #27870.

jif · 2026-06-15 16:23:37 +02:00

c8c78b63a7

feat(app-server): filter threads by parent (#26662 )

## Why

Clients that display or coordinate spawned subagents need an
authoritative snapshot of a thread's immediate spawned children when
they connect to app-server or recover after missing live events.
`thread/list` cannot query by parent, so clients must otherwise scan
unrelated threads or reconstruct relationships from rollout history and
transient events.

The direct spawn relationship already exists in persisted
`thread_spawn_edges` state. Review and Guardian threads do not
participate in that lifecycle and are intentionally outside this
filter's scope.

## What changed

This adds an experimental `parentThreadId` filter to `thread/list`.
Parent-filtered requests return direct spawned children from persisted
state while preserving the existing response shape, explicit filters,
sorting, and timestamp-only cursor behavior. The lookup does not read
rollout transcripts or recursively return descendants.

Supersedes #25112 with the narrower `thread/list` filter approach.

## How it works

1. An experimental client passes a valid thread ID as `parentThreadId`.
2. App-server routes the list through the existing thread-store and
state-database boundaries.
3. SQLite selects threads whose IDs have a direct persisted spawn edge
from that parent.
4. Omitted provider and source filters include all values; explicit
filters keep ordinary `thread/list` semantics.
5. Grandchildren, Review threads, and Guardian threads are excluded.

## Verification

State (144 tests), rollout (69 tests), and focused app-server
thread-list (31 tests) suites passed. Scoped Clippy checks and
repository formatting also passed. Coverage includes direct spawned
children, omitted grandchildren, pagination, malformed IDs, mixed source
kinds, explicit filters, and operation without rollout files.

Brent Traut · 2026-06-14 00:14:26 -07:00

dfd03ea01b

368 Commits