mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
53 Commits
-
Preserve namespaces on custom tool calls (#30302)
## Summary - Preserve the optional namespace on custom tool calls during response deserialization and app-server replay. - Use the namespaced tool identifier for streaming argument handling and tool dispatch. - Regenerate app-server protocol schemas. - Add regression tests covering namespace serialization and routing. ## Testing - Ran affected protocol and app-server test suites. - Ran the full core test suite; two load-sensitive timing tests passed when rerun individually. - Ran Clippy and formatting checks. - Verified with a local end-to-end app-server replay that the namespace is preserved through the complete request/response flow.
nhamidi-oai ·
2026-06-27 09:54:56 -07:00 -
chore(core) rm AskForApproval::OnFailure (#28418)
## Summary Deletes the OnFailure variant of the `AskForApproval` enum. This option has been deprecated since #11631. ## Testing - [x] Tests pass
Dylan Hurd ·
2026-06-23 12:13:54 -07:00 -
feat(core): store turn_id on ResponseItem metadata (#28360)
## Description This PR is a followup to https://github.com/openai/codex/pull/28355 and starts assigning `internal_chat_message_metadata_passthrough.turn_id` to durable Responses API items created during a turn. The goal is that those items keep the `turn_id` that introduced them when Codex resends stateless HTTP context, reconstructs history for resume/fork paths, or reuses websocket response state. ## What changed - Set `internal_chat_message_metadata_passthrough.turn_id` when missing as response items enter durable history, initial/replacement history, inter-agent communication history, and local compaction summaries. - Preserve existing item turn IDs instead of overwriting them during persistence, resume reconstruction, compaction, forked history, and websocket incremental reuse. - Keep `compaction_trigger` fieldless because it is a request control, not a durable response item. - Update focused history/request assertions and fixtures for stateless requests, websocket incrementals, compaction, thread injection, prompt debug, and related CI coverage.
Owen Lin ·
2026-06-22 16:45:14 -07:00 -
core: rename metadata -> internal_chat_message_metadata_passthrough (#28968)
## Description This PR cuts Codex over from generic `ResponseItem.metadata` (introduced here: https://github.com/openai/codex/pull/28355) to `ResponseItem.internal_chat_message_metadata_passthrough`, which is the blessed path and has strongly-typed keys. For now we have to drop this MAv2 usage of `metadata`: https://github.com/openai/codex/pull/28561 until we figure out where that should live.
Owen Lin ·
2026-06-22 11:11:25 -07:00 -
[codex] Assign response item IDs when recording history (#28814)
## Why Client-created response items enter history without IDs, so their identity is lost across rollout persistence and resume. IDs should be assigned once at the history-recording boundary, while IDs returned by the server must remain unchanged. The Responses API validates item IDs using type-specific prefixes. Locally generated IDs therefore use the matching prefix plus a hyphenated UUIDv7, keeping them valid while distinguishable from server-generated IDs. Because this changes persisted history and provider request shapes, the behavior is opt-in behind the under-development `item_ids` feature. Compaction triggers remain request controls whose API shape does not accept an ID. ## What changed - Register the disabled-by-default `item_ids` feature and expose it in `config.schema.json`. - Make supported optional `ResponseItem` IDs serializable and expose them in the generated app-server schemas. - When `item_ids` is enabled, assign an ID during conversation-history preparation if an item has no ID. - Generate type-prefixed, hyphenated UUIDv7 IDs using the Responses API item conventions. - Preserve existing server IDs without rewriting them. - Persist assigned IDs in rollouts and include them in subsequent Responses requests. - Remove the unsupported ID field from `CompactionTrigger` and document why it has no ID. - Add integration coverage for enabled ID persistence, preservation of server IDs, and omission of generated IDs while the feature is disabled. `prepare_conversation_items_for_history` is the single response-item ID allocation boundary. ## Test plan - `just test -p codex-features` - `just test -p codex-core response_item_ids_persist_across_resume_and_preserve_server_ids` - `just test -p codex-core non_openai_responses_requests_omit_item_turn_metadata` - `just test -p codex-core resize_all_images_prepares_failures_before_history_insertion` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api azure_default_store_attaches_ids_and_headers`
pakrym-oai ·
2026-06-18 17:30:55 -07:00 -
[codex] Add optional IDs to response items (#28812)
## Why `ResponseItem` variants do not have a consistent internal ID shape: some variants carry required IDs, some carry optional IDs, and some cannot represent an ID at all. The existing fields also use inconsistent serde, TypeScript, and JSON-schema annotations. A single enum-level access path is needed before history recording can assign and retain IDs. This PR establishes that internal model only. It intentionally does not generate or serialize IDs; allocation and wire persistence are isolated in the stacked follow-up. ## What changed - Give every concrete `ResponseItem` variant an `Option<String>` ID field. - Apply the same internal-only annotations to every ID field: `#[serde(default, skip_serializing)]`, `#[ts(skip)]`, and `#[schemars(skip)]`. - Add `ResponseItem::id()` and `ResponseItem::set_id()` as the shared accessors. - Preserve IDs when history items are rewritten for truncation. - Adapt consumers that previously assumed reasoning and image-generation IDs were required. - Regenerate app-server schemas so the hidden fields are represented consistently. The serde catch-all `ResponseItem::Other` remains ID-less because it must remain a unit variant. ## Test plan - `cargo check --tests -p codex-core -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-api -p codex-rollout-trace -p codex-image-generation-extension` - `just test -p codex-core event_mapping`
pakrym-oai ·
2026-06-17 18:27:43 -07:00 -
Add join key for MAv2 inter-agent messages (#28561)
## Summary This keeps inter-agent communication on the existing raw response item path and adds a join key for MAv2 tool calls. MAv2 `spawn_agent`, `send_message`, and `followup_task` now stamp the originating tool call id into `ResponseItemMetadata.source_call_id` on the raw `ResponseItem::AgentMessage`. App-server clients can join that raw item back to the existing tool/activity event by call id, while using the raw agent message's existing sender, receiver, and content fields. No new app-server `ThreadItem` or notification type is added. ## Tests - `just fmt` - `just write-app-server-schema` - `just test -p codex-protocol` - `just test -p codex-app-server-protocol` - `just test -p codex-core multi_agent_v2_spawn_returns_path_and_send_message_accepts_relative_path` - `just test -p codex-core multi_agent_v2_followup_task_completion_notifies_parent_on_every_turn` - `just fix -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core`
jif ·
2026-06-17 14:48:56 +02:00 -
feat(core): add metadata field to ResponseItem (#28355)
## Description This PR adds an optional `metadata` field to `ResponseItem` for Responses API calls. Only mechanical plumbing, no actual values populated and sent yet. Turns out just adding a new field to `ResponseItem` has quite a large blast radius already. This change is backwards compatible because `metadata` is optional and omitted when absent, so existing response items and rollout history without it still deserialize and requests that do not set it keep the same wire shape. For provider compatibility, we strip out `metadata` before non-OpenAI Responses requests so Azure and AWS Bedrock never see this field. My followup PR here will actually make use of it to start storing and passing along `turn_id`: https://github.com/openai/codex/pull/28360 ## What changed - Added `ResponseItemMetadata` with optional `turn_id`, plus optional `metadata` on Responses API item variants and inter-agent communication. - Preserved item metadata through response-item rewrites such as truncation, missing tool-output synthesis, compaction history rebuilding, visible-history conversion, rollout/resume, and generated app-server schemas/types. - Strip item metadata from non-OpenAI Responses requests while preserving it for OpenAI-shaped requests. - Updated the mechanical fixture/test construction churn required by the new optional field.
Owen Lin ·
2026-06-15 15:05:28 -07:00 -
Support plaintext agent messages (#27830)
## Why Multi-agent v2 `send_message` deliveries already reach the receiving model as typed `agent_message` items with encrypted content. Child-completion notifications are generated by Codex itself, so their content is plaintext and previously fell back to a serialized JSON envelope inside an assistant message. With plaintext `input_text` supported for `agent_message`, both delivery paths can use the same model-visible type while preserving explicit author and recipient metadata. ## What changed - add plaintext `input_text` support to `AgentMessageInputContent` and regenerate the affected app-server schemas - preserve `InterAgentCommunication` as structured mailbox input instead of converting it to assistant text - record delivered communications as typed `agent_message` history items - persist a dedicated rollout item so local delivery metadata such as `trigger_turn` remains available without leaking into the Responses request - reconstruct typed agent messages on resume and preserve fork-turn truncation behavior - remove request-time assistant-content parsing - preserve plaintext and encrypted inter-agent deliveries in stage-one memory inputs - normalize and link plaintext and encrypted agent messages in rollout traces without treating inbound messages as child results - cover the real MultiAgent V2 child-completion path end to end with deterministic mailbox synchronization ## Verification - `just test -p codex-core plaintext_multi_agent_v2_completion_sends_agent_message` - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order record_initial_history_reconstructs_typed_inter_agent_message fork_turn_positions_use_inter_agent_delivery_metadata` - `just test -p codex-memories-write serializes_inter_agent_communications_for_memory` - `just test -p codex-rollout-trace agent_messages_preserve_routing_and_content sub_agent_started_activity_creates_spawn_edge` - `just test -p codex-rollout-trace agent_result_edge_falls_back_to_child_thread_without_result_message` - `just test -p codex-protocol -p codex-rollout -p codex-app-server-protocol`
jif ·
2026-06-12 13:50:04 -07:00 -
Make runtime workspace roots absolute in app-server API (#26552)
Stacked on #26532. ## Why #26532 moves cwd normalization to the app-server/core boundary. `runtimeWorkspaceRoots` still accepted raw paths in v2 requests and in `ConfigOverrides`, which left core responsible for interpreting those roots later. This makes runtime workspace roots follow the same absolute-path boundary as cwd. ## What - Change v2 `runtimeWorkspaceRoots` request fields for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` to `AbsolutePathBuf`. - Deduplicate already-absolute runtime roots in app-server handlers and pass them through `ConfigOverrides.workspace_roots` as `AbsolutePathBuf`. - Update TUI and exec client request builders to pass absolute runtime roots directly. - Update app-server docs, schema fixtures, and focused tests for absolute runtime roots. ## Testing - `just test -p codex-app-server-protocol` - `just test -p codex-app-server runtime_workspace_roots` - `just test -p codex-core session_permission_profile_rebinds_runtime_workspace_roots` - `just test -p codex-tui app_server_session` - `just test -p codex-exec`
pakrym-oai ·
2026-06-05 11:36:53 -07:00 -
Encrypt multi-agent v2 message payloads (#26210)
## Why Multi-agent v2 currently routes agent instructions through normal tool arguments and inter-agent context. That means the parent model can emit plaintext task text, Codex can persist it in history/rollouts, and the recipient can receive it as ordinary assistant-message JSON. This changes the v2 path so agent instructions stay encrypted between model calls: Responses encrypts the `message` argument returned by the model, Codex forwards only that ciphertext, and Responses decrypts it internally for the recipient model. ## What changed - Mark the v2 `message` parameter as encrypted for `spawn_agent`, `send_message`, and `followup_task`. - Treat multi-agent v2 tool `message` values as ciphertext unconditionally. - Store v2 inter-agent task text in `InterAgentCommunication.encrypted_content` with empty plaintext `content`. - Convert encrypted inter-agent communications into the Responses `agent_message` input item before sending the child request. - Preserve `agent_message` items across history, rollout, compaction, telemetry, and app-server schema paths. - Leave multi-agent v1 unchanged. ## Message shape The model still calls the v2 tools with a `message` argument, but that value is now ciphertext: ```json { "name": "spawn_agent", "arguments": { "task_name": "worker", "message": "<ciphertext>" } } ``` Codex stores the task as encrypted inter-agent communication: ```json { "author": "/root", "recipient": "/root/worker", "content": "", "encrypted_content": "<ciphertext>", "trigger_turn": true } ``` When Codex builds the recipient request, it forwards the ciphertext using the new Responses input item: ```json { "type": "agent_message", "author": "/root", "recipient": "/root/worker", "content": [ { "type": "encrypted_content", "encrypted_content": "<ciphertext>" } ] } ``` Responses decrypts that item internally for the recipient model. ## Context impact - Parent context no longer carries plaintext v2 agent task instructions from these tool arguments. - Codex rollout/history stores ciphertext for v2 agent instructions. - Recipient requests receive an `agent_message` item instead of assistant commentary JSON for encrypted task delivery. - Plaintext completion/status notifications are still plaintext because they are Codex-generated status messages, not encrypted model tool arguments. ## Validation - `just test -p codex-tools` - `just test -p codex-protocol` - `just test -p codex-rollout` - `just test -p codex-rollout-trace` - `just test -p codex-otel` - `just write-app-server-schema`jif ·
2026-06-05 10:25:57 +02:00 -
feat(app-server): include turns page on thread resume (#23534)
## Summary The client currently calls `thread/resume` to establish live updates and immediately follows it with `thread/turns/list` to hydrate recent turns. This lets `thread/resume` return that page directly, eliminating a round trip and the ordering/deduplication gap between the two calls. Experimental clients opt in with `initialTurnsPage: { limit, sortDirection, itemsView }`. The response returns `initialTurnsPage` as a `TurnsPage`, including cursors for paging further back in history. Keeping the controls in a nested opt-in object provides the useful `thread/turns/list` knobs without spreading page-specific parameters across `thread/resume`. ## Verification - `just fmt` - `just write-app-server-schema --experimental` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_resume_initial_turns_page_matches_requested_turns_list_page --tests` - `cargo test -p codex-app-server thread_resume_rejoins_running_thread_even_with_override_mismatch --tests` - `just fix -p codex-app-server-protocol -p codex-app-server`Brent Traut ·
2026-05-28 09:18:13 -07:00 -
Restore legacy image detail values (#24644)
## Why Older persisted rollouts can contain `input_image.detail` values of `auto` or `low` from before `ImageDetail` was narrowed to `high`/`original`. Current deserialization rejects those values, which can make resume skip later compacted checkpoints and reconstruct an oversized raw suffix before the next compaction attempt. Confirmed Sentry reports fixed by this compatibility path: - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/) - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/) - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/) - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/) ## Background [openai/codex#20693](https://github.com/openai/codex/pull/20693) added image-detail plumbing for app-server `UserInput` so input images could explicitly request `detail: original`. The Slack discussion behind that PR was about ScreenSpot / bridge evals where user input images were resized, while tool output images already had MCP/code-mode ways to request image detail. In review, the intended new API surface was narrowed to `high` and `original`: default to `high`, allow `original` when callers need unchanged image handling, and avoid encouraging new `auto` or `low` usage. That policy still makes sense for newly emitted values. The missing compatibility piece is persisted history. Older rollouts can already contain `auto` and `low`, and resume reconstructs typed history by deserializing those rollout records. Rejecting old values at that boundary causes valid compacted checkpoints to be skipped. This PR restores `auto` and `low` as real variants so old records deserialize and round-trip without being rewritten as `high`, while product paths can continue to default to `high` and avoid emitting `auto` for new behavior. ## What changed - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class protocol values. - Preserved `auto`/`low` through rollout deserialization, MCP image metadata, code-mode image output, and schema/type generation. - Kept local image byte handling conservative: only `original` switches to original-resolution loading; `auto`/`low`/`high` continue through the resize-to-fit path while retaining their detail value. - Added regression coverage for enum round-tripping and code-mode `low` detail handling. ## Testing - `just write-app-server-schema` - `just test -p codex-protocol` - `just test -p codex-tools` - `just test -p codex-code-mode` - `just test -p codex-app-server-protocol` - `just test -p codex-core suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata` - `just test -p codex-core suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper` - Loaded broken rollouts on local fixed builds, and started/completed new turns. I also attempted `just test -p codex-core`; the local broad run did not finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed out. The failures were broad timeout/deadline failures across unrelated areas; targeted changed-path core tests above passed.
rhan-oai ·
2026-05-26 16:24:33 -07:00 -
add encryptedcontent to functioncalloutput (#23500)
add new `EncryptedContent` variant to `FunctionCallOutputContentItem` ahead of standalone websearch. we need to be able to receive and pass encrypted function call output from the new web search endpoint back to responsesapi, as we cannot expose direct search results.
sayan-oai ·
2026-05-19 23:47:48 -07:00 -
Fix empty rollout path app-server handling (#23400)
## Summary - Coerce `path: ""` to `None` at the v2 protocol params deserialization boundary for `thread/resume` and `thread/fork`. - Restore the pre-ThreadStore running-thread resume behavior: if `threadId` is already running, rejoin it by id and treat a non-empty `path` only as a consistency check; otherwise cold resume keeps `history > path > threadId` precedence. - Add protocol, resume, and fork regression coverage for empty path payloads; refresh app-server schema fixtures for the clarified params docs. ## Tests - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol thread_path_params_deserialize_empty_path_as_none` - `cargo test -p codex-app-server-protocol --test schema_fixtures` - `cargo test -p codex-app-server empty_path` - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server --test all thread_resume_rejects_mismatched_path_for_running_thread_id` - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server --test all thread_resume_uses_path_over_non_running_thread_id`
Tom ·
2026-05-19 21:19:38 +00:00 -
Preserve image detail in app-server inputs (#20693)
## Summary - Add optional image detail to user image inputs across core, app-server v2, thread history/event mapping, and the generated app-server schemas/types. - Preserve requested detail when serializing Responses image inputs: omitted detail stays on the existing `high` default, while explicit `original` keeps local images on the original-resolution path. - Support `high`/`original` consistently for tool image outputs, including MCP `codex/imageDetail`, code-mode image helpers, and `view_image`.
Curtis 'Fjord' Hawthorne ·
2026-05-15 15:04:04 -07:00 -
[codex] Use compaction_trigger item for remote compaction v2 (#22809)
## Why Remote compaction v2 was still using `context_compaction` as both the request trigger and the compacted output shape. The Responses API now has the landed contract for this flow: Codex sends a dedicated `{ "type": "compaction_trigger" }` input item, and the backend returns the standard `compaction` output item with encrypted content. This aligns the v2 path with that wire contract while preserving the existing local compacted-history post-processing behavior. ## What changed - Add `ResponseItem::CompactionTrigger` and regenerate the app-server protocol schema fixtures. - Send `compaction_trigger` from `remote_compaction_v2` instead of a payload-less `context_compaction`. - Collect exactly one backend `compaction` output item, then reuse the existing compacted-history rebuilding path. - Treat the trigger item as a transient request marker rather than model output or persisted rollout/memory content. ## Verification - `cargo test -p codex-protocol compaction_trigger` - `cargo test -p codex-core remote_compact_v2` - `cargo test -p codex-core compact_remote_v2` - `cargo test -p codex-core responses_websocket_sends_response_processed_after_remote_compaction_v2` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol schema_fixtures`jif-oai ·
2026-05-15 11:40:35 +02:00 -
app-server: use permission ids and runtime workspace roots (#22611)
## Why This PR builds on [#22610](https://github.com/openai/codex/pull/22610) and is the app-server side of the migration from mutable per-turn `SandboxPolicy` replacement toward selecting immutable permission profiles by id plus mutable runtime workspace roots. Once permission profiles can carry their own immutable `workspace_roots`, app-server no longer needs to mutate the selected `PermissionProfile` just to represent thread-specific filesystem context. The mutable part now lives on the thread as explicit `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until the sandbox is realized for a turn. ## What Changed - Replaced the v2 permission-selection wrapper surface with plain profile ids for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Removed the API surface for profile modifications (`PermissionProfileSelectionParams`, `PermissionProfileModificationParams`, `ActivePermissionProfileModification`). - Added experimental `runtimeWorkspaceRoots` fields to the thread lifecycle and turn-start APIs. - Threaded runtime workspace roots through core session/thread snapshots, turn overrides, app-server request handling, and command execution permission resolution. - Kept session permission state symbolic so later runtime root updates and cwd-only implicit-root retargeting rebind `:workspace_roots` correctly. - Updated the embedded clients just enough to send and restore the new thread state. - Refreshed the generated schema/TypeScript artifacts and the app-server README to match the new contract. ## Verification Targeted coverage for this layer lives in: - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs` - `codex-rs/app-server/tests/suite/v2/thread_start.rs` - `codex-rs/app-server/tests/suite/v2/thread_resume.rs` - `codex-rs/app-server/tests/suite/v2/turn_start.rs` - `codex-rs/core/src/session/tests.rs` The key regression checks exercise that: - `runtimeWorkspaceRoots` resolve against the effective cwd on thread start. - Profile-declared workspace roots are excluded from the runtime workspace roots returned by app-server. - A turn-level runtime workspace-root update persists onto the thread and is returned by `thread/resume`. - A named permission profile selected on one turn remains symbolic so a later runtime-root-only turn update changes the actual sandbox writes. - A cwd-only turn update retargets the implicit runtime cwd root while preserving additional runtime roots. - The protocol fixtures and generated client artifacts stay in sync with the string-based permission selection contract. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611). * #22612 * __->__ #22611
Michael Bolin ·
2026-05-14 23:00:05 -07:00 -
2- Use string service tiers in session protocol (#20971)
## Summary - break service tier session/op/app-server protocol fields from the closed enum to string tier ids - send the service tier string directly through model requests, prewarm, compaction, memories, and TUI/app-server turn starts - regenerate app-server protocol JSON/TypeScript schemas, removing the standalone ServiceTier TS enum ## Verification - just fmt - cargo check -p codex-core -p codex-app-server -p codex-tui - just write-app-server-schema --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-05-06 18:00:21 +03:00 -
feat: add remote compaction v2 Responses client path (#20773)
## Why This adds the `remote_compaction_v2` client path so remote compaction can run through the normal Responses stream and install a `context_compaction` item that trigger a compaction. The goal is to migrate some of the compaction logic on the client side We keeps the v2 transport behind a feature flag while letting follow-up requests reuse the compacted context instead of falling back to the legacy compaction item shape. ## What changed - add `ResponseItem::ContextCompaction` and refresh the generated app-server / schema / TypeScript fixtures that expose response items on the wire - add `core/src/compact_remote_v2.rs` to send compaction through the standard streamed Responses client, require exactly one `context_compaction` output item, and install that item into compacted history - route manual compact and auto-compaction through the v2 path when `remote_compaction_v2` is enabled, while keeping the existing remote compaction path as the fallback - preserve the new item type across history retention, follow-up request construction, telemetry, rollout persistence, and rollout-trace normalization - add targeted coverage for the feature flag, `context_compaction` serialization, rollout-trace normalization, and remote-compaction follow-up behavior ## Verification - added protocol tests for `context_compaction` serialization/deserialization in `protocol/src/models.rs` - added rollout-trace coverage for `context_compaction` normalization in `rollout-trace/src/reducer/conversation_tests.rs` - added remote compaction integration coverage for v2 follow-up reuse and mixed compaction output streams in `core/tests/suite/compact_remote.rs` --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-04 14:15:01 +02:00 -
fix(app-server): mark thread/turns/list and exclude_turns as experime… (#20499)
…ntal We have some bugs to work out and it is not quite ready to consume as a public API.
Owen Lin ·
2026-04-30 17:39:08 -07:00 -
Michael Bolin ·
2026-04-29 20:54:59 -07:00 -
app-server-protocol: mark permission profiles experimental (#19899)
## Why `PermissionProfile` is now the canonical internal permissions representation, but the app-server wire shape is still intentionally unstable while the migration continues. Stable app-server clients should not see or generate code for these fields until the wire format settles. ## What changed - Marks every app-server v2 field that sends `PermissionProfile` as experimental, including `command/exec`, `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` request/response payloads. - Enables per-field experimental inspection for `command/exec`, so `permissionProfile` is gated without making the entire method experimental. - Fixes the generated TypeScript schema filter to be comment-aware. The previous scanner treated apostrophes inside doc comments as string delimiters, so some experimental fields leaked into stable TypeScript even though stable JSON was filtered correctly. ## Verification - `cargo test -p codex-app-server-protocol` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19899). * #19900 * __->__ #19899
Michael Bolin ·
2026-04-28 06:08:34 +00:00 -
Remove ghost snapshots (#19481)
## Summary - Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface and generated SDK/schema artifacts. - Keep legacy config loading compatible, but make undo a no-op that reports the feature is unavailable. - Clean up core history, compaction, telemetry, rollout, and tests to stop carrying ghost snapshot items. ## Testing - Unit tests passed for `codex-protocol`, `codex-core` targeted undo and compaction flows, `codex-rollout`, and `codex-app-server-protocol`. - Regenerated config and app-server schemas plus Python SDK artifacts and verified they match the checked-in outputs.
pakrym-oai ·
2026-04-27 18:48:57 -07:00 -
permissions: remove cwd special path (#19841)
## Why The experimental `PermissionProfile` API had both `:cwd` and `:project_roots` special filesystem paths, which made the permission root ambiguous. This PR removes the unstable `current_working_directory` special path before the permissions API is stabilized, so callers use `:project_roots` for symbolic project-root access. ## What changed - Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol and app-server protocol models, plus regenerated app-server JSON/TypeScript schemas. - Replaces internal `:cwd` permission entries with `:project_roots` entries. - Keeps the existing cwd-update behavior for legacy-shaped workspace-write profiles, while removing the deleted `CurrentWorkingDirectory` case from that compatibility path. - Keeps `PermissionProfile::workspace_write()` as the reusable symbolic workspace-write helper, with docs noting that `:project_roots` entries resolve at enforcement time. - Updates app-server docs/examples and approval UI labeling to stop advertising `:cwd` as a permission token. ## Compatibility Persisted rollout items may contain the old `{"kind":"current_working_directory"}` tag from earlier experimental `permissionProfile` snapshots. This PR keeps that tag as a deserialize-only alias for `ProjectRoots { subpath: None }`, while continuing to serialize only the new `project_roots` tag. ## Follow-up This PR intentionally does not introduce an explicit project-root set on `SessionConfiguration` or runtime sandbox resolution. Today, the resolver still uses the active cwd as the single implicit project root. A follow-up should model project roots separately from tool cwd so `:project_roots` entries can resolve against the configured project roots, and resolve to no entries when there are no project roots. ## Verification - `cargo test -p codex-protocol permissions:: --lib` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-sandboxing -p codex-exec-server --lib` - `cargo test -p codex-core session_configuration_apply_ --lib` - `cargo test -p codex-app-server command_exec_permission_profile_project_roots_use_command_cwd --test all` - `cargo test -p codex-tui thread_read_session_state_does_not_reuse_primary_permission_profile --lib` - `cargo test -p codex-tui preset_matching_accepts_workspace_write_with_extra_roots --lib` - `cargo test -p codex-config --lib`Michael Bolin ·
2026-04-27 13:41:27 -07:00 -
Delete unused ResponseItem::Message.end_turn (#19605)
This field is unused. Delete it.
Andrey Mishchenko ·
2026-04-26 17:18:09 -07:00 -
permissions: make profiles represent enforcement (#19231)
## Why `PermissionProfile` is becoming the canonical permissions abstraction, but the old shape only carried optional filesystem and network fields. It could describe allowed access, but not who is responsible for enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy when profiles were exported, cached, or round-tripped through app-server APIs. The important model change is that active permissions are now a disjoint union over the enforcement mode. Conceptually: ```rust pub enum PermissionProfile { Managed { file_system: FileSystemSandboxPolicy, network: NetworkSandboxPolicy, }, Disabled, External { network: NetworkSandboxPolicy, }, } ``` This distinction matters because `Disabled` means Codex should apply no outer sandbox at all, while `External` means filesystem isolation is owned by an outside caller. Those are not equivalent to a broad managed sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an inner sandbox may require the outer Codex layer to use no sandbox rather than a permissive one. ## How Existing Modeling Maps Legacy `SandboxPolicy` remains a boundary projection, but it now maps into the higher-fidelity profile model: - `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed` with restricted filesystem entries plus the corresponding network policy. - `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving the “no outer sandbox” intent instead of treating it as a lax managed sandbox. - `ExternalSandbox { network_access }` maps to `PermissionProfile::External { network }`, preserving external filesystem enforcement while still carrying the active network policy. - Split runtime policies that legacy `SandboxPolicy` cannot faithfully express, such as managed unrestricted filesystem plus restricted network, stay `Managed` instead of being collapsed into `ExternalSandbox`. - Per-command/session/turn grants remain partial overlays via `AdditionalPermissionProfile`; full `PermissionProfile` is reserved for complete active runtime permissions. ## What Changed - Change active `PermissionProfile` into a tagged union: `managed`, `disabled`, and `external`. - Keep partial permission grants separate with `AdditionalPermissionProfile` for command/session/turn overlays. - Represent managed filesystem permissions as either `restricted` entries or `unrestricted`; `glob_scan_max_depth` is non-zero when present. - Preserve old rollout compatibility by accepting the pre-tagged `{ network, file_system }` profile shape during deserialization. - Preserve fidelity for important edge cases: `DangerFullAccess` round-trips as `disabled`, `ExternalSandbox` round-trips as `external`, and managed unrestricted filesystem + restricted network stays managed instead of being mistaken for external enforcement. - Preserve configured deny-read entries and bounded glob scan depth when full profiles are projected back into runtime policies, including unrestricted replacements that now become `:root = write` plus deny entries. - Regenerate the experimental app-server v2 JSON/TypeScript schema and update the `command/exec` README example for the tagged `permissionProfile` shape. ## Compatibility Legacy `SandboxPolicy` remains available at config/API boundaries as the compatibility projection. Existing rollout lines with the old `PermissionProfile` shape continue to load. The app-server `permissionProfile` field is experimental, so its v2 wire shape is intentionally updated to match the higher-fidelity model. ## Verification - `just write-app-server-schema` - `cargo check --tests` - `cargo test -p codex-protocol permission_profile` - `cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable` - `cargo test -p codex-app-server-protocol permission_profile_file_system_permissions` - `cargo test -p codex-app-server-protocol serialize_client_response` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox` - `just fix` - `just fix -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core` - `just fix -p codex-app-server`Michael Bolin ·
2026-04-23 23:02:18 -07:00 -
Add excludeTurns parameter to thread/resume and thread/fork (#19014)
For callers who expect to be paginating the results for the UI, they can now call thread/resume or thread/fork with excludeturns:true so it will not fetch any pages of turns, and instead only set up the subscription. That call can be immediately followed by pagination requests to thread/turns/list to fetch pages of turns according to the UI's current interactions.
David de Regt ·
2026-04-23 10:07:59 -07:00 -
Rebrand approvals reviewer config to auto-review (#18504)
### Why Auto-review is the user-facing name for the approvals reviewer, but the config/API value still exposed the old `guardian_subagent` name. That made new configs and generated schemas point users at Guardian terminology even though the intended product surface is Auto-review. This PR updates the external `approvals_reviewer` value while preserving compatibility for existing configs and clients. ### What changed - Makes `auto_review` the canonical serialized value for `approvals_reviewer`. - Keeps `guardian_subagent` accepted as a legacy alias. - Keeps `user` accepted and serialized as `user`. - Updates generated config and app-server schemas so `approvals_reviewer` includes: - `user` - `auto_review` - `guardian_subagent` - Updates app-server README docs for the reviewer value. - Updates analytics and config requirements tests for the canonical auto_review value. ### Compatibility Existing configs and API payloads using: ```toml approvals_reviewer = "guardian_subagent" ``` continue to load and map to the Auto-review reviewer behavior. New serialization emits: ```toml approvals_reviewer = "auto_review" ``` This PR intentionally does not rename the [features].guardian_approval key or broad internal Guardian symbols. Those are split out for a follow-up PR to keep this migration small and avoid touching large TUI/internal surfaces. **Verification** cargo test -p codex-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent cargo test -p codex-app-server-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
Won Park ·
2026-04-22 15:45:35 -07:00 -
app-server: accept permission profile overrides (#18279)
## Why `PermissionProfile` is becoming the canonical permissions shape shared by core and app-server. After app-server responses expose the active profile, clients need to be able to send that same shape back when starting, resuming, forking, or overriding a turn instead of translating through the legacy `sandbox`/`sandboxPolicy` shorthands. This still needs to preserve the existing requirements/platform enforcement model. A profile-shaped request can be downgraded or rejected by constraints, but the server should keep the user's elevated-access intent for project trust decisions. Turn-level profile overrides also need to retain existing read protections, including deny-read entries and bounded glob-scan metadata, so a permission override cannot accidentally drop configured protections such as `**/*.env = deny`. ## What changed - Adds optional `permissionProfile` request fields to `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Rejects ambiguous requests that specify both `permissionProfile` and the legacy `sandbox`/`sandboxPolicy` fields, including running-thread resume requests. - Converts profile-shaped overrides into core runtime filesystem/network permissions while continuing to derive the constrained legacy sandbox projection used by existing execution paths. - Preserves project-trust intent for profile overrides that are equivalent to workspace-write or full-access sandbox requests. - Preserves existing deny-read entries and `globScanMaxDepth` when applying turn-level `permissionProfile` overrides. - Updates app-server docs plus generated JSON/TypeScript schema fixtures and regression coverage. ## Verification - `cargo test -p codex-app-server-protocol schema_fixtures` - `cargo test -p codex-core session_configuration_apply_permission_profile_preserves_existing_deny_read_entries` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * __->__ #18279
Michael Bolin ·
2026-04-22 13:34:33 -07:00 -
Update image outputs to default to high detail (#18386)
Do not assume the default `detail`.
pakrym-oai ·
2026-04-18 11:01:12 -07:00 -
Add notify to code-mode (#14842)
Allows model to send an out-of-band notification. The notification is injected as another tool call output for the same call_id.
pakrym-oai ·
2026-03-18 09:37:13 -07:00 -
generate an internal json schema for
RolloutLine(#14434)### Why i'm working on something that parses and analyzes codex rollout logs, and i'd like to have a schema for generating a parser/validator. `codex app-server generate-internal-json-schema` writes an `RolloutLine.json` file while doing this, i noticed we have a writer <> reader mismatch issue on `FunctionCallOutputPayload` and reasoning item ID -- added some schemars annotations to fix those ### Test ``` $ just codex app-server generate-internal-json-schema --out ./foo ``` generates an `RolloutLine.json` file, which i validated against jsonl files on disk `just codex app-server --help` doesn't expose the `generate-internal-json-schema` option by default, but you can do `just codex app-server generate-internal-json-schema --help` if you know the command everything else still works --------- Co-authored-by: Codex <noreply@openai.com>
Keyan Zhang ·
2026-03-17 11:19:42 -07:00 -
Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
## Summary - add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime control for who reviews approval requests - route Smart Approvals guardian review through core for command execution, file changes, managed-network approvals, MCP approvals, and delegated/subagent approval flows - expose guardian review in app-server with temporary unstable `item/autoApprovalReview/{started,completed}` notifications carrying `targetItemId`, `review`, and `action` - update the TUI so Smart Approvals can be enabled from `/experimental`, aligned with the matching `/approvals` mode, and surfaced clearly while reviews are pending or resolved ## Runtime model This PR does not introduce a new `approval_policy`. Instead: - `approval_policy` still controls when approval is needed - `approvals_reviewer` controls who reviewable approval requests are routed to: - `user` - `guardian_subagent` `guardian_subagent` is a carefully prompted reviewer subagent that gathers relevant context and applies a risk-based decision framework before approving or denying the request. The `smart_approvals` feature flag is a rollout/UI gate. Core runtime behavior keys off `approvals_reviewer`. When Smart Approvals is enabled from the TUI, it also switches the current `/approvals` settings to the matching Smart Approvals mode so users immediately see guardian review in the active thread: - `approval_policy = on-request` - `approvals_reviewer = guardian_subagent` - `sandbox_mode = workspace-write` Users can still change `/approvals` afterward. Config-load behavior stays intentionally narrow: - plain `smart_approvals = true` in `config.toml` remains just the rollout/UI gate and does not auto-set `approvals_reviewer` - the deprecated `guardian_approval = true` alias migration does backfill `approvals_reviewer = "guardian_subagent"` in the same scope when that reviewer is not already configured there, so old configs preserve their original guardian-enabled behavior ARC remains a separate safety check. For MCP tool approvals, ARC escalations now flow into the configured reviewer instead of always bypassing guardian and forcing manual review. ## Config stability The runtime reviewer override is stable, but the config-backed app-server protocol shape is still settling. - `thread/start`, `thread/resume`, and `turn/start` keep stable `approvalsReviewer` overrides - the config-backed `approvals_reviewer` exposure returned via `config/read` (including profile-level config) is now marked `[UNSTABLE]` / experimental in the app-server protocol until we are more confident in that config surface ## App-server surface This PR intentionally keeps the guardian app-server shape narrow and temporary. It adds generic unstable lifecycle notifications: - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed` with payloads of the form: - `{ threadId, turnId, targetItemId, review, action? }` `review` is currently: - `{ status, riskScore?, riskLevel?, rationale? }` - where `status` is one of `inProgress`, `approved`, `denied`, or `aborted` `action` carries the guardian action summary payload from core when available. This lets clients render temporary standalone pending-review UI, including parallel reviews, even when the underlying tool item has not been emitted yet. These notifications are explicitly documented as `[UNSTABLE]` and expected to change soon. This PR does **not** persist guardian review state onto `thread/read` tool items. The intended follow-up is to attach guardian review state to the reviewed tool item lifecycle instead, which would improve consistency with manual approvals and allow thread history / reconnect flows to replay guardian review state directly. ## TUI behavior - `/experimental` exposes the rollout gate as `Smart Approvals` - enabling it in the TUI enables the feature and switches the current session to the matching Smart Approvals `/approvals` mode - disabling it in the TUI clears the persisted `approvals_reviewer` override when appropriate and returns the session to default manual review when the effective reviewer changes - `/approvals` still exposes the reviewer choice directly - the TUI renders: - pending guardian review state in the live status footer, including parallel review aggregation - resolved approval/denial state in history ## Scope notes This PR includes the supporting core/runtime work needed to make Smart Approvals usable end-to-end: - shell / unified-exec / apply_patch / managed-network / MCP guardian review - delegated/subagent approval routing into guardian review - guardian review risk metadata and action summaries for app-server/TUI - config/profile/TUI handling for `smart_approvals`, `guardian_approval` alias migration, and `approvals_reviewer` - a small internal cleanup of delegated approval forwarding to dedupe fallback paths and simplify guardian-vs-parent approval waiting (no intended behavior change) Out of scope for this PR: - redesigning the existing manual approval protocol shapes - persisting guardian review state onto app-server `ThreadItem`s - delegated MCP elicitation auto-review (the current delegated MCP guardian shim only covers the legacy `RequestUserInput` path) --------- Co-authored-by: Codex <noreply@openai.com>Charley Cunningham ·
2026-03-13 15:27:00 -07:00 -
Jack Mousseau ·
2026-03-12 16:38:04 -07:00 -
feat: search_tool migrate to bring you own tool of Responses API (#14274)
## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch
Anton Panasenko ·
2026-03-11 17:51:51 -07:00 -
chore: add a separate reject-policy flag for skill approvals (#14271)
## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior
Celia Chen ·
2026-03-11 12:33:09 -07:00 -
fix(core) default RejectConfig.request_permissions (#14165)
## Summary Adds a default here so existing config deserializes ## Testing - [x] Added a unit test
Dylan Hurd ·
2026-03-10 04:56:23 +00:00 -
feat(approvals) RejectConfig for request_permissions (#14118)
## Summary We need to support allowing request_permissions calls when using `Reject` policy <img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06 40 PM" src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5" /> Note that this is a backwards-incompatible change for Reject policy. I'm not sure if we need to add a default based on our current use/setup ## Testing - [x] Added tests - [x] Tested locally
Dylan Hurd ·
2026-03-09 18:16:54 -07:00 -
feat(app-server-protocol): address naming conflicts in json schema exporter (#13819)
This fixes a schema export bug where two different `WebSearchAction` types were getting merged under the same name in the app-server v2 JSON schema bundle. The problem was that v2 thread items use the app-server API's `WebSearchAction` with camelCase variants like `openPage`, while `ThreadResumeParams.history` and `RawResponseItemCompletedNotification.item` pull in the upstream `ResponseItem` graph, which uses the Responses API snake_case shape like `open_page`. During bundle generation we were flattening nested definitions into the v2 namespace by plain name, so the later definition could silently overwrite the earlier one. That meant clients generating code from the bundled schema could end up with the wrong `WebSearchAction` definition for v2 thread history. In practice this shows up on web search items reconstructed from rollout files with persisted extended history. This change does two things: - Gives the upstream Responses API schema a distinct JSON schema name: `ResponsesApiWebSearchAction` - Makes namespace-level schema definition collisions fail loudly instead of silently overwriting
Owen Lin ·
2026-03-07 01:33:46 +00:00 -
image-gen-core (#13290)
Core tool-calling for image-gen, handles requesting and receiving logic for images using response API
Won Park ·
2026-03-03 23:11:28 -08:00 -
Val Kharitonov ·
2026-03-03 22:46:05 -08:00 -
Add under-development original-resolution view_image support (#13050)
## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049Curtis 'Fjord' Hawthorne ·
2026-03-03 15:56:54 -08:00 -
app-server service tier plumbing (plus some cleanup) (#13334)
followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None | "fast", so "standard" is not needed.
pash-openai ·
2026-03-03 02:35:09 -08:00 -
Support multimodal custom tool outputs (#12948)
## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948
Curtis 'Fjord' Hawthorne ·
2026-02-26 18:17:46 -08:00 -
feat: add Reject approval policy with granular prompt rejection controls (#12087)
## Why We need a way to auto-reject specific approval prompt categories without switching all approvals off. The goal is to let users independently control: - sandbox escalation approvals, - execpolicy `prompt` rule approvals, - MCP elicitation prompts. ## What changed - Added a new primary approval mode in `protocol/src/protocol.rs`: ```rust pub enum AskForApproval { // ... Reject(RejectConfig), // ... } pub struct RejectConfig { pub sandbox_approval: bool, pub rules: bool, pub mcp_elicitations: bool, } ``` - Wired `RejectConfig` semantics through approval paths in `core`: - `core/src/exec_policy.rs` - rejects rule-driven prompts when `rules = true` - rejects sandbox/escalation prompts when `sandbox_approval = true` - preserves rule priority when both rule and sandbox prompt conditions are present - `core/src/tools/sandboxing.rs` - applies `sandbox_approval` to default exec approval decisions and sandbox-failure retry gating - `core/src/safety.rs` - keeps `Reject { all false }` behavior aligned with `OnRequest` for patch safety - rejects out-of-root patch approvals when `sandbox_approval = true` - `core/src/mcp_connection_manager.rs` - auto-declines MCP elicitations when `mcp_elicitations = true` - Ensured approval policy used by MCP elicitation flow stays in sync with constrained session policy updates. - Updated app-server v2 conversions and generated schema/TypeScript artifacts for the new `Reject` shape. ## Verification Added focused unit coverage for the new behavior in: - `core/src/exec_policy.rs` - `core/src/tools/sandboxing.rs` - `core/src/mcp_connection_manager.rs` - `core/src/safety.rs` - `core/src/tools/runtimes/apply_patch.rs` Key cases covered include rule-vs-sandbox prompt precedence, MCP auto-decline behavior, and patch/sandbox retry behavior under `RejectConfig`.Michael Bolin ·
2026-02-19 11:41:49 -08:00 -
fix(tui): conditionally restore status indicator using message phase (#10947)
TLDR: use new message phase field emitted by preamble-supported models to determine whether an AgentMessage is mid-turn commentary. if so, restore the status indicator afterwards to indicate the turn has not completed. ### Problem `commit_tick` hides the status indicator while streaming assistant text. For preamble-capable models, that text can be commentary mid-turn, so hiding was correct during streaming but restore timing mattered: - restoring too aggressively caused jitter/flashing - not restoring caused indicator to stay hidden before subsequent work (tool calls, web search, etc.) ### Fix - Add optional `phase` to `AgentMessageItem` and propagate it from `ResponseItem::Message` - Keep indicator hidden during streamed commit ticks, restore only when: - assistant item completes as `phase=commentary`, and - stream queues are idle + task is still running. - Treat `phase=None` as final-answer behavior (no restore) to keep existing behavior for non-preamble models ### Tests Add/update tests for: - no idle-tick restore without commentary completion - commentary completion restoring status before tool begin - snapshot coverage for preamble/status behavior --------- Co-authored-by: Josh McKinney <joshka@openai.com>
sayan-oai ·
2026-02-07 02:39:52 +00:00 -
chore(app-server): add experimental annotation to relevant fields (#10928)
These fields had always been documented as experimental/unstable with docstrings, but now let's actually use the `experimental` annotation to be more explicit. - thread/start.experimentalRawEvents - thread/resume.history - thread/resume.path - thread/fork.path - turn/start.collaborationMode - account/login/start.chatgptAuthTokens
Owen Lin ·
2026-02-06 20:48:04 +00:00 -
feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567)
Took over the work that @aaronl-openai started here: https://github.com/openai/codex/pull/10397 Now that app-server clients are able to set up custom tools (called `dynamic_tools` in app-server), we should expose a way for clients to pass in not just text, but also image outputs. This is something the Responses API already supports for function call outputs, where you can pass in either a string or an array of content outputs (text, image, file): https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image So let's just plumb it through in Codex (with the caveat that we only support text and image for now). This is implemented end-to-end across app-server v2 protocol types and core tool handling. ## Breaking API change NOTE: This introduces a breaking change with dynamic tools, but I think it's ok since this concept was only recently introduced (https://github.com/openai/codex/pull/9539) and it's better to get the API contract correct. I don't think there are any real consumers of this yet (not even the Codex App). Old shape: `{ "output": "dynamic-ok", "success": true }` New shape: ``` { "contentItems": [ { "type": "inputText", "text": "dynamic-ok" }, { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" } ] "success": true } ```
Owen Lin ·
2026-02-04 16:12:47 -08:00 -
add none personality option (#10688)
- add none personality enum value and empty placeholder behavior\n- add docs/schema updates and e2e coverage
Ahmed Ibrahim ·
2026-02-04 15:40:33 -08:00