mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
e29bbb53688ea44f5a59b15eb4fa34bc75fa3fcc
596 Commits
-
[codex] Add model tool mode selector (#25031)
## Why Some models need to select their code-execution behavior through model catalog metadata. Models without that metadata must continue to follow the existing `CodeMode` and `CodeModeOnly` feature flags, including when a newer server sends an enum value this client does not recognize. ## What changed - add optional `ModelInfo.tool_mode` metadata with `direct`, `code_mode`, and `code_mode_only` - treat omitted and unknown wire values as `None` - resolve `None` from the existing feature flags - carry the resolved `ToolMode` directly on `TurnContext`, outside `Config` - use the resolved value for turn creation, model switches, review turns, tool planning, and code execution ## Coverage - add protocol coverage for omitted, known, and unknown enum values - add focused coverage for flag fallback and explicit metadata overriding feature flags - add core integration coverage that fetches remote model metadata through `/v1/models` and verifies the outbound `/responses` tools for explicit `direct` and `code_mode_only` selectors ## Stack - followed by #25032
Ahmed Ibrahim ·
2026-05-29 09:05:05 -07:00 -
Surface filesystem permission profiles in prompt context (#23924)
## Summary Some permission profiles can encode filesystem reads that should remain unavailable to the agent. Before this change, the model-visible context and automatic approval review prompt summarized the effective permissions as a legacy sandbox mode, which can omit permission-profile filesystem entries from escalation decisions. For example, a profile can grant workspace access while denying a private subtree across every workspace root: ```toml default_permissions = "restricted-workspace" [permissions.restricted-workspace.workspace_roots] "/Users/alice/project" = true "/Users/alice/other-project" = true [permissions.restricted-workspace.filesystem] ":minimal" = "read" [permissions.restricted-workspace.filesystem.":workspace_roots"] "." = "write" "private" = "deny" "private/**" = "deny" ``` The context window now describes the workspace roots and effective filesystem side of the `PermissionProfile` directly, with deny entries marked as non-escalatable: ```xml <environment_context> <cwd>/Users/alice/project</cwd> <shell>zsh</shell> <filesystem><workspace_roots><root>/Users/alice/project</root><root>/Users/alice/other-project</root></workspace_roots><permission_profile type="managed"><file_system type="restricted"><entry access="read"><special>:minimal</special></entry><entry access="write"><path>/Users/alice/project</path></entry><entry access="write"><path>/Users/alice/other-project</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/project/private</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/other-project/private</path></entry><entry access="deny" escalatable="false"><glob>/Users/alice/project/private/**</glob></entry><entry access="deny" escalatable="false"><glob>/Users/alice/other-project/private/**</glob></entry></file_system></permission_profile></filesystem> </environment_context> ``` Managed requirements can impose the same kind of deny-read restriction: ```toml [permissions.filesystem] deny_read = [ "/Users/alice/project/private", "/Users/alice/project/private/**", ] ``` The automatic approval review prompt also receives the parent turn's denied-read context, so review decisions can account for the active permission profile. ## What Changed - Render the effective filesystem profile in `<environment_context>`, including profile type, filesystem entries, workspace roots, and non-escalatable deny entries. - Persist effective `workspace_roots` in `TurnContextItem` so resumed/replayed context does not have to bind `:workspace_roots` through legacy `cwd` fallback. - Add explicit permission instructions that denied reads are policy restrictions, not escalation targets. - Pass the parent turn's denied-read context into automatic approval reviews. - Add targeted coverage for prompt rendering, workspace-root materialization, replay context, and review prompt context. - Keep the prompt-context test expectations platform-aware so the same filesystem rendering assertions pass on Unix and Windows paths. ## Testing - `just test -p codex-core context::environment_context::tests::serialize_environment_context_with_full_filesystem_profile` - `just test -p codex-core context::environment_context::tests::turn_context_item_filesystem_uses_workspace_roots_instead_of_cwd` - `just test -p codex-core context::permissions_instructions::permissions_instructions_tests::builds_permissions_from_profile_with_denied_reads` - `just fix -p codex-core` I also attempted `just test -p codex-core`; the changed prompt-context tests passed, but the full local run did not complete cleanly in this sandboxed macOS environment due unrelated user-shell `CODEX_SANDBOX*` expectations and integration-test timeouts.
Michael Bolin ·
2026-05-28 14:56:53 -07:00 -
[codex] Add user input client ids (#24653)
## Summary Adds an optional `clientId` field to app-server v2 `UserInput` and carries it through the core `UserInput` model so clients can correlate echoed user input items without relying on payload equality. ## Details - Adds `client_id: Option<String>` to core `UserInput` variants. - Exposes the v2 app-server field as `clientId` on the wire and in generated TypeScript. - Preserves the id when converting between app-server v2 and core protocol types. - Regenerates app-server schema fixtures. ## Validation - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-protocol` - `git diff --check`
Alexi Christakis ·
2026-05-28 14:54:39 -07:00 -
Expose MCP server info as part of server status (#24698)
# Summary Expose MCP server info via App Server (when available) so apps can render a richer MCP experience
Gabriel Peal ·
2026-05-28 09:38:34 -07:00 -
Restore legacy image detail values (#24644)
## Why Older persisted rollouts can contain `input_image.detail` values of `auto` or `low` from before `ImageDetail` was narrowed to `high`/`original`. Current deserialization rejects those values, which can make resume skip later compacted checkpoints and reconstruct an oversized raw suffix before the next compaction attempt. Confirmed Sentry reports fixed by this compatibility path: - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/) - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/) - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/) - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/) ## Background [openai/codex#20693](https://github.com/openai/codex/pull/20693) added image-detail plumbing for app-server `UserInput` so input images could explicitly request `detail: original`. The Slack discussion behind that PR was about ScreenSpot / bridge evals where user input images were resized, while tool output images already had MCP/code-mode ways to request image detail. In review, the intended new API surface was narrowed to `high` and `original`: default to `high`, allow `original` when callers need unchanged image handling, and avoid encouraging new `auto` or `low` usage. That policy still makes sense for newly emitted values. The missing compatibility piece is persisted history. Older rollouts can already contain `auto` and `low`, and resume reconstructs typed history by deserializing those rollout records. Rejecting old values at that boundary causes valid compacted checkpoints to be skipped. This PR restores `auto` and `low` as real variants so old records deserialize and round-trip without being rewritten as `high`, while product paths can continue to default to `high` and avoid emitting `auto` for new behavior. ## What changed - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class protocol values. - Preserved `auto`/`low` through rollout deserialization, MCP image metadata, code-mode image output, and schema/type generation. - Kept local image byte handling conservative: only `original` switches to original-resolution loading; `auto`/`low`/`high` continue through the resize-to-fit path while retaining their detail value. - Added regression coverage for enum round-tripping and code-mode `low` detail handling. ## Testing - `just write-app-server-schema` - `just test -p codex-protocol` - `just test -p codex-tools` - `just test -p codex-code-mode` - `just test -p codex-app-server-protocol` - `just test -p codex-core suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata` - `just test -p codex-core suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper` - Loaded broken rollouts on local fixed builds, and started/completed new turns. I also attempted `just test -p codex-core`; the local broad run did not finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed out. The failures were broad timeout/deadline failures across unrelated areas; targeted changed-path core tests above passed.
rhan-oai ·
2026-05-26 16:24:33 -07:00 -
[codex] remove plain image wrapper spans (#24652)
## Why Remote image submissions currently wrap native `input_image` spans in literal `<image>` and `</image>` text spans. Those extra prompt tokens add structure without providing label or routing information. ## What Changed - Serialize `UserInput::Image` directly as an `input_image` content span. - Preserve named local-image framing and legacy wrapper parsing for labeled attachments and existing histories. - Update existing request-shape expectations for drag-and-drop images, model switching, and compaction. ## Validation - `just test -p codex-protocol` - Focused `codex-core` run covering `drag_drop_image_persists_rollout_request_shape`, `model_change_from_image_to_text_strips_prior_image_content`, and `snapshot_request_shape_pre_turn_compaction_including_incoming_user_message` ## Notes - A broader `just test -p codex-core` run was attempted; the affected tests passed, while the overall run failed in unrelated CLI, MCP, and tooling tests plus a `thread_manager` timeout.
pakrym-oai ·
2026-05-26 15:49:37 -07:00 -
Add experimental turn additional context (#24154)
## Summary Adds experimental `additionalContext` support to `turn/start` and `turn/steer` so clients can provide ephemeral external context, such as browser or automation state, without turning that plumbing into a visible user prompt or triggering user-prompt lifecycle behavior. ## API Shape The parameter shape is: ```ts additionalContext?: Record<string, { value: string kind: "untrusted" | "application" }> | null ``` Example: ```json { "additionalContext": { "browser_info": { "value": "Active tab is CI failures.", "kind": "untrusted" }, "automation_info": { "value": "CI rerun is in progress.", "kind": "application" } } } ``` The keys are opaque and caller-defined. ## Context Injection When provided, accepted entries are inserted into model context as hidden contextual message items, not as visible thread user-message items. `kind: "untrusted"` entries are inserted with role `user`: ```text <external_${key}>${value}</external_${key}> ``` `kind: "application"` entries are inserted with role `developer`: ```text <${key}>${value}</${key}> ``` Values are not escaped. Each value is truncated to 1k approximate tokens before wrapping. For `turn/start`, accepted additional context is inserted before normal user input. For `turn/steer`, additional context is merged only when the steer includes non-empty user input; context-only steers still reject as empty input. ## Dedupe Strategy `AdditionalContextStore` lives on session state and stores the latest complete additional-context map. Each `turn/start` or non-empty `turn/steer` treats its `additionalContext` as the current complete set of values. Entries are injected only when the key is new or the exact entry for that key changed, including `value` or `kind`. After merging, the store is replaced with the provided map, so omitted keys are removed from the retained set and can be injected again later if reintroduced. Omitting `additionalContext`, passing `null`, or passing an empty object resets the store to empty and injects nothing. ## What Changed - Threads experimental v2 `additionalContext` through app-server into core turn start and steer handling. - Adds separate contextual fragment types for untrusted user-role context and application developer-role context. - Uses pending response input items so additional context can be combined with normal user input without treating it as prompt text. - Adds integration coverage for start/steer flow, role routing, dedupe/reset behavior, deletion/re-add behavior, hook-blocked input behavior, empty context-only steer rejection, external-fragment marker matching, and truncation.pakrym-oai ·
2026-05-26 13:02:34 -07:00 -
standalone websearch extension (#23823)
## Summary Add the extension-backed standalone `web.run` tool so Codex can call the standalone search endpoint through the `codex-api` search client and return its encrypted output to Responses. - gate the new tool behind `standalone_web_search` - install the extension in the app-server thread registry and hide hosted `web_search` when standalone search is enabled for OpenAI providers so the two paths stay mutually exclusive - build search context from persisted history using a small tail heuristic: previous user message, assistant text between the last two user turns capped at about 1k tokens, and current user message ## Test Plan - `cargo test -p codex-web-search-extension` - `cargo test -p codex-api` - `cargo test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates`
sayan-oai ·
2026-05-26 11:12:24 -07:00 -
Display workspace usage limit error copy from response header (#24114)
## Why `openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex workspace credit-depletion and spend-cap responses. The CLI currently reads the adjacent promo header but otherwise renders generic usage-limit copy, so those responses do not explain the workspace-specific action the user needs to take. Backend dependency: https://github.com/openai/openai/pull/947613 ## What Changed - Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error handling path alongside `x-codex-promo-message`. - Keep the header value parsing with the shared `RateLimitReachedType` enum. - Carry the parsed type on `UsageLimitReachedError` and render client-owned copy for the four workspace owner/member credit and spend-cap values. - Preserve existing promo and plan-based text for absent, generic, or unknown header values. - Keep the existing TUI workspace-owner nudge state path unchanged; the response header only selects the displayed error string. - Add focused display coverage for all specific type values and the generic fallback case. ## Test Plan - Added `usage_limit_reached_error_formats_rate_limit_reached_types` coverage. - Not run manually, per request; CI runs validation on the pushed commit.
dhruvgupta-oai ·
2026-05-22 23:58:49 +00:00 -
Add trace_id to TurnStartedEvent (#23980)
## Why [Recent PR](https://github.com/openai/codex/pull/22709) removed `trace_id` from `TurnContextItem`. ## What changed - Add to `TurnStartedEvent` so rollout consumers can correlate turns with telemetry traces. - Note that the branch name is out of date because I originally re-added to `TurnContextItem`, but we decided to move it to `TurnStartedEvent`. ## Verification - `cargo test -p codex-protocol` - `cargo test -p codex-core --lib regular_turn_emits_turn_started_without_waiting_for_startup_prewarm` - `cargo test -p codex-core --test all emits_warning_when_resumed_model_differs` - `cargo test -p codex-rollout` - `cargo test -p codex-state`
mchen-oai ·
2026-05-22 13:10:56 -07:00 -
cli: rename profile v2 flag to --profile (#23883)
## Why Profile v2 is taking over the user-facing profile selection path, so the CLI no longer needs to expose the transitional `--profile-v2` spelling. This switches the public args surface to `--profile` before the remaining legacy profile plumbing is removed separately. ## What - Rebind `--profile` and `-p` to the v2 profile name argument that selects `$CODEX_HOME/<name>.config.toml`. - Stop parsing the legacy shared CLI profile argument while keeping its implementation path in place for follow-up cleanup. - Update CLI validation, profile-name parse errors, and the legacy-profile collision message/tests to refer to `--profile`. ## Testing - `cargo test -p codex-cli -p codex-config -p codex-protocol -p codex-utils-cli`
jif-oai ·
2026-05-21 16:45:27 +02:00 -
[codex] Add plugin id to MCP tool call items (#23737)
Add owning plugin id to MCP tool call items so we can better filter them at plugin level. ## Summary - add optional `plugin_id` to MCP tool-call items and legacy begin/end events - propagate plugin metadata into emitted core items and app-server v2 `ThreadItem::McpToolCall` - preserve plugin ids through app-server replay/redaction paths and regenerate v2 schema fixtures ## Testing - `just write-app-server-schema` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-protocol -p codex-app-server-protocol` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core mcp_tool_call_item_includes_plugin_id --lib` - `cargo check -p codex-tui --tests` - `cargo check -p codex-app-server --tests` - `git diff --check` ## Notes - `just fix -p codex-core` completed with two non-fatal `too_many_arguments` warnings on the touched MCP notification helpers. - A broader `cargo test -p codex-core` run passed core unit tests, then hit shell/sandbox/snapshot failures in the integration target. - A broader app-server downstream run hit the existing `in_process::tests::in_process_start_clamps_zero_channel_capacity` stack overflow; `cargo test -p codex-exec` also hit the existing sandbox expectation mismatch in `thread_lifecycle_params_include_legacy_sandbox_when_no_active_profile`.
Matthew Zeng ·
2026-05-20 17:02:10 -07:00 -
Honor client-resolved service tier defaults (#23537)
## Why Model catalog responses can now advertise a nullable `default_service_tier` for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference: - no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled - explicit `default`, meaning the user intentionally wants standard routing - explicit catalog tier ids such as `priority`, `flex`, or future tiers Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server `thread/start` / `turn/start` updates. ## What Changed - Plumbed `default_service_tier` through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions. - Added the request-only `default` service tier sentinel and normalized legacy config spelling so `fast` in `config.toml` still materializes as the runtime/request id `priority`. - Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change. - Updated app-server thread lifecycle config construction so `serviceTier: null` preserves explicit standard-routing intent by mapping to `default` instead of internal `None`. - Kept core responsible for validating explicit tiers against the current model and stripping `default` before `/v1/responses`, without applying catalog defaults itself. ## Validation - `CARGO_INCREMENTAL=0 cargo build -p codex-cli` - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list` - `cargo test -p codex-tui service_tier` - `cargo test -p codex-protocol service_tier_for_request` - `cargo test -p codex-core get_service_tier` - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier`
Shijie Rao ·
2026-05-20 15:57:50 -07:00 -
Add SubagentStop hook (#22873)
# What <img width="1792" height="1024" alt="image" src="https://github.com/user-attachments/assets/8f81d232-5813-4994-a61d-e42a05a93a3e" /> `SubagentStop` runs when a thread-spawned subagent turn is about to finish. Thread-spawned subagents use `SubagentStop` instead of the normal root-agent `Stop` hook. Configured handlers match on `agent_type`. Hook input includes the normal stop fields plus: - `agent_id`: the child thread id. - `agent_type`: the resolved subagent type. - `agent_transcript_path`: the child subagent transcript path. - `transcript_path`: the parent thread transcript path. - `last_assistant_message`: the final assistant message from the child turn, when available. - `stop_hook_active`: `true` when the child is already continuing because an earlier stop-like hook blocked completion. `SubagentStop` shares the same completion-control semantics as `Stop`, scoped to the child turn: - No decision allows the child turn to finish. - `decision: "block"` with a non-empty `reason` records that reason as hook feedback and continues the child with that prompt. - `continue: false` stops the child turn. If `stopReason` is present, Codex surfaces it as the stop reason. # Lifecycle Scope Only thread-spawned subagents run `SubagentStop`. Internal/system subagents such as Review, Compact, MemoryConsolidation, and Other do not run normal `Stop` hooks and do not run `SubagentStop`. This avoids exposing synthetic matcher labels for internal implementation paths. # Stack 1. #22782: add `SubagentStart`. 2. This PR: add `SubagentStop`. 3. #22882: add subagent identity to normal hook inputs.
Abhinav ·
2026-05-20 14:59:41 -07:00 -
feat(permissions): resolve permission profile inheritance (#22270)
## Stack This is the foundation PR for the permission-profile inheritance stack. - This PR adds config-level `extends` resolution and merge semantics. - Follow-up: #23705 applies resolved profiles at runtime and updates the active-profile protocol surfaces. ## Why Permission profiles are starting to carry enough policy that copy-pasting near-identical definitions becomes hard to review and easy to drift. Before the runtime can consume inherited profiles, the config layer needs one explicit resolver that can merge parent chains and reject unsafe or invalid inheritance shapes. ## What changed - Add `extends` to permission-profile TOML and resolve parent chains in inheritance order. - Merge inherited profile TOML with the existing config merge behavior while preserving the permission-specific normalization needed for network domain keys. - Keep parent descriptions out of resolved child profiles and record inherited profile names separately for downstream consumers. - Reject undefined parents, unsupported built-in parents, and inheritance cycles with targeted errors. - Cover resolver behavior with TOML fixture tests and refresh the generated config schema. ## Validation - `cargo test -p codex-config` - `cargo test -p codex-core permissions_profiles_`
viyatb-oai ·
2026-05-20 20:12:07 +00:00 -
Add timeout for remote compaction requests (#23451)
## Why Remote compaction currently sends a unary `POST /responses/compact` and waits for the full response before replacing history or emitting the completed `ContextCompaction` item. Unlike normal `/responses` streaming requests, this unary compact request had no timeout boundary. If the backend accepts the request and then stalls before returning a body, the existing request retry policy never sees a transport error, so the compact turn can remain stuck after the started item with no completion or actionable error. That matches the reported hang shape in issues such as #18363, where logs show `responses/compact` was posted but no corresponding compact completion followed. A bounded request timeout gives the existing retry policy a concrete timeout error to retry instead of letting the user sit indefinitely on automatic context compaction. ## What - Add a request timeout to legacy `/responses/compact` calls. - Size that timeout from the provider stream idle timeout with a conservative multiplier, so the default compact attempt gets 20 minutes rather than the 5 minute stream idle window. - Map API transport timeouts to a request timeout error instead of the child-process timeout message. ## Testing - Not run (per request; CI will cover).
jif-oai ·
2026-05-20 11:56:00 +02:00 -
add encryptedcontent to functioncalloutput (#23500)
add new `EncryptedContent` variant to `FunctionCallOutputContentItem` ahead of standalone websearch. we need to be able to receive and pass encrypted function call output from the new web search endpoint back to responsesapi, as we cannot expose direct search results.
sayan-oai ·
2026-05-19 23:47:48 -07:00 -
Add SubagentStart hook (#22782)
# What `SubagentStart` runs once when Codex creates a thread-spawned subagent, before that child sends its first model request. Thread-spawned subagents use `SubagentStart` instead of the normal root-agent `SessionStart` hook. Configured handlers match on the subagent `agent_type`, using the same value passed to `spawn_agent`. When no agent type is specified, Codex uses the default agent type. Hook input includes the normal session-start fields plus: - `agent_id`: the child thread id. - `agent_type`: the resolved subagent type. `SubagentStart` may return `hookSpecificOutput.additionalContext`. That context is added to the child conversation before the first model request. # Lifecycle Scope Only thread-spawned subagents run `SubagentStart`. Internal/system subagents such as Review, Compact, MemoryConsolidation, and Other do not run normal `SessionStart` hooks and do not run `SubagentStart`. This avoids exposing synthetic matcher labels for internal implementation paths. Also the `SessionStart` hook no longer fires for subagents, this matches behavior with other coding agents' implementation # Stack 1. This PR: add `SubagentStart`. 2. #22873: add `SubagentStop`. 3. #22882: add subagent identity to normal hook inputs.
Abhinav ·
2026-05-19 12:45:08 -07:00 -
Make
denycanonical for filesystem permission entries (#23493)## Why Filesystem permission profiles used `none` for deny-read entries, which is less direct than the action the entry actually represents. This change makes `deny` the canonical filesystem permission spelling while preserving compatibility for older configs that still send `none`. ## What changed - rename `FileSystemAccessMode::None` to `Deny` - serialize and generate schemas with `deny` as the canonical value - retain `none` only as a legacy input alias for temporary config compatibility - update filesystem glob diagnostics and regression coverage to use the canonical spelling - refresh config and app-server schema fixtures to match the new wire shape ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core config_toml_deserializes_permission_profiles --lib` - `cargo test -p codex-core read_write_glob_patterns_still_reject_non_subpath_globs --lib` Earlier in the session, a broad `cargo test -p codex-core` run reached unrelated pre-existing failures in timing/snapshot/git-info tests under this environment; the targeted surfaces touched by this PR passed cleanly.
viyatb-oai ·
2026-05-19 11:03:47 -07:00 -
Add
body_after_prefixauto-compact token limit scope (#22870)## Why `model_auto_compact_token_limit` has only been able to budget the full active context. That makes it hard to set a small "growth since compaction" budget for sessions that preserve a large carried window prefix: the preserved prefix can consume the whole budget and force immediate repeated compaction. This PR adds an opt-in `body_after_prefix` scope so callers can apply `model_auto_compact_token_limit` to sampled output and later growth after the current carried prefix, while still forcing compaction before the full model context window is exhausted. ## What changed - Adds `AutoCompactTokenLimitScope` with the existing `total` behavior as the default and a new `body_after_prefix` mode: [`config_types.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/protocol/src/config_types.rs#L24-L37). - Threads `model_auto_compact_token_limit_scope` through config loading, `Config`, `core-api`, and app-server v2 schema/TypeScript generation. - Records the first observed input-token count for a `body_after_prefix` compaction window and uses it as the baseline when deciding whether the scoped auto-compaction budget is exhausted: [`turn.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/src/session/turn.rs#L743-L781). - Keeps a hard context-window cap in `body_after_prefix`, so scoped budgeting cannot let the active context overrun the usable window. ## Verification Added compact-suite coverage for the two key behaviors: `body_after_prefix` does not re-compact just because the carried prefix is larger than the scoped budget, and it still compacts when the total active context reaches the configured context window: [`compact.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/tests/suite/compact.rs#L3003-L3128).
jif-oai ·
2026-05-19 10:19:46 +00:00 -
[5 of 7] Replace OverrideTurnContext with ThreadSettings (#22508)
**Stack position:** [5 of 7] ## Summary This PR adds `Op::ThreadSettings`, a queued settings-only update mechanism for changing stored thread settings without starting a new turn. It also removes the legacy `Op::OverrideTurnContext` in the same layer, so reviewers can see the replacement and deletion together. ## Changes - Add `Op::ThreadSettings` for settings-only queued updates. - Emit `ThreadSettingsApplied` with the effective thread settings snapshot after core applies an update. - Route settings-only updates through the same submission queue as user input. - Migrate remaining `OverrideTurnContext` tests and callers to the queued `Op::ThreadSettings` path. - Delete `Op::OverrideTurnContext` from the core protocol and submission loop. This stack addresses #20656 and #22090. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) (this PR) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 21:03:51 -07:00 -
[3 of 7] Remove UserTurn (#23075)
**Stack position:** [3 of 7] ## Summary This PR finishes the input-op consolidation by moving the remaining `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`. This touches a lot of files, but it is a low-risk mechanical migration. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) (this PR) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 19:56:00 -07:00 -
[2 of 7] Remove UserInputWithTurnContext (#23081)
**Stack position:** [2 of 7] ## Summary This PR removes the overlapping `Op::UserInputWithTurnContext` variant now that `Op::UserInput` can carry thread settings overrides directly. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) (this PR) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 19:41:33 -07:00 -
[1 of 7] Add thread settings to UserInput (#23080)
**Stack position:** [1 of 7] ## Summary The first three PRs in this stack are a cleanup pass before the actual thread settings API work. Today, core has several overlapping "user input" ops: `UserInput`, `UserInputWithTurnContext`, and `UserTurn`. They differ mostly in how much next-turn state they carry, which makes the later queued thread settings update harder to reason about and review. This PR starts that cleanup by adding the shared `ThreadSettingsOverrides` payload and allowing `Op::UserInput` to carry it. Existing variants remain in place here, so this layer is mostly a behavior-preserving API shape change plus mechanical constructor updates. ## End State After PR3 By the end of PR3, `Op::UserInput` is the only "user input" core op. It can carry optional thread settings overrides for callers that need to update stored defaults with a turn, while callers without updates use empty settings. `Op::UserInputWithTurnContext` and `Op::UserTurn` are deleted. ## End State After PR5 By the end of PR5, core will have only two ops for this area: - `Op::UserInput` for user-input-bearing submissions. - `Op::ThreadSettings` for settings-only updates. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) (this PR) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)
Eric Traut ·
2026-05-18 18:48:35 -07:00 -
[codex] Trim unused TurnContextItem fields (#22709)
## Why `TurnContextItem` is the durable baseline used to reconstruct context diffs across resume/fork. Most of the old persisted-only fields on it are no longer read, so keeping them in rollout snapshots adds schema surface and state that can drift without affecting reconstruction. `summary` is the exception: older Codex versions require it to deserialize `turn_context` records, so keep writing a default compatibility value until that schema surface can be removed safely. ## What changed - Removed the unused persisted fields from `TurnContextItem`: trace ids, user/developer instructions, output schema, and truncation policy. - Kept `summary` with a compatibility comment and made `TurnContext::to_turn_context_item` write `ReasoningSummary::Auto` instead of live turn state. - Updated rollout/context reconstruction fixtures for the retained summary field. ## Verification - `cargo test -p codex-protocol --lib turn_context_item` - `cargo test -p codex-rollout resume_candidate_matches_cwd_reads_latest_turn_context` - `cargo test -p codex-state turn_context` - `cargo test -p codex-core --lib new_default_turn_captures_current_span_trace_id` - `cargo test -p codex-core --lib record_initial_history_resumed_turn_context_after_compaction_reestablishes_reference_context_item` - `cargo test -p codex-core --test all emits_warning_when_resumed_model_differs` - `git diff --check`
pakrym-oai ·
2026-05-18 21:54:36 +00:00 -
goal: pause continuation loops on usage limits and blockers (#23094)
Addresses #22833, #22245, #23067 ## Why `/goal` can keep synthesizing turns even when the next turn cannot make meaningful progress. Hard usage exhaustion can replay failing turns, and repeated permission or external-resource blockers can keep burning tokens while waiting for user or system intervention. ## What changed - Add resumable `blocked` and `usageLimited` goal states. As with `paused`, goal continuation stops with these states. - Move to `usageLimited` after usage-limit failures. - Allow the built-in `update_goal` tool to set `blocked` only under explicit repeated-impasse guidance. Updated goal continuation prompt to specify that agent should use `blocked` only when it has made at least three attempts to get past an impasse. Most of the files touched by this PR are because of the small app server protocol update. ## Validation I manually reproduced a number of situations where an agent can run into a true impasse and verified that it properly enters `blocked` state. I then resumed and verified that it once again entered `blocked` state several turns later if the impasse still exists. I also manually reproduced the usage-limit condition by creating a simulated responses API endpoint that returns 429 errors with the appropriate error message. Verified that the goal runtime properly moves the goal into `usageLimited` state and TUI UI updates appropriately. Verified that `/goal resume` resumes (and immediately goes back into `ussageLImited` state if appropriate). ## Follow-up PRs Small changes will be needed to the GUI clients to properly handle the two new states.
Eric Traut ·
2026-05-18 11:28:53 -07:00 -
app-server-protocol: remove PermissionProfile from API (#22924)
## Why The app server API should expose permission profile identity, not the lower-level runtime permission model. `PermissionProfile` is the compiled sandbox/network representation that the server uses internally; exposing it through app-server-protocol forces clients to understand details that should remain implementation-level. The API boundary should prefer `ActivePermissionProfile`: a stable profile id, plus future parent-profile metadata, that clients can pass back when they want to select the same active permissions. This also avoids schema generation collisions between the app-server v2 API type space and the core protocol model. Incidentally, while PR makes a number of changes to `command/exec`, note that we are hoping to deprecate this API in favor of `process/spawn`, so we don't need to be too finicky about these changes. ## What Changed - Removed `PermissionProfile` from the app-server-protocol API surface, including generated schema and TypeScript exports. - Changed `CommandExecParams.permissionProfile` to `ActivePermissionProfile`. - Resolve command exec profile ids through `ConfigManager` for the command cwd, matching turn override selection semantics. - Updated downstream TUI tests/helpers to use core permission types directly instead of app-server-protocol `PermissionProfile` shims.
Michael Bolin ·
2026-05-15 17:10:15 -07:00 -
Preserve image detail in app-server inputs (#20693)
## Summary - Add optional image detail to user image inputs across core, app-server v2, thread history/event mapping, and the generated app-server schemas/types. - Preserve requested detail when serializing Responses image inputs: omitted detail stays on the existing `high` default, while explicit `original` keeps local images on the original-resolution path. - Support `high`/`original` consistently for tool image outputs, including MCP `codex/imageDetail`, code-mode image helpers, and `view_image`.
Curtis 'Fjord' Hawthorne ·
2026-05-15 15:04:04 -07:00 -
[codex] Use compaction_trigger item for remote compaction v2 (#22809)
## Why Remote compaction v2 was still using `context_compaction` as both the request trigger and the compacted output shape. The Responses API now has the landed contract for this flow: Codex sends a dedicated `{ "type": "compaction_trigger" }` input item, and the backend returns the standard `compaction` output item with encrypted content. This aligns the v2 path with that wire contract while preserving the existing local compacted-history post-processing behavior. ## What changed - Add `ResponseItem::CompactionTrigger` and regenerate the app-server protocol schema fixtures. - Send `compaction_trigger` from `remote_compaction_v2` instead of a payload-less `context_compaction`. - Collect exactly one backend `compaction` output item, then reuse the existing compacted-history rebuilding path. - Treat the trigger item as a transient request marker rather than model output or persisted rollout/memory content. ## Verification - `cargo test -p codex-protocol compaction_trigger` - `cargo test -p codex-core remote_compact_v2` - `cargo test -p codex-core compact_remote_v2` - `cargo test -p codex-core responses_websocket_sends_response_processed_after_remote_compaction_v2` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol schema_fixtures`jif-oai ·
2026-05-15 11:40:35 +02:00 -
app-server: use permission ids and runtime workspace roots (#22611)
## Why This PR builds on [#22610](https://github.com/openai/codex/pull/22610) and is the app-server side of the migration from mutable per-turn `SandboxPolicy` replacement toward selecting immutable permission profiles by id plus mutable runtime workspace roots. Once permission profiles can carry their own immutable `workspace_roots`, app-server no longer needs to mutate the selected `PermissionProfile` just to represent thread-specific filesystem context. The mutable part now lives on the thread as explicit `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until the sandbox is realized for a turn. ## What Changed - Replaced the v2 permission-selection wrapper surface with plain profile ids for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Removed the API surface for profile modifications (`PermissionProfileSelectionParams`, `PermissionProfileModificationParams`, `ActivePermissionProfileModification`). - Added experimental `runtimeWorkspaceRoots` fields to the thread lifecycle and turn-start APIs. - Threaded runtime workspace roots through core session/thread snapshots, turn overrides, app-server request handling, and command execution permission resolution. - Kept session permission state symbolic so later runtime root updates and cwd-only implicit-root retargeting rebind `:workspace_roots` correctly. - Updated the embedded clients just enough to send and restore the new thread state. - Refreshed the generated schema/TypeScript artifacts and the app-server README to match the new contract. ## Verification Targeted coverage for this layer lives in: - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs` - `codex-rs/app-server/tests/suite/v2/thread_start.rs` - `codex-rs/app-server/tests/suite/v2/thread_resume.rs` - `codex-rs/app-server/tests/suite/v2/turn_start.rs` - `codex-rs/core/src/session/tests.rs` The key regression checks exercise that: - `runtimeWorkspaceRoots` resolve against the effective cwd on thread start. - Profile-declared workspace roots are excluded from the runtime workspace roots returned by app-server. - A turn-level runtime workspace-root update persists onto the thread and is returned by `thread/resume`. - A named permission profile selected on one turn remains symbolic so a later runtime-root-only turn update changes the actual sandbox writes. - A cwd-only turn update retargets the implicit runtime cwd root while preserving additional runtime roots. - The protocol fixtures and generated client artifacts stay in sync with the string-based permission selection contract. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611). * #22612 * __->__ #22611
Michael Bolin ·
2026-05-14 23:00:05 -07:00 -
permissions: support workspace roots in profiles (#22610)
## Why This is the configuration/model half of the alternative permissions migration we discussed as a comparison point for [#22401](https://github.com/openai/codex/pull/22401) and [#22402](https://github.com/openai/codex/pull/22402). The old `workspace-write` model mixes three concerns that we want to keep separate: - reusable profile rules that should stay immutable once selected - user/runtime workspace roots from `cwd`, `--add-dir`, and legacy workspace-write config - internal Codex writable roots such as memories, which should not be shown as user workspace roots This PR gives permission profiles first-class `workspace_roots` so users can opt multiple repositories into the same `:workspace_roots` rules without using broad absolute-path write grants. It also starts separating the raw selected profile from the effective runtime profile by making `Permissions` expose explicit accessors instead of public mutable fields. A representative `config.toml` looks like this: ```toml default_permissions = "dev" [permissions.dev.workspace_roots] "~/code/openai" = true "~/code/developers-website" = true [permissions.dev.filesystem.":workspace_roots"] "." = "write" ".codex" = "read" ".git" = "read" ".vscode" = "read" ``` If Codex starts in `~/code/codex` with that profile selected, the effective workspace-root set becomes: - `~/code/codex` from the runtime `cwd` - `~/code/openai` from the profile - `~/code/developers-website` from the profile The `:workspace_roots` rules are materialized across each root, so `.git`, `.codex`, and `.vscode` stay scoped the same way everywhere. Runtime additions such as `--add-dir` can still layer on later stack entries without mutating the selected profile. ## Stack Shape This PR intentionally stops before the profile-identity cleanup in [#22683](https://github.com/openai/codex/pull/22683) so the base review stays focused on config loading, workspace-root materialization, and compatibility with legacy `workspace-write`. The representation in this PR is therefore transitional: `Permissions` carries enough state to distinguish the raw constrained profile from the effective runtime profile, and there are still call sites that must keep the active profile identity and constrained profile value in sync. The follow-up PR replaces that with a single resolved profile state (`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the profile id, immutable `PermissionProfile`, and profile-declared workspace roots together. That follow-up removes APIs such as `set_constrained_permission_profile_with_active_profile()` where separate arguments could drift out of sync. Downstream PRs then build on this base to switch app-server turn updates to profile ids plus runtime workspace roots and to finish the user-visible summary behavior. Reviewers should judge this PR as the workspace-roots foundation, not as the final in-memory shape of selected permission profiles. ## Review Guide Suggested review order: 1. Start with `codex-rs/core/src/config/mod.rs`. This is the main shape change in the base slice. `Permissions` now stores a private raw `Constrained<PermissionProfile>` plus runtime `workspace_roots`. Callers use `permission_profile()` when they need the raw constrained value and `effective_permission_profile()` when they need a materialized runtime profile. As noted above, [#22683](https://github.com/openai/codex/pull/22683) replaces this transitional shape with a resolved profile state that keeps identity and profile data together. 2. Review `codex-rs/config/src/permissions_toml.rs` and `codex-rs/core/src/config/permissions.rs`. These add `[permissions.<id>.workspace_roots]`, resolve enabled entries relative to the policy cwd, and keep `:workspace_roots` deny-read glob patterns symbolic until the actual roots are known. 3. Review `codex-rs/protocol/src/permissions.rs` and `codex-rs/protocol/src/models.rs`. These add the policy/profile materialization helpers that expand exact `:workspace_roots` entries and scoped deny-read globs over every workspace root. This is also where `ActivePermissionProfileModification` is removed from the core model. 4. Review the legacy bridge in `Config::load_from_base_config_with_overrides` and `Config::set_legacy_sandbox_policy`. This is where legacy `workspace-write` roots become runtime workspace roots, while Codex internal writable roots stay internal and do not appear as user-facing workspace roots. 5. Then skim downstream call sites. The interesting pattern is raw-vs-effective access: state/proxy/bwrap paths keep the raw constrained profile, while execution, summaries, and user-visible status use the effective profile and workspace-root list. ## What Changed - added `[permissions.<id>.workspace_roots]` to the config model and schema - added runtime `workspace_roots` state to `Config`/`Permissions` and `ConfigOverrides` - made `Permissions` profile fields private and replaced direct mutation with accessors/setters - added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for materializing `:workspace_roots` exact paths and deny-read globs across all roots - moved legacy additional writable roots into runtime workspace-root state instead of active profile modifications - removed `ActivePermissionProfileModification` and its app-server protocol/schema export - updated sandbox/status summary paths so internal writable roots are not reported as user workspace roots ## Verification Strategy The targeted tests cover the behavior at the layers where regressions are most likely: - `codex-rs/core/src/config/config_tests.rs` verifies config loading, legacy workspace-root seeding, effective profile materialization, and memory-root handling. - `codex-rs/core/src/config/permissions_tests.rs` verifies profile `workspace_roots` parsing and `:workspace_roots` scoped/glob compilation. - `codex-rs/protocol/src/permissions.rs` unit tests verify exact and glob materialization over multiple workspace roots. - `codex-rs/tui/src/status/tests.rs` and `codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the user-facing summaries show effective workspace roots and hide internal writes. I also ran `cargo check --tests` locally after the latest stack refresh to catch cross-crate API breakage from the private-field/accessor changes. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610). * #22612 * #22611 * #22683 * __->__ #22610
Michael Bolin ·
2026-05-14 18:25:23 -07:00 -
permissions: canonicalize workspace_roots and danger-full-access names (#22624)
## Why This is a small precursor to the larger permissions-migration work. Both the comparison stack in [#22401](https://github.com/openai/codex/pull/22401) / [#22402](https://github.com/openai/codex/pull/22402) and the alternate stack in [#22610](https://github.com/openai/codex/pull/22610) / [#22611](https://github.com/openai/codex/pull/22611) / [#22612](https://github.com/openai/codex/pull/22612) are easier to review if the terminology is already settled underneath them. Because `:project_roots` and `:danger-no-sandbox` have not shipped as stable user-facing surface area, carrying them forward as aliases would just add more migration logic to the later stacks. This PR removes that ambiguity now so the follow-on work can rely on one spelling for each built-in concept. ## What Changed - renamed the config-facing special filesystem key from `:project_roots` to `:workspace_roots` - dropped unpublished `:project_roots` parsing support in `core/src/config/permissions.rs`, so new config only recognizes `:workspace_roots` - renamed the built-in full-access permission profile id from `:danger-no-sandbox` to `:danger-full-access` - dropped unpublished `:danger-no-sandbox` support entirely, including the old active-profile canonicalization path, and added explicit rejection coverage for the legacy id - introduced shared built-in permission-profile id constants in `codex-rs/protocol/src/models.rs` - updated `core`, `app-server`, and `tui` call sites that special-case built-in profiles to use the shared constants and canonical ids - updated tests and the Linux sandbox README to use `:workspace_roots` / `:danger-full-access` ## Verification I focused verification on the three places this rename can regress: config parsing, active-profile identity surfaced back out of `core`, and user/server call sites that special-case built-in profiles. Targeted checks: - `config::tests::default_permissions_can_select_builtin_profile_without_permissions_table` - `config::tests::default_permissions_read_only_applies_additional_writable_roots_as_modifications` - `config::tests::default_permissions_can_select_builtin_full_access_profile` - `config::tests::legacy_danger_no_sandbox_is_rejected` - `workspace_root` filtered `codex-core` tests - `request_processors::thread_processor::thread_processor_tests::thread_processor_behavior_tests::requested_permissions_trust_project_uses_permission_profile_intent` - `suite::v2::turn_start::turn_start_rejects_invalid_permission_selection_before_starting_turn` - `status::tests::status_snapshot_shows_auto_review_permissions` - `status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access` - `app_server_session::tests::embedded_turn_permissions_use_active_profile_selection`
Michael Bolin ·
2026-05-14 08:45:54 -07:00 -
feat: add layered --profile-v2 config files (#17141)
## Why `--profile-v2 <name>` gives launchers and runtime entry points a named profile config without making each profile duplicate the base user config. The base `$CODEX_HOME/config.toml` still loads first, then `$CODEX_HOME/<name>.config.toml` layers above it and becomes the active writable user config for that session. That keeps shared defaults, plugin/MCP setup, and managed/user constraints in one place while letting a named profile override only the pieces that need to differ. ## What Changed - Added the shared `--profile-v2 <name>` runtime option with validated plain names, now represented by `ProfileV2Name`. - Extended config layer state so the base user config and selected profile config are both `User` layers; APIs expose the active user layer and merged effective user config. - Threaded profile selection through runtime entry points: `codex`, `codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex debug prompt-input`. - Made user-facing config writes go to the selected profile file when active, including TUI/settings persistence, app-server config writes, and MCP/app tool approval persistence. - Made plugin, marketplace, MCP, hooks, and config reload paths read from the merged user config so base and profile layers both participate. - Updated app-server config layer schemas to mark profile-backed user layers. ## Limits `--profile-v2` is still rejected for config-management subcommands such as feature, MCP, and marketplace edits. Those paths remain tied to the base `config.toml` until they have explicit profile-selection semantics. Some adjacent background writes may still update base or global state rather than the selected profile: - marketplace auto-upgrade metadata - automatic MCP dependency installs from skills - remote plugin sync or uninstall config edits - personality migration marker/default writes ## Verification Added targeted coverage for profile name validation, layer ordering/merging, selected-profile writes, app-server config writes, session hot reload, plugin config merging, hooks/config fixture updates, and MCP/app approval persistence. --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-14 15:16:15 +02:00 -
[codex] Remove unused legacy shell tools (#22246)
## Why Recent session history showed no active use of the raw `shell`, `local_shell`, or `container.exec` execution surfaces. Keeping those handlers/specs wired into core leaves duplicate shell execution paths alongside the supported `shell_command` and unified exec tools. ## What changed - Removed the raw `shell` handler/spec and its `ShellToolCallParams` protocol helper. - Removed the legacy `local_shell` and `container.exec` handler/spec plumbing while preserving persisted-history compatibility for old response items. - Normalized model/config `default` and `local` shell selections to `shell_command`. - Pruned tests that exercised removed raw-shell/local-shell/apply-patch variants and kept coverage on `shell_command`, unified exec, and freeform `apply_patch`. ## Verification - `git diff --check` - `cargo test -p codex-protocol` - `cargo test -p codex-tools` - `cargo test -p codex-core tools::handlers::shell` - `cargo test -p codex-core tools::spec` - `cargo test -p codex-core tools::router` - `cargo test -p codex-core active_call_preserves_triggering_command_context` - `cargo test -p codex-core guardian_tests` - `cargo test -p codex-core --test all shell_serialization` - `cargo test -p codex-core --test all apply_patch_cli` - `cargo test -p codex-core --test all shell_command_` - `cargo test -p codex-core --test all local_shell` - `cargo test -p codex-core --test all otel::` - `cargo test -p codex-core --test all hooks::` - `just fix -p codex-core` - `just fix -p codex-tools`
pakrym-oai ·
2026-05-13 16:43:25 +00:00 -
feat(tui): remove Zellij TUI workarounds (#22214)
## Why We added Zellij-specific TUI workarounds because older Zellij behavior did not work with Codex's normal terminal model: - #8555 made `tui.alternate_screen = "auto"` disable alternate screen in Zellij so transcript history stayed available. - #16578 avoided scroll-region operations in Zellij by emitting raw newlines and using a separate composer styling path. This PR removes both workarounds because the latest Zellij release tested locally (`zellij 0.44.1`) works correctly with Codex's standard TUI behavior: normal alternate-screen handling, redraw, and history insertion. ## What Changed - Removed the `InsertHistoryMode::Zellij` path and the Zellij-only newline scrollback insertion behavior. - Removed cached `is_zellij` state from the TUI and composer. - Removed Zellij-specific composer styling, the helper snapshot, and the `TerminalInfo::is_zellij()` convenience method that only served this workaround. - Changed `tui.alternate_screen = "auto"` to use alternate screen for Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still preserve the inline mode escape hatch. - Updated the generated config schema description for `tui.alternate_screen`. ## How to Test Manual smoke path used with `zellij 0.44.1`: 1. Build and run this branch inside a Zellij `0.44.1` session with default config. 2. Start Codex normally and produce enough assistant/tool output to create scrollback. 3. Confirm the transcript remains readable, the composer renders normally, and scrolling through terminal history works. 4. Resize the Zellij pane while output exists and confirm the TUI redraws without duplicated, missing, or stale rows. 5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if you want to verify the inline fallback still works. Targeted tests: - `just write-config-schema` - `just fmt` - `just fix -p codex-tui` - `cargo test -p codex-terminal-detection` - `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen` Attempted but did not complete locally: - `cargo test -p codex-tui` built and ran the new test successfully, then failed later on unrelated local failures in `status_permissions_full_disk_managed_*` and a stack overflow in `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`. ## Documentation No developers.openai.com Codex documentation update is needed for this revert.
Felipe Coury ·
2026-05-13 12:11:15 -03:00 -
feat(sandbox): add Windows deny-read parity (#18202)
## Why The split filesystem policy stack already supports exact and glob `access = none` read restrictions on macOS and Linux. Windows still needed subprocess handling for those deny-read policies without claiming enforcement from a backend that cannot provide it. ## Key finding The unelevated restricted-token backend cannot safely enforce deny-read overlays. Its `WRITE_RESTRICTED` token model is authoritative for write checks, not read denials, so this PR intentionally fails that backend closed when deny-read overrides are present instead of claiming unsupported enforcement. ## What changed This PR adds the Windows deny-read enforcement layer and makes the backend split explicit: - Resolves Windows deny-read filesystem policy entries into concrete ACL targets. - Preserves exact missing paths so they can be materialized and denied before an enforceable sandboxed process starts. - Snapshot-expands existing glob matches into ACL targets for Windows subprocess enforcement. - Honors `glob_scan_max_depth` when expanding Windows deny-read globs. - Plans both the configured lexical path and the canonical target for existing paths so reparse-point aliases are covered. - Threads deny-read overrides through the elevated/logon-user Windows sandbox backend and unified exec. - Applies elevated deny-read ACLs synchronously before command launch rather than delegating them to the background read-grant helper. - Reconciles persistent deny-read ACEs per sandbox principal so policy changes do not leave stale deny-read ACLs behind. - Fails closed on the unelevated restricted-token backend when deny-read overrides are present, because its `WRITE_RESTRICTED` token model is not authoritative for read denials. ## Landed prerequisites These prerequisite PRs are already on `main`: 1. #15979 `feat(permissions): add glob deny-read policy support` 2. #18096 `feat(sandbox): add glob deny-read platform enforcement` 3. #17740 `feat(config): support managed deny-read requirements` This PR targets `main` directly and contains only the Windows deny-read enforcement layer. ## Implementation notes - Exact deny-read paths remain enforceable on the elevated path even when they do not exist yet: Windows materializes the missing path before applying the deny ACE, so the sandboxed command cannot create and read it during the same run. - Existing exact deny paths are preserved lexically until the ACL planner, which then adds the canonical target as a second ACL target when needed. That keeps both the configured alias and the resolved object covered. - Windows ACLs do not consume Codex glob syntax directly, so glob deny-read entries are expanded to the concrete matches that exist before process launch. - Glob traversal deduplicates directory visits within each pattern walk to avoid cycles, without collapsing distinct lexical roots that happen to resolve to the same target. - Persistent deny-read ACL state is keyed by sandbox principal SID, so cleanup only removes ACEs owned by the same backend principal. - Deny-read ACEs are fail-closed on the elevated path: setup aborts if mandatory deny-read ACL application fails. - Unelevated restricted-token sessions reject deny-read overrides early instead of running with a silently unenforceable read policy. ## Verification - `cargo test -p codex-core windows_restricted_token_rejects_unreadable_split_carveouts` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-windows-sandbox` - GitHub Actions rerun is in progress on the pushed head. --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-05-11 23:04:28 -07:00 -
Reapply "Move skills watcher to app-server" (#21652)
## Why PR #21460 reverted the earlier move of skills change watching from `codex-core` into app-server. This reapplies that boundary change so app-server owns client-facing `skills/changed` notifications and core no longer carries the watcher. ## What - Restore the app-server `SkillsWatcher` and register it from thread listener setup. - Remove the core-owned skills watcher and its core live-reload integration surface. - Restore app-server coverage for `skills/changed` notifications after a watched skill file changes. ## Validation - `cargo test -p codex-app-server --test all suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change -- --exact --nocapture` - `cargo test -p codex-core --lib --no-run`
pakrym-oai ·
2026-05-08 17:41:15 -07:00 -
[codex] Delete function-style apply_patch (#21651)
## Why `apply_patch` is now a freeform/custom tool. Keeping the old JSON/function-style registration and parsing path left another way for models and tests to invoke `apply_patch`, which made the tool surface harder to reason about. ## What changed - Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch` spec, and handler support for function payloads. - Kept `apply_patch_tool_type = freeform` as the supported model metadata path, including Bedrock catalog metadata. - Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool calls. ## Verification - `cargo test -p codex-tools -p codex-protocol -p codex-model-provider` - `cargo test -p codex-core tools::handlers::apply_patch --lib` - `cargo test -p codex-core --test all apply_patch_tool_executes_and_emits_patch_events` - `cargo test -p codex-core --test all apply_patch_reports_parse_diagnostics` - `cargo test -p codex-exec test_apply_patch_tool` - `just fix -p codex-core` - `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p codex-exec`
pakrym-oai ·
2026-05-08 13:00:57 -07:00 -
Remove ToolName display helper (#21465)
## Why `ToolName::display()` made it too easy to flatten tool identity and accidentally compare rendered strings. Tool identity should stay structural until a legacy string boundary actually requires the flattened spelling. ## What - Removes `ToolName::display()` and relies on the existing `Display` impl for messages and errors. - Adds structural ordering for `ToolName` and uses it for sorting/deduping deferred tools. - Carries `ToolName` through tool/sandbox plumbing, flattening only at legacy boundaries such as hook payloads, telemetry tags, and Responses tool names. - Updates MCP normalization tests to assert `ToolName` structure instead of rendered strings. ## Testing - `cargo test -p codex-mcp test_normalize_tools` - `cargo test -p codex-core unavailable_tool` - `just fix -p codex-protocol` - `just fix -p codex-mcp` - `just fix -p codex-core`
pakrym-oai ·
2026-05-08 12:17:48 -07:00 -
[codex] Generalize service tier slash commands (#21745)
## Why `/fast` was wired as a one-off slash command even though model metadata now exposes service tiers as catalog data. That meant adding another tier, such as a slower/cheaper tier, would require more hardcoded TUI plumbing instead of letting the model catalog drive the available commands. This change makes service-tier commands data-driven: each advertised `service_tiers` entry becomes a `/name` command using the catalog description, while the request path sends the tier `id` only when the selected model supports it. ## What Changed - Removed the hardcoded `/fast` slash-command variant and introduced dynamic service-tier command items in the composer and command popup. - Added toggle behavior for service-tier commands: invoking `/name` selects that tier, and invoking it again clears the selection. - Preserved the existing Fast-mode keybinding/status affordances by resolving the current model tier whose name is `fast`, while still sending the tier request value such as `priority`. - Persisted service-tier selections as raw request strings so non-fast tiers can round-trip through config. - Updated the Bedrock catalog entry to advertise fast support through `service_tiers` with `id: "priority"` and `name: "fast"`. - Added defensive filtering in core so unsupported selected service tiers are omitted from `/responses` requests. ## Validation - Added/updated coverage for dynamic service-tier slash command lookup, popup descriptions, composer dispatch, TUI fast toggling, and unsupported-tier omission in core request construction. - Local tests were not run per request. --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-05-08 20:09:51 +03:00 -
[codex-analytics] plumb protocol-native review timing (#21434)
## Why We want terminal tool review analytics, but the reducer should not stamp review timing from its own wall clock. This PR plumbs review timing through the real protocol and app-server seams so downstream analytics can consume the emitter's timestamps directly. Guardian reviews keep their enriched `started_at` / `completed_at` analytics fields by deriving those legacy second-based values from the same protocol-native millisecond lifecycle timestamps, rather than sampling a separate analytics clock. ## What changed - add `started_at_ms` to user approval request payloads - add `started_at_ms` / `completed_at_ms` to guardian review notifications - preserve Guardian review `started_at` / `completed_at` enrichment from the protocol-native timing source - stamp typed `ServerResponse` analytics facts with app-server-observed `completed_at_ms` - thread the new timing fields through core, protocol, app-server, TUI, and analytics fixtures ## Verification - `cargo test -p codex-app-server outgoing_message --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-app-server-protocol guardian --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml` - `cargo test -p codex-analytics analytics_client_tests --manifest-path codex-rs/Cargo.toml` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434). * #18748 * __->__ #21434 * #18747 * #17090 * #17089 * #20514
rhan-oai ·
2026-05-07 20:31:41 -07:00 -
pakrym-oai ·
2026-05-07 02:24:20 +00:00 -
Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
Based on work from Vincent K - https://github.com/openai/codex/pull/19060 <img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x" src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634" /> ## Why Compaction rewrites the conversation context that future model turns receive, but hooks currently have no deterministic lifecycle point around that rewrite. This adds compact lifecycle hooks so users can audit manual and automatic compaction, surface hook messages in the UI, and run post-compaction follow-up without overloading tool or prompt hooks. ## What Changed - Added `PreCompact` and `PostCompact` hook events across hook config, discovery, dispatch, generated schemas, app-server notifications, analytics, and TUI hook rendering. - Added trigger matching for compact hooks with the documented `manual` and `auto` matcher values. - Wired `PreCompact` before both local and remote compaction, and `PostCompact` after successful local or remote compaction. - Kept compact hook command input to lifecycle metadata: session id, Codex turn id, transcript path, cwd, hook event name, model, and trigger. - Made compact stdout handling consistent with other hooks: plain stdout is ignored as debug output, while malformed JSON-looking stdout is reported as failed hook output. - Added integration coverage for compact hook dispatch, trigger matching, post-compact execution, and the audited behavior that `decision:"block"` does not block compaction. ## Out of Scope - Hook-specific compaction blocking is not implemented; `decision:"block"` and exit-code-2 blocking semantics are intentionally unsupported for `PreCompact`. - Custom compaction instructions are not exposed to compact hooks in this PR. - Compact summaries, summary character counts, and summary previews are not exposed to compact hooks in this PR. ## Verification - `cargo test -p codex-hooks` - `cargo test -p codex-core manual_pre_compact_block_decision_does_not_block_compaction` - `cargo test -p codex-app-server hooks_list` - `cargo test -p codex-core config_schema_matches_fixture` - `cargo test -p codex-tui hooks_browser` ## Docs The developer documentation for Codex hooks should be updated alongside this feature to document `PreCompact` and `PostCompact`, the `manual`/`auto` matcher values, and the compact hook payload fields. --------- Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Andrei Eternal ·
2026-05-06 18:08:31 -07:00 -
Move skills watcher to app-server (#21287)
## Why Skills update notifications are app-server API behavior, but the watcher lived in `codex-core` and surfaced through `EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core focused on thread execution and lets app-server own both cache invalidation and the `skills/changed` notification. ## What changed - Added an app-server-owned skills watcher that watches local skill roots, clears the shared skills cache, and emits `skills/changed` directly. - Registers skill watches from the common app-server thread listener attach path, including direct starts, resumes, and app-server-observed child or forked threads. - Stores the `WatchRegistration` on `ThreadState`, so listener replacement, thread teardown, idle unload, and app-server shutdown deregister by dropping the RAII guard. - Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the old core live-reload test. - Extended the app-server skills change test to verify a cached skills list is refreshed after a filesystem change without forcing reload. ## Validation - `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p codex-rollout -p codex-rollout-trace` - `cargo test -p codex-app-server skills_changed_notification_is_emitted_after_skill_change`
pakrym-oai ·
2026-05-06 15:38:11 -07:00 -
Route opted-in MCP elicitations through Guardian (#19431)
# Motivation Browser Use origin-access prompts are MCP elicitations, not direct tool-call approval prompts, so they were bypassing the Guardian approval path. We need a generic opt-in that lets eligible MCP elicitations use Guardian when the current turn already routes approvals there. # Description Add a generic elicitation reviewer hook in codex-mcp and wire codex-core to pass a Guardian reviewer callback when creating the MCP connection manager. The reviewer validates explicit mcp_tool_call opt-in metadata, builds a Guardian MCP tool-call review request from server/tool/connector metadata and tool params, and maps Guardian approval, denial, timeout, and cancellation decisions back to MCP elicitation responses. The new option to trigger this in the `_meta` object is: ``` "codex_request_type": "approval_request", ``` # Testing - RUST_MIN_STACK=8388608 NEXTEST_STATUS_LEVEL=leak cargo nextest run --no-fail-fast --cargo-profile ci-test --test-threads 2 - cargo clippy --tests -- -D warnings - cargo fmt -- --config imports_granularity=Item --check - cargo shear - pnpm run format - python3 .github/scripts/verify_cargo_workspace_manifests.py - python3 .github/scripts/verify_tui_core_boundary.py - python3 .github/scripts/verify_bazel_clippy_lints.py - git diff --check
Clark DuVall ·
2026-05-06 19:42:45 +00:00 -
Remove core MCP list tools op (#21281)
## Why The core `Op::ListMcpTools` request path is no longer needed. Keeping it around left a dead request/response surface alongside the app-server MCP inventory APIs that own current server status listing. ## What Changed - Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the core handler that built the MCP snapshot response. - Removed the now-unused `codex-mcp` snapshot wrapper/export and passive event handling arms in rollout and MCP-server consumers. - Updated tests that used the old op as a synchronization hook to wait on existing startup/skills events, and deleted the plugin test that only exercised the removed listing op. ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all pending_input::queued_inter_agent_mail` - `cargo test -p codex-core --test all rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta` - `cargo test -p codex-core --test all rmcp_client::stdio_image_responses` - `just fix -p codex-core -p codex-protocol -p codex-mcp -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
pakrym-oai ·
2026-05-06 11:20:34 -07:00 -
Move message history out of core (#21278)
## Why Message history was implemented inside `codex-core` and surfaced through core protocol ops and `SessionConfiguredEvent` fields even though the current consumer is TUI-local prompt recall. That made core own UI history persistence and exposed `history_log_id` / `history_entry_count` through surfaces that app-server and other clients do not need. This change moves message history persistence out of core and keeps the recall plumbing local to the TUI. ## What changed - Added a new `codex-message-history` crate for appending, looking up, trimming, and reading metadata from `history.jsonl`. - Removed core protocol history ops/events: `AddToHistory`, `GetHistoryEntryRequest`, and `GetHistoryEntryResponse`. - Removed `history_log_id` and `history_entry_count` from `SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly. - Updated the TUI to dispatch local app events for message-history append/lookup and keep its persistent-history metadata in TUI session state. ## Validation - `cargo test -p codex-message-history -p codex-protocol` - `cargo test -p codex-exec event_processor_with_json_output` - `cargo test -p codex-mcp-server outgoing_message` - `cargo test -p codex-tui` - `just fix -p codex-message-history -p codex-protocol -p codex-core -p codex-tui -p codex-exec -p codex-mcp-server`
pakrym-oai ·
2026-05-06 08:35:42 -07:00 -
2- Use string service tiers in session protocol (#20971)
## Summary - break service tier session/op/app-server protocol fields from the closed enum to string tier ids - send the service tier string directly through model requests, prewarm, compaction, memories, and TUI/app-server turn starts - regenerate app-server protocol JSON/TypeScript schemas, removing the standalone ServiceTier TS enum ## Verification - just fmt - cargo check -p codex-core -p codex-app-server -p codex-tui - just write-app-server-schema --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-05-06 18:00:21 +03:00 -
feat: add
session_id(#20437)## Summary Related to https://openai.slack.com/archives/C095U48JNL9/p1777537279707449 TLDR: We update the meaning of session ids and thread ids: * thread_id stays as now * session_id become a shared id between every thread under a /root thread (i.e. every sub-agent share the same session id) This PR introduces an explicit `SessionId` and threads it through the protocol/client boundary so `session_id` and `thread_id` can diverge when they need to, while preserving compatibility for older serialized `session_configured` events. --------- Co-authored-by: Codex <noreply@openai.com>
jif-oai ·
2026-05-06 10:48:37 +02:00 -
[codex-analytics] rework thread_source for thread analytics (#20949)
## Summary - make `thread_source` an explicit optional thread-level field on `thread/start`, `thread/fork`, and returned thread payloads - persist `thread_source` in rollout/session metadata so resumed live threads retain the original value - replace the old best-effort `session_source` -> `thread_source` mapping with an explicit caller-supplied analytics classification ## Why Before this change, analytics `thread_source` was populated by a best-effort mapping from `session_source`. `session_source` describes the runtime/client surface, not the actual thread-level origin, so that projection was not accurate enough to distinguish cases such as `user`, `subagent`, `memory_consolidation`, and future thread origins reliably. Making `thread_source` explicit keeps one thread-level analytics field while letting callers provide the real classification directly instead of recovering it indirectly from `session_source`. ## Impact For new analytics events, `thread_source` now reflects the explicit thread-level classification supplied by the caller rather than an inferred value derived from `session_source`. Existing protocol fields remain optional; callers that omit `threadSource` now produce `null` instead of a best-effort inferred value. ## Validation - `just write-app-server-schema` - `cargo test -p codex-analytics -p codex-core -p codex-app-server-protocol --no-run` - `cargo test -p codex-app-server-protocol generated_ts_optional_nullable_fields_only_in_params` - `cargo test -p codex-analytics thread_initialized_event_serializes_expected_shape` - `cargo test -p codex-core resume_stopped_thread_from_rollout_preserves_thread_source`
rhan-oai ·
2026-05-06 02:12:31 +00:00