mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
dev
2094 Commits
-
Remove TUI legacy core test_support dependencies (#27484)
## Why The TUI now sits on the app-server layer, but `app-server-client::legacy_core` still exposed core test helpers solely for TUI tests. We've been whittling away the remaining dependencies. This is the next step on that journey. There is no functional change — just a refactor, and this affects only test code, so it should be low risk. ## What changed - remove the `legacy_core::test_support` re-export and call model-manager test helpers directly - keep the bundled model-preset cache local to TUI test support - import constraint types directly from `codex-config`
Eric Traut ·
2026-06-10 17:55:49 -07:00 -
[codex] Remove redundant plugin app auth state (#27465)
## Summary - remove the redundant `needsAuth` field from `AppSummary` and generated app-server schemas - stop `plugin/read` from querying Apps MCP solely to hydrate unused connector auth state - preserve `plugin/install.appsNeedingAuth` membership and `app/list.isAccessible` as the authentication signals ## Why Codex App and TUI do not consume `plugin/read.plugin.apps[].needsAuth`. Hydrating it could establish an Apps MCP connection and discover tools on a cold `plugin/read` request, adding avoidable latency. The plugin APIs are still marked under development, so removing this wire field is preferable to retaining a misleading default. ## Verification - `just write-app-server-schema` - `just fmt` - `just test -p codex-app-server-protocol` - `just test -p codex-app-server plugin_install_uses_remote_apps_needing_auth_response` - `just test -p codex-app-server plugin_install_returns_apps_needing_auth` - `just test -p codex-app-server plugin_read_returns_plugin_details_with_bundle_contents` - `just test -p codex-tui plugin_detail_popup_snapshot_shows_install_actions_and_capability_summaries` - `$xin-build` simplify and debug reviews
xl-openai ·
2026-06-10 17:33:56 -07:00 -
[codex] add /import for external agents (#27071)
## Why External-agent import should be discoverable and deliberate without blocking startup or claiming the public `codex [PROMPT]` CLI namespace. The slash command keeps the flow local to the interactive TUI and reuses the existing app-server import API. ## What changed - add the user-facing `/import` slash command - detect external-agent importable items only when the command is invoked - run imports through the embedded local app-server - show start and completion messages, refresh configuration, and block duplicate imports while one is pending - reject the flow for unsupported remote and local-daemon sessions ## Validation - `just test -p codex-tui external_agent_config_migration` (10 passed) - manually exercised an isolated TUI fixture with existing external-agent setup and session data using a fresh `CODEX_HOME` - verified picker customization, plugin and session detection, import completion, repeated invocation, and imported-session resume context - the broader `just test -p codex-tui` run passed 2,805 tests, with 2 unrelated guardian feature-flag failures and 4 skipped tests ## Draft follow-ups - review whether completion messaging should remain attached to the initiating chat if the user switches chats during an import - review shutdown semantics for an in-progress background import ## Stack 1. [#27064](https://github.com/openai/codex/pull/27064): remove the startup migration flow 2. [#27065](https://github.com/openai/codex/pull/27065): extract the picker renderer 3. [#27070](https://github.com/openai/codex/pull/27070): add the external-agent import picker UX 4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow through `/import` **This PR is stack item 4.** Draft while the lower stack dependencies are reviewed.
stefanstokic-oai ·
2026-06-10 15:53:15 -04:00 -
[codex] add external agent import picker UX (#27070)
## Why Users need to understand what external-agent data Codex detected, what is selected, and how to proceed before an import begins. The updated picker makes focus, selection state, and the submission path explicit while preserving the existing import backend. ## What changed - replace the old migration prompt with a two-step external-agent import picker - add a customize view with explicit item focus, selection state, counts, and a review action - separate detected import data into a view model - add Unix and Windows snapshots for prompt, item-focus, and action-focus states ## Validation - `just test -p codex-tui external_agent_config_migration` (10 passed) - manually exercised an isolated TUI fixture covering customization, selection toggles, review, import, repeated invocation, and session resume - the broader `just test -p codex-tui` run passed 2,805 tests, with 2 unrelated guardian feature-flag failures and 4 skipped tests ## Review note This is the largest layer in the stack because the interaction state, rendering changes, and required snapshots move together. It remains a draft in case reviewers prefer a further presentation/state split. ## Stack 1. [#27064](https://github.com/openai/codex/pull/27064): remove the startup migration flow 2. [#27065](https://github.com/openai/codex/pull/27065): extract the picker renderer 3. [#27070](https://github.com/openai/codex/pull/27070): add the external-agent import picker UX 4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow through `/import` **This PR is stack item 3.** Draft while the lower stack dependencies are reviewed.
stefanstokic-oai ·
2026-06-10 15:19:37 -04:00 -
[codex] extract external agent import picker renderer (#27065)
## Why The external-agent import picker is easier to review when its rendering refactor lands separately from new state and interaction behavior. This layer is intended to be behavior-neutral. ## What changed - extract external-agent migration rendering into a dedicated `render` module - preserve existing behavior while separating presentation from interaction logic - establish a smaller foundation for the import picker UX in the next PR ## Validation - `just test -p codex-tui external_agent_config_migration` (10 passed) ## Stack 1. [#27064](https://github.com/openai/codex/pull/27064): remove the startup migration flow 2. [#27065](https://github.com/openai/codex/pull/27065): extract the picker renderer 3. [#27070](https://github.com/openai/codex/pull/27070): add the external-agent import picker UX 4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow through `/import` **This PR is stack item 2.** Draft while the lower stack dependency is reviewed.
stefanstokic-oai ·
2026-06-10 14:48:30 -04:00 -
[codex] remove blocking external agent migration flow (#27064)
## Why External-agent import should be initiated deliberately instead of interrupting eligible TUI startups. This cleanup removes the blocking startup flow before the replacement import experience is introduced later in the stack. ## What changed - remove the startup-blocking external-agent migration prompt - remove the now-unused external migration feature gate - remove the obsolete TUI app-server migration wrappers - retain the dormant picker behind a module-scoped dead-code allowance until the next stack item wires it back in - keep normal TUI startup focused on entering Codex immediately ## Validation - `bazel build --config=clippy //codex-rs/tui:tui //codex-rs/tui:tui-unit-tests-bin` - `just test -p codex-tui external_agent_config_migration` (8 passed) - `just test -p codex-tui` (2,786 passed, 12 unrelated local environment-sensitive failures, 4 skipped) - `just fix -p codex-tui` - `just fmt` ## Stack 1. [#27064](https://github.com/openai/codex/pull/27064): remove the startup migration flow 2. [#27065](https://github.com/openai/codex/pull/27065): extract the picker renderer 3. [#27070](https://github.com/openai/codex/pull/27070): add the external-agent import picker UX 4. [#27071](https://github.com/openai/codex/pull/27071): expose the flow through `/import` **This PR is stack item 1.**
stefanstokic-oai ·
2026-06-10 14:25:04 -04:00 -
fix: Auto-recover from corrupted sqlite databases (#26859)
Further investigation of the sqlite incidents showed that the problems are due to corruption from the older version of SQLite that we recently upgraded, and that the data is truly corrupted in the root database -- recovery of all data is not possible. Given that the data is reconstructable from the rollouts on disk, we should just auto-backup the database and let codex rebuild the rollout info from the disk rollouts. The new behavior is that appserver auto-backs-up and rebuilds (with logs reflecting that behavior). The CLI now pops a message letting you know this happened and the paths of the backed-up corrupt db and the new database. There is also context added so that the desktop app can read the rebuild info from it and inform the user with it.
David de Regt ·
2026-06-10 11:24:29 -07:00 -
Add app-server
thread/deleteAPI (#25018)## Why Clients can archive and unarchive threads today, but there is no app-server API for permanently removing a thread. Deletion also needs to cover the full session tree: deleting a main thread should remove spawned subagent threads and the related local metadata instead of leaving orphaned rollout files, goals, or subagent state behind. ## What - Adds the v2 `thread/delete` request and `thread/deleted` notification, with the response shape kept consistent with `thread/archive`. - Implements local hard delete for active and archived rollout files. - Deletes the requested thread's state DB row as the commit point, then best-effort cleans associated state including spawned descendants, goals, spawn edges, logs, dynamic tools, and agent job assignments. - Updates app-server API docs and generated protocol schema/TypeScript fixtures.
Eric Traut ·
2026-06-10 11:22:12 -07:00 -
feat: keep child MCP warnings out of parent transcript (#27174)
## Why MCP startup status notifications are thread-owned, but `ChatWidget` trusted upstream routing. If routing state delivered a tagged child notification to the active parent widget, the child MCP failure could still mutate the parent's startup state and transcript. Rejecting it only inside the MCP handler was also too late because shared notification handling could already restore and consume the parent's retry status. ## What changed - Validate a tagged MCP status notification against the visible `ChatWidget` thread before shared notification handling mutates any parent state. - Cover child `Starting` and `Failed` notifications delivered to a retrying parent widget, asserting that they preserve its visible retry error and saved status header while producing no history or MCP status mutation. ## User impact Subagent MCP startup failures remain scoped to the child transcript instead of appearing as duplicate warnings in the parent transcript. ## Testing - `just test -p codex-tui mcp_startup_ignores_status_for_other_thread` - `just test -p codex-tui primary_thread_ignores_child_mcp_startup_notifications` - `just fmt`
jif ·
2026-06-10 11:45:49 +02:00 -
canvrno-oai ·
2026-06-09 16:34:38 -07:00 -
Reduce TUI legacy core dependencies (#26711)
## Why The TUI still reached through `app-server-client::legacy_core` for thread-name normalization and project-instruction filename details. In particular, checking the TUI's local filesystem for `/init` is incorrect for remote app-server sessions, where the server owns the working directory and instruction discovery. ## What changed - use the instruction source paths supplied by the app server to decide whether `/init` should avoid overwriting project instructions - keep the small thread-name normalization helper local to the TUI - remove the now-unused instruction filename constants, utility module, and other unused `legacy_core` re-exports - make status helper tests independent of concrete instruction filenames ## Verification - `just test -p codex-app-server-client` - `just test -p codex-tui slash_init_skips_when_project_instructions_are_loaded` - `just test -p codex-tui` ran 2,799 tests; 2,797 passed and two unrelated guardian feature-flag tests failed reproducibly in untouched code ### Manual test Started an app server over WebSocket with a remote workspace containing `AGENTS.md`, then connected the TUI using `--remote`. After confirming `thread/start` returned the file in `instructionSources`, deleted `AGENTS.md` and ran `/init` in the existing session. The TUI still reported that project instructions already existed and skipped `/init`. The trace contained no `turn/start` request, confirming the decision came from app-server session state rather than a new client-local filesystem check.
Eric Traut ·
2026-06-09 13:26:00 -07:00 -
fix: Prevent /review crash when entering Esc on steer message (#22879)
This changes the `/review` escape path so `Esc` no longer behaves like the normal queued-follow-up interrupt flow while a review is running. Steering is not currently supported in `/review` mode, without this change users are able to attempt a steer but it leads to a crash (see #22815). If the user has already tried to send additional guidance during `/review`, the TUI now keeps the review running and shows a warning that steer messages are not supported in that mode, while still pointing users to `Ctrl+C` if they actually want to cancel. It also adds regression coverage for the review-specific warning behavior. When users do cancel with Ctrl+C during /review, the TUI now tolerates the active-turn race that can happen during review handoff, and any queued steer messages are restored to the composer instead of being discarded. - Special-case `Esc` during an active `/review` when follow-up steer input is pending or has already been deferred. - Show a clear warning instead of interrupting the running review. - Make the Ctrl+C cancel path during /review resilient to active-turn races, while preserving any queued steer text by restoring it to the composer. - Add review-mode test coverage for the warning path. ## Testing 1. Start a `/review` with a diff large enough that the review stays active for more than a few seconds. 2. While the review is still running, type a follow-up / steer message, submit it, and then press `Esc`. Before: `Esc` causes the TUI to close abruptly. After: the review keeps running and the transcript shows a warning that steer messages are not supported during `/review`, with guidance to use `Ctrl+C` if you want to cancel. 3. Press `Ctrl+C` if you actually want to stop the review. Before: (after restarting the test since Pt. 2 crashed) this is the intentional cancellation path. After: this remains the intentional cancellation path, and any queued follow-up steer text is restored to the composer instead of being lost. ## Note: `/review` mode explicitly does not support steering at this time (as noted in `turn_processer.rs`, if we want to explore that in the future this code will need to be modified). This change keeps unsupported steer attempts from crashing the TUI and preserves queued follow-up text if the user cancels with Ctrl+C.
canvrno-oai ·
2026-06-09 09:41:58 -07:00 -
multi-agent: add path-based v2 activity tracking (#27007)
## Why Multi-agent v2 identifies agents by canonical paths, but its tool handlers still emitted the larger legacy collaboration begin/end events built around nickname and role metadata. App-server, rollout-trace, analytics, and TUI consumers therefore lacked one compact path-based completion signal that behaved consistently across live events and replay. The TUI also needs a bounded `/agent` status surface for v2 agents. It should use recent local activity for previews, refresh liveness without loading full histories, and keep the legacy picker available when no path-backed v2 agent is known. ## What changed - Replace the v2 `spawn_agent`, `send_message`, `followup_task`, and `interrupt_agent` legacy lifecycle emissions with a success-only `SubAgentActivity` event. The event records the tool call ID, occurrence time, affected thread, canonical agent path, and `started`, `interacted`, or `interrupted` kind. - Expose the activity as a completion-only app-server v2 `subAgentActivity` thread item in live notifications and reconstructed history, regenerate the protocol schemas, and count it in sub-agent tool analytics. - Track canonical paths from live activity and loaded-thread metadata in the TUI, and render the activity in live and replayed transcripts. - Make `/agent` list running path-backed agents with summaries from bounded local event buffers. Each summary is capped at 240 graphemes, the scan is capped at six recent items, only the last three wrapped lines are shown, and command output is omitted. Liveness falls back to metadata-only `thread/read` when local turn state is unavailable. - Persist the activity as a terminal rollout-trace runtime payload and reduce it to the corresponding spawn, send, follow-up, or close interaction edge. `interrupt_agent` is classified as a close-edge operation. - Preserve the legacy picker when no path-backed v2 agent is known. ## Compatibility App-server v2 clients that consumed `collabAgentToolCall` begin/end pairs for these tools must handle the new completion-only `subAgentActivity` item. Legacy v1 collaboration behavior is unchanged. ## Screenshot <img width="684" height="288" alt="Screenshot 2026-06-08 at 15 40 47" src="https://github.com/user-attachments/assets/194b3cd0-619d-45fb-b587-cf3e2b1b8a1d" /> ## Testing - `just test -p codex-app-server-protocol` - `just test -p codex-rollout-trace` - Added focused coverage for activity analytics, terminal trace serialization, spawn-edge reduction, `interrupt_agent` classification, TUI status rendering without aggregated command output, and clearing stale running state after a completed turn.
jif ·
2026-06-09 12:14:48 +02:00 -
[codex] preserve fsmonitor for worktree Git reads (#26880)
Codex forces `core.fsmonitor=false` on internal Git commands so a repository cannot select an executable fsmonitor helper. This also disables Git's built-in daemon for `status`, `diff`, and `ls-files`, turning those worktree reads into full scans in large repositories. Read the raw effective `core.fsmonitor` value and preserve it only when Git interprets it as true and advertises built-in daemon support through `git version --build-options`. Query uncommon boolean spellings back through Git using the exact effective value. Unset, false, helper paths, malformed values, probe failures, and unsupported Git builds continue to force `core.fsmonitor=false`. Centralize this policy in `git-utils` while keeping process execution in the existing local and workspace-command adapters. Probe once per worktree workflow and reuse the result for its Git commands, including the TUI `/diff` path. Metadata-only commands and repository discovery remain disabled without probing. Each probe and requested Git process keeps its own existing timeout, and the decision is not cached because layered and conditional Git configuration can change while Codex runs. --------- Co-authored-by: Chris Bookholt <bookholt@openai.com>
Tamir Duberstein ·
2026-06-08 21:32:46 -07:00 -
Preserve cloud requirements across TUI thread resets (#25177)
Fixes a TUI regression where thread transitions such as `/new` and `/clear` could rebuild config without the cloud requirements loader, allowing users to fall back to non-cloud-managed settings. The config refresh path now preserves cloud requirements during thread reinitialization, and config loading is moved off the deep TUI event stack to avoid stack-overflow crashes during those reloads. - Passes the cloud requirements loader through TUI config rebuild paths. - Keeps cloud requirements applied for `/new`, `/clear`, `/fork`, side conversations, and session picker transitions. - Runs config building on a Tokio task so reloads do not occur on the deep TUI caller stack. - Adds regression coverage that cloud requirements survive thread-transition config refreshes. ## Test/Repro: - Start Codex with a cloud requirement applied. - Use `/new` or `/clear`. - The refreshed/fresh-session config should still include the cloud requirements This can be tested with any config item, at this moment for oai staff the easiest item to test is the `mentions_v2` feature. This is currently enabled in cloud requirements, but is not enabled by default. As a result, prior to these changes that feature is disabled after `/new` or `/clear`. Testing the same steps with a binary from this branch should not drop the feature enablement.
canvrno-oai ·
2026-06-08 18:08:48 -07:00 -
Show effective sandbox modes in /debug-config (#27068)
## Summary - Render `/debug-config`'s `allowed_sandbox_modes` from the finalized permission constraints instead of the raw requirements list. - Add regression coverage for configured full-access and external sandbox modes being omitted when effective permissions reject them. ## Details `allowed_sandbox_modes` comes from managed requirements, but the final permissions can be further constrained by derived validation rules. For example, `permissions.filesystem.deny_read` requires sandbox enforcement, so modes that disable or externalize Codex's sandbox are not actually usable even if they were present in the raw requirements TOML. The debug renderer now enumerates the configured sandbox-mode labels and keeps only those accepted by `Config.permissions`. That makes `/debug-config` reflect the same effective permission-profile constraint path used by runtime config validation, while preserving the existing source/provenance display. ## Validation - Added a regression test for effective sandbox-mode filtering in `/debug-config`.
canvrno-oai ·
2026-06-08 17:03:52 -07:00 -
fix(tui): linkify complete bare URLs with tildes (#27088)
## Background Bare URLs containing `~` in their path are currently only clickable up to the tilde in the interactive TUI. For example, Codex renders the visible text for: `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf` but the OSC 8 destination stops at `https://www.cs.tufts.edu/`. This makes Cmd-click open the wrong location even though the terminal recognizes the complete URL outside Codex. Fixes #26774. ## Root Cause The URL scanner already accepts `~`. The truncation happens earlier: with strikethrough parsing enabled, `pulldown-cmark` splits this URL into adjacent decoded `Event::Text` values around the tilde. The Markdown renderer annotated each text event independently, so only the first event still looked like a complete URL with a supported scheme. The renderer now merges adjacent decoded text events before URL annotation. It preserves the combined source range while retaining parser-decoded contents, which avoids regressing entities such as `&`. ## Changes - Add a small iterator that merges adjacent decoded Markdown text events and their source ranges. - Apply it at the Markdown renderer boundary before hyperlink detection. - Add regression coverage for the reported URL in prose, wrapped table output, and entity-decoded URLs. ## How to Test 1. Run Codex with `just c`. 2. Ask the assistant to output this exact bare URL with no Markdown link syntax: `https://www.cs.tufts.edu/~nr/cs257/archive/olin-shivers/dissertation.pdf` 3. Hold Cmd and hover or click the URL. 4. Confirm the complete URL, including the suffix after `~`, is one destination. 5. Repeat with the URL inside a Markdown table and confirm wrapped portions retain the same complete destination. Targeted tests: - `just test -p codex-tui url_with_tilde` - `just test -p codex-tui merged_text_events_preserve_entity_decoding` The full `codex-tui` test run was also executed. Its only failures were the two existing Guardian feature-flag tests: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
Felipe Coury ·
2026-06-08 17:02:36 -07:00 -
fix: preserve auto review across config and delegation (#26230)
## Why Auto Review should remain the effective approval reviewer when settings cross runtime boundaries. A config or app-server round trip must not change the reviewer identity, and delegated work must not silently fall back to user review. This requires both a stable canonical serialized value and propagation of the effective setting. `auto_review` is the canonical value across protocol and app-server output, while `guardian_subagent` remains accepted as backward-compatible input. ## What changed - serialize `ApprovalsReviewer::AutoReview` consistently as `auto_review` across core protocol and app-server v2 - continue accepting `guardian_subagent` when reading existing config or client requests - carry the active turn's approval reviewer into spawned agents - update config/debug expectations and add delegated-task regression coverage ## Scope This does not change Guardian policy or remove compatibility with existing `guardian_subagent` inputs. It preserves the selected reviewer across serialization, config reloads, app-server settings, and delegated task setup. Related Guardian changes are split independently: - #26231 adds denials and soft denials - #26334 retries transient reviewer failures - #26333 reuses narrowly scoped low-risk approvals - #26232 adds TUI denial recovery ## Validation - `just test -p codex-app-server-protocol` (224 passed) - regression coverage for delegated task reviewer propagation - serialization coverage for canonical `auto_review` output and legacy `guardian_subagent` input --------- Co-authored-by: saud-oai <saud@openai.com>
viyatb-oai ·
2026-06-08 18:59:50 +00:00 -
fix(tui): scope MCP startup status by thread (#26639)
## Why MCP startup failures from spawned subagents were rendered as global notifications, so a child thread's failure could pollute the visible parent transcript. Routing the notification to the child exposed two related replay problems: session refresh could discard the buffered event, and a newly created child `ChatWidget` did not know the expected MCP server set, which could leave its startup spinner running after every server had settled. MCP startup diagnostics should remain visible in the thread that owns the startup without affecting other transcripts. The protocol also needs to support a future app-scoped MCP lifecycle where startup is not owned by any thread. ## Reported Behavior The [originating Slack report](https://openai.slack.com/archives/C08JZTV654K/p1780604538859939) called out that using subagents could turn MCP startup failures into a wall of yellow CLI warnings because repeated failures were not deduplicated. The intended behavior is for those diagnostics to remain visible once in the thread that owns the startup, without polluting the parent transcript. ## What Changed - add nullable `threadId` ownership to `mcpServer/startupStatus/updated` - populate it from the app-server conversation ID for the current thread-scoped lifecycle and regenerate the protocol schema and TypeScript artifacts - treat a missing or null `threadId` as app-scoped without injecting it into the active chat transcript - route and buffer thread-owned MCP startup notifications by thread in the TUI - preserve buffered MCP startup events across child session refresh - seed expected MCP servers before replaying a thread snapshot so startup reaches its terminal state - suppress an identical repeated failure warning for the same server within one startup round The owning thread still renders the detailed failure and final `MCP startup incomplete (...)` summary. ## How to Test 1. Configure an optional MCP server named `smoke` that exits during initialization. 2. Launch the TUI with multi-agent support enabled. 3. Confirm the main thread's own startup failure renders one detailed `smoke` warning and one incomplete-startup summary. 4. Spawn exactly one subagent. 5. Confirm the parent transcript does not receive the subagent's MCP startup failure. 6. Switch to the subagent thread and confirm it contains exactly one detailed `smoke` failure and one incomplete-startup summary. 7. Confirm the subagent's MCP startup spinner disappears and the thread remains usable. 8. Switch between the parent and subagent and confirm the warnings neither move nor duplicate. Targeted tests: - `just test -p codex-app-server-protocol` - `just test -p codex-app-server thread_start_emits_mcp_server_status_updated_notifications` - `just test -p codex-tui mcp_startup` The parent/child behavior and spinner completion were also exercised manually in tmux. `just argument-comment-lint` was attempted but blocked by an unrelated local Bazel LLVM empty-glob failure; touched Rust callsites were inspected manually.
Felipe Coury ·
2026-06-07 20:12:05 -07:00 -
[codex] Deduplicate skill load warnings (#26698)
Skill reloads can get noisy when the watcher keeps triggering `skills/list` and the same invalid `SKILL.md` error comes back each time. This keeps the first warning visible, then suppresses repeats while the same `(path, message)` is still active. If the error clears and later comes back, or if the message changes, it will show again. Validation: - `just fmt` - `just test -p codex-tui skill_load_warning_state`
xl-openai ·
2026-06-05 18:37:47 -07:00 -
permissions: enforce managed permission profile allowlists (#24852)
## Why Permission profile allowlists are an enterprise security boundary, but they also need to compose across the managed requirements layers added in #24620. A map representation lets each requirements layer add, allow, or revoke individual profiles without replacing an entire array. ## Managed Contract Administrators configure the mergeable allow map with `allowed_permission_profiles`. A recommended enterprise configuration explicitly lists every built-in and custom profile users should be able to select: ```toml default_permissions = "review_only" [allowed_permission_profiles] ":read-only" = true ":workspace" = true review_only = true # ":danger-full-access" is intentionally omitted, so it is denied. [permissions.review_only] extends = ":read-only" ``` - Profiles whose effective merged value is `true` are allowed. - Missing profiles and profiles set to `false` are denied. - This is a closed allowlist: built-in profiles and profiles introduced in future versions are denied unless explicitly allowed. - Explicitly list each built-in profile the enterprise wants to make available. Omit built-ins such as `:danger-full-access` when they should remain unavailable. - Set `default_permissions` explicitly to the allowed profile users should receive when they have no local selection. - Higher-precedence layers override only the profile keys they define. - `false` is only needed when a higher-precedence layer must revoke a `true` inherited from a lower layer. - Explicit keys must refer to known built-in or managed profiles. A custom or narrowed allowlist requires an allowed `default_permissions`. For compatibility, if both `:workspace` and `:read-only` are explicitly allowed, an omitted default resolves to `:workspace`; customer configurations should still set the intended default explicitly. When `allowed_permission_profiles` is absent, existing implicit permission and legacy `sandbox_mode` behavior is unchanged. ## What Changed - Add `allowed_permission_profiles` as a `BTreeMap<String, bool>` that merges per profile across requirements layers. - Enforce managed defaults, strict denial of omitted profiles, and the explicitly allowed standard-pair fallback. - Expose `allowedPermissionProfiles` through `configRequirements/read` and regenerate its schemas. - Add regression coverage for map composition and revocation, managed defaults, strict denial of omitted built-ins, and API output. ## Verification - Focused `codex-config` coverage for layered map composition and revocation - Focused `codex-core` coverage for managed defaults, invalid defaults, strict denial of omitted built-ins, and the standard built-in pair - Focused `codex-app-server` coverage for requirements API output - Scoped Clippy for `codex-config`, `codex-core`, `codex-app-server-protocol`, and `codex-app-server` ## Documentation The managed `requirements.toml` documentation should introduce `allowed_permission_profiles` as a closed permission-profile allowlist before this setting is published on developers.openai.com. --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-05 18:06:29 -07:00 -
[codex-rs] support v2 personal access tokens (#25731)
## Summary - add v2 personal access token support for `codex login --with-access-token` and `CODEX_ACCESS_TOKEN` - classify opaque `at-` tokens separately from legacy Agent Identity JWTs - hydrate required ChatGPT account metadata through AuthAPI `/v1/user-auth-credential/whoami` - use PATs directly as bearer tokens while preserving existing ChatGPT account surfaces - expose PAT-backed auth as the explicit `personalAccessToken` app-server auth mode ## Implementation PAT auth is intentionally small and stateless. Loading a PAT performs one AuthAPI metadata request, stores the hydrated metadata in the in-memory auth object, and redacts the secret from debug output. Legacy Agent Identity JWT handling remains unchanged. The shared access-token classifier lives in a private neutral module because it dispatches between both credential types. PAT hydration fails closed when AuthAPI omits any required metadata, including email. Hydrated metadata is intentionally not persisted: startup performs a live `whoami` preflight so revoked tokens or changed account metadata are not accepted from a stale cache. ## Workspace restriction scope This change intentionally does **not** apply `forced_chatgpt_workspace_id` to PAT authentication. The setting is a client-side config guardrail, not an authorization boundary, and PAT does not currently require workspace-ID parity. The PAT login and `CODEX_ACCESS_TOKEN` paths therefore validate through AuthAPI without threading workspace-restriction state through access-token loading. Existing workspace checks for non-PAT auth remain on their established paths. ## App-server compatibility The public app-server `AuthMode` is shared across v1 and v2, and PAT-backed auth reports `personalAccessToken` through both APIs. Following human review, this intentionally removes the temporary v1 compatibility mapping that reported PATs as `chatgpt`; the deprecated v1 API is kept in parity with v2 rather than maintaining a separate closed enum. Clients with exhaustive auth-mode handling in either API version must add the new case and should generally treat it as ChatGPT-backed unless they need PAT-specific behavior. The v1 auth-status response still omits the raw PAT when `includeToken` is requested because that response cannot carry the account metadata needed to reuse the credential safely. Persisted PAT auth also omits the new enum value so older Codex builds can deserialize `auth.json` and infer PAT auth from the credential field after a rollback. ## Validation Latest review-fix validation: - `CARGO_INCREMENTAL=0 just test -p codex-login` (126 passed) - `CARGO_INCREMENTAL=0 just test -p codex-cli` (263 passed) - `CARGO_INCREMENTAL=0 just test -p codex-cli stored_auth_validation_handles_personal_access_token` - `CARGO_INCREMENTAL=0 just test -p codex-app-server-protocol` (226 passed) - `CARGO_INCREMENTAL=0 just test -p codex-models-manager refresh_available_models_uses_remote_only_catalog_for_chatgpt_auth` - `CARGO_INCREMENTAL=0 just test -p codex-tui existing_non_oauth_chatgpt_login_counts_as_signed_in` - `CARGO_INCREMENTAL=0 just fix -p codex-login -p codex-app-server-protocol -p codex-models-manager -p codex-tui -p codex-cli` - `just fmt` - `git diff --check` The broader `codex-tui` suite previously compiled and ran 2,834 tests. Three unrelated environment-sensitive guardian/IDE-socket tests failed after retries; the PAT-relevant TUI coverage passed.
cooper-oai ·
2026-06-05 17:36:18 -07:00 -
[codex] Gate terminal visualization instructions in TUI (#26013)
## Summary - add `Feature::TerminalVisualizationInstructions` as `UnderDevelopment`, disabled by default - keep terminal visualization instructions inside the TUI package - append them to existing developer instructions for TUI start, resume, and fork flows only when enabled - intentionally do not apply them to `codex exec` ## Rollout Control behavior is unchanged. TUI dogfooders can enable `terminal_visualization_instructions`; no default user receives the new terminal-specific instructions. The shared visualization-selection rule is supplied separately through the `codex_proxy_model_3` Statsig layer for every target Codex model slug in the gated cohort. This TUI feature determines how to render an appropriate visualization on the terminal surface; the model-layer treatment determines when to use one. ## Validation - `cargo test -p codex-tui terminal_visualization_instructions_are_gated_for_all_tui_thread_flows --lib` - `cargo test -p codex-features --lib` - `cargo fmt --all -- --check` - `git diff --check` - GPT-5.4 and GPT-5.5 real prompt-pipeline smoke tests: both visualized the positive mapping case, abstained on the negative route case, and passed exact prompt-stack verification on CLI and App - refreshed onto current `main` with a clean merge and reran the focused validation The full 53-probe all-model treatment comparison and requested production coding evals remain rollout gates before broadening beyond the initial employee cohort. This PR remains open for normal human review.
vie-oai ·
2026-06-05 17:23:45 -07:00 -
Speed up TUI startup by reusing plugin discovery (#26469)
## Summary TUI startup loads related plugin data from `hooks/list`, session MCP initialization, and plugin skill warmup. These paths repeated filesystem discovery and emitted the same plugin warnings, while `hooks/list` and account/model bootstrap ran serially. This change: - Reuses one immutable plugin load outcome across startup consumers. - Keys the cache only on plugin-relevant configuration. - Single-flights concurrent plugin loads and prevents invalidated loads from repopulating the cache. - Runs hook discovery and account/model bootstrap concurrently. - Preserves configuration-migration ordering, hook review behavior, and accurate startup telemetry. In 10 alternating release-build launches in the Ruff repository with the existing `~/.codex` configuration, median time to the first editable composer decreased from 833ms to 504ms. The branch was faster in 9 of 10 pairs, with a paired median improvement of 312ms.
Charlie Marsh ·
2026-06-05 15:32:43 -04:00 -
Use state DB first for
resume --last(#26462)## Summary `codex resume --last` currently lists sessions by updated time using scan-and-repair. Updated-time filesystem listing must stat every rollout before applying the cwd, provider, and source filters, so startup scales with the entire local session history... This change queries the state DB first for the latest matching session. For local workspaces, we only accept the indexed result when its rollout path still exists; otherwise we retry with scan-and-repair. The same lookup path is shared by `fork --last`. I benchmarked the same `thread/list` request used by `resume --last` in my local `ruff` checkout against a Codex home with 2,599 active rollouts totaling 3.7 GiB, including 90 Ruff threads. Across five fresh release app-server processes with warm filesystem caches, the state-DB-only lookup had median latency of 0.37-0.44 ms, while scan-and-repair had median latency of 139-162 ms. First-request latency was 0.7-1.7 ms versus 142-185 ms. So this **removes roughly 140-160 ms from the `resume --last` lookup** on this machine, and makes that lookup over 300x faster. The tradeoff is that this does leave two correctness gaps: - If a newer matching rollout is missing from SQLite but an older matching row exists, the fast path resumes the older thread and never falls back to the filesystem scan. - If an existing row has stale filter or ordering metadata, the fast path can select a different thread from scan-and-repair. The rollout tests already demonstrate this for stale cwd metadata: state-DB-only returns the stale match, while scan-and-repair removes and repairs it. So you could end up seeing the "wrong" result in cases like... 1. A crash or SQLite error occurs between Codex writing the conversation file and updating SQLite, leaving the newer file unindexed. 2. An older Codex version, restore, or manual copy adds a conversation file after SQLite’s one-time backfill completed. These seem pretty rare though (and sessions can always be recovered via other mechanisms -- `--last` is just a convenience feature), and I think the tradeoffs are good here?
Charlie Marsh ·
2026-06-05 14:58:09 -04:00 -
Make runtime workspace roots absolute in app-server API (#26552)
Stacked on #26532. ## Why #26532 moves cwd normalization to the app-server/core boundary. `runtimeWorkspaceRoots` still accepted raw paths in v2 requests and in `ConfigOverrides`, which left core responsible for interpreting those roots later. This makes runtime workspace roots follow the same absolute-path boundary as cwd. ## What - Change v2 `runtimeWorkspaceRoots` request fields for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` to `AbsolutePathBuf`. - Deduplicate already-absolute runtime roots in app-server handlers and pass them through `ConfigOverrides.workspace_roots` as `AbsolutePathBuf`. - Update TUI and exec client request builders to pass absolute runtime roots directly. - Update app-server docs, schema fixtures, and focused tests for absolute runtime roots. ## Testing - `just test -p codex-app-server-protocol` - `just test -p codex-app-server runtime_workspace_roots` - `just test -p codex-core session_permission_profile_rebinds_runtime_workspace_roots` - `just test -p codex-tui app_server_session` - `just test -p codex-exec`
pakrym-oai ·
2026-06-05 11:36:53 -07:00 -
fix(tui): restore cancelled prompt cursor at end (#26457)
## Why Pressing `Esc` on a turn that produced no visible output restores the submitted prompt so the user can keep editing it. That restore path preserved the prompt content, images, and mention bindings, but left the composer cursor at the start of the restored text. The next edit therefore inserted at the beginning instead of continuing from the end of the prompt. ## What Changed - Move the cursor to the end after `BottomPane::set_composer_text_with_mention_bindings` rehydrates a restored draft. - Add test-only cursor accessors so restore tests can assert the composer state directly. - Extend the queued restore regression to assert the restored composer cursor is positioned at `text.len()`. ## How to Test Manual reviewer flow: 1. Start Codex in the TUI. 2. Submit a prompt that will take long enough to interrupt. 3. Press `Esc` before any visible assistant output appears. 4. Confirm the prompt is restored into the composer and the cursor is at the end, so typing appends to the prompt. 5. Repeat with a prompt that includes an attached image or resolved mention and confirm the restored content remains intact. Targeted tests: - `just test -p codex-tui chatwidget::tests::composer_submission::queued_restore_with_remote_images_keeps_local_placeholder_mapping` Lint note: - `just argument-comment-lint` is blocked locally by the existing Bazel `compiler-rt` empty glob failure before analyzing touched code. The touched Rust diff was manually inspected and adds no new opaque positional literal callsites.
Felipe Coury ·
2026-06-05 15:10:13 -03:00 -
fix(tui): Windows composer background (#26181)
## Why On Windows, the TUI could not shade the composer against the terminal background because `terminal_palette::default_colors()` always fell back to `None`. That preserved safety, but it also meant terminals that do support OSC 10/11 default color replies had no path to report their real background color. This keeps the existing fallback behavior for unsupported terminals while allowing capable Windows terminals to report their default foreground/background colors during startup. | Before | After | |---|---| | <img width="1235" height="658" alt="win-before" src="https://github.com/user-attachments/assets/ff756589-fcb3-43de-8f2a-ebc0369b30dd" /> | <img width="1235" height="658" alt="win-after" src="https://github.com/user-attachments/assets/9563ff20-4be5-4608-9414-a2afb647e745" /> | ## What Changed - Moved the OSC 10/11 default color parser in `tui/src/terminal_probe.rs` out of the Unix-only implementation so it can be reused by Windows. - Added a Windows-only bounded OSC 10/11 probe using raw console handles and the existing `windows-sys` dependency. - Added Windows palette caching in `tui/src/terminal_palette.rs` so startup probe results, including `None`, are reused instead of probing again later. - Wired the Windows color probe into TUI startup after the existing non-Unix crossterm cursor and keyboard checks. - Added parser coverage for malformed, partial, and noisy OSC color replies. If the probe fails, times out, receives only one color, or receives malformed data, the cache stores `None` and the composer keeps the current behavior. ## How to Test 1. On Windows, start Codex in a terminal that supports OSC 10/11 default color replies. 2. Open the TUI composer. 3. Confirm the composer/status area is painted using the terminal's reported default background, instead of leaving the background unshaded. 4. Start Codex in a terminal that does not answer OSC 10/11, or otherwise blocks terminal color replies. 5. Confirm startup still succeeds and the composer uses the existing fallback behavior. Targeted tests: - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target just test -p codex-tui terminal_probe` Additional local verification: - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target just test -p codex-tui` was run; 2774 tests passed, and two unrelated Guardian feature-flag tests failed reproducibly when isolated. - `just argument-comment-lint` was attempted but blocked by the local Bazel/LLVM `include/sanitizer/*.h` empty glob issue. Touched Rust literal callsites were inspected manually. - `cargo check -p codex-tui --target x86_64-pc-windows-msvc` was attempted after installing the target, but local macOS cross-checking is blocked by missing Windows C SDK headers in native dependencies (`ring`/`aws-lc-sys`). --------- Co-authored-by: Kevin Bond <kbond@openai.com>
Felipe Coury ·
2026-06-05 11:05:46 -07:00 -
fix(tui): avoid doubled blank rows while streaming (#26636)
## Summary During assistant-message streaming, blank markdown lines in the transient active tail were prefixed with two spaces. Ratatui measured those whitespace-only lines as two viewport rows, so list- and table-heavy answers showed doubled vertical gaps while streaming and then visibly compacted when finalized into scrollback. - keep whitespace-only `StreamingAgentTailCell` lines structurally empty while preserving nonblank message prefixes - clear impossible hyperlink metadata when normalizing a blank tail line - add an inline snapshot and height regression proving one blank markdown line occupies one viewport row Related to #26618, but fixes a separate live-tail row-height issue rather than stale committed markdown content. ## How to Test Recommended before/after reproduction: 1. Start the latest Codex build without this change. 2. Submit this exact prompt: > Send 20 different lists: bullets vs numbered, simple vs complex with paragraphs in between items, etc. Intertwine them with some tables and some paragraphs. 3. While the answer streams, observe duplicated vertical gaps around list items and paragraphs. When the answer finishes, observe the spacing compact. 4. Start this branch with `just c` and submit the same prompt. 5. Confirm each intended blank markdown line occupies one terminal row throughout streaming and that the spacing does not compact or jump when the answer finishes. 6. As a focused regression, verify the sections after the first table, especially loose lists with paragraphs between items; those blank rows should remain stable throughout streaming. Targeted tests: - `just test -p codex-tui streaming_agent_tail_blank_line_uses_one_viewport_row` - `just test -p codex-tui history_cell::tests` ## Test Notes - Verified the exact prompt above in a real tmux TUI using latest Codex and this branch as the before/after comparison. - The full `just test -p codex-tui` run completed 2,782 of 2,784 tests successfully. Two unrelated guardian feature-flag tests fail reproducibly in isolation because the expected `OverrideTurnContext` message is absent. - `just argument-comment-lint` is blocked locally by the existing Bazel `compiler-rt` missing-header glob error; the touched Rust diff was inspected manually for opaque positional literals.
Felipe Coury ·
2026-06-05 14:33:31 -03:00 -
Render code comment directives in TUI replay (#26554)
## Summary Resumed Codex App or VS Code review sessions can contain `::code-comment` directives that the TUI previously displayed verbatim because only rich clients interpret them. This change rewrites valid line-start directives into readable Markdown during assistant-message parsing, using the session working directory for relative file paths. The fallback is applied consistently to live messages, replayed transcripts, and resume previews while preserving malformed directives and existing `::git-*` parsing. ## Before The TUI exposed the raw client directive: ```text ::code-comment{title="Fix body= parsing" body="Keep role=\"tab\", ::git-stage{cwd=/tmp}, file=, and \n literal." file="/repo/src/app.ts" start=10 end=12 priority="P2"} ``` ## After The same directive is rendered as readable review feedback: ```text - [P2] Fix body= parsing — src/app.ts:10-12 Keep role="tab", ::git-stage{cwd=/tmp}, file=, and \n literal. ``` Fixes #25658Eric Traut ·
2026-06-05 08:34:34 -07:00 -
Fix
/goalusage text for control commands (#26551)## Why The TUI's `/goal` usage text only advertised the objective form even though `/goal clear`, `/goal edit`, `/goal pause`, and `/goal resume` are implemented. This made the lifecycle controls difficult to discover and allowed the duplicated help text to drift from actual behavior. Fixes #25530. ## What changed - Show the complete `/goal [<objective>|clear|edit|pause|resume]` syntax in usage messages. - Share one usage string across slash-command dispatch and goal-related app messages. - Add inline snapshot coverage for the control-command usage path.
Eric Traut ·
2026-06-05 08:32:53 -07:00 -
Surface TUI config write error causes (#26537)
## Summary TUI config writes currently wrap app-server failures with local context like `config/batchWrite failed in TUI`, but several user-visible paths only render the outer error. That hides the actionable app-server message, such as validation constraints or read-only `CODEX_HOME` failures, leaving users with a dead-end diagnostic. This change adds a small formatter next to the TUI config write helpers that renders the error source chain, then uses it for model persistence, feature persistence, project trust, status line writes, hook trust, and hook enablement. Fixes #26077
Eric Traut ·
2026-06-05 08:32:07 -07:00 -
[codex] Forward turn moderation metadata through app-server (#25710)
## Why First-party backends can supply turn-scoped moderation metadata that app-server clients need for client-side presentation. Exposing this as an experimental typed notification lets opted-in clients consume it without interpreting raw Responses API events. ## What changed - forward `response.metadata.openai_chatgpt_moderation_metadata` from Responses API SSE and WebSocket streams as turn-scoped moderation metadata - emit the experimental app-server v2 `turn/moderationMetadata` notification with `{ threadId, turnId, metadata }` - add app-server integration coverage for the typed moderation metadata notification ## Testing - `just test -p codex-core build_ws_client_metadata_includes_window_lineage_and_turn_metadata` - `just test -p codex-core` (fails locally: 46 failures and 1 timeout, primarily missing `test_stdio_server` and shell snapshot timeouts) - `just test -p codex-app-server-protocol` - `just test -p codex-app-server turn_moderation_metadata_emits_typed_notification_v2` - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed, and 5 timed out; failures are in existing environment-sensitive tests, primarily because nested macOS `sandbox-exec` is not permitted) - `just write-app-server-schema --experimental --schema-root /tmp/codex-app-server-schema-experimental`carlc-oai ·
2026-06-05 02:41:06 -07:00 -
[codex] Expose unavailable app templates in plugin detail (#26317)
## Summary - Adds `unavailable_app_templates` to the app-server protocol and generated schemas/types. - Parses plugin-service `release.unavailable_app_templates` in the remote plugin client. - Maps remote unavailable templates into app-server `PluginDetail`. - Defaults local plugins to an empty unavailable app template list. ## Validation - `just write-app-server-schema` - `cargo +1.95.0 fmt --manifest-path codex-rs/Cargo.toml --all --check` - `cargo +1.95.0 test --manifest-path codex-rs/Cargo.toml -p codex-app-server-protocol schema_fixtures` - `cargo +1.95.0 check --manifest-path codex-rs/Cargo.toml -p codex-app-server-protocol -p codex-core-plugins -p codex-app-server` - `git diff --check` Note: default `cargo check` uses rustc 1.89 locally and failed because dependencies require newer Rust, so validation was rerun with installed Rust 1.95.
charlesgong-openai ·
2026-06-04 23:42:27 +00:00 -
[codex] Use model-advertised reasoning effort order (#26446)
## Summary - preserve the model catalog order for app-server `supportedReasoningEfforts` and document that client contract - render TUI reasoning choices in the advertised order - step reasoning shortcuts by adjacent list position instead of deriving order from known effort names - anchor unsupported configured values to the advertised default, or the first option when needed - remove canonical effort ordering helpers and the unused upgrade effort mapping ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI. Stacked on #26444.
Ahmed Ibrahim ·
2026-06-04 14:01:14 -07:00 -
[codex] Support model-defined reasoning efforts (#26444)
## Summary - accept non-empty model-defined reasoning effort values while preserving built-in effort behavior - propagate the non-Copy effort type through core, app-server, TUI, telemetry, and persistence call sites - preserve string wire encoding and expose an open-string schema for clients - update model selection and shortcut behavior for model-advertised effort values ## Root cause `ReasoningEffort` gained a string-backed custom variant, so it could no longer implement `Copy` or rely on derived closed-enum serialization. Existing consumers still moved effort values from shared references and assumed a fixed built-in value set. ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI.
Ahmed Ibrahim ·
2026-06-04 13:36:24 -07:00 -
Felipe Coury ·
2026-06-03 20:30:15 -03:00 -
Felipe Coury ·
2026-06-03 20:28:31 -03:00 -
Fix multiline paste in /goal edit (#26047)
Fixes #26025. ## Why `/goal edit` opens `CustomPromptView`, which did not use the paste-burst handling that protects the main composer when terminals deliver paste as rapid key events. On Windows terminals, the first pasted newline could be treated as Enter-to-submit, truncating the goal edit and leaving the rest of the paste behind. ## What This reuses `PasteBurst` in `CustomPromptView` as a lightweight Enter-suppression detector for paste-like key streams. Characters still insert directly, explicit paste still goes through the view paste path, and ordinary text entry still submits on Enter.
Eric Traut ·
2026-06-03 09:36:50 -07:00 -
Switch runtime to cloud config bundle (#24622)
## Summary - Adapts the moved `codex-cloud-config` crate from the legacy cloud requirements endpoint to the new config bundle endpoint. - Switches runtime consumers from `CloudRequirementsLoader` to `CloudConfigBundleLoader` so one shared bundle supplies cloud-delivered config and requirements. - Removes the legacy cloud requirements domain loader path. ## Details This intentionally keeps `codex-cloud-config` monolithic for review lineage: the previous PR establishes the crate move, and this PR shows the behavior change against that moved implementation. A follow-up PR splits the module back into focused files. The new bundle path preserves the important cloud requirements loader semantics where intended: account-scoped signed cache, 30 minute TTL, 5 minute refresh cadence, retry/backoff, auth recovery, and fail-closed startup loading. The cached payload changes from a single requirements TOML string to the backend-delivered bundle, and validation rejects malformed config or requirements fragments before cache write/use.
joeflorencio-openai ·
2026-06-02 13:18:59 -07:00 -
Propagate permission approval environment id (#25862)
## Stack 1. #25850 - Key request-permission grants by environment: stores and applies sticky permission grants per environment id. 2. #25858 - Add `environmentId` to `request_permissions`: lets the model target a selected environment and resolves relative permission paths against it. 3. This PR (#25862) - Propagate permission approval environment id: carries the selected environment id through approval events, app-server requests, TUI prompts, and delegate forwarding. 4. #25867 - Add remote request permissions integration coverage: verifies the selected remote environment across request, approval, grant reuse, and exec. This PR is stacked on #25858, and #25867 is stacked on this PR. ## Why PR2 lets the model bind a `request_permissions` call to a selected environment, but the approval event and client-facing request still needed to carry that binding. For CCA, the user-facing prompt and delegated approval path should know which environment the grant applies to instead of relying on cwd alone. ## What Changed - Added optional `environmentId` to `RequestPermissionsEvent`. - Emit the selected environment id from core permission approval events. - Preserve the environment id through delegate forwarding, including cwd-based delegated requests. - Added `environmentId` to app-server permission approval params, generated schema/TypeScript artifacts, and README examples. - Preserve and display the environment id in TUI permission approval prompts. - Updated focused core, app-server protocol, and TUI conversion coverage. ## Testing Not run locally per instruction. Performed read-only `git diff --check`.
jif ·
2026-06-02 21:09:34 +02:00 -
Reduce stack pressure in session startup and config rebuilds (#25844)
## Why `/clear` starts a fresh thread with `InitialHistory::Cleared`, which re-enters the thread/session startup path. That path now builds large async futures through `ThreadManagerState::spawn_thread_with_source`, `Codex::spawn`, and `Session::new`. Separately, TUI config rebuilds for cwd and permission-profile changes build a similarly heavy `ConfigBuilder::build()` future inside the app task. In debug and Bazel runs, those call chains can put enough state on the caller stack to abort before startup or config refresh completes. This change keeps the behavior the same while moving the heaviest future frames off the caller stack. ## What changed - Box `Codex::spawn(...)` in `codex-rs/core/src/thread_manager.rs` before awaiting it from `spawn_thread_with_source`. - Box `Session::new(...)` in `codex-rs/core/src/session/mod.rs` before awaiting it from `Codex::spawn_internal`. - Route `ConfigBuilder::build()` through a small `tokio::spawn` helper in `codex-rs/tui/src/app/config_persistence.rs` so cwd and permission-profile config rebuilds run on a runtime worker stack while preserving error context. ## Verification CI is running on the PR. No new targeted tests were added. This is a mechanical stack-pressure reduction that keeps the existing behavior and error propagation intact.
jif-oai ·
2026-06-02 15:42:47 +02:00 -
feat: show enterprise monthly credit limits in status (#24812)
## Summary Enterprise users can have an effective monthly credit limit, but Codex `/status` currently drops that metadata from the account-usage response. This change adds the optional `spend_control.individual_limit` projection to the existing rate-limit snapshot flow. The backend client reads the monthly limit, app-server exposes it as `individualLimit`, and the TUI renders a `Monthly credit limit` row through the existing progress-bar renderer. When the backend does not return an effective monthly limit, existing rate-limit behavior is unchanged. ## Existing backend state The account-usage backend already returns the effective monthly limit and current usage together: ```json { "spend_control": { "reached": false, "individual_limit": { "limit": "25000", "used": "8000", "remaining": "17000", "used_percent": 32, "remaining_percent": 68, "reset_after_seconds": 86400, "reset_at": 1778137680 } } } ``` Before this change, Codex projected rolling `primary` and `secondary` windows plus `credits`. It ignored `spend_control.individual_limit`, so app-server clients and `/status` could not render the monthly cap. The updated flow is: ```text account usage backend -> backend-client reads spend_control.individual_limit -> existing rate-limit snapshot carries optional individual_limit -> app-server exposes optional individualLimit -> TUI renders Monthly credit limit ``` ## App-server contract `account/rateLimits/read` and sparse `account/rateLimits/updated` notifications now include an additive nullable `rateLimits.individualLimit` field: ```json { "individualLimit": { "limit": "25000", "used": "8000", "remainingPercent": 68, "resetsAt": 1778137680 } } ``` In an `account/rateLimits/read` response, `null` means no monthly limit is available. `account/rateLimits/updated` remains a sparse rolling notification: clients merge available values into their most recent `account/rateLimits/read` snapshot or refetch. Nullable account metadata in a rolling notification does not clear a previously observed value. ## Design decisions - Extend the existing rate-limit snapshot instead of introducing a separate request or wire-level update protocol. - Keep the Codex projection narrow: `/status` needs the effective limit, current usage, remaining percentage, and reset timestamp. - Render the monthly row through the existing progress-bar renderer, with one optional detail line for `8,000 of 25,000 credits used`. - Keep the backend response optional so existing accounts and older usage states preserve their current behavior. - Preserve cached monthly metadata when sparse rolling notifications omit it. Live account-usage reads remain authoritative and can clear a removed limit. ## Visual evidence ```text Monthly credit limit: [██████████████░░░░░░] 68% left (resets 07:08 on 7 May) 8,000 of 25,000 credits used ``` Snapshot: `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap` ## Testing Tests: generated app-server schema verification, protocol tests, backend-client tests, app-server integration coverage, TUI snapshot coverage, formatting, and workspace lint cleanup.efrazer-oai ·
2026-06-01 21:25:42 -07:00 -
Move cloud requirements crate to cloud config (#24621)
## Summary - Moves the existing `codex-cloud-requirements` crate to `codex-cloud-config`. - Updates workspace dependencies and imports to the new crate name. - Intentionally keeps runtime behavior unchanged: this still fetches the legacy cloud requirements endpoint. ## Details This PR exists to make the lineage obvious before the bundle migration. GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs` implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather than as unrelated new code. The follow-up PR adapts this moved crate to the new config bundle API and switches runtime consumers over.
joeflorencio-openai ·
2026-06-01 16:43:52 -07:00 -
app-server: remove experimental persist_extended_history bool flag (#25712)
## Summary Remove the dead experimental `persistExtendedHistory` app-server flag and collapse rollout persistence to the single policy app-server already used. ## What Changed - Removed `persistExtendedHistory` from v2 thread start/resume/fork params and deleted its deprecation notice path. - Removed the persistence-mode enums and plumbing through core, rollout, and thread-store. - Made rollout filtering mode-free, keeping the existing limited persisted-history behavior. ## Test Plan - `just write-app-server-schema` - `cargo nextest run --no-fail-fast -p codex-app-server-protocol schema_fixtures` - `cargo nextest run --no-fail-fast -p codex-app-server thread_shell_command_history_responses_exclude_persisted_command_executions` - `cargo nextest run --no-fail-fast -p codex-rollout -p codex-thread-store` - final `rg` for removed flag/type names
Owen Lin ·
2026-06-01 23:33:42 +00:00 -
fix(tui): clarify footer shortcut overlay hints (#25625)
## Why The TUI shortcut overlay used static labels for `Tab` and `Ctrl+C`, even though both keys change behavior while a task is running. That made the visible help misleading: idle `Tab` submits rather than queues, and active-turn `Ctrl+C` interrupts rather than exits. Closes #25531. Closes #25564. ## What Changed - Pass task-running state into the shortcut overlay renderer. - Render `Tab` as `submit message` while idle and `queue message` while work is running. - Render `Ctrl+C` as `exit` while idle and `interrupt` while work is running. - Add snapshot coverage for the active-work shortcut overlay and update idle overlay snapshots. ## How to Test 1. Start Codex and open the shortcut overlay with `?` while no task is running. 2. Confirm the overlay shows `tab to submit message` and `ctrl + c to exit`. 3. Start a task, then open or keep the shortcut overlay visible while work is running. 4. Confirm the overlay shows `tab to queue message` and `ctrl + c to interrupt`. 5. Type a follow-up prompt during active work and press `Tab`; confirm it queues rather than submitting immediately. Targeted tests: - `just test -p codex-tui footer_snapshots` - `just test -p codex-tui footer_mode_snapshots` ## Validation Notes `just test -p codex-tui` currently has two unrelated guardian feature-flag test failures on this base: - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history` - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` `just argument-comment-lint codex-rs/tui/src/bottom_pane/footer.rs` could not run locally because the prebuilt wrapper requires `dotslash`; the touched Rust diff was manually inspected for opaque positional literals.
Felipe Coury ·
2026-06-01 19:41:22 -03:00 -
Add reasoning-only status surface item (#25504)
Closes #24886. ## Why Users can configure the TUI status line and terminal title with `model-with-reasoning`, but issue #24886 asks for a compact reasoning-only item. That lets a setup show just `default`, `low`, `medium`, `high`, or `xhigh` without repeating the model name. ## What changed - Added a `reasoning` item for `/statusline` and `/title` setup flows. - Rendered the item from the effective reasoning effort, including collaboration-mode overrides. - Registered `reasoning` with `codex doctor` so Codex-generated terminal-title config is not reported as invalid. - Updated TUI setup snapshots so the picker previews include the new item.
Eric Traut ·
2026-06-01 09:30:20 -07:00 -
Reset slash popup selection when filter changes (#25492)
## Summary Fixes #25295. The slash-command popup reused its previous `ScrollState` when the composer filter token changed. After scrolling the full `/` command list, typing a narrower filter such as `/st` could clamp the stale selection into the filtered results and highlight the wrong command. This resets the popup selection and viewport only when the parsed filter token changes, so normal arrow navigation is preserved while new filters start at the first match.
Eric Traut ·
2026-06-01 09:17:19 -07:00 -
Allow paste in searchable selection menus (#25400)
## Summary I frequently want to be able to paste into the searchable menu -- the most common use-case here is when specifying an upstream for a `/review`, where I copy the upstream from an open terminal.
Charlie Marsh ·
2026-06-01 18:01:52 +02:00 -
feat(tui): restore output-free cancelled prompts (#25316)
## TL;DR When you press Esc or Ctrl+C after sending a prompt but before any output was rendering, it restores the last composer and the message. ## Summary Cancelling a prompt immediately after submission should behave like returning to edit that prompt, not like discarding the user's draft. Today, pressing `Esc` or `Ctrl+C` before Codex responds leaves the submitted prompt in the transcript and returns an empty composer, forcing the user to recall or retype it. When an interrupted turn has not produced substantive visible output, restore its submitted prompt directly into the composer and roll back that latest turn. This also covers the first prompt in a fresh thread, before the TUI has retained a local user-history cell. The restored draft keeps its text, image attachments, and active collaboration mode so it can be edited and resubmitted in place. Restoration is intentionally suppressed once the turn has produced user-visible activity such as assistant output, tool work, hooks, or patches. A transient thinking status does not make the prompt ineligible. Rollback also rebuilds terminal scrollback from the retained transcript cells so repeated cancellations and terminal resizes do not duplicate history. ## How to Test 1. Start the TUI with `cargo run -p codex-cli --bin codex`. 2. In a fresh thread, submit the first prompt and press `Esc` before Codex emits substantive output. Confirm that the prompt returns to the composer for editing and its submitted transcript row is removed. 3. Repeat with `Ctrl+C`, then repeat after at least one completed turn. Confirm the same behavior. 4. Submit a prompt, wait for assistant output or tool activity, then cancel. Confirm that the transcript remains intact and the prompt is not restored into the composer. 5. Cancel several output-free prompts and resize the terminal between attempts. Confirm that the startup banner, tip, and transcript history do not duplicate in scrollback. Targeted tests: - `just test -p codex-tui cancelled_turn_edit_restores_prompt` - `just test -p codex-tui output_free_interrupted_turn_requests_prompt_restore` - `just test -p codex-tui visible_output_prevents_cancelled_turn_prompt_restore` - `just test -p codex-tui thinking_status_keeps_cancelled_turn_prompt_restore_eligible` - `just test -p codex-tui patch_activity_prevents_cancelled_turn_prompt_restore` The full `just test -p codex-tui` run completed with `2746` passing tests and two unrelated existing guardian feature-flag failures. `just argument-comment-lint` remains blocked locally by the existing Bazel LLVM `compiler-rt` sanitizer-header glob failure; the touched Rust diff was manually audited for positional literal comments.
Felipe Coury ·
2026-06-01 11:49:14 -03:00