mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
470c20bf98ce35c8bcaa021e722f0a7074c4530b
7193 Commits
-
protocol: remove submission-side serde from Op (#26674)
## Why Submission-side `Op` payloads are now an internal handoff inside the Rust codebase, so keeping a stable serde contract there adds complexity without a real wire consumer. ## What changed - remove serde/schema annotations from `Submission`, `Op`, and submission-only payload types like thread settings overrides, additional context, realtime conversation params, `TurnEnvironmentSelection`, and `RequestUserInputResponse` - delete the `Op` serialization tests and the now-unused double-option prompt serde helper - keep event/API-facing serialization where it is still required, and serialize the `request_user_input` tool output from its wire payload instead of the core response struct - update `protocol_v1.md` to call out that events remain the serialized transport surface while submission payloads are implementation details ## Testing - `just test -p codex-protocol` - `cargo check -p codex-core -p codex-app-server -p codex-thread-store` - `just test -p codex-core request_user_input`
pakrym-oai ·
2026-06-05 15:41:13 -07:00 -
[2 of 2] Finish moving goal runtime to extension (#26548)
## Stack 1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align goal extension with core behavior 2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move goal runtime to extension ## Why This PR completes the switch of the goal behavior to the extension-backed runtime and removes the old core goal implementation. ## What Changed - Installs the goal extension for app-server `ThreadManager` sessions. - Routes app-server thread goal `get`, `set`, and `clear` through `GoalService`. - Uses thread-idle lifecycle emission after goal resume and snapshot ordering so the extension can decide whether to continue the goal. - Forwards extension goal updates through a FIFO async app-server notification path so backpressure does not drop them or reorder updates. - Keeps review turns from enabling goal runtime behavior. - Plans extension tools before dynamic tools so built-in goal tool names keep their old precedence when goals are enabled. - Removes the old core goal runtime, core goal tool handlers, and core goal tool specs. - Updates tests that were coupled to the core-owned goal runtime while leaving the legacy `<goal_context>` compatibility path in core for old threads. - Removes the stale cargo-shear ignore now that `codex-goal-extension` is used by the workspace. - Keeps realtime event matching exhaustive after removing the old goal-specific realtime text path. ## Validation - Ran manual `/goal` runs in TUI. Validated time accounting matched wall-clock time and goal lifecycle state transitions.
Eric Traut ·
2026-06-05 14:17:30 -07:00 -
[codex] Bound WSL local curated discovery (#26669)
## Context The installed-app suggestion expansion added in #24996 reads plugin details for trusted file-backed marketplace candidates because the list response does not include app ids. On Windows-backed WSL mounts, the local `openai-curated` checkout lives under `$CODEX_HOME/.tmp/plugins`, and those per-plugin detail reads can be very slow. Remote curated already has cached app ids, so it does not need the same local filesystem traversal. ## Summary - Keep only the WSL Windows-backed local `openai-curated` checkout on the legacy fallback/configured discovery path. - Preserve installed-app expansion for non-WSL file-backed marketplaces and remote curated. - Add focused tests for the WSL local curated path predicate. ## Test - `just test -p codex-core-plugins discoverable` - `just test -p codex-core plugins::discoverable::tests`
xl-openai ·
2026-06-05 14:09:40 -07:00 -
Add JSON output for plugin subcommands (#26631)
## Summary - Follow-up to #25330 and #26417 - Add `--json` output for `codex plugin add` and `codex plugin remove` - Add `--json` output for `codex plugin marketplace add/list/upgrade/remove` - Keep existing human-readable output unchanged - Keep existing error handling/stderr behavior unchanged; `--json` changes successful stdout output only - Align marketplace add/remove JSON field names with the existing app-server protocol shape - Add CLI coverage for plugin and marketplace JSON outputs ## Validation - `just fmt` - `just fix -p codex-cli` - `just test -p codex-cli`
mpc-oai ·
2026-06-05 14:40:31 -05:00 -
Speed up TUI startup by reusing plugin discovery (#26469)
## Summary TUI startup loads related plugin data from `hooks/list`, session MCP initialization, and plugin skill warmup. These paths repeated filesystem discovery and emitted the same plugin warnings, while `hooks/list` and account/model bootstrap ran serially. This change: - Reuses one immutable plugin load outcome across startup consumers. - Keys the cache only on plugin-relevant configuration. - Single-flights concurrent plugin loads and prevents invalidated loads from repopulating the cache. - Runs hook discovery and account/model bootstrap concurrently. - Preserves configuration-migration ordering, hook review behavior, and accurate startup telemetry. In 10 alternating release-build launches in the Ruff repository with the existing `~/.codex` configuration, median time to the first editable composer decreased from 833ms to 504ms. The branch was faster in 9 of 10 pairs, with a paired median improvement of 312ms.
Charlie Marsh ·
2026-06-05 15:32:43 -04:00 -
Use state DB first for
resume --last(#26462)## Summary `codex resume --last` currently lists sessions by updated time using scan-and-repair. Updated-time filesystem listing must stat every rollout before applying the cwd, provider, and source filters, so startup scales with the entire local session history... This change queries the state DB first for the latest matching session. For local workspaces, we only accept the indexed result when its rollout path still exists; otherwise we retry with scan-and-repair. The same lookup path is shared by `fork --last`. I benchmarked the same `thread/list` request used by `resume --last` in my local `ruff` checkout against a Codex home with 2,599 active rollouts totaling 3.7 GiB, including 90 Ruff threads. Across five fresh release app-server processes with warm filesystem caches, the state-DB-only lookup had median latency of 0.37-0.44 ms, while scan-and-repair had median latency of 139-162 ms. First-request latency was 0.7-1.7 ms versus 142-185 ms. So this **removes roughly 140-160 ms from the `resume --last` lookup** on this machine, and makes that lookup over 300x faster. The tradeoff is that this does leave two correctness gaps: - If a newer matching rollout is missing from SQLite but an older matching row exists, the fast path resumes the older thread and never falls back to the filesystem scan. - If an existing row has stale filter or ordering metadata, the fast path can select a different thread from scan-and-repair. The rollout tests already demonstrate this for stale cwd metadata: state-DB-only returns the stale match, while scan-and-repair removes and repairs it. So you could end up seeing the "wrong" result in cases like... 1. A crash or SQLite error occurs between Codex writing the conversation file and updating SQLite, leaving the newer file unindexed. 2. An older Codex version, restore, or manual copy adds a conversation file after SQLite’s one-time backfill completed. These seem pretty rare though (and sessions can always be recovered via other mechanisms -- `--last` is just a convenience feature), and I think the tradeoffs are good here?
Charlie Marsh ·
2026-06-05 14:58:09 -04:00 -
Make runtime workspace roots absolute in app-server API (#26552)
Stacked on #26532. ## Why #26532 moves cwd normalization to the app-server/core boundary. `runtimeWorkspaceRoots` still accepted raw paths in v2 requests and in `ConfigOverrides`, which left core responsible for interpreting those roots later. This makes runtime workspace roots follow the same absolute-path boundary as cwd. ## What - Change v2 `runtimeWorkspaceRoots` request fields for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` to `AbsolutePathBuf`. - Deduplicate already-absolute runtime roots in app-server handlers and pass them through `ConfigOverrides.workspace_roots` as `AbsolutePathBuf`. - Update TUI and exec client request builders to pass absolute runtime roots directly. - Update app-server docs, schema fixtures, and focused tests for absolute runtime roots. ## Testing - `just test -p codex-app-server-protocol` - `just test -p codex-app-server runtime_workspace_roots` - `just test -p codex-core session_permission_profile_rebinds_runtime_workspace_roots` - `just test -p codex-tui app_server_session` - `just test -p codex-exec`
pakrym-oai ·
2026-06-05 11:36:53 -07:00 -
[codex] Add turn profiling analytics (#26484)
## Summary Add flat profiling fields to `codex_turn_event` so analytics can explain where turn wall-clock time is spent without changing tool execution behavior. The profile reports: - time before the first sampling request - sampling time across all attempts and follow-ups - overhead between sampling requests - time blocked in the post-sampling tool drain - time after the final sampling request - sampling request and retry counts ## Implementation - Extend the existing turn timing state with constant-memory phase accounting and one RAII phase guard. - Observe sampling and the existing post-sampling drain only at turn orchestration boundaries. - Keep tool runtime, tool futures, response item handling, and turn lifecycle values unchanged. - Add the profiling fields directly to the existing analytics turn event without changing app-server protocol or rollout persistence. - Use the existing turn `status` to distinguish completed, failed, and interrupted profiles. Exact sampling/tool overlap is intentionally omitted because measuring tool completion accurately would require hooks in the tool execution path. ## Validation - Add app-server end-to-end coverage for a single-sampling turn with no blocking tool work. - Add app-server end-to-end coverage for `request_user_input` blocking followed by a second sampling request. - CI is running on the PR; tests were not executed locally per repository guidance.
Ahmed Ibrahim ·
2026-06-05 11:27:10 -07:00 -
[codex] Respect Windows sandbox backend in exec policy (#26307)
## Why Windows managed filesystem permissions can now be backed by a real Windows sandbox. `exec-policy` was still treating the managed read-only policy shape as if there were never a sandbox backend, so benign unmatched commands such as PowerShell directory listings could be rejected with `blocked by policy` even when `windows.sandbox` was enabled. The inverse case still needs to stay conservative: when the Windows sandbox backend is disabled, managed filesystem restrictions are only configuration intent, not an enforced filesystem boundary. That applies to writable-root restricted profiles too, not just read-only profiles. ## What Changed - Thread the effective `WindowsSandboxLevel` into exec-policy approval decisions for shell, unified exec, and intercepted shell exec paths. - Treat managed restricted filesystem profiles as lacking sandbox protection only on Windows when `WindowsSandboxLevel::Disabled`. - Exclude full-disk-write profiles from that no-backend path because they do not rely on filesystem sandbox enforcement. - Remove the cwd-sensitive read-only heuristic and the now-stale cwd plumbing from exec-policy approval contexts. - Add Windows coverage for both enabled-sandbox and disabled-backend behavior, including a writable-root managed profile. ## Validation - Added/updated `exec_policy` coverage for managed filesystem restrictions, full-disk-write exclusion, enabled Windows sandbox behavior, and disabled-backend read-only/writable-root behavior. - `just test -p codex-core exec_policy` — 100 passed, 10 leaky - Empirical local `codex exec` probe with `--sandbox read-only -c 'windows.sandbox="unelevated"'`: PowerShell directory listing completed successfully. - Disabled-backend control with Windows sandbox cleared: the same command was rejected with `blocked by policy`.
iceweasel-oai ·
2026-06-05 11:20:52 -07:00 -
fix(tui): restore cancelled prompt cursor at end (#26457)
## Why Pressing `Esc` on a turn that produced no visible output restores the submitted prompt so the user can keep editing it. That restore path preserved the prompt content, images, and mention bindings, but left the composer cursor at the start of the restored text. The next edit therefore inserted at the beginning instead of continuing from the end of the prompt. ## What Changed - Move the cursor to the end after `BottomPane::set_composer_text_with_mention_bindings` rehydrates a restored draft. - Add test-only cursor accessors so restore tests can assert the composer state directly. - Extend the queued restore regression to assert the restored composer cursor is positioned at `text.len()`. ## How to Test Manual reviewer flow: 1. Start Codex in the TUI. 2. Submit a prompt that will take long enough to interrupt. 3. Press `Esc` before any visible assistant output appears. 4. Confirm the prompt is restored into the composer and the cursor is at the end, so typing appends to the prompt. 5. Repeat with a prompt that includes an attached image or resolved mention and confirm the restored content remains intact. Targeted tests: - `just test -p codex-tui chatwidget::tests::composer_submission::queued_restore_with_remote_images_keeps_local_placeholder_mapping` Lint note: - `just argument-comment-lint` is blocked locally by the existing Bazel `compiler-rt` empty glob failure before analyzing touched code. The touched Rust diff was manually inspected and adds no new opaque positional literal callsites.
Felipe Coury ·
2026-06-05 15:10:13 -03:00 -
fix(tui): Windows composer background (#26181)
## Why On Windows, the TUI could not shade the composer against the terminal background because `terminal_palette::default_colors()` always fell back to `None`. That preserved safety, but it also meant terminals that do support OSC 10/11 default color replies had no path to report their real background color. This keeps the existing fallback behavior for unsupported terminals while allowing capable Windows terminals to report their default foreground/background colors during startup. | Before | After | |---|---| | <img width="1235" height="658" alt="win-before" src="https://github.com/user-attachments/assets/ff756589-fcb3-43de-8f2a-ebc0369b30dd" /> | <img width="1235" height="658" alt="win-after" src="https://github.com/user-attachments/assets/9563ff20-4be5-4608-9414-a2afb647e745" /> | ## What Changed - Moved the OSC 10/11 default color parser in `tui/src/terminal_probe.rs` out of the Unix-only implementation so it can be reused by Windows. - Added a Windows-only bounded OSC 10/11 probe using raw console handles and the existing `windows-sys` dependency. - Added Windows palette caching in `tui/src/terminal_palette.rs` so startup probe results, including `None`, are reused instead of probing again later. - Wired the Windows color probe into TUI startup after the existing non-Unix crossterm cursor and keyboard checks. - Added parser coverage for malformed, partial, and noisy OSC color replies. If the probe fails, times out, receives only one color, or receives malformed data, the cache stores `None` and the composer keeps the current behavior. ## How to Test 1. On Windows, start Codex in a terminal that supports OSC 10/11 default color replies. 2. Open the TUI composer. 3. Confirm the composer/status area is painted using the terminal's reported default background, instead of leaving the background unshaded. 4. Start Codex in a terminal that does not answer OSC 10/11, or otherwise blocks terminal color replies. 5. Confirm startup still succeeds and the composer uses the existing fallback behavior. Targeted tests: - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target just test -p codex-tui terminal_probe` Additional local verification: - `CARGO_TARGET_DIR=/private/tmp/codex-windows-osc-default-colors-target just test -p codex-tui` was run; 2774 tests passed, and two unrelated Guardian feature-flag tests failed reproducibly when isolated. - `just argument-comment-lint` was attempted but blocked by the local Bazel/LLVM `include/sanitizer/*.h` empty glob issue. Touched Rust literal callsites were inspected manually. - `cargo check -p codex-tui --target x86_64-pc-windows-msvc` was attempted after installing the target, but local macOS cross-checking is blocked by missing Windows C SDK headers in native dependencies (`ring`/`aws-lc-sys`). --------- Co-authored-by: Kevin Bond <kbond@openai.com>
Felipe Coury ·
2026-06-05 11:05:46 -07:00 -
[1 of 2] Align goal extension with core behavior (#26547)
## Stack 1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align goal extension with core behavior 2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move goal runtime to extension ## Why The goal runtime is moving out of `codex-core` and into `codex-goal-extension`. This first PR brings the extension back in line with the current core behavior before the follow-up PR switches app-server sessions over to the extension, so that review can focus on ownership and wiring rather than hidden behavior drift. ## What Changed - Updates the extension `create_goal` and `update_goal` tool schemas/descriptions to match the current core wording for explicit token budgets, blocked-goal audits, resumed blocked goals, and system-owned budget/usage-limit transitions. - Marks `codex-goal-extension` as the live `/goal` extension crate rather than an unwired sketch. - Looks up the live thread before reading goal state for idle continuation, so continuation setup exits early when no live thread can accept the automatic turn.
Eric Traut ·
2026-06-05 10:37:38 -07:00 -
Clean up Rust release workflow (#26335)
## Why PR #26252 moved macOS release signing into the tag-triggered `rust-release` workflow through the protected `codesigning` environment and Azure Key Vault. That leaves the old manual unsigned-build / signed-promotion handoff as dead compatibility scaffolding: it makes the release DAG harder to reason about and keeps paths around that the current release process no longer intends to operate. ## What changed - Remove the manual `workflow_dispatch` inputs and validation for `build_unsigned`, `promote_signed`, and the deprecated `sign_macos` flag. - Drop the `stage-signed-macos` job and the promotion-specific artifact download, re-upload, pruning, and cleanup logic. - Make tag-pushed releases always follow the signed release path: build, sign, package, finalize, publish, and then run downstream release jobs from `release` success. - Remove stale `SIGN_MACOS` / `sign_macos` conditions and outputs, including downstream gates for npm, DotSlash, WinGet, dev website deploy, and `latest-alpha-cli` branch updates. ## Verification - `ruby -e 'require "yaml"; YAML.load_file(ARGV.fetch(0)); puts "yaml ok"' .github/workflows/rust-release.yml` - `git diff --check` - `rg -n "workflow_dispatch|inputs\\.|release_mode|build_unsigned|SIGN_MACOS|outputs\\.sign_macos|sign_macos\\b" .github/workflows/rust-release.yml` returned no matches
Shijie Rao ·
2026-06-05 10:36:14 -07:00 -
feat(app-server): add remote control pairing status RPC (#26450)
## What Exposes the pairing status transport as experimental app-server v2 RPC `remoteControl/pairing/status`. - Adds request/response protocol types for exactly one lookup key: `pairingCode` or `manualPairingCode`, returning `{ claimed }`. - Registers the RPC with `global_shared_read("remote-control-pairing")`. - Wires the method through `MessageProcessor` and `RemoteControlRequestProcessor`. - Validates missing/conflicting pairing-code params as invalid requests. - Documents the RPC in `app-server/README.md`. - Adds processor, protocol export, and JSON-RPC integration coverage for both code paths. ## Why This is the app-server surface the desktop app can poll while the QR/manual pairing modal is active. Depends on https://github.com/openai/codex/pull/26449 Related backend change: https://github.com/openai/openai/pull/990244 ## Verification - `cargo test --manifest-path app-server-protocol/Cargo.toml remote_control` - `cargo test --manifest-path app-server/Cargo.toml remote_control` - `cargo fmt --all --check` - `git diff --check`hefuc-oai ·
2026-06-05 10:33:56 -07:00 -
fix(tui): avoid doubled blank rows while streaming (#26636)
## Summary During assistant-message streaming, blank markdown lines in the transient active tail were prefixed with two spaces. Ratatui measured those whitespace-only lines as two viewport rows, so list- and table-heavy answers showed doubled vertical gaps while streaming and then visibly compacted when finalized into scrollback. - keep whitespace-only `StreamingAgentTailCell` lines structurally empty while preserving nonblank message prefixes - clear impossible hyperlink metadata when normalizing a blank tail line - add an inline snapshot and height regression proving one blank markdown line occupies one viewport row Related to #26618, but fixes a separate live-tail row-height issue rather than stale committed markdown content. ## How to Test Recommended before/after reproduction: 1. Start the latest Codex build without this change. 2. Submit this exact prompt: > Send 20 different lists: bullets vs numbered, simple vs complex with paragraphs in between items, etc. Intertwine them with some tables and some paragraphs. 3. While the answer streams, observe duplicated vertical gaps around list items and paragraphs. When the answer finishes, observe the spacing compact. 4. Start this branch with `just c` and submit the same prompt. 5. Confirm each intended blank markdown line occupies one terminal row throughout streaming and that the spacing does not compact or jump when the answer finishes. 6. As a focused regression, verify the sections after the first table, especially loose lists with paragraphs between items; those blank rows should remain stable throughout streaming. Targeted tests: - `just test -p codex-tui streaming_agent_tail_blank_line_uses_one_viewport_row` - `just test -p codex-tui history_cell::tests` ## Test Notes - Verified the exact prompt above in a real tmux TUI using latest Codex and this branch as the before/after comparison. - The full `just test -p codex-tui` run completed 2,782 of 2,784 tests successfully. Two unrelated guardian feature-flag tests fail reproducibly in isolation because the expected `OverrideTurnContext` message is absent. - `just argument-comment-lint` is blocked locally by the existing Bazel `compiler-rt` missing-header glob error; the touched Rust diff was inspected manually for opaque positional literals.
Felipe Coury ·
2026-06-05 14:33:31 -03:00 -
Make turn diff tracker multi-env aware (#26433)
## Why Turn diffs were tracked as one flat set of absolute paths. In multi-environment turns, local and remote environments can report the same path while representing different filesystems, so a single path key can collapse distinct changes or attribute them to the wrong environment. The environment name is **NOT** included in the generated unified diff. This can come later.
pakrym-oai ·
2026-06-05 17:31:22 +00:00 -
feat(remote-control): add pairing status transport (#26449)
## What Adds transport support for checking remote-control pairing status against the backend. - Adds the normalized `server/pair/status` backend URL. - Adds backend request/response structs for exactly one lookup key: `pairing_code` or `manual_pairing_code`, returning `{ claimed }`. - Adds `RemoteControlEnrollment::pairing_status` and `RemoteControlHandle::pairing_status`. - Preserves auth refresh/retry behavior and backend error mapping. - Adds transport coverage for pending, claimed, manual-code payloads, token refresh, mapped backend errors, malformed responses, and URL normalization. ## Why Desktop needs a host-authenticated way to poll whether a QR or manual pairing code has been claimed. Related backend change: https://github.com/openai/openai/pull/990244 ## Verification - `cargo test --manifest-path app-server-transport/Cargo.toml remote_control::tests::pairing_tests` - `cargo fmt --all --check` - `git diff --check`hefuc-oai ·
2026-06-05 10:07:25 -07:00 -
[codex] Add /usr/bin/bash shell fallback (#26538)
## Why Some Linux environments expose `bash` at `/usr/bin/bash` instead of `/bin/bash`. The shell detection fallback list should cover both standard locations once PATH/user-shell probing fails. Stacked on #26480. ## What changed - Add `/usr/bin/bash` to the bash fallback path list in `codex-shell-command`. - Extend shell type detection coverage for `/usr/bin/bash`. - Add AGENTS.md testing guidance to avoid tests for statically defined values and negative tests for removed logic. ## Verification - `just test -p codex-shell-command`
pakrym-oai ·
2026-06-05 09:38:26 -07:00 -
[codex] Allow socketpair in proxy-routed Linux sandbox (#26625)
## Summary - allow `socketpair(AF_UNIX, ...)` in the proxy-routed Linux seccomp mode - continue denying `socket(AF_UNIX, ...)` so user commands cannot create pathname or abstract Unix sockets - extend the managed-proxy integration test to verify both behaviors ## Root cause `NetworkSeccompMode::ProxyRouted` treated anonymous Unix socket pairs like externally addressable Unix sockets and returned `EPERM`. This breaks tools that use socket pairs for local child-process IPC even though a socket pair cannot connect outside the sandbox or bypass the routed proxy. `dangerously_allow_all_unix_sockets` controls Unix-socket requests forwarded by the managed network proxy; it does not currently configure the Linux seccomp filter. Socket pairs should not require that dangerous setting because they are unnamed, process-local IPC. Related but independent: #26553 fixes host proxy bridge socket path length handling. --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-05 09:34:36 -07:00 -
Require absolute cwd in thread settings (#26532)
## Why Thread settings cwd overrides are expected to be resolved before they enter core. Keeping this boundary as a plain `PathBuf` made it easy for core/session code to keep fallback normalization and relative-path resolution logic in places that should only receive an already-resolved cwd. This is intentionally the absolute-cwd-only slice: it does not change environment selection stickiness or cwd-to-default-environment fallback behavior. ## What changed - Changes `ThreadSettingsOverrides.cwd`, `CodexThreadSettingsOverrides.cwd`, and `SessionSettingsUpdate.cwd` to use `AbsolutePathBuf`. - Removes core-side cwd normalization/resolution from session settings updates. - Updates affected core/app-server test helpers and callsites to pass existing absolute cwd values or use `abs()` helpers. ## Validation Opening as draft so CI can start while local validation continues.
pakrym-oai ·
2026-06-05 09:29:15 -07:00 -
feat: reload v2 agents on delivery (#26623)
## Summary This is the first small step toward making multi-agent v2 agents durable logical agents whose `ThreadManager` residency is only an implementation detail. This PR adds a narrow v2 reload-on-delivery hook: - If a known v2 agent target is already loaded, delivery is unchanged. - If the target is still registered but missing from `ThreadManager`, delivery reloads that exact v2 thread from durable rollout history before submitting the message. - If the target is unknown, closed, missing from storage, or not a v2 thread, delivery still fails as not found. The reload is wired only into existing-agent delivery paths: v2 `send_message` / `followup_task`, and legacy `send_input` when its target is a known v2 agent. ## Stack 1. **Reload on delivery**: load known unloaded v2 agents before `followup_task`, `send_message`, or `send_input` delivery. This PR. 2. **Residency LRU**: unload idle resident v2 agents from `ThreadManager` without making them closed or unreachable. 3. **Execution concurrency**: count active non-root turns, not logical agents or resident idle threads. 4. **Close semantics**: make v2 close interrupt-only and leave durable agent identity intact. 5. **Resume cleanup**: remove user-facing v2 resume semantics; addressing an unloaded durable agent reloads it implicitly. ## Validation - Ran `just fmt`. - Left broader tests and clippy to CI.
jif ·
2026-06-05 18:18:29 +02:00 -
Render code comment directives in TUI replay (#26554)
## Summary Resumed Codex App or VS Code review sessions can contain `::code-comment` directives that the TUI previously displayed verbatim because only rich clients interpret them. This change rewrites valid line-start directives into readable Markdown during assistant-message parsing, using the session working directory for relative file paths. The fallback is applied consistently to live messages, replayed transcripts, and resume previews while preserving malformed directives and existing `::git-*` parsing. ## Before The TUI exposed the raw client directive: ```text ::code-comment{title="Fix body= parsing" body="Keep role=\"tab\", ::git-stage{cwd=/tmp}, file=, and \n literal." file="/repo/src/app.ts" start=10 end=12 priority="P2"} ``` ## After The same directive is rendered as readable review feedback: ```text - [P2] Fix body= parsing — src/app.ts:10-12 Keep role="tab", ::git-stage{cwd=/tmp}, file=, and \n literal. ``` Fixes #25658Eric Traut ·
2026-06-05 08:34:34 -07:00 -
Fix
/goalusage text for control commands (#26551)## Why The TUI's `/goal` usage text only advertised the objective form even though `/goal clear`, `/goal edit`, `/goal pause`, and `/goal resume` are implemented. This made the lifecycle controls difficult to discover and allowed the duplicated help text to drift from actual behavior. Fixes #25530. ## What changed - Show the complete `/goal [<objective>|clear|edit|pause|resume]` syntax in usage messages. - Share one usage string across slash-command dispatch and goal-related app messages. - Add inline snapshot coverage for the control-command usage path.
Eric Traut ·
2026-06-05 08:32:53 -07:00 -
Open Windows app workspaces via deep link (#26500)
## Summary Fixes #26423. On Windows, `codex app PATH` detected Codex Desktop and launched the app shell target, then only printed a manual instruction to open the workspace. The Desktop app already supports `codex://threads/new?path=...`, so the CLI can open the requested workspace directly. This updates the Windows launcher to normalize the workspace path, encode it into a `codex://threads/new` deep link, and open that URL when Codex Desktop is installed. The installer fallback still opens the Windows installer and prints the workspace path for after installation.
Eric Traut ·
2026-06-05 08:32:42 -07:00 -
Surface TUI config write error causes (#26537)
## Summary TUI config writes currently wrap app-server failures with local context like `config/batchWrite failed in TUI`, but several user-visible paths only render the outer error. That hides the actionable app-server message, such as validation constraints or read-only `CODEX_HOME` failures, leaving users with a dead-end diagnostic. This change adds a small formatter next to the TUI config write helpers that renders the error source chain, then uses it for model persistence, feature persistence, project trust, status line writes, hook trust, and hook enablement. Fixes #26077
Eric Traut ·
2026-06-05 08:32:07 -07:00 -
[codex] Fix long proxy socket paths (#26553)
## Summary - avoid generating host proxy bridge Unix socket paths that exceed Linux's `sockaddr_un.sun_path` limit - fall back from a long `$CODEX_HOME/tmp` path to the system temp directory, then `/tmp` - add focused unit coverage for short and overlong parent paths ## Root cause With a sufficiently long `CODEX_HOME`, the generated `proxy-route-*.sock` path exceeds Linux's 107-byte pathname limit. The host bridge child exits before writing its readiness byte, so the parent reports the indirect error `failed to prepare host proxy routing bridge: failed to fill whole buffer`. ## Validation - reproduced the original error with a long `CODEX_HOME` using `codex-cli 0.138.0-alpha.4` - `cargo clippy -p codex-linux-sandbox --all-targets` - `just fix -p codex-linux-sandbox` - `just fmt` The Linux-only unit test could not execute locally: the arm64 Docker build was repeatedly OOM-killed by `rustc` while compiling an unrelated `codex-app-server-protocol` dependency, before reaching the test. --------- Co-authored-by: Codex <noreply@openai.com>
viyatb-oai ·
2026-06-05 08:00:46 -07:00 -
feat(app-server): expose account token usage [1 of 2] (#25344)
## Why Token activity is useful account-level context, but terminal clients need a supported app-server path to fetch it without reaching into ChatGPT backend details directly. The API should also live under the broader account usage umbrella so future usage surfaces can be added without proliferating user-facing concepts. ## What Changed - Add `codex-backend-client` support for the ChatGPT profile token-usage payload. - Add the v2 `account/usage/read` app-server RPC. - Map lifetime usage, peak daily usage, streak, longest task duration, and daily buckets into app-server protocol types. - Gate the request on Codex-backend auth, which supports ChatGPT auth tokens and AgentIdentity. - Regenerate the app-server JSON and TypeScript schema fixtures. ## Token Count Source `account/usage/read` returns the token-usage aggregate supplied by the ChatGPT profile backend. App-server maps that backend-owned aggregate into protocol fields; it does not recompute cached-token treatment, usage multipliers, or raw input/output totals locally. ## Stack 1. feat(app-server): expose account token usage [1 of 2] (this PR) 2. [#25345](https://github.com/openai/codex/pull/25345) feat(tui): add token activity command [2 of 2] ## How to Test 1. Start an app-server client from this branch while authenticated with ChatGPT or AgentIdentity. 2. Call `account/usage/read`. 3. Confirm the response includes `summary` and `dailyUsageBuckets`. 4. Also verify a session without Codex-backend auth receives the existing auth error path. Targeted tests: - `just test -p codex-backend-client -p codex-app-server-protocol -p codex-app-server` - `just write-app-server-schema`
Felipe Coury ·
2026-06-05 14:43:44 +00:00 -
refactor: split agent control modules (#26610)
## Summary Mechanically splits `AgentControl` into focused modules so later agent runtime changes are easier to review. The shared lookup, messaging, and completion logic remains in `control.rs`, while spawn-specific code and V1 legacy close/resume behavior move into dedicated files. ## Changes - Extract spawn-agent code into `agent/control/spawn.rs`. - Extract V1-only legacy close/resume behavior into `agent/control/legacy.rs`. - Keep shared control-plane behavior in `agent/control.rs`. - Preserve existing behavior; this PR is intended to be mechanical. ## Stack 1. This PR - Mechanical `AgentControl` split: extracts spawn and V1 legacy code without behavior changes. 2. #26614 - Execution slot accounting: separates logical agents from active execution slots. 3. #26611 - Residency and reload runtime: adds resident-agent LRU, eviction/reload, durable lookup, and V2 delivery through reload. 4. #26612 - V2 tool semantics: narrows `close_agent` to interrupt-only and updates V2 tool coverage.
jif ·
2026-06-05 16:24:22 +02:00 -
[codex] Keep v1 spawn metadata visible (#26599)
## Summary - keep the legacy v1 `spawn_agent` role and model selectors visible - add regression coverage for the default v1 tool plan ## Why `hide_spawn_agent_metadata` is a multi-agent v2 setting, but the v1 planning branch also consumed it. After the default changed to `true`, v1 stopped advertising `agent_type`, `model`, `reasoning_effort`, and `service_tier`, preventing configured agents from being selected. This keeps the hidden-metadata default for v2 while opting v1 out of that behavior. Fixes #26363. ## Validation Not run locally, per request; CI will validate the change.
jif ·
2026-06-05 14:52:51 +02:00 -
[codex] Forward turn moderation metadata through app-server (#25710)
## Why First-party backends can supply turn-scoped moderation metadata that app-server clients need for client-side presentation. Exposing this as an experimental typed notification lets opted-in clients consume it without interpreting raw Responses API events. ## What changed - forward `response.metadata.openai_chatgpt_moderation_metadata` from Responses API SSE and WebSocket streams as turn-scoped moderation metadata - emit the experimental app-server v2 `turn/moderationMetadata` notification with `{ threadId, turnId, metadata }` - add app-server integration coverage for the typed moderation metadata notification ## Testing - `just test -p codex-core build_ws_client_metadata_includes_window_lineage_and_turn_metadata` - `just test -p codex-core` (fails locally: 46 failures and 1 timeout, primarily missing `test_stdio_server` and shell snapshot timeouts) - `just test -p codex-app-server-protocol` - `just test -p codex-app-server turn_moderation_metadata_emits_typed_notification_v2` - `just test -p codex-app-server` (fails locally: 792 passed, 10 failed, and 5 timed out; failures are in existing environment-sensitive tests, primarily because nested macOS `sandbox-exec` is not permitted) - `just write-app-server-schema --experimental --schema-root /tmp/codex-app-server-schema-experimental`carlc-oai ·
2026-06-05 02:41:06 -07:00 -
nit: doc (#26566)
Matching CBv9
jif ·
2026-06-05 11:10:32 +02:00 -
Encrypt multi-agent v2 message payloads (#26210)
## Why Multi-agent v2 currently routes agent instructions through normal tool arguments and inter-agent context. That means the parent model can emit plaintext task text, Codex can persist it in history/rollouts, and the recipient can receive it as ordinary assistant-message JSON. This changes the v2 path so agent instructions stay encrypted between model calls: Responses encrypts the `message` argument returned by the model, Codex forwards only that ciphertext, and Responses decrypts it internally for the recipient model. ## What changed - Mark the v2 `message` parameter as encrypted for `spawn_agent`, `send_message`, and `followup_task`. - Treat multi-agent v2 tool `message` values as ciphertext unconditionally. - Store v2 inter-agent task text in `InterAgentCommunication.encrypted_content` with empty plaintext `content`. - Convert encrypted inter-agent communications into the Responses `agent_message` input item before sending the child request. - Preserve `agent_message` items across history, rollout, compaction, telemetry, and app-server schema paths. - Leave multi-agent v1 unchanged. ## Message shape The model still calls the v2 tools with a `message` argument, but that value is now ciphertext: ```json { "name": "spawn_agent", "arguments": { "task_name": "worker", "message": "<ciphertext>" } } ``` Codex stores the task as encrypted inter-agent communication: ```json { "author": "/root", "recipient": "/root/worker", "content": "", "encrypted_content": "<ciphertext>", "trigger_turn": true } ``` When Codex builds the recipient request, it forwards the ciphertext using the new Responses input item: ```json { "type": "agent_message", "author": "/root", "recipient": "/root/worker", "content": [ { "type": "encrypted_content", "encrypted_content": "<ciphertext>" } ] } ``` Responses decrypts that item internally for the recipient model. ## Context impact - Parent context no longer carries plaintext v2 agent task instructions from these tool arguments. - Codex rollout/history stores ciphertext for v2 agent instructions. - Recipient requests receive an `agent_message` item instead of assistant commentary JSON for encrypted task delivery. - Plaintext completion/status notifications are still plaintext because they are Codex-generated status messages, not encrypted model tool arguments. ## Validation - `just test -p codex-tools` - `just test -p codex-protocol` - `just test -p codex-rollout` - `just test -p codex-rollout-trace` - `just test -p codex-otel` - `just write-app-server-schema`jif ·
2026-06-05 10:25:57 +02:00 -
[codex] Add environment shell info (#26480)
## Why Shell detection needs to be available through the `Environment` abstraction so callers can ask the selected local or remote environment for shell metadata without adding a separate HTTP endpoint or parallel info-source path. This keeps shell metadata shaped like the existing environment-owned filesystem capability and lets remote environments answer through exec-server JSON-RPC. ## What changed - Added `environment/info` to the exec-server protocol/client/server and exposed `Environment::info()`. - Added local and remote environment info providers on `Environment`, following the existing capability-provider pattern used for filesystem access. - Moved the shared shell detection logic into `codex-shell-command` and kept core shell APIs as wrappers around that implementation. - Returned shell metadata as `EnvironmentInfo { shell: ShellInfo }` using the existing shell detection path. - Added a remote environment test that calls `Environment::info()` through an exec-server-backed environment. ## Validation - `git diff --check` - `just test -p codex-shell-command` - `just test -p codex-core -E 'test(/shell::tests::/)'`\n- `just test -p codex-exec-server environment`pakrym-oai ·
2026-06-04 22:36:25 -07:00 -
feat(remote-control): allow pairing while disabled (#26215)
## Why `remoteControl/pairing/start` creates authorization for future remote-control connections, so it should not require the live websocket to already be enabled. Requiring enable first made pairing depend on presence instead of the persisted server enrollment that pairing actually uses. Pairing also needs to recover when that persisted server row is stale. If `/server/pair` returns `404`, making the first pairing attempt fail forces a manual retry even though the client can clear the stale row and create a replacement enrollment immediately. ## What Changed - Allow `remoteControl/pairing/start` to reuse or create the persisted remote-control server enrollment while remote control is disabled. - Keep the selected in-memory enrollment across disable and share it with websocket connect so a later enable uses the same selected server. - Thread the app-server client name through pairing so stdio persistence keeps using the websocket-owned enrollment key. - Recover pairing server-token auth failures through the existing refresh/auth-recovery path. - Recover stale pairing enrollment on `/server/pair` `404` by clearing the stale selected enrollment, re-enrolling once, and retrying pairing once. - Add focused disabled-pairing and stale-pairing recovery coverage. ## Verification - `remote_control_pairing_start_returns_pairing_artifacts_while_disabled` exercises pairing before enable. - `remote_control_handle_reenrolls_after_stale_pairing_enrollment` exercises stale `/server/pair` `404` recovery without a manual retry. Related: N/A
Anton Panasenko ·
2026-06-05 05:12:23 +00:00 -
core: derive exec policy filesystem policy from profile (#26499)
## Why `PermissionProfile` already owns the runtime filesystem sandbox policy through `file_system_sandbox_policy()`. Keeping a separate `FileSystemSandboxPolicy` on exec-policy fallback contexts made it possible for callers and tests to construct split states that the production permission model should not rely on. ## What changed - Removed `file_system_sandbox_policy` from `UnmatchedCommandContext`, `ExecApprovalRequest`, and the intercepted Unix exec-policy context. - Derived filesystem sandbox policy inside unmatched-command decision logic from `PermissionProfile::file_system_sandbox_policy()`. - Simplified shell/unified-exec callers and tests that were only plumbing the duplicate policy through. ## Testing Local tests not run per request; relying on remote CI.
Michael Bolin ·
2026-06-04 21:48:45 -07:00 -
[codex] Keep Bazel startup options stable across commands (#26256)
## Why `just bazel-clippy` ran target discovery with `--noexperimental_remote_repo_contents_cache`, then ran the build with the workspace default `--experimental_remote_repo_contents_cache`. Bazel therefore killed and restarted its server on each transition, slowing repeated commands and discarding the in-memory analysis cache. An audit found the same class of startup-option variation in several CI command sequences. ## What changed - Keep local lint target-discovery queries on the workspace-default Bazel server, while making CI target discovery explicitly use the CI startup options. - Normalize GitHub Actions launches through the BuildBuddy wrapper to share `BAZEL_OUTPUT_USER_ROOT` and `--noexperimental_remote_repo_contents_cache`. - Route the CI lockfile check and Windows test-shard query through the same startup configuration. - Document the startup-option invariant and add wrapper regression coverage. ## Validation - Confirmed consecutive local clippy target-discovery runs retained the same Bazel server PID.
Adam Perry @ OpenAI ·
2026-06-04 20:23:37 -07:00 -
fix(rmcp): refresh expired OAuth tokens before startup (#26482)
## Why Codex persists OAuth expiry as an absolute `expires_at`, then reconstructs RMCP’s relative `expires_in` when credentials are loaded. For an already-expired token, Codex reconstructed `expires_in` as missing. [RMCP 0.15 treated a missing `expires_in` as zero when a refresh token was present](https://github.com/modelcontextprotocol/rust-sdk/blob/9cfc905a9ef17c8bba6748dc0a9bdd2452681733/crates/rmcp/src/transport/auth.rs#L704-L723), so this still triggered a refresh. [RMCP 1.7 treats missing expiry information as unknown and uses the access token as-is](https://github.com/modelcontextprotocol/rust-sdk/blob/3529c3675ff64db805bd947ca6ece6090809e43d/crates/rmcp/src/transport/auth.rs#L1233-L1265), causing the stale token to be sent during `initialize`. ## What changed - Represent a known-expired persisted token as `expires_in = 0`, preserving `None` for genuinely unknown expiry. - Add Streamable HTTP coverage requiring the token to refresh before the startup handshake. ## Validation - The new regression test fails on RMCP 1.7 before the fix and passes afterward. - The same scenario passes on the commit immediately before the RMCP 1.7 update, using RMCP 0.15. - `just test -p codex-rmcp-client` (63 passed).
Adam Perry @ OpenAI ·
2026-06-05 02:31:06 +00:00 -
[codex] Add use_responses_lite 'override' logic (#26487)
## Summary - add a defaulted `ModelInfo.use_responses_lite` catalog field - support serializing `reasoning.context` while preserving the existing effort and summary path - has not been turned on for any models yet I've added an override to parallel tools if responses_lite is on. I've also forced persistent reasoning when using responses_lite. It would be ideal if we could centralize all the responses_lite plumbing, but I think this is best for now to keep the plumbing & diffs small. ## Testing - `cargo test -p codex-protocol model_info_defaults_availability_nux_to_none_when_omitted` - `RUST_MIN_STACK=8388608 cargo test -p codex-core responses_lite_sets_all_turns_context_and_disables_parallel_tool_calls` - `RUST_MIN_STACK=8388608 cargo test -p codex-core configured_reasoning_summary_is_sent` - `cargo check -p codex-core --tests` - `RUST_MIN_STACK=8388608 cargo clippy -p codex-core --tests` (passes with pre-existing warnings in `codex-code-mode` and `codex-core-plugins`)
rka-oai ·
2026-06-04 18:49:51 -07:00 -
[codex] Emit sandbox outcome telemetry event (#25955)
## Summary Adds a dedicated `codex.sandbox_outcome` telemetry event so we can query sandbox edge outcomes without threading sandbox metadata through tool-result output types. This is meant to make sandbox failures and approved escalation retries visible in OTEL while keeping the existing `codex.tool_result` event shape focused on tool completion data. ## What changed - Adds `SessionTelemetry::sandbox_outcome(...)`, which emits `codex.sandbox_outcome` as both a log and trace event. - Records the tool name, call id, sandbox outcome, initial attempt duration, and escalated attempt duration when a retry runs. - Emits `denied` when the sandbox blocks execution and no retry is run. - Emits `timed_out` and `signal` when those sandbox errors surface from tool execution. - Emits `escalated` when the initial sandboxed attempt fails and the approved unsandboxed retry succeeds. - Adds OTEL coverage for the new event payload, including timing fields. ## Validation - `RUST_MIN_STACK=8388608 just test -p codex-core sandbox_outcome_event_records_outcome handle_sandbox_error_user_approves_retry_records_tool_decision` - `just test -p codex-otel otel_export_routing_policy_routes_tool_result_log_and_trace_events runtime_metrics_summary_collects_tool_api_and_streaming_metrics` - `just fix -p codex-core` - `just fix -p codex-otel`
rreichel3-oai ·
2026-06-04 20:58:14 -04:00 -
ci: test windows cross build (#25000)
We cross build when using bazel for windows. This causes a couple hiccups in that v8 does a mksnapshot step that is expecting to snapshot on the host arch which wasn't matching when we were doing the crossbuild. This was causing segfault failiures when starting up codemode from a cross built artifact. This changes things such that we cross build the library and then run and link a snapshot on the host machine/arch which is windows. This gives us a functional snapshot and library that can start code-mode on windows. This fixes the build and then fixes two test regressions we had.
Channing Conger ·
2026-06-04 17:51:13 -07:00 -
Pull plugin service less frequently (#26431)
# Summary Reduce download traffic to `github.com/openai/plugins` while continuing to check for updates on every Codex startup. # Root cause The startup sync replaced the local repository with a fresh shallow clone whenever the remote revision changed. At Codex's global scale, repeatedly downloading the repository created excessive GitHub traffic. # Changes - Run `git ls-remote` on each startup to read the remote HEAD SHA. - Skip all repository downloads when the local and remote SHAs match. - Update existing checkouts with an exact-SHA shallow `git fetch`, followed by reset and clean. - Bootstrap new installations with `git init` plus the same shallow fetch, rather than cloning. - Keep the existing file lock so concurrent Codex processes serialize updates and do not duplicate fetches. - Preserve the existing GitHub HTTP and export archive fallback behavior. # Impact Each startup makes one lightweight remote HEAD check. Repository objects are downloaded only when the revision changes, and existing Git objects are reused during updates. # Validation - `just test -p codex-core-plugins startup_sync` (15 tests passed) - `just test -p codex-core-plugins` (201 tests passed) - `just clippy -p codex-core-plugins` (passes with one pre-existing `large_enum_variant` warning) - Production app-server smoke test against GitHub: - Fresh home: `ls-remote`, `git init`, one exact-SHA shallow fetch - Unchanged restart: `ls-remote` and local `rev-parse` only; no fetch or clone - Bench smoke passed
beggers-openai ·
2026-06-04 17:47:58 -07:00 -
Improve Windows sandbox setup refresh diagnostics (#26471)
## Why Users have been seeing opaque Windows sandbox setup refresh failures such as `windows sandbox: spawn setup refresh`, including reports in #24391 and #21208. The setup refresh path already runs the Windows sandbox setup helper, but it was not using the same structured `setup_error.json` reporting path that elevated setup uses. As a result, when the helper exited non-zero, Codex only surfaced a generic refresh status instead of the helper's `SetupFailure` code and message. ## What changed - Clear stale `setup_error.json` before non-elevated setup refresh launches the helper. - When the refresh helper exits non-zero, read the helper-written report through the existing `report_helper_failure` path. - Keep a parent-side launch diagnostic for cases where the helper never starts, including the helper path, cwd, sandbox log path, and spawn error. - Clear the setup error report after a successful refresh. - Add regression coverage for report consumption and stale-report avoidance. ## Verification - `cargo test -p codex-windows-sandbox setup::tests::`
iceweasel-oai ·
2026-06-04 16:52:10 -07:00 -
[codex] Expose unavailable app templates in plugin detail (#26317)
## Summary - Adds `unavailable_app_templates` to the app-server protocol and generated schemas/types. - Parses plugin-service `release.unavailable_app_templates` in the remote plugin client. - Maps remote unavailable templates into app-server `PluginDetail`. - Defaults local plugins to an empty unavailable app template list. ## Validation - `just write-app-server-schema` - `cargo +1.95.0 fmt --manifest-path codex-rs/Cargo.toml --all --check` - `cargo +1.95.0 test --manifest-path codex-rs/Cargo.toml -p codex-app-server-protocol schema_fixtures` - `cargo +1.95.0 check --manifest-path codex-rs/Cargo.toml -p codex-app-server-protocol -p codex-core-plugins -p codex-app-server` - `git diff --check` Note: default `cargo check` uses rustc 1.89 locally and failed because dependencies require newer Rust, so validation was rerun with installed Rust 1.95.
charlesgong-openai ·
2026-06-04 23:42:27 +00:00 -
Add skill for pushing CI configuration changes (#26473)
## Why Codex agents that modify GitHub Actions configuration need clear guidance when repository push protections require temporary approval. Without it, an agent may pursue an unavailable exemption or stop before checking whether the user already has access. ## What Add a `pushing-ci-changes` skill that explains the restriction, directs agents to attempt the push first, and tells them how to involve the user when approval is required. ## Validation Not run; this change only adds skill documentation.
Adam Perry @ OpenAI ·
2026-06-04 15:40:16 -07:00 -
fix(app-server): expose remote MCP servers in plugin read (#26453)
## Why Remote plugin detail responses include MCP server metadata under `release.mcp_servers`, but Codex did not deserialize or propagate that field. As a result, `plugin/read` always returned an empty `mcpServers` list for remote plugins, so the plugin details pane omitted the MCP Servers section even when the remote plugin declares one. This affects uninstalled plugins as well: the remote detail API is the source of truth and returns MCP server keys without requiring a local plugin bundle. ## What changed - Deserialize MCP server entries from remote plugin detail responses. - Normalize their keys into a sorted, deduplicated list on `RemotePluginDetail`. - Return those keys from app-server `plugin/read` instead of hardcoding an empty list. - Add regression coverage proving an uninstalled remote plugin returns its MCP server names. ## Test plan - `just test -p codex-core-plugins` - `just test -p codex-app-server plugin_read`
Eric Ning ·
2026-06-04 22:10:24 +00:00 -
[codex] Preserve logical paths during AGENTS.md discovery (#26465)
## Intent Follow up on #26205 by avoiding unnecessary filesystem canonicalization during `AGENTS.md` discovery. The configured working directory is already absolute, and canonicalization incorrectly switches symlinked workspaces from their logical parent hierarchy to the target's hierarchy. ## User-facing behavior For a symlinked working directory such as: ```text test-root/ |-- logical-repo/ | |-- AGENTS.md ("logical parent doc") | `-- workspace ------------> physical-repo/workspace/ `-- physical-repo/ |-- AGENTS.md ("physical parent doc") `-- workspace/ `-- AGENTS.md ("workspace doc") ``` Before this change, Codex canonicalized `logical-repo/workspace` to `physical-repo/workspace` before discovery. It therefore loaded `physical-repo/AGENTS.md` and `physical-repo/workspace/AGENTS.md`, ignoring the instructions from the repository through which the user entered the workspace. After this change, ancestor discovery walks the configured logical path, so Codex loads `logical-repo/AGENTS.md`. Opening `logical-repo/workspace/AGENTS.md` still follows the symlink through the host filesystem, so the workspace document is also loaded. `physical-repo/AGENTS.md` is not loaded. ## Implementation Use the logical absolute working directory when discovering project instructions and reporting instruction sources. Filesystem reads still follow the working-directory symlink, so an `AGENTS.md` in the target workspace continues to load while ancestor discovery uses the symlink's parents. ## Validation Added integration coverage proving that discovery loads the logical parent's instructions and the target workspace's instructions, but not the target parent's instructions.
Adam Perry @ OpenAI ·
2026-06-04 15:08:52 -07:00 -
Use Winget release environment secret (#26466)
## Why `WINGET_PUBLISH_PAT` now lives as a GitHub environment secret under `mainline-release-winget`. The WinGet release job needs to enter that environment so `secrets.WINGET_PUBLISH_PAT` resolves during stable/mainline Rust releases. ## What Changed - Attach the `winget` job in `.github/workflows/rust-release.yml` to the `mainline-release-winget` environment. - Set `deployment: false` so the job can read environment secrets without creating GitHub deployment records. ## Operational Note The `mainline-release-winget` environment must allow `rust-v*.*.*` tag refs before this can run on release tags. The live environment currently has a custom policy named `rust-v*.*.*` with type `branch`; add the corresponding `tag` policy before relying on this path for a release. ## Validation - `git diff --check origin/main...HEAD -- .github/workflows/rust-release.yml` - `ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts "yaml ok: #{f}" }' .github/workflows/rust-release.yml`Shijie Rao ·
2026-06-04 14:38:11 -07:00 -
[codex] Use model-advertised reasoning effort order (#26446)
## Summary - preserve the model catalog order for app-server `supportedReasoningEfforts` and document that client contract - render TUI reasoning choices in the advertised order - step reasoning shortcuts by adjacent list position instead of deriving order from known effort names - anchor unsupported configured values to the advertised default, or the first option when needed - remove canonical effort ordering helpers and the unused upgrade effort mapping ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI. Stacked on #26444.
Ahmed Ibrahim ·
2026-06-04 14:01:14 -07:00 -
[codex] Support model-defined reasoning efforts (#26444)
## Summary - accept non-empty model-defined reasoning effort values while preserving built-in effort behavior - propagate the non-Copy effort type through core, app-server, TUI, telemetry, and persistence call sites - preserve string wire encoding and expose an open-string schema for clients - update model selection and shortcut behavior for model-advertised effort values ## Root cause `ReasoningEffort` gained a string-backed custom variant, so it could no longer implement `Copy` or rely on derived closed-enum serialization. Existing consumers still moved effort values from shared references and assumed a fixed built-in value set. ## Validation - `just fmt` - Local tests and compilation were not run per request; relying on CI.
Ahmed Ibrahim ·
2026-06-04 13:36:24 -07:00 -
Cleanup experimentalFeature/enablement/set (#26312)
## Why `experimentalFeature/enablement/set` still allowed several keys that no longer need to be managed through this API. Keeping those keys also preserved corresponding special-case logic, including refreshing the apps list when the `apps` key was enabled. The endpoint also rejected an entire request when any key was invalid or unsupported. That makes clients brittle when they send a mix of current and stale keys, even when the valid entries can still be applied safely. ## What changed - remove the feature keys that no longer need to be supported by `experimentalFeature/enablement/set` - remove the corresponding apps-list refresh path and its auth/config plumbing - ignore and warn on invalid or unsupported keys while still applying valid keys from the same request - update the app-server documentation and integration coverage for the reduced key set and partial-acceptance behavior ## Test plan - `just test -p codex-app-server experimental_feature_enablement_set` (6 passed) - `just test -p codex-app-server` exercised the changed tests successfully; unrelated sandbox-dependent and watcher/timing tests failed locally
Matthew Zeng ·
2026-06-04 13:35:31 -07:00