mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
d5fef5c19008d34a360758e9ad72077a693e511f
3925 Commits
-
Add C# syntax option to highlight selections (#12511)
Summary - map csharp/c-sharp aliases to the existing C# syntax in the highlight matcher - ensure the extension list and tests include .cs and the new aliases so coverage stays accurate Testing <img width="543" height="266" alt="image" src="https://github.com/user-attachments/assets/e6c8a42f-649c-4c30-b574-421b4287534c" />
Eric Traut ·
2026-02-22 12:15:20 -08:00 -
Sort themes case-insensitively in picker (#12509)
## Summary - order bundled and custom themes together by name while keeping entries stable across platforms - update the theme fixture names and tests to assert case-insensitive ordering
Eric Traut ·
2026-02-22 12:12:36 -08:00 -
Revert "Revert "Route inbound realtime text into turn start or steer"" (#12480)
With working tests this time --------- Co-authored-by: Codex <noreply@openai.com>
Ahmed Ibrahim ·
2026-02-22 11:54:16 -08:00 -
feat(tui): support Alt-d delete-forward-word (#12455)
Alt-d should delete the next word. It didn’t. Now it does. Added a small test so it stays that way. Details: File updated: [codex-rs/tui/src/bottom_pane/textarea.rs](./codex-rs/tui/src/bottom_pane/textarea.rs) Test added: delete_forward_word_alt_d — verifies Alt-d deletes the next word and keeps the cursor position correct. Solves Issue #12453
Douglas Chimento ·
2026-02-22 11:22:17 -08:00 -
Handle orphan exec ends without clobbering active exploring cell (#12313)
Summary - distinguish exec end handling targets (active tracking, active orphan history, new cell) so unified exec responses don’t clobber unrelated exploring cells - ensure orphan ends flush existing exploring history when complete, insert standalone history entries, and keep active cells correct - add regression tests plus a snapshot covering the new behavior and expose the ExecCell completion result for verification Fix for https://github.com/openai/codex/issues/12278 --------- Co-authored-by: Josh McKinney <joshka@openai.com>
jif-oai ·
2026-02-22 14:26:58 +00:00 -
jif-oai ·
2026-02-22 14:13:56 +00:00 -
Send events to realtime api (#12423)
- Send assistant messages, ExecCommandBegin, and PatchApplyBegin/PatchApplyEnd
Ahmed Ibrahim ·
2026-02-21 23:24:51 -08:00 -
fix(core) exec policy parsing 3 (#12485)
## Summary Quick fix
Dylan Hurd ·
2026-02-22 06:26:13 +00:00 -
feat(tui) /clear (#12444)
# /clear feature! /clear will clear your terminal while preserving the context/state of the thread.
Won Park ·
2026-02-21 22:06:56 -08:00 -
app-server: retain thread listener across disconnects (#12373)
- keep the per-thread app-server listener alive when the last client unsubscribes or disconnects - preserve listener-side active turn history so running `thread/resume` can merge an in-progress turn snapshot after reconnect - add `ThreadStateManager` regressions for disconnect/unsubscribe retention and explicit thread teardown cleanup Added unit tests, and I manually tested to confirm the fix --------- Co-authored-by: Codex <noreply@openai.com>
Max Johnson ·
2026-02-22 05:33:33 +00:00 -
feat(tui): syntax highlighting via syntect with theme picker (#11447)
## Summary Adds syntax highlighting to the TUI for fenced code blocks in markdown responses and file diffs, plus a `/theme` command with live preview and persistent theme selection. Uses syntect (~250 grammars, 32 bundled themes, ~1 MB binary cost) — the same engine behind `bat`, `delta`, and `xi-editor`. Includes guardrails for large inputs, graceful fallback to plain text, and SSH-aware clipboard integration for the `/copy` command. <img width="1554" height="1014" alt="image" src="https://github.com/user-attachments/assets/38737a79-8717-4715-b857-94cf1ba59b85" /> <img width="2354" height="1374" alt="image" src="https://github.com/user-attachments/assets/25d30a00-c487-4af8-9cb6-63b0695a4be7" /> ## Problem Code blocks in the TUI (markdown responses and file diffs) render without syntax highlighting, making it hard to scan code at a glance. Users also have no way to pick a color theme that matches their terminal aesthetic. ## Mental model The highlighting system has three layers: 1. **Syntax engine** (`render::highlight`) -- a thin wrapper around syntect + two-face. It owns a process-global `SyntaxSet` (~250 grammars) and a `RwLock<Theme>` that can be swapped at runtime. All public entry points accept `(code, lang)` and return ratatui `Span`/`Line` vectors or `None` when the language is unrecognized or the input exceeds safety guardrails. 2. **Rendering consumers** -- `markdown_render` feeds fenced code blocks through the engine; `diff_render` highlights Add/Delete content as a whole file and Update hunks per-hunk (preserving parser state across hunk lines). Both callers fall back to plain unstyled text when the engine returns `None`. 3. **Theme lifecycle** -- at startup the config's `tui.theme` is resolved to a syntect `Theme` via `set_theme_override`. At runtime the `/theme` picker calls `set_syntax_theme` to swap themes live; on cancel it restores the snapshot taken at open. On confirm it persists `[tui] theme = "..."` to config.toml. ## Non-goals - Inline diff highlighting (word-level change detection within a line). - Semantic / LSP-backed highlighting. - Theme authoring tooling; users supply standard `.tmTheme` files. ## Tradeoffs | Decision | Upside | Downside | | ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | syntect over tree-sitter / arborium | ~1 MB binary increase for ~250 grammars + 32 themes; battle-tested crate powering widely-used tools (`bat`, `delta`, `xi-editor`). tree-sitter would add ~12 MB for 20-30 languages or ~35 MB for full coverage. | Regex-based; less structurally accurate than tree-sitter for some languages (e.g. language injections like JS-in-HTML). | | Global `RwLock<Theme>` | Enables live `/theme` preview without threading Theme through every call site | Lock contention risk (mitigated: reads vastly outnumber writes, single UI thread) | | Skip background / italic / underline from themes | Terminal BG preserved, avoids ugly rendering on some themes | Themes that rely on these properties lose fidelity | | Guardrails: 512 KB / 10k lines | Prevents pathological stalls on huge diffs or pastes | Very large files render without color | ## Architecture ``` config.toml ─[tui.theme]─> set_theme_override() ─> THEME (RwLock) │ ┌───────────────────────────────────────────┘ │ markdown_render ─── highlight_code_to_lines(code, lang) ─> Vec<Line> diff_render ─── highlight_code_to_styled_spans(code, lang) ─> Option<Vec<Vec<Span>>> │ │ (None ⇒ plain text fallback) │ /theme picker ─── set_syntax_theme(theme) // live preview swap ─── current_syntax_theme() // snapshot for cancel ─── resolve_theme_by_name(name) // lookup by kebab-case ``` Key files: - `tui/src/render/highlight.rs` -- engine, theme management, guardrails - `tui/src/diff_render.rs` -- syntax-aware diff line wrapping - `tui/src/theme_picker.rs` -- `/theme` command builder - `tui/src/bottom_pane/list_selection_view.rs` -- side content panel, callbacks - `core/src/config/types.rs` -- `Tui::theme` field - `core/src/config/edit.rs` -- `syntax_theme_edit()` helper ## Observability - `tracing::warn` when a configured theme name cannot be resolved. - `Config::startup_warnings` surfaces the same message as a TUI banner. - `tracing::error` when persisting theme selection fails. ## Tests - Unit tests in `highlight.rs`: language coverage, fallback behavior, CRLF stripping, style conversion, guardrail enforcement, theme name mapping exhaustiveness. - Unit tests in `diff_render.rs`: snapshot gallery at multiple terminal sizes (80x24, 94x35, 120x40), syntax-highlighted wrapping, large-diff guardrail, rename-to-different-extension highlighting, parser state preservation across hunk lines. - Unit tests in `theme_picker.rs`: preview rendering (wide + narrow), dim overlay on deletions, subtitle truncation, cancel-restore, fallback for unavailable configured theme. - Unit tests in `list_selection_view.rs`: side layout geometry, stacked fallback, buffer clearing, cancel/selection-changed callbacks. - Integration test in `lib.rs`: theme warning uses the final (post-resume) config. ## Cargo Deny: Unmaintained Dependency Exceptions This PR adds two `cargo deny` advisory exceptions for transitive dependencies pulled in by `syntect v5.3.0`: | Advisory | Crate | Status | |----------|-------|--------| | RUSTSEC-2024-0320 | `yaml-rust` | Unmaintained (maintainer unreachable) | | RUSTSEC-2025-0141 | `bincode` | Unmaintained (development ceased; v1.3.3 considered complete) | **Why this is safe in our usage:** - Neither advisory describes a known security vulnerability. Both are "unmaintained" notices only. - `bincode` is used by syntect to deserialize pre-compiled syntax sets. Again, these are **static vendored artifacts** baked into the binary at build time. No user-supplied bincode data is ever deserialized. - Attack surface is zero for both crates; exploitation would require a supply-chain compromise of our own build artifacts. - These exceptions can be removed when syntect migrates to `yaml-rust2` and drops `bincode`, or when alternative crates are available upstream.
Felipe Coury ·
2026-02-21 20:26:58 -08:00 -
Make shell detection tests
robust to Nix shell paths (#12476) ## Summary - Updated `codex-rs/core/src/shell.rs` tests for shell detection to stop asserting hardcoded shell paths. - `detects_bash` and `detects_sh` now assert executable basenames (`bash`, `sh`) rather than `/bin/*`/`/usr/bin/*` absolute paths. - This keeps behavior the same while avoiding failures in Nix environments where shells are resolved from `/nix/store/.../bin`. ## Testing - `nix develop .#default --command sh -lc 'export PKG_CONFIG_PATH=/nix/store/6az1q591wwlgazzskngr6rl7gmhpyvnc-libcap-2.77-dev/lib/pkgconfig:/nix/store/fgm3pz8486ksh3f94629lpb7xjr2wjp7-openssl-3.6.0-dev/lib/pkgconfig:$PKG_CONFIG_PATH; export PKG_CONFIG_PATH_FOR_TARGET=$PKG_CONFIG_PATH; cd /home/alex/workspace/openai/codex/codex-rs && cargo test -p codex-core --lib detects_bash && cargo test -p codex-core --lib detects_sh'` ## Why The two failing tests previously hardcoded fixed paths and failed under the Nix shell due to Nix-provided shell binary locations. ## Links - Bug report / enhancement request: not publicly filed yet; this was reproduced in the local Nix environment.
Alex Kwiatkowski ·
2026-02-21 20:08:02 -08:00 -
fix: make realtime conversation flake test order-insensitive (#12475)
## Why `codex-core::all` has a flaky test, `suite::realtime_conversation::conversation_start_audio_text_close_round_trip`, that assumes a fixed ordering between `conversation.item.create` and `response.input_audio.delta` requests. That ordering is not guaranteed: realtime text and audio input are forwarded through separate queues and a background task, so either request can be observed first while still being correct behavior. ## What Changed - Updated the assertion in `codex-rs/core/tests/suite/realtime_conversation.rs` to compare the two observed request types order-independently. - Kept the existing checks that `session.create` is sent first and that exactly two follow-up requests are recorded. ## Verification - Re-ran `cargo test -p codex-core --test all conversation_start_audio_text_close_round_trip` 10 times locally.
Michael Bolin ·
2026-02-21 17:06:35 -08:00 -
Ahmed Ibrahim ·
2026-02-21 15:46:03 -08:00 -
Route inbound realtime text into turn start or steer (#12469)
- Route inbound realtime websocket text into normal user input handling so it steers an active turn or starts a new one
Ahmed Ibrahim ·
2026-02-21 15:45:27 -08:00 -
fix(tui): preserve URL clickability across all TUI views (#12067)
## Problem Long URLs containing `/` and `-` characters are split across multiple terminal lines by `textwrap`'s default hyphenation rules. This breaks terminal link detection: emulators can no longer identify the URL as clickable, and copy-paste yields a truncated fragment. The issue affects every view that renders user or agent text — exec output, history cells, markdown, the app-link setup screen, and the VT100 scrollback path. A secondary bug compounds the first: `desired_height()` calculations count logical lines rather than viewport rows. When a URL overflows its line and wraps visually, the height budget is too small, causing content to clip or leave gaps. Here is how the complete URL is interpreted by the terminal before (first line only) and after (complete URL): | Before | After | |---|---| | <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 59 11 PM" src="https://github.com/user-attachments/assets/193a89a0-7e56-49c5-8b76-53499a76e7e3" /> | <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 58 40 PM" src="https://github.com/user-attachments/assets/0b9b4c14-aafb-439f-9ffe-f6bba556f95e" /> | ## Mental model The TUI now treats URL-like tokens as atomic units that must never be split by the wrapping engine. Every call site that previously used `word_wrap_*` has been migrated to `adaptive_wrap_*`, which inspects each line for URL-like tokens and switches wrapping strategy accordingly: - **Non-URL lines** follow the existing `textwrap` path unchanged (word boundaries, optional indentation, hyphenation). - **URL-only lines** (with at most decorative markers like `│`, `-`, `1.`) are emitted unwrapped so terminal link detection works; ratatui's `Wrap { trim: false }` handles the final character wrap at render time. - **Mixed lines** (URL + substantive non-URL prose) flow through `adaptive_wrap_line` so prose wraps naturally at word boundaries while URL tokens remain unsplit. Height measurement everywhere now delegates to `Paragraph::line_count(width)`, which accounts for the visual row cost of overflowed lines. This single source of truth replaces ad-hoc line counting in individual cells. For terminal scrollback (the VT100 path that prints history when the TUI exits), URL-only lines are emitted unwrapped so the terminal's own link detector can find them. Mixed URL+prose lines use adaptive wrapping so surrounding text wraps naturally. Continuation rows are pre-cleared to avoid stale content artifacts. ## Non-goals - Full RFC 3986 URL parsing. The detector is a conservative heuristic that covers `scheme://host`, bare domains (`example.com/path`), `localhost:port`, and IPv4 hosts. IPv6 (`[::1]:8080`) and exotic schemes are intentionally excluded from v1. - Changing wrapping behavior for non-URL content. - Reflowing or reformatting existing terminal scrollback on resize. ## Tradeoffs | Decision | Upside | Downside | |----------|--------|----------| | Heuristic URL detection vs. full parser | Fast, zero-alloc on the hot path; conservative enough to reject file paths like `src/main.rs` | False negatives on obscure URL formats (they get split as before) | | Adaptive (three-path) wrapping | Non-URL lines are untouched — no behavior change, no perf cost; mixed lines wrap prose naturally while preserving URLs | Three wrapping strategies to reason about when debugging layout | | Row-based truncation with line-unit ellipsis | Accurate viewport budget; stable "N lines omitted" count across terminal widths | `truncate_lines_middle` is more complex (must compute per-line row cost) | | Unwrapped URL-only lines in scrollback | Terminal emulators detect clickable links; copy-paste gets the full URL | TUI and scrollback formatting diverge for URL-only lines | | Default `desired_height` via `Paragraph::line_count` | DRY — most cells inherit correct measurement | Cells with custom layout must remember to override | ## Architecture ```mermaid flowchart TD A["adaptive_wrap_*()"] --> B{"line_contains_url_like?"} B -- No URL tokens --> C["word_wrap_line<br/>(textwrap default)"] B -- Has URL tokens --> D{"mixed URL + prose?"} D -- "URL-only<br/>(+ decorative markers)" --> E["emit unwrapped<br/>(terminal char-wraps)"] D -- "Mixed<br/>(URL + substantive text)" --> F["adaptive_wrap_line<br/>(AsciiSpace + custom WordSplitter)"] C --> G["Paragraph::line_count(w)<br/>(single height truth)"] E --> G F --> G ``` **Changed files:** | File | Role | |------|------| | `wrapping.rs` | URL detection heuristics, mixed-line detection, `adaptive_wrap_*` functions, custom `WordSplitter` | | `exec_cell/render.rs` | Row-aware `truncate_lines_middle`, adaptive wrapping for command/output display | | `history_cell.rs` | Migrate all cell types to `adaptive_wrap_*`; default `desired_height` via `Paragraph::line_count` | | `insert_history.rs` | Three-path scrollback wrapping (unwrapped URL-only, adaptive mixed, word-wrapped text); continuation row clearing | | `app_link_view.rs` | Adaptive wrapping for setup URL; `desired_height` via `Paragraph::line_count` | | `markdown_render.rs` | Adaptive wrapping in `finish_paragraph` | | `model_migration.rs` | Viewport-aware wrapping for narrow-pane markdown | | `pager_overlay.rs` | `Wrap { trim: false }` for transcript and streaming chunks | | `queued_user_messages.rs` | Migrate to `adaptive_wrap_lines` | | `status/card.rs` | Migrate to `adaptive_wrap_lines` | ## Observability - **Ellipsis message** in truncated exec output reports omitted count in logical lines (stable across resize) rather than viewport rows (fluctuates). - URL detection is deterministic and stateless — no hidden caching or memoization to go stale. - Height mismatch bugs surface immediately as visual clipping or gaps; the `Paragraph::line_count` path is the same code ratatui uses at render time, so measurement and rendering cannot diverge. ## Tests 26 new unit tests across 7 files, covering: - **URL integrity**: assert a URL-like token appears on exactly one rendered line (not split across two). - **Height accuracy**: compare `desired_height()` against `Paragraph::line_count()` for URL-containing content. - **Row-aware truncation**: verify ellipsis counts logical lines and output fits within the row budget. - **Scrollback rendering**: VT100 backend tests confirm prefix and URL land on the same row; continuation rows are cleared; mixed URL+prose lines wrap prose while preserving URL tokens. - **Mixed URL+prose detection**: `line_has_mixed_url_and_non_url_tokens` correctly distinguishes lines with substantive non-URL text from lines with only decorative markers alongside a URL. - **Heuristic correctness**: positive matches (`https://...`, `example.com/path`, `localhost:3000/api`, `192.168.1.1:8080/health`) and negative matches (`src/main.rs`, `foo/bar`, `hello-world`). ## Risks and open items 1. **URL-like tokens in code output** (e.g. `example.com/api` inside a JSON blob) will trigger URL-preserving wrap on that line. This is acceptable — the worst case is a slightly wider line, not broken output. 2. **Very long non-URL tokens on a URL line** can only break at character boundaries (the custom splitter emits all char indices for non-URL words). On extremely narrow terminals this could overflow, but narrow terminals already degrade gracefully. 3. **No IPv6 support** — `[::1]:8080/path` will be treated as a non-URL and may get split. Can be added later without API changes. Fixes #5457
Felipe Coury ·
2026-02-21 15:31:41 -08:00 -
Michael Bolin ·
2026-02-21 14:40:24 -08:00 -
Michael Bolin ·
2026-02-21 14:40:13 -08:00 -
Improve token usage estimate for images (#12419)
Fixes #11845. Adjust context/token estimation for inline image `data:*;base64,...` URLs so we do not count the raw base64 payload as model-visible text. What changed: - keep the existing JSON-length estimator as the baseline - detect only inline base64 `data:` image URLs in message and function-call output content items - subtract only the base64 payload bytes (preserving data URL prefix + JSON overhead) - add a fixed per-image estimate of 340 bytes (~85 tokens at the repo’s 4-bytes/token heuristic) This avoids large overestimates from MCP image tool outputs while leaving normal image URLs (`https://`, `file://`, non-base64 `data:` URLs) unchanged. Tests: - message image data URL estimate regression - function-call output image data URL estimate regression - non-base64 image URLs unchanged - non-base64 `data:` URLs unchanged - `data:application/octet-stream;base64,...` adjusted - multiple inline images apply multiple fixed costs - text-only items unchanged
Eric Traut ·
2026-02-21 14:25:36 -08:00 -
Prefer v2 websockets if available (#12428)
And also cleanup settings flow to avoid reading many separate flags. --------- Co-authored-by: Codex <noreply@openai.com>
pakrym-oai ·
2026-02-21 20:08:04 +00:00 -
Prevent replayed runtime events from forcing active status (#12420)
Fixes #11852 Resume replay was applying transient runtime events (`TurnStarted`, `StreamError`) as if they were live, which could leave the TUI stuck in a stale `Working` / `Reconnecting...` state after resuming an interrupted reconnect. This change makes replay transcript-oriented for these events by: - skipping retry-status restoration for replayed non-stream events - ignoring replayed `TurnStarted` for task-running state - ignoring replayed `StreamError` for reconnect/status UI Also adds TUI regression tests and snapshot coverage for the interrupted reconnect replay case.
Eric Traut ·
2026-02-21 11:55:03 -08:00 -
profile-level model_catalog_json overrie (#12410)
enable `model-catalog_json` config value on `ConfigProfile` as well
sayan-oai ·
2026-02-21 19:39:02 +00:00 -
feat(linux-sandbox): implement proxy-only egress via TCP-UDS-TCP bridge (#11293)
## Summary - Implement Linux proxy-only routing in `codex-rs/linux-sandbox` with a two-stage bridge: host namespace `loopback TCP proxy endpoint -> UDS`, then bwrap netns `loopback TCP listener -> host UDS`. - Add hidden `--proxy-route-spec` plumbing for outer-to-inner stage handoff. - Fail closed in proxy mode when no valid loopback proxy endpoints can be routed. - Introduce explicit network seccomp modes: `Restricted` (legacy restricted networking) and `ProxyRouted` (allow INET/INET6 for routed proxy access, deny `AF_UNIX` and `socketpair`). - Enforce that proxy bridge/routing is bwrap-only by validating `--apply-seccomp-then-exec` requires `--use-bwrap-sandbox`. - Keep landlock-only flows unchanged (no proxy bridge behavior outside bwrap). --------- Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
viyatb-oai ·
2026-02-21 18:16:34 +00:00 -
Delete AggregatedStream (#12441)
Used only in test
pakrym-oai ·
2026-02-21 08:50:27 +00:00 -
chore: delete empty codex-rs/code file (#12440)
This file was added in https://github.com/openai/codex/pull/4195, but I think it may have been a mistake?
Michael Bolin ·
2026-02-21 08:44:55 +00:00 -
refactor(core): move embedded system skills into codex-skills crate (#12435)
## Why `codex-core` was carrying the embedded system-skill sample assets (and a `build.rs` that walks those files to register rerun triggers). Those assets change infrequently, but any change under `codex-core` still ties them to `codex-core`'s build/cache lifecycle. This change moves the embedded system-skills packaging into a dedicated `codex-skills` crate so it can be cached independently. That reduces unnecessary invalidation/rebuild pressure on `codex-core` when the skills bundle is the only thing that changes. ## What Changed - Added a new `codex-rs/skills` crate (`codex-skills`) with: - `Cargo.toml` - `BUILD.bazel` - `build.rs` to track skill asset file changes for Cargo rebuilds - `src/lib.rs` containing the embedded system-skills install/cache logic previously in `codex-core` - Moved the embedded sample skill assets from `codex-rs/core/src/skills/assets/samples` to `codex-rs/skills/src/assets/samples`. - Updated `codex-rs/core/Cargo.toml` to depend on `codex-skills` and removed `codex-core`'s direct `include_dir` dependency. - Removed `codex-core`'s `build.rs`. - Replaced `codex-rs/core/src/skills/system.rs` implementation with a thin re-export wrapper to keep existing `codex-core` call sites unchanged. - Updated workspace manifests/lockfile (`codex-rs/Cargo.toml`, `codex-rs/Cargo.lock`) for the new crate.
Michael Bolin ·
2026-02-21 08:34:08 +00:00 -
fix: codex-arg0 no longer depends on codex-core (#12434)
## Why `codex-rs/arg0` only needed two things from `codex-core`: - the `find_codex_home()` wrapper - the special argv flag used for the internal `apply_patch` self-invocation path That made `codex-arg0` depend on `codex-core` for a very small surface area. This change removes that dependency edge and moves the shared `apply_patch` invocation flag to a more natural boundary (`codex-apply-patch`) while keeping the contract explicitly documented. ## What Changed - Moved the internal `apply_patch` argv[1] flag constant out of `codex-core` and into `codex-apply-patch`. - Renamed the constant to `CODEX_CORE_APPLY_PATCH_ARG1` and documented that it is part of the Codex core process-invocation contract (even though it now lives in `codex-apply-patch`). - Updated `arg0`, the core apply-patch runtime, and the `codex-exec` apply-patch test to import the constant from `codex-apply-patch`. - Updated `codex-rs/arg0` to call `codex_utils_home_dir::find_codex_home()` directly instead of `codex_core::config::find_codex_home()`. - Removed the `codex-core` dependency from `codex-rs/arg0` and added the needed direct dependency on `codex-utils-home-dir`. - Added `codex-apply-patch` as a dev-dependency for `codex-rs/exec` tests (the apply-patch test now imports the moved constant directly). ## Verification - `cargo test -p codex-apply-patch` - `cargo test -p codex-arg0` - `cargo test -p codex-core --lib apply_patch` - `cargo test -p codex-exec test_standalone_exec_cli_can_use_apply_patch` - `cargo shear`
Michael Bolin ·
2026-02-21 00:20:42 -08:00 -
chore: remove codex-core public protocol/shell re-exports (#12432)
## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`Michael Bolin ·
2026-02-20 23:45:35 -08:00 -
Collapse waited message (#12430)
<img width="1349" height="148" alt="image" src="https://github.com/user-attachments/assets/98c96523-4cec-4bb1-9998-59d38e0bebb8" />
pakrym-oai ·
2026-02-20 23:32:59 -08:00 -
chore: move config diagnostics out of codex-core (#12427)
## Why Compiling `codex-rs/core` is a bottleneck for local iteration, so this change continues the ongoing extraction of config-related functionality out of `codex-core` and into `codex-config`. The goal is not just to move code, but to reduce `codex-core` ownership and indirection so more code depends on `codex-config` directly. ## What Changed - Moved config diagnostics logic from `core/src/config_loader/diagnostics.rs` into `config/src/diagnostics.rs`. - Updated `codex-core` to use `codex-config` diagnostics types/functions directly where possible. - Removed the `core/src/config_loader/diagnostics.rs` shim module entirely; the remaining `ConfigToml`-specific calls are in `core/src/config_loader/mod.rs`. - Moved `CONFIG_TOML_FILE` into `codex-config` and updated existing references to use `codex_config::CONFIG_TOML_FILE` directly. - Added a direct `codex-config` dependency to `codex-cli` for its `CONFIG_TOML_FILE` use.
Michael Bolin ·
2026-02-20 23:19:29 -08:00 -
Fix compaction context reinjection and model baselines (#12252)
## Summary - move regular-turn context diff/full-context persistence into `run_turn` so pre-turn compaction runs before incoming context updates are recorded - after successful pre-turn compaction, rely on a cleared `reference_context_item` to trigger full context reinjection on the follow-up regular turn (manual `/compact` keeps replacement history summary-only and also clears the baseline) - preserve `<model_switch>` when full context is reinjected, and inject it *before* the rest of the full-context items - scope `reference_context_item` and `previous_model` to regular user turns only so standalone tasks (`/compact`, shell, review, undo) cannot suppress future reinjection or `<model_switch>` behavior - make context-diff persistence + `reference_context_item` updates explicit in the regular-turn path, with clearer docs/comments around the invariant - stop persisting local `/compact` `RolloutItem::TurnContext` snapshots (only regular turns persist `TurnContextItem` now) - simplify resume/fork previous-model/reference-baseline hydration by looking up the last surviving turn context from rollout lifecycle events, including rollback and compaction-crossing handling - remove the legacy fallback that guessed from bare `TurnContext` rollouts without lifecycle events - update compaction/remote-compaction/model-visible snapshots and compact test assertions (including remote compaction mock response shape) ## Why We were persisting incoming context items before spawning the regular turn task, which let pre-turn compaction requests accidentally include incoming context diffs without the new user message. Fixing that exposed follow-on baseline issues around `/compact`, resume/fork, and standalone tasks that could cause duplicate context injection or suppress `<model_switch>` instructions. This PR re-centers the invariants around regular turns: - regular turns persist model-visible context diffs/full reinjection and update the `reference_context_item` - standalone tasks do not advance those regular-turn baselines - compaction clears the baseline when replacement history may have stripped the referenced context diffs ## Follow-ups (TODOs left in code) - `TODO(ccunningham)`: fix rollback/backtracking baseline handling more comprehensively - `TODO(ccunningham)`: include pending incoming context items in pre-turn compaction threshold estimation - `TODO(ccunningham)`: inject updated personality spec alongside `<model_switch>` so some model-switch paths can avoid forced full reinjection - `TODO(ccunningham)`: review task turn lifecycle (`TurnStarted`/`TurnComplete`) behavior and emit task-start context diffs for task types that should have them (excluding `/compact`) ## Validation - `just fmt` - CI should cover the updated compaction/resume/model-visible snapshot expectations and rollout-hydration behavior - I did **not** rerun the full local test suite after the latest resume-lookup / rollout-persistence simplifications
Charley Cunningham ·
2026-02-20 23:13:08 -08:00 -
feat: discourage the use of the --all-features flag (#12429)
## Why Developers are frequently running low on disk space, and routine use of `--all-features` contributes to larger Cargo build caches in `target/` by compiling additional feature combinations. This change updates local workflow guidance to avoid `--all-features` by default and reserve it for cases where full feature coverage is specifically needed. ## What Changed - Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test` / `just test` for full-suite local runs, and to call out the disk-usage cost of routine `--all-features` usage. - Updated the root `justfile` so `just fix` and `just clippy` no longer pass `--all-features` by default. - Updated `docs/install.md` to explicitly describe `cargo test --all-features` as an optional heavier-weight run (more build time and `target/` disk usage). ## Verification - Confirmed the `justfile` parses and the recipes list successfully with `just --list`.
Michael Bolin ·
2026-02-20 23:02:24 -08:00 -
fix(core) Filter non-matching prefix rules (#12314)
## Summary `gpt-5.3-codex` really likes to write complicated shell scripts, and suggest a partial prefix_rule that wouldn't actually approve the command. We should only show the `prefix_rule` suggestion from the model if it would actually fully approve the command the user is seeing. This will technically cause more instances of overly-specific suggestions when we fallback, but I think the UX is clearer, particularly when the model doesn't necessarily understand the current limitations of execpolicy parsing. ## Testing - [x] Add unit tests - [x] Add integration tests
Dylan Hurd ·
2026-02-20 22:02:35 -08:00 -
ignore v1 in JSON schema codegen (#12408)
## Why The generated unnamespaced JSON envelope schemas (`ClientRequest` and `ServerNotification`) still contained both v1 and v2 variants, which pulled legacy v1/core types and v2 types into the same `definitions` graph. That caused `schemars` to produce numeric suffix names (for example `AskForApproval2`, `ByteRange2`, `MessagePhase2`). This PR moves JSON codegen toward v2-only output while preserving the unnamespaced envelope artifacts, and avoids reintroducing numeric-suffix tolerance by removing the v1/internal-only variants that caused the collisions in those envelope schemas. ## What Changed - In `codex-rs/app-server-protocol/src/export.rs`, JSON generation now excludes v1 schema artifacts (`v1/*`) while continuing to emit unnamespaced/root JSON schemas and the JSON bundle. - Added a narrow JSON v1 allowlist (`JSON_V1_ALLOWLIST`) so `InitializeParams` and `InitializeResponse` are still emitted. - Added JSON-only post-processing for the mixed envelope schemas before collision checks run: - `ClientRequest`: strips v1 request variants from the generated `oneOf` using the temporary `V1_CLIENT_REQUEST_METHODS` list - `ServerNotification`: strips v1 notifications plus the internal-only `rawResponseItem/completed` notification using the temporary `EXCLUDED_SERVER_NOTIFICATION_METHODS_FOR_JSON` list - Added a temporary local-definition pruning pass for those envelope schemas so now-unreferenced v1/core definitions are removed from `definitions` after method filtering. - Updated the variant-title naming heuristic for single-property literal object variants to use the literal value (when available), avoiding collisions like multiple `state`-only variants all deriving the same title. - Collision handling remains fail-fast (no numeric suffix fallback map in this PR path). ## Verification - `just write-app-server-schema` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12408). * __->__ #12408 * #12406
Michael Bolin ·
2026-02-20 21:36:12 -08:00 -
test(app-server): wait for turn/completed in turn_start tests (#12376)
## Summary - switch a few app-server `turn_start` tests from `codex/event/task_complete` waits to `turn/completed` waits - avoid matching unrelated/background `task_complete` events - keep this flaky test fix separate from the /title feature PR ## Why On Windows ARM CI, these tests can return early after observing a generic `codex/event/task_complete` notification from another task. That can leave the mock Responses server with fewer calls than expected and fail the test with a wiremock verification mismatch. Using `turn/completed` matches the app-server turn lifecycle notification the tests actually care about. ## Validation - `cargo test -p codex-app-server turn_start_updates_sandbox_and_cwd_between_turns_v2 -- --nocapture` - `cargo test -p codex-app-server turn_start_exec_approval_ -- --nocapture` - `just fmt`
Yaroslav Volovich ·
2026-02-20 21:15:21 -08:00 -
feat: use OAI Responses API MessagePhase type directly in App Server v2 (#12422)
https://github.com/openai/codex/pull/10455 introduced the `phase` field, and then https://github.com/openai/codex/pull/12072 introduced a `MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type in `codex-rs/protocol/src/models.rs`. The app server protocol prefers `camelCase` while the Responses API uses `snake_case`, so this meant we had two versions of `MessagePhase` with different serialization rules. When the app server protocol refers to types from the Responses API, we use the wire format of the the Responses API even though it is inconsistent with the app server API. This PR deletes `MessagePhase` from `v2.rs` and consolidates on the Responses API version to eliminate confusion.
Michael Bolin ·
2026-02-20 20:43:36 -08:00 -
fix: address flakiness in thread_resume_rejoins_running_thread_even_with_override_mismatch (#12381)
## Why `thread/resume` responses for already-running threads can be reported as `Idle` even while a turn is still in progress. This is caused by a timing window where the runtime watch state has not yet observed the running-thread transition, so API clients can receive stale status information at resume time. Possibly related: https://github.com/openai/codex/pull/11786 ## What - Add a shared status normalization helper, `resolve_thread_status`, in `codex-rs/app-server/src/thread_status.rs` that resolves `Idle`/`NotLoaded` to `Active { active_flags: [] }` when an in-progress turn is known. - Reuse this helper across thread response paths in `codex-rs/app-server/src/codex_message_processor.rs` (including `thread/start`, `thread/unarchive`, `thread/read`, `thread/resume`, `thread/fork`, and review/thread-started notification responses). - In `handle_pending_thread_resume_request`, use both the in-memory `active_turn_snapshot` and the resumed rollout turns to decide whether a turn is in progress before resolving thread status for the response. - Extend `thread_status` tests to validate the new status-resolution behavior directly. ## Verification - `cargo test -p codex-app-server suite::v2::thread_resume::thread_resume_rejoins_running_thread_even_with_override_mismatch`
Michael Bolin ·
2026-02-20 20:36:04 -08:00 -
Add experimental realtime websocket backend prompt override (#12418)
- add top-level `experimental_realtime_ws_backend_prompt` config key (experimental / do not use) and include it in config schema - apply the override only to `Op::RealtimeConversation` websocket `backend_prompt`, with config + realtime tests
Ahmed Ibrahim ·
2026-02-20 20:10:51 -08:00 -
Improve Plan mode reasoning selection flow (#12303)
Addresses https://github.com/openai/codex/issues/11013 ## Summary - add a Plan implementation path in the TUI that lets users choose reasoning before switching to Default mode and implementing - add Plan-mode reasoning scope handling (Plan-only override vs all-modes default), including config/schema/docs plumbing for `plan_mode_reasoning_effort` - remove the hardcoded Plan preset medium default and make the reasoning popup reflect the active Plan override as `(current)` - split the collaboration-mode switch notification UI hint into #12307 to keep this diff focused If I have `plan_mode_reasoning_effort = "medium"` set in my `config.toml`: <img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM" src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369" /> If I don't have `plan_mode_reasoning_effort` set in my `config.toml`: <img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM" src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d" /> ## Codex author `codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
Charley Cunningham ·
2026-02-20 20:08:56 -08:00 -
Add experimental realtime websocket URL override (#12416)
- add top-level `experimental_realtime_ws_base_url` config key (experimental / do not use) and include it in config schema - apply the override only to `Op::RealtimeConversation` websocket transport, with config + realtime tests
Ahmed Ibrahim ·
2026-02-20 19:51:20 -08:00 -
fix(nix): include libcap dependency on linux builds (#12415)
commit
923f931121introduced a dependency on `libcap`. This PR fixes the nix build by including `libcap` in nix's build inputs issue number: #12102. @etraut-openai gave me permission to open pr Testing: running `nix run .#codex-rs` works on both macos (aarch64) and nixos (x86-64)Rohan Godha ·
2026-02-20 19:32:15 -08:00 -
Wire realtime api to core (#12268)
- Introduce `RealtimeConversationManager` for realtime API management - Add `op::conversation` to start conversation, insert audio, insert text, and close conversation. - emit conversation lifecycle and realtime events. - Move shared realtime payload types into codex-protocol and add core e2e websocket tests for start/replace/transport-close paths. Things to consider: - Should we use the same `op::` and `Events` channel to carry audio? I think we should try this simple approach and later we can create separate one if the channels got congested. - Sending text updates to the client: we can start simple and later restrict that. - Provider auth isn't wired for now intentionally
Ahmed Ibrahim ·
2026-02-20 19:06:35 -08:00 -
Add field to Thread object for the latest rename set for a given thread (#12301)
Exposes through the app server updated names set for a thread. This enables other surfaces to use the core as the source of truth for thread naming. `threadName` is gathered using the helper functions used to interact with `session_index.jsonl`, and is hydrated in: - `thread/list` - `thread/read` - `thread/resume` - `thread/unarchive` - `thread/rollback` We don't do this for `thread/start` and `thread/fork`.
natea-oai ·
2026-02-20 18:26:57 -08:00 -
fix: explicitly list name collisions in JSON schema generation (#12406)
## Why JSON schema codegen was silently resolving naming collisions by appending numeric suffixes (for example `...2`, `...3`). That makes the generated schema names unstable: removing an earlier colliding type can cause a later type to be renumbered, which is a breaking change for consumers that referenced the old generated name. This PR makes those collisions explicit and reviewable. Though note that once we remove `v1` from the codegen, we will no longer support naming collisions. Or rather, naming collisions will have to be handled explicitly rather than the numeric suffix approach. ## What Changed - In `codex-rs/app-server-protocol/src/export.rs`, replaced implicit numeric suffix collision handling for generated variant titles with explicit special-case maps. - Added a panic when a collision occurs without an entry in the map, so new collisions fail loudly instead of silently renaming generated schema types. - Added the currently required special cases so existing generated names remain stable. - Extended the same approach to numbered `definitions` / `$defs` collisions (for example `MessagePhase2`-style names) so those are also explicitly tracked. ## Verification - Ran targeted generator-path test: - `cargo test -p codex-app-server-protocol generate_json_filters_experimental_fields_and_methods -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12406). * #12408 * __->__ #12406
Michael Bolin ·
2026-02-20 17:51:53 -08:00 -
[apps] Bump MCP tool call timeout. (#12405)
- [x] Bump MCP tool call timeout.
Matthew Zeng ·
2026-02-20 17:35:07 -08:00 -
fix(tui): queued-message edit shortcut unreachable in some terminals (#12240)
## Problem The TUI's "edit queued message" shortcut (Alt+Up) is either silently swallowed or recognized as another key combination by Apple Terminal, Warp, and VSCode's integrated terminal on macOS. Users in those environments see the hint but pressing the keys does nothing. ## Mental model When a model turn is in progress the user can still type follow-up messages. These are queued and displayed below the composer with a hint line showing how to pop the most recent one back into the editor. The hint text and the actual key handler must agree on which shortcut is used, and that shortcut must actually reach the TUI—i.e. it must not be intercepted by the host terminal. Three terminals are known to intercept Alt+Up: Apple Terminal (remaps it to cursor movement), Warp (consumes it for its own command palette), and VSCode (maps it to "move line up"). For these we use Shift+Left instead. <p align="center"> <img width="283" height="182" alt="image" src="https://github.com/user-attachments/assets/4a9c5d13-6e47-4157-bb41-28b4ce96a914" /> </p> | macOS Native Terminal | Warp | VSCode Terminal | |---|---|---| | <img width="1557" height="1010" alt="SCR-20260219-kigi" src="https://github.com/user-attachments/assets/f4ff52f8-119e-407b-a3f3-52f564c36d70" /> | <img width="1479" height="1261" alt="SCR-20260219-krrf" src="https://github.com/user-attachments/assets/5807d7c4-17ae-4a2b-aa27-238fd49d90fd" /> | <img width="1612" height="1312" alt="SCR-20260219-ksbz" src="https://github.com/user-attachments/assets/1cedb895-6966-4d63-ac5f-0eea0f7057e8" /> | ## Non-goals - Making the binding user-configurable at runtime (deferred to a broader keybinding-config effort). - Remapping any other shortcuts that might be terminal-specific. ## Tradeoffs - **Exhaustive match instead of a wildcard default.** The `queued_message_edit_binding_for_terminal` function explicitly lists every `TerminalName` variant. This is intentional: adding a new terminal to the enum will produce a compile error, forcing the author to decide which binding that terminal should use. - **Binding lives on `ChatWidget`, hint lives on `QueuedUserMessages`.** The key event handler that actually acts on the press is in `ChatWidget`, but the rendered hint text is inside `QueuedUserMessages`. These are kept in sync by `ChatWidget` calling `bottom_pane.set_queued_message_edit_binding(self.queued_message_edit_binding)` during construction. A mismatch would show the wrong hint but would not lose data. ## Architecture ```mermaid graph TD TI["terminal_info().name"] --> FN["queued_message_edit_binding_for_terminal(name)"] FN --> KB["KeyBinding"] KB --> CW["ChatWidget.queued_message_edit_binding<br/><i>key event matching</i>"] KB --> BP["BottomPane.set_queued_message_edit_binding()"] BP --> QUM["QueuedUserMessages.edit_binding<br/><i>rendered in hint line</i>"] subgraph "Special terminals (Shift+Left)" AT["Apple Terminal"] WT["Warp"] VS["VSCode"] end subgraph "Default (Alt+Up)" GH["Ghostty"] IT["iTerm2"] OT["Others…"] end AT --> FN WT --> FN VS --> FN GH --> FN IT --> FN OT --> FN ``` No new crates or public API surface. The only cross-crate dependency added is `codex_core::terminal::{TerminalName, terminal_info}`, which already existed for telemetry. ## Observability No new logging. Terminal detection already emits a `tracing::debug!` log line at startup with the detected terminal name, which is sufficient to diagnose binding mismatches. ## Tests - Existing `alt_up_edits_most_recent_queued_message` test is preserved and explicitly sets the Alt+Up binding to isolate from the host terminal. - New parameterized async tests verify Shift+Left works for Apple Terminal, Warp, and VSCode. - A sync unit test asserts the mapping table covers the three special terminals (Shift+Left) and that iTerm2 still gets Alt+Up. Fixes #4490
Felipe Coury ·
2026-02-20 16:56:41 -08:00 -
[apps] Fix gateway url. (#12403)
- [x] Fix connectors gateway url.
Matthew Zeng ·
2026-02-21 00:47:15 +00:00 -
Show model/reasoning hint when switching modes (#12307)
## Summary - show an info message when switching collaboration modes changes the effective model or reasoning - include the target mode in the message (for example `... for Plan mode.`) - add TUI tests for model-change and reasoning-only change notifications on mode switch <img width="715" height="184" alt="Screenshot 2026-02-20 at 2 01 40 PM" src="https://github.com/user-attachments/assets/18d1beb3-ab87-4e1c-9ada-a10218520420" />
Charley Cunningham ·
2026-02-20 15:22:10 -08:00 -
clarify model_catalog_json only applied on startup (#12379)
# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.
sayan-oai ·
2026-02-20 15:04:36 -08:00 -
Move sanitizer into codex-secrets (#12306)
## Summary - move the sanitizer implementation into `codex-secrets` (`secrets/src/sanitizer.rs`) and re-export `redact_secrets` - switch `codex-core` to depend on/import `codex-secrets` for sanitizer usage - remove the old `utils/sanitizer` crate wiring and refresh lockfiles ## Testing - `just fmt` - `cargo test -p codex-secrets` - `cargo test -p codex-core --no-run` - `cargo clippy -p codex-secrets -p codex-core --all-targets --all-features -- -D warnings` - `just bazel-lock-update` - `just bazel-lock-check` ## Notes - not run: `cargo test --all-features` (full workspace suite)
viyatb-oai ·
2026-02-20 22:47:54 +00:00