Commit Graph

26 Commits

  • tui: double-press Ctrl+C/Ctrl+D to quit (#8936)
    ## Problem
    
    Codex’s TUI quit behavior has historically been easy to trigger
    accidentally and hard to reason
    about.
    
    - `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a
    common key to press while trying
      to dismiss a modal, cancel a command, or recover from a stuck state.
    - “Quit” and “shutdown” were not consistently separated, so some exit
    paths could bypass the
      shutdown/cleanup work that should run before the process terminates.
    
    This PR makes quitting both safer (harder to do by accident) and more
    uniform across quit
    gestures, while keeping the shutdown-first semantics explicit.
    
    ## Mental model
    
    After this change, the system treats quitting as a UI request that is
    coordinated by the app
    layer.
    
    - The UI requests exit via `AppEvent::Exit(ExitMode)`.
    - `ExitMode::ShutdownFirst` is the normal user path: the app triggers
    `Op::Shutdown`, continues
    rendering while shutdown runs, and only ends the UI loop once shutdown
    has completed.
    - `ExitMode::Immediate` exists as an escape hatch (and as the
    post-shutdown “now actually exit”
    signal); it bypasses cleanup and should not be the default for
    user-triggered quits.
    
    User-facing quit gestures are intentionally “two-step” for safety:
    
    - `Ctrl+C` and `Ctrl+D` no longer exit immediately.
    - The first press arms a 1-second window and shows a footer hint (“ctrl
    + <key> again to quit”).
    - Pressing the same key again within the window requests a
    shutdown-first quit; otherwise the
      hint expires and the next press starts a fresh window.
    
    Key routing remains modal-first:
    
    - A modal/popup gets first chance to consume `Ctrl+C`.
    - If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so
    dismissing a modal cannot
      prime a subsequent `Ctrl+C` to quit.
    - `Ctrl+D` only participates in quitting when the composer is empty and
    no modal/popup is active.
    
    The design doc `docs/exit-confirmation-prompt-design.md` captures the
    intended routing and the
    invariants the UI should maintain.
    
    ## Non-goals
    
    - This does not attempt to redesign modal UX or make modals uniformly
    dismissible via `Ctrl+C`.
    It only ensures modals get priority and that quit arming does not leak
    across modal handling.
    - This does not introduce a persistent confirmation prompt/menu for
    quitting; the goal is to keep
      the exit gesture lightweight and consistent.
    - This does not change the semantics of core shutdown itself; it changes
    how the UI requests and
      sequences it.
    
    ## Tradeoffs
    
    - Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second
    keypress, which adds friction for
      users who relied on the old “instant quit” behavior.
    - The UI now maintains a small time-bounded state machine for the armed
    shortcut, which increases
      complexity and introduces timing-dependent behavior.
    
    This design was chosen over alternatives (a modal confirmation prompt or
    a long-lived “are you
    sure” state) because it provides an explicit safety barrier while
    keeping the flow fast and
    keyboard-native.
    
    ## Architecture
    
    - `ChatWidget` owns the quit-shortcut state machine and decides when a
    quit gesture is allowed
      (idle vs cancellable work, composer state, etc.).
    - `BottomPane` owns rendering and local input routing for modals/popups.
    It is responsible for
    consuming cancellation keys when a view is active and for
    showing/expiring the footer hint.
    - `App` owns shutdown sequencing: translating
    `AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown`
      and only terminating the UI loop when exit is safe.
    
    This keeps “what should happen” decisions (quit vs interrupt vs ignore)
    in the chat/widget layer,
    while keeping “how it looks and which view gets the key” in the
    bottom-pane layer.
    
    ## Observability
    
    You can tell this is working by running the TUIs and exercising the quit
    gestures:
    
    - While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and
    no modal) shows a footer
    hint for ~1 second; pressing again within that window exits via
    shutdown-first.
    - While streaming/tools/review are active: `Ctrl+C` interrupts work
    rather than quitting.
    - With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it
    chooses to) and does not
    arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless
    the user re-arms it.
    
    Failure modes are visible as:
    
    - Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`.
    - Quits that occur while a modal is open and consuming `Ctrl+C`.
    - UI termination before shutdown completes (cleanup skipped).
    
    ## Tests
    
    - Updated/added unit and snapshot coverage in `codex-tui` and
    `codex-tui2` to validate:
      - The quit hint appears and expires on the expected key.
    - Double-press within the window triggers a shutdown-first quit request.
    - Modal-first routing prevents quit bypass and clears any armed shortcut
    when a modal consumes
        `Ctrl+C`.
    
    These tests focus on the UI-level invariants and rendered output; they
    do not attempt to validate
    real terminal key-repeat timing or end-to-end process shutdown behavior.
    
    ---
    Screenshot:
    <img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM"
    src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f"
    />
  • Improve handling of config and rules errors for app server clients (#9182)
    When an invalid config.toml key or value is detected, the CLI currently
    just quits. This leaves the VSCE in a dead state.
    
    This PR changes the behavior to not quit and bubble up the config error
    to users to make it actionable. It also surfaces errors related to
    "rules" parsing.
    
    This allows us to surface these errors to users in the VSCE, like this:
    
    <img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM"
    src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4"
    />
    
    <img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM"
    src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b"
    />
  • fix: integration test for #9011 (#9166)
    Adds an integration test for the new behavior introduced in
    https://github.com/openai/codex/pull/9011. The work to create the test
    setup was substantial enough that I thought it merited a separate PR.
    
    This integration test spawns `codex` in TUI mode, which requires
    spawning a PTY to run successfully, so I had to introduce quite a bit of
    scaffolding in `run_codex_cli()`. I was surprised to discover that we
    have not done this in our codebase before, so perhaps this should get
    moved to a common location so it can be reused.
    
    The test itself verifies that a malformed `rules` in `$CODEX_HOME`
    prints a human-readable error message and exits nonzero.
  • chore: delete chatwidget::tests::binary_size_transcript_snapshot tui test (#6759)
    We're running into quite a bit of drag maintaining this test, since
    every time we add fields to an EventMsg that happened to be dumped into
    the `binary-size-log.jsonl` fixture, this test starts to fail. The fix
    is usually to either manually update the `binary-size-log.jsonl` fixture
    file, or update the `upgrade_event_payload_for_tests` function to map
    the data in that file into something workable.
    
    Eason says it's fine to delete this test, so let's just delete it
  • fix(tui): propagate errors in insert_history_lines_to_writer (#4266)
    ## What?
    Fixed error handling in `insert_history_lines_to_writer` where all
    terminal operations were silently ignoring errors via `.ok()`.
    
      ## Why?
    Silent I/O failures could leave the terminal in an inconsistent state
    (e.g., scroll region not reset) with no way to debug. This violates Rust
    error handling best practices.
    
      ## How?
      - Changed function signature to return `io::Result<()>`
      - Replaced all `.ok()` calls with `?` operator to propagate errors
    - Added `tracing::warn!` in wrapper function for backward compatibility
      - Updated 15 test call sites to handle Result  with `.expect()`
    
      ## Testing
      -  Pass all tests
    
      ## Type of Change
      - [x] Bug fix (non-breaking change)
    
    ---------
    
    Signed-off-by: Huaiwu Li <lhwzds@gmail.com>
    Co-authored-by: Eric Traut <etraut@openai.com>
  • feat: add Vec<ParsedCommand> to ExecApprovalRequestEvent (#5222)
    This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
    in the core protocol (`protocol/src/protocol.rs`), which is also what
    this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
    the name (it sounds like a single command, but it is actually a list of
    them), but I don't want to get distracted by a naming discussion right
    now.
    
    This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
    `codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
    via `codex app-server`, as well.
    
    For consistency, I also updated `ExecApprovalElicitRequestParams` in
    `codex-rs/mcp-server/src/exec_approval.rs` to include this field under
    the name `codex_parsed_cmd`, as that struct already has a number of
    special `codex_*` fields. Note this is the code for when Codex is used
    as an MCP _server_ and therefore has to conform to the official spec for
    an MCP elicitation type.
  • update composer + user message styling (#4240)
    Changes:
    
    - the composer and user messages now have a colored background that
    stretches the entire width of the terminal.
    - the prompt character was changed from a cyan `▌` to a bold `›`.
    - the "working" shimmer now follows the "dark gray" color of the
    terminal, better matching the terminal's color scheme
    
    | Terminal + Background        | Screenshot |
    |------------------------------|------------|
    | iTerm with dark bg | <img width="810" height="641" alt="Screenshot
    2025-09-25 at 11 44 52 AM"
    src="https://github.com/user-attachments/assets/1317e579-64a9-4785-93e6-98b0258f5d92"
    /> |
    | iTerm with light bg | <img width="845" height="540" alt="Screenshot
    2025-09-25 at 11 46 29 AM"
    src="https://github.com/user-attachments/assets/e671d490-c747-4460-af0b-3f8d7f7a6b8e"
    /> |
    | iTerm with color bg | <img width="825" height="564" alt="Screenshot
    2025-09-25 at 11 47 12 AM"
    src="https://github.com/user-attachments/assets/141cda1b-1164-41d5-87da-3be11e6a3063"
    /> |
    | Terminal.app with dark bg | <img width="577" height="367"
    alt="Screenshot 2025-09-25 at 11 45 22 AM"
    src="https://github.com/user-attachments/assets/93fc4781-99f7-4ee7-9c8e-3db3cd854fe5"
    /> |
    | Terminal.app with light bg | <img width="577" height="367"
    alt="Screenshot 2025-09-25 at 11 46 04 AM"
    src="https://github.com/user-attachments/assets/19bf6a3c-91e0-447b-9667-b8033f512219"
    /> |
    | Terminal.app with color bg | <img width="577" height="367"
    alt="Screenshot 2025-09-25 at 11 45 50 AM"
    src="https://github.com/user-attachments/assets/dd7c4b5b-342e-4028-8140-f4e65752bd0b"
    /> |
  • feat: update default (#4076)
    Changes:
    - Default model and docs now use gpt-5-codex. 
    - Disables the GPT-5 Codex NUX by default.
    - Keeps presets available for API key users.
  • feat: include reasoning_effort in NewConversationResponse (#3506)
    `ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
  • fix: include rollout_path in NewConversationResponse (#3352)
    Adding the `rollout_path` to the `NewConversationResponse` makes it so a
    client can perform subsequent operations on a `(ConversationId,
    PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
    * #3353
    * __->__ #3352
  • fix: fix serde_as annotation and verify with test (#3170)
    I didn't do https://github.com/openai/codex/pull/3163 correctly the
    first time: now verified with a test.
  • prefer ratatui Stylized for constructing lines/spans (#3068)
    no functional change, just simplifying ratatui styling and adding
    guidance in AGENTS.md for future.
  • do not show timeouts as "sandbox error"s (#2587)
    🙅🫸
    ```
    ✗ Failed (exit -1)
      └ 🧪 cargo test --all-features -q
        sandbox error: command timed out
    ```
    
    😌👉
    ```
    ✗ Failed (exit -1)
      └ 🧪 cargo test --all-features -q
        error: command timed out
    ```
  • test: faster test execution in codex-core (#2633)
    this dramatically improves time to run `cargo test -p codex-core` (~25x
    speedup).
    
    before:
    ```
    cargo test -p codex-core  35.96s user 68.63s system 19% cpu 8:49.80 total
    ```
    
    after:
    ```
    cargo test -p codex-core  5.51s user 8.16s system 63% cpu 21.407 total
    ```
    
    both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
    to exclude compile times.
    
    approach inspired by [Delete Cargo Integration
    Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
    we move all test cases in tests/ into a single suite in order to have a
    single binary, as there is significant overhead for each test binary
    executed, and because test execution is only parallelized with a single
    binary.
  • send-aggregated output (#2364)
    We want to send an aggregated output of stderr and stdout so we don't
    have to aggregate it stderr+stdout as we lose order sometimes.
    
    ---------
    
    Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
  • hide CoT by default; show headers in status indicator (#2316)
    Plan is for full CoT summaries to be visible in a "transcript view" when
    we implement that, but for now they're hidden.
    
    
    https://github.com/user-attachments/assets/e8a1b0ef-8f2a-48ff-9625-9c3c67d92cdb
  • chore: upgrade to Rust 1.89 (#2465)
    Codex created this PR from the following prompt:
    
    > upgrade this entire repo to Rust 1.89. Note that this requires
    updating codex-rs/rust-toolchain.toml as well as the workflows in
    .github/. Make sure that things are "clippy clean" as this change will
    likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
    are sufficient to check for correctness
    
    Note this modifies a lot of lines because it folds nested `if`
    statements using `&&`.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
    * #2467
    * __->__ #2465
  • tui: standardize tree prefix glyphs to └ (#2274)
    Replace mixed `⎿` and `L` prefixes with `└` in TUI rendering.
    
    <img width="454" height="659" alt="Screenshot 2025-08-13 at 4 02 03 PM"
    src="https://github.com/user-attachments/assets/61c9c7da-830b-4040-bb79-a91be90870ca"
    />
  • Re-add markdown streaming (#2029)
    Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.
  • Streaming markdown (#1920)
    We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!
  • Stream model responses (#1810)
    Stream models thoughts and responses instead of waiting for the whole
    thing to come through. Very rough right now, but I'm making the risk call to push through.
  • feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
    As stated in `codex-rs/README.md`:
    
    Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
    run it. For a number of users, this runtime requirement inhibits
    adoption: they would be better served by a standalone executable. As
    maintainers, we want Codex to run efficiently in a wide range of
    environments with minimal overhead. We also want to take advantage of
    operating system-specific APIs to provide better sandboxing, where
    possible.
    
    To that end, we are moving forward with a Rust implementation of Codex
    CLI contained in this folder, which has the following benefits:
    
    - The CLI compiles to small, standalone, platform-specific binaries.
    - Can make direct, native calls to
    [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
    [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
    order to support sandboxing on Linux.
    - No runtime garbage collection, resulting in lower memory consumption
    and better, more predictable performance.
    
    Currently, the Rust implementation is materially behind the TypeScript
    implementation in functionality, so continue to use the TypeScript
    implmentation for the time being. We will publish native executables via
    GitHub Releases as soon as we feel the Rust version is usable.