Commit Graph

3260 Commits

  • fix(rules) Limit rules listed in conversation (#10351)
    ## Summary
    We should probably warn users that they have a million rules, and help
    clean them up. But for now, we should handle this unbounded case.
    
    Limit rules listed in conversations, with shortest / broadest rules
    first.
    
    ## Testing
    - [x] Updated unit tests
  • fix: System skills marker includes nested folders recursively (#10350)
    Updated system skills bundled with Codex were not correctly replacing
    the user's skills in their .system folder.
    
    - Fix `.codex-system-skills.marker` not updating by hashing embedded
    system skills recursively (nested dirs + file contents), so updates
    trigger a reinstall.
    - Added a build Cargo hook to rerun if there are changes in
    `src/skills/assets/samples/*`, ensuring embedded skill updates rebuild
    correctly under caching.
    - Add a small unit test to ensure nested entries are included in the
    fingerprint.
  • Bump thread updated_at on unarchive to refresh sidebar ordering (#10280)
    ## Summary
    - Touch restored rollout files on `thread/unarchive` so `updatedAt`
    reflects the unarchive time.
    - Add a regression test to ensure unarchiving bumps `updated_at` from an
    old mtime.
    
    ## Notes
    This fixes the UX issue where unarchived old threads don’t reappear near
    the top of recent threads.
  • Improve plan mode interaction rules (#10329)
    ## Summary
    - Replace the “Hard interaction rule” with a clearer “Response
    constraints” section that enumerates the allowed exceptions for Plan
    Mode replies.
    - Remove the stray Phase 1 exception line about simple questions.
    - Update plan content requirements to ask for a brief summary section
    and generalize API/type wording.
  • fix(config) config schema newline (#10323)
    ## Summary
    Looks like we may have introduced a formatting issue in recent PRs.
    
    ## Testing
    - [x] ran `just write-config-schema`
  • chore(features) Personality => Stable (#10310)
    ## Summary
    Bump `/personality` to stable
    
    ## Testing
     - [x] unit tests pass
  • chore(config) Rename config setting to personality (#10314)
    ## Summary
    Let's make the setting name consistent with the SlashCommand!
    
    ## Testing
    - [x] Updated tests
  • Add websocket telemetry metrics and labels (#10316)
    Summary
    - expose websocket telemetry hooks through the responses client so
    request durations and event processing can be reported
    - record websocket request/event metrics and emit runtime telemetry
    events that the history UI now surfaces
    - improve tests to cover websocket telemetry reporting and guard runtime
    summary updates
    
    
    <img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM"
    src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d"
    />
  • Make skills prompt explicit about relative-path lookup (#10282)
    Fix cases where the model tries to locate skill scripts from the cwd and
    fails.
  • feat: Support loading skills from .agents/skills (#10317)
    This PR adds support for loading
    [skills](https://developers.openai.com/codex/skills) from
    `.agents/skills/`.
    - Issue: https://github.com/agentskills/agentskills/issues/15
    - Motivation: When skills live on the filesystem, sharing them across
    agents is awkward and often ends up requiring symlinks/duplication. A
    single location under `.agents/` makes it easier to share skills.
    - Loading from `.codex/skills/` will remain but will be deprecated soon.
    The change only applies to the [REPO
    scope](https://developers.openai.com/codex/skills#where-to-save-skills).
    - Documentation will be updated before this change is live.
    
    Testing with skills in two locations of this repo:
    <img width="960" height="152" alt="image"
    src="https://github.com/user-attachments/assets/28975ff9-7363-46dd-ad40-f4c7bfdb8234"
    />
    
    When starting Codex with CWD in `$repo_root` (should only pick up at
    root):
    <img width="513" height="143" alt="image"
    src="https://github.com/user-attachments/assets/389e1ea7-020c-481e-bda0-ce58562db59f"
    />
    
    When starting Codex with CWD in `$repo_root/codex-rs` (should pick up at
    cwd and crawl up to root):
    <img width="552" height="177" alt="image"
    src="https://github.com/user-attachments/assets/a5beb8de-11b4-45ed-8660-80707c77006a"
    />
  • enable plan mode (#10313)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • feat(core,tui,app-server) personality migration (#10307)
    ## Summary
    Keep existing users on Pragmatic, to preserve behavior while new users
    default to Friendly
    
    ## Testing
    - [x] Tested locally
    - [x] add integration tests
  • chore(core) Default to friendly personality (#10305)
    ## Summary
    Update default personality to friendly
    
    ## Testing
    - [x] Unit tests pass
  • plan mode prompt (#10308)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • chore(app-server) add personality update test (#10306)
    ## Summary
    Add some additional validation to ensure app-server handles Personality
    changes
    
    ## Testing
    - [x] These are tests
  • Fix npm README image link (#10303)
    ### Motivation
    - The image referenced in the package README was 404ing on the npm
    package page because it used a relative file path that doesn't resolve
    on npm, so the splash image needs a GitHub-hosted URL to render
    correctly.
    
    ### Description
    - Update `README.md` to replace the relative image path
    `./.github/codex-cli-splash.png` with the GitHub-hosted URL
    `https://github.com/openai/codex/blob/main/.github/codex-cli-splash.png`.
    
    ### Testing
    - No automated tests were run because this is a docs-only change and
    does not affect code or test behavior.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_697e58dbce34832d87c7847779e8f4a5)
  • chore(features) remove Experimental tag from UTF8 (#10296)
    ## Summary
    This has been default on for some time, it should now be the default.
    
    ## Testing
    - [x] Existing tests pass
  • fix(nix): update flake for newer Rust toolchain requirements (#10302)
    ## Summary
    
    - Add rust-overlay input to provide newer Rust versions (rama crates
    require rustc 1.91.0+)
    - Add devShells output with complete development environment
    - Add missing git dependency hashes to codex-rs/default.nix
    
    ## Changes
    
    **flake.nix:**
    - Added `rust-overlay` input to get newer Rust toolchains
    - Updated `packages` output to use `rust-bin.stable.latest.minimal` for
    builds
    - Added `devShells` output with:
      - Rust with `rust-src` and `rust-analyzer` extensions for IDE support
    - Required build dependencies: `pkg-config`, `openssl`, `cmake`,
    `libclang`
      - Environment variables: `PKG_CONFIG_PATH`, `LIBCLANG_PATH`
    
    **codex-rs/default.nix:**
    - Added missing `outputHashes` for git dependencies:
      - `nucleo-0.5.0`, `nucleo-matcher-0.3.1`
      - `runfiles-0.1.0`
      - `tokio-tungstenite-0.28.0`, `tungstenite-0.28.0`
    
    ## Test Plan
    
    - [x] `nix develop` enters shell successfully
    - [x] `nix develop -c rustc --version` shows 1.93.0
    - [x] `nix develop -c cargo build` completes successfully
  • display promo message in usage error (#10285)
    If a promo message is attached to a rate limit response, then display it
    in the error message.
  • feat: show runtime metrics in console (#10278)
    Summary of changes:
    
    - Adds a new feature flag: runtime_metrics
      - Declared in core/src/features.rs
      - Added to core/config.schema.json
      - Wired into OTEL init in core/src/otel_init.rs
    
    - Enables on-demand runtime metric snapshots in OTEL
      - Adds runtime_metrics: bool to otel/src/config.rs
      - Enables experimental custom reader features in otel/Cargo.toml
      - Adds snapshot/reset/summary APIs in:
        - otel/src/lib.rs
        - otel/src/metrics/client.rs
        - otel/src/metrics/config.rs
        - otel/src/metrics/error.rs
    
    - Defines metric names and a runtime summary builder
      - New files:
        - otel/src/metrics/names.rs
        - otel/src/metrics/runtime_metrics.rs
      - Summarizes totals for:
        - Tool calls
        - API requests
        - SSE/streaming events
    
    - Instruments metrics collection in OTEL manager
      - otel/src/traces/otel_manager.rs now records:
        - API call counts + durations
        - SSE event counts + durations (success/failure)
        - Tool call metrics now use shared constants
    
    - Surfaces runtime metrics in the TUI
      - Resets runtime metrics at turn start in tui/src/chatwidget.rs
    - Displays metrics in the final separator line in
    tui/src/history_cell.rs
    
    - Adds tests
      - New OTEL tests:
        - otel/tests/suite/snapshot.rs
        - otel/tests/suite/runtime_summary.rs
      - New TUI test:
    - final_message_separator_includes_runtime_metrics in
    tui/src/history_cell.rs
    
    Scope:
    - 19 files changed
    - ~652 insertions, 38 deletions
    
    
    <img width="922" height="169" alt="Screenshot 2026-01-30 at 4 11 34 PM"
    src="https://github.com/user-attachments/assets/1efd754d-a16d-4564-83a5-f4442fd2f998"
    />
  • feat(core) Smart approvals on (#10286)
    ## Summary
    Turn on Smart Approvals by default
    
    ## Testing
     - [x] Updated unit tests
  • Fix minor typos in comments and documentation (#10287)
    ## Summary
    
    I have read the contribution guidelines.  
    All changes in this PR are limited to text corrections and do not modify
    any business logic, runtime behavior, or user-facing functionality.
    
    ## Details
    
    This PR fixes several minor typos, including:
    
    - `create` -> `crate`
    - `analagous` -> `analogous`
    - `apply-patch` -> `apply_patch`
    - `codecs` -> `codex`
    - ` '/" ` -> ` '/' `
    - `Respesent` -> `Represent`
  • Turn on cloud requirements for business too (#10283)
    Need to check "enterprise" and "business"
  • add missing fields to WebSearchAction and update app-server types (#10276)
    - add `WebSearchAction` to app-server v2 types
    - add `queries` to `WebSearchAction::Search` type
    
    Updated tests.
  • Add enforce_residency to requirements (#10263)
    Add `enforce_residency` to requirements.toml and thread it through to a
    header on `default_client`.
  • Wire up cloud reqs in exec, app-server (#10241)
    We're fetching cloud requirements in TUI in
    https://github.com/openai/codex/pull/10167.
    
    This adds the same fetching in exec and app-server binaries also.
  • Validate CODEX_HOME before resolving (#10249)
    Summary
    - require `CODEX_HOME` to point to an existing directory before
    canonicalizing and surface clear errors otherwise
    - share the same helper logic in both `core` and `rmcp-client` and add
    unit tests that cover missing, non-directory, valid, and default paths
    
    This addresses #9222
  • fix: update file search directory when session CWD changes (#9279)
    ## Summary
    
    Fixes #9041
    
    - Adds update_search_dir() method to FileSearchManager to allow updating
    the search directory after initialization
    - Calls this method when the session CWD changes: new session, resume,
    or fork
    
    ## Problem
    
    The FileSearchManager was created once with the initial search_dir and
    never updated. When a user:
    
    1. Starts Codex in a non-git directory (e.g., /tmp/random)
    2. Resumes or forks a session from a different workspace
    3. The @filename lookup still searched the original directory
    
    This caused no matches to be returned even when files existed in the
    current workspace.
    
    ## Solution
    
    Update FileSearchManager.search_dir whenever the session working
    directory changes:
    - AppEvent::NewSession: Use current config CWD
    - SessionSelection::Resume: Use resumed session CWD
    - SessionSelection::Fork: Use forked session CWD
    
    ## Test plan
    
    - [ ] Start Codex in /tmp/test-dir (non-git)
    - [ ] Resume a session from a project with actual files
    - [ ] Verify @filename returns matches from the resumed session
    directory
    
    ---------
    
    Co-authored-by: Eric Traut <etraut@openai.com>
  • fix: dont auto-enable web_search for azure (#10266)
    seeing issues with azure after default-enabling web search: #10071,
    #10257.
    
    need to work with azure to fix api-side, for now turning off
    default-enable of web_search for azure.
    
    diff is big because i moved logic to reuse
  • file-search: multi-root walk (#10240)
    Instead of a separate walker for each root in a multi-root walk, use a
    single walker.
  • Update announcement_tip.toml (#10267)
    Extend the test for dev version
  • Hide /approvals from the slash-command list (#10265)
    `/permissions` is the replacement. `/approvals` still available when
    typing.
  • core: prevent shell_snapshot from inheriting stdin (#9735)
    Fixes #9559.
    
    When `shell_snapshot` runs, it may execute user startup files (e.g.
    `.bashrc`). If those files read from stdin (or if stdin is an
    interactive TTY under job control), the snapshot subprocess can block or
    receive `SIGTTIN` (as reported over SSH).
    
    This change explicitly sets `stdin` to `Stdio::null()` for the snapshot
    subprocess, so it can't read from the terminal.
    
    Regression test added that would hang/timeout without this change.
    Tests: `ulimit -n 4096 && cargo test -p codex-core`.
    
    cc @dongdongbh @etraut-openai
    
    ---------
    
    Co-authored-by: Skylar Graika <sgraika127@gmail.com>
  • Skip loading codex home as project layer (#10207)
    Summary:
    - Fixes issue #9932: https://github.com/openai/codex/issues/9932
    - Prevents `$CODEX_HOME` (typically `~/.codex`) from being discovered as
    a project `.codex` layer by skipping it during project layer traversal.
    We compare both normalized absolute paths and best-effort canonicalized
    paths to handle symlinks.
    - Adds regression tests for home-directory invocation and for the case
    where `CODEX_HOME` points to a project `.codex` directory (e.g.,
    worktrees/editor integrations).
    
    Testing:
    - `cargo build -p codex-cli --bin codex`
    - `cargo build -p codex-rmcp-client --bin test_stdio_server`
    - `cargo test -p codex-core`
    - `cargo test --all-features`
    - Manual: ran `target/debug/codex` from `~` and confirmed the
    disabled-folder warning and trust prompt no longer appear.
  • Make plan highlight use popup grey background (#10253)
    ## Summary
    - align proposed plan background with popup surface color by reusing
    `user_message_bg`
    - remove the custom blue-tinted plan background
    
    <img width="1572" height="1568" alt="image"
    src="https://github.com/user-attachments/assets/63a5341e-4342-4c07-b6b0-c4350c3b2639"
    />
  • plan prompt (#10255)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Plan mode prompt (#10238)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • chore: rename ChatGpt -> Chatgpt in type names (#10244)
    When using ChatGPT in names of types, we should be consistent, so this
    renames some types with `ChatGpt` in the name to `Chatgpt`. From
    https://rust-lang.github.io/api-guidelines/naming.html:
    
    > In `UpperCamelCase`, acronyms and contractions of compound words count
    as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
    or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
    contractions are lower-cased: `is_xid_start`.
    
    This PR updates existing uses of `ChatGpt` and changes them to
    `Chatgpt`. Though in all cases where it could affect the wire format, I
    visually inspected that we don't change anything there. That said, this
    _will_ change the codegen because it will affect the spelling of type
    names.
    
    For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
    `app-server-protocol`, but the wire format is still `"chatgpt"`.
    
    This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
  • Tui: hide Code mode footer label (#10063)
    Title
    Hide Code mode footer label/cycle hint; add Plan footer-collapse
    snapshots
    
    Summary
    - Keep Code mode internal naming but suppress the footer mode label +
    cycle hint when Code is active.
    - Only show the cycle hint when a non‑Code mode indicator is present.
    - Add Plan-mode footer collapse snapshot coverage (empty + queued,
    across widths) and update existing footer collapse snapshots for the new
    Code behavior.
    
    Notes
    - The test run currently fails in codex-cloud-requirements on
    origin/main due to a stale auth.mode field; no fix is included in this
    PR to keep the diff minimal.
    
    Codex author
    `codex resume 019c0296-cfd4-7193-9b0a-6949048e4546`
  • Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)
    ## Summary
    - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
    in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
    tags from normal assistant output.
    - Persist plan items and rebuild them on resume so proposed plans show
    in thread history.
    - Wire plan items/deltas through app-server protocol v2 and render a
    dedicated proposed-plan view in the TUI, including the “Implement this
    plan?” prompt only when a plan item is present.
    
    ## Changes
    
    ### Core (`codex-rs/core`)
    - Added a generic, line-based tag parser that buffers each line until it
    can disprove a tag prefix; implements auto-close on `finish()` for
    unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
    - Refactored proposed plan parsing to wrap the generic parser.
    `codex-rs/core/src/proposed_plan_parser.rs`
    - In plan mode, stream assistant deltas as:
      - **Normal text** → `AgentMessageContentDelta`
      - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
      (`codex-rs/core/src/codex.rs`)
    - Final plan item content is derived from the completed assistant
    message (authoritative), not necessarily the concatenated deltas.
    - Strips `<proposed_plan>` blocks from assistant text in plan mode so
    tags don’t appear in normal messages.
    (`codex-rs/core/src/stream_events_utils.rs`)
    - Persist `ItemCompleted` events only for plan items for rollout replay.
    (`codex-rs/core/src/rollout/policy.rs`)
    - Guard `update_plan` tool in Plan Mode with a clear error message.
    (`codex-rs/core/src/tools/handlers/plan.rs`)
    - Updated Plan Mode prompt to:  
      - keep `<proposed_plan>` out of non-final reasoning/preambles  
      - require exact tag formatting  
      - allow only one `<proposed_plan>` block per turn  
      (`codex-rs/core/templates/collaboration_mode/plan.md`)
    
    ### Protocol / App-server protocol
    - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
    (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
    - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
    EXPERIMENTAL markers and note that deltas may not match the final plan
    item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
    - Added plan delta route in app-server protocol common mapping.
    (`codex-rs/app-server-protocol/src/protocol/common.rs`)
    - Rebuild plan items from persisted `ItemCompleted` events on resume.
    (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)
    
    ### App-server
    - Forward plan deltas to v2 clients and map core plan items to v2 plan
    items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
    `codex-rs/app-server/src/codex_message_processor.rs`)
    - Added v2 plan item tests.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ### TUI
    - Added a dedicated proposed plan history cell with special background
    and padding, and moved “• Proposed Plan” outside the highlighted block.
    (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
    - Only show “Implement this plan?” when a plan item exists.
    (`codex-rs/tui/src/chatwidget.rs`,
    `codex-rs/tui/src/chatwidget/tests.rs`)
    
    <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
    src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
    />
    
    ### Docs / Misc
    - Updated protocol docs to mention plan deltas.
    (`codex-rs/docs/protocol_v1.md`)
    - Minor plumbing updates in exec/debug clients to tolerate plan deltas.
    (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)
    
    ## Tests
    - Added core integration tests:
      - Plan mode strips plan from agent messages.
      - Missing `</proposed_plan>` closes at end-of-message.  
      (`codex-rs/core/tests/suite/items.rs`)
    - Added unit tests for generic tag parser (prefix buffering, non-tag
    lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
    - Existing app-server plan item tests in v2.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ## Notes / Behavior
    - Plan output no longer appears in standard assistant text in Plan Mode;
    it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
    - The final plan item content is authoritative and may diverge from
    streamed deltas (documented as experimental).
    - Reasoning summaries are not filtered; prompt instructs the model not
    to include `<proposed_plan>` outside the final plan message.
    
    ## Codex Author
    `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
  • plan mode: add TL;DR checkpoint and client behavior note (#10195)
    ## Summary
    - Tightens Plan Mode to encourage exploration-first behavior and more
    back-and-forth alignment.
    - Adds a required TL;DR checkpoint before drafting the full plan.
    - Clarifies client behavior that can cause premature “Implement this
    plan?” prompts.
    
    ## What changed
    - Require at least one targeted non-mutating exploration pass before the
    first user question.
    - Insert a TL;DR checkpoint between Phase 2 (intent) and Phase 3
    (implementation).
    - TL;DR checkpoint guidance:
      - Label: “Proposed Plan (TL;DR)”
      - Format: 3–5 bullets using `- `
      - Options: exactly one option, “Approve”
    - `isOther: true`, with explicit guidance that “None of the above” is
    the edit path in the current UI.
    - Require the final plan to include a TL;DR consistent with the approved
    checkpoint.
    
    ## Why
    - In Plan Mode, any normal assistant message at turn completion is
    treated as plan content by the client. This can trigger premature
    “Implement this plan?” prompts.
    - The TL;DR checkpoint aligns on direction before Codex drafts a long,
    decision-complete plan.
    
    ## Testing
    - Manual: built the local CLI and verified the flow now explores first,
    presents a TL;DR checkpoint, and only drafts the full plan after
    approval.
    
    ---------
    
    Co-authored-by: Nick Baumann <@openai.com>