Commit Graph

3759 Commits

  • feat: add --compact mode to just log (#11994)
    Summary:
    - add a `--compact` flag to the logs client to suppress thread/target
    info
    - format rows and timestamps differently when compact mode is enabled so
    only hour time, level, and message remain
  • feat: add --search to just log (#11995)
    Summary
    - extend the log client to accept an optional `--search` substring
    filter when querying codex-state logs
    - propagate the filter through `LogQuery` and apply it in
    `push_log_filters` via `INSTR(message, ...)`
    - add an integration test that exercises the new search filtering
    behavior
    
    Testing
    - Not run (not requested)
  • Exit early when session initialization fails (#11908)
    Summary
    - wait for the initial session startup loop to finish and handle exit
    before waiting for the first message in fresh sessions
    - propagate AppRunControl::Exit to return immediately when
    initialization fails
  • fix(ci) Fix shell-tool-mcp.yml (#11969)
    ## Summary
    We're seeing failures for shell-tool-mcp.yml during git checkouts. This
    is a quick attempt to unblock releases - we should revisit this build
    pipeline since we've hit a number of errors.
  • fix: race in js repl (#11922)
    js_repl_reset previously raced with in-flight/new js_repl executions
    because reset() could clear exec_tool_calls without synchronizing with
    execute(). In that window, a running exec could lose its per-exec
    tool-call context, and subsequent kernel RunTool messages would fail
    with js_repl exec context not found. The fix serializes reset and
    execute on the same exec_lock, so reset cannot run concurrently with
    exec setup/teardown. We also keep the timeout path safe by performing
    reset steps inline while execute() already holds the lock, avoiding
    re-entrant lock acquisition. A regression test now verifies that reset
    waits for the exec lock and does not clear tool-call state early.
  • Hide /debug slash commands from popup menu (#11974)
    Summary
    - filter command popup builtins to remove any `/debug*` entries so they
    stay usable but are not listed
    - added regression tests to ensure the popup hides debug commands while
    dispatch still resolves them
  • fix: js_repl reset hang by clearing exec tool calls without waiting (#11932)
    Remove the waiting loop in `reset` so it no longer blocks on potentially
    hanging exec tool calls + add `clear_all_exec_tool_calls_map` to drain
    the map and notify waiters so `reset` completes immediately
  • fix(core) exec_policy parsing fixes (#11951)
    ## Summary
    Fixes a few things in our exec_policy handling of prefix_rules:
    1. Correctly match redirects specifically for exec_policy parsing. i.e.
    if you have `prefix_rule(["echo"], decision="allow")` then `echo hello >
    output.txt` should match - this should fix #10321
    2. If there already exists any rule that would match our prefix rule
    (not just a prompt), then drop it, since it won't do anything.
    
    
    ## Testing
    - [x] Updated unit tests, added approvals ScenarioSpecs
  • add(core): safety check downgrade warning (#11964)
    Add per-turn notice when a request is downgraded to a fallback model due
    to cyber safety checks.
    
    **Changes**
    
    - codex-api: Emit a ServerModel event based on the openai-model response
    header and/or response payload (SSE + WebSocket), including when the
    model changes mid-stream.
    - core: When the server-reported model differs from the requested model,
    emit a single per-turn warning explaining the reroute to gpt-5.2 and
    directing users to Trusted
        Access verification and the cyber safety explainer.
    - app-server (v2): Surface these cyber model-routing warnings as
    synthetic userMessage items with text prefixed by Warning: (and document
    this behavior).
  • Fixed screen reader regression in CLI (#11860)
    The `tui.animations` switch should gate all animations in the TUI, but a
    recent change introduced a regression that didn't include the gate. This
    makes it difficult to use the TUI with a screen reader.
    
    This fix addresses #11856
  • add(feedback): over-refusal / safety check (#11948)
    Add new feedback option for "Over-refusal / safety check"
  • chore(core) rm Feature::RequestRule (#11866)
    ## Summary
    This feature is now reasonably stable, let's remove it so we can
    simplify our upcoming iterations here.
    
    ## Testing 
    - [x] Existing tests pass
  • [apps] Fix app mention syntax. (#11894)
    - [x] Fix app mention syntax.
  • Rename collab modules to multi agents (#11939)
    Summary
    - rename the `collab` handlers and UI files to `multi_agents` to match
    the new naming
    - update module references and specs so the handlers and TUI widgets
    consistently use the renamed files
    - keep the existing functionality while aligning file and module names
    with the multi-agent terminology
  • docs: mention Codex app in README intro (#11926)
    Add mention of the app in the README.
  • feat: add customizable roles for multi-agents (#11917)
    The idea is to have 2 family of agents.
    
    1. Built-in that we packaged directly with Codex
    2. User defined that are defined using the `agents_config.toml` file. It
    can reference config files that will override the agent config. This
    looks like this:
    ```
    version = 1
    
    [agents.explorer]
    description = """Use `explorer` for all codebase questions.
    Explorers are fast and authoritative.
    Always prefer them over manual search or file reading.
    Rules:
    - Ask explorers first and precisely.
    - Do not re-read or re-search code they cover.
    - Trust explorer results without verification.
    - Run explorers in parallel when useful.
    - Reuse existing explorers for related questions."""
    config_file = "explorer.toml"
    ```
  • chore: rename collab feature flag key to multi_agent (#11918)
    Summary
    - rename the collab feature key to multi_agent while keeping the Feature
    enum unchanged
    - add legacy alias support so both "multi_agent" and "collab" map to the
    same feature
    - cover the alias behavior with a new unit test
  • Allow hooks to error (#11615)
    Allow hooks to return errors. 
    
    We should do this before introducing more hook types, or we'll have to
    migrate them all.
  • feat: use shell policy in shell snapshot (#11759)
    Honor `shell_environment_policy.set` even after a shell snapshot
  • bazel: fix snapshot parity for tests/*.rs rust_test targets (#11893)
    ## Summary
    - make `rust_test` targets generated from `tests/*.rs` use Cargo-style
    crate names (file stem) so snapshot names match Cargo (`all__...`
    instead of Bazel-derived names)
    - split lib vs `tests/*.rs` test env wiring in `codex_rust_crate` to
    keep existing lib snapshot behavior while applying Bazel
    runfiles-compatible workspace root for `tests/*.rs`
    - compute the `tests/*.rs` snapshot workspace root from package depth so
    `insta` resolves committed snapshots under Bazel `--noenable_runfiles`
    
    ## Validation
    - `bazelisk test //codex-rs/core:core-all-test
    --test_arg=suite::compact:: --cache_test_results=no`
    - `bazelisk test //codex-rs/core:core-all-test
    --test_arg=suite::compact_remote:: --cache_test_results=no`
  • fix: only emit unknown model warning on user turns (#11884)
    ###### Context
    unknown model warning added in #11690 has
    [issues](https://github.com/openai/codex/actions/runs/22047424710/job/63700733887)
    on ubuntu runners because we potentially emit it on all new turns,
    including ones with intentionally fake models (i.e., `mock-model` in a
    test).
    
    ###### Fix
    change the warning to only emit on user turns/review turns.
    
    ###### Tests
    CI now passes on ubuntu, still passes locally
  • feat: persist and restore codex app's tools after search (#11780)
    ### What changed
    1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`.
    2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in
    `core/src/state/session.rs` for authoritative restore behavior (deduped,
    order-preserving, empty clears).
    3. Added rollout parsing in `core/src/codex.rs` to recover
    `active_selected_tools` from prior `search_tool_bm25` outputs:
       - tracks matching `call_id`s
       - parses function output text JSON
       - extracts `active_selected_tools`
       - latest valid payload wins
       - malformed/non-matching payloads are ignored
    4. Applied restore logic to resumed and forked startup paths in
    `core/src/codex.rs`.
    5. Updated instruction text to session/thread scope in
    `core/templates/search_tool/tool_description.md`.
    6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit
    coverage in:
       - `core/src/codex.rs`
       - `core/src/state/session.rs`
    
    ### Behavior after change
    1. Search activates matched tools.
    2. Additional searches union into active selection.
    3. Selection survives new turns in the same thread.
    4. Resume/fork restores selection from rollout history.
    5. Separate threads do not inherit selection unless forked.
  • fix: show user warning when using default fallback metadata (#11690)
    ### What
    It's currently unclear when the harness falls back to the default,
    generic `ModelInfo`. This happens when the `remote_models` feature is
    disabled or the model is truly unknown, and can lead to bad performance
    and issues in the harness.
    
    Add a user-facing warning when this happens so they are aware when their
    setup is broken.
    
    ### Tests
    Added tests, tested locally.
  • core: snapshot tests for compaction requests, post-compaction layout, some additional compaction tests (#11487)
    This PR keeps compaction context-layout test coverage separate from
    runtime compaction behavior changes, so runtime logic review can stay
    focused.
    
    ## Included
    - Adds reusable context snapshot helpers in
    `core/tests/common/context_snapshot.rs` for rendering model-visible
    request/history shapes.
    - Standardizes helper naming for readability:
      - `format_request_input_snapshot`
      - `format_response_items_snapshot`
      - `format_labeled_requests_snapshot`
      - `format_labeled_items_snapshot`
    - Expands snapshot coverage for both local and remote compaction flows:
      - pre-turn auto-compaction
      - pre-turn failure/context-window-exceeded paths
      - mid-turn continuation compaction
      - manual `/compact` with and without prior user turns
    - Captures both sides where relevant:
      - compaction request shape
      - post-compaction history layout shape
    - Adds/uses shared request-inspection helpers so assertions target
    structured request content instead of ad-hoc JSON string parsing.
    - Aligns snapshots/assertions to current behavior and leaves explicit
    `TODO(ccunningham)` notes where behavior is known and intentionally
    deferred.
    
    ## Not Included
    - No runtime compaction logic changes.
    - No model-visible context/state behavior changes.
  • Add process_uuid to sqlite logs (#11534)
    ## Summary
    This PR is the first slice of the per-session `/feedback` logging work:
    it adds a process-unique identifier to SQLite log rows.
    
    It does **not** change `/feedback` sourcing behavior yet.
    
    ## Changes
    - Add migration `0009_logs_process_id.sql` to extend `logs` with:
      - `process_uuid TEXT`
      - `idx_logs_process_uuid` index
    - Extend state log models:
      - `LogEntry.process_uuid: Option<String>`
      - `LogRow.process_uuid: Option<String>`
    - Stamp each log row with a stable per-process UUID in the sqlite log
    layer:
      - generated once per process as `pid:<pid>:<uuid>`
    - Update sqlite log insert/query paths to persist and read
    `process_uuid`:
      - `INSERT INTO logs (..., process_uuid, ...)`
      - `SELECT ..., process_uuid, ... FROM logs`
    
    ## Why
    App-server runs many sessions in one process. This change provides a
    process-scoping primitive we need for follow-up `/feedback` work, so
    threadless/process-level logs can be associated with the emitting
    process without mixing across processes.
    
    ## Non-goals in this PR
    - No `/feedback` transport/source changes
    - No attachment size changes
    - No sqlite retention/trim policy changes
    
    ## Testing
    - `just fmt`
    - CI will run the full checks
  • fix(core): add linux bubblewrap sandbox tag (#11767)
    ## Summary
    - add a distinct `linux_bubblewrap` sandbox tag when the Linux
    bubblewrap pipeline feature is enabled
    - thread the bubblewrap feature flag into sandbox tag generation for:
      - turn metadata header emission
      - tool telemetry metric tags and after-tool-use hooks
    - add focused unit tests for `sandbox_tag` precedence and Linux
    bubblewrap behavior
    
    ## Validation
    - `just fmt`
    - `cargo clippy -p codex-core --all-targets`
    - `cargo test -p codex-core sandbox_tags::tests`
    - started `cargo test -p codex-core` and stopped it per request
    
    Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
  • feat(tui) Permissions update history item (#11550)
    ## Summary
    We should document in the tui when you switch permissions!
    
    ## Testing
    - [x] Added unit tests
    - [x] Tested locally
  • feat(tui): render structured network approval prompts in approval overlay (#11674)
    ### Description
    #### Summary
    Adds the TUI UX layer for structured network approvals
    
    #### What changed
    - Updated approval overlay to display network-specific approval context
    (host/protocol).
    - Added/updated TUI wiring so approval prompts show correct network
    messaging.
    - Added tests covering the new approval overlay behavior.
    
    #### Why
    Core orchestration can now request structured network approvals; this
    ensures users see clear, contextual prompts in the TUI.
    
    #### Notes
    - UX behavior activates only when network approval context is present.
    
    ---------
    
    Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
  • feat(core): add structured network approval plumbing and policy decision model (#11672)
    ### Description
    #### Summary
    Introduces the core plumbing required for structured network approvals
    
    #### What changed
    - Added structured network policy decision modeling in core.
    - Added approval payload/context types needed for network approval
    semantics.
    - Wired shell/unified-exec runtime plumbing to consume structured
    decisions.
    - Updated related core error/event surfaces for structured handling.
    - Updated protocol plumbing used by core approval flow.
    - Included small CLI debug sandbox compatibility updates needed by this
    layer.
    
    #### Why
    establishes the minimal backend foundation for network approvals without
    yet changing high-level orchestration or TUI behavior.
    
    #### Notes
    - Behavior remains constrained by existing requirements/config gating.
    - Follow-up PRs in the stack handle orchestration, UX, and app-server
    integration.
    
    ---------
    
    Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
  • Fixed help text for mcp and mcp-server CLI commands (#11813)
    Also removed the "[experimental]" tag since these have been stable for
    many months
    
    This addresses #11812
  • Handle model-switch base instructions after compaction (#11659)
    Strip trailing <model_switch> during model-switch compaction request,
    and append <model_switch> after model switch compaction
  • bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790)
    ## Why this change
    
    When Cargo dependencies change, it is easy to end up with an unexpected
    local diff in
    `MODULE.bazel.lock` after running Bazel. That creates noisy working
    copies and pushes lockfile fixes
    later in the cycle. This change addresses that pain point directly.
    
    ## What this change enforces
    
    The expected invariant is: after dependency updates, `MODULE.bazel.lock`
    is already in sync with
    Cargo resolution. In practice, running `bazel mod deps` should not
    mutate the lockfile in a clean
    state. If it does, the dependency update is incomplete.
    
    ## How this is enforced
    
    This change adds a single lockfile check script that snapshots
    `MODULE.bazel.lock`, runs
    `bazel mod deps`, and fails if the file changes. The same check is wired
    into local workflow
    commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
    Bazel CI (Linux x86_64 job)
    so drift is caught early and consistently. The developer documentation
    is updated in
    `codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
    explicit.
    
    `MODULE.bazel.lock` is also refreshed in this PR to match the current
    Cargo dependency resolution.
    
    ## Expected developer workflow
    
    After changing `Cargo.toml` or `Cargo.lock`, run `just
    bazel-lock-update`, then run
    `just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
    update in the same change.
    
    ## Testing
    
    Ran `just bazel-lock-check` locally.
  • feat(skills): add permission profiles from openai.yaml metadata (#11658)
    ## Summary
    
    This PR adds support for skill-level permissions in .codex/openai.yaml
    and wires that through the skill loading pipeline.
    
      ## What’s included
    
    1. Added a new permissions section for skills (network, filesystem, and
    macOS-related access).
    2. Implemented permission parsing/normalization and translation into
    runtime permission profiles.
    3. Threaded the new permission profile through SkillMetadata and loader
    flow.
    
      ## Follow-up
    
    A follow-up PR will connect these permission profiles to actual sandbox
    enforcement and add user approval prompts for executing binaries/scripts
    from skill directories.
    
    
     ## Example 
    `openai.yaml` snippet:
    ```
      permissions:
        network: true
        fs_read:
          - "./data"
          - "./data"
        fs_write:
          - "./output"
        macos_preferences: "readwrite"
        macos_automation:
          - "com.apple.Notes"
        macos_accessibility: true
        macos_calendar: true
    ```
    
    compiled skill permission profile metadata (macOS): 
    ```
    SkillPermissionProfile {
          sandbox_policy: SandboxPolicy::WorkspaceWrite {
              writable_roots: vec![
                  AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/output").unwrap(),
              ],
              read_only_access: ReadOnlyAccess::Restricted {
                  include_platform_defaults: true,
                  readable_roots: vec![
                      AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/data").unwrap(),
                  ],
              },
              network_access: true,
              exclude_tmpdir_env_var: false,
              exclude_slash_tmp: false,
          },
          // Truncated for readability; actual generated profile is longer.
          macos_seatbelt_permission_file: r#"
      (allow user-preference-write)
      (allow appleevent-send
          (appleevent-destination "com.apple.Notes"))
      (allow mach-lookup (global-name "com.apple.axserver"))
      (allow mach-lookup (global-name "com.apple.CalendarAgent"))
      ...
      "#.to_string(),
    ```
  • Fix js_repl in-flight tool-call waiter race (#11800)
    ## Summary
    
    This PR fixes a race in `js_repl` tool-call draining that could leave an
    exec waiting indefinitely for in-flight tool calls to finish.
    
    The fix is in:
    
    -
    `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
    
    ## Problem
    
    `js_repl` tracks in-flight tool calls per exec and waits for them to
    drain on completion/timeout/cancel paths.
    The previous wait logic used a check-then-wait pattern with `Notify`
    that could miss a wakeup:
    
    1. Observe `in_flight > 0`
    2. Drop lock
    3. Register wait (`notified().await`)
    
    If `notify_waiters()` happened between (2) and (3), the waiter could
    sleep until another notification that never comes.
    
    ## What changed
    
    - Updated all exec-tool-call wait loops to create an owned notification
    future while holding the lock:
    - use `Arc<Notify>::notified_owned()` instead of cloning notify and
    awaiting later.
    - Applied this consistently to:
      - `wait_for_exec_tool_calls`
      - `wait_for_all_exec_tool_calls`
      - `wait_for_exec_tool_calls_map`
    
    This preserves existing behavior while eliminating the lost-wakeup
    window.
    
    ## Test coverage
    
    Added a regression test:
    
    - `wait_for_exec_tool_calls_map_drains_inflight_calls_without_hanging`
    
    The test repeatedly races waiter/finisher tasks and asserts bounded
    completion to catch hangs.
    
    ## Impact
    
    - No API changes.
    - No user-facing behavior changes intended.
    - Improves reliability of exec lifecycle boundaries when tool calls are
    still in flight.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/11796
    - 👉 `2` https://github.com/openai/codex/pull/11800
    -  `3` https://github.com/openai/codex/pull/10673
    -  `4` https://github.com/openai/codex/pull/10670
  • Fix js_repl view_image test runtime panic (#11796)
    ## Summary
    Fixes a flaky/panicking `js_repl` image-path test by running it on a
    multi-thread Tokio runtime and tightening assertions to focus on real
    behavior.
    
    ## Problem
    `js_repl_can_attach_image_via_view_image_tool` in  
    
    `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
    can panic under single-thread test runtime with:
    
    `can call blocking only when running on the multi-threaded runtime`
    
    It also asserted a brittle user-facing text string.
    
    ## Changes
    1. Updated the test runtime to:
       `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]`
    2. Removed the brittle `"attached local image path"` string assertion.
    3. Kept the concrete side-effect assertions:
       - tool call succeeds
    - image is actually injected into pending input (`InputImage` with
    `data:image/png;base64,...`)
    
    ## Why this is safe
    This is test-only behavior. No production runtime code paths are
    changed.
    
    ## Validation
    - Ran:
    `cargo test -p codex-core
    tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool --
    --nocapture`
    - Result: pass
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/11796
    -  `2` https://github.com/openai/codex/pull/11800
    -  `3` https://github.com/openai/codex/pull/10673
    -  `4` https://github.com/openai/codex/pull/10670
  • fix(protocol): make local image test Bazel-friendly (#11799)
    Fixes Bazel build failure in //codex-rs/protocol:protocol-unit-tests.
    
    The test used include_bytes! to read a PNG from codex-core assets; Cargo
    can read it,
    but Bazel sandboxing can't, so the crate fails to compile.
    
    This change inlines a tiny valid PNG in the test to keep it hermetic.
    
    Related regression: #10590 (cc: @charley-oai)
  • fix: send unfiltered models over model/list (#11793)
    ### What
    to unblock filtering models in VSCE, change `model/list` app-server
    endpoint to send all models + visibility field `showInPicker` so
    filtering can be done in VSCE if desired.
    
    ### Tests
    Updated tests.
  • codex-rs: fix thread resume rejoin semantics (#11756)
    ## Summary
    - always rejoin an in-memory running thread on `thread/resume`, even
    when overrides are present
    - reject `thread/resume` when `history` is provided for a running thread
    - reject `thread/resume` when `path` mismatches the running thread
    rollout path
    - warn (but do not fail) on override mismatches for running threads
    - add more `thread_resume` integration tests and fixes; including
    restart-based resume-with-overrides coverage
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-app-server --test all thread_resume`
    - manual test with app-server-test-client
    https://github.com/openai/codex/pull/11755
    - manual test both stdio and websocket in app
  • [app-server] add fuzzyFileSearch/sessionCompleted (#11773)
    this is to allow the client to know when to stop showing a spinner.
  • turn metadata followups (#11782)
    some trivial simplifications from #11677
  • tui: preserve remote image attachments across resume/backtrack (#10590)
    ## Summary
    This PR makes app-server-provided image URLs first-class attachments in
    TUI, so they survive resume/backtrack/history recall and are resubmitted
    correctly.
    
    <img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
    src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
    />
    
    Can delete the attached image upon backtracking:
    <img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
    src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
    />
    
    In both history and composer, remote images are rendered as normal
    `[Image #N]` placeholders, with numbering unified with local images.
    
    ## What changed
    - Plumb remote image URLs through TUI message state:
      - `UserHistoryCell`
      - `BacktrackSelection`
      - `ChatComposerHistory::HistoryEntry`
      - `ChatWidget::UserMessage`
    - Show remote images as placeholder rows inside the composer box (above
    textarea), and in history cells.
    - Support keyboard selection/deletion for remote image rows in composer
    (`Up`/`Down`, `Delete`/`Backspace`).
    - Preserve remote-image-only turns in local composer history (Up/Down
    recall), including restore after backtrack.
    - Ensure submit/queue/backtrack resubmit include remote images in model
    input (`UserInput::Image`), and keep request shape stable for
    remote-image-only turns.
    - Keep image numbering contiguous across remote + local images:
      - remote images occupy `[Image #1]..[Image #M]`
      - local images start at `[Image #M+1]`
      - deletion renumbers consistently.
    - In protocol conversion, increment shared image index for remote images
    too, so mixed remote/local image tags stay in a single sequence.
    - Simplify restore logic to trust in-memory attachment order (no
    placeholder-number parsing path).
    - Backtrack/replay rollback handling now queues trims through
    `AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
    lines after trims, so overlay/transcript state stays consistent.
    - Trim trailing blank rendered lines from user history rendering to
    avoid oversized blank padding.
    
    ## Docs + tests
    - Updated: `docs/tui-chat-composer.md` (remote image flow,
    selection/deletion, numbering offsets)
    - Added/updated tests across `tui/src/chatwidget/tests.rs`,
    `tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
    and `tui/src/bottom_pane/chat_composer.rs`
    - Added snapshot coverage for remote image composer states, including
    deleting the first of two remote images.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-tui`
    
    ## Codex author
    `codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
  • rmcp-client: fix auth crash (#11692)
    Don't load auth tokens if bearer token is present. This fixes a crash I
    was getting on Linux:
    
    ```
    2026-02-12T23:26:24.999408Z DEBUG session_init: codex_core::codex: Configuring session: model=gpt-5.3-codex-spark; provider=ModelProviderInfo { name: "OpenAI", base_url: None, env_key: None, env_key_instructions: No
    ne, experimental_bearer_token: None, wire_api: Responses, query_params: None, http_headers: Some({"version": "0.0.0"}), env_http_headers: Some({"OpenAI-Project": "OPENAI_PROJECT", "OpenAI-Organization": "OPENAI_ORGA
    NIZATION"}), request_max_retries: None, stream_max_retries: None, stream_idle_timeout_ms: None, requires_openai_auth: true, supports_websockets: true }
    2026-02-12T23:26:24.999799Z TRACE session_init: codex_keyring_store: keyring.load start, service=Codex MCP Credentials, account=codex_apps|20398391ad12d90b
    
    thread 'tokio-runtime-worker' (96190) has overflowed its stack
    fatal runtime error: stack overflow, aborting
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.35s
    ```
  • fix(nix): use correct version from Cargo.toml in flake build (#11770)
    ## Summary
    
    - When building via `nix build`, the binary reports `codex-cli 0.0.0`
    because the workspace `Cargo.toml` uses `0.0.0` as a placeholder on
    `main`. This causes the update checker to always prompt users to upgrade
    even when running the latest code.
    - Reads the version from `codex-rs/Cargo.toml` at flake evaluation time
    using `builtins.fromTOML` and patches it into the workspace `Cargo.toml`
    before cargo builds via `postPatch`.
    - On release commits (e.g. tag `rust-v0.101.0`), the real version is
    used as-is. On `main` branch builds, falls back to
    `0.0.0-dev+<shortRev>` (or `0.0.0-dev+dirty`), which the update
    checker's `parse_version` ignores — suppressing the spurious upgrade
    prompt.
    
    | Scenario | Cargo.toml version | Nix `version` | Binary reports |
    Upgrade nag? |
    |---|---|---|---|---|
    | Release commit (e.g. `rust-v0.101.0`) | `0.101.0` | `0.101.0` |
    `codex-cli 0.101.0` | Only if newer exists |
    | Main branch (committed) | `0.0.0` | `0.0.0-dev+b934ffc` | `codex-cli
    0.0.0-dev+b934ffc` | No |
    | Main branch (uncommitted) | `0.0.0` | `0.0.0-dev+dirty` | `codex-cli
    0.0.0-dev+dirty` | No |
    
    ## Test plan
    
    - [ ] `nix build` from `main` branch and verify `codex --version`
    reports `0.0.0-dev+<shortRev>` instead of `0.0.0`
    - [ ] Verify the update checker does not show a spurious upgrade prompt
    for dev builds
    - [ ] Confirm that on a release commit where `Cargo.toml` has a real
    version, the binary reports that version correctly
  • Improve GitHub issue deduplication reliability by introducing a stage… (#11769)
    …d two-pass Codex search strategy with deterministic fallback behavior,
    and remove an obsolete prompt file that was no longer used.
    
    ### Changes
    - Updated `workflows/issue-deduplicator.yml`:
    - Added richer issue input fields (`state`, `updatedAt`, `labels`) for
    model context.
      - Added two candidate pools:
        - `codex-existing-issues-all.json` (`--state all`)
        - `codex-existing-issues-open.json` (`--state open`)
    - Added body truncation during JSON preparation to reduce prompt noise.
      - Added **Pass 1** Codex run over all issues.
      - Added normalization/validation step for Pass 1 output:
        - tolerant JSON parsing
        - self-issue filtering
        - deduplication
        - cap to 5 results
    - Added **Pass 2 fallback** Codex run over open issues only, triggered
    only when Pass 1 has no usable matches.
    - Added normalization/validation step for Pass 2 output (same
    filtering/dedup/cap behavior).
      - Added final deterministic selector:
        - prefer pass 2 if it finds matches
        - otherwise use pass 1
        - otherwise return no matches
      - Added observability logs:
        - pool sizes
        - per-pass parse/match status
        - final pass selected and final duplicate count
      - Kept public issue-comment format unchanged.
    - Added comment documenting that prompt text now lives inline in
    workflow.
    
    - Deleted obsolete file:
      - `/prompts/issue-deduplicator.txt`
    
    ### Behavior Impact
    - Better duplicate recall when broad search fails by retrying against
    active issues only.
    - More deterministic/noise-resistant output handling.
    - No change to workflow trigger conditions, permissions, or issue
    comment structure.
  • support app usage analytics (#11687)
    Emit app mentioned and app used events. Dedup by (turn_id, connector_id)
    
    Example event params:
    {
        "event_type": "codex_app_used",
        "connector_id": "asdk_app_xxx",
        "thread_id": "019c5527-36d4-xxx",
        "turn_id": "019c552c-cd17-xxx",
        "app_name": "Slack (OpenAI Internal)",
        "product_client_id": "codex_cli_rs",
        "invoke_type": "explicit",
        "model_slug": "gpt-5.3-codex"
    }