Commit Graph

3718 Commits

  • Update context window after model switch (#11520)
    - Update token usage aggregation to refresh model context window after a
    model change.
    - Add protocol/core tests, including an e2e model-switch test that
    validates switching to a smaller model updates telemetry.
  • Clamp auto-compact limit to context window (#11516)
    - Clamp auto-compaction to the minimum of configured limit and 90% of
    context window
    - Add an e2e compact test for clamped behavior
    - Update remote compact tests to account for earlier auto-compaction in
    setup turns
  • Pre-sampling compact with previous model context (#11504)
    - Run pre-sampling compact through a single helper that builds
    previous-model turn context and compacts before the follow-up request
    when switching to a smaller context window.
    - Keep compaction events on the parent turn id and add compact suite
    coverage for switch-in-session and resume+switch flows.
  • change model cap to server overload (#11388)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Consolidate search_tool feature into apps (#11509)
    ## Summary
    - Remove `Feature::SearchTool` and the `search_tool` config key from the
    feature registry/schema.
    - Gate `search_tool_bm25` exposure via `Feature::Apps` in
    `core/src/tools/spec.rs`.
    - Update MCP selection logic in `core/src/codex.rs` to use
    `Feature::Apps` for search-tool behavior.
    - Update `core/tests/suite/search_tool.rs` to enable `Feature::Apps`.
    - Regenerate `core/config.schema.json` via `just write-config-schema`.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-core --test all suite::search_tool::`
    
    ## Tickets
    - None
  • feat: try to fix bugs I saw in the wild in the resource parsing logic (#11513)
    I gave Codex the following bug report about the logic to report the
    host's resources introduced in
    https://github.com/openai/codex/pull/11488 and this PR is its proposed
    fix.
    
    The fix seems like an escaping issue, mostly.
    
    ---
    
    The logic to print out the runner specs has an awk error on Mac:
    
    ```
    Runner: GitHub Actions 1014936475
    OS: macOS 15.7.3
    Hardware model: VirtualMac2,1
    CPU architecture: arm64
    Logical CPUs: 5
    Physical CPUs: 5
    awk: syntax error at source line 1
     context is
    	{printf >>>  \ <<< "%.1f GiB\\n\", $1 / 1024 / 1024 / 1024}
    awk: illegal statement at source line 1
    Total RAM: 
    Disk usage:
    Filesystem      Size    Used   Avail Capacity iused ifree %iused  Mounted on
    /dev/disk3s5   320Gi   237Gi    64Gi    79%    2.0M  671M    0%   /System/Volumes/Data
    ```
    
    as well as Linux:
    
    ```
    Runner: GitHub Actions 1014936469
    OS: Linux runnervmwffz4 6.11.0-1018-azure #18~24.04.1-Ubuntu SMP Sat Jun 28 04:46:03 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
    awk: cmd. line:1: /Model name/ {gsub(/^[ \t]+/,\"\",$2); print $2; exit}
    awk: cmd. line:1:                              ^ backslash not last character on line
    CPU model: 
    Logical CPUs: 4
    awk: cmd. line:1: /MemTotal/ {printf \"%.1f GiB\\n\", $2 / 1024 / 1024}
    awk: cmd. line:1:                    ^ backslash not last character on line
    Total RAM: 
    Disk usage:
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/root        72G   50G   22G  70% /
    ```
  • Hydrate previous model across resume/fork/rollback/task start (#11497)
    - Replace pending resume model state with persistent previous_model and
    hydrate it on resume, fork, rollback, and task end in spawn_task
  • Added seatbelt policy rule to allow os.cpus (#11277)
    I don't think this policy change increases the risk, other than
    potentially exposing the caller to bugs in these kernel calls, which are
    unlikely.
    
    Without this change, some tools are silently failing or making incorrect
    decisions about the processor type (e.g. installing x86 binaries rather
    than Apple silicon binaries).
    
    This addresses #11210
    
    ---------
    
    Co-authored-by: viyatb-oai <viyatb@openai.com>
  • app-server: thread resume subscriptions (#11474)
    This stack layer makes app-server thread event delivery connection-aware
    so resumed/attached threads only emit notifications and approval prompts
    to subscribed connections.
    
    - Added per-thread subscription tracking in `ThreadState`
    (`subscribed_connections`) and mapped subscription ids to `(thread_id,
    connection_id)`.
    - Updated listener lifecycle so removing a subscription or closing a
    connection only removes that connection from the thread’s subscriber
    set; listener shutdown now happens when the last subscriber is gone.
    - Added `connection_closed(connection_id)` plumbing (`lib.rs` ->
    `message_processor.rs` -> `codex_message_processor.rs`) so disconnect
    cleanup happens immediately.
    - Scoped bespoke event handling outputs through `TargetedOutgoing` to
    send requests/notifications only to subscribed connections.
    - Kept existing threadresume behavior while aligning with the latest
    split-loop transport structure.
  • Make codex-sdk depend on openai/codex (#11503)
    Do not bundle all binaries inside the SDK as it makes the package huge.
    Instead depend on openai/codex
  • chore(tui) Simplify /status Permissions (#11290)
    ## Summary
    Consolidate `/status` Permissions lines into a simpler view. It should
    only show "Default," "Full Access," or "Custom" (with specifics)
    
    ## Testing
    - [x] many snapshots updated
  • feat: build windows support binaries in parallel (#11500)
    Windows release builds were compiling and linking four release binaries
    on a single runner, which slowed the release pipeline. The
    Windows-specific logic also made `rust-release.yml` harder to read and
    maintain.
    
    ## What Changed
    
    - Extracted Windows release logic into a reusable workflow at
    `.github/workflows/rust-release-windows.yml`.
    - Updated `.github/workflows/rust-release.yml` to call the reusable
    Windows workflow via `workflow_call`.
    - Parallelized Windows binary builds with one 4-entry matrix over two
    targets (`x86_64-pc-windows-msvc`, `aarch64-pc-windows-msvc`) and two
    bundles (`primary`, `helpers`).
    - Kept signing centralized per target by downloading both prebuilt
    bundles and signing all four executables together.
    - Preserved final release artifact behavior and filtered intermediate
    `windows-binaries*` artifacts out of the published release asset set.
  • Add AfterToolUse hook (#11335)
    Not wired up to config yet. (So we can change the name if we want)
    
    An example payload:
    
    ```
    {
      "session_id": "019c48b7-7098-7b61-bc48-32e82585d451",
      "cwd": "/Users/gt/code/codex/codex-rs",
      "triggered_at": "2026-02-10T18:02:31Z",
      "hook_event": {
        "event_type": "after_tool_use",
        "turn_id": "4",
        "call_id": "call_iuo4DqWgjE7OxQywnL2UzJUE",
        "tool_name": "apply_patch",
        "tool_kind": "custom",
        "tool_input": {
          "input_type": "custom",
          "input": "*** Begin Patch\n*** Update File: README.md\n@@\n-# Codex CLI hello (Rust Implementation)\n+# Codex CLI (Rust Implementation)\n*** End Patch\n"
        },
        "executed": true,
        "success": true,
        "duration_ms": 37,
        "mutating": true,
        "sandbox": "none",
        "sandbox_policy": "danger-full-access",
        "output_preview": "{\"output\":\"Success. Updated the following files:\\nM README.md\\n\",\"metadata\":{\"exit_code\":0,\"duration_seconds\":0.0}}"
      }
    }
    ```
  • Increased file watcher debounce duration from 1s to 10s (#11494)
    Users were reporting that when they were actively editing a skill file,
    they would see frequent errors (one per second) across all of their
    active session until they fixed all frontmatter parse errors. This
    change will reduce the chatter at the expense of a slightly longer delay
    before skills are updated in the UI.
    
    This addresses #11385
  • nit: memory truncation (#11479)
    Use existing truncation for memories
  • feat: use more powerful machines for building Windows releases (#11488)
    Windows release builds in `.github/workflows/rust-release.yml` were
    still using GitHub-hosted `windows-latest` and `windows-11-arm` runners.
    This change aligns release builds with the faster dedicated Codex runner
    pool already used in CI, and adds machine-spec logging at startup so
    runner capacity (CPU/RAM/disk) is visible in build logs.
    
    ## What Changed
    
    - Updated the `build` job to support matrix entries that provide a full
    `runs_on` object:
      - `runs-on: ${{ matrix.runs_on || matrix.runner }}`
    - Switched Windows release matrix entries to Codex runners:
      - `windows-latest` -> `windows-x64` with:
        - `group: codex-runners`
        - `labels: codex-windows-x64`
      - `windows-11-arm` -> `windows-arm64` with:
        - `group: codex-runners`
        - `labels: codex-windows-arm64`
    - Updated the ARM-specific zstd install condition to match the new
    runner id:
      - `matrix.runner == 'windows-arm64'`
    - Added early platform-specific runner diagnostics steps
    (Linux/macOS/Windows) that print OS, CPU, logical CPU count, total RAM,
    and disk usage.
  • Pump pings (#11413)
    Keep processing ping even when the agent isn't actively running.
    
    Otherwise the connection will drop.
  • refactor: codex app-server ThreadState (#11419)
    this is a no-op functionality wise. consolidates thread-specific message
    processor / event handling state in ThreadState
  • Add feature-gated freeform js_repl core runtime (#10674)
    ## Summary
    
    This PR adds an **experimental, feature-gated `js_repl` core runtime**
    so models can execute JavaScript in a persistent REPL context across
    tool calls.
    
    The implementation integrates with existing feature gating, tool
    registration, prompt composition, config/schema docs, and tests.
    
    ## What changed
    
    - Added new experimental feature flag: `features.js_repl`.
    - Added freeform `js_repl` tool and companion `js_repl_reset` tool.
    - Gated tool availability behind `Feature::JsRepl`.
    - Added conditional prompt-section injection for JS REPL instructions
    via marker-based prompt processing.
    - Implemented JS REPL handlers, including freeform parsing and pragma
    support (timeout/reset controls).
    - Added runtime resolution order for Node:
      1. `CODEX_JS_REPL_NODE_PATH`
      2. `js_repl_node_path` in config
      3. `PATH`
    - Added JS runtime assets/version files and updated docs/schema.
    
    ## Why
    
    This enables richer agent workflows that require incremental JavaScript
    execution with preserved state, while keeping rollout safe behind an
    explicit feature flag.
    
    ## Testing
    
    Coverage includes:
    
    - Feature-flag gating behavior for tool exposure.
    - Freeform parser/pragma handling edge cases.
    - Runtime behavior (state persistence across calls and top-level `await`
    support).
    
    ## Usage
    
    ```toml
    [features]
    js_repl = true
    ```
    
    Optional runtime override:
    
    - `CODEX_JS_REPL_NODE_PATH`, or
    - `js_repl_node_path` in config.
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Promote Windows Sandbox (#11341)
    1. Move Windows Sandbox NUX to right after trust directory screen
    2. Don't offer read-only as an option in Sandbox NUX.
    Elevated/Legacy/Quit
    3. Don't allow new untrusted directories. It's trust or quit
    4. move experimental sandbox features to `[windows]
    sandbox="elevated|unelevatd"`
    5. Copy tweaks = elevated -> default, non-elevated -> non-admin
  • Linkify feedback link (#11414)
    Make it clickable
  • fix(tui): increase paste burst char interval on Windows to 30ms (#9348)
    ## Summary
    
    - Increases `PASTE_BURST_CHAR_INTERVAL` from 8ms to 30ms on Windows to
    fix multi-line paste issues in VS Code integrated terminal
    - Follows existing pattern of platform-specific timing (like
    `PASTE_BURST_ACTIVE_IDLE_TIMEOUT`)
    
    ## Problem
    
    When pasting multi-line text in Codex CLI on Windows (especially VS Code
    integrated terminal), only the first portion is captured before
    auto-submit. The rest arrives as a separate message.
    
    **Root cause**: VS Code's terminal emulation adds latency (~10-15ms per
    character) between key events. The 8ms `PASTE_BURST_CHAR_INTERVAL`
    threshold is too tight - characters arrive slower than expected, so
    burst detection fails and Enter submits instead of inserting a newline.
    
    ## Solution
    
    Use Windows-specific timing (30ms) for `PASTE_BURST_CHAR_INTERVAL`,
    following the same pattern already used for
    `PASTE_BURST_ACTIVE_IDLE_TIMEOUT` (60ms on Windows vs 8ms on Unix).
    
    30ms is still fast enough to distinguish paste from typing (humans type
    ~200ms between keystrokes).
    
    ## Test plan
    
    - [x] All existing paste_burst tests pass
    - [ ] Test multi-line paste in VS Code integrated PowerShell on Windows
    - [ ] Test multi-line paste in standalone Windows PowerShell
    - [ ] Verify no regression on macOS/Linux
    
    Fixes #2137
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • feat: remove "cargo check individual crates" from CI (#11475)
    I think this check has outlived its usefulness. It is often one of the
    last CI jobs to finish when we put up a PR, so this should save us some
    time.
  • feat: panic if Constrained<WebSearchMode> does not support Disabled (#11470)
    If this happens, this is a logical error on our part and we should fix
    it.
  • Reapply "Add app-server transport layer with websocket support" (#11370)
    Reapply "Add app-server transport layer with websocket support" with
    additional fixes from https://github.com/openai/codex/pull/11313/changes
    to avoid deadlocking.
    
    This reverts commit 47356ff83c.
    
    ## Summary
    
    To avoid deadlocking when queues are full, we maintain separate tokio
    tasks dedicated to incoming vs outgoing event handling
    - split the app-server main loop into two tasks in
    `run_main_with_transport`
       - inbound handling (`transport_event_rx`)
       - outbound handling (`outgoing_rx` + `thread_created_rx`)
    - separate incoming and outgoing websocket tasks
    
    ## Validation
    
    Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent
    requests
    
    <img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM"
    src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f"
    />
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Extract codex-config from codex-core (#11389)
    `codex-core` had accumulated config loading, requirements parsing,
    constraint logic, and config-layer state handling in a single crate.
    This change extracts that subsystem into `codex-config` to reduce
    `codex-core` rebuild/test surface area and isolate future config work.
    
    ## What Changed
    
    ### Added `codex-config`
    
    - Added new workspace crate `codex-rs/config` (`codex-config`).
    - Added workspace/build wiring in:
      - `codex-rs/Cargo.toml`
      - `codex-rs/config/Cargo.toml`
      - `codex-rs/config/BUILD.bazel`
    - Updated lockfiles (`codex-rs/Cargo.lock`, `MODULE.bazel.lock`).
    - Added `codex-core` -> `codex-config` dependency in
    `codex-rs/core/Cargo.toml`.
    
    ### Moved config internals from `core` into `config`
    
    Moved modules to `codex-rs/config/src/`:
    
    - `core/src/config/constraint.rs` -> `config/src/constraint.rs`
    - `core/src/config_loader/cloud_requirements.rs` ->
    `config/src/cloud_requirements.rs`
    - `core/src/config_loader/config_requirements.rs` ->
    `config/src/config_requirements.rs`
    - `core/src/config_loader/fingerprint.rs` -> `config/src/fingerprint.rs`
    - `core/src/config_loader/merge.rs` -> `config/src/merge.rs`
    - `core/src/config_loader/overrides.rs` -> `config/src/overrides.rs`
    - `core/src/config_loader/requirements_exec_policy.rs` ->
    `config/src/requirements_exec_policy.rs`
    - `core/src/config_loader/state.rs` -> `config/src/state.rs`
    
    `codex-config` now re-exports this surface from `config/src/lib.rs` at
    the crate top level.
    
    ### Updated `core` to consume/re-export `codex-config`
    
    - `core/src/config_loader/mod.rs` now imports/re-exports config-loader
    types/functions from top-level `codex_config::*`.
    - Local moved modules were removed from `core/src/config_loader/`.
    - `core/src/config/mod.rs` now re-exports constraint types from
    `codex_config`.
  • feat(core): promote Linux bubblewrap sandbox to Experimental (#11381)
    ## Summary
    - Promote `use_linux_sandbox_bwrap` to `Stage::Experimental` on Linux so
    users see it in `/experimental` and get a startup nudge.
  • Do not attempt to append after response.completed (#11402)
    Completed responses are fully done, and new response must be created.
  • chore: rename disable_websockets -> websockets_disabled (#11420)
    `disable_websockets()` is confusing because its a getter. rename for
    clarity
  • feat: set policy for phase 2 memory (#11449)
    Set the policy of the memory phase 2 worker such that it never ask for
    approval
  • feat: close mem agent after consolidation (#11455)
    Close the phase-2 agent of memory when it's done
    
    Fire and forget (i.e. best effort)
  • Cache cloud requirements (#11305)
    We're loading these from the web on every startup. This puts them in a
    local file with a 1hr TTL.
    
    We sign the downloaded requirements with a key compiled into the Codex
    CLI to prevent unsophisticated tampering (determined circumvention is
    outside of our threat model: after all, one could just compile Codex
    without any of these checks).
    
    If any of the following are true, we ignore the local cache and re-fetch
    from Cloud:
    * The signature is invalid for the payload (== requirements, sign time,
    ttl, user identity)
    * The identity does not match the auth'd user's identity
    * The TTL has expired
    * We cannot parse requirements.toml from the payload
  • feat: new memory prompts (#11439)
    * Update prompt
    * Wire CWD in the prompt
    * Handle the no-output case
  • feat: split codex-common into smaller utils crates (#11422)
    We are removing feature-gated shared crates from the `codex-rs`
    workspace. `codex-common` grouped several unrelated utilities behind
    `[features]`, which made dependency boundaries harder to reason about
    and worked against the ongoing effort to eliminate feature flags from
    workspace crates.
    
    Splitting these utilities into dedicated crates under `utils/` aligns
    this area with existing workspace structure and keeps each dependency
    explicit at the crate boundary.
    
    ## What changed
    
    - Removed `codex-rs/common` (`codex-common`) from workspace members and
    workspace dependencies.
    - Added six new utility crates under `codex-rs/utils/`:
      - `codex-utils-cli`
      - `codex-utils-elapsed`
      - `codex-utils-sandbox-summary`
      - `codex-utils-approval-presets`
      - `codex-utils-oss`
      - `codex-utils-fuzzy-match`
    - Migrated the corresponding modules out of `codex-common` into these
    crates (with tests), and added matching `BUILD.bazel` targets.
    - Updated direct consumers to use the new crates instead of
    `codex-common`:
      - `codex-rs/cli`
      - `codex-rs/tui`
      - `codex-rs/exec`
      - `codex-rs/app-server`
      - `codex-rs/mcp-server`
      - `codex-rs/chatgpt`
      - `codex-rs/cloud-tasks`
    - Updated workspace lockfile entries to reflect the new dependency graph
    and removal of `codex-common`.
  • feat: improve thread listing (#11429)
    Improve listing by doing:
    1. List using the rollout file system
    2. Upsert the result in the DB (if present)
    3. Return the result of a DB listing
    4. Fallback on the result of 1 
    
    + some metrics on top of this
  • fix: flaky test (#11428)
    stage1_concurrent_claims_respect_running_cap was flaky due to SQLite
    lock contention, not cap logic correctness. The claim flow used deferred
    transactions (BEGIN) with read-then-write behavior, which can fail under
    concurrency with SQLITE_BUSY_SNAPSHOT/database is locked when upgrading
    a read transaction to a write transaction. We fixed this by using BEGIN
    IMMEDIATE for stage1 and phase2 claim paths, so lock acquisition happens
    up front and contenders serialize cleanly instead of failing during
    upgrade. After the change, codex-state tests pass and stress reruns of
    the flaky path no longer reproduced the failure.
  • Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
    ## Why
    
    `codex-core` was being built in multiple feature-resolved permutations
    because test-only behavior was modeled as crate features. For a large
    crate, those permutations increase compile cost and reduce cache reuse.
    
    ## Net Change
    
    - Removed the `test-support` crate feature and related feature wiring so
    `codex-core` no longer needs separate feature shapes for test consumers.
    - Standardized cross-crate test-only access behind
    `codex_core::test_support`.
    - External test code now imports helpers from
    `codex_core::test_support`.
    - Underlying implementation hooks are kept internal (`pub(crate)`)
    instead of broadly public.
    
    ## Outcome
    
    - Fewer `codex-core` build permutations.
    - Better incremental cache reuse across test targets.
    - No intended production behavior change.
  • tui: show non-file layer content in /debug-config (#11412)
    The debug output listed non-file-backed layers such as session flags and
    MDM managed config, but it did not show their values. That made it
    difficult to explain unexpected effective settings because users could
    not inspect those layers on disk.
    
    Now `/debug-config` might include output like this:
    
    ```
    Config layer stack (lowest precedence first):
      1. system (/etc/codex/config.toml) (enabled)
      2. user (/Users/mbolin/.codex/config.toml) (enabled)
      3. legacy managed_config.toml (mdm) (enabled)
         MDM value:
           # Production Codex configuration file.
    
           [otel]
           log_user_prompt = true
           environment = "prod"
           exporter = { otlp-http = {
             endpoint = "https://example.com/otel",
             protocol = "binary"
           }}
    ```
  • feat: support multiple rate limits (#11260)
    Added multi-limit support end-to-end by carrying limit_name in
    rate-limit snapshots and handling multiple buckets instead of only
    codex.
    Extended /usage client parsing to consume additional_rate_limits
    Updated TUI /status and in-memory state to store/render per-limit
    snapshots
    Extended app-server rate-limit read response: kept rate_limits and added
    rate_limits_by_name.
    Adjusted usage-limit error messaging for non-default codex limit buckets
  • chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
    Problem:
    1. turn id is constructed in-memory;
    2. on resuming threads, turn_id might not be unique;
    3. client cannot no the boundary of a turn from rollout files easily.
    
    This PR does three things:
    1. persist `task_started` and `task_complete` events;
    1. persist `turn_id` in rollout turn events;
    5. generate turn_id as unique uuids instead of incrementing it in
    memory.
    
    This helps us resolve the issue of clients wanting to have unique turn
    ids for resuming a thread, and knowing the boundry of each turn in
    rollout files.
    
    example debug logs
    ```
    2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
    2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
    2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
    ```
    
    backward compatibility:
    if you try to resume an old session without task_started and
    task_complete event populated, the following happens:
    - If you resume and do nothing: those reconstructed historical IDs can
    differ next time you resume.
    - If you resume and send a new turn: the new turn gets a fresh UUID from
    live submission flow and is persisted, so that new turn’s ID is stable
    on later resumes.
    I think this behavior is fine, because we only care about deterministic
    turn id once a turn is triggered.
  • Do not resend output items in incremental websockets connections (#11383)
    In the incremental websocket output items are already part of the
    context, no need to send them again and duplicate.