211 Commits

  • Remove ApiPrompt (#11265)
    Keep things simple and build a full Responses API request request right
    in the model client
  • memories: add extraction and prompt module foundation (#11200)
    ## Summary
    - add the new `core/src/memories` module (phase-one parsing, rollout
    filtering, storage, selection, prompts)
    - add Askama-backed memory templates for stage-one input/system and
    consolidation prompts
    - add module tests for parsing, filtering, path bucketing, and summary
    maintenance
    
    ## Testing
    - just fmt
    - cargo test -p codex-core --lib memories::
  • chore: put crypto provider logic in a shared crate (#11294)
    Ensures a process-wide rustls crypto provider is installed.
    
    Both the `codex-network-proxy` and `codex-api` crates need this.
  • Translate websocket errors (#10937)
    When getting errors over a websocket connection, translate the error
    into our regular API error format
  • feat: enable premessage-deflate for websockets (#10966)
    note:
    unfortunately, tokio-tungstenite / tungstenite upgrade triggers some
    problems with linker of rama-tls-boring with openssl:
    ```
    error: linking with `/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh` failed: exit status: 1
      |
      = note:  "/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh" "-m64" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/rcrt1.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crti.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtbeginS.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/{liblzma_sys-662a82316f96ec30,libbzip2_sys-bf78a2d58d5cbce6,liblibsqlite3_sys-6c004987fd67a36a,libtree_sitter_bash-220b99a97d331ab7,libtree_sitter-858f0a1dbfea58bd,libzstd_sys-6eb237deec748c5b,libring-2a87376483bf916f,libopenssl_sys-7c189e68b37fe2bb,liblibz_sys-4344eef4345520b1,librama_boring_sys-0414e98115015ee0}.rlib" "-lc++" "-lc++abi" "-lunwind" "-lc" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-*.rlib" "-L" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-nostartfiles" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libz-sys-ff5ea50d88c28ffb/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/ring-bdec3dddc19f5a5e/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/openssl-sys-96e0870de3ca22bc/out/openssl-build/install/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/zstd-sys-0cc37a5da1481740/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-72d2418073317c0f/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-bash-bfd293a9f333ce6a/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libsqlite3-sys-b78b2cfb81a330fc/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/bzip2-sys-69a145cc859ef275/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/lzma-sys-07e92d0b6baa6fd4/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/ssl/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/deps/codex_network_proxy-d08268b863517761" "-Wl,--gc-sections" "-static-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-Wl,--strip-all" "-nodefaultlibs" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtendS.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtn.o"
      = note: some arguments are omitted. use `--verbose` to show all linker arguments
      = note: warning: ignoring deprecated linker optimization setting '1'
              warning: unable to open library directory '/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/': FileNotFound
              ld.lld: error: duplicate symbol: SSL_export_keying_material
              >>> defined at ssl_lib.c:3816 (ssl/ssl_lib.c:3816)
              >>>            libssl-lib-ssl_lib.o:(SSL_export_keying_material) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
              >>> defined at t1_enc.cc:205 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/ssl/t1_enc.cc:205)
              >>>            t1_enc.cc.o:(.text.SSL_export_keying_material+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
    
              ld.lld: error: duplicate symbol: d2i_ASN1_TIME
              >>> defined at a_time.c:27 (crypto/asn1/a_time.c:27)
              >>>            libcrypto-lib-a_time.o:(d2i_ASN1_TIME) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
              >>> defined at a_time.cc:34 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/crypto/asn1/a_time.cc:34)
              >>>            a_time.cc.o:(.text.d2i_ASN1_TIME+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
    ``` 
    
    that force me to migrate away from rama-tls-boring to rama-tls-rustls
    and pin `ring` for rustls.
  • Support alternative websocket API (#10861)
    **Test plan**
    
    ```
    cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
      ./target/debug/codex \
        --enable responses_websockets_v2 \
        --profile byok \
        --full-auto
    ```
  • Add a codex.rate_limits event for websockets (#10324)
    When communicating over websockets, we can't rely on headers to deliver
    rate limit information. This PR adds a `codex.rate_limits` event that
    the server can pass to the client to inform them about rate limit usage.
    The client parses this data the same way we parse rate limit headers in
    HTTP mode.
    
    This PR also wires up the etag and reasoning headers for websockets
  • [Codex][CLI] Gate image inputs by model modalities (#10271)
    ###### Summary
    
    - Add input_modalities to model metadata so clients can determine
    supported input types.
    - Gate image paste/attach in TUI when the selected model does not
    support images.
    - Block submits that include images for unsupported models and show a
    clear warning.
    - Propagate modality metadata through app-server protocol/model-list
    responses.
      - Update related tests/fixtures.
    
      ###### Rationale
    
      - Models support different input modalities.
    - Clients need an explicit capability signal to prevent unsupported
    requests.
    - Backward-compatible defaults preserve existing behavior when modality
    metadata is absent.
    
      ###### Scope
    
      - codex-rs/protocol, codex-rs/core, codex-rs/tui
      - codex-rs/app-server-protocol, codex-rs/app-server
      - Generated app-server types / schema fixtures
    
      ###### Trade-offs
    
    - Default behavior assumes text + image when field is absent for
    compatibility.
      - Server-side validation remains the source of truth.
    
      ###### Follow-up
    
    - Non-TUI clients should consume input_modalities to disable unsupported
    attachments.
    - Model catalogs should explicitly set input_modalities for text-only
    models.
    
      ###### Testing
    
      - cargo fmt --all
      - cargo test -p codex-tui
      - env -u GITHUB_APP_KEY cargo test -p codex-core --lib
      - just write-app-server-schema
    - cargo run -p codex-cli --bin codex -- app-server generate-ts --out
    app-server-types
      - test against local backend
      
    <img width="695" height="199" alt="image"
    src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
    />
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • chore: add phase to message responseitem (#10455)
    ### What
    
    add wiring for `phase` field on `ResponseItem::Message` to lay
    groundwork for differentiating model preambles and final messages.
    currently optional.
    
    follows pattern in #9698.
    
    updated schemas with `just write-app-server-schema` so we can see type
    changes.
    
    ### Tests
    Updated existing tests for SSE parsing and hydrating from history
  • Add websocket telemetry metrics and labels (#10316)
    Summary
    - expose websocket telemetry hooks through the responses client so
    request durations and event processing can be reported
    - record websocket request/event metrics and emit runtime telemetry
    events that the history UI now surfaces
    - improve tests to cover websocket telemetry reporting and guard runtime
    summary updates
    
    
    <img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM"
    src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d"
    />
  • display promo message in usage error (#10285)
    If a promo message is attached to a rate limit response, then display it
    in the error message.
  • fix: dont auto-enable web_search for azure (#10266)
    seeing issues with azure after default-enabling web search: #10071,
    #10257.
    
    need to work with azure to fix api-side, for now turning off
    default-enable of web_search for azure.
    
    diff is big because i moved logic to reuse
  • chore(personality) new schema with fallbacks (#10147)
    ## Summary
    Let's dial in this api contract in a bit more with more robust fallback
    behavior when model_instructions_template is false.
    
    Switches to a more explicit template / variables structure, with more
    fallbacks.
    
    ## Testing
    - [x] Adding unit tests
    - [x] Tested locally
  • fix: handle all web_search actions and in progress invocations (#9960)
    ### Summary
    - Parse all `web_search` tool actions (`search`, `find_in_page`,
    `open_page`).
    - Previously we only parsed + displayed `search`, which made the TUI
    appear to pause when the other actions were being used.
    - Show in progress `web_search` calls as `Searching the web`
      - Previously we only showed completed tool calls
    
    <img width="308" height="149" alt="image"
    src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
    />
    
    ### Tests
    Added + updated tests, tested locally
    
    ### Follow ups
    Update VSCode extension to display these as well
  • feat: support proxy for ws connection (#9719)
    reapply websocket changes without changing tls lib.
  • Support end_turn flag (#9698)
    Experimental flag that signals the end of the turn.
  • feat(core) ModelInfo.model_instructions_template (#9597)
    ## Summary
    #9555 is the start of a rename, so I'm starting to standardize here.
    Sets up `model_instructions` templating with a strongly-typed object for
    injecting a personality block into the model instructions.
    
    ## Testing
    - [x] Added tests
    - [x] Ran locally
  • Add websockets logging (#9633)
    To help with debugging.
  • feat: support proxy for ws connection (#9409)
    unfortunately tokio-tungstenite doesn't support proxy configuration
    outbox, while https://github.com/snapview/tokio-tungstenite/pull/370 is
    in review, we can depend on source code for now.
  • Act on reasoning-included per turn (#9402)
    - Reset reasoning-included flag each turn and update compaction test
  • fix(codex-api): treat invalid_prompt as non-retryable (#9400)
    **Goal**: Prevent response.failed events with `invalid_prompt` from
    being treated as retryable errors so the UI shows the actual error
    message instead of continually retrying.
    
    **Before**: Codex would continue to retry despite the prompt being
    marked as disallowed
    **After**: Codex will stop retrying once prompt is marked disallowed
  • Turn-state sticky routing per turn (#9332)
    - capture the header from SSE/WS handshakes, store it per
    ModelClientSession using `Oncelock`, echo it on turn-scoped requests,
    and add SSE+WS integration tests for within-turn persistence +
    cross-turn reset.
    
    - keep `x-codex-turn-state` sticky within a user turn to maintain
    routing continuity for retries/tool follow-ups.
  • fix: eliminate unnecessary clone() for each SSE event (#9238)
    Given how many SSE events we get, seems worth fixing.
  • fix: Emit response.completed immediately for Responses SSE (#9170)
    we see windows test failures like this:
    https://github.com/openai/codex/actions/runs/20930055601/job/60138344260.
    
    The issue is that SSE connections sometimes remain open after the
    completion event esp. for windows. We should emit the completion event
    and return immediately. this is consistent with the protocol:
    
    > The Model streams responses back in an SSE, which are collected until
    "completed" message and the SSE terminates
    
    from
    https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/docs/protocol_v1.md#L37.
    
    this helps us achieve parity with responses websocket logic here:
    https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L220-L227.
  • Support response.done and add integration tests (#9129)
    The agent loop using a persistent incremental web socket connection.
  • Websocket append support (#9128)
    Support an incremental append request in websocket transport.
  • Reuse websocket connection (#9127)
    Reuses the connection but still sends full requests.
  • Add model client sessions (#9102)
    Maintain a long-running session.
  • Extract single responses SSE event parsing (#9114)
    To be reused in WebSockets parsing.
  • Remove unused conversation_id header (#9107)
    It's an exact copy of session_id
  • feat: add support for building with Bazel (#8875)
    This PR configures Codex CLI so it can be built with
    [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
    includes configuration so that remote builds can be done using
    [BuildBuddy](https://www.buildbuddy.io).
    
    If you are familiar with Bazel, things should work as you expect, e.g.,
    run `bazel test //... --keep-going` to run all the tests in the repo,
    but we have also added some new aliases in the `justfile` for
    convenience:
    
    - `just bazel-test` to run tests locally
    - `just bazel-remote-test` to run tests remotely (currently, the remote
    build is for x86_64 Linux regardless of your host platform). Note we are
    currently seeing the following test failures in the remote build, so we
    still need to figure out what is happening here:
    
    ```
    failures:
        suite::compact::manual_compact_twice_preserves_latest_user_messages
        suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
        suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
    ```
    
    - `just build-for-release` to build release binaries for all
    platforms/architectures remotely
    
    To setup remote execution:
    - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
    employees should also request org access at
    https://openai.buildbuddy.io/join/ with their `@openai.com` email
    address.)
    - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
    `~/.bazelrc` (add the line `build
    --remote_header=x-buildbuddy-api-key=YOUR_KEY`)
    - Use `--config=remote` in your `bazel` invocations (or add `common
    --config=remote` to your `~/.bazelrc`, or use the `just` commands)
    
    ## CI
    
    In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
    uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
    (we are working on supporting Windows, but that is not ready yet). Note
    that the failures we are seeing in `just bazel-remote-test` do not occur
    on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
    is green right now.
    
    The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
    that macOS CI jobs build _remotely_ on Linux hosts (using the
    `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
    root `BUILD.bazel`) using cross-compilation to build the macOS
    artifacts. Then these artifacts are downloaded locally to GitHub's macOS
    runner so the tests can be executed natively. This is the relevant
    config that enables this:
    
    ```
    common:macos --config=remote
    common:macos --strategy=remote
    common:macos --strategy=TestRunner=darwin-sandbox,local
    ```
    
    Because of the remote caching benefits we get from BuildBuddy, these new
    CI jobs can be extremely fast! For example, consider these two jobs that
    ran all the tests on Linux x86_64:
    
    - Bazel 1m37s
    https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
    - Cargo 9m20s
    https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
    
    For now, we will continue to run both the Bazel and Cargo jobs for PRs,
    but once we add support for Windows and running Clippy, we should be
    able to cutover to using Bazel exclusively for PRs, which should still
    speed things up considerably. We will probably continue to run the Cargo
    jobs post-merge for commits that land on `main` as a sanity check.
    
    Release builds will also continue to be done by Cargo for now.
    
    Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
    Earlier attempt to add support for Buck2, now abandoned:
    https://github.com/openai/codex/pull/8504
    
    ---------
    
    Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • Add feature for optional request compression (#8767)
    Adds a new feature
    `enable_request_compression` that will compress using zstd requests to
    the codex-backend. Currently only enabled for codex-backend so only enabled for openai providers when using chatgpt::auth even when the feature is enabled
    
    Added a new info log line too for evaluating the compression ratio and
    overhead off compressing before requesting. You can enable with
    `RUST_LOG=$RUST_LOG,codex_client::transport=info`
    
    ```
    2026-01-06T00:09:48.272113Z  INFO codex_client::transport: Compressed request body with zstd pre_compression_bytes=28914 post_compression_bytes=11485 compression_duration_ms=0
    ```
  • Merge Modelfamily into modelinfo (#8763)
    - Merge ModelFamily into ModelInfo
    - Remove logic for adding instructions to apply patch
    - Add compaction limit and visible context window to `ModelInfo`
  • fix(codex-api): handle Chat Completions DONE sentinel (#8708)
    Context
    - This code parses Server-Sent Events (SSE) from the legacy Chat
    Completions streaming API (wire_api = "chat").
    - The upstream protocol terminates a stream with a final sentinel event:
    data: [DONE].
    - Some of our test stubs/helpers historically end the stream with data:
    DONE (no brackets).
    
    How this was found
    - GitHub Actions on Windows failed in codex-app-server integration tests
    with wiremock verification errors (expected multiple POSTs, got 1).
    
    Diagnosis
    - The job logs included: codex_api::sse::chat: Failed to parse
    ChatCompletions SSE event ... data: DONE.
    - eventsource_stream surfaces the sentinel as a normal SSE event; it
    does not automatically close the stream.
    - The parser previously attempted to JSON-decode every data: payload.
    The sentinel is not JSON, so we logged and skipped it, then continued
    polling.
    - On servers that keep the HTTP connection open after emitting the
    sentinel (notably wiremock on Windows), skipping the sentinel meant we
    never emitted ResponseEvent::Completed.
    - Higher layers wait for completion before progressing (emitting
    approval requests and issuing follow-up model calls), so the test never
    reached the subsequent requests and wiremock panicked when its
    expected-call count was not met.
    
    Fix
    - Treat both data: [DONE] and data: DONE as explicit end-of-stream
    sentinels.
    - When a sentinel is seen, flush any pending assistant/reasoning items
    and emit ResponseEvent::Completed once.
    
    Tests
    - Add a regression unit test asserting we complete on the sentinel even
    if the underlying connection is not closed.
  • fix: chat multiple tool calls (#8556)
    Fix this: https://github.com/openai/codex/issues/8479
    
    The issue is that chat completion API expect all the tool calls in a
    single assistant message and then all the tool call output in a single
    response message
  • Refresh on models etag mismatch (#8491)
    - Send models etag
    - Refresh models on 412
    - This wires `ModelsManager` to `ModelFamily` so we don't mutate it
    mid-turn
  • Remove reasoning format (#8484)
    This isn't very useful parameter. 
    
    logic:
    ```
    if model puts `**` in their reasoning, trim it and visualize the header.
    if couldn't trim: don't render
    if model doesn't support: don't render
    ```
    
    We can simplify to:
    ```
    if could trim, visualize header.
    if not, don't render
    ```
  • remove minimal client version (#8447)
    This isn't needed value by client
  • feat: experimental menu (#8071)
    This will automatically render any `Stage::Beta` features.
    
    The change only gets applied to the *next session*. This started as a
    bug but actually this is a good thing to prevent out of distribution
    push
    
    <img width="986" height="288" alt="Screenshot 2025-12-15 at 15 38 35"
    src="https://github.com/user-attachments/assets/78b7a71d-0e43-4828-a118-91c5237909c7"
    />
    
    
    <img width="509" height="109" alt="Screenshot 2025-12-15 at 17 35 44"
    src="https://github.com/user-attachments/assets/6933de52-9b66-4abf-b58b-a5f26d5747e2"
    />
  • nit: trace span for regular task (#8053)
    Logs are too spammy
    
    ---------
    
    Co-authored-by: Anton Panasenko <apanasenko@openai.com>