Commit Graph

62 Commits

  • Handle response.incomplete (#11558)
    Treat it same as error.
  • change model cap to server overload (#11388)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Pump pings (#11413)
    Keep processing ping even when the agent isn't actively running.
    
    Otherwise the connection will drop.
  • Do not attempt to append after response.completed (#11402)
    Completed responses are fully done, and new response must be created.
  • feat: support multiple rate limits (#11260)
    Added multi-limit support end-to-end by carrying limit_name in
    rate-limit snapshots and handling multiple buckets instead of only
    codex.
    Extended /usage client parsing to consume additional_rate_limits
    Updated TUI /status and in-memory state to store/render per-limit
    snapshots
    Extended app-server rate-limit read response: kept rate_limits and added
    rate_limits_by_name.
    Adjusted usage-limit error messaging for non-default codex limit buckets
  • Compare full request for websockets incrementality (#11343)
    Tools can dynamically change mid-turn now. We need to be more thorough
    about reusing incremental connections.
  • Remove ApiPrompt (#11265)
    Keep things simple and build a full Responses API request request right
    in the model client
  • memories: add extraction and prompt module foundation (#11200)
    ## Summary
    - add the new `core/src/memories` module (phase-one parsing, rollout
    filtering, storage, selection, prompts)
    - add Askama-backed memory templates for stage-one input/system and
    consolidation prompts
    - add module tests for parsing, filtering, path bucketing, and summary
    maintenance
    
    ## Testing
    - just fmt
    - cargo test -p codex-core --lib memories::
  • chore: put crypto provider logic in a shared crate (#11294)
    Ensures a process-wide rustls crypto provider is installed.
    
    Both the `codex-network-proxy` and `codex-api` crates need this.
  • Translate websocket errors (#10937)
    When getting errors over a websocket connection, translate the error
    into our regular API error format
  • feat: enable premessage-deflate for websockets (#10966)
    note:
    unfortunately, tokio-tungstenite / tungstenite upgrade triggers some
    problems with linker of rama-tls-boring with openssl:
    ```
    error: linking with `/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh` failed: exit status: 1
      |
      = note:  "/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh" "-m64" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/rcrt1.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crti.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtbeginS.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/{liblzma_sys-662a82316f96ec30,libbzip2_sys-bf78a2d58d5cbce6,liblibsqlite3_sys-6c004987fd67a36a,libtree_sitter_bash-220b99a97d331ab7,libtree_sitter-858f0a1dbfea58bd,libzstd_sys-6eb237deec748c5b,libring-2a87376483bf916f,libopenssl_sys-7c189e68b37fe2bb,liblibz_sys-4344eef4345520b1,librama_boring_sys-0414e98115015ee0}.rlib" "-lc++" "-lc++abi" "-lunwind" "-lc" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-*.rlib" "-L" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-nostartfiles" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libz-sys-ff5ea50d88c28ffb/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/ring-bdec3dddc19f5a5e/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/openssl-sys-96e0870de3ca22bc/out/openssl-build/install/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/zstd-sys-0cc37a5da1481740/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-72d2418073317c0f/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-bash-bfd293a9f333ce6a/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libsqlite3-sys-b78b2cfb81a330fc/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/bzip2-sys-69a145cc859ef275/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/lzma-sys-07e92d0b6baa6fd4/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/ssl/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/deps/codex_network_proxy-d08268b863517761" "-Wl,--gc-sections" "-static-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-Wl,--strip-all" "-nodefaultlibs" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtendS.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtn.o"
      = note: some arguments are omitted. use `--verbose` to show all linker arguments
      = note: warning: ignoring deprecated linker optimization setting '1'
              warning: unable to open library directory '/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/': FileNotFound
              ld.lld: error: duplicate symbol: SSL_export_keying_material
              >>> defined at ssl_lib.c:3816 (ssl/ssl_lib.c:3816)
              >>>            libssl-lib-ssl_lib.o:(SSL_export_keying_material) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
              >>> defined at t1_enc.cc:205 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/ssl/t1_enc.cc:205)
              >>>            t1_enc.cc.o:(.text.SSL_export_keying_material+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
    
              ld.lld: error: duplicate symbol: d2i_ASN1_TIME
              >>> defined at a_time.c:27 (crypto/asn1/a_time.c:27)
              >>>            libcrypto-lib-a_time.o:(d2i_ASN1_TIME) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
              >>> defined at a_time.cc:34 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/crypto/asn1/a_time.cc:34)
              >>>            a_time.cc.o:(.text.d2i_ASN1_TIME+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
    ``` 
    
    that force me to migrate away from rama-tls-boring to rama-tls-rustls
    and pin `ring` for rustls.
  • Support alternative websocket API (#10861)
    **Test plan**
    
    ```
    cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
      ./target/debug/codex \
        --enable responses_websockets_v2 \
        --profile byok \
        --full-auto
    ```
  • Add a codex.rate_limits event for websockets (#10324)
    When communicating over websockets, we can't rely on headers to deliver
    rate limit information. This PR adds a `codex.rate_limits` event that
    the server can pass to the client to inform them about rate limit usage.
    The client parses this data the same way we parse rate limit headers in
    HTTP mode.
    
    This PR also wires up the etag and reasoning headers for websockets
  • chore: add phase to message responseitem (#10455)
    ### What
    
    add wiring for `phase` field on `ResponseItem::Message` to lay
    groundwork for differentiating model preambles and final messages.
    currently optional.
    
    follows pattern in #9698.
    
    updated schemas with `just write-app-server-schema` so we can see type
    changes.
    
    ### Tests
    Updated existing tests for SSE parsing and hydrating from history
  • Add websocket telemetry metrics and labels (#10316)
    Summary
    - expose websocket telemetry hooks through the responses client so
    request durations and event processing can be reported
    - record websocket request/event metrics and emit runtime telemetry
    events that the history UI now surfaces
    - improve tests to cover websocket telemetry reporting and guard runtime
    summary updates
    
    
    <img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM"
    src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d"
    />
  • display promo message in usage error (#10285)
    If a promo message is attached to a rate limit response, then display it
    in the error message.
  • fix: dont auto-enable web_search for azure (#10266)
    seeing issues with azure after default-enabling web search: #10071,
    #10257.
    
    need to work with azure to fix api-side, for now turning off
    default-enable of web_search for azure.
    
    diff is big because i moved logic to reuse
  • fix: handle all web_search actions and in progress invocations (#9960)
    ### Summary
    - Parse all `web_search` tool actions (`search`, `find_in_page`,
    `open_page`).
    - Previously we only parsed + displayed `search`, which made the TUI
    appear to pause when the other actions were being used.
    - Show in progress `web_search` calls as `Searching the web`
      - Previously we only showed completed tool calls
    
    <img width="308" height="149" alt="image"
    src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
    />
    
    ### Tests
    Added + updated tests, tested locally
    
    ### Follow ups
    Update VSCode extension to display these as well
  • feat: support proxy for ws connection (#9719)
    reapply websocket changes without changing tls lib.
  • Support end_turn flag (#9698)
    Experimental flag that signals the end of the turn.
  • Add websockets logging (#9633)
    To help with debugging.
  • feat: support proxy for ws connection (#9409)
    unfortunately tokio-tungstenite doesn't support proxy configuration
    outbox, while https://github.com/snapview/tokio-tungstenite/pull/370 is
    in review, we can depend on source code for now.
  • Act on reasoning-included per turn (#9402)
    - Reset reasoning-included flag each turn and update compaction test
  • fix(codex-api): treat invalid_prompt as non-retryable (#9400)
    **Goal**: Prevent response.failed events with `invalid_prompt` from
    being treated as retryable errors so the UI shows the actual error
    message instead of continually retrying.
    
    **Before**: Codex would continue to retry despite the prompt being
    marked as disallowed
    **After**: Codex will stop retrying once prompt is marked disallowed
  • Turn-state sticky routing per turn (#9332)
    - capture the header from SSE/WS handshakes, store it per
    ModelClientSession using `Oncelock`, echo it on turn-scoped requests,
    and add SSE+WS integration tests for within-turn persistence +
    cross-turn reset.
    
    - keep `x-codex-turn-state` sticky within a user turn to maintain
    routing continuity for retries/tool follow-ups.
  • fix: eliminate unnecessary clone() for each SSE event (#9238)
    Given how many SSE events we get, seems worth fixing.
  • fix: Emit response.completed immediately for Responses SSE (#9170)
    we see windows test failures like this:
    https://github.com/openai/codex/actions/runs/20930055601/job/60138344260.
    
    The issue is that SSE connections sometimes remain open after the
    completion event esp. for windows. We should emit the completion event
    and return immediately. this is consistent with the protocol:
    
    > The Model streams responses back in an SSE, which are collected until
    "completed" message and the SSE terminates
    
    from
    https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/docs/protocol_v1.md#L37.
    
    this helps us achieve parity with responses websocket logic here:
    https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L220-L227.
  • Support response.done and add integration tests (#9129)
    The agent loop using a persistent incremental web socket connection.
  • Websocket append support (#9128)
    Support an incremental append request in websocket transport.
  • Reuse websocket connection (#9127)
    Reuses the connection but still sends full requests.
  • Add model client sessions (#9102)
    Maintain a long-running session.
  • Extract single responses SSE event parsing (#9114)
    To be reused in WebSockets parsing.
  • Remove unused conversation_id header (#9107)
    It's an exact copy of session_id
  • Add feature for optional request compression (#8767)
    Adds a new feature
    `enable_request_compression` that will compress using zstd requests to
    the codex-backend. Currently only enabled for codex-backend so only enabled for openai providers when using chatgpt::auth even when the feature is enabled
    
    Added a new info log line too for evaluating the compression ratio and
    overhead off compressing before requesting. You can enable with
    `RUST_LOG=$RUST_LOG,codex_client::transport=info`
    
    ```
    2026-01-06T00:09:48.272113Z  INFO codex_client::transport: Compressed request body with zstd pre_compression_bytes=28914 post_compression_bytes=11485 compression_duration_ms=0
    ```
  • Merge Modelfamily into modelinfo (#8763)
    - Merge ModelFamily into ModelInfo
    - Remove logic for adding instructions to apply patch
    - Add compaction limit and visible context window to `ModelInfo`
  • fix(codex-api): handle Chat Completions DONE sentinel (#8708)
    Context
    - This code parses Server-Sent Events (SSE) from the legacy Chat
    Completions streaming API (wire_api = "chat").
    - The upstream protocol terminates a stream with a final sentinel event:
    data: [DONE].
    - Some of our test stubs/helpers historically end the stream with data:
    DONE (no brackets).
    
    How this was found
    - GitHub Actions on Windows failed in codex-app-server integration tests
    with wiremock verification errors (expected multiple POSTs, got 1).
    
    Diagnosis
    - The job logs included: codex_api::sse::chat: Failed to parse
    ChatCompletions SSE event ... data: DONE.
    - eventsource_stream surfaces the sentinel as a normal SSE event; it
    does not automatically close the stream.
    - The parser previously attempted to JSON-decode every data: payload.
    The sentinel is not JSON, so we logged and skipped it, then continued
    polling.
    - On servers that keep the HTTP connection open after emitting the
    sentinel (notably wiremock on Windows), skipping the sentinel meant we
    never emitted ResponseEvent::Completed.
    - Higher layers wait for completion before progressing (emitting
    approval requests and issuing follow-up model calls), so the test never
    reached the subsequent requests and wiremock panicked when its
    expected-call count was not met.
    
    Fix
    - Treat both data: [DONE] and data: DONE as explicit end-of-stream
    sentinels.
    - When a sentinel is seen, flush any pending assistant/reasoning items
    and emit ResponseEvent::Completed once.
    
    Tests
    - Add a regression unit test asserting we complete on the sentinel even
    if the underlying connection is not closed.
  • fix: chat multiple tool calls (#8556)
    Fix this: https://github.com/openai/codex/issues/8479
    
    The issue is that chat completion API expect all the tool calls in a
    single assistant message and then all the tool call output in a single
    response message
  • Refresh on models etag mismatch (#8491)
    - Send models etag
    - Refresh models on 412
    - This wires `ModelsManager` to `ModelFamily` so we don't mutate it
    mid-turn
  • Remove reasoning format (#8484)
    This isn't very useful parameter. 
    
    logic:
    ```
    if model puts `**` in their reasoning, trim it and visualize the header.
    if couldn't trim: don't render
    if model doesn't support: don't render
    ```
    
    We can simplify to:
    ```
    if could trim, visualize header.
    if not, don't render
    ```
  • feat: experimental menu (#8071)
    This will automatically render any `Stage::Beta` features.
    
    The change only gets applied to the *next session*. This started as a
    bug but actually this is a good thing to prevent out of distribution
    push
    
    <img width="986" height="288" alt="Screenshot 2025-12-15 at 15 38 35"
    src="https://github.com/user-attachments/assets/78b7a71d-0e43-4828-a118-91c5237909c7"
    />
    
    
    <img width="509" height="109" alt="Screenshot 2025-12-15 at 17 35 44"
    src="https://github.com/user-attachments/assets/6933de52-9b66-4abf-b58b-a5f26d5747e2"
    />