273 Commits

  • fix: read version from package.json instead of modifying session.ts (#753)
    I am working to simplify the build process. As a first step, update
    `session.ts` so it reads the `version` from `package.json` at runtime so
    we no longer have to modify it during the build process. I want to get
    to a place where the build looks like:
    
    ```
    cd codex-cli
    pnpm i
    pnpm build
    RELEASE_DIR=$(mktemp -d)
    cp -r bin "$RELEASE_DIR/bin"
    cp -r dist "$RELEASE_DIR/dist"
    cp -r src "$RELEASE_DIR/src" # important if we want sourcemaps to continue to work
    cp ../README.md "$RELEASE_DIR"
    VERSION=$(printf '0.1.%d' $(date +%y%m%d%H%M))
    jq --arg version "$VERSION" '.version = $version' package.json > "$RELEASE_DIR/package.json"
    ```
    
    Then the contents of `$RELEASE_DIR` should be good to `npm publish`, no?
  • feat: add common package registries domains to allowed-domains list (#414)
    feat: add common package registries domains to allowed-domains list
  • Fixes issue #726 by adding config to configToSave object (#728)
    The saveConfig() function only includes a hardcoded subset of properties
    when writing the config file. Any property not explicitly listed (like
    disableResponseStorage) will be dropped.
    I have added `disableResponseStorage` to the `configToSave` object as
    the immediate fix.
    
    [Linking Issue this fixes.](https://github.com/openai/codex/issues/726)
  • feat: add --reasoning CLI flag (#314)
    This PR adds a new CLI flag: `--reasoning`, which allows users to
    customize the reasoning effort level (`low`, `medium`, or `high`) used
    by OpenAI's `o` models.
    By introducing the `--reasoning` flag, users gain more flexibility when
    working with the models. It enables optimization for either speed or
    depth of reasoning, depending on specific use cases.
    This PR resolves #107
    
    - **Flag**: `--reasoning`
    - **Accepted Values**: `low`, `medium`, `high`
    - **Default Behavior**: If not specified, the model uses the default
    reasoning level.
    
    ## Example Usage
    
    ```bash
    codex --reasoning=low "Write a simple function to calculate factorial"
    
    ---------
    
    Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
    Co-authored-by: yashrwealthy <yash.rastogi@wealthy.in>
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • feat: lower default retry wait time and increase number of tries (#720)
    In total we now guarantee that we will wait for at least 60s before
    giving up.
    
    ---------
    
    Signed-off-by: Thibault Sottiaux <tibo@openai.com>
  • fix: tighten up check for /usr/bin/sandbox-exec (#710)
    * In both TypeScript and Rust, we now invoke `/usr/bin/sandbox-exec`
    explicitly rather than whatever `sandbox-exec` happens to be on the
    `PATH`.
    * Changed `isSandboxExecAvailable` to use `access()` rather than
    `command -v` so that:
      *  We only do the check once over the lifetime of the Codex process.
      * The check is specific to `/usr/bin/sandbox-exec`.
    * We now do a syscall rather than incur the overhead of spawning a
    process, dealing with timeouts, etc.
    
    I think there is still room for improvement here where we should move
    the `isSandboxExecAvailable` check earlier in the CLI, ideally right
    after we do arg parsing to verify that we can provide the Seatbelt
    sandbox if that is what the user has requested.
  • fix: check if sandbox-exec is available (#696)
    - Introduce `isSandboxExecAvailable()` helper and tidy import ordering
    in `handle-exec-command.ts`.
    - Add runtime check for the `sandbox-exec` binary on macOS; fall back to
    `SandboxType.NONE` with a warning if it’s missing, preventing crashes.
    
    ---------
    
    Signed-off-by: Thibault Sottiaux <tibo@openai.com>
    Co-authored-by: Fouad Matin <fouad@openai.com>
  • feat: user config api key (#569)
    Adds support for reading OPENAI_API_KEY (and other variables) from a
    user‑wide dotenv file (~/.codex.config). Precedence order is now:
      1. explicit environment variable
      2. project‑local .env (loaded earlier)
      3. ~/.codex.config
    
    Also adds a regression test that ensures the multiline editor correctly
    handles cases where printable text and the CSI‑u Shift+Enter sequence
    arrive in the same input chunk.
    
    House‑kept with Prettier; removed stray temp.json artifact.
  • fix: duplicate messages in quiet mode (#680)
    Addressing #600 and #664 (partially)
    
    ## Bug
    Codex was staging duplicate items in output running when the same
    response item appeared in both the streaming events. Specifically:
    
    1. Items would be staged once when received as a
    `response.output_item.done` event
    2. The same items would be staged again when included in the final
    `response.completed` payload
    
    This duplication would result in each message being sent several times
    in the quiet mode output.
    
    ## Changes
    - Added a Set (`alreadyStagedItemIds`) to track items that have already
    been staged
    - Modified the `stageItem` function to check if an item's ID is already
    in this set before staging it
    - Added a regression test (`agent-dedupe-items.test.ts`) that verifies
    items with the same ID are only staged once
    
    ## Testing
    Like other tests, the included test creates a mock OpenAI stream that
    emits the same message twice (once as an incremental event and once in
    the final response) and verifies the item is only passed to `onItem`
    once.
  • bump(version): 0.1.2504251709 (#660)
    ## `0.1.2504251709`
    
    ### 🚀 Features
    
    - Add openai model info configuration (#551)
    - Added provider to run quiet mode function (#571)
    - Create parent directories when creating new files (#552)
    - Print bug report URL in terminal instead of opening browser (#510)
    (#528)
    - Add support for custom provider configuration in the user config
    (#537)
    - Add support for OpenAI-Organization and OpenAI-Project headers (#626)
    - Add specific instructions for creating API keys in error msg (#581)
    - Enhance toCodePoints to prevent potential unicode 14 errors (#615)
    - More native keyboard navigation in multiline editor (#655)
    - Display error on selection of invalid model (#594)
    
    ### 🪲 Bug Fixes
    
    - Model selection (#643)
    - Nits in apply patch (#640)
    - Input keyboard shortcuts (#676)
    - `apply_patch` unicode characters (#625)
    - Don't clear turn input before retries (#611)
    - More loosely match context for apply_patch (#610)
    - Update bug report template - there is no --revision flag (#614)
    - Remove outdated copy of text input and external editor feature (#670)
    - Remove unreachable "disableResponseStorage" logic flow introduced in
    #543 (#573)
    - Non-openai mode - fix for gemini content: null, fix 429 to throw
    before stream (#563)
    - Only allow going up in history when not already in history if input is
    empty (#654)
    - Do not grant "node" user sudo access when using run_in_container.sh
    (#627)
    - Update scripts/build_container.sh to use pnpm instead of npm (#631)
    - Update lint-staged config to use pnpm --filter (#582)
    - Non-openai mode - don't default temp and top_p (#572)
    - Fix error catching when checking for updates (#597)
    - Close stdin when running an exec tool call (#636)
  • fix: input keyboard shortcuts (#676)
    Fixes keyboard shortcuts:
    - ctrl+a/e
    - opt+arrow keys
  • perf: optimize token streaming with balanced approach (#635)
    - Replace setTimeout(10ms) with queueMicrotask for immediate processing
    - Add minimal 3ms setTimeout for rendering to maintain readable UX
    - Reduces per-token delay while preserving streaming experience
    - Add performance test to verify optimization works correctly
    
    ---------
    
    Co-authored-by: Claude <noreply@anthropic.com>
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • feat: Add support for OpenAI-Organization and OpenAI-Project headers (#626)
    Added support for OpenAI-Organization and OpenAI-Project headers for
    OpenAI API calls.
    
    This is for #74
  • fix: only allow going up in history when not already in history if input is empty (#654)
    \+ cleanup below input help to be "ctrl+c to exit | "/" to see commands
    | enter to send" now that we have command autocompletion
    \+ minor other drive-by code cleanups
    
    ---------
    
    Signed-off-by: Thibault Sottiaux <tibo@openai.com>
  • fix: model selection (#643)
    fix: pass correct selected model in ModelOverlay
    
    The ModelOverlay component was incorrectly passing the current model
    instead of the newly selected model to its onSelect callback. This
    prevented model changes from being applied properly.
    
    The fix ensures that when a user selects a new model, the parent
    component receives the correct newly selected model value, allowing
    model changes to work as intended.
  • fix: nits in apply patch (#640)
    ## Description
    
    Fix a nit in `apply patch`, potentially improving performance slightly.
  • chore: upgrade prettier to v3 (#644)
    ## Description
    
    This PR addresses the following improvements:
    
    **Unify Prettier Version**: Currently, the Prettier version used in
    `/package.json` and `/codex-cli/package.json` are different. In this PR,
    we're updating both to use Prettier v3.
    
    - Prettier v3 introduces improved support for JavaScript and TypeScript.
    (e.g. the formatting scenario shown in the image below. This is more
    aligned with the TypeScript indentation standard).
    
    <img width="1126" alt="image"
    src="https://github.com/user-attachments/assets/6e237eb8-4553-4574-b336-ed9561c55370"
    />
    
    **Add Prettier Auto-Formatting in lint-staged**: We've added a step to
    automatically run prettier --write on JavaScript and TypeScript files as
    part of the lint-staged process, before the ESLint checks.
    
    - This will help ensure that all committed code is properly formatted
    according to the project's Prettier configuration.
  • fix(utils): save config (#578)
    ## Description
    
    When `saveConfig` is called, the project doc is incorrectly saved into
    user instructions. This change ensures that only user instructions are
    saved to `instructions.md` during saveConfig, preventing data
    corruption.
    
    close: #576
    
    ---------
    
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • feat(bug-report): print bug report URL in terminal instead of opening browser (#510) (#528)
    Solves #510 
    This PR changes the `/bug` command to print the URL into the terminal
    (so it works in headless sessions) instead of trying to open a browser.
    
    ---------
    
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • feat: display error on selection of invalid model (#594)
    Up-to-date of #78 
    
    Fixes #32
    
    addressed requested changes @tibo-openai :) made sense to me
    
    
    though, previous rationale with passing the state up was assuming there
    could be a future need to have a shared state with all available models
    being available to the parent
  • fix: update scripts/build_container.sh to use pnpm instead of npm (#631)
    I suspect this is why some contributors kept accidentally including a
    new `codex-cli/package-lock.json` in their PRs.
    
    Note the `Dockerfile` still uses `npm` instead of `pnpm`, but that
    appears to be fine. (Probably nicer to globally install as few things as
    possible in the image.)
  • fix: do not grant "node" user sudo access when using run_in_container.sh (#627)
    This exploration came out of my review of
    https://github.com/openai/codex/pull/414.
    
    `run_in_container.sh` runs Codex in a Docker container like so:
    
    
    https://github.com/openai/codex/blob/bd1c3deed9f4f103e755baa3f3a45e7a1c1a134b/codex-cli/scripts/run_in_container.sh#L51-L58
    
    But then runs `init_firewall.sh` to set up the firewall to restrict
    network access.
    
    Previously, we did this by adding `/usr/local/bin/init_firewall.sh` to
    the container and adding a special rule in `/etc/sudoers.d` so the
    unprivileged user (`node`) could run the privileged `init_firewall.sh`
    script to open up the firewall for `api.openai.com`:
    
    
    https://github.com/openai/codex/blob/31d0d7a305305ad557035a2edcab60b6be5018d8/codex-cli/Dockerfile#L51-L56
    
    Though I believe this is unnecessary, as we can use `docker exec --user
    root` from _outside_ the container to run
    `/usr/local/bin/init_firewall.sh` as `root` without adding a special
    case in `/etc/sudoers.d`.
    
    This appears to work as expected, as I tested it by doing the following:
    
    ```
    ./codex-cli/scripts/build_container.sh
    ./codex-cli/scripts/run_in_container.sh 'what is the output of `curl https://www.openai.com`'
    ```
    
    This was a bit funny because in some of my runs, Codex wasn't convinced
    it had network access, so I had to convince it to try the `curl`
    request:
    
    
    ![image](https://github.com/user-attachments/assets/80bd487c-74e2-4cd3-aa0f-26a6edd8d3f7)
    
    As you can see, when it ran `curl -s https\://www.openai.com`, it a
    connection failure, so the network policy appears to be working as
    intended.
    
    Note this PR also removes `sudo` from the `apt-get install` list in the
    `Dockerfile`.
  • fix: apply_patch unicode characters (#625)
    fuzzy-er matching for apply_patch to handle u00A0 and u202F spaces.
  • fix(agent-loop): notify type (#608)
    ## Description
    
    The `as AppConfig` type assertion in the constructor may introduce
    potential type safety risks. Removing the assertion and making `notify`
    an optional parameter could enhance type robustness and prevent
    unexpected runtime errors.
    
    close: #605
  • feat: update README and config to support custom providers with API k… (#577)
    When using a non-built-in provider with the `--provider` option, users
    are prompted:
    
    ```
    Set the environment variable <provider>_API_KEY and re-run this command.
    You can create a <provider>_API_KEY in the <provider> dashboard.
    ```
    
    However, many users are confused because, even after correctly setting
    `<provider>_API_KEY`, authentication may still fail unless
    `OPENAI_API_KEY` is _also_ present in the environment. This is not
    intuitive and leads to ambiguity about which API key is actually
    required and used as a fallback, especially when using custom or
    third-party (non-listed) providers.
    
    Furthermore, the original README/documentation did not mention the
    requirement to set `<provider>_BASE_URL` for non-built-in providers,
    which is necessary for proper client behavior. This omission made the
    configuration process more difficult for users trying to integrate with
    custom endpoints.
  • feat: enhance toCodePoints to prevent potential unicode 14 errors (#615)
    ## Description
    
    `Array.from` may fail when handling certain characters newly added in
    Unicode 14. Where possible, it seems better to use `Intl.Segmenter` for
    more reliable processing.
    
    
    ![image](https://github.com/user-attachments/assets/2cbd779d-69d3-448e-b76a-d793cb639d96)
  • feat: more loosely match context for apply_patch (#610)
    More of a proposal than anything but models seem to struggle with
    composing valid patches for `apply_patch` for context matching when
    there are unicode look-a-likes involved. This would normalize them.
    
    ```
    top-level          # ASCII
    top-level          # U+2011 NON-BREAKING HYPHEN
    top–level          # U+2013 EN DASH
    top—level          # U+2014 EM DASH
    top‒level          # U+2012 FIGURE DASH
    ```
    
    thanks unicode.
  • feat: add specific instructions for creating API keys in error msg (#581)
    Updates the error message for missing Gemini API keys to reference
    "Google AI Studio" instead of the generic "GEMINI dashboard". This
    provides users with more accurate information about where to obtain
    their Gemini API keys.
    
    This could be extended to other providers as well.
  • fix: don't clear turn input before retries (#611)
    The current turn input in the agent loop is being discarded before
    consuming the stream events which causes the stream reconnect (after
    rate limit failure) to not include the inputs. Since the new stream
    includes the previous response ID, it triggers a bad request exception
    considering the input doesn't match what OpenAI has stored on the server
    side and subsequently a very confusing error message of: `No tool output
    found for function call call_xyz`.
    
    This should fix https://github.com/openai/codex/issues/586.
    
    ## Testing
    
    I have a personal project that I'm working on that runs multiple Codex
    CLIs in parallel and often runs into rate limit errors (as seen in the
    OpenAI logs). After making this change, I am no longer experiencing
    Codex crashing and it was able to retry and handle everything gracefully
    until completion (even though I still see rate limiting in the OpenAI
    logs).
  • bug: fix error catching when checking for updates (#597)
    This fixes https://github.com/openai/codex/issues/480 where the latest
    code was crashing when attempting to be run inside docker since the
    update checker attempts to reach out to `npm.antfu.dev` but that DNS is
    not allowed in the firewall rules.
    
    I believe the original code was attempting to catch and ignore any
    errors when checking for updates but was doing so incorrectly. If you
    use await on a promise, you have to use a standard try/catch instead of
    `Promise.catch` so this fixes that.
    
    ## Testing
    
    ### Before
    
    ```
    $ scripts/run_in_container.sh "explain this project to me"
    7d1aa845edf9a36fe4d5b331474b5cb8ba79537b682922b554ea677f14996c6b
    Resolving api.openai.com...
    Adding 162.159.140.245 for api.openai.com
    Adding 172.66.0.243 for api.openai.com
    Host network detected as: 172.17.0.0/24
    Firewall configuration complete
    Verifying firewall rules...
    Firewall verification passed - unable to reach https://example.com as expected
    Firewall verification passed - able to reach https://api.openai.com as expected
    TypeError: fetch failed
        at node:internal/deps/undici/undici:13510:13
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async getLatestVersionBatch (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132669:17)
        at async getLatestVersion (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132674:19)
        at async getUpdateCheckInfo (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132748:20)
        at async checkForUpdates (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132772:23)
        at async file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:142027:1 {
      [cause]: AggregateError [ECONNREFUSED]: 
          at internalConnectMultiple (node:net:1122:18)
          at afterConnectMultiple (node:net:1689:7) {
        code: 'ECONNREFUSED',
        [errors]: [ [Error], [Error] ]
      }
    }
    ```
    
    ### After
    
    ```
    $ scripts/run_in_container.sh "explain this project to me"
    91aa716e3d3f86c9cf6013dd567be31b2c44eb5d7ab184d55ef498731020bb8d
    Resolving api.openai.com...
    Adding 162.159.140.245 for api.openai.com
    Adding 172.66.0.243 for api.openai.com
    Host network detected as: 172.17.0.0/24
    Firewall configuration complete
    Verifying firewall rules...
    Firewall verification passed - unable to reach https://example.com as expected
    Firewall verification passed - able to reach https://api.openai.com as expected
    ╭──────────────────────────────────────────────────────────────╮
    │ ● OpenAI Codex (research preview) v0.1.2504221401            │
    ╰──────────────────────────────────────────────────────────────╯
    ╭──────────────────────────────────────────────────────────────╮
    │ localhost session: 7c782f196ae04503866e39f071e26a69          │
    │ ↳ model: o4-mini                                             │
    │ ↳ provider: openai                                           │
    │ ↳ approval: full-auto                                        │
    ╰──────────────────────────────────────────────────────────────╯
    user
    explain this project to me
    ╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
    │( ●    ) 2s  Thinking                                                                                                                                                                                                                                                  │
    ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
      send q or ctrl+c to exit | send "/clear" to reset | send "/help" for commands | press enter to send | shift+enter for new line — 100% context left
    ```
  • feat: add support for custom provider configuration in the user config (#537)
    ### What
    
    - Add support for loading and merging custom provider configurations
    from a local `providers.json` file.
    - Allow users to override or extend default providers with their own
    settings.
    
    ### Why
    
    This change enables users to flexibly customize and extend provider
    endpoints and API keys without modifying the codebase, making the CLI
    more adaptable for various LLM backends and enterprise use cases.
    
    ### How
    
    - Introduced `loadProvidersFromFile` and `getMergedProviders` in config
    logic.
    - Added/updated related tests in [tests/config.test.tsx]
    
    
    ### Checklist
    
    - [x] Lint passes for changed files
    - [x] Tests pass for all files
    - [x] Documentation/comments updated as needed
    
    ---------
    
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • feat: added provider to run quiet mode function (#571)
    Adding support to be able to run other models in quiet mode
    
    ie: `codex --approval-mode full-auto -q "explain the current directory"
    --provider xai --model grok-3-beta`
  • bug: non-openai mode - don't default temp and top_p (#572)
    I haven't seen any actual errors due to this, but it's been bothering me
    that I had it defaulted to 1. I think best to leave it undefined and
    have each provider do their thing
  • bug: non-openai mode - fix for gemini content: null, fix 429 to throw before stream (#563)
    Gemini's API is finicky, it 400's without an error when you pass
    content: null
    Also fixed the rate limiting issues by throwing outside of the iterator.
    I think there's a separate issue with the second isRateLimit check in
    agent-loop - turnInput is cleared by that time, so it retries without
    the last message.
  • feat: create parent directories when creating new files. (#552)
    apply_patch doesn't create parent directories when creating a new file
    leading to confusion and flailing by the agent. This will create parent
    directories automatically when absent.
    
    ---------
    
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • bump(version): 0.1.2504221401 (#559)
    ## `0.1.2504221401`
    
    ### 🚀 Features
    
    - Show actionable errors when api keys are missing (#523)
    - Add CLI `--version` flag (#492)
    
    ### 🐛 Bug Fixes
    
    - Agent loop for ZDR (`disableResponseStorage`) (#543)
    - Fix relative `workdir` check for `apply_patch` (#556)
    - Minimal mid-stream #429 retry loop using existing back-off (#506)
    - Inconsistent usage of base URL and API key (#507)
    - Remove requirement for api key for ollama (#546)
    - Support `[provider]_BASE_URL` (#542)
  • when a shell tool call invokes apply_patch, resolve relative paths against workdir, if specified (#556)
    Previously, we were ignoring the `workdir` field in an `ExecInput` when
    running it through `canAutoApprove()`. For ordinary `exec()` calls, that
    was sufficient, but for `apply_patch`, we need the `workdir` to resolve
    relative paths in the `apply_patch` argument so that we can check them
    in `isPathConstrainedTowritablePaths()`.
    
    Likewise, we also need the workdir when running `execApplyPatch()`
    because the paths need to be resolved again.
    
    Ideally, the `ApplyPatchCommand` returned by `canAutoApprove()` would
    not be a simple `patch: string`, but the parsed patch with all of the
    paths resolved, in which case `execApplyPatch()` could expect absolute
    paths and would not need `workdir`.
  • fix: agent loop for disable response storage (#543)
    - Fixes post-merge of #506
    
    ---------
    
    Co-authored-by: Ilan Bigio <ilan@openai.com>
  • fix: support [provider]_BASE_URL (#542)
    Resolved issue where an OLLAMA_BASE_URL was not properly handled
    (openai/codex#516).
  • fix: remove requirement for api key for ollama (#546)
    Fixes #540 
    # Skip API key validation for Ollama provider
    
    ## Description
    This PR modifies the CLI to not require an API key when using Ollama as
    the provider
    
    ## Changes
    - Modified the validation logic to skip API key checks for these
    providers
    - Updated the README to clarify that Ollama doesn't require an API key
  • feat: show actionable errors when api keys are missing (#523)
    Change errors on missing api key of other providers from
    <img width="854" alt="image"
    src="https://github.com/user-attachments/assets/f488a247-5040-4b02-92d6-90a2204419ff"
    />
    (missing deepseek key but still throws error for openai)
    to
    <img width="854" alt="image"
    src="https://github.com/user-attachments/assets/8333d24a-91f8-4ba8-9a51-ed22a7e8a074"
    />
    This should help new users figure out the issue easier and go to the
    right place to get api keys
    
    OpenAI key missing would popup with the right link
    <img width="854" alt="image"
    src="https://github.com/user-attachments/assets/0ecc9320-380f-425c-972e-4312bf610955"
    />
  • feat: add CLI –version flag (#492)
    Adds a new flag to cli `--version` that prints the current version and
    exits
    
    ---------
    
    Co-authored-by: Thibault Sottiaux <tibo@openai.com>
  • agent-loop: minimal mid-stream #429 retry loop using existing back-off (#506)
    As requested by @tibo-openai at
    https://github.com/openai/codex/pull/357#issuecomment-2816554203, this
    attempts a more minimal implementation of #357 that preserves as much as
    possible of the existing code's exponential backoff logic.
    
    Adds a small retry wrapper around the streaming for‑await loop so that
    HTTP 429s which occur *after* the stream has started no longer crash the
    CLI.
    
    Highlights
    • Re‑uses existing RATE_LIMIT_RETRY_WAIT_MS constant and 5‑attempt
    limit.
    • Exponential back‑off identical to initial request handling. 
    
    This comment is probably more useful here in the PR:
    // The OpenAI SDK may raise a 429 (rate‑limit) *after* the stream has
    // started. Prior logic already retries the initial `responses.create`
            // call, but we need to add equivalent resilience for mid‑stream
            // failures.  We keep the implementation minimal by wrapping the
    // existing `for‑await` loop in a small retry‑for‑loop that re‑creates
            // the stream with exponential back‑off.