Commit Graph

6893 Commits

  • [codex] Stage Python SDK beta versions from release tags (#24872)
    ## Summary
    - Treat `sdk/python` as a development template with source version
    `0.0.0-dev`, matching the existing Python runtime packaging pattern.
    - Have `python-v*` tags supply the published SDK beta version through
    the existing `stage-sdk --sdk-version` path.
    - Remove the workflow check requiring a source version bump for each
    beta release and remove its now-unused host Python setup step.
    - Keep the reviewed runtime dependency pin at
    `openai-codex-cli-bin==0.132.0`.
    - Remove beta-number-specific documentation so it does not need editing
    for each publish.
    
    ## Why
    The package staging script already writes the release version into the
    artifact. Requiring the checked-in SDK template version to match every
    tag adds release-only source churn without changing the package users
    receive.
    
    ## Validation
    - Not run locally; relying on online CI for this workflow and metadata
    change.
    
    ## Release
    After this PR lands, publish the next beta by pushing tag
    `python-v0.1.0b2` from merged `main`.
  • [codex] Remove Python SDK beta warning note (#24870)
    ## Summary
    - Remove the beta warning callout from the PyPI-facing Python SDK
    README.
    - Keep the existing Beta title and install/usage guidance unchanged.
    
    ## Validation
    - Not run locally; relying on online CI for this documentation-only
    change.
    
    ## Release
    - Land this change before publishing the next Python SDK beta.
  • [codex] Remove Python SDK language classifiers (#24868)
    ## Summary
    - Remove the Python language classifiers from the Python SDK package
    metadata.
    - Keep `requires-python = ">=3.10"` as the package's interpreter
    compatibility constraint.
    - Avoid presenting a curated version-support list in PyPI metadata.
    
    ## Validation
    - Not run locally; relying on online CI for this metadata-only change.
    
    ## Release
    - Land this change before publishing the next Python SDK beta.
  • [codex] Simplify Python SDK install guidance (#24866)
    ## Summary
    - Remove the exact-version install snippet from the PyPI-facing Python
    SDK README.
    - Remove the release-selection explanation so the install section
    presents the standard `pip install openai-codex` path directly.
    
    ## Validation
    - Not run locally; relying on online CI for this documentation-only
    change.
  • Treat refresh_token_reused 400s as relogin-required (#24830)
    ## Summary
    - classify known refresh-token terminal failures from `/oauth/token` as
    permanent even when the backend returns `400`
    - preserve the existing relogin-required message for
    `refresh_token_reused` instead of retrying and collapsing into a generic
    cloud requirements error
    - add regression coverage for `400 refresh_token_reused`
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-login`
  • [codex] Prepare Python SDK beta documentation and package metadata (#24836)
    ## Why
    
    The initial public `openai-codex` beta should read and install like a
    normal published Python package before a release tag is created. This
    follows merged PR #24828, which establishes the independent SDK beta
    release plumbing and exact runtime dependency.
    
    ## What changed
    
    - Rewrote `sdk/python/README.md` as a compact PyPI-facing beta package
    page: published installation, one quickstart, short login examples,
    built-in help, and links to deeper guides.
    - Updated the getting-started guide, API reference, FAQ, and examples
    index to present the published beta consistently without repeating
    onboarding in the package landing page or reference page.
    - Made `pip install openai-codex` the primary install path while beta
    releases are the only published SDK releases, with `--pre` documented
    for opting into prereleases after a stable release exists.
    - Added curated `help()` / `pydoc` docstrings across the public API and
    generated public convenience methods through
    `scripts/update_sdk_artifacts.py`.
    - Declared the repository `Apache-2.0` license expression and
    Documentation URL in package metadata, without introducing a duplicated
    SDK-local license file.
    - Kept the source distribution focused on installable package material
    (`src/openai_codex`, `README.md`, and `pyproject.toml`); the repository
    docs and runnable examples remain linked from the PyPI README.
    - Built release artifacts in an Alpine container on the Ubuntu runner,
    matching Python SDK CI and allowing type generation to install the
    published `musllinux` runtime wheel.
    - Added `twine check --strict` to the release workflow so malformed PyPI
    metadata or rendered README content fails before publishing.
    - Added focused SDK assertions for beta metadata, the exact runtime pin,
    source distribution contents, and the built-in Python documentation
    surface.
    
    ## Validation
    
    - Ran `uv run --frozen --extra dev ruff check
    scripts/update_sdk_artifacts.py src/openai_codex
    tests/test_public_api_signatures.py
    tests/test_artifact_workflow_and_binaries.py` before the final
    README-only reductions and review-fix follow-ups.
    - Built `openai_codex-0.1.0b1-py3-none-any.whl` and
    `openai_codex-0.1.0b1.tar.gz` before the final README-only reductions
    and review-fix follow-ups.
    - Ran `python -m twine check --strict` on both built artifacts before
    the final README-only reductions and review-fix follow-ups.
    - Verified artifact metadata reports `Apache-2.0` without a duplicated
    SDK-local license file.
    - Verified `inspect.getdoc(...)` resolves documentation for the package,
    `Codex`, `CodexConfig`, and key generated thread methods.
    - Rebased the documentation/readiness change onto merged PR #24828
    without changing the intended SDK or workflow file contents.
    - Final verification is delegated to online CI for this PR.
  • [codex] Add independent beta release for the Python SDK (#24828)
    ## Why
    
    `openai-codex` needs a beta release lifecycle without requiring beta
    releases of its pinned runtime package. Previously, SDK staging rewrote
    its runtime dependency to the SDK version, which made an SDK-only beta
    impossible.
    
    ## What changed
    
    - Set the initial SDK beta version to `0.1.0b1` and pin it to published
    stable `openai-codex-cli-bin==0.132.0`.
    - Decoupled SDK release staging from runtime versioning so it preserves
    the reviewed exact runtime pin.
    - Added a `python-v*` tag workflow that builds and publishes only
    `openai-codex` through PyPI trusted publishing.
    - Removed the Beta classifier from runtime package metadata for future
    runtime publications.
    - Regenerated protocol-derived SDK models from the selected stable
    runtime package.
    
    `0.132.0` is the newest stable runtime admitted by the checked-in
    dependency date fence and retains the Linux wheel family currently used
    by SDK CI.
    
    ## Release setup
    
    Before pushing `python-v0.1.0b1`, configure PyPI trusted publishing for
    the `openai-codex` project with workflow `python-sdk-release.yml`,
    environment `pypi`, and job `publish-python-sdk`.
    
    ## Validation
    
    - `uv run --frozen --extra dev ruff check src/openai_codex scripts
    examples tests`
    - Parsed `.github/workflows/python-sdk-release.yml` with PyYAML.
    - Built staged release artifacts locally:
    `openai_codex-0.1.0b1-py3-none-any.whl` and
    `openai_codex-0.1.0b1.tar.gz`.
    - Verified wheel metadata pins `openai-codex-cli-bin==0.132.0`.
    - Tests are deferred to online CI for this PR.
  • [codex] Remove redundant SQLite dynamic tool storage (#24819)
    ## Why
    
    Dynamic tools are defined at thread start and already stored in rollout
    `SessionMeta`, which restores resumed and forked sessions. Persisting
    the same tools through SQLite creates a second runtime persistence path
    that is unnecessary prework for the explicit namespace refactor.
    
    ## What changed
    
    - Restore missing thread-start dynamic tools directly from rollout
    history, including when SQLite is enabled.
    - Remove SQLite dynamic-tool reads, writes, backfill, and thread
    metadata patch plumbing.
    - Add SQLite-enabled resume integration coverage that verifies a
    rollout-defined dynamic tool is still sent after resume.
    
    ## Compatibility
    
    The existing `thread_dynamic_tools` table is intentionally not dropped
    even though it's now unused. Older Codex binaries are allowed to open
    databases migrated by newer binaries and still reference this table;
    dropping it would break that mixed-version path. See
    [here](https://github.com/openai/codex/blob/main/codex-rs/state/src/migrations.rs#L10-L11).
    
    ## Verification
    
    - `just test -p codex-state -p codex-rollout -p codex-thread-store`
    - `just test -p codex-core --test all
    resume_restores_dynamic_tools_from_rollout_with_sqlite_enabled`
  • [codex] Rename Python SDK AppServerConfig to CodexConfig (#24800)
    ## Why
    
    `AppServerConfig` is exported as part of the ergonomic Python SDK
    surface and passed to `Codex(...)` and `AsyncCodex(...)`. That name
    exposes the underlying app-server transport at the same layer where
    users are configuring the Codex client. `CodexConfig` makes the common
    callsite read naturally and names the object it configures.
    
    ## What changed
    
    - Renamed the public configuration dataclass from `AppServerConfig` to
    `CodexConfig`.
    - Updated `Codex`, `AsyncCodex`, and the transport clients to accept
    `CodexConfig`.
    - Updated binary-resolution messages, package exports, docs, examples,
    and related coverage to use the new public name.
    
    ## API impact
    
    ```python
    from openai_codex import Codex, CodexConfig
    
    with Codex(config=CodexConfig(codex_bin="/path/to/codex")) as codex:
        ...
    ```
    
    Callers should now import and construct `CodexConfig`; `AppServerConfig`
    is no longer part of the Python SDK surface.
    
    ## Validation
    
    - `uv run --frozen --extra dev ruff check src/openai_codex scripts
    examples tests`
    - Tests are deferred to online CI for this PR.
  • [codex] Fix hyperlink-aware key-value table rendering (#24825)
    ## Why
    
    The key/value markdown table renderer added in #24636 still operates on
    `Line` values, while table cells and rendered table output now carry
    `HyperlinkLine`. That mismatch breaks `codex-tui` compilation on `main`
    and would risk losing semantic web-link annotations if corrected by
    flattening the values.
    
    ## What changed
    
    - Make key/value record rendering wrap and emit `HyperlinkLine` values
    consistently with the existing grid renderer.
    - Remap wrapped hyperlink ranges and shift them when value content is
    prefixed by record-mode indentation or labels.
    - Add focused coverage verifying key/value fallback output preserves
    web-link destinations.
    
    ## Verification
    
    - `just test -p codex-tui -E
    'test(key_value_table_keeps_web_annotations) |
    test(/table_renders_(key_value_records_when_compact_fragmentation_is_systemic_snapshot|stacked_key_value_records_when_path_column_becomes_too_narrow_snapshot|records_when_multiple_prose_columns_are_starved_snapshot)/)'`
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • Allow API-key auth for remote exec-server registration (#24666)
    ## Overview
    Allow remote `codex exec-server` registration to use existing API-key
    auth while restricting where those credentials can be sent.
    
    - Accept `CodexAuth::ApiKey` for the normal `--remote` registration
    path.
    - Restrict API-key remote registration to HTTPS `openai.com` and
    `openai.org` hosts and subdomains, with explicit HTTP loopback support
    for local development.
    - Disable registry registration redirects so credentials cannot be
    forwarded to an unvalidated destination.
    - Retain `--use-agent-identity-auth` as the explicit Agent Identity
    path.
    - Document remote registration using `CODEX_API_KEY`.
    
    ## Big picture
    Callers can now provide an API key directly to `exec-server`
    registration without first establishing ChatGPT login state:
    
    ```sh
    CODEX_API_KEY="$OPENAI_API_KEY" \
    codex exec-server \
      --remote "https://<host>.openai.org/api" \
      --environment-id "$ENVIRONMENT_ID"
    ```
    
    ## Validation
    - `cargo fmt --all` (`just fmt` is not installed on this host)
    - `cargo test -p codex-cli -p codex-exec-server`
  • feat(tui): render cramped markdown tables as key-value records [2 of 2] (#24636)
    ## Stack
    
    - **Base: #24489 [1 of 2]** - render markdown tables in app style.
    - **Current: #24636 [2 of 2]** - render cramped markdown tables as
    key/value records.
    
    Review this PR against `fcoury/app-style-markdown-tables`; it contains
    only the fallback behavior for cramped tables.
    
    ## Why
    
    The row-separated markdown table rendering in #24489 remains readable
    while columns have usable room. Once long links or multiple prose-heavy
    columns are compressed into narrow allocations, however, the grid can
    turn words and paths into tall vertical strips that are difficult to
    scan. In those cases the content matters more than preserving the grid
    shape.
    
    ## What Changed
    
    <table>
    <tr><td>
    <p align="center"><b>
    Normal
    </b></p>
    <img width="1722" height="619" alt="CleanShot 2026-05-27 at 14 32 57"
    src="https://github.com/user-attachments/assets/d04f5fbd-6064-4acd-91bd-072d19b983df"
    />
    </td></tr>
    <tr><td>
    <p align="center"><b>
    Narrow
    </b></p>
    <img width="863" height="1013" alt="CleanShot 2026-05-27 at 14 33 12"
    src="https://github.com/user-attachments/assets/6a7d2968-0a68-48fd-ab5d-209b3dbaf03e"
    />
    </td></tr>
    <tr><td>
    <p align="center"><b>
    Very narrow
    </b></p>
    <img width="435" height="746" alt="CleanShot 2026-05-27 at 14 33 47"
    src="https://github.com/user-attachments/assets/f6a59e30-b1d2-4063-9c05-43933abc77d6"
    />
    </td></tr>
    </table>
    
    - Detect tables whose grid allocation causes systemic token
    fragmentation or starves multiple prose-heavy columns.
    - Render those tables as repeated key/value records instead of retaining
    an unreadable grid.
    - Use aligned label/value records when there is useful horizontal room,
    and switch to a stacked narrow-record layout where each label is
    followed by a full-width value when width is especially constrained.
    - Preserve the themed label color, rich inline formatting, links, and
    the existing grid presentation for tables that remain readable.
    - Add snapshot coverage for path-heavy narrow tables, prose-heavy issue
    tables, systemic compact fragmentation, and a control case that should
    continue to render as a grid.
    
    ## How to Test
    
    1. Start Codex from this branch and render a normal multi-column
    markdown table at a comfortable terminal width. Confirm it still appears
    as the styled row-separated grid from #24489.
    2. Render a table containing a long linked record identifier or
    file-like value, then narrow the terminal until the grid would split the
    value into vertical fragments. Confirm it switches to key/value records,
    with labels above values at very narrow widths.
    3. Render a table with multiple prose-heavy columns, such as an issue
    summary table with `Issue`, `Activity`, `Complexity`, and `Why start`.
    Confirm a cramped width switches to records rather than wrapping several
    columns into hard-to-read strips.
    4. Render a compact table where only one value wraps mildly. Confirm it
    stays in grid form rather than switching prematurely.
    
    ## Validation
    
    - Ran `just test -p codex-tui` while developing the fallback and
    reviewed/accepted the intended new markdown-render snapshots. The
    command still reports two unrelated existing guardian feature-flag test
    failures outside this diff.
    - Ran `just fix -p codex-tui` and `just fmt` after the Rust changes were
    complete.
    - `just argument-comment-lint` cannot reach source linting locally
    because Bazel fails while resolving LLVM sanitizer headers; touched
    positional literal callsites were inspected manually and annotated where
    needed.
  • feat(tui): add OSC 8 web links to rich content (#24472)
    ## Why
    
    Wrapped URLs in rich TUI output, especially URLs rendered inside
    Markdown tables, are split across terminal rows. In terminals that
    support OSC 8 hyperlinks, treating each visible fragment as part of the
    complete destination enables reliable open-link and copy-link actions
    even after table layout wraps the URL.
    
    This addresses the semantic-link portion of #12200 and the behavior
    described in
    https://github.com/openai/codex/issues/12200#issuecomment-4535452980. It
    does not change ordinary drag-selection across bordered table rows.
    
    ## What Changed
    
    - Added shared TUI OSC 8 support that validates `http://` and `https://`
    destinations, sanitizes terminal payloads, and applies metadata
    separately from visible line width/layout.
    - Added semantic web-link annotations to assistant and proposed-plan
    Markdown, including explicit web links and bare web URLs in prose and
    table cells while excluding code and non-web Markdown destinations.
    - Preserved complete URL targets through table wrapping, narrow pipe
    fallback, streaming, transcript overlay rendering, history insertion,
    and resize replay.
    - Routed intentional Codex-owned links in notices,
    status/setup/app-link, feedback, onboarding, MCP/plugin help, memories,
    and update surfaces through the shared hyperlink handling.
    
    ## How to Test
    
    1. Run Codex in a terminal with OSC 8 link support, such as Ghostty, and
    request an assistant response containing a Markdown table whose last
    column contains a long `https://` URL.
    2. Make the terminal narrow enough for the URL to wrap across multiple
    bordered table rows.
    3. Use the terminal's open-link or copy-link action on more than one
    wrapped URL fragment and confirm each fragment resolves to the complete
    original URL.
    4. Resize the terminal after the table is rendered and repeat the link
    action to confirm the destination survives scrollback replay.
    5. Open the transcript overlay while rich output is present and confirm
    web links remain interactive there.
    6. As a regression check, render inline/fenced code containing URL text
    and a Markdown link such as
    `[https://example.com](mailto:support@example.com)`; confirm these do
    not acquire a web OSC 8 destination.
    
    Targeted automated coverage exercised Markdown links and exclusions,
    wrapped and pipe-fallback tables, streaming/transcript overlay
    propagation, status-link truncation, and rendered word-wrapping cell
    alignment. `just test -p codex-tui` was also run; it passed the
    hyperlink coverage and reproduced two unrelated existing guardian
    feature-flag test failures.
  • fix(linux-sandbox): preserve shell cleanup on interruption (#22729)
    ## Why
    Interrupted `shell_command` calls can race with the outer tool-dispatch
    cancellation path. When that happens, the runtime future may be dropped
    before the spawned process gets a chance to run `SIGTERM` cleanup. For
    bwrapd-backed Linux sandbox commands, that can leave synthetic
    protected-path mount bookkeeping such as `.git/.codex` registrations
    under `/tmp` behind after a TUI interruption.
    
    The relevant cancellation points are the outer dispatch race in
    [`core/src/tools/parallel.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/parallel.rs#L91-L132)
    and the process shutdown logic in
    [`core/src/exec.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/exec.rs#L1367-L1393).
    
    ## What changed
    - Keep `shell_command` dispatch alive long enough for the runtime to
    finish cancellation cleanup instead of immediately returning the
    synthetic aborted response.
    - Fold shell-turn cancellation into the existing `ExecExpiration` path
    in
    [`core/src/tools/runtimes/shell.rs`](https://github.com/openai/codex/blob/bd184ba84703cc924921ed883f0cf17d3dba60ff/codex-rs/core/src/tools/runtimes/shell.rs#L267-L274),
    so cancellation and timeout behavior stay centralized.
    - On cancellation, send `SIGTERM` first, wait briefly for cleanup to
    run, then hard-kill any remaining descendants in the original process
    group.
    - Treat `ESRCH` as an already-gone process-group cleanup case in
    `codex-utils-pty`, which keeps best-effort teardown from surfacing a
    stale-process race as an error.
    
    ## Verification
    - `cargo test -p codex-core cancellation`
    - Added regression coverage for:
      - `shell_tool_cancellation_waits_for_runtime_cleanup`
      - `process_exec_tool_call_cancellation_allows_sigterm_cleanup`
  • chore: enable namespace tools for Bedrock (#24713)
    Client-side namespace tools are now supported by bedrock. Enable
    `namespace_tools` for the Amazon Bedrock provider while continuing to
    disable unsupported hosted tools such as image generation and web
    search.
  • feat(tui): render markdown tables in app style [1 of 2] (#24489)
    ## Stack
    
    - **Current: #24489 [1 of 2]** - render markdown tables in app style.
    - **Stacked follow-up: #24636 [2 of 2]** - render cramped markdown
    tables as key/value records.
    
    ## Why
    
    Markdown tables currently render as boxed terminal grids, which gives
    ordinary assistant output a heavier visual treatment than surrounding
    rich text. This row-separated layout is the best match for how the App
    renders tables, while accented headers remain distinguishable even when
    a terminal font renders bold subtly.
    
    <table>
    <tr><td>
    <p align="center">Codex CLI - Before</p>
    <img width="1722" height="742" alt="CleanShot 2026-05-25 at 18 46 17"
    src="https://github.com/user-attachments/assets/f673d92a-ebd8-46e2-b414-3d985e41b6a4"
    />
    </td></tr>
    <tr><td>
    <p align="center">Codex CLI - After</p>
    <img width="1720" height="957" alt="image"
    src="https://github.com/user-attachments/assets/36a3d331-bea1-439b-b5be-e97b0731bd6f"
    />
    </td></tr>
    <tr><td>
    <p align="center">Codex App</p>
    <img width="979" height="1293" alt="CleanShot 2026-05-25 at 18 45 04"
    src="https://github.com/user-attachments/assets/7d97cae0-9256-4f6e-a4b3-8b8f22b0d901"
    />
    </td></tr>
    </table>
    
    ## What Changed
    
    - Render markdown tables as padded, aligned rows without an enclosing
    box.
    - Style table headers with the active syntax-theme accent plus bold
    text, while keeping separators low contrast and theme-aware.
    - Use a segmented heavy header rule and thin body-row rules, preserving
    wrapping, narrow-width fallback, streaming parity, and rich-history
    rendering.
    - Update focused assertions and snapshots for the final table layout.
    
    ## How to Test
    
    1. Render a markdown table in the TUI with several rows and columns.
    2. Confirm the header uses the active theme accent, rows use
    one-character interior padding, and the table has no enclosing box.
    3. Confirm the header is followed by segmented `━` rules and multiple
    body rows are separated by muted segmented `─` rules.
    4. Render the same table while streaming and in history/raw-mode
    toggles; the final rich layout should remain stable.
    5. Render a narrow table with long content and verify wrapping or pipe
    fallback does not overflow.
    
    ## Validation
    
    - `just test -p codex-tui table`
    - `just test -p codex-tui streaming::controller::tests`
    - `just argument-comment-lint-from-source -p codex-tui -- --all-targets`
    - `just fix -p codex-tui`
    - `just fmt`
    
    `just test -p codex-tui` was also run after accepting the snapshots; it
    fails only in the unrelated existing guardian app tests
    `update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    and
    `update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`.
  • feat(tui): make turn interruption keybind configurable (#24766)
    ## Why
    
    Interrupting an active turn is currently fixed to `Esc`, which is easy
    to hit accidentally and cannot be customized through `/keymap`. This
    gives users a less accidental binding while preserving the existing
    default.
    
    ## What Changed
    
    - Adds `tui.keymap.chat.interrupt_turn` to `/keymap`, defaulting to
    `esc` and supporting remapping or unbinding.
    - Uses the configured interrupt binding for running-turn status, queued
    steer interruption, and `request_user_input`, including the visible
    hints.
    - Preserves local `Esc` behavior for popups, Vim insert mode, and
    `/agent` editing while validating conflicts with fixed/backtrack and
    request-input navigation bindings.
    - Adds behavior and snapshot coverage for remapped interruption paths.
    
    ## How to Test
    
    1. Run Codex and open `/keymap`, then set **Interrupt Turn** to `f12`.
    2. Start a turn and confirm `Esc` no longer interrupts it while `f12`
    does; the running hint should display `f12 to interrupt`.
    3. Queue a steer while a turn is running and confirm the preview
    displays `f12`; pressing it should interrupt and submit the steer
    immediately.
    4. Trigger a `request_user_input` prompt and confirm its footer uses
    `f12`; with notes open, `Esc` should still clear notes while `f12`
    interrupts the turn.
    5. Clear the Interrupt Turn binding and confirm the key-specific
    interrupt hint is removed while `Ctrl+C` remains available.
    
    Targeted validation:
    
    - `just write-config-schema`
    - `just fix -p codex-config`
    - `just fix -p codex-tui`
    - `just fmt`
    - `just argument-comment-lint-from-source -p codex-config -p codex-tui`
    - `just test -p codex-config`
    - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
    - `just test -p codex-tui keymap_setup::tests`
    - `just test -p codex-tui` (fails in two pre-existing guardian
    feature-flag tests unrelated to this diff; the intentional picker
    snapshot updates were reviewed and accepted)
  • feat(tui): add vim text object bindings (#24382)
    ## Why
    
    Vim mode currently supports some normal-mode operators and motions, but
    common text-object combinations like `ciw`, `daw`, `di(`, and
    quote/bracket variants are still missing. That makes the composer feel
    incomplete for users who expect operator + text object editing to work
    inside prompts.
    
    Closes #21383.
    
    ## What Changed
    
    - Add Vim pending-state support for operator/text-object sequences.
    - Add `c` as a normal-mode operator for text objects, so combinations
    like `ciw` delete the object and enter insert mode.
    - Support word, WORD, delimiter, and quote text objects:
      - `iw`, `aw`, `iW`, `aW`
      - `i(`, `a(`, `i)`, `a)`, `ib`, `ab`
      - `i[`, `a[`, `i]`, `a]`
      - `i{`, `a{`, `i}`, `a}`, `iB`, `aB`
      - `i"`, `a"`, `i'`, `a'`, `i\``, `a\``
    - Add configurable keymap entries and keymap picker coverage for the new
    Vim text-object context.
    - Regenerate the config schema and update keymap picker snapshots.
    
    ## How to Test
    
    Manual smoke test:
    
    1. Start Codex with Vim composer mode enabled.
    2. Type a draft such as:
       ```text
       alpha beta gamma
       call(foo[bar], {"x": "hello world"})
       say "one \"two\" three" now
       ```
    3. Put the cursor on `beta`, press `ciw`, and confirm `beta` is removed
    and the composer enters insert mode.
    4. Escape back to normal mode, put the cursor on `gamma`, press `daw`,
    and confirm `gamma` plus surrounding whitespace is removed.
    5. Put the cursor inside `foo[bar]`, press `di[`, and confirm only `bar`
    is removed.
    6. Put the cursor inside `call(...)`, press `da(`, and confirm the whole
    parenthesized section is removed.
    7. Put the cursor inside the quoted text, press `ci"`, and confirm the
    quote contents are removed and insert mode starts.
    8. Verify cancellation does not edit text: press `d` then `Esc`, and
    press `d` then `i` then `Esc`.
    
    Targeted tests:
    
    - `cargo test -p codex-tui --lib vim_`
    - `cargo nextest run -p codex-tui keymap_setup::tests`
    
    Additional local checks:
    
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-tui`
    - `git diff --check`
    - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
    
    Local full-suite note: `just test -p codex-tui` ran to completion. The
    keymap snapshot failures were expected and accepted. Two unrelated
    guardian feature-flag tests still fail locally:
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    -
    `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`
    
    `just argument-comment-lint` is currently blocked locally by Bazel
    analysis before the lint runs because `compiler-rt` has an empty
    `include/sanitizer/*.h` glob in the local Bazel cache. The touched Rust
    diff was manually inspected for opaque positional literals.
  • [codex] Add friendly Python SDK sandbox presets (#24772)
    ## Why
    
    The Python SDK currently exposes sandbox selection differently depending
    on where it is used: thread lifecycle methods accept `SandboxMode`,
    while turns accept the lower-level `SandboxPolicy` shape. For the common
    case of choosing an access level, that leaks app-server wire details
    into otherwise straightforward SDK usage.
    
    This makes the common path explicit and discoverable: callers choose a
    named sandbox preset once, using the same keyword on threads and turns.
    The preset name `workspace_write` also makes the granted capability
    clear at the callsite.
    
    ## What changed
    
    - Added a root-level `Sandbox` enum with documented presets:
      - `Sandbox.read_only`: read files without allowing writes.
    - `Sandbox.workspace_write`: the normal default for projects with a
    recorded trust decision; read files and write inside the workspace and
    configured writable roots.
      - `Sandbox.full_access`: run without filesystem access restrictions.
    - Documented that omitting `sandbox=` delegates to app-server's
    configured default, while explicit turn overrides remain sticky for
    subsequent turns.
    - Updated sync and async thread lifecycle and turn APIs to consistently
    accept `sandbox=Sandbox...`, translating to the existing app-server
    thread and turn representations internally.
    - Updated the public API artifact generator so regenerated SDK wrappers
    retain the friendly enum shape.
    - Replaced low-level policy construction in Python docs, examples, and
    the walkthrough notebook with the preset API.
    - Added focused coverage for root exports, method signatures,
    preset-to-wire mapping, and rejection of raw string sandbox inputs.
    
    ## API impact
    
    High-level turn calls now use `sandbox=` instead of `sandbox_policy=`:
    
    ```python
    from openai_codex import Codex, Sandbox
    
    with Codex() as codex:
        thread = codex.thread_start(sandbox=Sandbox.workspace_write)
        result = thread.run("Review the diff only.", sandbox=Sandbox.read_only)
    ```
    
    `thread_start(...)` already defaults to `ApprovalMode.auto_review`, so
    normal writable usage is concise:
    
    ```python
    with Codex() as codex:
        thread = codex.thread_start(sandbox=Sandbox.workspace_write)
        thread.run("Update the files in this workspace.")
    ```
    
    With that combination, edits inside `cwd` and configured writable roots
    run within the workspace-write sandbox. Operations that require
    approval, such as edits outside those roots, are routed through auto
    review. When `sandbox=` is omitted, app-server resolves its configured
    default. A sandbox supplied to `run(...)` or `turn(...)` applies to that
    turn and subsequent turns.
    
    ## Test coverage
    
    - `sdk/python/tests/test_public_api_signatures.py` covers the public
    export and parameter names, including the default approval mode.
    - `sdk/python/tests/test_public_api_runtime_behavior.py` covers preset
    mappings to the existing wire types and raw string rejection.
  • [codex] add compaction metadata to turn headers (#24368)
    ## Summary
    - Add `request_kind` values for foreground turn, startup prewarm,
    compaction, and detached memory model requests.
    - Attach compaction dispatch metadata to local Responses, legacy
    `/v1/responses/compact`, and remote v2 compact requests.
    - Add the existing logical context-window identifier as `window_id` on
    turn-owned model request metadata.
    - Keep identity fields optional for detached memory requests, while
    still emitting `request_kind="memory"` in non-git/no-sandbox workspaces.
    
    ## Root Cause
    `x-codex-turn-metadata` has more than one producer. Foreground turns and
    compaction requests own a real turn and should carry that turn identity.
    Detached memory stage-one requests do not own a foreground turn, so
    absent identity fields are valid rather than missing data. Startup
    websocket prewarm is also a model request, but it has `generate=false`
    and must not be counted as a foreground turn.
    
    `thread_source` or session source identifies where a thread came from
    (for example review, guardian, or another subagent). `request_kind`
    identifies what the current outbound model request is doing (`turn`,
    `prewarm`, `compaction`, or `memory`). A review or guardian thread can
    issue either a normal turn request or a compaction request, so source
    cannot replace request kind.
    
    ## Behavior / Impact
    - Ordinary foreground requests send `request_kind="turn"`, their real
    identity fields, and `window_id="<thread_id>:<window_generation>"`.
    - Startup websocket warmup requests send `request_kind="prewarm"` so
    they are not counted as foreground turns.
    - Compaction requests send `request_kind="compaction"`, their real
    owning turn identity, the existing `window_id`, and
    `compaction.{trigger,reason,implementation,phase,strategy}`.
    - Detached memory stage-one requests send `request_kind="memory"`
    without `session_id`, `thread_id`, `turn_id`, or `window_id`; when no
    workspace metadata exists, the kind-only header is still emitted.
    - `session_id`, `thread_id`, `turn_id`, and `window_id` remain optional
    in the header schema because detached memory requests do not own a
    foreground turn or context window.
    - `window_id` is not a new ID system: it is copied from the already-sent
    `x-codex-window-id` / WS client metadata value at model-request dispatch
    time.
    - Existing `x-codex-window-id` HTTP/WS emission, value format,
    generation advancement, resume behavior, and fork reset behavior are
    unchanged.
    - `request_kind`, `window_id`, and upstream turn-owned identity fields
    remain schema-owned; input `responsesapi_client_metadata` cannot replace
    their canonical values.
    - No table, DAG, export, app-server API, or MCP `_meta` schema changes
    are included.
    
    A compaction attempt stopped by a pre-compact hook issues no model
    request and therefore has no request header; its outcome remains in
    analytics events. Status, error, duration, and token deltas also remain
    analytics fields rather than request-header fields.
    
    Future detached-memory attribution using a real initiating turn ID as
    `trigger_turn_id` is intentionally not part of this PR.
    
    ## Sync With Main
    - Final pushed head `716342e79` is rebased onto `origin/main@0d37db4b2`.
    - The metadata conflict came from upstream `#24160`, which added
    `forked_from_thread_id` on the same `turn_metadata` surface. Resolution
    preserves that field and its protection from client metadata override
    alongside this PR's request-kind, compaction, and window-id fields.
    - While resolving the overlapping commits, I removed an accidental
    recursive model-request overlay and a duplicate detached-memory header
    builder before completing the rebase.
    
    ## Latency / User Experience Boundary
    - Foreground turns perform no new filesystem, git, or network work. New
    fields are inserted into metadata already serialized for outgoing
    requests.
    - Compaction issues the same model/HTTP requests with the same prompt,
    model, service tier, and sampling settings; only metadata bytes change.
    - Startup prewarm already sent metadata; it is now correctly classified
    as `prewarm`.
    - Non-git detached memory now sends a small kind-only metadata header
    rather than no header.
    - This client diff adds no user-visible latency mechanism beyond
    negligible serialization and header bytes on already-existing requests.
    
    ## Validation
    On conflict-resolved head `1d35c2cfb` based on `origin/main@487521733`:
    - `just fmt` (passed)
    - `just fix -p codex-core` (passed)
    - `git diff --check origin/main...HEAD` (passed)
    - `just test -p codex-core -E 'test(turn_metadata) |
    test(websocket_first_turn_uses_startup_prewarm_and_create) |
    test(responses_stream_includes_turn_metadata_header_for_git_workspace_e2e)
    |
    test(responses_websocket_forwards_turn_metadata_on_initial_and_incremental_create)
    | test(remote_compact_v2_retries_failures_with_stream_retry_budget) |
    test(window_id_advances_after_compact_persists_on_resume_and_resets_on_fork)'`
    (`23 passed`; `bench-smoke` passed)
    - `just test -p codex-app-server -E
    'test(turn_start_forwards_client_metadata_to_responses_request_v2) |
    test(turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2)
    | test(auto_compaction_remote_emits_started_and_completed_items)'` (`3
    passed`; `bench-smoke` passed)
    - `just test -p codex-memories-write` (`29 passed`; `bench-smoke`
    passed)
  • [codex] Remove stale composer narrative doc references (#24641)
    ## Context
    
    `docs/tui-chat-composer.md` was removed by #20896 as part of removing
    local-only docs/specs from the repository. I checked the #20896 file
    list and the merge commit: the composer doc was deleted, not moved or
    copied, and current `main` does not contain a replacement composer
    narrative doc.
    
    Current guidance should keep contributors and agents focused on the docs
    that still exist: the module docs in `chat_composer.rs` and
    `paste_burst.rs`.
    
    ## Summary
    
    - Removes the scoped TUI bottom-pane AGENTS.md requirement to update
    `docs/tui-chat-composer.md`.
    - Removes stale module-doc references to that deleted narrative doc from
    `chat_composer.rs` and `paste_burst.rs`.
    
    ## Validation
    
    - Checked #20896 and the merge commit with rename/copy detection to
    confirm `docs/tui-chat-composer.md` was deleted rather than moved.
    - Searched current `main` for a replacement composer narrative doc.
    - Not run; documentation-only change.
  • fix: Preserve draft text when completing argument-taking slash commands (#23950)
    This adds slash command completion behavior for argument-taking
    commands, where text after the partially typed command becomes inline
    arguments instead of being discarded. This addresses the workflow of
    drafting text first, moving to the start, and completing a slash command
    around that existing draft. Before this change, this workflow would
    remove all user-input text aside from the slash command, which can be
    frustrating if the user had just typed out a long and well thought out
    goal.
    
    - Preserves the draft tail for inline-argument slash commands like
    `/goal` and `/review` when completing with `Tab` or `Enter`.
    - Keeps popup filtering focused on the command fragment under the cursor
    rather than the full draft text.
    - Leaves slash commands that do not support inline arguments unchanged,
    so completion still replaces the existing draft tail for those commands.
    - Adds focused TUI tests under slash input covering preserved arguments,
    cursor edge cases, and the negative case for a command without inline
    args.
      
    Follow-up simplification and test relocation from #24683 folded into
    this PR.
    
    ---------
    
    Co-authored-by: Eric Traut <etraut@openai.com>
  • make vercel webhook url an env secret (#24778)
    move `DEV_WEBSITE_VERCEL_DEPLOY_HOOK_URL` to a repo environment secret.
    
    to keep scope of use of that env secret small, move the vercel website
    redeploy to its own post-release job.
  • fix: run standalone updates noninteractively (#24637)
    # Summary
    
    The standalone update action currently downloads and runs the Codex
    installer as an interactive command. When an existing managed Codex
    install is present, accepting an update can therefore enter an installer
    prompt instead of completing the update.
    
    This change runs the standalone installer with `CODEX_NON_INTERACTIVE=1`
    on macOS/Linux and Windows. The installer environment-variable support
    is introduced by the parent PR; this PR wires that behavior into the
    Codex CLI update action. The rendered Windows command remains
    shell-safe, and long update commands wrap within the update-notice card.
    The standard test target snapshots the standalone notice for both
    platforms.
    
    # Stack
    
    1. [#21567](https://github.com/openai/codex/pull/21567) - Adds
    environment-controlled release selection and noninteractive installer
    behavior.
    2. [#24637](https://github.com/openai/codex/pull/24637) - Runs
    standalone updates with `CODEX_NON_INTERACTIVE=1`. (current)
    3. [#24639](https://github.com/openai/codex/pull/24639) - Removes
    explicit release argument inputs in favor of `CODEX_RELEASE`.
    
    # Evidence
    
    Standalone updater-shaped macOS install with an existing npm-managed
    Codex on `PATH`:
    
    
    https://github.com/user-attachments/assets/a27fe9e9-db3a-4c39-a514-24bd3d1f01e8
    
    # Testing
    
    Tests: targeted `codex-tui` update-action and update-notice snapshot
    tests, Rust formatting, benchmark smoke validation, macOS live-terminal
    standalone-update smoke testing, Windows ARM64 PowerShell
    standalone-update smoke testing through Parallels, and CI.
  • Bump SQLx to pick up newer bundled SQLite (#24728)
    ## Why
    
    Codex stores thread, log, goal, and memory state in bundled SQLite
    databases through SQLx. We have a suspected SQLite WAL-reset corruption
    issue under heavy concurrent writer load, especially when multiple
    subagents are active. The existing `sqlx 0.8.6` dependency kept us on an
    older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack
    far enough forward to pick up the newer bundled SQLite library.
    
    ## What changed
    
    - Bump the workspace `sqlx` dependency to `0.9.0`.
    - Use the SQLx 0.9 feature names explicitly: `runtime-tokio`,
    `tls-rustls`, and `sqlite-bundled`.
    - Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys
    0.37.0`.
    - Refresh `MODULE.bazel.lock` for the dependency changes.
    - Adapt `codex-state` to SQLx 0.9:
    - build dynamic state queries with `QueryBuilder<Sqlite>` instead of
    passing dynamic `String`s to `sqlx::query`;
    - remove the old `QueryBuilder` lifetime parameter from helper
    signatures;
    - preserve SQLx's new `Migrator` fields when constructing runtime
    migrators.
    
    ## Verification
    
    - `just test -p codex-state`
    - `just bazel-lock-check`
    - `cargo check -p codex-state --tests`
  • fix(tui): complete vim word-end and line-end behavior (#24380)
    ## Why
    
    The TUI Vim composer currently diverges from normal Vim editing in two
    common workflows: pressing `e` repeatedly can remain stuck at an
    existing word end, and normal mode does not support `C` for changing
    through the end of the line. The existing `D` behavior also removes the
    newline when the cursor is already at the line boundary, which makes the
    new `C` action and existing deletion action surprising in multiline
    prompts.
    
    Closes #23926.
    Closes #24238.
    
    ## What Changed
    
    - Make normal-mode `e` advance from the current word end to the next
    word end, including for operator motions such as `de`.
    - Add configurable Vim normal-mode `change_to_line_end` behavior, bound
    to `C` by default, which deletes to the end of the current line and
    enters Insert mode.
    - Keep the newline intact when `D` or `C` is pressed at the end-of-line
    boundary.
    - Add regression coverage for repeated `e`, `de`, `C`, and the multiline
    `C`/`D` boundary behavior.
    - Regenerate the config schema and update the keymap picker snapshots
    for the new Vim action.
    
    ## How to Test
    
    1. Run Codex with Vim composer mode enabled:
       ```bash
       cd codex-rs
       cargo run --bin codex -- -c tui.vim_mode_default=true
       ```
    2. Enter `alpha beta gamma`, press `Esc`, `0`, then press `e`
    repeatedly.
    Confirm the cursor advances through the ends of `alpha`, `beta`, and
    `gamma`.
    3. Enter `hello world`, press `Esc`, `0`, `w`, then `C`.
       Confirm `world` is deleted and the composer enters Insert mode.
    4. Enter a multiline prompt with `hello` above `world`, press `Esc`,
    `k`, `$`, and then `D`.
       Confirm the newline is preserved and the two lines do not join.
    5. At the same boundary, press `C` and type `!`.
    Confirm the composer enters Insert mode and yields `hello!` above
    `world`, preserving the newline.
    
    Targeted automated verification:
    - `just fix -p codex-tui`
    - `just argument-comment-lint-from-source -p codex-tui -p codex-config`
    - `cargo insta pending-snapshots` reports no pending snapshots.
    - `just test -p codex-tui` validates the new Vim and keymap snapshot
    coverage, but the command remains red due to two reproducible unrelated
    failures in `app::tests::update_feature_flags_disabling_guardian_*`.
    
    ## Validation Note
    
    The workspace-wide `just argument-comment-lint` form is currently
    blocked during Bazel analysis by the existing LLVM `compiler-rt` missing
    `include/sanitizer/*.h` failure; package-scoped source linting for the
    changed Rust crates passed.
  • TUI config cleanup: plugin marketplace (#24257)
    ## Why
    
    Plugin and marketplace mutations are applied by the app server, but
    several TUI follow-up paths still refreshed state from the TUI host
    config. In remote workspace mode, that can leave plugin UI state tied to
    stale client-local `config.toml` after the server has already applied
    the mutation.
    
    ## What
    
    - Stop reloading the TUI host config after app-server-owned plugin,
    marketplace, skill, and app mutations.
    - Use the same app-server-owned refresh path for local and remote
    sessions: ask the app server to reload user config where the running
    session needs it, then refetch plugin list/detail state from the app
    server.
    - Build plugin mention candidates from existing app-server `plugin/list`
    and `plugin/read` data in both local and remote sessions instead of
    TUI-host plugin config.
    - Avoid the duplicate local config reload after `ReloadUserConfig` asks
    the app server to reload config.
    
    ## Verification
    
    Manually launched a local WebSocket app-server with a temp server
    `CODEX_HOME`, launched the TUI with a separate temp host `CODEX_HOME`
    and `--remote`, installed a sample plugin from a temp local marketplace
    through `/plugins`, and confirmed the TUI refreshed to installed state
    while only the server config gained `[plugins."sample@debug"]`. Trace
    logs showed the TUI using app-server `plugin/list` and `plugin/read` for
    the refresh path.
  • Drop startup context when truncating forked rollouts (#24751)
    ## Summary
    - Change last-`n` fork truncation to start at the first fork-turn
    boundary instead of returning the full rollout when the fork history is
    shorter than the requested window.
    - Add coverage for the startup-prefix case in both rollout truncation
    tests and agent control spawn behavior.
    - Ensure bounded forked children still rebuild context after the cached
    prefix is truncated.
    
    ## Testing
    - Added unit coverage for truncation behavior when the parent history is
    under the requested fork-turn limit.
    - Added an agent control test covering bounded fork spawn behavior with
    startup context present.
    - Not run (not requested).
  • feat: add thread idle lifecycle hook (#24744)
    ## Why
    
    Extensions can currently observe thread start, resume, and stop, but
    they do not have a lifecycle point for the host to say that immediately
    pending thread work has drained. That makes idle follow-up behavior
    harder to express as extension-owned logic instead of host-specific
    plumbing.
    
    This adds an explicit idle lifecycle hook so an extension can react when
    a thread becomes idle while the host keeps ownership of whether any
    submitted follow-up input starts a turn, is queued, or is ignored.
    
    ## What changed
    
    - Added `ThreadIdleInput` with access to the session-scoped and
    thread-scoped extension stores.
    - Added a default `on_thread_idle` method to
    `ThreadLifecycleContributor`.
    - Re-exported `ThreadIdleInput` from the extension API surface.
    
    ## Testing
    
    Not run; this only extends the extension API trait surface with a
    default hook and exported input type.
  • Fix guardian review test user input (#24746)
    ## Summary
    - Add the missing additional_context field to the guardian review
    Op::UserInput test initializer.
    
    ## Test plan
    - just fmt
    - just test -p codex-core guardian_review
    - just test -p codex-core (compiles, then fails on local environment
    issues: sandbox-exec Operation not permitted, missing test_stdio_server
    helper binary, and unrelated timeouts)
  • feat: handle goal usage limits in goal extension (#24628)
    ## Why
    
    The extracted goal runtime needs a host-callable path for turns that
    stop because the workspace usage limit is reached. In that case, any
    in-turn goal progress should be accounted before the goal becomes
    terminal, and active goal accounting must be cleared so later
    tool-finish or turn-stop handling does not keep charging usage to a
    stopped goal.
    
    ## What changed
    
    - Adds `GoalRuntimeHandle::usage_limit_active_goal_for_turn`, which
    accounts current active-goal progress, marks the active or
    budget-limited thread goal as `UsageLimited`, records terminal metrics
    when the status changes, clears active goal accounting, and emits the
    updated goal event.
    - Covers both active and budget-limited goals in
    `ext/goal/tests/goal_extension_backend.rs`, including the invariant that
    later token/tool events do not add usage after the goal has been
    usage-limited.
    
    ## Testing
    
    - Added
    `usage_limit_active_goal_accounts_progress_and_clears_accounting`.
    - Added `usage_limit_budget_limited_goal_accounts_remaining_progress`.
  • Revert "Add Bedrock Mantle GovCloud region (#23860)" (#24690)
    This reverts commit 5381240f57. Gov cloud
    should not be supported
    
    # External (non-OpenAI) Pull Request Requirements
    
    External code contributions are by invitation only. Please read the
    dedicated "Contributing" markdown file for details:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • fix(auto-review) skip legacy notify for auto review threads (#24714)
    ## Summary
    Clear inherited legacy `notify` from Guardian review session config,
    since we should not be passing auto review threads into `notify`
    targets. Keeps legacy notify payload and hook runtime behavior unchanged
    for normal user turns.
    
    ## Testing
    - [x] add a Guardian config regression and dedicated Guardian
    integration test so review sessions cannot inherit parent notify hooks
  • Allow runtime enablement for remote plugins (#24707)
    experimentalFeature/enablement/set now accepts remote_plugin as a
    supported runtime feature key
  • fix: add noninteractive install script mode (#21567)
    # Summary
    
    The Codex standalone installers can pause after installation to ask
    about an older managed install or launching Codex. That makes unattended
    bootstrap and update flows hard to complete reliably.
    
    This PR adds noninteractive installer control on macOS/Linux and Windows
    through `CODEX_NON_INTERACTIVE=1`. Noninteractive operation is
    environment-only, which gives automated callers one stable way to
    suppress prompts. When a noninteractive install leaves an older npm,
    bun, or brew-managed Codex installed, the standalone bin is configured
    ahead of that command on `PATH` so the newly installed Codex is the one
    future launches select. It also supports `CODEX_RELEASE` for callers
    that select a release through environment variables while retaining the
    existing explicit release inputs. Release selection accepts `latest`,
    stable `x.y.z` versions, and Codex prereleases written as
    `rust-v0.134.0-alpha.3`, `v0.134.0-alpha.3`, or `0.134.0-alpha.3`; it
    validates that shape before constructing release requests.
    
    # Stack
    
    1. [#21567](https://github.com/openai/codex/pull/21567) - Adds release
    and noninteractive environment controls to the installers. (current)
    2. [#24637](https://github.com/openai/codex/pull/24637) - Runs
    standalone updater installs with `CODEX_NON_INTERACTIVE=1`.
    3. [#24639](https://github.com/openai/codex/pull/24639) - Removes
    explicit release argument inputs in favor of `CODEX_RELEASE`.
    
    # Evidence
    
    | Before | After |
    | --- | --- |
    | ![Interactive install
    prompts](https://github.com/user-attachments/assets/feecb45a-7087-4681-8775-ba57b07e97fa)
    | ![Noninteractive install completes without
    prompts](https://github.com/user-attachments/assets/53dcc791-383a-46e2-9a95-3b37b80ae053)
    |
    
    Environment-controlled macOS install with an existing npm-managed Codex
    on `PATH`:
    
    
    https://github.com/user-attachments/assets/442e0b5b-4a32-4bf5-996b-68784777380d
    
    # Design decisions
    
    Windows installs using the older standalone bin layout still require an
    interactive migration confirmation. Noninteractive mode does not
    auto-migrate that existing directory because replacing it is a
    destructive transition for an early, limited-use layout; unattended
    installs on that layout fail with an instruction to rerun interactively.
    
    # Testing
    
    Tests: installer syntax validation, release-selector acceptance and
    rejection coverage including PowerShell `Latest` compatibility, macOS
    live-terminal installer smoke testing with environment-controlled stable
    and prerelease installation and competing PATH precedence, shell
    rejection of the omitted noninteractive flag, and Windows ARM64
    PowerShell smoke testing with environment-only noninteractive behavior,
    retained release input, and competing PATH precedence through Parallels.
  • Uprev Rust toolchain pins to 1.95.0 (#24684)
    ## Summary
    - Bump the workspace Rust toolchain from `1.93.0` to `1.95.0` across
    Cargo, Bazel, CI, release workflows, devcontainers, and the Codex
    environment config.
    - Refresh `MODULE.bazel.lock` so the Bazel Rust toolchain artifacts
    match the new version.
    - Leave purpose-specific toolchains unchanged, including the
    `argument-comment-lint` nightly and the upstream `rusty_v8` `1.91.0`
    build pin.
    - Includes fixes for new lints from `just fix` and a few codex-authored
    fixes for lints without a suggestion.
  • fix(core): instrument stalled tool-listing handoff (#24667)
    ## Why
    
    When a turn needs a follow-up request after tool output is recorded,
    Codex can still appear stuck in `Thinking` before the next `/responses`
    request is opened. The existing local trace showed the last completed
    response and the absence of a new backend request, but it did not show
    whether the stall was in tool-router preparation or later request setup.
    
    Issue: N/A (internal incident investigation)
    
    ## What Changed
    
    Added trace spans around the pre-stream tool-router handoff in
    `core/src/session/turn.rs`, including the `built_tools` phase and the
    MCP manager read lock.
    
    Added per-server MCP tool-listing spans and trace breadcrumbs in
    `codex-mcp/src/connection_manager.rs` with startup snapshot /
    startup-complete state so a pending MCP client is visible in feedback
    logs instead of looking like a silent hang.
    
    ## Verification
    
    - `just fmt`
    - `just test -p codex-mcp`
    - `just test -p codex-core` (prior full rerun fails in this workspace on
    unrelated integration tests: code-mode output length expectations, one
    shell timeout formatting assertion, and shell snapshot timeouts; latest
    review-fix rerun compiled and passed 1160 tests before I stopped the
    abnormally slow unrelated suite)
  • fix: dont compact standalone websearch schema (#24660)
    add new `parse_tool_input_schema_without_compaction` to bypass the
    existing compaction/trimming of client-provided tool schemas that are
    over 4k bytes.
    
    we want this for standalone web search to keep field guidance/metadata
    on certain fields; this keeps us closer to parity with existing hosted
    tool schema (which didnt go through this 4k byte filter).
  • [codex] Remove obsolete goal continuation turn marker (#24658)
    ## Why
    
    `continuation_turn_id` was introduced to distinguish synthetic goal
    continuation turns for the no-tool continuation suppression heuristic.
    #20523 removed that heuristic, but left the marker behind. It is still
    written and cleared without affecting any runtime decision.
    
    ## What Changed
    
    - Remove `GoalRuntimeState::continuation_turn_id`.
    - Remove the marker setter/clearer and their now-no-op start, finish,
    and abort call sites.
    
    ## Testing
    
    - Not run yet (deferred at request).
  • [codex-analytics] add grouped session id to runtime events (#24655)
    ## Why
    - Runtime analytics events report `thread_id`, which identifies the
    individual thread emitting an event
    - They don't report `session_id`, which identifies the shared session
    for a root thread and its subagent threads
    - Emitting both identifiers allows analytics to group related activity
    
    ## What Changed
    - Adds `session_id` to relevant analytics events (thread_initalized,
    turn, turn_steer, compaction, guardian_review)
    - Tracks each thread's session ID in the analytics reducer so subsequent
    thread scoped events emit the same value
    - Carries the shared session ID through subagent initialization
    
    ## Verification
    - `just test -p codex-analytics` validates event payloads and subagent
    session grouping.
    - Focused `codex-app-server` tests validate session IDs for thread,
    turn, and steer events.
    - Focused `codex-core` tests validate root and subagent session ID
    propagation.
  • Restore legacy image detail values (#24644)
    ## Why
    
    Older persisted rollouts can contain `input_image.detail` values of
    `auto` or `low` from before `ImageDetail` was narrowed to
    `high`/`original`. Current deserialization rejects those values, which
    can make resume skip later compacted checkpoints and reconstruct an
    oversized raw suffix before the next compaction attempt.
    
    Confirmed Sentry reports fixed by this compatibility path:
    
    - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/)
    - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/)
    - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/)
    - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/)
    
    ## Background
    
    [openai/codex#20693](https://github.com/openai/codex/pull/20693) added
    image-detail plumbing for app-server `UserInput` so input images could
    explicitly request `detail: original`. The Slack discussion behind that
    PR was about ScreenSpot / bridge evals where user input images were
    resized, while tool output images already had MCP/code-mode ways to
    request image detail.
    
    In review, the intended new API surface was narrowed to `high` and
    `original`: default to `high`, allow `original` when callers need
    unchanged image handling, and avoid encouraging new `auto` or `low`
    usage. That policy still makes sense for newly emitted values.
    
    The missing compatibility piece is persisted history. Older rollouts can
    already contain `auto` and `low`, and resume reconstructs typed history
    by deserializing those rollout records. Rejecting old values at that
    boundary causes valid compacted checkpoints to be skipped. This PR
    restores `auto` and `low` as real variants so old records deserialize
    and round-trip without being rewritten as `high`, while product paths
    can continue to default to `high` and avoid emitting `auto` for new
    behavior.
    
    ## What changed
    
    - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class
    protocol values.
    - Preserved `auto`/`low` through rollout deserialization, MCP image
    metadata, code-mode image output, and schema/type generation.
    - Kept local image byte handling conservative: only `original` switches
    to original-resolution loading; `auto`/`low`/`high` continue through the
    resize-to-fit path while retaining their detail value.
    - Added regression coverage for enum round-tripping and code-mode `low`
    detail handling.
    
    ## Testing
    
    - `just write-app-server-schema`
    - `just test -p codex-protocol`
    - `just test -p codex-tools`
    - `just test -p codex-code-mode`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-core
    suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata`
    - `just test -p codex-core
    suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper`
    - Loaded broken rollouts on local fixed builds, and started/completed
    new turns.
    
    I also attempted `just test -p codex-core`; the local broad run did not
    finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed
    out. The failures were broad timeout/deadline failures across unrelated
    areas; targeted changed-path core tests above passed.
  • Attach Windows sandbox log to feedback reports (#24623)
    ## Why
    
    Windows sandbox diagnostics are currently hard to recover from
    `/feedback` even though they are often the most useful artifact when
    debugging sandbox behavior. Now that sandbox logging uses daily rolling
    files, feedback can safely include the current day's sandbox log without
    uploading the old ever-growing legacy `sandbox.log`.
    
    ## What changed
    
    - Add a `codex-windows-sandbox` helper that resolves the current daily
    sandbox log from `codex_home`.
    - When feedback is submitted with logs enabled on Windows, app-server
    attaches today's sandbox log if it exists.
    - Upload the attachment under the stable filename `windows-sandbox.log`,
    independent of the dated on-disk filename.
    - Keep existing raw `extra_log_files` behavior unchanged for rollout and
    desktop log attachments.
    
    ## Verification
    
    - `cargo fmt -p codex-app-server -p codex-windows-sandbox`
    - `cargo test -p codex-windows-sandbox
    current_log_file_path_for_codex_home_uses_sandbox_dir`
    - `cargo test -p codex-app-server
    windows_sandbox_log_attachment_uses_current_log`
    - Manual CLI/TUI `/feedback` test confirmed Sentry received
    `windows-sandbox.log`.
  • [codex] remove plain image wrapper spans (#24652)
    ## Why
    
    Remote image submissions currently wrap native `input_image` spans in
    literal `<image>` and `</image>` text spans. Those extra prompt tokens
    add structure without providing label or routing information.
    
    ## What Changed
    
    - Serialize `UserInput::Image` directly as an `input_image` content
    span.
    - Preserve named local-image framing and legacy wrapper parsing for
    labeled attachments and existing histories.
    - Update existing request-shape expectations for drag-and-drop images,
    model switching, and compaction.
    
    ## Validation
    
    - `just test -p codex-protocol`
    - Focused `codex-core` run covering
    `drag_drop_image_persists_rollout_request_shape`,
    `model_change_from_image_to_text_strips_prior_image_content`, and
    `snapshot_request_shape_pre_turn_compaction_including_incoming_user_message`
    
    ## Notes
    
    - A broader `just test -p codex-core` run was attempted; the affected
    tests passed, while the overall run failed in unrelated CLI, MCP, and
    tooling tests plus a `thread_manager` timeout.
  • windows-sandbox: remove SandboxPolicy runner plumbing (#23813)
    ## Why
    
    The Windows sandbox runner still carried the old `SandboxPolicy`
    compatibility path even though core now computes `PermissionProfile`.
    That meant Windows command-runner execution could only see the legacy
    projection, so profile-only filesystem rules such as deny globs were not
    part of the runner input.
    
    ## What Changed
    
    - Removed the Windows-local `SandboxPolicy` parser/export and deleted
    `windows-sandbox-rs/src/policy.rs`.
    - Changed restricted-token capture/session setup, elevated setup,
    world-writable audit, read-root grant, and command-runner session APIs
    to accept `PermissionProfile` plus the profile cwd.
    - Bumped the elevated command-runner IPC protocol to version 2 because
    `SpawnRequest` now carries `permission_profile` /
    `permission_profile_cwd` instead of the legacy `policy_json_or_preset` /
    `sandbox_policy_cwd` fields.
    - Updated core exec, unified exec, debug-sandbox, TUI setup/grant flows,
    and app-server setup to pass the actual effective `PermissionProfile`.
    - Left regression coverage asserting the old IPC policy fields are
    absent and the runner serializes tagged `PermissionProfile` JSON.
    
    ## Verification
    
    - `cargo test -p codex-windows-sandbox`
    - `cargo test -p codex-core windows_sandbox`
    - `cargo test -p codex-app-server
    request_processors::windows_sandbox_processor`
    - `just fix -p codex-windows-sandbox -p codex-core -p codex-app-server
    -p codex-cli -p codex-tui`
    - `just fix -p codex-cli -p codex-tui`
    - `just fix -p codex-windows-sandbox -p codex-tui`
    - `rg "\\bSandboxPolicy\\b" codex-rs/windows-sandbox-rs` returned no
    matches.
    
    Note: `cargo test -p codex-cli` was attempted but did not reach crate
    tests because local disk filled while compiling dependencies (`No space
    left on device`). The targeted clippy pass compiled the affected CLI/TUI
    surfaces afterward.
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23813).
    * #24108
    * __->__ #23813
  • Avoid repeated marketplace upgrades for alternate layouts (#24320)
    Fixes #24249.
    
    ## Why
    
    Codex already supports discovering marketplaces under both
    `.agents/plugins/marketplace.json` and
    `.claude-plugin/marketplace.json`. The Git marketplace auto-upgrade
    no-op check only looked for the `.agents` layout. That meant an
    installed `.claude-plugin` marketplace with matching revision metadata
    still looked absent, so plugin list/startup upgrade work could stage and
    re-activate the same marketplace again.
    
    That matches the failure shape in #24249: the report called out repeated
    marketplace sync/cache refresh logs and a large recently-touched
    `.tmp/marketplaces/.staging` directory. This change makes the
    auto-upgrade path recognize the installed `.claude-plugin` marketplace
    as already current, which should remove that staging/activation feedback
    loop.
    
    ## What changed
    
    `codex-rs/core-plugins/src/marketplace_upgrade.rs` now uses the existing
    supported marketplace manifest discovery helper when deciding whether an
    installed Git marketplace is already current. Existing local plugin
    source validation is unchanged; `source: "./"` still remains invalid.
    
    ## Confidence
    
    Confidence is high that this fixes the repeated marketplace upgrade
    path: the old hardcoded layout check was definitely wrong for installed
    `.claude-plugin` marketplaces, and the reported staging churn points
    directly at that path.
    
    Confidence is not 100% because we do not have a CPU profile or a fully
    re-run reporter repro. A malformed marketplace entry can still be logged
    as invalid if another caller repeatedly lists plugins; this PR fixes the
    staging/upgrade feedback loop that likely made the failure pathological,
    not every possible source of repeated marketplace resolution.
  • TUI config cleanup: plugin mentions (#24266)
    ## Summary
    
    TUI plugin mention refresh still joined app-server plugin inventory with
    client-local plugin config, which can diverge once plugin state is owned
    by the app server.
    
    This changes the TUI to mirror the GUI client: `plugin/list` is the
    autocomplete source, and mention candidates are plugin-level entries
    filtered to installed, enabled, and not disabled by admin. The TUI no
    longer reads local plugin config or calls `plugin/read` while refreshing
    plugin mention candidates.
    
    ## API shape and limitations
    
    The current app-server API does not expose effective per-session plugin
    capability summaries for mention autocomplete. As in the GUI,
    autocomplete now trusts `plugin/list` metadata rather than proving which
    plugin capabilities are loaded in the active session.
    
    That avoids stale client-local reads and the cwd/remote detail gaps in
    `plugin/read`, but intentionally accepts the same list-level tradeoff as
    the app: if `plugin/list` reports a remote plugin before its local
    bundle is materialized, the plugin can still appear as a mention
    candidate.
  • make direct only allowed caller for standalone websearch (#24646)
    only allow `Direct` callers of the standalone websearch tool because its
    not supported in codemode
  • Add forked_from_thread_id turn metadata (#24160)
    ## Why
    
    When Codex calls responsesapi, we currently send `session_id`,
    `thread_id`, and `turn_id` among other things as
    `client_metadata["x-codex-turn-metadata"]`. This PR adds
    `forked_from_thread_id` which helps explain the "lineage" of a forked
    thread.
    
    ## What's changed
    
    - Track the immediate history source copied into a forked thread through
    thread/session creation, including subagent and review turn metadata
    paths.
    - Include `forked_from_thread_id` in Codex turn metadata while
    preventing turn-scoped Responses API client metadata from overwriting
    Codex-owned lineage fields.
    - Add coverage for fork lineage in turn metadata and the app-server
    Responses API request path.