221 Commits

  • [codex] Remove child AGENTS.md prompt experiment (#28993)
    ## Why
    
    `child_agents_md` is a disabled, under-development experiment that adds
    a second model-visible explanation of hierarchical `AGENTS.md` behavior.
    Keeping it leaves unused prompt, configuration, documentation, and test
    surface.
    
    ## What changed
    
    - remove the `ChildAgentsMd` feature and `child_agents_md` config schema
    entry
    - remove the hierarchical prompt asset, export, and instruction
    injection
    - remove feature-specific tests and documentation
    - keep the generic unstable-feature warning coverage using
    `apply_patch_streaming_events`
    
    Normal project `AGENTS.md` discovery and composition are unchanged.
    
    ## Testing
    
    - `just test -p codex-features`
    - `just test -p codex-prompts`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core unstable_features_warning`
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • tui: make codex-tui.log opt-in (#24081)
    ## Why
    
    The TUI currently creates a shared plaintext `codex-tui.log` under the
    default log directory. That append-only file can keep growing across
    runs even though the TUI already records diagnostics in bounded local
    stores.
    
    Make the plaintext file log an explicit troubleshooting choice instead
    of a default side effect.
    
    This is possible because logs are also stored in the DB with proper
    rotation
    
    ## What changed
    
    - Only install the TUI file logging layer when `log_dir` is explicitly
    set.
    - Remove the prior `codex-tui.log` at startup before an opt-in file
    layer is created.
    - Clarify the `log_dir` config/schema text and `docs/install.md` example
    so users opt in with `codex -c log_dir=...` when they need a plaintext
    log.
  • Prefer just test over cargo test in docs (#23910)
    `cargo test` for the core and other crates fails on a fresh macOS
    checkout without the right stack size variable. This change encourages
    using the just test command that sets the environment up correctly.
    
    As a bonus, this should encourage agents to get more benefit out of
    nextest's parallel execution.
  • Add allow_managed_hooks_only hook requirement (#20319)
    ## Why
    
    Enterprise-managed hook policy needs a narrow way to require Codex to
    ignore user-controlled lifecycle hooks without adopting the broader
    trust-precedence model from earlier hook work. This keeps the policy
    anchored in `requirements.toml`, so admins can opt into managed hooks
    only while normal `config.toml` files cannot enable the restriction
    themselves.
    
    ## What changed
    
    - Added `allow_managed_hooks_only` to the requirements data flow and
    preserved explicit `false` values.
    - Also adds it to /debug-config
    - Marked MDM, system, and legacy managed config layers as managed for
    hook discovery.
    - Updated hook discovery so `allow_managed_hooks_only = true`:
      - keeps managed requirements hooks and managed config-layer hooks,
    - skips user/project/session `hooks.json` and `[hooks]` entries with
    concise startup warnings,
      - skips current unmanaged plugin hooks,
    - ignores any `allow_managed_hooks_only` key placed in ordinary
    `config.toml` layers.
  • Clarify docs folder guidance in AGENTS.md (#21772)
    ## Summary
    
    Codex keeps trying to add documentation to the `docs/` directory. With
    the exception of app server API documentation, the docs for Codex should
    not live in this repo. We don't want the local `docs/` folder to become
    a stale shadow of the official docs.
    
    This PR updates `AGENTS.md` to make that boundary explicit and scopes
    the existing API documentation guidance to app-server docs/examples. It
    also removes the extra `docs/config.md` sections that were recently
    added.
  • codex-otel: add configurable trace metadata (#21556)
    Add Codex config for static trace span attributes and structured W3C
    tracestate field upserts. The config flows through OtelSettings so
    callers can attach trace metadata without touching every span call site.
    
    Apply span attributes with an SDK span processor so every exported
    trace span carries the configured metadata. Model tracestate as nested
    member fields so configured keys can be upserted while unrelated
    propagated state in the same member is preserved.
    
    Validate configured tracestate before installing provider-global state,
    including header-unsafe values the SDK does not reject by itself. This
    keeps Codex from propagating malformed trace context from config.
    
    Update the config schema, public docs, and OTLP loopback coverage for
    config parsing, span export, propagation, and invalid-header rejection.
  • Ensure all mentions of cargo-install are --locked (#21592)
    There's already a preference for this in the codebase, but a few of them
    have drifted away. Generally `--locked` is preferred to reduce exposure
    to supply-chain attacks (and just generally improve reproducibility).
    
    In an ideal world these dependencies would maybe even be pinned to
    versions but Cargo is kinda bad at that for devtools. Still better to
    use --locked than not.
  • Document Codex git commit attribution config (#21379)
    ## Summary
    - document that commit attribution for generated git commit messages is
    gated by the `codex_git_commit` feature flag
    - add an example `config.toml` snippet showing `commit_attribution` with
    `[features].codex_git_commit = true`
    - update the config schema description so the reference docs explain
    that `commit_attribution` only takes effect when the feature is enabled
    
    Fixes #19799.
    
    ## Validation
    - `cargo run -p codex-core --bin codex-write-config-schema`
    - `cargo test -p codex-config`
    - `cargo test -p codex-features`
    - `cargo fmt --check`
    - `git diff --check`
    
    ## Notes
    - `cargo test -p codex-core config_schema_matches_fixture` currently
    fails before reaching the schema test because `core_test_support`
    imports `similar` without a linked crate in this checkout. The narrower
    package checks above avoid that unrelated test-support build failure.
  • Remove local docs and specs (#20896)
    ## Summary
    
    We should not check local-only docs or planning specs into this
    repository. Keeping those files here duplicates the canonical Codex
    documentation surface and makes transient implementation notes look like
    supported docs.
    
    This PR removes the local-only docs/spec files from `docs/` and trims
    `docs/config.md` back to links for the maintained configuration
    documentation on developers.openai.com.
  • deprecate legacy notify (#20524)
    # Why
    
    `notify` is the remaining compatibility surface from the legacy hook
    implementation. The newer lifecycle hook engine now owns the active hook
    system, so we should start steering users away from adding new `notify`
    configs before removing the old path entirely. This also adds a
    lightweight watchpoint for the deprecation so we can see how much legacy
    usage remains before the clean drop.
    
    # What
    
    - emit a startup deprecation notice when a non-empty `notify` command is
    configured
    - emit `codex.notify.configured` when a session starts with legacy
    `notify` configured
    - emit `codex.notify.run` when the legacy notify path fires after a
    completed turn
    - mark `notify` as deprecated in the config schema and repo docs
    - remove the orphaned `codex-rs/hooks/src/user_notification.rs` file
    that is no longer compiled
    - add regression coverage for the new deprecation notice
    
    # Next steps
    
    A follow-up PR can remove the legacy notify path entirely once we are
    ready for the clean drop. Before then, we can watch
    `codex.notify.configured` and `codex.notify.run` to understand the
    deprecation impact and remaining active usage. The cleanup PR should
    then delete the `notify` config field, the `legacy_notify`
    implementation, the old compatibility dispatch types and callsites that
    only exist for the legacy path, and the remaining compatibility
    docs/tests.
    
    # Testing
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-config`
    - `cargo test -p codex-core emits_deprecation_notice_for_notify`
  • Support disabling tool suggest for specific tools. (#20072)
    ## Summary
    - Add `disable_tool_suggest` to app and plugin config, schema, and
    TypeScript output
    - Exclude disabled connectors and plugins from tool suggestion discovery
    - Persist "never show again" tool-suggestion choices back into
    `config.toml`
    - Update config docs and add coverage for connector and plugin
    suppression
    
    ## Testing
    - Added and updated unit tests for config persistence and tool-suggest
    filtering
    - Not run (not requested)
  • Add server-level approval defaults for custom MCP servers (#17843)
    ## Summary
    - Add `default_tools_approval_mode` support for custom MCP server
    configs, matching the existing `codex_apps` behavior
    - Apply approval precedence as per-tool override, then server default,
    then `auto`
    - Update config serialization, CLI display, schema generation, docs, and
    tests
    
    ## Testing
    - `cargo check -p codex-config`
    - `cargo check -p codex-core`
    - `just write-config-schema`
    - `just fmt`
    - `cargo test -p codex-config`
    - Targeted `codex-core` tests for config parsing, config writes, and MCP
    approval precedence
    - `just fix -p codex-config -p codex-core`
  • Support original-detail metadata on MCP image outputs (#17714)
    ## Summary
    - honor `_meta["codex/imageDetail"] == "original"` on MCP image content
    and map it to `detail: "original"` where supported
    - strip that detail back out when the active model does not support
    original-detail image inputs
    - update code-mode `image(...)` to accept individual MCP image blocks
    - teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
    from raw MCP image outputs
    - document the new `_meta` contract and add generic RMCP-backed coverage
    across protocol, core, code-mode, and js_repl paths
  • Add supports_parallel_tool_calls flag to included mcps (#17667)
    ## Why
    
    For more advanced MCP usage, we want the model to be able to emit
    parallel MCP tool calls and have Codex execute eligible ones
    concurrently, instead of forcing all MCP calls through the serial block.
    
    The main design choice was where to thread the config. I made this
    server-level because parallel safety depends on the MCP server
    implementation. Codex reads the flag from `mcp_servers`, threads the
    opted-in server names into `ToolRouter`, and checks the parsed
    `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
    on model-visible tool names, which can be incomplete in
    deferred/search-tool paths or ambiguous for similarly named
    servers/tools.
    
    ## What was added
    
    Added `supports_parallel_tool_calls` for MCP servers.
    
    Before:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    ```
    
    After:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    supports_parallel_tool_calls = true
    ```
    
    MCP calls remain serial by default. Only tools from opted-in servers are
    eligible to run in parallel. Docs also now warn to enable this only when
    the server’s tools are safe to run concurrently, especially around
    shared state or read/write races.
    
    ## Testing
    
    Tested with a local stdio MCP server exposing real delay tools. The
    model/Responses side was mocked only to deterministically emit two MCP
    calls in the same turn.
    
    Each test called `query_with_delay` and `query_with_delay_2` with `{
    "seconds": 25 }`.
    
    | Build/config | Observed | Wall time |
    | --- | --- | --- |
    | main with flag enabled | serial | `58.79s` |
    | PR with flag enabled | parallel | `31.73s` |
    | PR without flag | serial | `56.70s` |
    
    PR with flag enabled showed both tools start before either completed;
    main and PR-without-flag completed the first delay before starting the
    second.
    
    Also added an integration test.
    
    Additional checks:
    
    - `cargo test -p codex-tools` passed
    - `cargo test -p codex-core
    mcp_parallel_support_uses_exact_payload_server` passed
    - `git diff --check` passed
  • feat(tui): add reverse history search to composer (#17550)
    ## Problem
    
    The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not
    provide the reverse incremental search workflow users expect from
    shells. Users needed a way to search older prompts without immediately
    replacing the current draft, and the interaction needed to handle async
    persistent history, repeated navigation keys, duplicate prompt text,
    footer hints, and preview highlighting without making the main composer
    file even harder to review.
    
    
    https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b
    
    <img width="1227" height="722" alt="image"
    src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af"
    />
    
    ## Mental model
    
    `Ctrl+R` opens a temporary search session owned by the composer. The
    footer line becomes the search input, the composer body previews the
    current match only after the query has text, and `Enter` accepts that
    preview as an editable draft while `Esc` restores the draft that existed
    before search started. The history layer provides a combined offset
    space over persistent and local history, but search navigation exposes
    unique prompt text rather than every physical history row.
    
    ## Non-goals
    
    This change does not rewrite stored history, change normal Up/Down
    browsing semantics, add fuzzy matching, or add persistent metadata for
    attachments in cross-session history. Search deduplication is
    deliberately scoped to the active Ctrl+R search session and uses exact
    prompt text, so case, whitespace, punctuation, and attachment-only
    differences are not normalized.
    
    ## Tradeoffs
    
    The implementation keeps search state in the existing composer and
    history state machines instead of adding a new cross-module controller.
    That keeps ownership local and testable, but it means the composer still
    coordinates visible search status, draft restoration, footer rendering,
    cursor placement, and match highlighting while `ChatComposerHistory`
    owns traversal, async fetch continuation, boundary clamping, and
    unique-result caching. Unique-result caching stores cloned
    `HistoryEntry` values so known matches can be revisited without cache
    lookups; this is simple and robust for interactive search sizes, but it
    is not a global history index.
    
    ## Architecture
    
    `ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches
    the footer to `FooterMode::HistorySearch`, and routes search-mode keys
    before normal editing. Query edits call `ChatComposerHistory::search`
    with `restart = true`, which starts from the newest combined-history
    offset. Repeated `Ctrl+R` or Up searches older; Down searches newer
    through already discovered unique matches or continues the scan.
    Persistent history entries still arrive asynchronously through
    `on_entry_response`, where a pending search either accepts the response,
    skips a duplicate, or requests the next offset.
    
    The composer-facing pieces now live in
    `codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving
    `chat_composer.rs` responsible for routing and rendering integration
    instead of owning every search helper inline.
    `codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the
    owner of stored history, combined offsets, async fetch state, boundary
    semantics, and duplicate suppression. Match highlighting is computed
    from the current composer text while search is active and disappears
    when the match is accepted.
    
    ## Observability
    
    There are no new logs or telemetry. The practical debug path is state
    inspection: `ChatComposer.history_search` tells whether the footer query
    is idle, searching, matched, or unmatched; `ChatComposerHistory.search`
    tracks selected raw offsets, pending persistent fetches, exhausted
    directions, and unique match cache state. If a user reports skipped or
    repeated results, first inspect the exact stored prompt text, the
    selected offset, whether an async persistent response is still pending,
    and whether a query edit restarted the search session.
    
    ## Tests
    
    The change is covered by focused `codex-tui` unit tests for opening
    search without previewing the latest entry, accepting and canceling
    search, no-match restoration, boundary clamping, footer hints,
    case-insensitive highlighting, local duplicate skipping, and persistent
    duplicate skipping through async responses. Snapshot coverage captures
    the footer-mode visual changes. Local verification used `just fmt`,
    `cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and
    `just fix -p codex-tui`.
  • Remove remaining custom prompt support (#16115)
    ## Summary
    - remove protocol and core support for discovering and listing custom
    prompts
    - simplify the TUI slash-command flow and command popup to built-in
    commands only
    - delete obsolete custom prompt tests, helpers, and docs references
    - clean up downstream event handling for the removed protocol events
  • Rename tui_app_server to tui (#16104)
    This is a follow-up to https://github.com/openai/codex/pull/15922. That
    previous PR deleted the old `tui` directory and left the new
    `tui_app_server` directory in place. This PR renames `tui_app_server` to
    `tui` and fixes up all references.
  • Remove the legacy TUI split (#15922)
    This is the part 1 of 2 PRs that will delete the `tui` /
    `tui_app_server` split. This part simply deletes the existing `tui`
    directory and marks the `tui_app_server` feature flag as removed. I left
    the `tui_app_server` feature flag in place for now so its presence
    doesn't result in an error. It is simply ignored.
    
    Part 2 will rename the `tui_app_server` directory `tui`. I did this as
    two parts to reduce visible code churn.
  • [mcp] Improve custom MCP elicitation (#15800)
    - [x] Support don't ask again for custom MCP tool calls.
    - [x] Don't run arc in yolo mode.
    - [x] Run arc for custom MCP tools in always allow mode.
  • client: extend custom CA handling across HTTPS and websocket clients (#14239)
    ## Stacked PRs
    
    This work is now effectively split across two steps:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: extend that shared custom CA handling across Codex HTTPS
    clients and secure websocket TLS
    
    Note: #14240 was merged into this branch while it was stacked on top of
    this PR. This PR now subsumes that websocket follow-up and should be
    treated as the combined change.
    
    Builds on top of #14178.
    
    ## Problem
    
    Custom CA support landed first in the login path, but the real
    requirement is broader. Codex constructs outbound TLS clients in
    multiple places, and both HTTPS and secure websocket paths can fail
    behind enterprise TLS interception if they do not honor
    `CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.
    
    This PR broadens the shared custom-CA logic beyond login and applies the
    same policy to websocket TLS, so the enterprise-proxy story is no longer
    split between “HTTPS works” and “websockets still fail”.
    
    ## What This Delivers
    
    Custom CA support is no longer limited to login. Codex outbound HTTPS
    clients and secure websocket connections can now honor the same
    `CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
    proxy/intercept setups work more consistently end-to-end.
    
    For users and operators, nothing new needs to be configured beyond the
    same CA env vars introduced in #14178. The change is that more of Codex
    now respects them, including websocket-backed flows that were previously
    still using default trust roots.
    
    I also manually validated the proxy path locally with mitmproxy using:
    `CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
    HTTPS_PROXY=http://127.0.0.1:8080 just codex`
    with mitmproxy installed via `brew install mitmproxy` and configured as
    the macOS system proxy.
    
    ## Mental model
    
    `codex-client` is now the owner of shared custom-CA policy for outbound
    TLS client construction. Reqwest callers start from the builder
    configuration they already need, then pass that builder through
    `build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
    same module for a rustls client config when a custom CA bundle is
    configured.
    
    The env precedence is the same everywhere:
    - `CODEX_CA_CERTIFICATE` wins
    - otherwise fall back to `SSL_CERT_FILE`
    - otherwise use system roots
    
    The helper is intentionally narrow. It loads every usable certificate
    from the configured PEM bundle into the appropriate root store and
    returns either a configured transport or a typed error that explains
    what went wrong.
    
    ## Non-goals
    
    This does not add handshake-level integration tests against a live TLS
    endpoint. It does not validate that the configured bundle forms a
    meaningful certificate chain. It also does not try to force every
    transport in the repo through one abstraction; it extends the shared CA
    policy across the reqwest and websocket paths that actually needed it.
    
    ## Tradeoffs
    
    The main tradeoff is centralizing CA behavior in `codex-client` while
    still leaving adoption up to call sites. That keeps the implementation
    additive and reviewable, but it means the rule "outbound Codex TLS that
    should honor enterprise roots must use the shared helper" is still
    partly enforced socially rather than by types.
    
    For websockets, the shared helper only builds an explicit rustls config
    when a custom CA bundle is configured. When no override env var is set,
    websocket callers still use their ordinary default connector path.
    
    ## Architecture
    
    `codex-client::custom_ca` now owns CA bundle selection, PEM
    normalization, mixed-section parsing, certificate extraction, typed
    CA-loading errors, and optional rustls client-config construction for
    websocket TLS.
    
    The affected consumers now call into that shared helper directly rather
    than carrying login-local CA behavior:
    - backend-client
    - cloud-tasks
    - RMCP client paths that use `reqwest`
    - TUI voice HTTP paths
    - `codex-core` default reqwest client construction
    - `codex-api` websocket clients for both responses and realtime
    websocket connections
    
    The subprocess CA probe, env-sensitive integration tests, and shared PEM
    fixtures also live in `codex-client`, which is now the actual owner of
    the behavior they exercise.
    
    ## Observability
    
    The shared CA path logs:
    - which environment variable selected the bundle
    - which path was loaded
    - how many certificates were accepted
    - when `TRUSTED CERTIFICATE` labels were normalized
    - when CRLs were ignored
    - where client construction failed
    
    Returned errors remain user-facing and include the relevant env var,
    path, and remediation hint. That same error model now applies whether
    the failure surfaced while building a reqwest client or websocket TLS
    configuration.
    
    ## Tests
    
    Pure unit tests in `codex-client` cover env precedence and PEM
    normalization behavior. Real client construction remains in subprocess
    tests so the suite can control process env and avoid the macOS seatbelt
    panic path that motivated the hermetic test split.
    
    The subprocess coverage verifies:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-cert and multi-cert bundles
    - malformed and empty-file errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The websocket side is covered by the existing `codex-api` / `codex-core`
    websocket test suites plus the manual mitmproxy validation above.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • login: add custom CA support for login flows (#14178)
    ## Stacked PRs
    
    This work is split across three stacked PRs:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: broaden the shared custom CA path from login to other outbound
    `reqwest` clients across Codex
    - #14240: extend that shared custom CA handling to secure websocket TLS
    so websocket connections honor the same CA env vars
    
    Review order: #14178, then #14239, then #14240.
    
    Supersedes #6864.
    
    Thanks to @3axap4eHko for the original implementation and investigation
    here. Although this version rearranges the code and history
    significantly, the majority of the credit for this work belongs to them.
    
    ## Problem
    
    Login flows need to work in enterprise environments where outbound TLS
    is intercepted by an internal proxy or gateway. In those setups, system
    root certificates alone are often insufficient to validate the OAuth and
    device-code endpoints used during login. The change adds a
    login-specific custom CA loading path, but the important contracts
    around env precedence, PEM compatibility, test boundaries, and
    probe-only workarounds need to be explicit so reviewers can understand
    what behavior is intentional.
    
    For users and operators, the behavior is simple: if login needs to trust
    a custom root CA, set `CODEX_CA_CERTIFICATE` to a PEM file containing
    one or more certificates. If that variable is unset, login falls back to
    `SSL_CERT_FILE`. If neither is set, login uses system roots. Invalid or
    empty PEM files now fail with an error that points back to those
    environment variables and explains how to recover.
    
    ## What This Delivers
    
    Users can now make Codex login work behind enterprise TLS interception
    by pointing `CODEX_CA_CERTIFICATE` at a PEM bundle containing the
    relevant root certificates. If that variable is unset, login falls back
    to `SSL_CERT_FILE`, then to system roots.
    
    This PR applies that behavior to both browser-based and device-code
    login flows. It also makes login tolerant of the PEM shapes operators
    actually have in hand: multi-certificate bundles, OpenSSL `TRUSTED
    CERTIFICATE` labels, and bundles that include well-formed CRLs.
    
    ## Mental model
    
    `codex-login` is the place where the login flows construct ad hoc
    outbound HTTP clients. That makes it the right boundary for a narrow CA
    policy: look for `CODEX_CA_CERTIFICATE`, fall back to `SSL_CERT_FILE`,
    load every parseable certificate block in that bundle into a
    `reqwest::Client`, and fail early with a clear user-facing error if the
    bundle is unreadable or malformed.
    
    The implementation is intentionally pragmatic about PEM input shape. It
    accepts ordinary certificate bundles, multi-certificate bundles, OpenSSL
    `TRUSTED CERTIFICATE` labels, and bundles that also contain CRLs. It
    does not validate a certificate chain or prove a handshake; it only
    constructs the root store used by login.
    
    ## Non-goals
    
    This change does not introduce a general-purpose transport abstraction
    for the rest of the product. It does not validate whether the provided
    bundle forms a real chain, and it does not add handshake-level
    integration tests against a live TLS server. It also does not change
    login state management or OAuth semantics beyond ensuring the existing
    flows share the same CA-loading rules.
    
    ## Tradeoffs
    
    The main tradeoff is keeping this logic scoped to login-specific client
    construction rather than lifting it into a broader shared HTTP layer.
    That keeps the review surface smaller, but it also means future
    login-adjacent code must continue to use `build_login_http_client()` or
    it can silently bypass enterprise CA overrides.
    
    The `TRUSTED CERTIFICATE` handling is also intentionally a local
    compatibility shim. The rustls ecosystem does not currently accept that
    PEM label upstream, so the code normalizes it locally and trims the
    OpenSSL `X509_AUX` trailer bytes down to the certificate DER that
    `reqwest` can consume.
    
    ## Architecture
    
    `custom_ca.rs` is now the single place that owns login CA behavior. It
    selects the CA file from the environment, reads it, normalizes PEM label
    shape where needed, iterates mixed PEM sections with `rustls-pki-types`,
    ignores CRLs, trims OpenSSL trust metadata when necessary, and returns
    either a configured `reqwest::Client` or a typed error.
    
    The browser login server and the device-code flow both call
    `build_login_http_client()`, so they share the same trust-store policy.
    Environment-sensitive tests run through the `login_ca_probe` helper
    binary because those tests must control process-wide env vars and cannot
    reliably build a real reqwest client in-process on macOS seatbelt runs.
    
    ## Observability
    
    The custom CA path logs which environment variable selected the bundle,
    which file path was loaded, how many certificates were accepted, when
    `TRUSTED CERTIFICATE` labels were normalized, when CRLs were ignored,
    and where client construction failed. Returned errors remain user-facing
    and include the relevant path, env var, and remediation hint.
    
    This gives enough signal for three audiences:
    - users can see why login failed and which env/file caused it
    - sysadmins can confirm which override actually won
    - developers can tell whether the failure happened during file read, PEM
    parsing, certificate registration, or final reqwest client construction
    
    ## Tests
    
    Pure unit tests stay limited to env precedence and empty-value handling.
    Real client construction lives in subprocess tests so the suite remains
    hermetic with respect to process env and macOS sandbox behavior.
    
    The subprocess tests verify:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-certificate and multi-certificate bundles
    - malformed and empty-bundle errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The named PEM fixtures under `login/tests/fixtures/` are shared by the
    tests so their purpose stays reviewable.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Persist js_repl codex helpers across cells (#14503)
    ## Summary
    
    This changes `js_repl` so saved references to `codex.tool(...)` and
    `codex.emitImage(...)` keep working across cells.
    
    Previously, those helpers were recreated per exec and captured that
    exec's `message.id`. If a persisted object or saved closure reused an
    old helper in a later cell, the nested tool/image call could fail with
    `js_repl exec context not found`.
    
    This patch:
    - keeps stable `codex.tool` and `codex.emitImage` helper identities in
    the kernel
    - resolves the current exec dynamically at call time using
    `AsyncLocalStorage`
    - adds regression coverage for persisted helper references across cells
    - updates the js_repl docs and project-doc instructions to describe the
    new behavior and its limits
    
    ## Why
    
    We already support persistent top-level bindings across `js_repl` cells,
    so persisted objects should be able to reuse `codex` helpers in later
    active cells. The bug was that helper identity was exec-scoped, not
    kernel-scoped.
    
    Using `AsyncLocalStorage` fixes the cross-cell reuse case without
    falling back to a single global active exec that could accidentally
    attribute stale background callbacks to the wrong cell.
  • Let models opt into original image detail (#14175)
    ## Summary
    
    This PR narrows original image detail handling to a single opt-in
    feature:
    
    - `image_detail_original` lets the model request `detail: "original"` on
    supported models
    - Omitting `detail` preserves the default resized behavior
    
    The model only sees `detail: "original"` guidance when the active model
    supports it:
    
    - JS REPL instructions include the guidance and examples only on
    supported models
    - `view_image` only exposes a `detail` parameter when the feature and
    model can use it
    
    The image detail API is intentionally narrow and consistent across both
    paths:
    
    - `view_image.detail` supports only `"original"`; otherwise omit the
    field
    - `codex.emitImage(..., detail)` supports only `"original"`; otherwise
    omit the field
    - Unsupported explicit values fail clearly at the API boundary instead
    of being silently reinterpreted
    - Unsupported explicit `detail: "original"` requests fall back to normal
    behavior when the feature is disabled or the model does not support
    original detail
  • Add js_repl cwd and homeDir helpers (#14385)
    ## Summary
    
    This PR adds two read-only path helpers to `js_repl`:
    
    - `codex.cwd`
    - `codex.homeDir`
    
    They are exposed alongside the existing `codex.tmpDir` helper so the
    REPL can reference basic host path context without reopening direct
    `process` access.
    
    ## Implementation
    
    - expose `codex.cwd` and `codex.homeDir` from the js_repl kernel
    - make `codex.homeDir` come from the kernel process environment
    - pass session dependency env through js_repl kernel startup so
    `codex.homeDir` matches the env a shell-launched process would see
    - keep existing shell `HOME` population behavior unchanged
    - update js_repl prompt/docs and add runtime/integration coverage for
    the new helpers
  • Add realtime start instructions config override (#14270)
    - add `realtime_start_instructions` config support
    - thread it into realtime context updates, schema, docs, and tests
  • docs: remove auth login logging plan (#13810)
    ## Summary
    
    Remove `docs/auth-login-logging-plan.md`.
    
    ## Why
    
    The document was a temporary planning artifact. The durable rationale
    for the
    auth-login diagnostics work now lives in the code comments, tests, PR
    context,
    and existing implementation notes, so keeping the standalone plan doc
    adds
    duplicate maintenance surface.
    
    ## Testing
    
    - not run (docs-only deletion)
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add auth login diagnostics (#13797)
    ## Problem
    
    Browser login failures historically leave support with an incomplete
    picture. HARs can show that the browser completed OAuth and reached the
    localhost callback, but they do not explain why the native client failed
    on the final `/oauth/token` exchange. Direct `codex login` also relied
    mostly on terminal stderr and the browser error page, so even when the
    login crate emitted better sign-in diagnostics through TUI or app-server
    flows, the one-shot CLI path still did not leave behind an easy artifact
    to collect.
    
    ## Mental model
    
    This implementation treats the browser page, the returned `io::Error`,
    and the normal structured log as separate surfaces with different safety
    requirements. The browser page and returned error preserve the detail
    that operators need to diagnose failures. The structured log stays
    narrower: it records reviewed lifecycle events, parsed safe fields, and
    redacted transport errors without becoming a sink for secrets or
    arbitrary backend bodies.
    
    Direct `codex login` now adds a fourth support surface: a small
    file-backed log at `codex-login.log` under the configured `log_dir`.
    That artifact carries the same login-target events as the other
    entrypoints without changing the existing stderr/browser UX.
    
    ## Non-goals
    
    This does not add auth logging to normal runtime requests, and it does
    not try to infer precise transport root causes from brittle string
    matching. The scope remains the browser-login callback flow in the
    `login` crate plus a direct-CLI wrapper that persists those events to
    disk.
    
    This also does not try to reuse the TUI logging stack wholesale. The TUI
    path initializes feedback, OpenTelemetry, and other session-oriented
    layers that are useful for an interactive app but unnecessary for a
    one-shot login command.
    
    ## Tradeoffs
    
    The implementation favors fidelity for caller-visible errors and
    restraint for persistent logs. Parsed JSON token-endpoint errors are
    logged safely by field. Non-JSON token-endpoint bodies remain available
    to the returned error so CLI and browser surfaces still show backend
    detail. Transport errors keep their real `reqwest` message, but attached
    URLs are surgically redacted. Custom issuer URLs are sanitized before
    logging.
    
    On the CLI side, the code intentionally duplicates a narrow slice of the
    TUI file-logging setup instead of sharing the full initializer. That
    keeps `codex login` easy to reason about and avoids coupling it to
    interactive-session layers that the command does not need.
    
    ## Architecture
    
    The core auth behavior lives in `codex-rs/login/src/server.rs`. The
    callback path now logs callback receipt, callback validation,
    token-exchange start, token-exchange success, token-endpoint non-2xx
    responses, and transport failures. App-server consumers still use this
    same login-server path via `run_login_server(...)`, so the same
    instrumentation benefits TUI, Electron, and VS Code extension flows.
    
    The direct CLI path in `codex-rs/cli/src/login.rs` now installs a small
    file-backed tracing layer for login commands only. That writes
    `codex-login.log` under `log_dir` with login-specific targets such as
    `codex_cli::login` and `codex_login::server`.
    
    ## Observability
    
    The main signals come from the `login` crate target and are
    intentionally scoped to sign-in. Structured logs include redacted issuer
    URLs, redacted transport errors, HTTP status, and parsed token-endpoint
    fields when available. The callback-layer log intentionally avoids
    `%err` on token-endpoint failures so arbitrary backend bodies do not get
    copied into the normal log file.
    
    Direct `codex login` now leaves a durable artifact for both failure and
    success cases. Example output from the new file-backed CLI path:
    
    Failing callback:
    
    ```text
    2026-03-06T22:08:54.143612Z  INFO codex_cli::login: starting browser login flow
    2026-03-06T22:09:03.431699Z  INFO codex_login::server: received login callback path=/auth/callback has_code=false has_state=true has_error=true state_valid=true
    2026-03-06T22:09:03.431745Z  WARN codex_login::server: oauth callback returned error error_code="access_denied" has_error_description=true
    ```
    
    Succeeded callback and token exchange:
    
    ```text
    2026-03-06T22:09:14.065559Z  INFO codex_cli::login: starting browser login flow
    2026-03-06T22:09:36.431678Z  INFO codex_login::server: received login callback path=/auth/callback has_code=true has_state=true has_error=false state_valid=true
    2026-03-06T22:09:36.436977Z  INFO codex_login::server: starting oauth token exchange issuer=https://auth.openai.com/ redirect_uri=http://localhost:1455/auth/callback
    2026-03-06T22:09:36.685438Z  INFO codex_login::server: oauth token exchange succeeded status=200 OK
    ```
    
    ## Tests
    
    - `cargo test -p codex-login`
    - `cargo clippy -p codex-login --tests -- -D warnings`
    - `cargo test -p codex-cli`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - manual direct `codex login` smoke tests for both a failing callback
    and a successful browser login
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Clarify js_repl image emission and encoding guidance (#13639)
    ## Summary
    
    This updates the `js_repl` prompt and docs to make the image guidance
    less confusing.
    
    ## What changed
    
    - Clarified that `codex.emitImage(...)` adds one image per call and can
    be called multiple times to emit multiple images.
    - Reworded the image-encoding guidance to be general `js_repl` advice
    instead of `ImageDetailOriginal`-specific behavior.
    - Updated the guidance to recommend JPEG at about quality 85 when lossy
    compression is acceptable, and PNG when transparency or lossless detail
    matters.
    - Mirrored the same wording in the public `js_repl` docs.
  • Harden js_repl emitImage to accept only data: URLs (#13507)
    ### Motivation
    
    - Prevent untrusted js_repl code from supplying arbitrary external URLs
    that the host would forward into model input and cause external fetches
    / data exfiltration. This change narrows the emitImage contract to safe,
    self-contained data URLs.
    
    ### Description
    
    - Kernel: added `normalizeEmitImageUrl` and enforce that string-valued
    `codex.emitImage(...)` inputs and `input_image`/content-item paths only
    accept non-empty `data:` URLs; byte-based paths still produce data URLs
    as before (`kernel.js`).
    - Host: added `validate_emitted_image_url` and check `EmitImage`
    requests before creating `FunctionCallOutputContentItem::InputImage`,
    returning an error to the kernel if the URL is not a `data:` URL
    (`mod.rs`).
    - Tests/docs: added a runtime test
    `js_repl_emit_image_rejects_non_data_url` to assert rejection of
    non-data URLs and updated user-facing docs/instruction text to state
    `data URL` support instead of generic direct image URLs (`mod.rs`,
    `docs/js_repl.md`, `project_doc.rs`).
    
    ### Testing
    
    - Ran `just fmt` in `codex-rs`; it completed successfully.
    - Added a runtime test (`cargo test -p codex-core
    js_repl_emit_image_rejects_non_data_url`) but executing the test in this
    environment failed due to a missing system dependency required by
    `codex-linux-sandbox` (the vendored `bubblewrap` build requires
    `libcap.pc` via `pkg-config`), so the test could not be run here.
    - Attempted a focused `cargo test` invocation with and without default
    features; both compile/test attempts were blocked by the same missing
    system `libcap` dependency in this environment.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)
  • Persist initialized js_repl bindings after failed cells (#13482)
    ## Summary
    
    - Change `js_repl` failed-cell persistence so later cells keep prior
    bindings plus only the current-cell bindings whose initialization
    definitely completed before the throw.
    - Preserve initialized lexical bindings across failed cells via
    module-namespace readability, including top-level destructuring that
    partially succeeds before a later throw.
    - Preserve hoisted `var` and `function` bindings only when execution
    clearly reached their declaration site, and preserve direct top-level
    pre-declaration `var` writes and updates through explicit write-site
    markers.
    - Preserve top-level `for...in` / `for...of` `var` bindings when the
    loop body executes at least once, using a first-iteration guard to avoid
    per-iteration bookkeeping overhead.
    - Keep prior module state intact across link-time failures and
    evaluation failures before the prelude runs, while still allowing failed
    cells that already recreated prior bindings to persist updates to those
    existing bindings.
    - Hide internal commit hooks from user `js_repl` code after the prelude
    aliases them, so snippets cannot spoof committed bindings by calling the
    raw `import.meta` hooks directly.
    - Add focused regression coverage for the supported failed-cell
    behaviors and the intentionally unsupported boundaries.
    - Update `js_repl` docs and generated instructions to describe the new,
    narrower failed-cell persistence model.
    
    ## Motivation
    
    We saw `js_repl` drop bindings that had already been initialized
    successfully when a later statement in the same cell threw, for example:
    
        const { context: liveContext, session } =
          await initializeGoogleSheetsLiveForTab(tab);
        // later statement throws
    
    That was surprising in practice because successful earlier work
    disappeared from the next cell.
    
    This change makes failed-cell persistence more useful without trying to
    model every possible partially executed JavaScript edge case. The
    resulting behavior is narrower and easier to reason about:
    
    - prior bindings are always preserved
    - lexical bindings persist when their initialization completed before
    the throw
    - hoisted `var` / `function` bindings persist only when execution
    clearly reached their declaration or a supported top-level `var` write
    site
    - failed cells that already recreated prior bindings can persist writes
    to those existing bindings even if they introduce no new bindings
    
    The detailed edge-case matrix stays in `docs/js_repl.md`. The
    model-facing `project_doc` guidance is intentionally shorter and focused
    on generation-relevant behavior.
    
    ## Supported Failed-Cell Behavior
    
    - Prior bindings remain available after a failed cell.
    - Initialized lexical bindings remain available after a failed cell.
    - Top-level destructuring like `const { a, b } = ...` preserves names
    whose initialization completed before a later throw.
    - Hoisted `function` bindings persist when execution reached the
    declaration statement before the throw.
    - Direct top-level pre-declaration `var` writes and updates persist, for
    example:
      - `x = 1`
      - `x += 1`
      - `x++`
    - short-circuiting logical assignments only persist when the write
    branch actually runs
    - Non-empty top-level `for...in` / `for...of` `var` loops persist their
    loop bindings.
    - Failed cells can persist updates to existing carried bindings after
    the prelude has run, even when the cell commits no new bindings.
    - Link failures and eval failures before the prelude do not poison
    `@prev`.
    
    ## Intentionally Unsupported Failed-Cell Cases
    
    - Hoisted function reads before the declaration, such as `foo(); ...;
    function foo() {}`
    - Aliasing or inference-based recovery from reads before declaration
    - Nested writes inside already-instrumented assignment RHS expressions
    - Destructuring-assignment recovery for hoisted `var`
    - Partial `var` destructuring recovery
    - Pre-declaration `undefined` reads for hoisted `var`
    - Empty top-level `for...in` / `for...of` loop vars
    - Nested or scope-sensitive pre-declaration `var` writes outside direct
    top-level expression statements
  • [js_repl] Support local ESM file imports (#13437)
    ## Summary
    - add `js_repl` support for dynamic imports of relative and absolute
    local ESM `.js` / `.mjs` files
    - keep bare package imports on the native Node path and resolved from
    REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`),
    even when they originate from imported local files
    - restrict static imports inside imported local files to other local
    relative/absolute `.js` / `.mjs` files, and surface a clear error for
    unsupported top-level static imports in the REPL cell
    - run imported local files inside the REPL VM context so they can access
    `codex.tmpDir`, `codex.tool`, captured `console`, and Node-like
    `import.meta` helpers
    - reload local files between execs so later `await import("./file.js")`
    calls pick up edits and fixed failures, while preserving package/builtin
    caching and persistent top-level REPL bindings
    - make `import.meta.resolve()` self-consistent by allowing the returned
    `file://...` URLs to round-trip through `await import(...)`
    - update both public and injected `js_repl` docs to clarify the narrowed
    contract, including global bare-import resolution behavior for local
    absolute files
    
    ## Testing
    - `cargo test -p codex-core js_repl_`
    - built codex binary and verified behavior
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Make js_repl image output controllable (#13331)
    ## Summary
    
    Instead of always adding inner function call outputs to the model
    context, let js code decide which ones to return.
    
    - Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the
    outer `js_repl` function output.
    - Keep `codex.tool(...)` return values unchanged as structured JS
    objects.
    - Add `codex.emitImage(...)` as the explicit path for attaching an image
    to the outer `js_repl` function output.
    - Support emitting from a direct image URL, a single `input_image` item,
    an explicit `{ bytes, mimeType }` object, or a raw tool response object
    containing exactly one image.
    - Preserve existing `view_image` original-resolution behavior when JS
    emits the raw `view_image` tool result.
    - Suppress the special `ViewImageToolCall` event for `js_repl`-sourced
    `view_image` calls so nested inspection stays side-effect free until JS
    explicitly emits.
    - Update the `js_repl` docs and generated project instructions with both
    recommended patterns:
      - `await codex.emitImage(codex.tool("view_image", { path }))`
    - `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg",
    quality: 85 }), mimeType: "image/jpeg" })`
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/13050
    - 👉 `2` https://github.com/openai/codex/pull/13331
    -  `3` https://github.com/openai/codex/pull/13049
  • tui: preserve kill buffer across submit and slash-command clears (#12006)
    ## Problem
    
    Before this change, composer paths that cleared the textarea after
    submit or slash-command dispatch
    also cleared the textarea kill buffer. That meant a user could `Ctrl+K`
    part of a draft, trigger a
    composer action that cleared the visible draft, and then lose the
    ability to `Ctrl+Y` the killed
    text back.
    
    This was especially awkward for workflows where the user wants to
    temporarily remove text, run a
    composer action such as changing reasoning level or dispatching a slash
    command, and then restore
    the killed text into the now-empty draft.
    
    ## Mental model
    
    This change separates visible draft state from editing-history state.
    
    The visible draft includes the current textarea contents and text
    elements that should be cleared
    when the composer submits or dispatches a command. The kill buffer is
    different: it represents the
    most recent killed text and should survive those composer-driven clears
    so the user can still yank
    it back afterward.
    
    After this change, submit and slash-command dispatch still clear the
    visible textarea contents, but
    they no longer erase the most recent kill.
    
    ## Non-goals
    
    This does not implement a multi-entry kill ring or change the semantics
    of `Ctrl+K` and `Ctrl+Y`
    beyond preserving the existing yank target across these clears.
    
    It also does not change how submit, slash-command parsing, prompt
    expansion, or attachment handling
    work, except that those flows no longer discard the textarea kill buffer
    as a side effect of
    clearing the draft.
    
    ## Tradeoffs
    
    The main tradeoff is that clearing the visible textarea is no longer
    equivalent to fully resetting
    all editing state. That is intentional here, because submit and
    slash-command dispatch are composer
    actions, not requests to forget the user's most recent kill.
    
    The benefit is better editing continuity. The cost is that callers must
    understand that full-buffer
    replacement resets visible draft state but not the kill buffer.
    
    ## Architecture
    
    The behavioral change is in `TextArea`: full-buffer replacement now
    rebuilds text and elements
    without clearing `kill_buffer`.
    
    `ChatComposer` already clears the textarea after successful submit and
    slash-command dispatch by
    calling into those textarea replacement paths. With this change, those
    existing composer flows
    inherit the new behavior automatically: the visible draft is cleared,
    but the last killed text
    remains available for `Ctrl+Y`.
    
    The tests cover both layers:
    
    - `TextArea` verifies that the kill buffer survives full-buffer
    replacement.
    - `ChatComposer` verifies that it survives submit.
    - `ChatComposer` also verifies that it survives slash-command dispatch.
    
    ## Observability
    
    There is no dedicated logging for kill-buffer preservation. The most
    direct way to reason about the
    behavior is to inspect textarea-wide replacement paths and confirm
    whether they treat the kill
    buffer as visible-buffer state or as editing-history state.
    
    If this regresses in the future, the likely failure mode is simple and
    user-visible: `Ctrl+Y` stops
    restoring text after submit or slash-command clears even though ordinary
    kill/yank still works
    within a single uninterrupted draft.
    
    ## Tests
    
    Added focused regression coverage for the new contract:
    
    - `kill_buffer_persists_across_set_text`
    - `kill_buffer_persists_after_submit`
    - `kill_buffer_persists_after_slash_command_dispatch`
    
    Local verification:
    - `just fmt`
    - `cargo test -p codex-tui`
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • notify: include client in legacy hook payload (#12968)
    ## Why
    
    The `notify` hook payload did not identify which Codex client started
    the turn. That meant downstream notification hooks could not distinguish
    between completions coming from the TUI and completions coming from
    app-server clients such as VS Code or Xcode. Now that the Codex App
    provides its own desktop notifications, it would be nice to be able to
    filter those out.
    
    This change adds that context without changing the existing payload
    shape for callers that do not know the client name, and keeps the new
    end-to-end test cross-platform.
    
    ## What changed
    
    - added an optional top-level `client` field to the legacy `notify` JSON
    payload
    - threaded that value through `core` and `hooks`; the internal session
    and turn state now carries it as `app_server_client_name`
    - set the field to `codex-tui` for TUI turns
    - captured `initialize.clientInfo.name` in the app server and applied it
    to subsequent turns before dispatching hooks
    - replaced the notify integration test hook with a `python3` script so
    the test does not rely on Unix shell permissions or `bash`
    - documented the new field in `docs/config.md`
    
    ## Testing
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-tui`
    - `cargo test -p codex-app-server
    suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name
    -- --exact --nocapture`
    - `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs`
    still has unrelated existing failures in this environment)
    
    ## Docs
    
    The public config reference on `developers.openai.com/codex` should
    mention that the legacy `notify` payload may include a top-level
    `client` field. The TUI reports `codex-tui`, and the app server reports
    `initialize.clientInfo.name` when it is available.
  • Log js_repl nested tool responses in rollout history (#12837)
    ## Summary
    
    - add tracing-based diagnostics for nested `codex.tool(...)` calls made
    from `js_repl`
    - emit a bounded, sanitized summary at `info!`
    - emit the exact raw serialized response object or error string seen by
    JavaScript at `trace!`
    - document how to enable these logs and where to find them, especially
    for `codex app-server`
    
    ## Why
    
    Nested `codex.tool(...)` calls inside `js_repl` are a debugging
    boundary: JavaScript sees the tool result, but that result is otherwise
    hard to inspect from outside the kernel.
    
    This change adds explicit tracing for that path using the repo’s normal
    observability pattern:
    - `info` for compact summaries
    - `trace` for exact raw payloads when deep debugging is needed
    
    ## What changed
    
    - `js_repl` now summarizes nested tool-call results across the response
    shapes it can receive:
      - message content
      - function-call outputs
      - custom tool outputs
      - MCP tool results and MCP error results
      - direct error strings
    - each nested `codex.tool(...)` completion logs:
      - `exec_id`
      - `tool_call_id`
      - `tool_name`
      - `ok`
      - a bounded summary struct describing the payload shape
    - at `trace`, the same path also logs the exact serialized response
    object or error string that JavaScript received
    - docs now include concrete logging examples for `codex app-server`
    - unit coverage was added for multimodal function output summaries and
    error summaries
    
    ## How to use it
    
    ### Summary-only logging
    
    Set:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=info
    ```
    
    For `codex app-server`, tracing output is written to the server process
    `stderr`.
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=info \
    LOG_FORMAT=json \
    codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    This emits bounded summary lines for nested `codex.tool(...)` calls.
    
    ### Full raw debugging
    
    Set:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace
    ```
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace \
    LOG_FORMAT=json \
    codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    At `trace`, you get:
    - the same `info` summary line
    - a `trace` line with the exact serialized response object seen by
    JavaScript
    - or the exact error string if the nested tool call failed
    
    ### Where the logs go
    
    For `codex app-server`, these logs go to process `stderr`, so redirect
    or capture `stderr` to inspect them.
    
    Example:
    
    ```sh
    RUST_LOG=codex_core::tools::js_repl=trace \
    LOG_FORMAT=json \
    /Users/fjord/code/codex/codex-rs/target/debug/codex app-server \
    2> /tmp/codex-app-server.log
    ```
    
    Then inspect:
    
    ```sh
    rg "js_repl nested tool call" /tmp/codex-app-server.log
    ```
    
    Without an explicit `RUST_LOG` override, these `js_repl` nested
    tool-call logs are typically not visible.
  • Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
    ## Summary
    - Add agent job support: spawn a batch of sub-agents from CSV, auto-run,
    auto-export, and store results in SQLite.
    - Simplify workflow: remove run/resume/get-status/export tools; spawn is
    deterministic and completes in one call.
    - Improve exec UX: stable, single-line progress bar with ETA; suppress
    sub-agent chatter in exec.
    
    ## Why
    Enables map-reduce style workflows over arbitrarily large repos using
    the existing Codex orchestrator. This addresses review feedback about
    overly complex job controls and non-deterministic monitoring.
    
    ## Demo (progress bar)
    ```
    ./codex-rs/target/debug/codex exec \
      --enable collab \
      --enable sqlite \
      --full-auto \
      --progress-cursor \
      -c agents.max_threads=16 \
      -C /Users/daveaitel/code/codex \
      - <<'PROMPT'
    Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
    path = item-01..item-30, area = test.
    
    Then call spawn_agents_on_csv with:
    - csv_path: /tmp/agent_job_progress_demo.csv
    - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
    - output_csv_path: /tmp/agent_job_progress_demo_out.csv
    PROMPT
    ```
    
    ## Review feedback addressed
    - Auto-start jobs on spawn; removed run/resume/status/export tools.
    - Auto-export on success.
    - More descriptive tool spec + clearer prompts.
    - Avoid deadlocks on spawn failure; pending/running handled safely.
    - Progress bar no longer scrolls; stable single-line redraw.
    
    ## Tests
    - `cd codex-rs && cargo test -p codex-exec`
    - `cd codex-rs && cargo build -p codex-cli`
  • feat: discourage the use of the --all-features flag (#12429)
    ## Why
    
    Developers are frequently running low on disk space, and routine use of
    `--all-features` contributes to larger Cargo build caches in `target/`
    by compiling additional feature combinations.
    
    This change updates local workflow guidance to avoid `--all-features` by
    default and reserve it for cases where full feature coverage is
    specifically needed.
    
    ## What Changed
    
    - Updated `AGENTS.md` guidance for `codex-rs` to recommend `cargo test`
    / `just test` for full-suite local runs, and to call out the disk-usage
    cost of routine `--all-features` usage.
    - Updated the root `justfile` so `just fix` and `just clippy` no longer
    pass `--all-features` by default.
    - Updated `docs/install.md` to explicitly describe `cargo test
    --all-features` as an optional heavier-weight run (more build time and
    `target/` disk usage).
    
    ## Verification
    
    - Confirmed the `justfile` parses and the recipes list successfully with
    `just --list`.
  • Improve Plan mode reasoning selection flow (#12303)
    Addresses https://github.com/openai/codex/issues/11013
    
    ## Summary
    - add a Plan implementation path in the TUI that lets users choose
    reasoning before switching to Default mode and implementing
    - add Plan-mode reasoning scope handling (Plan-only override vs
    all-modes default), including config/schema/docs plumbing for
    `plan_mode_reasoning_effort`
    - remove the hardcoded Plan preset medium default and make the reasoning
    popup reflect the active Plan override as `(current)`
    - split the collaboration-mode switch notification UI hint into #12307
    to keep this diff focused
    
    If I have `plan_mode_reasoning_effort = "medium"` set in my
    `config.toml`:
    <img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM"
    src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369"
    />
    
    If I don't have `plan_mode_reasoning_effort` set in my `config.toml`:
    <img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM"
    src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d"
    />
    
    ## Codex author
    `codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
  • docs: use --locked when installing cargo-nextest (#12377)
    ## What
    
    Updates the optional `cargo-nextest` install command in
    `docs/install.md`:
    
    - `cargo install cargo-nextest` -> `cargo install --locked
    cargo-nextest`
    
    ## Why
    
    The current docs command can fail during source install because recent
    `cargo-nextest` releases intentionally require `--locked`.
    
    Repro (macOS, but likely not platform-specific):
    - `cargo install cargo-nextest`
    - Fails with a compile error from `locked-tripwire` indicating:
      - `Nextest does not support being installed without --locked`
      - suggests `cargo install --locked cargo-nextest`
    
    Using the locked command succeeds:
    - `cargo install --locked cargo-nextest`
    
    ## How
    
    Single-line docs change in `docs/install.md` to match current
    `cargo-nextest` install requirements.
    
    ## Validation
    
    - Reproduced failure locally using a temporary `CARGO_HOME` directory
    (clean Cargo home)
    - Example command used: `CARGO_HOME=/tmp/cargo-home-test cargo install
    cargo-nextest`
    - Confirmed success with `cargo install --locked cargo-nextest`
  • js_repl: remove codex.state helper references (#12275)
    ## Summary
    
    This PR removes `codex.state` from the `js_repl` helper surface and
    removes all corresponding documentation/instruction references.
    
    ## Motivation
    
    Top-level bindings in `js_repl` now persist across cells, so the extra
    `codex.state` helper is redundant and adds unnecessary API/docs surface.
    
    ## Changes
    
    - Removed the long-lived `state` object from the Node kernel helper
    wiring.
    - Stopped exposing `codex.state` (and `context.state`) during `js_repl`
    execution.
    - Updated user-facing `js_repl` docs to remove `codex.state`.
    - Updated generated instruction text and related test expectations to
    list only:
      - `codex.tmpDir`
      - `codex.tool(name, args?)`
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/12300
    - 👉 `2` https://github.com/openai/codex/pull/12275
    -  `3` https://github.com/openai/codex/pull/12205
    -  `4` https://github.com/openai/codex/pull/12185
    -  `5` https://github.com/openai/codex/pull/10673
  • [js_repl] paths for node module resolution can be specified for js_repl (#11944)
    # External (non-OpenAI) Pull Request Requirements
    
    In `js_repl` mode, module resolution currently starts from
    `js_repl_kernel.js`, which is written to a per-kernel temp dir. This
    effectively means that bare imports will not resolve.
    
    This PR adds a new config option, `js_repl_node_module_dirs`, which is a
    list of dirs that are used (in order) to resolve a bare import. If none
    of those work, the current working directory of the thread is used.
    
    For example:
    ```toml
    js_repl_node_module_dirs = [
        "/path/to/node_modules/",
        "/other/path/to/node_modules/",
    ]
    ```
  • tui: preserve remote image attachments across resume/backtrack (#10590)
    ## Summary
    This PR makes app-server-provided image URLs first-class attachments in
    TUI, so they survive resume/backtrack/history recall and are resubmitted
    correctly.
    
    <img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
    src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
    />
    
    Can delete the attached image upon backtracking:
    <img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
    src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
    />
    
    In both history and composer, remote images are rendered as normal
    `[Image #N]` placeholders, with numbering unified with local images.
    
    ## What changed
    - Plumb remote image URLs through TUI message state:
      - `UserHistoryCell`
      - `BacktrackSelection`
      - `ChatComposerHistory::HistoryEntry`
      - `ChatWidget::UserMessage`
    - Show remote images as placeholder rows inside the composer box (above
    textarea), and in history cells.
    - Support keyboard selection/deletion for remote image rows in composer
    (`Up`/`Down`, `Delete`/`Backspace`).
    - Preserve remote-image-only turns in local composer history (Up/Down
    recall), including restore after backtrack.
    - Ensure submit/queue/backtrack resubmit include remote images in model
    input (`UserInput::Image`), and keep request shape stable for
    remote-image-only turns.
    - Keep image numbering contiguous across remote + local images:
      - remote images occupy `[Image #1]..[Image #M]`
      - local images start at `[Image #M+1]`
      - deletion renumbers consistently.
    - In protocol conversion, increment shared image index for remote images
    too, so mixed remote/local image tags stay in a single sequence.
    - Simplify restore logic to trust in-memory attachment order (no
    placeholder-number parsing path).
    - Backtrack/replay rollback handling now queues trims through
    `AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
    lines after trims, so overlay/transcript state stays consistent.
    - Trim trailing blank rendered lines from user history rendering to
    avoid oversized blank padding.
    
    ## Docs + tests
    - Updated: `docs/tui-chat-composer.md` (remote image flow,
    selection/deletion, numbering offsets)
    - Added/updated tests across `tui/src/chatwidget/tests.rs`,
    `tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
    and `tui/src/bottom_pane/chat_composer.rs`
    - Added snapshot coverage for remote image composer states, including
    deleting the first of two remote images.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-tui`
    
    ## Codex author
    `codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
  • Add js_repl_tools_only model and routing restrictions (#10671)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    - 👉 `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Add js_repl host helpers and exec end events (#10672)
    ## Summary
    
    This PR adds host-integrated helper APIs for `js_repl` and updates model
    guidance so the agent can use them reliably.
    
    ### What’s included
    
    - Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call
    normal Codex tools.
    - Keep persistent JS state and scratch-path helpers available:
      - `codex.state`
      - `codex.tmpDir`
    - Wire `js_repl` tool calls through the standard tool router path.
    - Add/align `js_repl` execution completion/end event behavior with
    existing tool logging patterns.
    - Update dynamic prompt injection (`project_doc`) to document:
      - how to call `codex.tool(...)`
      - raw output behavior
    - image flow via `view_image` (`codex.tmpDir` +
    `codex.tool("view_image", ...)`)
    - stdio safety guidance (`console.log` / `codex.tool`, avoid direct
    `process.std*`)
    
    ## Why
    
    - Standardize JS-side tool usage on `codex.tool(...)`
    - Make `js_repl` behavior more consistent with existing tool execution
    and event/logging patterns.
    - Give the model enough runtime guidance to use `js_repl` safely and
    effectively.
    
    ## Testing
    
    - Added/updated unit and runtime tests for:
      - `codex.tool` calls from `js_repl` (including shell/MCP paths)
      - image handoff flow via `view_image`
      - prompt-injection text for `js_repl` guidance
      - execution/end event behavior and related regression coverage
    
    
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    -  `1` https://github.com/openai/codex/pull/10674
    - 👉 `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Add feature-gated freeform js_repl core runtime (#10674)
    ## Summary
    
    This PR adds an **experimental, feature-gated `js_repl` core runtime**
    so models can execute JavaScript in a persistent REPL context across
    tool calls.
    
    The implementation integrates with existing feature gating, tool
    registration, prompt composition, config/schema docs, and tests.
    
    ## What changed
    
    - Added new experimental feature flag: `features.js_repl`.
    - Added freeform `js_repl` tool and companion `js_repl_reset` tool.
    - Gated tool availability behind `Feature::JsRepl`.
    - Added conditional prompt-section injection for JS REPL instructions
    via marker-based prompt processing.
    - Implemented JS REPL handlers, including freeform parsing and pragma
    support (timeout/reset controls).
    - Added runtime resolution order for Node:
      1. `CODEX_JS_REPL_NODE_PATH`
      2. `js_repl_node_path` in config
      3. `PATH`
    - Added JS runtime assets/version files and updated docs/schema.
    
    ## Why
    
    This enables richer agent workflows that require incremental JavaScript
    execution with preserved state, while keeping rollout safe behind an
    explicit feature flag.
    
    ## Testing
    
    Coverage includes:
    
    - Feature-flag gating behavior for tool exposure.
    - Freeform parser/pragma handling edge cases.
    - Runtime behavior (state persistence across calls and top-level `await`
    support).
    
    ## Usage
    
    ```toml
    [features]
    js_repl = true
    ```
    
    Optional runtime override:
    
    - `CODEX_JS_REPL_NODE_PATH`, or
    - `js_repl_node_path` in config.
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • tui: keep history recall cursor at line end (#11295)
    ## Summary
    - keep cursor at end-of-line after Up/Down history recall
    - allow continued history navigation when recalled text cursor is at
    start or end boundary
    - add regression tests and document the history cursor contract in
    composer docs
    
    ## Testing
    - just fmt
    - cargo test -p codex-tui --lib
    history_navigation_leaves_cursor_at_end_of_line
    - cargo test -p codex-tui --lib
    should_handle_navigation_when_cursor_is_at_line_boundaries
    - cargo test -p codex-tui *(fails in existing integration test
    `suite::no_panic_on_startup::malformed_rules_should_not_panic` because
    `target/debug/codex` is not present in this environment)*
  • fix(tui): tab submits when no task running in steer mode (#10035)
    When steer mode is enabled, Tab used to only queue while a task was
    running and otherwise did nothing. Treat Tab as an immediate submit when
    no task is running so input isn't dropped when the inflight turn ends
    mid-typing.
    
    Adds a regression test and updates docs/tooltips.
  • fix(tui): rehydrate drafts and restore image placeholders (#9040)
    Fixes #9050
    
    When a draft is stashed with Ctrl+C, we now persist the full draft state
    (text elements, local image paths, and pending paste payloads) in local
    history. Up/Down recall rehydrates placeholder elements and attachments
    so styling remains correct and large pastes still expand on submit.
    Persistent (cross‑session) history remains text‑only.
    
    Backtrack prefills now reuse the selected user message’s text elements
    and local image paths, so image placeholders/attachments rehydrate when
    rolling back.
    
    External editor replacements keep only attachments whose placeholders
    remain and then normalize image placeholders to `[Image #1]..[Image #N]`
    to keep the attachment mapping consistent.
    
    Docs:
    - docs/tui-chat-composer.md
    
    Testing:
    - just fix -p codex-tui
    - cargo test -p codex-tui
    
    Co-authored-by: Eric Traut <etraut@openai.com>