99 Commits

  • Add allow_managed_hooks_only hook requirement (#20319)
    ## Why
    
    Enterprise-managed hook policy needs a narrow way to require Codex to
    ignore user-controlled lifecycle hooks without adopting the broader
    trust-precedence model from earlier hook work. This keeps the policy
    anchored in `requirements.toml`, so admins can opt into managed hooks
    only while normal `config.toml` files cannot enable the restriction
    themselves.
    
    ## What changed
    
    - Added `allow_managed_hooks_only` to the requirements data flow and
    preserved explicit `false` values.
    - Also adds it to /debug-config
    - Marked MDM, system, and legacy managed config layers as managed for
    hook discovery.
    - Updated hook discovery so `allow_managed_hooks_only = true`:
      - keeps managed requirements hooks and managed config-layer hooks,
    - skips user/project/session `hooks.json` and `[hooks]` entries with
    concise startup warnings,
      - skips current unmanaged plugin hooks,
    - ignores any `allow_managed_hooks_only` key placed in ordinary
    `config.toml` layers.
  • Clarify docs folder guidance in AGENTS.md (#21772)
    ## Summary
    
    Codex keeps trying to add documentation to the `docs/` directory. With
    the exception of app server API documentation, the docs for Codex should
    not live in this repo. We don't want the local `docs/` folder to become
    a stale shadow of the official docs.
    
    This PR updates `AGENTS.md` to make that boundary explicit and scopes
    the existing API documentation guidance to app-server docs/examples. It
    also removes the extra `docs/config.md` sections that were recently
    added.
  • codex-otel: add configurable trace metadata (#21556)
    Add Codex config for static trace span attributes and structured W3C
    tracestate field upserts. The config flows through OtelSettings so
    callers can attach trace metadata without touching every span call site.
    
    Apply span attributes with an SDK span processor so every exported
    trace span carries the configured metadata. Model tracestate as nested
    member fields so configured keys can be upserted while unrelated
    propagated state in the same member is preserved.
    
    Validate configured tracestate before installing provider-global state,
    including header-unsafe values the SDK does not reject by itself. This
    keeps Codex from propagating malformed trace context from config.
    
    Update the config schema, public docs, and OTLP loopback coverage for
    config parsing, span export, propagation, and invalid-header rejection.
  • Document Codex git commit attribution config (#21379)
    ## Summary
    - document that commit attribution for generated git commit messages is
    gated by the `codex_git_commit` feature flag
    - add an example `config.toml` snippet showing `commit_attribution` with
    `[features].codex_git_commit = true`
    - update the config schema description so the reference docs explain
    that `commit_attribution` only takes effect when the feature is enabled
    
    Fixes #19799.
    
    ## Validation
    - `cargo run -p codex-core --bin codex-write-config-schema`
    - `cargo test -p codex-config`
    - `cargo test -p codex-features`
    - `cargo fmt --check`
    - `git diff --check`
    
    ## Notes
    - `cargo test -p codex-core config_schema_matches_fixture` currently
    fails before reaching the schema test because `core_test_support`
    imports `similar` without a linked crate in this checkout. The narrower
    package checks above avoid that unrelated test-support build failure.
  • Remove local docs and specs (#20896)
    ## Summary
    
    We should not check local-only docs or planning specs into this
    repository. Keeping those files here duplicates the canonical Codex
    documentation surface and makes transient implementation notes look like
    supported docs.
    
    This PR removes the local-only docs/spec files from `docs/` and trims
    `docs/config.md` back to links for the maintained configuration
    documentation on developers.openai.com.
  • deprecate legacy notify (#20524)
    # Why
    
    `notify` is the remaining compatibility surface from the legacy hook
    implementation. The newer lifecycle hook engine now owns the active hook
    system, so we should start steering users away from adding new `notify`
    configs before removing the old path entirely. This also adds a
    lightweight watchpoint for the deprecation so we can see how much legacy
    usage remains before the clean drop.
    
    # What
    
    - emit a startup deprecation notice when a non-empty `notify` command is
    configured
    - emit `codex.notify.configured` when a session starts with legacy
    `notify` configured
    - emit `codex.notify.run` when the legacy notify path fires after a
    completed turn
    - mark `notify` as deprecated in the config schema and repo docs
    - remove the orphaned `codex-rs/hooks/src/user_notification.rs` file
    that is no longer compiled
    - add regression coverage for the new deprecation notice
    
    # Next steps
    
    A follow-up PR can remove the legacy notify path entirely once we are
    ready for the clean drop. Before then, we can watch
    `codex.notify.configured` and `codex.notify.run` to understand the
    deprecation impact and remaining active usage. The cleanup PR should
    then delete the `notify` config field, the `legacy_notify`
    implementation, the old compatibility dispatch types and callsites that
    only exist for the legacy path, and the remaining compatibility
    docs/tests.
    
    # Testing
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-config`
    - `cargo test -p codex-core emits_deprecation_notice_for_notify`
  • Support disabling tool suggest for specific tools. (#20072)
    ## Summary
    - Add `disable_tool_suggest` to app and plugin config, schema, and
    TypeScript output
    - Exclude disabled connectors and plugins from tool suggestion discovery
    - Persist "never show again" tool-suggestion choices back into
    `config.toml`
    - Update config docs and add coverage for connector and plugin
    suppression
    
    ## Testing
    - Added and updated unit tests for config persistence and tool-suggest
    filtering
    - Not run (not requested)
  • Add server-level approval defaults for custom MCP servers (#17843)
    ## Summary
    - Add `default_tools_approval_mode` support for custom MCP server
    configs, matching the existing `codex_apps` behavior
    - Apply approval precedence as per-tool override, then server default,
    then `auto`
    - Update config serialization, CLI display, schema generation, docs, and
    tests
    
    ## Testing
    - `cargo check -p codex-config`
    - `cargo check -p codex-core`
    - `just write-config-schema`
    - `just fmt`
    - `cargo test -p codex-config`
    - Targeted `codex-core` tests for config parsing, config writes, and MCP
    approval precedence
    - `just fix -p codex-config -p codex-core`
  • Add supports_parallel_tool_calls flag to included mcps (#17667)
    ## Why
    
    For more advanced MCP usage, we want the model to be able to emit
    parallel MCP tool calls and have Codex execute eligible ones
    concurrently, instead of forcing all MCP calls through the serial block.
    
    The main design choice was where to thread the config. I made this
    server-level because parallel safety depends on the MCP server
    implementation. Codex reads the flag from `mcp_servers`, threads the
    opted-in server names into `ToolRouter`, and checks the parsed
    `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
    on model-visible tool names, which can be incomplete in
    deferred/search-tool paths or ambiguous for similarly named
    servers/tools.
    
    ## What was added
    
    Added `supports_parallel_tool_calls` for MCP servers.
    
    Before:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    ```
    
    After:
    
    ```toml
    [mcp_servers.docs]
    command = "docs-server"
    supports_parallel_tool_calls = true
    ```
    
    MCP calls remain serial by default. Only tools from opted-in servers are
    eligible to run in parallel. Docs also now warn to enable this only when
    the server’s tools are safe to run concurrently, especially around
    shared state or read/write races.
    
    ## Testing
    
    Tested with a local stdio MCP server exposing real delay tools. The
    model/Responses side was mocked only to deterministically emit two MCP
    calls in the same turn.
    
    Each test called `query_with_delay` and `query_with_delay_2` with `{
    "seconds": 25 }`.
    
    | Build/config | Observed | Wall time |
    | --- | --- | --- |
    | main with flag enabled | serial | `58.79s` |
    | PR with flag enabled | parallel | `31.73s` |
    | PR without flag | serial | `56.70s` |
    
    PR with flag enabled showed both tools start before either completed;
    main and PR-without-flag completed the first delay before starting the
    second.
    
    Also added an integration test.
    
    Additional checks:
    
    - `cargo test -p codex-tools` passed
    - `cargo test -p codex-core
    mcp_parallel_support_uses_exact_payload_server` passed
    - `git diff --check` passed
  • [mcp] Improve custom MCP elicitation (#15800)
    - [x] Support don't ask again for custom MCP tool calls.
    - [x] Don't run arc in yolo mode.
    - [x] Run arc for custom MCP tools in always allow mode.
  • client: extend custom CA handling across HTTPS and websocket clients (#14239)
    ## Stacked PRs
    
    This work is now effectively split across two steps:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: extend that shared custom CA handling across Codex HTTPS
    clients and secure websocket TLS
    
    Note: #14240 was merged into this branch while it was stacked on top of
    this PR. This PR now subsumes that websocket follow-up and should be
    treated as the combined change.
    
    Builds on top of #14178.
    
    ## Problem
    
    Custom CA support landed first in the login path, but the real
    requirement is broader. Codex constructs outbound TLS clients in
    multiple places, and both HTTPS and secure websocket paths can fail
    behind enterprise TLS interception if they do not honor
    `CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.
    
    This PR broadens the shared custom-CA logic beyond login and applies the
    same policy to websocket TLS, so the enterprise-proxy story is no longer
    split between “HTTPS works” and “websockets still fail”.
    
    ## What This Delivers
    
    Custom CA support is no longer limited to login. Codex outbound HTTPS
    clients and secure websocket connections can now honor the same
    `CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
    proxy/intercept setups work more consistently end-to-end.
    
    For users and operators, nothing new needs to be configured beyond the
    same CA env vars introduced in #14178. The change is that more of Codex
    now respects them, including websocket-backed flows that were previously
    still using default trust roots.
    
    I also manually validated the proxy path locally with mitmproxy using:
    `CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
    HTTPS_PROXY=http://127.0.0.1:8080 just codex`
    with mitmproxy installed via `brew install mitmproxy` and configured as
    the macOS system proxy.
    
    ## Mental model
    
    `codex-client` is now the owner of shared custom-CA policy for outbound
    TLS client construction. Reqwest callers start from the builder
    configuration they already need, then pass that builder through
    `build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
    same module for a rustls client config when a custom CA bundle is
    configured.
    
    The env precedence is the same everywhere:
    - `CODEX_CA_CERTIFICATE` wins
    - otherwise fall back to `SSL_CERT_FILE`
    - otherwise use system roots
    
    The helper is intentionally narrow. It loads every usable certificate
    from the configured PEM bundle into the appropriate root store and
    returns either a configured transport or a typed error that explains
    what went wrong.
    
    ## Non-goals
    
    This does not add handshake-level integration tests against a live TLS
    endpoint. It does not validate that the configured bundle forms a
    meaningful certificate chain. It also does not try to force every
    transport in the repo through one abstraction; it extends the shared CA
    policy across the reqwest and websocket paths that actually needed it.
    
    ## Tradeoffs
    
    The main tradeoff is centralizing CA behavior in `codex-client` while
    still leaving adoption up to call sites. That keeps the implementation
    additive and reviewable, but it means the rule "outbound Codex TLS that
    should honor enterprise roots must use the shared helper" is still
    partly enforced socially rather than by types.
    
    For websockets, the shared helper only builds an explicit rustls config
    when a custom CA bundle is configured. When no override env var is set,
    websocket callers still use their ordinary default connector path.
    
    ## Architecture
    
    `codex-client::custom_ca` now owns CA bundle selection, PEM
    normalization, mixed-section parsing, certificate extraction, typed
    CA-loading errors, and optional rustls client-config construction for
    websocket TLS.
    
    The affected consumers now call into that shared helper directly rather
    than carrying login-local CA behavior:
    - backend-client
    - cloud-tasks
    - RMCP client paths that use `reqwest`
    - TUI voice HTTP paths
    - `codex-core` default reqwest client construction
    - `codex-api` websocket clients for both responses and realtime
    websocket connections
    
    The subprocess CA probe, env-sensitive integration tests, and shared PEM
    fixtures also live in `codex-client`, which is now the actual owner of
    the behavior they exercise.
    
    ## Observability
    
    The shared CA path logs:
    - which environment variable selected the bundle
    - which path was loaded
    - how many certificates were accepted
    - when `TRUSTED CERTIFICATE` labels were normalized
    - when CRLs were ignored
    - where client construction failed
    
    Returned errors remain user-facing and include the relevant env var,
    path, and remediation hint. That same error model now applies whether
    the failure surfaced while building a reqwest client or websocket TLS
    configuration.
    
    ## Tests
    
    Pure unit tests in `codex-client` cover env precedence and PEM
    normalization behavior. Real client construction remains in subprocess
    tests so the suite can control process env and avoid the macOS seatbelt
    panic path that motivated the hermetic test split.
    
    The subprocess coverage verifies:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-cert and multi-cert bundles
    - malformed and empty-file errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The websocket side is covered by the existing `codex-api` / `codex-core`
    websocket test suites plus the manual mitmproxy validation above.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • login: add custom CA support for login flows (#14178)
    ## Stacked PRs
    
    This work is split across three stacked PRs:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: broaden the shared custom CA path from login to other outbound
    `reqwest` clients across Codex
    - #14240: extend that shared custom CA handling to secure websocket TLS
    so websocket connections honor the same CA env vars
    
    Review order: #14178, then #14239, then #14240.
    
    Supersedes #6864.
    
    Thanks to @3axap4eHko for the original implementation and investigation
    here. Although this version rearranges the code and history
    significantly, the majority of the credit for this work belongs to them.
    
    ## Problem
    
    Login flows need to work in enterprise environments where outbound TLS
    is intercepted by an internal proxy or gateway. In those setups, system
    root certificates alone are often insufficient to validate the OAuth and
    device-code endpoints used during login. The change adds a
    login-specific custom CA loading path, but the important contracts
    around env precedence, PEM compatibility, test boundaries, and
    probe-only workarounds need to be explicit so reviewers can understand
    what behavior is intentional.
    
    For users and operators, the behavior is simple: if login needs to trust
    a custom root CA, set `CODEX_CA_CERTIFICATE` to a PEM file containing
    one or more certificates. If that variable is unset, login falls back to
    `SSL_CERT_FILE`. If neither is set, login uses system roots. Invalid or
    empty PEM files now fail with an error that points back to those
    environment variables and explains how to recover.
    
    ## What This Delivers
    
    Users can now make Codex login work behind enterprise TLS interception
    by pointing `CODEX_CA_CERTIFICATE` at a PEM bundle containing the
    relevant root certificates. If that variable is unset, login falls back
    to `SSL_CERT_FILE`, then to system roots.
    
    This PR applies that behavior to both browser-based and device-code
    login flows. It also makes login tolerant of the PEM shapes operators
    actually have in hand: multi-certificate bundles, OpenSSL `TRUSTED
    CERTIFICATE` labels, and bundles that include well-formed CRLs.
    
    ## Mental model
    
    `codex-login` is the place where the login flows construct ad hoc
    outbound HTTP clients. That makes it the right boundary for a narrow CA
    policy: look for `CODEX_CA_CERTIFICATE`, fall back to `SSL_CERT_FILE`,
    load every parseable certificate block in that bundle into a
    `reqwest::Client`, and fail early with a clear user-facing error if the
    bundle is unreadable or malformed.
    
    The implementation is intentionally pragmatic about PEM input shape. It
    accepts ordinary certificate bundles, multi-certificate bundles, OpenSSL
    `TRUSTED CERTIFICATE` labels, and bundles that also contain CRLs. It
    does not validate a certificate chain or prove a handshake; it only
    constructs the root store used by login.
    
    ## Non-goals
    
    This change does not introduce a general-purpose transport abstraction
    for the rest of the product. It does not validate whether the provided
    bundle forms a real chain, and it does not add handshake-level
    integration tests against a live TLS server. It also does not change
    login state management or OAuth semantics beyond ensuring the existing
    flows share the same CA-loading rules.
    
    ## Tradeoffs
    
    The main tradeoff is keeping this logic scoped to login-specific client
    construction rather than lifting it into a broader shared HTTP layer.
    That keeps the review surface smaller, but it also means future
    login-adjacent code must continue to use `build_login_http_client()` or
    it can silently bypass enterprise CA overrides.
    
    The `TRUSTED CERTIFICATE` handling is also intentionally a local
    compatibility shim. The rustls ecosystem does not currently accept that
    PEM label upstream, so the code normalizes it locally and trims the
    OpenSSL `X509_AUX` trailer bytes down to the certificate DER that
    `reqwest` can consume.
    
    ## Architecture
    
    `custom_ca.rs` is now the single place that owns login CA behavior. It
    selects the CA file from the environment, reads it, normalizes PEM label
    shape where needed, iterates mixed PEM sections with `rustls-pki-types`,
    ignores CRLs, trims OpenSSL trust metadata when necessary, and returns
    either a configured `reqwest::Client` or a typed error.
    
    The browser login server and the device-code flow both call
    `build_login_http_client()`, so they share the same trust-store policy.
    Environment-sensitive tests run through the `login_ca_probe` helper
    binary because those tests must control process-wide env vars and cannot
    reliably build a real reqwest client in-process on macOS seatbelt runs.
    
    ## Observability
    
    The custom CA path logs which environment variable selected the bundle,
    which file path was loaded, how many certificates were accepted, when
    `TRUSTED CERTIFICATE` labels were normalized, when CRLs were ignored,
    and where client construction failed. Returned errors remain user-facing
    and include the relevant path, env var, and remediation hint.
    
    This gives enough signal for three audiences:
    - users can see why login failed and which env/file caused it
    - sysadmins can confirm which override actually won
    - developers can tell whether the failure happened during file read, PEM
    parsing, certificate registration, or final reqwest client construction
    
    ## Tests
    
    Pure unit tests stay limited to env precedence and empty-value handling.
    Real client construction lives in subprocess tests so the suite remains
    hermetic with respect to process env and macOS sandbox behavior.
    
    The subprocess tests verify:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-certificate and multi-certificate bundles
    - malformed and empty-bundle errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The named PEM fixtures under `login/tests/fixtures/` are shared by the
    tests so their purpose stays reviewable.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime start instructions config override (#14270)
    - add `realtime_start_instructions` config support
    - thread it into realtime context updates, schema, docs, and tests
  • notify: include client in legacy hook payload (#12968)
    ## Why
    
    The `notify` hook payload did not identify which Codex client started
    the turn. That meant downstream notification hooks could not distinguish
    between completions coming from the TUI and completions coming from
    app-server clients such as VS Code or Xcode. Now that the Codex App
    provides its own desktop notifications, it would be nice to be able to
    filter those out.
    
    This change adds that context without changing the existing payload
    shape for callers that do not know the client name, and keeps the new
    end-to-end test cross-platform.
    
    ## What changed
    
    - added an optional top-level `client` field to the legacy `notify` JSON
    payload
    - threaded that value through `core` and `hooks`; the internal session
    and turn state now carries it as `app_server_client_name`
    - set the field to `codex-tui` for TUI turns
    - captured `initialize.clientInfo.name` in the app server and applied it
    to subsequent turns before dispatching hooks
    - replaced the notify integration test hook with a `python3` script so
    the test does not rely on Unix shell permissions or `bash`
    - documented the new field in `docs/config.md`
    
    ## Testing
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-tui`
    - `cargo test -p codex-app-server
    suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name
    -- --exact --nocapture`
    - `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs`
    still has unrelated existing failures in this environment)
    
    ## Docs
    
    The public config reference on `developers.openai.com/codex` should
    mention that the legacy `notify` payload may include a top-level
    `client` field. The TUI reports `codex-tui`, and the app server reports
    `initialize.clientInfo.name` when it is available.
  • Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
    ## Summary
    - Add agent job support: spawn a batch of sub-agents from CSV, auto-run,
    auto-export, and store results in SQLite.
    - Simplify workflow: remove run/resume/get-status/export tools; spawn is
    deterministic and completes in one call.
    - Improve exec UX: stable, single-line progress bar with ETA; suppress
    sub-agent chatter in exec.
    
    ## Why
    Enables map-reduce style workflows over arbitrarily large repos using
    the existing Codex orchestrator. This addresses review feedback about
    overly complex job controls and non-deterministic monitoring.
    
    ## Demo (progress bar)
    ```
    ./codex-rs/target/debug/codex exec \
      --enable collab \
      --enable sqlite \
      --full-auto \
      --progress-cursor \
      -c agents.max_threads=16 \
      -C /Users/daveaitel/code/codex \
      - <<'PROMPT'
    Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
    path = item-01..item-30, area = test.
    
    Then call spawn_agents_on_csv with:
    - csv_path: /tmp/agent_job_progress_demo.csv
    - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
    - output_csv_path: /tmp/agent_job_progress_demo_out.csv
    PROMPT
    ```
    
    ## Review feedback addressed
    - Auto-start jobs on spawn; removed run/resume/status/export tools.
    - Auto-export on success.
    - More descriptive tool spec + clearer prompts.
    - Avoid deadlocks on spawn failure; pending/running handled safely.
    - Progress bar no longer scrolls; stable single-line redraw.
    
    ## Tests
    - `cd codex-rs && cargo test -p codex-exec`
    - `cd codex-rs && cargo build -p codex-cli`
  • Improve Plan mode reasoning selection flow (#12303)
    Addresses https://github.com/openai/codex/issues/11013
    
    ## Summary
    - add a Plan implementation path in the TUI that lets users choose
    reasoning before switching to Default mode and implementing
    - add Plan-mode reasoning scope handling (Plan-only override vs
    all-modes default), including config/schema/docs plumbing for
    `plan_mode_reasoning_effort`
    - remove the hardcoded Plan preset medium default and make the reasoning
    popup reflect the active Plan override as `(current)`
    - split the collaboration-mode switch notification UI hint into #12307
    to keep this diff focused
    
    If I have `plan_mode_reasoning_effort = "medium"` set in my
    `config.toml`:
    <img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM"
    src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369"
    />
    
    If I don't have `plan_mode_reasoning_effort` set in my `config.toml`:
    <img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM"
    src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d"
    />
    
    ## Codex author
    `codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
  • [connectors] Support connectors part 1 - App server & MCP (#9667)
    In order to make Codex work with connectors, we add a built-in gateway
    MCP that acts as a transparent proxy between the client and the
    connectors. The gateway MCP collects actions that are accessible to the
    user and sends them down to the user, when a connector action is chosen
    to be called, the client invokes the action through the gateway MCP as
    well.
    
     - [x] Add the system built-in gateway MCP to list and run connectors.
     - [x] Add the app server methods and protocol
  • tui: double-press Ctrl+C/Ctrl+D to quit (#8936)
    ## Problem
    
    Codex’s TUI quit behavior has historically been easy to trigger
    accidentally and hard to reason
    about.
    
    - `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a
    common key to press while trying
      to dismiss a modal, cancel a command, or recover from a stuck state.
    - “Quit” and “shutdown” were not consistently separated, so some exit
    paths could bypass the
      shutdown/cleanup work that should run before the process terminates.
    
    This PR makes quitting both safer (harder to do by accident) and more
    uniform across quit
    gestures, while keeping the shutdown-first semantics explicit.
    
    ## Mental model
    
    After this change, the system treats quitting as a UI request that is
    coordinated by the app
    layer.
    
    - The UI requests exit via `AppEvent::Exit(ExitMode)`.
    - `ExitMode::ShutdownFirst` is the normal user path: the app triggers
    `Op::Shutdown`, continues
    rendering while shutdown runs, and only ends the UI loop once shutdown
    has completed.
    - `ExitMode::Immediate` exists as an escape hatch (and as the
    post-shutdown “now actually exit”
    signal); it bypasses cleanup and should not be the default for
    user-triggered quits.
    
    User-facing quit gestures are intentionally “two-step” for safety:
    
    - `Ctrl+C` and `Ctrl+D` no longer exit immediately.
    - The first press arms a 1-second window and shows a footer hint (“ctrl
    + <key> again to quit”).
    - Pressing the same key again within the window requests a
    shutdown-first quit; otherwise the
      hint expires and the next press starts a fresh window.
    
    Key routing remains modal-first:
    
    - A modal/popup gets first chance to consume `Ctrl+C`.
    - If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so
    dismissing a modal cannot
      prime a subsequent `Ctrl+C` to quit.
    - `Ctrl+D` only participates in quitting when the composer is empty and
    no modal/popup is active.
    
    The design doc `docs/exit-confirmation-prompt-design.md` captures the
    intended routing and the
    invariants the UI should maintain.
    
    ## Non-goals
    
    - This does not attempt to redesign modal UX or make modals uniformly
    dismissible via `Ctrl+C`.
    It only ensures modals get priority and that quit arming does not leak
    across modal handling.
    - This does not introduce a persistent confirmation prompt/menu for
    quitting; the goal is to keep
      the exit gesture lightweight and consistent.
    - This does not change the semantics of core shutdown itself; it changes
    how the UI requests and
      sequences it.
    
    ## Tradeoffs
    
    - Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second
    keypress, which adds friction for
      users who relied on the old “instant quit” behavior.
    - The UI now maintains a small time-bounded state machine for the armed
    shortcut, which increases
      complexity and introduces timing-dependent behavior.
    
    This design was chosen over alternatives (a modal confirmation prompt or
    a long-lived “are you
    sure” state) because it provides an explicit safety barrier while
    keeping the flow fast and
    keyboard-native.
    
    ## Architecture
    
    - `ChatWidget` owns the quit-shortcut state machine and decides when a
    quit gesture is allowed
      (idle vs cancellable work, composer state, etc.).
    - `BottomPane` owns rendering and local input routing for modals/popups.
    It is responsible for
    consuming cancellation keys when a view is active and for
    showing/expiring the footer hint.
    - `App` owns shutdown sequencing: translating
    `AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown`
      and only terminating the UI loop when exit is safe.
    
    This keeps “what should happen” decisions (quit vs interrupt vs ignore)
    in the chat/widget layer,
    while keeping “how it looks and which view gets the key” in the
    bottom-pane layer.
    
    ## Observability
    
    You can tell this is working by running the TUIs and exercising the quit
    gestures:
    
    - While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and
    no modal) shows a footer
    hint for ~1 second; pressing again within that window exits via
    shutdown-first.
    - While streaming/tools/review are active: `Ctrl+C` interrupts work
    rather than quitting.
    - With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it
    chooses to) and does not
    arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless
    the user re-arms it.
    
    Failure modes are visible as:
    
    - Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`.
    - Quits that occur while a modal is open and consuming `Ctrl+C`.
    - UI termination before shutdown completes (cleanup skipped).
    
    ## Tests
    
    - Updated/added unit and snapshot coverage in `codex-tui` and
    `codex-tui2` to validate:
      - The quit hint appears and expires on the expected key.
    - Double-press within the window triggers a shutdown-first quit request.
    - Modal-first routing prevents quit bypass and clears any armed shortcut
    when a modal consumes
        `Ctrl+C`.
    
    These tests focus on the UI-level invariants and rendered output; they
    do not attempt to validate
    real terminal key-repeat timing or end-to-end process shutdown behavior.
    
    ---
    Screenshot:
    <img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM"
    src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f"
    />
  • add generated jsonschema for config.toml (#8956)
    ### What
    Add JSON Schema generation for `config.toml`, with checked‑in
    `docs/config.schema.json`. We can move the schema elsewhere if preferred
    (and host it if there's demand).
    
    Add fixture test to prevent drift and `just write-config-schema` to
    regenerate on schema changes.
    
    Generate MCP config schema from `RawMcpServerConfig` instead of
    `McpServerConfig` because that is the runtime type used for
    deserialization.
    
    Populate feature flag values into generated schema so they can be
    autocompleted.
    
    ### Tests
    Added tests + regenerate script to prevent drift. Tested autocompletions
    using generated jsonschema locally with Even Better TOML.
    
    
    
    https://github.com/user-attachments/assets/5aa7cd39-520c-4a63-96fb-63798183d0bc
  • Replaced user documentation with links to developers docs site (#8662)
    This eliminates redundant user documentation and allows us to focus our
    documentation investments.
    
    I left tombstone files for most of the existing ".md" docs files to
    avoid broken links. These now contain brief links to the developers docs
    site.
  • Remove reasoning format (#8484)
    This isn't very useful parameter. 
    
    logic:
    ```
    if model puts `**` in their reasoning, trim it and visualize the header.
    if couldn't trim: don't render
    if model doesn't support: don't render
    ```
    
    We can simplify to:
    ```
    if could trim, visualize header.
    if not, don't render
    ```
  • feat: add support for project_root_markers in config.toml (#8359)
    - allow configuring `project_root_markers` in `config.toml`
    (user/system/MDM) to control project discovery beyond `.git`
    - honor the markers after merging pre-project layers; default to
    `[".git"]` when unset and skip ancestor walk when set to an empty array
    - document the option and add coverage for alternate markers in config
    loader tests
  • docs: add developer_instructions config option and update descriptions (#8376)
    Updates the configuration documentation to clarify and improve the
    description of the `developer_instructions` and `instructions` fields.
    
    Documentation updates:
    
    * Added a description for the `developer_instructions` field in
    `docs/config.md`, clarifying that it provides additional developer
    instructions.
    * Updated the comments in `docs/example-config.md` to specify that
    `developer_instructions` is injected before `AGENTS.md`, and clarified
    that the `instructions` field is ignored and that `AGENTS.md` is
    preferred.
    
    ___
    
    ref #7973 
    
    Thanks to @miraclebakelaser for the message. I have double-confirmed
    that developer instructions are always injected before user
    instructions. According to the source code
    [codex_core::codex::Session::build_initial_context](https://github.com/openai/codex/blob/rust-v0.77.0-alpha.2/codex-rs/core/src/codex.rs#L1279),
    we can see the specific order of these instructions.
  • Update ghost_commit flag reference to undo (#8091)
    Minor documentation update to fix #7966 (documentation of undo flag).
  • Chore: remove rmcp feature and exp flag usages (#8087)
    ### Summary
    With codesigning on Mac, Windows and Linux, we should be able to safely
    remove `features.rmcp_client` and `use_experimental_use_rmcp_client`
    check from the codebase now.
  • feat(tui2): tune scrolling inpu based on (#8357)
    ## TUI2: Normalize Mouse Scroll Input Across Terminals (Wheel +
    Trackpad)
    
    This changes TUI2 scrolling to a stream-based model that normalizes
    terminal scroll event density into consistent wheel behavior (default:
    ~3 transcript lines per physical wheel notch) while keeping trackpad
    input higher fidelity via fractional accumulation.
    
    Primary code: `codex-rs/tui2/src/tui/scrolling/mouse.rs`
    
    Doc of record (model + probe-derived data):
    `codex-rs/tui2/docs/scroll_input_model.md`
    
    ### Why
    
    Terminals encode both mouse wheels and trackpads as discrete scroll
    up/down events with direction but no magnitude, and they vary widely in
    how many raw events they emit per physical wheel notch (commonly 1, 3,
    or 9+). Timing alone doesn’t reliably distinguish wheel vs trackpad, so
    cadence-based heuristics are unstable across terminals/hardware.
    
    This PR treats scroll input as short *streams* separated by silence or
    direction flips, normalizes raw event density into tick-equivalents,
    coalesces redraws for dense streams, and exposes explicit config
    overrides.
    
    ### What Changed
    
    #### Scroll Model (TUI2)
    
    - Stream detection
      - Start a stream on the first scroll event.
      - End a stream on an idle gap (`STREAM_GAP_MS`) or a direction flip.
    - Normalization
    - Convert raw events into tick-equivalents using per-terminal
    `tui.scroll_events_per_tick`.
    - Wheel-like vs trackpad-like behavior
    - Wheel-like: fixed “classic” lines per wheel notch; flush immediately
    for responsiveness.
    - Trackpad-like: fractional accumulation + carry across stream
    boundaries; coalesce flushes to ~60Hz to avoid floods and reduce “stop
    lag / overshoot”.
    - Trackpad divisor is intentionally capped: `min(scroll_events_per_tick,
    3)` so terminals with dense wheel ticks (e.g. 9 events per notch) don’t
    make trackpads feel artificially slow.
    - Auto mode (default)
      - Start conservatively as trackpad-like (avoid overshoot).
    - Promote to wheel-like if the first tick-worth of events arrives
    quickly.
    - Fallback for 1-event-per-tick terminals (no tick-completion timing
    signal).
    
    #### Trackpad Acceleration
    
    Some terminals produce relatively low vertical event density for
    trackpad gestures, which makes large/faster swipes feel sluggish even
    when small motions feel correct. To address that, trackpad-like streams
    apply a bounded multiplier based on event count:
    
    - `multiplier = clamp(1 + abs(events) / scroll_trackpad_accel_events,
    1..scroll_trackpad_accel_max)`
    
    The multiplier is applied to the trackpad stream’s computed line delta
    (including carried fractional remainder). Defaults are conservative and
    bounded.
    
    #### Config Knobs (TUI2)
    
    All keys live under `[tui]`:
    
    - `scroll_wheel_lines`: lines per physical wheel notch (default: 3).
    - `scroll_events_per_tick`: raw vertical scroll events per physical
    wheel notch (terminal-specific default; fallback: 3).
    - Wheel-like per-event contribution: `scroll_wheel_lines /
    scroll_events_per_tick`.
    - `scroll_trackpad_lines`: baseline trackpad sensitivity (default: 1).
    - Trackpad-like per-event contribution: `scroll_trackpad_lines /
    min(scroll_events_per_tick, 3)`.
    - `scroll_trackpad_accel_events` / `scroll_trackpad_accel_max`: bounded
    trackpad acceleration (defaults: 30 / 3).
    - `scroll_mode = auto|wheel|trackpad`: force behavior or use the
    heuristic (default: `auto`).
    - `scroll_wheel_tick_detect_max_ms`: auto-mode promotion threshold (ms).
    - `scroll_wheel_like_max_duration_ms`: auto-mode fallback for
    1-event-per-tick terminals (ms).
    - `scroll_invert`: invert scroll direction (applies to wheel +
    trackpad).
    
    Config docs: `docs/config.md` and field docs in
    `codex-rs/core/src/config/types.rs`.
    
    #### App Integration
    
    - The app schedules follow-up ticks to close idle streams (via
    `ScrollUpdate::next_tick_in` and `schedule_frame_in`) and finalizes
    streams on draw ticks.
      - `codex-rs/tui2/src/app.rs`
    
    #### Docs
    
    - Single doc of record describing the model + preserved probe
    findings/spec:
      - `codex-rs/tui2/docs/scroll_input_model.md`
    
    #### Other (jj-only friendliness)
    
    - `codex-rs/tui2/src/diff_render.rs`: prefer stable cwd-relative paths
    when the file is under the cwd even if there’s no `.git`.
    
    ### Terminal Defaults
    
    Per-terminal defaults are derived from scroll-probe logs (see doc).
    Notable:
    
    - Ghostty currently defaults to `scroll_events_per_tick = 3` even though
    logs measured ~9 in one setup. This is a deliberate stopgap; if your
    Ghostty build emits ~9 events per wheel notch, set:
    
      ```toml
      [tui]
      scroll_events_per_tick = 9
      ```
    
    ### Testing
    
    - `just fmt`
    - `just fix -p codex-core --allow-no-vcs`
    - `cargo test -p codex-core --lib` (pass)
    - `cargo test -p codex-tui2` (scroll tests pass; remaining failures are
    known flaky VT100 color tests in `insert_history`)
    
    ### Review Focus
    
    - Stream finalization + frame scheduling in `codex-rs/tui2/src/app.rs`.
    - Auto-mode promotion thresholds and the 1-event-per-tick fallback
    behavior.
    - Trackpad divisor cap (`min(events_per_tick, 3)`) and acceleration
    defaults.
    - Ghostty default tradeoff (3 vs ~9) and whether we should change it.
  • feat: if .codex is a sub-folder of a writable root, then make it read-only to the sandbox (#8088)
    In preparation for in-repo configuration support, this updates
    `WritableRoot::get_writable_roots_with_cwd()` to include the `.codex`
    subfolder in `WritableRoot.read_only_subpaths`, if it exists, as we
    already do for `.git`.
    
    As noted, currently, like `.git`, `.codex` will only be read-only under
    macOS Seatbelt, but we plan to bring support to other OSes, as well.
    
    Updated the integration test in `seatbelt.rs` so that it actually
    attempts to run the generated Seatbelt commands, verifying that:
    
    - trying to write to `.codex/config.toml` in a writable root fails
    - trying to write to `.git/hooks/pre-commit` in a writable root fails
    - trying to write to the writable root containing the `.codex` and
    `.git` subfolders succeeds
  • docs: fix gpt-5.2 typo in config.md (#8079)
    Fix small typo in docs/config.md: `gpt5-2` -> `gpt-5.2`
  • Update config.md (#8066)
    Update supporting docs with the actual options
  • docs: document enabling experimental skills (#8024)
    ## Notes
    
    Skills are behind the experimental `skills` feature flag (disabled by
    default), but the skills guide didn't explain how to turn them on.
    
    - Add an explicit enable section to `docs/skills.md` (config +
    `--enable`)
    - Add the skills flag to `docs/config.md` and `docs/example-config.md`
    - Document the `/skills` slash command
  • docs: clarify xhigh reasoning effort on gpt-5.2 (#7911)
    ## Changes
    - Update config docs and example config comments to state that "xhigh"
    is supported on gpt-5.2 as well as gpt-5.1-codex-max
    - Adjust the FAQ model-support section to reflect broader xhigh
    availability
  • Fix toasts on Windows under WSL 2 (#7137)
    Before this: no notifications or toasts when using Codex CLI in WSL 2.
    
    After this: I get toasts from Codex
  • Removed experimental "command risk assessment" feature (#7799)
    This experimental feature received lukewarm reception during internal
    testing. Removing from the code base.
  • feat(tui2): add feature-flagged tui2 frontend (#7793)
    Introduce a new codex-tui2 crate that re-exports the existing
    interactive TUI surface and delegates run_main directly to codex-tui.
    This keeps behavior identical while giving tui2 its own crate for future
    viewport work.
    
    Wire the codex CLI to select the frontend via the tui2 feature flag.
    When the merged CLI overrides include features.tui2=true (e.g. via
    --enable tui2), interactive runs are routed through
    codex_tui2::run_main; otherwise they continue to use the original
    codex_tui::run_main.
    
    Register Feature::Tui2 in the core feature registry and add the tui2
    crate and dependency entries so the new frontend builds alongside the
    existing TUI.
    
    This is a stub that only wires up the feature flag for this.
    
    <img width="619" height="364" alt="image"
    src="https://github.com/user-attachments/assets/4893f030-932f-471e-a443-63fe6b5d8ed9"
    />
  • fix: refine the warning message and docs for deprecated tools config (#7685)
    Issue #7661 revealed that users are confused by deprecation warnings
    like:
    > `tools.web_search` is deprecated. Use `web_search_request` instead.
    
    This message misleadingly suggests renaming the config key from
    `web_search` to `web_search_request`, when the actual required change is
    to **move and rename the configuration from the `[tools]` section to the
    `[features]` section**.
    
    This PR clarifies the warning messages and documentation to make it
    clear that deprecated `[tools]` configurations should be moved to
    `[features]`. Changes made:
    - Updated deprecation warning format in `codex-rs/core/src/codex.rs:520`
    to include `[features].` prefix
    - Updated corresponding test expectations in
    `codex-rs/core/tests/suite/deprecation_notice.rs:39`
    - Improved documentation in `docs/config.md` to clarify upfront that
    `[tools]` options are deprecated in favor of `[features]`
  • fix(doc): TOML otel exporter example — multi-line inline table is inv… (#7669)
    …alid (#7668)
    
    The `otel` exporter example in `docs/config.md` is misleading and will
    cause
    the configuration parser to fail if copied verbatim.
    
    Summary
    -------
    The example uses a TOML inline table but spreads the inline-table braces
    across multiple lines. TOML inline tables must be contained on a single
    line
    (`key = { a = 1, b = 2 }`); placing newlines inside the braces triggers
    a
    parse error in most TOML parsers and prevents Codex from starting.
    
    Reproduction
    ------------
    1. Paste the snippet below into `~/.codex/config.toml` (or your project
    config).
    2. Run `codex` (or the command that loads the config).
    3. The process will fail to start with a TOML parse error similar to:
    
    ```text
    Error loading config.toml: TOML parse error at line 55, column 27
       |
    55 | exporter = { otlp-http = {
       |                           ^
    newlines are unsupported in inline tables, expected nothing
    ```
    
    Problematic snippet (as currently shown in the docs)
    ---------------------------------------------------
    ```toml
    [otel]
    exporter = { otlp-http = {
      endpoint = "https://otel.example.com/v1/logs",
      protocol = "binary",
      headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
    }}
    ```
    
    Recommended fixes
    ------------------
    ```toml
    [otel.exporter."otlp-http"]
    endpoint = "https://otel.example.com/v1/logs"
    protocol = "binary"
    
    [otel.exporter."otlp-http".headers]
    "x-otlp-api-key" = "${OTLP_TOKEN}"
    ```
    
    Or, keep an inline table but write it on one line (valid but less
    readable):
    
    ```toml
    [otel]
    exporter = { "otlp-http" = { endpoint = "https://otel.example.com/v1/logs", protocol = "binary", headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" } } }
    ```
  • docs: fix documentation of rmcp client flag (#7665)
    ## Summary
    - Updated the rmcp client flag's documentation in config.md file
    - changed it from `experimental_use_rmcp_client` to `rmcp_client`
  • Refactor execpolicy fallback evaluation (#7544)
    ## Refactor of the `execpolicy` crate
    
    To illustrate why we need this refactor, consider an agent attempting to
    run `apple | rm -rf ./`. Suppose `apple` is allowed by `execpolicy`.
    Before this PR, `execpolicy` would consider `apple` and `pear` and only
    render one rule match: `Allow`. We would skip any heuristics checks on
    `rm -rf ./` and immediately approve `apple | rm -rf ./` to run.
    
    To fix this, we now thread a `fallback` evaluation function into
    `execpolicy` that runs when no `execpolicy` rules match a given command.
    In our example, we would run `fallback` on `rm -rf ./` and prevent
    `apple | rm -rf ./` from being run without approval.
  • whitelist command prefix integration in core and tui (#7033)
    this PR enables TUI to approve commands and add their prefixes to an
    allowlist:
    <img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM"
    src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32"
    />
    
    note: we only show the option to whitelist the command when 
    1) command is not multi-part (e.g `git add -A && git commit -m 'hello
    world'`)
    2) command is not already matched by an existing rule
  • Trim history.jsonl when history.max_bytes is set (#6242)
    This PR honors the `history.max_bytes` configuration parameter by
    trimming `history.jsonl` whenever it grows past the configured limit.
    While appending new entries we retain the newest record, drop the oldest
    lines to stay within the byte budget, and serialize the compacted file
    back to disk under the same lock to keep writers safe.
  • docs: clarify codex max defaults and xhigh availability (#7449)
    ## Summary
    Adds the missing `xhigh` reasoning level everywhere it should have been
    documented, and makes clear it only works with `gpt-5.1-codex-max`.
    
    ## Changes
    
    * `docs/config.md`
    
    * Add `xhigh` to the official list of reasoning levels with a note that
    `xhigh` is exclusive to Codex Max.
    
    * `docs/example-config.md`
    
    * Update the example comment adding `xhigh` as a valid option but only
    for Codex Max.
    
    * `docs/faq.md`
    
      * Update the model recommendation to `GPT-5.1 Codex Max`.
    * Mention that users can choose `high` or the newly documented `xhigh`
    level when using Codex Max.
  • Allow enterprises to skip upgrade checks and messages (#7213)
    This is a feature primarily for enterprises who centrally manage Codex
    updates.
  • Removed streamable_shell from docs (#7235)
    This config option no longer exists
    
    Addresses #7207