Commit Graph

568 Commits

  • Code mode on v8 (#15276)
    Moves Code Mode to a new crate with no dependencies on codex. This
    create encodes the code mode semantics that we want for lifetime,
    mounting, tool calling.
    
    The model-facing surface is mostly unchanged. `exec` still runs raw
    JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
    are still available through `tools.*`, and helpers like `text`, `image`,
    `store`, `load`, `notify`, `yield_control`, and `exit` still exist.
    
    The major change is underneath that surface:
    
    - Old code mode was an external Node runtime.
    - New code mode is an in-process V8 runtime embedded directly in Rust.
    - Old code mode managed cells inside a long-lived Node runner process.
    - New code mode manages cells in Rust, with one V8 runtime thread per
    active `exec`.
    - Old code mode used JSON protocol messages over child stdin/stdout plus
    Node worker-thread messages.
    - New code mode uses Rust channels and direct V8 callbacks/events.
    
    This PR also fixes the two migration regressions that fell out of that
    substrate change:
    
    - `wait { terminate: true }` now waits for the V8 runtime to actually
    stop before reporting termination.
    - synchronous top-level `exit()` now succeeds again instead of surfacing
    as a script error.
    
    ---
    
    - `core/src/tools/code_mode/*` is now mostly an adapter layer for the
    public `exec` / `wait` tools.
    - `code-mode/src/service.rs` owns cell sessions and async control flow
    in Rust.
    - `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
    JavaScript execution.
    - each `exec` spawns a dedicated runtime thread plus a Rust
    session-control task.
    - helper globals are installed directly into the V8 context instead of
    being injected through a source prelude.
    - helper modules like `tools.js` and `@openai/code_mode` are synthesized
    through V8 module resolution callbacks in Rust.
    
    ---
    
    Also added a benchmark for showing the speed of init and use of a code
    mode env:
    ```
    $ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
    Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
         Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
    exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
    scenario       tools samples    warmups      iters      mean/exec       p95/exec       rssΔ p50       rssΔ max
    cold_exec          0      30          0          1         1.13ms         1.20ms        8.05MiB        8.06MiB
    warm_exec          0      30          1         25       473.43us       512.49us      912.00KiB        1.33MiB
    cold_exec         32      30          0          1         1.03ms         1.15ms        8.08MiB        8.11MiB
    warm_exec         32      30          1         25       509.73us       545.76us      960.00KiB        1.30MiB
    cold_exec        128      30          0          1         1.14ms         1.19ms        8.30MiB        8.34MiB
    warm_exec        128      30          1         25       575.08us       591.03us      736.00KiB      864.00KiB
    memory uses a fresh-process max RSS delta for each scenario
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime transcript notification in v2 (#15344)
    - emit a typed `thread/realtime/transcriptUpdated` notification from
    live realtime transcript deltas
    - expose that notification as flat `threadId`, `role`, and `text` fields
    instead of a nested transcript array
    - continue forwarding raw `handoff_request` items on
    `thread/realtime/itemAdded`, including the accumulated
    `active_transcript`
    - update app-server docs, tests, and generated protocol schema artifacts
    to match the delta-based payloads
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add v8-poc consumer of our new built v8 (#15203)
    This adds a dummy v8-poc project that in Cargo links against our
    prebuilt binaries and the ones provided by rusty_v8 for non musl
    platforms. This demonstrates that we can successfully link and use v8 on
    all platforms that we want to target.
    
    In bazel things are slightly more complicated. Since the libraries as
    published have libc++ linked in already we end up with a lot of double
    linked symbols if we try to use them in bazel land. Instead we fall back
    to building rusty_v8 and v8 from source (cached of course) on the
    platforms we ship to.
    
    There is likely some compatibility drift in the windows bazel builder
    that we'll need to reconcile before we can re-enable them. I'm happy to
    be on the hook to unwind that.
  • Add remote env CI matrix and integration test (#14869)
    `CODEX_TEST_REMOTE_ENV` will make `test_codex` start the executor
    "remotely" (inside a docker container) turning any integration test into
    remote test.
  • Initial plugins TUI menu - list and read only. tui + tui_app_server (#15215)
    ### Preliminary /plugins TUI menu
    - Adds a preliminary /plugins menu flow in both tui and tui_app_server.
    - Fetches plugin list data asynchronously and shows loading/error/cached
    states.
      - Limits this first pass to the curated ChatGPT marketplace.
      - Shows available plugins with installed/status metadata.
    - Supports in-menu search over plugin display name, plugin id, plugin
    name, and marketplace label.
    - Opens a plugin detail view on selection, including summaries for
    Skills, Apps, and MCP Servers, with back navigation.
    
    ### Testing
      - Launch codex-cli with plugins enabled (`--enable plugins`).
      - Run /plugins and verify:
          - loading state appears first
          - plugin list is shown
          - search filters results
    - selecting a plugin opens detail view, with a list of
    skills/connectors/MCP servers for the plugin
          - back action returns to the list.
    - Verify disabled behavior by running /plugins without plugins enabled
    (shows “Plugins are disabled” message).
    - Launch with `--enable tui_app_server` (and plugins enabled) and repeat
    the same /plugins flow; behavior should match.
  • Split features into codex-features crate (#15253)
    - Split the feature system into a new `codex-features` crate.
    - Cut `codex-core` and workspace consumers over to the new config and
    warning APIs.
    
    Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Move auth code into login crate (#15150)
    - Move the auth implementation and token data into codex-login.
    - Keep codex-core re-exporting that surface from codex-login for
    existing callers.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Refactor ExecServer filesystem split between local and remote (#15232)
    For each feature we have:
    1. Trait exposed on environment
    2. **Local Implementation** of the trait
    3. Remote implementation that uses the client to proxy via network
    4. Handler implementation that handles PRC requests and calls into
    **Local Implementation**
  • Move terminal module to terminal-detection crate (#15216)
    - Move core/src/terminal.rs and its tests into a standalone
    terminal-detection workspace crate.
    - Update direct consumers to depend on codex-terminal-detection and
    import terminal APIs directly.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add exec-server exec RPC implementation (#15090)
    Stacked PR 2/3, based on the stub PR.
    
    Adds the exec RPC implementation and process/event flow in exec-server
    only.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [hooks] use a user message > developer message for prompt continuation (#14867)
    ## Summary
    
    Persist Stop-hook continuation prompts as `user` messages instead of
    hidden `developer` messages + some requested integration tests
    
    This is a followup to @pakrym 's comment in
    https://github.com/openai/codex/pull/14532 to make sure stop-block
    continuation prompts match training for turn loops
    
    - Stop continuation now writes `<hook_prompt hook_run_id="...">stop
    hook's user prompt<hook_prompt>`
    - Introduces quick-xml dependency, though we already indirectly depended
    on it anyway via syntect
    - This PR only has about 500 lines of actual logic changes, the rest is
    tests/schema
    
    ## Testing
    
    Example run (with a sessionstart hook and 3 stop hooks) - this shows
    context added by session start, then two stop hooks sending their own
    additional prompts in a new turn. The model responds with a single
    message addressing both. Then when that turn ends, the hooks detect that
    they just ran using `stop_hook_active` and decide not to infinite loop
    
    test files for this (unzip, move codex -> .codex):
    [codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)
    
    ```
    › cats
    
    
    • Running SessionStart hook: lighting the observatory
    
    SessionStart hook (completed)
      warning: Hi, I'm a session start hook for wizard-tower (startup).
      hook context: A wimboltine stonpet is an exotic cuisine from hyperspace
    
    • Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
      cat facts, cat breeds, cat names, or build something cat-themed in this repo.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: cook the stonpet
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: eat the cooked stonpet
    
    • Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
      rested until the hyperspace juices settle.
    
      Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
      smoky, bright, and totally out of this dimension.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    ```
  • Move environment abstraction into exec server (#15125)
    The idea is that codex-exec exposes an Environment struct with services
    on it. Each of those is a trait.
    
    Depending on construction parameters passed to Environment they are
    either backed by local or remote server but core doesn't see these
    differences.
  • feat: add graph representation of agent network (#15056)
    Add a representation of the agent graph. This is now used for:
    * Cascade close agents (when I close a parent, it close the kids)
    * Cascade resume (oposite)
    
    Later, this will also be used for post-compaction stuffing of the
    context
    
    Direct fix for: https://github.com/openai/codex/issues/14458
  • feat(core, tracing): create turn spans over websockets (#14632)
    ## Description
    
    Dependent on:
    - [responsesapi] https://github.com/openai/openai/pull/760991 
    - [codex-backend] https://github.com/openai/openai/pull/760985
    
    `codex app-server -> codex-backend -> responsesapi` now reuses a
    persistent websocket connection across many turns. This PR updates
    tracing when using websockets so that each `response.create` websocket
    request propagates the current tracing context, so we can get a holistic
    end-to-end trace for each turn.
    
    Tracing is propagated via special keys (`ws_request_header_traceparent`,
    `ws_request_header_tracestate`) set in the `client_metadata` param in
    Responses API.
    
    Currently tracing on websockets is a bit broken because we only set
    tracing context on ws connection time, so it's detached from a
    `turn/start` request.
  • Remove stdio transport from exec server (#15119)
    Summary
    - delete the deprecated stdio transport plumbing from the exec server
    stack
    - add a basic `exec_server()` harness plus test utilities to start a
    server, send requests, and await events
    - refresh exec-server dependencies, configs, and documentation to
    reflect the new flow
    
    Testing
    - Not run (not requested)
    
    ---------
    
    Co-authored-by: starr-openai <starr@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Add exec-server stub server and protocol docs (#15089)
    Stacked PR 1/3.
    
    This is the initialize-only exec-server stub slice: binary/client
    scaffolding and protocol docs, without exec/filesystem implementation.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Return image URL from view_image tool (#15072)
    Cleanup image semantics in code mode.
    
    `view_image` now returns `{image_url:string, details?: string}` 
    
    `image()` now allows both string parameter and `{image_url:string,
    details?: string}`
  • Add FS abstraction and use in view_image (#14960)
    Adds an environment crate and environment + file system abstraction.
    
    Environment is a combination of attributes and services specific to
    environment the agent is connected to:
    File system, process management, OS, default shell.
    
    The goal is to move most of agent logic that assumes environment to work
    through the environment abstraction.
  • Revert tui code so it does not rely on in-process app server (#14899)
    PR https://github.com/openai/codex/pull/14512 added an in-process app
    server and started to wire up the tui to use it. We were originally
    planning to modify the `tui` code in place, converting it to use the app
    server a bit at a time using a hybrid adapter. We've since decided to
    create an entirely new parallel `tui_app_server` implementation and do
    the conversion all at once but retain the existing `tui` while we work
    the bugs out of the new implementation.
    
    This PR undoes the changes to the `tui` made in the PR #14512 and
    restores the old initialization to its previous state. This allows us to
    modify the `tui_app_server` without the risk of regressing the old `tui`
    code. For example, we can start to remove support for all legacy core
    events, like the ones that PR https://github.com/openai/codex/pull/14892
    needed to ignore.
    
    Testing:
    * I manually verified that the old `tui` starts and shuts down without a
    problem.
  • windows-sandbox: add runner IPC foundation for future unified_exec (#14139)
    # Summary
    
    This PR introduces the Windows sandbox runner IPC foundation that later
    unified_exec work will build on.
    
    The key point is that this is intentionally infrastructure-only. The new
    IPC transport, runner plumbing, and ConPTY helpers are added here, but
    the active elevated Windows sandbox path still uses the existing
    request-file bootstrap. In other words, this change prepares the
    transport and module layout we need for unified_exec without switching
    production behavior over yet.
    
    Part of this PR is also a source-layout cleanup: some Windows sandbox
    files are moved into more explicit `elevated/`, `conpty/`, and shared
    locations so it is clearer which code is for the elevated sandbox flow,
    which code is legacy/direct-spawn behavior, and which helpers are shared
    between them. That reorganization is intentional in this first PR so
    later behavioral changes do not also have to carry a large amount of
    file-move churn.
    
    # Why This Is Needed For unified_exec
    
    Windows elevated sandboxed unified_exec needs a long-lived,
    bidirectional control channel between the CLI and a helper process
    running under the sandbox user. That channel has to support:
    
    - starting a process and reporting structured spawn success/failure
    - streaming stdout/stderr back incrementally
    - forwarding stdin over time
    - terminating or polling a long-lived process
    - supporting both pipe-backed and PTY-backed sessions
    
    The existing elevated one-shot path is built around a request-file
    bootstrap and does not provide those primitives cleanly. Before we can
    turn on Windows sandbox unified_exec, we need the underlying runner
    protocol and transport layer that can carry those lifecycle events and
    streams.
    
    # Why Windows Needs More Machinery Than Linux Or macOS
    
    Linux and macOS can generally build unified_exec on top of the existing
    sandbox/process model: the parent can spawn the child directly, retain
    normal ownership of stdio or PTY handles, and manage the lifetime of the
    sandboxed process without introducing a second control process.
    
    Windows elevated sandboxing is different. To run inside the sandbox
    boundary, we cross into a different user/security context and then need
    to manage a long-lived process from outside that boundary. That means we
    need an explicit helper process plus an IPC transport to carry spawn,
    stdin, output, and exit events back and forth. The extra code here is
    mostly that missing Windows sandbox infrastructure, not a conceptual
    difference in unified_exec itself.
    
    # What This PR Adds
    
    - the framed IPC message types and transport helpers for parent <->
    runner communication
    - the renamed Windows command runner with both the existing request-file
    bootstrap and the dormant IPC bootstrap
    - named-pipe helpers for the elevated runner path
    - ConPTY helpers and process-thread attribute plumbing needed for
    PTY-backed sessions
    - shared sandbox/process helpers that later PRs will reuse when
    switching live execution paths over
    - early file/module moves so later PRs can focus on behavior rather than
    layout churn
    
    # What This PR Does Not Yet Do
    
    - it does not switch the active elevated one-shot path over to IPC yet
    - it does not enable Windows sandbox unified_exec yet
    - it does not remove the existing request-file bootstrap yet
    
    So while this code compiles and the new path has basic validation, it is
    not yet the exercised production path. That is intentional for this
    first PR: the goal here is to land the transport and runner foundation
    cleanly before later PRs start routing real command execution through
    it.
    
    # Follow-Ups
    
    Planned follow-up PRs will:
    
    1. switch elevated one-shot Windows sandbox execution to the new runner
    IPC path
    2. layer Windows sandbox unified_exec sessions on top of the same
    transport
    3. remove the legacy request-file path once the IPC-based path is live
    
    # Validation
    
    - `cargo build -p codex-windows-sandbox`
  • Move TUI on top of app server (parallel code) (#14717)
    This PR replicates the `tui` code directory and creates a temporary
    parallel `tui_app_server` directory. It also implements a new feature
    flag `tui_app_server` to select between the two tui implementations.
    
    Once the new app-server-based TUI is stabilized, we'll delete the old
    `tui` directory and feature flag.
  • app-server: add v2 filesystem APIs (#14245)
    Add a protocol-level filesystem surface to the v2 app-server so Codex
    clients can read and write files, inspect directories, and subscribe to
    path changes without relying on host-specific helpers.
    
    High-level changes:
    - define the new v2 fs/readFile, fs/writeFile, fs/createDirectory,
    fs/getMetadata, fs/readDirectory, fs/remove, fs/copy RPCs
    - implement the app-server handlers, including absolute-path validation,
    base64 file payloads, recursive copy/remove semantics
    - document the API, regenerate protocol schemas/types, and add
    end-to-end tests for filesystem operations, copy edge cases
    
    Testing plan:
    - validate protocol serialization and generated schema output for the
    new fs request, response, and notification types
    - run app-server integration coverage for file and directory CRUD paths,
    metadata/readDirectory responses, copy failure modes, and absolute-path
    validation
  • Start TUI on embedded app server (#14512)
    This PR is part of the effort to move the TUI on top of the app server.
    In a previous PR, we introduced an in-process app server and moved
    `exec` on top of it.
    
    For the TUI, we want to do the migration in stages. The app server
    doesn't currently expose all of the functionality required by the TUI,
    so we're going to need to support a hybrid approach as we make the
    transition.
    
    This PR changes the TUI initialization to instantiate an in-process app
    server and access its `AuthManager` and `ThreadManager` rather than
    constructing its own copies. It also adds a placeholder TUI event
    handler that will eventually translate app server events into TUI
    events. App server notifications are accepted but ignored for now. It
    also adds proper shutdown of the app server when the TUI terminates.
  • client: extend custom CA handling across HTTPS and websocket clients (#14239)
    ## Stacked PRs
    
    This work is now effectively split across two steps:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: extend that shared custom CA handling across Codex HTTPS
    clients and secure websocket TLS
    
    Note: #14240 was merged into this branch while it was stacked on top of
    this PR. This PR now subsumes that websocket follow-up and should be
    treated as the combined change.
    
    Builds on top of #14178.
    
    ## Problem
    
    Custom CA support landed first in the login path, but the real
    requirement is broader. Codex constructs outbound TLS clients in
    multiple places, and both HTTPS and secure websocket paths can fail
    behind enterprise TLS interception if they do not honor
    `CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.
    
    This PR broadens the shared custom-CA logic beyond login and applies the
    same policy to websocket TLS, so the enterprise-proxy story is no longer
    split between “HTTPS works” and “websockets still fail”.
    
    ## What This Delivers
    
    Custom CA support is no longer limited to login. Codex outbound HTTPS
    clients and secure websocket connections can now honor the same
    `CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
    proxy/intercept setups work more consistently end-to-end.
    
    For users and operators, nothing new needs to be configured beyond the
    same CA env vars introduced in #14178. The change is that more of Codex
    now respects them, including websocket-backed flows that were previously
    still using default trust roots.
    
    I also manually validated the proxy path locally with mitmproxy using:
    `CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
    HTTPS_PROXY=http://127.0.0.1:8080 just codex`
    with mitmproxy installed via `brew install mitmproxy` and configured as
    the macOS system proxy.
    
    ## Mental model
    
    `codex-client` is now the owner of shared custom-CA policy for outbound
    TLS client construction. Reqwest callers start from the builder
    configuration they already need, then pass that builder through
    `build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
    same module for a rustls client config when a custom CA bundle is
    configured.
    
    The env precedence is the same everywhere:
    - `CODEX_CA_CERTIFICATE` wins
    - otherwise fall back to `SSL_CERT_FILE`
    - otherwise use system roots
    
    The helper is intentionally narrow. It loads every usable certificate
    from the configured PEM bundle into the appropriate root store and
    returns either a configured transport or a typed error that explains
    what went wrong.
    
    ## Non-goals
    
    This does not add handshake-level integration tests against a live TLS
    endpoint. It does not validate that the configured bundle forms a
    meaningful certificate chain. It also does not try to force every
    transport in the repo through one abstraction; it extends the shared CA
    policy across the reqwest and websocket paths that actually needed it.
    
    ## Tradeoffs
    
    The main tradeoff is centralizing CA behavior in `codex-client` while
    still leaving adoption up to call sites. That keeps the implementation
    additive and reviewable, but it means the rule "outbound Codex TLS that
    should honor enterprise roots must use the shared helper" is still
    partly enforced socially rather than by types.
    
    For websockets, the shared helper only builds an explicit rustls config
    when a custom CA bundle is configured. When no override env var is set,
    websocket callers still use their ordinary default connector path.
    
    ## Architecture
    
    `codex-client::custom_ca` now owns CA bundle selection, PEM
    normalization, mixed-section parsing, certificate extraction, typed
    CA-loading errors, and optional rustls client-config construction for
    websocket TLS.
    
    The affected consumers now call into that shared helper directly rather
    than carrying login-local CA behavior:
    - backend-client
    - cloud-tasks
    - RMCP client paths that use `reqwest`
    - TUI voice HTTP paths
    - `codex-core` default reqwest client construction
    - `codex-api` websocket clients for both responses and realtime
    websocket connections
    
    The subprocess CA probe, env-sensitive integration tests, and shared PEM
    fixtures also live in `codex-client`, which is now the actual owner of
    the behavior they exercise.
    
    ## Observability
    
    The shared CA path logs:
    - which environment variable selected the bundle
    - which path was loaded
    - how many certificates were accepted
    - when `TRUSTED CERTIFICATE` labels were normalized
    - when CRLs were ignored
    - where client construction failed
    
    Returned errors remain user-facing and include the relevant env var,
    path, and remediation hint. That same error model now applies whether
    the failure surfaced while building a reqwest client or websocket TLS
    configuration.
    
    ## Tests
    
    Pure unit tests in `codex-client` cover env precedence and PEM
    normalization behavior. Real client construction remains in subprocess
    tests so the suite can control process env and avoid the macOS seatbelt
    panic path that motivated the hermetic test split.
    
    The subprocess coverage verifies:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-cert and multi-cert bundles
    - malformed and empty-file errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The websocket side is covered by the existing `codex-api` / `codex-core`
    websocket test suites plus the manual mitmproxy validation above.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • login: add custom CA support for login flows (#14178)
    ## Stacked PRs
    
    This work is split across three stacked PRs:
    
    - #14178: add custom CA support for browser and device-code login flows,
    docs, and hermetic subprocess tests
    - #14239: broaden the shared custom CA path from login to other outbound
    `reqwest` clients across Codex
    - #14240: extend that shared custom CA handling to secure websocket TLS
    so websocket connections honor the same CA env vars
    
    Review order: #14178, then #14239, then #14240.
    
    Supersedes #6864.
    
    Thanks to @3axap4eHko for the original implementation and investigation
    here. Although this version rearranges the code and history
    significantly, the majority of the credit for this work belongs to them.
    
    ## Problem
    
    Login flows need to work in enterprise environments where outbound TLS
    is intercepted by an internal proxy or gateway. In those setups, system
    root certificates alone are often insufficient to validate the OAuth and
    device-code endpoints used during login. The change adds a
    login-specific custom CA loading path, but the important contracts
    around env precedence, PEM compatibility, test boundaries, and
    probe-only workarounds need to be explicit so reviewers can understand
    what behavior is intentional.
    
    For users and operators, the behavior is simple: if login needs to trust
    a custom root CA, set `CODEX_CA_CERTIFICATE` to a PEM file containing
    one or more certificates. If that variable is unset, login falls back to
    `SSL_CERT_FILE`. If neither is set, login uses system roots. Invalid or
    empty PEM files now fail with an error that points back to those
    environment variables and explains how to recover.
    
    ## What This Delivers
    
    Users can now make Codex login work behind enterprise TLS interception
    by pointing `CODEX_CA_CERTIFICATE` at a PEM bundle containing the
    relevant root certificates. If that variable is unset, login falls back
    to `SSL_CERT_FILE`, then to system roots.
    
    This PR applies that behavior to both browser-based and device-code
    login flows. It also makes login tolerant of the PEM shapes operators
    actually have in hand: multi-certificate bundles, OpenSSL `TRUSTED
    CERTIFICATE` labels, and bundles that include well-formed CRLs.
    
    ## Mental model
    
    `codex-login` is the place where the login flows construct ad hoc
    outbound HTTP clients. That makes it the right boundary for a narrow CA
    policy: look for `CODEX_CA_CERTIFICATE`, fall back to `SSL_CERT_FILE`,
    load every parseable certificate block in that bundle into a
    `reqwest::Client`, and fail early with a clear user-facing error if the
    bundle is unreadable or malformed.
    
    The implementation is intentionally pragmatic about PEM input shape. It
    accepts ordinary certificate bundles, multi-certificate bundles, OpenSSL
    `TRUSTED CERTIFICATE` labels, and bundles that also contain CRLs. It
    does not validate a certificate chain or prove a handshake; it only
    constructs the root store used by login.
    
    ## Non-goals
    
    This change does not introduce a general-purpose transport abstraction
    for the rest of the product. It does not validate whether the provided
    bundle forms a real chain, and it does not add handshake-level
    integration tests against a live TLS server. It also does not change
    login state management or OAuth semantics beyond ensuring the existing
    flows share the same CA-loading rules.
    
    ## Tradeoffs
    
    The main tradeoff is keeping this logic scoped to login-specific client
    construction rather than lifting it into a broader shared HTTP layer.
    That keeps the review surface smaller, but it also means future
    login-adjacent code must continue to use `build_login_http_client()` or
    it can silently bypass enterprise CA overrides.
    
    The `TRUSTED CERTIFICATE` handling is also intentionally a local
    compatibility shim. The rustls ecosystem does not currently accept that
    PEM label upstream, so the code normalizes it locally and trims the
    OpenSSL `X509_AUX` trailer bytes down to the certificate DER that
    `reqwest` can consume.
    
    ## Architecture
    
    `custom_ca.rs` is now the single place that owns login CA behavior. It
    selects the CA file from the environment, reads it, normalizes PEM label
    shape where needed, iterates mixed PEM sections with `rustls-pki-types`,
    ignores CRLs, trims OpenSSL trust metadata when necessary, and returns
    either a configured `reqwest::Client` or a typed error.
    
    The browser login server and the device-code flow both call
    `build_login_http_client()`, so they share the same trust-store policy.
    Environment-sensitive tests run through the `login_ca_probe` helper
    binary because those tests must control process-wide env vars and cannot
    reliably build a real reqwest client in-process on macOS seatbelt runs.
    
    ## Observability
    
    The custom CA path logs which environment variable selected the bundle,
    which file path was loaded, how many certificates were accepted, when
    `TRUSTED CERTIFICATE` labels were normalized, when CRLs were ignored,
    and where client construction failed. Returned errors remain user-facing
    and include the relevant path, env var, and remediation hint.
    
    This gives enough signal for three audiences:
    - users can see why login failed and which env/file caused it
    - sysadmins can confirm which override actually won
    - developers can tell whether the failure happened during file read, PEM
    parsing, certificate registration, or final reqwest client construction
    
    ## Tests
    
    Pure unit tests stay limited to env precedence and empty-value handling.
    Real client construction lives in subprocess tests so the suite remains
    hermetic with respect to process env and macOS sandbox behavior.
    
    The subprocess tests verify:
    - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
    - fallback to `SSL_CERT_FILE`
    - single-certificate and multi-certificate bundles
    - malformed and empty-bundle errors
    - OpenSSL `TRUSTED CERTIFICATE` handling
    - CRL tolerance for well-formed CRL sections
    
    The named PEM fixtures under `login/tests/fixtures/` are shared by the
    tests so their purpose stays reviewable.
    
    ---------
    
    Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
  • Cleanup code_mode tool descriptions (#14480)
    Move to separate files and clarify a bit.
  • Fix stdio-to-uds peer-close flake (#13882)
    ## What changed
    - `codex-stdio-to-uds` now tolerates `NotConnected` when
    `shutdown(Write)` happens after the peer has already closed.
    - The socket test was rewritten to send stdin from a fixture file and to
    read an exact request payload length instead of waiting on EOF timing.
    
    ## Why this fixes the flake
    - This one exposed a real cross-platform runtime edge case: on macOS,
    the peer can close first after a successful exchange, and
    `shutdown(Write)` can report `NotConnected` even though the interaction
    already succeeded.
    - Treating that specific ordering as a harmless shutdown condition
    removes the production-level false failure.
    - The old test compounded the problem by depending on EOF timing, which
    varies by platform and scheduler. Exact-length IO makes the test
    deterministic and focused on the actual data exchange.
    
    ## Scope
    - Production logic change with matching test rewrite.
  • [apps] Add tool_suggest tool. (#14287)
    - [x] Add tool_suggest tool.
    - [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a
    dedicated mod so that we have all the logic and global cache in one
    place.
    - [x] Update TUI app link view to support rendering the installation
    view for mcp elicitation.
    
    ---------
    
    Co-authored-by: Shaqayeq <shaqayeq@openai.com>
    Co-authored-by: Eric Traut <etraut@openai.com>
    Co-authored-by: pakrym-oai <pakrym@openai.com>
    Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
    Co-authored-by: guinness-oai <guinness@openai.com>
    Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com>
    Co-authored-by: Charlie Guo <cguo@openai.com>
    Co-authored-by: Fouad Matin <fouad@openai.com>
    Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
    Co-authored-by: xl-openai <xl@openai.com>
    Co-authored-by: alexsong-oai <alexsong@openai.com>
    Co-authored-by: Owen Lin <owenlin0@gmail.com>
    Co-authored-by: sdcoffey <stevendcoffey@gmail.com>
    Co-authored-by: Codex <noreply@openai.com>
    Co-authored-by: Won Park <won@openai.com>
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
    Co-authored-by: celia-oai <celia@openai.com>
    Co-authored-by: gabec-openai <gabec@openai.com>
    Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com>
    Co-authored-by: Leo Shimonaka <leoshimo@openai.com>
    Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
    Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com>
    Co-authored-by: pash-openai <pash@openai.com>
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • feat(app-server): propagate traces across tasks and core ops (#14387)
    ## Summary
    
    This PR keeps app-server RPC request trace context alive for the full
    lifetime of the work that request kicks off (e.g. for `thread/start`,
    this is `app-server rpc handler -> tokio background task -> core op
    submissions`). Previously we lose trace lineage once the request handler
    returns or hands work off to background tasks.
    
    This approach is especially relevant for `thread/start` and other RPC
    handlers that run in a non-blocking way. In the near future we'll most
    likely want to make all app-server handlers run in a non-blocking way by
    default, and only queue operations that must operate in order (e.g.
    thread RPCs per thread?), so we want to make sure tracing in app-server
    just generally works.
    
    Depends on https://github.com/openai/codex/pull/14300
    
    **Before**
    <img width="155" height="207" alt="image"
    src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
    />
    
    
    **After**
    <img width="299" height="337" alt="image"
    src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
    />
    
    ## What changed
    
    - Keep request-scoped trace context around until we send the final
    response or error, or the connection closes.
    - Thread that trace context through detached `thread/start` work so
    background startup stays attached to the originating request.
    - Pass request trace context through to downstream core operations,
    including:
      - thread creation
      - resume/fork flows
      - turn submission
      - review
      - interrupt
      - realtime conversation operations
    - Add tracing tests that verify:
      - remote W3C trace context is preserved for `thread/start`
      - remote W3C trace context is preserved for `turn/start`
      - downstream core spans stay under the originating request span
      - request-scoped tracing state is cleaned up correctly
    - Clean up shutdown behavior so detached background tasks and spawned
    threads are drained before process exit.
  • start of hooks engine (#13276)
    (Experimental)
    
    This PR adds a first MVP for hooks, with SessionStart and Stop
    
    The core design is:
    
    - hooks live in a dedicated engine under codex-rs/hooks
    - each hook type has its own event-specific file
    - hook execution is synchronous and blocks normal turn progression while
    running
    - matching hooks run in parallel, then their results are aggregated into
    a normalized HookRunSummary
    
    On the AppServer side, hooks are exposed as operational metadata rather
    than transcript-native items:
    
    - new live notifications: hook/started, hook/completed
    - persisted/replayed hook results live on Turn.hookRuns
    - we intentionally did not add hook-specific ThreadItem variants
    
    Hooks messages are not persisted, they remain ephemeral. The context
    changes they add are (they get appended to the user's prompt)
  • codex-rs/app-server: add health endpoints for --listen websocket server (#13782)
    Healthcheck endpoints for the websocket server
    
    - serve `GET /readyz` and `GET /healthz` from the same listener used for
    `--listen ws://...`
    - switch the websocket listener over to `axum` upgrade handling instead
    of manual socket parsing
    - add websocket transport coverage for the health endpoints and document
    the new behavior
    
    Testing
    - integration tests
    - built and tested e2e
    
    ```
    > curl -i http://127.0.0.1:9234/readyz
    HTTP/1.1 200 OK
    content-length: 0
    date: Fri, 06 Mar 2026 19:20:23 GMT
    
    >  curl -i http://127.0.0.1:9234/healthz
    HTTP/1.1 200 OK
    content-length: 0
    date: Fri, 06 Mar 2026 19:20:24 GMT
    ```
  • Add in-process app server and wire up exec to use it (#14005)
    This is a subset of PR #13636. See that PR for a full overview of the
    architectural change.
    
    This PR implements the in-process app server and modifies the
    non-interactive "exec" entry point to use the app server.
    
    ---------
    
    Co-authored-by: Felipe Coury <felipe.coury@gmail.com>
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • feat: add auth login diagnostics (#13797)
    ## Problem
    
    Browser login failures historically leave support with an incomplete
    picture. HARs can show that the browser completed OAuth and reached the
    localhost callback, but they do not explain why the native client failed
    on the final `/oauth/token` exchange. Direct `codex login` also relied
    mostly on terminal stderr and the browser error page, so even when the
    login crate emitted better sign-in diagnostics through TUI or app-server
    flows, the one-shot CLI path still did not leave behind an easy artifact
    to collect.
    
    ## Mental model
    
    This implementation treats the browser page, the returned `io::Error`,
    and the normal structured log as separate surfaces with different safety
    requirements. The browser page and returned error preserve the detail
    that operators need to diagnose failures. The structured log stays
    narrower: it records reviewed lifecycle events, parsed safe fields, and
    redacted transport errors without becoming a sink for secrets or
    arbitrary backend bodies.
    
    Direct `codex login` now adds a fourth support surface: a small
    file-backed log at `codex-login.log` under the configured `log_dir`.
    That artifact carries the same login-target events as the other
    entrypoints without changing the existing stderr/browser UX.
    
    ## Non-goals
    
    This does not add auth logging to normal runtime requests, and it does
    not try to infer precise transport root causes from brittle string
    matching. The scope remains the browser-login callback flow in the
    `login` crate plus a direct-CLI wrapper that persists those events to
    disk.
    
    This also does not try to reuse the TUI logging stack wholesale. The TUI
    path initializes feedback, OpenTelemetry, and other session-oriented
    layers that are useful for an interactive app but unnecessary for a
    one-shot login command.
    
    ## Tradeoffs
    
    The implementation favors fidelity for caller-visible errors and
    restraint for persistent logs. Parsed JSON token-endpoint errors are
    logged safely by field. Non-JSON token-endpoint bodies remain available
    to the returned error so CLI and browser surfaces still show backend
    detail. Transport errors keep their real `reqwest` message, but attached
    URLs are surgically redacted. Custom issuer URLs are sanitized before
    logging.
    
    On the CLI side, the code intentionally duplicates a narrow slice of the
    TUI file-logging setup instead of sharing the full initializer. That
    keeps `codex login` easy to reason about and avoids coupling it to
    interactive-session layers that the command does not need.
    
    ## Architecture
    
    The core auth behavior lives in `codex-rs/login/src/server.rs`. The
    callback path now logs callback receipt, callback validation,
    token-exchange start, token-exchange success, token-endpoint non-2xx
    responses, and transport failures. App-server consumers still use this
    same login-server path via `run_login_server(...)`, so the same
    instrumentation benefits TUI, Electron, and VS Code extension flows.
    
    The direct CLI path in `codex-rs/cli/src/login.rs` now installs a small
    file-backed tracing layer for login commands only. That writes
    `codex-login.log` under `log_dir` with login-specific targets such as
    `codex_cli::login` and `codex_login::server`.
    
    ## Observability
    
    The main signals come from the `login` crate target and are
    intentionally scoped to sign-in. Structured logs include redacted issuer
    URLs, redacted transport errors, HTTP status, and parsed token-endpoint
    fields when available. The callback-layer log intentionally avoids
    `%err` on token-endpoint failures so arbitrary backend bodies do not get
    copied into the normal log file.
    
    Direct `codex login` now leaves a durable artifact for both failure and
    success cases. Example output from the new file-backed CLI path:
    
    Failing callback:
    
    ```text
    2026-03-06T22:08:54.143612Z  INFO codex_cli::login: starting browser login flow
    2026-03-06T22:09:03.431699Z  INFO codex_login::server: received login callback path=/auth/callback has_code=false has_state=true has_error=true state_valid=true
    2026-03-06T22:09:03.431745Z  WARN codex_login::server: oauth callback returned error error_code="access_denied" has_error_description=true
    ```
    
    Succeeded callback and token exchange:
    
    ```text
    2026-03-06T22:09:14.065559Z  INFO codex_cli::login: starting browser login flow
    2026-03-06T22:09:36.431678Z  INFO codex_login::server: received login callback path=/auth/callback has_code=true has_state=true has_error=false state_valid=true
    2026-03-06T22:09:36.436977Z  INFO codex_login::server: starting oauth token exchange issuer=https://auth.openai.com/ redirect_uri=http://localhost:1455/auth/callback
    2026-03-06T22:09:36.685438Z  INFO codex_login::server: oauth token exchange succeeded status=200 OK
    ```
    
    ## Tests
    
    - `cargo test -p codex-login`
    - `cargo clippy -p codex-login --tests -- -D warnings`
    - `cargo test -p codex-cli`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - manual direct `codex login` smoke tests for both a failing callback
    and a successful browser login
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [rmcp-client] Recover from streamable HTTP 404 sessions (#13514)
    ## Summary
    - add one-time session recovery in `RmcpClient` for streamable HTTP MCP
    `404` session expiry
    - rebuild the transport and retry the failed operation once after
    reinitializing the client state
    - extend the test server and integration coverage for `404`, `401`,
    single-retry, and non-session failure scenarios
    
    ## Testing
    - just fmt
    - cargo test -p codex-rmcp-client (the post-rebase run lost its final
    summary in the terminal; the suite had passed earlier before the rebase)
    - just fix -p codex-rmcp-client
  • [diagnostics] show diagnostics earlier in workflow (#13604)
    <img width="591" height="243" alt="Screenshot 2026-03-05 at 10 17 06 AM"
    src="https://github.com/user-attachments/assets/84a6658b-6017-4602-b1f8-2098b9b5eff9"
    />
    
    - show feedback earlier
    - preserve raw literal env vars (no trimming, sanitizing, etc.)
  • feat(app-server): support mcp elicitations in v2 api (#13425)
    This adds a first-class server request for MCP server elicitations:
    `mcpServer/elicitation/request`.
    
    Until now, MCP elicitation requests only showed up as a raw
    `codex/event/elicitation_request` event from core. That made it hard for
    v2 clients to handle elicitations using the same request/response flow
    as other server-driven interactions (like shell and `apply_patch`
    tools).
    
    This also updates the underlying MCP elicitation request handling in
    core to pass through the full MCP request (including URL and form data)
    so we can expose it properly in app-server.
    
    ### Why not `item/mcpToolCall/elicitationRequest`?
    This is because MCP elicitations are related to MCP servers first, and
    only optionally to a specific MCP tool call.
    
    In the MCP protocol, elicitation is a server-to-client capability: the
    server sends `elicitation/create`, and the client replies with an
    elicitation result. RMCP models it that way as well.
    
    In practice an elicitation is often triggered by an MCP tool call, but
    not always.
    
    ### What changed
    - add `mcpServer/elicitation/request` to the v2 app-server API
    - translate core `codex/event/elicitation_request` events into the new
    v2 server request
    - map client responses back into `Op::ResolveElicitation` so the MCP
    server can continue
    - update app-server docs and generated protocol schema
    - add an end-to-end app-server test that covers the full round trip
    through a real RMCP elicitation flow
    - The new test exercises a realistic case where an MCP tool call
    triggers an elicitation, the app-server emits
    mcpServer/elicitation/request, the client accepts it, and the tool call
    resumes and completes successfully.
    
    ### app-server API flow
    - Client starts a thread with `thread/start`.
    - Client starts a turn with `turn/start`.
    - App-server sends `item/started` for the `mcpToolCall`.
    - While that tool call is in progress, app-server sends
    `mcpServer/elicitation/request`.
    - Client responds to that request with `{ action: "accept" | "decline" |
    "cancel" }`.
    - App-server sends `serverRequest/resolved`.
    - App-server sends `item/completed` for the mcpToolCall.
    - App-server sends `turn/completed`.
    - If the turn is interrupted while the elicitation is pending,
    app-server still sends `serverRequest/resolved` before the turn
    finishes.
  • chore(deps): bump serde_with from 3.16.1 to 3.17.0 in /codex-rs (#13209)
    Bumps [serde_with](https://github.com/jonasbb/serde_with) from 3.16.1 to
    3.17.0.
    <details>
    <summary>Release notes</summary>
    <p><em>Sourced from <a
    href="https://github.com/jonasbb/serde_with/releases">serde_with's
    releases</a>.</em></p>
    <blockquote>
    <h2>serde_with v3.17.0</h2>
    <h3>Added</h3>
    <ul>
    <li>Support <code>OneOrMany</code> with <code>smallvec</code> v1 (<a
    href="https://redirect.github.com/jonasbb/serde_with/issues/920">#920</a>,
    <a
    href="https://redirect.github.com/jonasbb/serde_with/issues/922">#922</a>)</li>
    </ul>
    <h3>Changed</h3>
    <ul>
    <li>Switch to <code>yaml_serde</code> for a maintained yaml dependency
    by <a href="https://github.com/kazan417"><code>@​kazan417</code></a> (<a
    href="https://redirect.github.com/jonasbb/serde_with/issues/921">#921</a>)</li>
    <li>Bump MSRV to 1.82, since that is required for
    <code>yaml_serde</code> dev-dependency.</li>
    </ul>
    </blockquote>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/4031878a4cfced7261105447d8683c296147864b"><code>4031878</code></a>
    Bump version to v3.17.0 (<a
    href="https://redirect.github.com/jonasbb/serde_with/issues/924">#924</a>)</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/204ae56f8ba08bd911ad0f122719bf07f3dcdbbb"><code>204ae56</code></a>
    Bump version to v3.17.0</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/7812b5a006e23e0204c687868e68a8b9dae75cd1"><code>7812b5a</code></a>
    serde_yaml 0.9 to yaml_serde 0.10 (<a
    href="https://redirect.github.com/jonasbb/serde_with/issues/921">#921</a>)</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/614bd8950bc179f4f23c1d9f26866ac216257fed"><code>614bd89</code></a>
    Bump MSRV to 1.82 as required by yaml_serde</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/518d0ed7873616a81c987d7961d78f5f26210694"><code>518d0ed</code></a>
    Suppress RUSTSEC-2026-0009 since we don't have untrusted time input in
    tests ...</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/a6579a89841f269c7f63912e8e808e82212c672e"><code>a6579a8</code></a>
    Suppress RUSTSEC-2026-0009 since we don't have untrusted time input in
    tests</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/9d4d0696e6794da4babf8204d17d11dadb79dd60"><code>9d4d069</code></a>
    Implement OneOrMany for smallvec_1::SmallVec (<a
    href="https://redirect.github.com/jonasbb/serde_with/issues/922">#922</a>)</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/fc78243e8c60c4fcc11a99f2c6ccc0d449a57fd9"><code>fc78243</code></a>
    Add changelog</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/2b8c30bf679309c27143f13070dbeef068310ab5"><code>2b8c30b</code></a>
    Implement OneOrMany for smallvec_1::SmallVec</li>
    <li><a
    href="https://github.com/jonasbb/serde_with/commit/2d9b9a1815cb6d58b17ab6403e57e7c2f62b84cc"><code>2d9b9a1</code></a>
    Carg.lock update</li>
    <li>Additional commits viewable in <a
    href="https://github.com/jonasbb/serde_with/compare/v3.16.1...v3.17.0">compare
    view</a></li>
    </ul>
    </details>
    <br />
    
    
    [![Dependabot compatibility
    score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=serde_with&package-manager=cargo&previous-version=3.16.1&new-version=3.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
    
    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.
    
    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)
    
    ---
    
    <details>
    <summary>Dependabot commands and options</summary>
    <br />
    
    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)
    
    
    </details>
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Eric Traut <etraut@openai.com>
  • chore(deps): bump strum_macros from 0.27.2 to 0.28.0 in /codex-rs (#13210)
    Bumps [strum_macros](https://github.com/Peternator7/strum) from 0.27.2
    to 0.28.0.
    <details>
    <summary>Changelog</summary>
    <p><em>Sourced from <a
    href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum_macros's
    changelog</a>.</em></p>
    <blockquote>
    <h2>0.28.0</h2>
    <ul>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>:
    Allow any kind of passthrough attributes on
    <code>EnumDiscriminants</code>.</p>
    <ul>
    <li>Previously only list-style attributes (e.g.
    <code>#[strum_discriminants(derive(...))]</code>) were supported. Now
    path-only
    (e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and
    name/value (e.g. <code>#[strum_discriminants(doc =
    &quot;foo&quot;)]</code>)
    attributes are also supported.</li>
    </ul>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>:
    Add missing <code>#[automatically_derived]</code> to generated impls not
    covered by <a
    href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>:
    Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and
    <code>windows-sys</code> dependencies. This is a breaking change if
    you're on an old version of rust.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>:
    Use absolute paths in generated proc macro code to avoid
    potential name conflicts.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>:
    Upgrade <code>phf</code> dependency to v0.13.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>:
    Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub
    Actions CI.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>:
    <code>strum::ParseError</code> now implements
    <code>core::fmt::Display</code> instead
    <code>std::fmt::Display</code> to make it <code>#[no_std]</code>
    compatible. Note the <code>Error</code> trait wasn't available in core
    until <code>1.81</code>
    so <code>strum::ParseError</code> still only implements that in std.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>:
    <strong>Breaking Change</strong> - <code>EnumString</code> now
    implements <code>From&lt;&amp;str&gt;</code>
    (infallible) instead of <code>TryFrom&lt;&amp;str&gt;</code> when the
    enum has a <code>#[strum(default)]</code> variant. This more accurately
    reflects that parsing cannot fail in that case. If you need the old
    <code>TryFrom</code> behavior, you can opt back in using
    <code>parse_error_ty</code> and <code>parse_error_fn</code>:</p>
    <pre lang="rust"><code>#[derive(EnumString)]
    #[strum(parse_error_ty = strum::ParseError, parse_error_fn =
    make_error)]
    pub enum Color {
        Red,
        #[strum(default)]
        Other(String),
    }
    <p>fn make_error(x: &amp;str) -&gt; strum::ParseError {
    strum::ParseError::VariantNotFound
    }
    </code></pre></p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>:
    Fix bug where <code>EnumString</code> ignored the
    <code>parse_err_ty</code>
    attribute when the enum had a <code>#[strum(default)]</code>
    variant.</p>
    </li>
    <li>
    <p><a
    href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>:
    EnumDiscriminants will now copy <code>default</code> over from the
    original enum to the Discriminant enum.</p>
    <pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)]
    #[strum_discriminants(derive(Default))] // &lt;- Remove this in 0.28.
    enum MyEnum {
        #[default] // &lt;- Will be the #[default] on the MyEnumDiscriminant
        #[strum_discriminants(default)] // &lt;- Remove this in 0.28
        Variant0,
        Variant1 { a: NonDefault },
    }
    </code></pre>
    </li>
    </ul>
    <!-- raw HTML omitted -->
    </blockquote>
    <p>... (truncated)</p>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/Peternator7/strum/commit/7376771128834d28bb9beba5c39846cba62e71ec"><code>7376771</code></a>
    Peternator7/0.28 (<a
    href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/26e63cd964a2e364331a5dd977d589bb9f649d8c"><code>26e63cd</code></a>
    Display exists in core (<a
    href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/9334c728eedaa8a992d1388a8f4564bbccad1934"><code>9334c72</code></a>
    Make TryFrom and FromStr infallible if there's a default (<a
    href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/0ccbbf823c16e827afc263182cd55e99e3b2a52e"><code>0ccbbf8</code></a>
    Honor parse_err_ty attribute when the enum has a default variant (<a
    href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/2c9e5a9259189ce8397f2f4967060240c6bafd74"><code>2c9e5a9</code></a>
    Automatically add Default implementation to EnumDiscriminant if it
    exists on ...</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/e241243e48359b8b811b8eaccdcfa1ae87138e0d"><code>e241243</code></a>
    Fix existing cargo fmt + clippy issues and add GH actions (<a
    href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/639b67fefd20eaead1c5d2ea794e9afe70a00312"><code>639b67f</code></a>
    feat: allow any kind of passthrough attributes on
    <code>EnumDiscriminants</code> (<a
    href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/0ea1e2d0fd1460e7492ea32e6b460394d9199ff8"><code>0ea1e2d</code></a>
    docs: Fix typo (<a
    href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/36c051b91086b37d531c63ccf5a49266832a846d"><code>36c051b</code></a>
    Upgrade <code>phf</code> to v0.13 (<a
    href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li>
    <li><a
    href="https://github.com/Peternator7/strum/commit/9328b38617dc6f4a3bc5fdac03883d3fc766cf34"><code>9328b38</code></a>
    Use absolute paths in proc macro (<a
    href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li>
    <li>Additional commits viewable in <a
    href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare
    view</a></li>
    </ul>
    </details>
    <br />
    
    
    [![Dependabot compatibility
    score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum_macros&package-manager=cargo&previous-version=0.27.2&new-version=0.28.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
    
    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.
    
    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)
    
    ---
    
    <details>
    <summary>Dependabot commands and options</summary>
    <br />
    
    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)
    
    
    </details>
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Eric Traut <etraut@openai.com>
  • feat(app-server-test-client): OTEL setup for tracing (#13493)
    ### Overview
    This PR:
    - Updates `app-server-test-client` to load OTEL settings from
    `$CODEX_HOME/config.toml` and initializes its own OTEL provider.
    - Add real client root spans to app-server test client traces.
    
    This updates `codex-app-server-test-client` so its Datadog traces
    reflect the full client-driven flow instead of a set of server spans
    stitched together under a synthetic parent.
    
    Before this change, the test client generated a fake `traceparent` once
    and reused it for every JSON-RPC request. That kept the requests in one
    trace, but there was no real client span at the top, so Datadog ended up
    showing the sequence in a slightly misleading way, where all RPCs were
    anchored under `initialize`.
    
    Now the test client:
    - loads OTEL settings from the normal Codex config path, including
    `$CODEX_HOME/config.toml` and existing --config overrides
    - initializes tracing the same way other Codex binaries do when trace
    export is enabled
    - creates a real client root span for each scripted command
    - creates per-request client spans for JSON-RPC methods like
    `initialize`, `thread/start`, and `turn/start`
    - injects W3C trace context from the current client span into
    request.trace instead of reusing a fabricated carrier
    
    This gives us a cleaner trace shape in Datadog:
    - one trace URL for the whole scripted flow
    - a visible client root span
    - proper client/server parent-child relationships for each app-server
    request
  • feat: external artifacts builder (#13485)
    This PR reverts the built-in artifact render while a decision is being
    reached. No impact expected on any features
  • feat(app-server): propagate app-server trace context into core (#13368)
    ### Summary
    Propagate trace context originating at app-server RPC method handlers ->
    codex core submission loop (so this includes spans such as `run_turn`!).
    This implements PR 2 of the app-server tracing rollout.
    
    This also removes the old lower-level env-based reparenting in core so
    explicit request/submission ancestry wins instead of being overridden by
    ambient `TRACEPARENT` state.
    
    ### What changed
    - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission
    - Taught `Codex::submit()` / `submit_with_id()` to automatically capture
    the current span context when constructing or forwarding a submission
    - Wrapped the core submission loop in a submission_dispatch span
    parented from Submission.trace
    - Warn on invalid submission trace carriers and ignore them cleanly
    - Removed the old env-based downstream reparenting path in core task
    execution
    - Stopped OTEL provider init from implicitly attaching env trace context
    process-wide
    - Updated mcp-server Submission call sites for the new field
    
    Added focused unit tests for:
    - capturing trace context into Submission
    - preferring `Submission.trace` when building the core dispatch span
    
    ### Why
    PR 1 gave us consistent inbound request spans in app-server, but that
    only covered the transport boundary. For long-running work like turns
    and reviews, the important missing piece was preserving ancestry after
    the request handler returns and core continues work on a different async
    path.
    
    This change makes that handoff explicit and keeps the parentage rules
    simple:
    - app-server request span sets the current context
    - `Submission.trace` snapshots that context
    - core restores it once, at the submission boundary
    - deeper core spans inherit naturally
    
    That also lets us stop relying on env-based reparenting for this path,
    which was too ambient and could override explicit ancestry.
  • [feedback] diagnostics (#13292)
    - added header logic to display diagnostics on cli
    - added logic for collecting env vars
    
    <img width="606" height="327" alt="Screenshot 2026-03-03 at 3 49 31 PM"
    src="https://github.com/user-attachments/assets/05e78c56-8cb3-47fa-abaf-3e57f1fdd8e2"
    />
    
    <img width="690" height="353" alt="Screenshot 2026-03-02 at 6 47 54 PM"
    src="https://github.com/user-attachments/assets/e470b559-13f4-44d9-897f-bc398943c6d1"
    />
  • Add under-development original-resolution view_image support (#13050)
    ## Summary
    
    Add original-resolution support for `view_image` behind the
    under-development `view_image_original_resolution` feature flag.
    
    When the flag is enabled and the target model is `gpt-5.3-codex` or
    newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends
    `detail: "original"` to the Responses API instead of using the legacy
    resize/compress path.
    
    ## What changed
    
    - Added `view_image_original_resolution` as an under-development feature
    flag.
    - Added `ImageDetail` to the protocol models and support for serializing
    `detail: "original"` on tool-returned images.
    - Added `PromptImageMode::Original` to `codex-utils-image`.
      - Preserves original PNG/JPEG/WebP bytes.
      - Keeps legacy behavior for the resize path.
    - Updated `view_image` to:
    - use the shared `local_image_content_items_with_label_number(...)`
    helper in both code paths
      - select original-resolution mode only when:
        - the feature flag is enabled, and
        - the model slug parses as `gpt-5.3-codex` or newer
    - Kept local user image attachments on the existing resize path; this
    change is specific to `view_image`.
    - Updated history/image accounting so only `detail: "original"` images
    use the docs-based GPT-5 image cost calculation; legacy images still use
    the old fixed estimate.
    - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG
    at 85% quality unless lossless is required, while still allowing other
    formats when explicitly requested.
    - Updated tests and helper code that construct
    `FunctionCallOutputContentItem::InputImage` to carry the new `detail`
    field.
    
    ## Behavior
    
    ### Feature off
    - `view_image` keeps the existing resize/re-encode behavior.
    - History estimation keeps the existing fixed-cost heuristic.
    
    ### Feature on + `gpt-5.3-codex+`
    - `view_image` sends original-resolution images with `detail:
    "original"`.
    - PNG/JPEG/WebP source bytes are preserved when possible.
    - History estimation uses the GPT-5 docs-based image-cost calculation
    for those `detail: "original"` images.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/13050
    -  `2` https://github.com/openai/codex/pull/13331
    -  `3` https://github.com/openai/codex/pull/13049
  • chore(app-server): delete v1 RPC methods and notifications (#13375)
    ## Summary
    This removes the old app-server v1 methods and notifications we no
    longer need, while keeping the small set the main codex app client still
    depends on for now.
    
    The remaining legacy surface is:
    - `initialize`
    - `getConversationSummary`
    - `getAuthStatus`
    - `gitDiffToRemote`
    - `fuzzyFileSearch`
    - `fuzzyFileSearch/sessionStart`
    - `fuzzyFileSearch/sessionUpdate`
    - `fuzzyFileSearch/sessionStop`
    
    And the raw `codex/event/*` notifications emitted from core. These
    notifications will be removed in a followup PR.
    
    ## What changed
    - removed deprecated v1 request variants from the protocol and
    app-server dispatcher
    - removed deprecated typed notifications: `authStatusChange`,
    `loginChatGptComplete`, and `sessionConfigured`
    - updated the app-server test client to use v2 flows instead of deleted
    v1 flows
    - deleted legacy-only app-server test suites and added focused coverage
    for `getConversationSummary`
    - regenerated app-server schema fixtures and updated the MCP interface
    docs to match the remaining compatibility surface
    
    ## Testing
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server`