363 Commits

  • fix(remote-control): avoid server token refresh retry storms (#30201)
    ## Why
    
    Remote-control websocket reconnects and pairing requests proactively
    refresh their server token. When `/server/refresh` returns a transient
    error such as `502`, the still-valid token was discarded as a usable
    connection path, causing reconnect failures and repeated refresh
    attempts that could amplify an upstream incident.
    
    ## What Changed
    
    - Start proactive refresh five minutes before token expiry and
    distinguish it from a required refresh for missing or expired tokens.
    - Continue websocket and pairing operations with the existing valid
    token after `429`, `5xx`, or timeout failures.
    - Share an in-memory `next_refresh_at` throttle across websocket and
    pairing callers, honoring both `Retry-After` formats and otherwise using
    a jittered 24–36 second delay.
    - Keep required refreshes strict, preserve `404` enrollment replacement,
    and clear token/throttle state for `401` and `403` auth recovery.
    - Preserve refresh response metadata internally and add focused
    wire-level and integration coverage.
    
    ## Verification
    
    Added behavioral coverage proving that:
    
    - a valid near-expiry token still completes websocket and pairing
    requests after transient refresh failures;
    - `Retry-After` suppresses a subsequent refresh across websocket and
    pairing callers;
    - request and response-body timeouts are classified as transient;
    - an expired token, including one that expires during refresh, cannot
    proceed to websocket connection;
    - auth failures clear the attempted token without overwriting a
    concurrently rotated token.
  • Project selected plugin runtime by environment availability (#30093)
    ## Why
    
    Selected plugin metadata is stable, but MCP processes are live runtime
    state. They need different lifetimes:
    
    - the MCP extension caches manifest, MCP, and connector declarations for
    each stable selected root;
    - each model step projects that cached metadata through the roots that
    resolved as ready for that exact step;
    - the MCP manager is rebuilt only when that availability projection
    changes.
    
    This matches executor skills: both features consume the same resolved
    step roots instead of inferring readiness from the turn's selected
    environments.
    
    ## Behavior
    
    ```text
    E1 not ready for this step
      -> no E1 MCP servers or connectors
      -> cached plugin metadata stays in ext/mcp
    
    E1 becomes ready
      -> reuse cached metadata
      -> publish one MCP runtime containing E1 capabilities
    
    same ready roots on the next step
      -> reuse the exact runtime; no rediscovery and no MCP restart
    
    resume
      -> create new extension thread state and a new MCP runtime
    ```
    
    All model-facing consumers use the same step snapshot:
    
    ```text
    resolved selected roots
            |
            v
    extension MCP/connector projection
            |
            v
    { MCP config, connector snapshot, MCP manager }
            |
            +-> advertise model tools
            +-> build app/connector tools
            +-> execute MCP calls
    ```
    
    ## Cache contract
    
    The existing MCP extension owns a cache keyed by the full
    `SelectedCapabilityRoot`:
    
    ```rust
    let state = thread_store.get_or_init(SelectedExecutorPluginMcpState::default);
    ```
    
    The cache lives with extension thread state. Environment availability
    filters projection but does not invalidate metadata. Resume creates new
    thread state. There is no file watcher or executor generation because
    contents behind a stable environment/root are assumed stable.
    
    ## What changes
    
    - Keeps executor plugin discovery and cached metadata in `ext/mcp`.
    - Caches MCP and connector declarations together per selected root.
    - Uses the step's already-resolved capability roots, including lazy
    environments that are not turn environments.
    - Reuses the current MCP runtime when the ready-root projection is
    unchanged.
    - Uses the same step MCP manager and connector snapshot for
    model-visible tools and execution.
    - Resolves direct thread-scoped MCP requests from the current
    selected-root projection.
    
    ## Deliberately out of scope
    
    - `app/list` remains based on the latest global host-plugin state; this
    PR does not make its response or notifications thread-specific.
    - `required = true` startup semantics do not apply to delayed executor
    MCP activation.
    - No filesystem/content invalidation.
    - No transport-disconnect watcher.
    - No executor generations or environment replacement semantics.
    - No client sharing across complete manager replacements.
    
    ## Stack
    
    1. Extension-owned World State sections.
    2. Project executor skills through World State.
    3. Pin one MCP runtime to each model step.
    4. **This PR:** project selected MCP and connector state from
    extension-owned metadata.
    5. Integration coverage for selected capability availability and resume.
    
    ## Verification
    
    -
    `selected_plugin_servers_use_managed_requirements_for_the_selected_root_id`
    - The stacked integration PR covers unavailable to ready activation,
    unchanged-runtime reuse, skills, MCP tools, connector attribution, and
    cold resume.
  • Read connector declarations from executor plugins (#29852)
    ## Why
    
    Selected capability roots can live on a different executor and operating
    system from app-server. Their connector declarations must therefore be
    read through the executor that owns the package, without converting
    executor URIs into host paths.
    
    This PR adds that authority-bound reader without activating connectors
    or changing thread startup.
    
    ## What changed
    
    - Add a small `codex-connectors-extension` crate for executor-owned
    connector I/O.
    - Read only the app configuration explicitly declared by the resolved
    plugin manifest.
    - Read through the `ExecutorFileSystem` retained by
    `ResolvedExecutorPlugin`; there is no host-filesystem fallback or
    default-file probe.
    - Keep `PathUri` values intact so Windows, Unix, and remote executor
    paths work from any orchestrator OS.
    - Return full `AppDeclaration` values so the caller retains declaration
    names and categories for routing.
    - Preserve the selected plugin ID and exact executor URI in read and
    parse errors.
    
    The contract is intentionally narrow: selected packages are trusted,
    valid packages and packages that provide connectors explicitly declare
    their app configuration.
    
    ## Stack scope
    
    This PR is stacked on #29851. It only provides the executor-backed
    reader. #29856 resolves selected roots at thread start, freezes their
    connector snapshot, and contains the remote-capable end-to-end authority
    test for the complete path.
  • [codex] Inject agent graph store into ThreadManager (#29736)
    Pick up the AgentGraphStore migration.
    
    - Inject an explicit optional agent graph store into `ThreadManager` 
    - Move all calls to spawn, close, recursive resume, and
    subtree/archive/delete/feedback traversal through it
    - Keep using  `LocalAgentGraphStore` when SQLite is available
    
    This required some changes to the interface to deal with futures:
    
    - The interface now matches `ThreadStore`'s object-safe pattern by
    returning a boxed `AgentGraphStoreFuture` directly, allowing
    `ThreadManager` to hold `Arc<dyn AgentGraphStore>`
    
    *Slight behavior change!* Unfiltered subtree enumeration now performs a
    single all-status breadth-first traversal, so a closed grandchild
    beneath an open edge is included; the previous Open-then-Closed
    traversals could not cross mixed-status paths and silently omitted it.
  • protocol: separate app and exec RPC ownership (#29714)
    ## Why
    
    The app-server and exec-server expose separate JSON-RPC APIs, but
    exec-server currently sources its serialized protocol and envelope types
    through app-server-oriented code. Giving each API an explicit owner
    makes the crate boundary legible without introducing shared generic
    envelopes.
    
    ## What changed
    
    - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and
    JSON-RPC envelopes.
    - Updated exec-server clients, transports, handlers, and tests to use
    the new crate.
    - Exposed app-server's existing JSON-RPC types through a public `rpc`
    module while retaining root re-exports.
    - Preserved existing wire shapes, including exec `PathUri` behavior.
    
    ## Stack
    
    This is PR 1 of 6. Next: [PR
    #29721](https://github.com/openai/codex/pull/29721), which moves auth
    mode below the app wire boundary.
    
    ## Validation
    
    - Exec-server protocol and server coverage passed in the focused
    protocol test runs.
    - App-server protocol schema fixtures passed.
  • Update rmcp to 1.8.0 (#29634)
    ## Summary
    
    - Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
    - Adapt to the new shared `peer_info` return type.
    - Box OAuth status discovery at the MCP boundary to keep the expanded
    future type from overflowing Rust's trait recursion limit.
    
    This brings in custom OAuth HTTP client support from
    [modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).
  • Share resumed rollout history (#28426)
    ## Summary
    
    Resuming a persisted thread currently deep-clones its complete rollout
    history several times. `InitialHistory` is retained for the app-server
    response, copied into thread persistence, and copied again by read-only
    accessors. These copies scale with the complete rollout rather than the
    bounded model context and add measurable latency for large sessions.
    
    This change stores resumed rollout history in `Arc<Vec<RolloutItem>>`.
    Rollout loading wraps the parsed vector once, while app-server response
    construction, session initialization, and thread persistence share it
    through inexpensive `Arc` clones. Read-only history access now returns a
    borrowed slice, and fork paths use `Arc::unwrap_or_clone` where they
    genuinely need mutable ownership. Rollout reconstruction also consumes
    its temporary context instead of cloning the reconstructed model
    history.
    
    The serialized representation remains unchanged. In an artificial 123 MB
    rollout benchmark, sharing resumed history reduced cold resume latency
    by roughly 9–10%. The affected crates compile with their test targets,
    all 80 thread-store tests pass, and the Bazel dependency lock remains
    valid.
  • PAC 4 - Add macOS system proxy resolver (#26709)
    ## Summary
    
    Stacked on #26708.
    
    Adds the macOS implementation of the shared system-proxy contract. This
    allows Codex-owned auth clients to use the route macOS selects for each
    auth URL through SystemConfiguration and CFNetwork, including PAC and
    WPAD results.
    
    The `respect_system_proxy` feature is disabled by default, so existing
    client behavior remains unchanged unless explicitly enabled.
    
    ## Implementation
    
    - Adds the macOS-only `system-configuration` dependency to
    `codex-client`.
    - Dispatches system-proxy resolution to `outbound_proxy/macos.rs` on
    macOS.
    - Reads system proxy settings from `SCDynamicStore` and resolves the
    target URL with `CFNetworkCopyProxiesForURL`.
    - Executes PAC URLs and inline PAC JavaScript through a bounded run loop
    with a five-second timeout.
    - Handles `DIRECT`, HTTP proxies, and CFNetwork HTTPS entries using HTTP
    CONNECT; unsupported SOCKS entries map to `UnsupportedProxyScheme`.
    - Builds concrete proxy URLs from host and port entries, including IPv6
    host bracketing.
    - Maps results into the shared `SystemProxyDecision::{Direct, Proxy,
    Unavailable}` contract.
    - Hashes URL-specific cache keys so PAC decisions remain distinct
    without retaining raw request URLs or query strings.
    
    ## End-user behavior
    
    - Disabled/default: existing client behavior is unchanged.
    - Enabled with `[features.respect_system_proxy]`:
      - macOS auth clients honor system proxy configuration, PAC, and WPAD;
      - valid OS/PAC `DIRECT` decisions use a direct connection;
    - unavailable system resolution falls back to explicit environment proxy
    variables, then `DIRECT`, through the shared contract from #26707.
    - Unsupported proxy schemes are not silently translated into another
    route.
    - Custom CA handling remains separate from proxy selection.
    - Known limitation: only the first supported system/PAC candidate is
    used. Subsequent proxy or `DIRECT` candidates are not attempted after a
    connection failure. This matches the current Windows behavior and leaves
    room for future ordered-fallback support.
    
    ## Tests
    
    - `just test -p codex-client` — 34 tests passed.
    - `just clippy -p codex-client`
    - `just fmt`
    - `just bazel-lock-check`
  • chore: advance tungstenite fork pins (#29480)
    ## Why
    
    `openai-oss-forks/tokio-tungstenite` now includes the updated
    `tungstenite` fork revision from
    [openai-oss-forks/tokio-tungstenite#3](https://github.com/openai-oss-forks/tokio-tungstenite/pull/3).
    Codex should consume the merged fork commit and resolve its direct and
    transitive `tungstenite` dependencies to the same revision instead of
    retaining the older pins.
    
    ## What Changed
    
    - Advanced the `tokio-tungstenite` git pin to
    `0e5b2d73aa18dd9f0a50ee9ff199d5aef7594186`.
    - Advanced the `tungstenite` fork pin to
    `4fffad30fe373adbdcffab9545e9e9bf4f2fc19f` and adjusted the patch source
    so the transitive dependency resolves to that revision.
    - Updated `Cargo.lock` and `MODULE.bazel.lock` to match the dependency
    graph.
  • chore(deps): advance tokio-tungstenite (#29132)
    ## Why
    
    Responses websocket connections use `tokio-tungstenite`. When DNS
    returns an unusable native IPv6 address before a working IPv4 address,
    sequential dialing can consume Codex's outer websocket timeout before
    reaching IPv4. The merged fork change adds Happy Eyeballs-style
    alternate-family racing so websocket dialing matches the recovery
    behavior already present in the HTTP path.
    
    ## What Changed
    
    Advance the workspace `tokio-tungstenite` patch from `132f5b39` to
    merged commit `e5e64b86`, and update the matching lockfile source. The
    new revision comes from
    [openai-oss-forks/tokio-tungstenite#1](https://github.com/openai-oss-forks/tokio-tungstenite/pull/1).
  • exec-server: add Noise relay transport (#26242)
    ## Why
    
    Rendezvous forwards traffic between the orchestrator and exec-server.
    The endpoints need to authenticate each other and encrypt that traffic
    without trusting Rendezvous with plaintext or endpoint keys.
    
    ## Changes
    
    - Adds a hybrid Noise IK channel through Clatter using X25519,
    ML-KEM-768, AES-256-GCM, and SHA-256.
    - Binds each handshake to `environment_id`, `executor_registration_id`,
    and `stream_id`.
    - Pins the registry-provided executor key and carries the harness
    authorization inside the encrypted handshake.
    - Orders relay frames before consuming Noise nonces and fragments large
    JSON-RPC messages into bounded records.
    - Bounds handshake payloads, frames, streams, and message reassembly.
    
    Runtime activation is in
    [openai/codex#26245](https://github.com/openai/codex/pull/26245).
    
    ## Stack
    
    1. **[openai/codex#26242](https://github.com/openai/codex/pull/26242)**:
    Noise channel and relay transport
    2. [openai/codex#26245](https://github.com/openai/codex/pull/26245):
    remote registration and runtime activation
    
    ## Verification
    
    - `just test -p codex-exec-server`
    - Oversized initiator payload regression coverage
    - `just fix -p codex-exec-server`
    - `just bazel-lock-check`
    - `cargo shear`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Use aws-lc-rs for rustls crypto provider (#27706)
    ## Why
    
    Some enterprise TLS proxies issue certificate chains signed with
    `ecdsa_secp521r1_sha512` / `ECDSA_NISTP521_SHA512`. Custom CA
    configuration such as `SSL_CERT_FILE` can add the right trust root, but
    it cannot make `rustls`'s `ring` verifier support a certificate
    signature algorithm it does not advertise.
    
    That can still break TLS after the CA bundle is configured, including on
    Rust websocket paths that call the shared
    `ensure_rustls_crypto_provider()` helper, such as the Responses
    websocket connector and remote app-server client:
    
    -
    [`codex-api/src/endpoint/responses_websocket.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L441)
    -
    [`app-server-client/src/remote.rs`](https://github.com/openai/codex/blob/eddc5c75ed527a8348bfcaa85692e53189600833/codex-rs/app-server-client/src/remote.rs#L718)
    
    The `aws-lc-rs` `rustls` provider supports this P-521/SHA-512
    certificate signature scheme, so use it as Codex's process-wide `rustls`
    provider.
    
    ## What Changed
    
    - Switch the workspace `rustls` feature from `ring` to `aws_lc_rs`.
    - Update `codex-utils-rustls-provider` to install
    `rustls::crypto::aws_lc_rs::default_provider()`.
    - Add an assertion and integration test that the installed provider
    supports `ECDSA_NISTP521_SHA512`.
    
    ## Verification
    
    ```shell
    just fmt
    just test -p codex-utils-rustls-provider
    just bazel-lock-update
    just bazel-lock-check
    ```
  • [codex] Pin bundled SQLite to fixed WAL-reset version (#27992)
    ## Summary
    
    Prevent dependency refreshes from silently downgrading Codex's bundled
    SQLite to a release affected by the WAL-reset corruption bug.
    
    SQLx 0.9 accepts a broad `libsqlite3-sys` range. An unrelated lock
    refresh therefore moved Codex from `libsqlite3-sys 0.37.0` back to
    `0.35.0`, changing the bundled SQLite runtime from 3.51.3 to 3.50.2.
    SQLite documents the affected versions and fix in [The WAL Reset
    Bug](https://www.sqlite.org/wal.html#the_wal_reset_bug) and the [SQLite
    3.51.3 changelog](https://www.sqlite.org/changes.html#version_3_51_3).
  • Remove TUI realtime voice support (#27801)
    ## Why
    
    Removes the realtime audio support from TUI.
    
    ## What Changed
    
    - Removed the TUI `/realtime` and realtime `/settings` command paths.
    - Deleted TUI voice capture/playback, WebRTC session handling,
    audio-device selection UI, and recording-meter code.
    - Removed TUI realtime tests and snapshots that covered the deleted
    surfaces.
    - Dropped the TUI-only `cpal` and `codex-realtime-webrtc` dependencies
    and refreshed the Rust/Bazel locks.
  • code-mode standalone: extract protocol and add host crate (#27724)
    This is phase 1 of a 4 phase stack:
    1. **Add protocol and host crates for new IPC code mode implementation**
    2. Create the new standalone binary
    3. Create a new IPC `CodeModeSessionProvider` to use new binary
    4. Remove v8 from core and only use IPC provider
    
    
    ## Add protocol and host crates for new IPC code mode implementation
    Establish a clean process boundary without changing the existing
    in-process behavior.
    
    - Add the codex-code-mode-protocol crate for shared session, runtime,
    response, and tool-definition types.
    - Move protocol-facing code out of the V8-backed implementation.
    - Add a buildable codex-code-mode-host crate as the foundation for the
    standalone process.
    - Keep the existing in-process runtime as the active implementation.
  • [codex] parallelize release code generation (#27702)
    The release profile still uses one codegen unit, which serializes LLVM
    code generation within each crate. That setting was selected alongside
    fat LTO for optimization quality and binary size, but releases now use
    ThinLTO and code generation dominates the critical-path build.
    
    Use four codegen units. On an Apple M4 Max with 16 cores and 128 GiB
    RAM, using rustc 1.96.0, four and eight units took 507.486 and 505.325
    seconds respectively. Four therefore keeps the build-time gain while
    limiting the stripped `codex` increase to 14.7%, compared with 21.5% at
    eight units. The gzip-compressed binary grows 7.8% at four units.
    
    The one-unit build from an empty target directory took 981.150 seconds.
    That comparison also populated dependency and native build caches, so it
    is directional rather than controlled. It agrees with the earlier clean
    matrix where eight units reduced 671 seconds to 303 seconds:
    https://gist.github.com/anp/4b88393a0acd35783d9f42156f3243d5
    
    At the local 48% reduction, the current release's 55m22s critical-path
    macOS Cargo step would save about 26 minutes from the 71m28s workflow:
    https://github.com/openai/codex/actions/runs/27367405663
    
    The prompt-image medians ranged from 3.9% faster to 0.9% slower. CLI
    startup shifted by 1-2 ms while user and system CPU time were unchanged.
    
    This is a draft because the release-latency improvement may not justify
    the binary-size increase.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • [codex] Load user instructions through an injected provider (#27101)
    ## Why
    
    We want to remove implicit use of `$CODEX_HOME` from `codex-core` and
    make embedders responsible for supplying user-level instructions. This
    also ensures user instructions load when no primary environment is
    selected.
    
    ## What changed
    
    Stacked on #27415, which makes `codex exec` surface thread-scoped
    runtime warnings.
    
    - Added `UserInstructionsProvider` to `codex-extension-api`, with
    absolute source attribution and recoverable loading warnings.
    - Added `codex-home` with the filesystem-backed provider for
    `AGENTS.override.md` and `AGENTS.md`, preserving precedence, fallback,
    trimming, lossy UTF-8 handling, and the existing uncapped global
    instruction size.
    - Removed global instruction loading from `Config` and require
    `ThreadManager` callers to inject a provider.
    - Load provider instructions once for each fresh root runtime, including
    runtimes without a primary environment. Running sessions retain their
    snapshot, while child agents inherit the parent snapshot without
    invoking the provider.
    - Keep provider instructions separate while loading project `AGENTS.md`,
    then assemble the model-visible instructions with the existing ordering,
    source attribution, warning, and turn-context behavior.
    - Wired the Codex home provider through the CLI, app server, MCP server,
    core facade, and thread-manager sample.
    
    ## Validation
    
    - `just test -p codex-home -p codex-extension-api`
    - `just test -p codex-core agents_md`
    - `just test -p codex-core guardian`
    - `just test -p codex-app-server
    thread_start_without_selected_environment_includes_only_global_instruction_source`
    - `just test -p codex-exec warning`
    - `just bazel-lock-check`
  • [codex] migrate ExecutorFileSystem paths to PathUri (#27424)
    ## Why
    
    We're moving exec-server to use PathUri for its internal path
    representations.
    
    ## What
    
    Move `ExecutorFileSystem` APIs to use `PathUri` instead of
    `AbsolutePathBuf`. Future changes will convert higher-level parts of
    exec-server.
  • Route hosted Apps MCP through extensions (#27191)
    ## Stack
    
    - Base: #27184
    - This PR is the second vertical and should be reviewed against
    `jif/external-plugins-1`, not `main`.
    
    ## Why
    
    CCA is moving toward a split runtime where the orchestrator may have no
    filesystem or executor, but it still needs to activate remotely hosted
    plugin components. HTTP MCP servers are the simplest complete example:
    they need configuration and host authentication, but they do not need an
    executor process.
    
    The Apps MCP endpoint is currently synthesized by a special-purpose
    loader inside the MCP runtime. That works locally, but it leaves hosted
    MCP activation outside the extension model being established in #27184.
    It also makes the Apps path a poor foundation for plugins whose skills,
    MCP servers, connectors, and hooks may come from different sources or
    execute in different places.
    
    This PR moves that one behavior behind an extension-owned contribution
    while preserving the existing local fallback. It deliberately does not
    introduce a generic plugin activation framework.
    
    ## What changed
    
    ### MCP extension contribution
    
    `codex-extension-api` gains an ordered `McpServerContributor` contract.
    A contributor returns typed `Set` or `Remove` overlays for MCP server
    configuration; later contributors win for the names they own.
    
    The contract stays at the existing MCP configuration boundary.
    Extensions do not create a second connection manager or transport
    abstraction.
    
    ### Hosted Apps MCP extension
    
    A new `codex-mcp-extension` contributes the reserved `codex_apps` server
    from the existing Apps feature, ChatGPT base URL, path override, and
    product SKU configuration.
    
    When `apps_mcp_path_override` is enabled for `https://chatgpt.com`, the
    resulting streamable HTTP endpoint is
    `https://chatgpt.com/backend-api/ps/mcp`. The existing ChatGPT-auth gate
    remains authoritative, so this server can run in an orchestrator-only
    process without being exposed for API-key sessions.
    
    ### One resolved runtime view
    
    `McpManager` now distinguishes three views:
    
    - **configured:** config- and plugin-backed servers before extension
    overlays;
    - **runtime:** configured servers plus host-installed extension
    contributions;
    - **effective:** runtime servers after auth gating and compatibility
    built-ins.
    
    App-server installs the hosted MCP extension and uses the runtime view
    for thread startup, refresh, status, threadless resource reads,
    connector discovery, and MCP OAuth lookup. This keeps
    `mcpServer/oauth/login` consistent with the servers exposed by the other
    MCP APIs. The hosted Apps server itself continues to use existing
    ChatGPT host authentication rather than MCP OAuth.
    
    ## Compatibility
    
    Hosts that do not install the MCP extension retain the existing Apps MCP
    synthesis path. This preserves current local-only, CLI, and
    standalone-host behavior while app-server exercises the extension path.
    
    Disabling Apps removes the reserved `codex_apps` entry, and losing
    ChatGPT auth removes it from the effective runtime view. Executor
    availability is not consulted for this HTTP transport.
    
    ## Follow-ups
    
    The next vertical will resolve a manifest-declared stdio MCP server from
    an executor-selected plugin root and execute it in the environment that
    owns that root. Later verticals can add backend-owned skills, connector
    metadata, hooks, durable selection semantics, and incremental local
    convergence without changing the component-specific runtime boundaries
    introduced here.
    
    ## Verification
    
    Focused coverage was added for:
    
    - contributing the hosted Apps MCP at `/backend-api/ps/mcp` without an
    executor;
    - requiring ChatGPT auth in the effective runtime view;
    - removing a reserved configured Apps server when the Apps feature is
    disabled.
    
    `cargo check -p codex-app-server -p codex-mcp-extension -p
    codex-extension-api -p codex-mcp` passed. Tests and Clippy were not run
    locally under the current development instruction; CI provides the full
    validation pass.
  • Load selected executor skills through extensions (#27184)
    ## Why
    
    CCA is moving toward a split runtime where the orchestrator may not have
    a filesystem, while executors can expose preinstalled plugins and
    skills. A thread therefore needs to select capabilities without asking
    app-server or core to interpret executor-owned paths through the
    orchestrator's filesystem.
    
    The longer-term model is broader than executor skills:
    
    - A plugin is a bundle of skills, MCP servers, connectors/apps, and
    hooks.
    - A plugin root can be local, executor-owned, or hosted by a backend.
    - Components inside one plugin can use different access and execution
    mechanisms. A skill may be read from a filesystem or through backend
    tools; an HTTP MCP server can run without an executor; a stdio MCP
    server or hook needs an execution environment.
    - Core should carry generic extension initialization data. The extension
    that owns a component should discover it, expose it to the model, and
    invoke it through the appropriate runtime.
    
    This PR establishes that architecture through one complete vertical:
    selecting a root on an executor, discovering the skills beneath it,
    exposing those skills to the model, and reading an explicitly invoked
    `SKILL.md` through the same executor.
    
    ## Contract
    
    `thread/start` gains an experimental `selectedCapabilityRoots` field:
    
    ```json
    {
      "selectedCapabilityRoots": [
        {
          "id": "deploy-plugin@1",
          "location": {
            "type": "environment",
            "environmentId": "workspace",
            "path": "/opt/codex/plugins/deploy"
          }
        }
      ]
    }
    ```
    
    The root is intentionally not classified as a "plugin" or "skill" in the
    API. It can point at a standalone skill, a directory containing several
    skills, or a plugin containing skills and other components. This PR only
    teaches the skills extension how to consume it; later extensions can
    resolve MCP, connector, and hook components from the same selection.
    
    The platform-supplied `id` is stable selection identity. The location
    says which runtime owns the root and gives that runtime an opaque path.
    App-server does not inspect or canonicalize the path.
    
    ## What changed
    
    ### Generic thread extension initialization
    
    App-server converts selected roots into `ExtensionDataInit`. Core
    carries that generic initialization value until the final thread ID is
    known, then creates thread-scoped `ExtensionData` before lifecycle
    contributors run.
    
    This keeps `Session` and core independent of the capability-selection
    contract. The initialization value is consumed during construction; it
    is not retained as another long-lived `Session` field.
    
    ### Executor-backed skills
    
    The skills extension now owns an `ExecutorSkillProvider` that:
    
    - resolves the selected environment through `EnvironmentManager`
    - discovers, canonicalizes, and reads skills through that environment's
    `ExecutorFileSystem`
    - contributes the bounded selected-skill catalog as stable developer
    context
    - reads an explicitly invoked skill body through the authority that
    listed it
    - warns when an environment or root is unavailable
    - never falls back to the orchestrator filesystem for an executor-owned
    root
    
    Skill catalog and instruction fragments have hard byte bounds, which
    also bound them below the 10K-token per-item context limit. If a
    selected executor skill has the same name as a legacy local skill, the
    executor selection owns that invocation and the local body is not
    injected a second time.
    
    Existing local and bundled skill loading remains in place. Omitting
    `selectedCapabilityRoots` therefore preserves current local-only
    behavior.
    
    ## Current semantics
    
    - Only environment-owned locations are represented in this first
    contract.
    - Roots are resolved by the destination extension, not by app-server or
    core.
    - An unavailable executor or invalid root produces a warning and no
    capabilities from that root; it does not trigger a local-filesystem
    fallback.
    - Selection applies to a newly started active thread.
    - MCP servers, connectors, and hooks beneath a selected plugin root are
    not activated yet.
    - Selection is not yet persisted or inherited across resume, fork, or
    subagent creation. Existing local capabilities continue to behave as
    they do today in those flows.
    
    ## Planned vertical follow-ups
    
    1. **Hosted HTTP MCP:** add an extension-backed HTTP MCP source that
    works without an executor, then replace the special-purpose MCP plugins
    loader with that implementation.
    2. **Executor MCP:** register and execute stdio MCP servers through the
    environment that owns the selected plugin root.
    3. **Backend skills:** add a hosted skill source whose catalog and
    bodies are accessed through extension tools rather than a filesystem.
    4. **Connectors and hooks:** activate those components through their
    owning extensions, using the same selected-root boundary and
    component-specific runtime.
    5. **Durable selection:** define the desired-selection lifecycle,
    persist it, and make resume, fork, and subagent inheritance explicit
    rather than accidental.
    6. **Local convergence:** incrementally route existing local plugin,
    skill, and MCP loading through the same extension model while preserving
    current local behavior.
    
    Each follow-up remains reviewable as an end-to-end capability. The
    platform selects roots, generic thread extension data carries the
    selection, and the owning extension resolves and operates its component.
    
    ## Verification
    
    Coverage added for:
    
    - app-server end-to-end discovery and explicit invocation of a skill
    inside an executor-selected plugin root
    - exclusive invocation when a selected executor skill collides with a
    local skill name
    - executor filesystem authority for discovery, canonicalization, and
    reads
    - thread extension initialization before lifecycle contributors run
    - stable executor catalog context, explicit invocation, context
    rebuilding, hidden skills, and preserved host/remote catalog behavior
    
    Targeted protocol, core-skills, skills-extension, core lifecycle, and
    app-server executor-skill tests were run during development.
  • Add typed file URIs (#26840)
    ## Why
    
    Codex needs stable `file:` URI identifiers that can cross process and
    operating-system boundaries without eagerly interpreting them as native
    paths. Existing fields also need to keep accepting absolute path strings
    during migration.
    
    ## What changed
    
    - Add `codex-utils-path-uri` with a validated, immutable `PathUri`
    wrapper that currently accepts only `file:` URLs.
    - Expose URI-level `basename`, `parent`, and `join` operations that
    preserve authorities and percent encoding without guessing the source
    operating system.
    - Keep native conversion explicit through `AbsolutePathBuf` and the
    current host rules.
    - Serialize as canonical URI text while accepting both URI text and
    legacy absolute native paths during deserialization.
    - Add adversarial coverage for Windows-looking and POSIX paths, UNC
    authorities, encoded metadata characters, non-UTF-8 POSIX paths, URI
    hierarchy operations, and legacy serde round trips.
  • [codex] Restore release symbol artifacts with line tables (#26202)
    ## Summary
    
    - Restore separate release symbol archives for macOS, Linux, and Windows
    binaries.
    - Build release binaries with `line-tables-only` debuginfo instead of
    full debuginfo.
    - Strip Unix distribution binaries after extracting symbols, preserve
    Windows PDBs, and keep symbol archives available to the release job.
    - Strip the packaged Linux `bwrap` binary before hashing it so the
    embedded digest matches the distributed bytes.
    
    ## Root cause
    
    The first symbol-artifact implementation enabled
    `CARGO_PROFILE_RELEASE_DEBUG=full`. In the June 2 release runs, macOS
    ARM primary builds reached the 90-minute timeout while still inside
    `Cargo build`. After the symbol changes were reverted, the same primary
    build completed in about 22 minutes. The archive step itself completed
    in tens of seconds when reached.
    
    Rust's `line-tables-only` debuginfo level preserves function names and
    source locations for symbolication without emitting the heavier variable
    and type information from full debuginfo.
    
    ## Validation
    
    - Ran `just fmt` from `codex-rs`.
    - Ran `just test-github-scripts` from the repository root: 23 tests
    passed.
    - Ran `bash -n` and `shellcheck` on
    `.github/scripts/archive-release-symbols-and-strip-binaries.sh`.
    - Parsed both modified workflows as YAML and ran `git diff --check`.
    - Built a macOS release smoke binary with `line-tables-only`, archived
    its dSYM through the restored script, stripped the production binary,
    and verified that `atos` resolves `symbol_smoke_function` to
    `main.rs:2`.
    - Ran Linux archive-script control-flow coverage with stubbed `objcopy`
    and `strip` commands.
    - Ran Windows PDB archive staging coverage and verified
    underscore-emitted Rust PDB names are staged under shipped hyphenated
    binary names.
    
    ## Follow-up
    
    The release workflow only runs for tags or manual dispatches, so CI
    cannot dry-run the full release matrix on this PR. The next release run
    will verify runner time and memory behavior under `line-tables-only`.
  • [2 of 2] Finish moving goal runtime to extension (#26548)
    ## Stack
    
    1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
    goal extension with core behavior
    2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
    goal runtime to extension
    
    ## Why
    
    This PR completes the switch of the goal behavior to the
    extension-backed runtime and removes the old core goal implementation.
    
    ## What Changed
    
    - Installs the goal extension for app-server `ThreadManager` sessions.
    - Routes app-server thread goal `get`, `set`, and `clear` through
    `GoalService`.
    - Uses thread-idle lifecycle emission after goal resume and snapshot
    ordering so the extension can decide whether to continue the goal.
    - Forwards extension goal updates through a FIFO async app-server
    notification path so backpressure does not drop them or reorder updates.
    - Keeps review turns from enabling goal runtime behavior.
    - Plans extension tools before dynamic tools so built-in goal tool names
    keep their old precedence when goals are enabled.
    - Removes the old core goal runtime, core goal tool handlers, and core
    goal tool specs.
    - Updates tests that were coupled to the core-owned goal runtime while
    leaving the legacy `<goal_context>` compatibility path in core for old
    threads.
    - Removes the stale cargo-shear ignore now that `codex-goal-extension`
    is used by the workspace.
    - Keeps realtime event matching exhaustive after removing the old
    goal-specific realtime text path.
    
    
    ## Validation
    
    - Ran manual `/goal` runs in TUI. Validated time accounting matched
    wall-clock time and goal lifecycle state transitions.
  • build: use ThinLTO for release binaries (#23710)
    ## Why
    
    Fat LTO makes release builds substantially slower without providing
    enough measured runtime benefit to justify the release CI long pole. The
    build-profile investigation found that keeping Cargo's default release
    `opt-level=3` and switching from fat LTO to ThinLTO (`3/thin/1`) reduced
    a clean `codex-cli` release build from 2073.893 seconds to 1243.172
    seconds, a 40.06% improvement.
    
    The resulting binary increased from 196.7 MiB to 211.8 MiB (+7.63%).
    Measured runtime changes were small: the worst image workload median was
    +0.86% and app-server startup was +0.31% relative to fat LTO. ThinLTO
    retains cross-crate optimization while avoiding most of the fat-LTO
    build cost.
    
    This deliberately avoids global size optimization: final-executable
    testing showed a substantial regression on the image request path, which
    is expected to become more important as image usage grows.
    
    ## What changed
    
    - Set the workspace release profile to `lto = "thin"`, retaining Cargo's
    default release `opt-level=3`.
    - Remove release and CI workflow-specific LTO overrides so
    release-profile builds consistently use the workspace setting.
    - Remove the now-unused Windows release workflow input and related
    diagnostic output.
    
    ## Validation
    
    - Confirmed the release profile parses with `cargo metadata --no-deps
    --format-version 1`.
    - CI validates release builds across the supported target matrix.
  • Optimize unbounded byte scans with memchr (#26265)
    ## Summary
    
    This PR adds `memchr` for some low-hanging performance improvements
    (namely, in MCP stdio, Ollama streaming, and full message-history
    newline counts).
    
    Codex produced the following release benchmarks:
    
    | Operation | Before | After | Speedup |
    | --- | ---: | ---: | ---: |
    | MCP 1 MiB chunked line | 2.172 s | 3.984 ms | 545x |
    | Ollama 1 MiB chunked line | 1.673 s | 2.790 ms | 600x |
    | Count newlines in 10 MiB history | 132.83 ms | 20.05 ms | 6.6x |
    
    With a "real" MCP setup (`ExecutorStdioServerLauncher` started a Python
    MCP server, completed `initialize`, requested `tools/list`, and
    deserialized a 1 MiB tool description over newline-delimited stdio),
    it's about 16x faster end-to-end:
    
    | Branch | 50 calls | Per call |
    | --- | ---: | ---: |
    | `main` | 862.53 ms | 17.25 ms |
    | this branch | 53.89 ms | 1.08 ms |
    
    `memchr` is already in our dependency tree and extremely widely used for
    this kind of optimized scanning.
  • chore: extract context fragments into dedicated crate (#26122)
    ## Why
    
    `codex-core` currently owns the generic contextual-fragment trait and
    several reusable fragment implementations. That makes it harder for
    other crates to share the same host-owned model-input abstraction
    without depending on all of `codex-core`.
    
    This change extracts the reusable fragment machinery into a small
    `codex-context-fragments` crate so future extension and skills work can
    depend on the fragment abstraction directly.
    
    ## What Changed
    
    - Added the `codex-context-fragments` crate with:
      - `ContextualUserFragment`
      - `FragmentRegistration` / `FragmentRegistrationProxy`
      - additional-context fragment types
    - Moved `SkillInstructions` into `codex-core-skills`, since
    skill-specific rendering belongs with skills rather than generic core
    context machinery.
    - Kept `codex-core` re-exporting the fragment types it still uses
    internally, so existing call sites keep the same shape.
    - Updated Cargo and Bazel workspace metadata for the new crate.
    
    ## Verification
    
    - `cargo metadata --locked --format-version 1 --no-deps`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
  • feat: add skills extension scaffold (#25953)
    ## Disclaimer
    This is only here for iteration purpose! Do not make any code rely on
    this
    
    ## Why
    
    Skills still live behind `codex-core` discovery and injection paths, but
    the extension system needs an authority-aware home before that logic can
    move. This adds that boundary without changing current skills behavior,
    and keeps host, executor, and remote skills distinct so future
    list/read/search flows do not collapse back to ambient local paths.
    
    ## What changed
    
    - Add the `codex-skills-extension` workspace/Bazel crate under
    `ext/skills`.
    - Define the initial catalog, authority, provider, and turn-state types
    for authority-bound skill packages and resources.
    - Register placeholder thread/config/prompt/turn lifecycle contributors
    plus host, executor, and remote provider aggregation points.
    - Capture the remaining extraction work as TODOs, including the missing
    extension API hooks needed for per-turn catalog construction and typed
    skill injection.
    - Keep plugins outside the runtime skills model: plugin-installed skills
    are treated as materialized host-owned skill sources once available.
    
    ## Verification
    
    - Not run locally.
  • Move cloud requirements crate to cloud config (#24621)
    ## Summary
    
    - Moves the existing `codex-cloud-requirements` crate to
    `codex-cloud-config`.
    - Updates workspace dependencies and imports to the new crate name.
    - Intentionally keeps runtime behavior unchanged: this still fetches the
    legacy cloud requirements endpoint.
    
    ## Details
    
    This PR exists to make the lineage obvious before the bundle migration.
    GitHub should show the old `codex-rs/cloud-requirements/src/lib.rs`
    implementation as moved to `codex-rs/cloud-config/src/lib.rs`, rather
    than as unrelated new code.
    
    The follow-up PR adapts this moved crate to the new config bundle API
    and switches runtime consumers over.
  • [codex] Consolidate shared prompts in codex-prompts (#25151)
    ## Why
    
    `codex_core` is consistently a bottleneck for incremental builds during
    iteration. The simplest fix is to make the crate smaller.
    
    ## Summary
    
    `codex-core` owns several reusable prompt renderers and static prompt
    assets, which makes the crate harder to split apart.
    
    Rename `codex-review-prompts` to `codex-prompts` and move shared review,
    goal, permissions, compaction, realtime, hierarchical AGENTS.md, and
    `apply_patch` prompts into it. Move prompt-only tests and update
    consumers and `CODEOWNERS`.
    
    ## Validation
    
    - `just test -p codex-prompts -p codex-apply-patch`
    - `just test -p codex-core prompt_caching`
    - Bazel builds for the affected crates
  • Add feature-gated standalone image generation extension (#24723)
    ## Why
    
    Add a standalone image generation path that can be exercised
    independently of hosted Responses image generation, while retaining the
    hosted tool as fallback unless the extension is actually available to
    the model.
    
    ## What changed
    
    - Added the `codex-image-generation-extension` crate with standalone
    generate/edit execution, prior-image selection for edits, model-visible
    image output, and local generated-image persistence.
    - Installed the extension in app-server behind the disabled-by-default
    `imagegenext` feature and backend eligibility checks.
    - Updated core tool planning so eligible `image_gen.imagegen` exposure
    replaces hosted `image_generation`, while unavailable configurations
    retain hosted fallback.
    - Added coverage for extension behavior, edit history reuse, feature
    gating, auth eligibility, and hosted-tool replacement.
    - The extension is installed through app-server only in this PR; other
    execution paths retain hosted image generation because hosted
    replacement occurs only when the standalone executor is actually
    registered and model-visible.
    - The initial extension contract intentionally fixes the image model to
    `gpt-image-2` and uses automatic image parameters.
    - Native generated-image history/card parity and rollout persistence
    cleanup are intentionally deferred follow-up work.
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - `just test -p codex-features`
    - `just test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `just test -p codex-app-server`
    - `just fix -p codex-image-generation-extension -p codex-features -p
    codex-core -p codex-app-server`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Add app-server startup benchmark crate (#24651)
    ## Summary
    - Add a new `app-server-start-bench` crate to measure app-server startup
    performance
    - Wire the benchmark into the workspace and Bazel build so it can be run
    consistently
    - Update lockfiles and repo automation to account for the new package
  • Update rmcp to 1.7.0 (#24763)
    WIll make it easier to uprev when the new draft spec is supported.
    
    Also updates reqwest where needed for compatibility but doesn't update
    it everywhere since this is already a large diff.
    
    The new version of rmcp handles certain kinds of authentication failures
    differently, this patch includes support for identifying the failing scope
    in a WWW-Authenticate header.
  • Bump SQLx to pick up newer bundled SQLite (#24728)
    ## Why
    
    Codex stores thread, log, goal, and memory state in bundled SQLite
    databases through SQLx. We have a suspected SQLite WAL-reset corruption
    issue under heavy concurrent writer load, especially when multiple
    subagents are active. The existing `sqlx 0.8.6` dependency kept us on an
    older `libsqlite3-sys` / bundled SQLite, so this PR moves the SQLx stack
    far enough forward to pick up the newer bundled SQLite library.
    
    ## What changed
    
    - Bump the workspace `sqlx` dependency to `0.9.0`.
    - Use the SQLx 0.9 feature names explicitly: `runtime-tokio`,
    `tls-rustls`, and `sqlite-bundled`.
    - Update `Cargo.lock` so `sqlx-sqlite` resolves through `libsqlite3-sys
    0.37.0`.
    - Refresh `MODULE.bazel.lock` for the dependency changes.
    - Adapt `codex-state` to SQLx 0.9:
    - build dynamic state queries with `QueryBuilder<Sqlite>` instead of
    passing dynamic `String`s to `sqlx::query`;
    - remove the old `QueryBuilder` lifetime parameter from helper
    signatures;
    - preserve SQLx's new `Migrator` fields when constructing runtime
    migrators.
    
    ## Verification
    
    - `just test -p codex-state`
    - `just bazel-lock-check`
    - `cargo check -p codex-state --tests`
  • standalone websearch extension (#23823)
    ## Summary
    
    Add the extension-backed standalone `web.run` tool so Codex can call the
    standalone search endpoint through the `codex-api` search client and
    return its encrypted output to Responses.
    
    - gate the new tool behind `standalone_web_search`
    - install the extension in the app-server thread registry and hide
    hosted `web_search` when standalone search is enabled for OpenAI
    providers so the two paths stay mutually exclusive
    - build search context from persisted history using a small tail
    heuristic: previous user message, assistant text between the last two
    user turns capped at about 1k tokens, and current user message
    
    ## Test Plan
    
    - `cargo test -p codex-web-search-extension`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
  • chore: drop orphaned codex memories MCP crate (#24555)
    ## Why
    
    The memory read-tool surface had two implementations: the app-server
    extension path under `ext/memories`, and an unused `codex-memories-mcp`
    workspace crate under `memories/mcp`. The MCP crate no longer has
    reverse dependents, so keeping it around preserves duplicate backend,
    schema, and tool code that is not part of the live app-server memory
    path.
    
    Dropping the orphaned crate makes the remaining memory crate split
    clearer: `memories/read` owns read-path prompt/citation helpers,
    `memories/write` owns the write pipeline, and `ext/memories` owns the
    app-server extension integration.
    
    ## What changed
    
    - Removed the `memories/mcp` crate and its Bazel/Cargo metadata.
    - Removed `memories/mcp` from the Rust workspace and lockfile.
    - Updated `memories/README.md` so it only lists the remaining reusable
    memory crates.
    
    ## Verification
    
    - `cargo metadata --format-version 1 --no-deps` succeeds.
  • [codex] Add image re-encoding benchmarks (#23935)
    ## Summary
    - add Divan benchmarks for prompt image re-encoding paths
    - wire the image benchmark smoke test into Rust CI workflows
    
    ## Why
    Image prompt handling includes re-encoding work that benefits from
    repeatable benchmark coverage so changes can be measured in CI and
    locally.
    
    This already helped identify a potential regression from changing compiler flags.
    
    ## Impact
    Developers can run and compare the new image re-encoding benchmarks, and
    CI exercises the benchmark target via the Rust benchmark smoke test.
  • feat: support local refs and defs in tool input schemas (#23357)
    # Why
    
    Some connector tool input schemas use local JSON Schema references and
    definition tables to avoid duplicating large nested shapes. Codex
    previously lowered these schemas into the supported subset in a way that
    could discard `$ref`-only schema objects and lose the corresponding
    definitions, which made non-strict tool registration less faithful than
    the original connector schema.
    
    This keeps the existing minimal-lowering policy: Codex still does not
    raw-pass through arbitrary JSON Schema, but it now preserves local
    reference structure that fits the Responses-compatible subset and prunes
    definition entries that cannot be reached by following `$ref`s from the
    root schema after sanitization, including refs found transitively inside
    other reachable definitions. The pruning matters because Responses
    parses definition tables even when entries are unused, so keeping dead
    definitions wastes prompt tokens.
    
    # What changed
    
    - Added `$ref`, `$defs`, and legacy `definitions` fields to the tool
    `JsonSchema` representation.
    - Updated `parse_tool_input_schema` lowering so `$ref`-only schema
    objects survive sanitization instead of becoming `{}`.
    - Sanitized definition tables recursively and dropped malformed
    definition tables so non-strict registration degrades gracefully.
    - Added reachability pruning for root definition tables by starting from
    refs outside definition tables, then following refs inside reachable
    definitions.
    - Added JSON Pointer decoding for local definition refs such as
    `#/$defs/Foo~1Bar`.
    
    # Verification
    ran local golden-schema probes against representative connector schemas
    to validate behavior on real generated schemas:
    
    | Golden schema | Before bytes | After bytes | `$defs` before -> after |
    `$ref` before -> after | Result |
    |---|---:|---:|---:|---:|---|
    | `google_calendar/create_space` | 7111 | 4526 | 7 -> 7 | 7 -> 7 | all
    definitions preserved because all are reachable |
    | `figma/apply_file_variable_changes` | 4609 | 999 | 8 -> 5 | 8 -> 5 |
    unused defs pruned after unsupported `oneOf` shapes lower away |
    | `snowflake/list_catalog_integrations` | 1380 | 404 | 3 -> 0 | 0 -> 0 |
    all defs pruned because none are referenced |
    | `dropbox/create_shared_link` | 8894 | 1836 | 14 -> 4 | 9 -> 4 | only
    defs reachable from the root schema after sanitization are retained,
    including transitively through other retained defs |
    
    Token increase across golden schema due to this change:
    <img width="817" height="366" alt="Screenshot 2026-05-19 at 1 47 04 PM"
    src="https://github.com/user-attachments/assets/d5c80fe9-da85-41e6-8ac7-a01d1e0b0f71"
    />
  • CI: Customize v8 building (#22086)
    ## Summary
    
    Move the rusty_v8 artifact production into hermetic Bazel path and bump
    the `v8` crate to `147.4.0`
    
    The new flow builds V8 release artifacts from source for Darwin and
    Linux targets, publishes both the current release-compatible artifacts
    and sandbox-enabled variants, and keeps Cargo consumers on prebuilt
    binaries by continuing to feed the `v8` crate the archive and generated
    binding files it already expects.
    
    ## Why
    
    We need control over V8 build-time features without giving up prebuilt
    artifacts for downstream Cargo builds.
    
    Upstream `rusty_v8` already supports source-only features such as
    `v8_enable_sandbox`, but its normal prebuilt release assets do not cover
    every feature combination we need. Building the artifacts ourselves lets
    us enable settings such as the V8 sandbox and pointer compression at
    artifact build time, then publish those outputs so ordinary Cargo builds
    can still consume prebuilts instead of compiling V8 locally.
    
    This keeps the fast consumer experience of prebuilt `rusty_v8` archives
    while giving us a reproducible path to ship featureful variants that
    upstream does not currently publish for us.
    
    ## Implementation Notes
    
    The Bazel graph in this PR is not copied wholesale from `rusty_v8`;
    `rusty_v8`'s normal source build is still GN/Ninja-based.
    
    Instead, this change starts from upstream V8's Bazel rules and adapts
    them to Codex's hermetic toolchains and dependency layout. Where we
    intentionally follow `rusty_v8`, we mirror its existing artifact
    contract:
    
    - the same `v8` crate version and generated binding expectations
    - the same sandbox feature relationship, where sandboxing requires
    pointer compression
    - the same custom libc++ model expected by Cargo's default
    `use_custom_libcxx` feature
    - the same release-style archive plus `src_binding` outputs consumed by
    the `v8` crate
    
    To preserve that contract, the Bazel release path pins the libc++,
    libc++abi, and llvm-libc revisions used by `rusty_v8 v147.4.0`, builds
    release artifacts with `--config=rusty-v8-upstream-libcxx`, and folds
    the matching runtime objects into the final static archive.
    
    ## Windows
    
    Windows is annoyingly handled differently.
    
    Codex's current hermetic Bazel Windows C++ platform is `windows-gnullvm`
    / `x86_64-w64-windows-gnu`, while upstream `rusty_v8` publishes Windows
    prebuilts for `*-pc-windows-msvc`. Those are different ABIs, so the
    Bazel graph cannot truthfully reproduce the upstream MSVC artifacts
    until we add a real MSVC-targeting C++ toolchain.
    
    For now:
    
    - Windows MSVC consumers continue to use upstream `rusty_v8` release
    archives.
    - Windows GNU targets are built in-tree so they link against a matching
    GNU ABI.
    - The canary workflow separately exercises upstream `rusty_v8` source
    builds for MSVC sandbox artifacts, but MSVC is not yet part of the
    Bazel-produced release matrix.
    
    ## Validation
    This PR is technically self validating through CI. I have already
    published it as a release tag so the artifacts from this branch are
    published to
    https://github.com/openai/codex/releases/tag/rusty-v8-v147.4.0 CI for
    this PR should therefore consume our own release targets. I have also
    locally tested for linux and darwin.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: goal ext skeleton (#23288)
    Skeleton of `/goal` in extension
    Lot's of follow-ups coming
  • Move memory prompt injection to app-server extension (#22841)
    ## Why
    
    Memory prompt injection should be owned by the extension path that
    app-server composes at runtime, not by an inlined special case inside
    `codex-core`. This keeps `codex-core` focused on session orchestration
    while allowing the memories extension to own its app-server prompt
    behavior.
    
    ## What Changed
    
    - Registers `codex-memories-extension` in the app-server extension
    registry.
    - Moves the memory developer-instruction injection out of
    `core/src/session/mod.rs` and into the memories extension prompt
    contributor.
    - Adds config-change handling so the extension keeps its per-thread
    memory settings in sync after startup.
    - Leaves memories read/retrieval tools unregistered for now so this PR
    only changes prompt injection.
    - Removes the stale `cargo-shear` ignore now that app-server depends on
    the extension crate.
    
    ## Validation
    
    Not run locally; validation is left to CI.
  • chore(config) rm Feature::CodexGitCommit (#22412)
    ## Summary
    Removes the unused Feature::CodexGitCommit
    
    ## Testing
    - [x] tests pass
  • config: add strict config parsing (#20559)
    ## Why
    
    Codex intentionally ignores unknown `config.toml` fields by default so
    older and newer config files keep working across versions. That leniency
    also makes typo detection hard because misspelled or misplaced keys
    disappear silently.
    
    This change adds an opt-in strict config mode so users and tooling can
    fail fast on unrecognized config fields without changing the default
    permissive behavior.
    
    This feature is possible because `serde_ignored` exposes the exact
    signal Codex needs: it lets Codex run ordinary Serde deserialization
    while recording fields Serde would otherwise ignore. That avoids
    requiring `#[serde(deny_unknown_fields)]` across every config type and
    keeps strict validation opt-in around the existing config model.
    
    ## What Changed
    
    ### Added strict config validation
    
    - Added `serde_ignored`-based validation for `ConfigToml` in
    `codex-rs/config/src/strict_config.rs`.
    - Combined `serde_ignored` with `serde_path_to_error` so strict mode
    preserves typed config error paths while also collecting fields Serde
    would otherwise ignore.
    - Added strict-mode validation for unknown `[features]` keys, including
    keys that would otherwise be accepted by `FeaturesToml`'s flattened
    boolean map.
    - Kept typed config errors ahead of ignored-field reporting, so
    malformed known fields are reported before unknown-field diagnostics.
    - Added source-range diagnostics for top-level and nested unknown config
    fields, including non-file managed preference source names.
    
    ### Kept parsing single-pass per source
    
    - Reworked file and managed-config loading so strict validation reuses
    the already parsed `TomlValue` for that source.
    - For actual config files and managed config strings, the loader now
    reads once, parses once, and validates that same parsed value instead of
    deserializing multiple times.
    - Validated `-c` / `--config` override layers with the same
    base-directory context used for normal relative-path resolution, so
    unknown override keys are still reported when another override contains
    a relative path.
    
    ### Scoped `--strict-config` to config-heavy entry points
    
    - Added support for `--strict-config` on the main config-loading entry
    points where it is most useful:
      - `codex`
      - `codex resume`
      - `codex fork`
      - `codex exec`
      - `codex review`
      - `codex mcp-server`
      - `codex app-server` when running the server itself
      - the standalone `codex-app-server` binary
      - the standalone `codex-exec` binary
    - Commands outside that set now reject `--strict-config` early with
    targeted errors instead of accepting it everywhere through shared CLI
    plumbing.
    - `codex app-server` subcommands such as `proxy`, `daemon`, and
    `generate-*` are intentionally excluded from the first rollout.
    - When app-server strict mode sees invalid config, app-server exits with
    the config error instead of logging a warning and continuing with
    defaults.
    - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
    instead of extending shared `ReviewArgs`, so `--strict-config` stays on
    the outer config-loading command surface and does not become part of the
    reusable review payload used by `codex exec review`.
    
    ### Coverage
    
    - Added tests for top-level and nested unknown config fields, unknown
    `[features]` keys, typed-error precedence, source-location reporting,
    and non-file managed preference source names.
    - Added CLI coverage showing invalid `--enable`, invalid `--disable`,
    and unknown `-c` overrides still error when `--strict-config` is
    present, including compound-looking feature names such as
    `multi_agent_v2.subagent_usage_hint_text`.
    - Added integration coverage showing both `codex app-server
    --strict-config` and standalone `codex-app-server --strict-config` exit
    with an error for unknown config fields instead of starting with
    fallback defaults.
    - Added coverage showing unsupported command surfaces reject
    `--strict-config` with explicit errors.
    
    ## Example Usage
    
    Run Codex with strict config validation enabled:
    
    ```shell
    codex --strict-config
    ```
    
    Strict config mode is also available on the supported config-heavy
    subcommands:
    
    ```shell
    codex --strict-config exec "explain this repository"
    codex review --strict-config --uncommitted
    codex mcp-server --strict-config
    codex app-server --strict-config --listen off
    codex-app-server --strict-config --listen off
    ```
    
    For example, if `~/.codex/config.toml` contains a typo in a key name:
    
    ```toml
    model = "gpt-5"
    approval_polic = "on-request"
    ```
    
    then `codex --strict-config` reports the misspelled key instead of
    silently ignoring it. The path is shortened to `~` here for readability:
    
    ```text
    $ codex --strict-config
    Error loading config.toml:
    ~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
      |
    2 | approval_polic = "on-request"
      | ^^^^^^^^^^^^^^
    ```
    
    Without `--strict-config`, Codex keeps the existing permissive behavior
    and ignores the unknown key.
    
    Strict config mode also validates ad-hoc `-c` / `--config` overrides:
    
    ```text
    $ codex --strict-config -c foo=bar
    Error: unknown configuration field `foo` in -c/--config override
    
    $ codex --strict-config -c features.foo=true
    Error: unknown configuration field `features.foo` in -c/--config override
    ```
    
    Invalid feature toggles are rejected too, including values that look
    like nested config paths:
    
    ```text
    $ codex --strict-config --enable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --disable does_not_exist
    Error: Unknown feature flag: does_not_exist
    
    $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
    Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
    ```
    
    Unsupported commands reject the flag explicitly:
    
    ```text
    $ codex --strict-config cloud list
    Error: `--strict-config` is not supported for `codex cloud`
    ```
    
    ## Verification
    
    The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
    `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
    case, unknown `-c` overrides, app-server strict startup failure through
    `codex app-server`, and rejection for unsupported commands such as
    `codex cloud`, `codex mcp`, `codex remote-control`, and `codex
    app-server proxy`.
    
    The config and config-loader tests cover unknown top-level fields,
    unknown nested fields, unknown `[features]` keys, source-location
    reporting, non-file managed config sources, and `-c` validation for keys
    such as `features.foo`.
    
    The app-server test suite covers standalone `codex-app-server
    --strict-config` startup failure for an unknown config field.
    
    ## Documentation
    
    The Codex CLI docs on developers.openai.com/codex should mention
    `--strict-config` as an opt-in validation mode for supported
    config-heavy entry points once this ships.
  • feat: memories ext (#22498)
    First memories extension implementation
    Based on memories-mcp tools
  • Refactor extension tools onto shared ToolExecutor (#22369)
    ## Why
    
    Extension tools were split across two public runtime contracts:
    `codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
    types, while core native tools used `codex_tools::ToolExecutor`. That
    made contributed tool specs and execution behavior easy to drift apart
    and added another crate boundary for what should be one executable-tool
    seam.
    
    This PR makes `ToolExecutor` the single runtime contract and keeps
    extension-specific pinning in `codex-extension-api`.
    
    ## Remaining todo
    
    https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
    Either generic `Invocation` or sub-extract the `ToolCall` and clean
    `ToolInvocation`
    
    ## What changed
    
    - Removed the `codex-tool-api` workspace crate and its dependencies from
    core and `codex-extension-api`.
    - Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
    extension contributors can return a dyn executor.
    - Added the extension-facing aliases under
    `ext/extension-api/src/contributors/tools.rs`, including
    `ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
    ExtensionToolOutput>`.
    - Changed `ToolContributor::tools` to return extension executors
    directly instead of `ToolBundle`s.
    - Updated core’s extension tool handler/registry/router path to adapt
    those extension executors into the existing native `ToolInvocation`
    runtime path.
    - Added focused coverage for extension tools being registered,
    model-visible, dispatchable, and not replacing built-in tools.
    
    ## Verification
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-extension-api`
  • Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
    ## Why
    
    `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
    bypass the normal Responses transport by reading SSE from local files.
    That kept test-only behavior wired through production client code. The
    affected tests can stay hermetic by using the existing
    `core_test_support::responses` mock server and passing `openai_base_url`
    instead.
    
    ## What Changed
    
    - Removed the `CODEX_RS_SSE_FIXTURE` flag,
    `codex_api::stream_from_fixture`, the `env-flags` dependency, and the
    checked-in SSE fixture files.
    - Repointed the affected core, exec, and TUI tests at `MockServer` with
    the existing SSE event constructors.
    - Removed the Bazel test data plumbing for the deleted fixtures and
    refreshed cargo/Bazel lock state.
    
    ## Verification
    
    - `cargo build -p codex-cli`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core --test all responses_api_stream_cli`
    - `cargo test -p codex-core --test all
    integration_creates_and_checks_session_file`
    - `cargo test -p codex-exec --test all ephemeral`
    - `cargo test -p codex-exec --test all resume`
    - `cargo test -p codex-tui --test all
    resume_startup_does_not_consume_model_availability_nux_count`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
    - `git diff --check`