7919 Commits

  • [codex] rename rollout budget error to session budget error (#29744)
    ## Summary
    
    - rename the rollout-budget exhaustion error from
    `RolloutBudgetExceeded` to `SessionBudgetExceeded`
    - expose the matching app-server v2 wire value as
    `sessionBudgetExceeded`
    - regenerate JSON/TypeScript schema fixtures and update the app-server
    docs and focused tests
    
    This is a naming-only follow-up to #29715 based on [Pavel's review
    suggestion](https://github.com/openai/codex/pull/29715#discussion_r3463183480).
    Runtime behavior is unchanged.
    
    ## Tests
    
    - `just test -p codex-core rollout_budget`
    - `just test -p codex-app-server-protocol`
    - `just fmt`
    - `just write-app-server-schema`
  • fix: scope context remaining to body window (#29665)
    ## Why
    
    With `model_auto_compact_token_limit_scope = "body_after_prefix"`, the
    persistent prefix should not count against the active body window.
    `get_context_remaining` and the token-budget reminder should report the
    same usable body-after-prefix window that auto-compaction uses, rather
    than the total token count since the session began.
    
    This is stacked on #29664 so the mechanical move from `turn.rs` is
    isolated from the behavior fix.
    
    ## What
    
    - Extends `ContextWindowTokenStatus` with `context_remaining_tokens`.
    - Updates `get_context_remaining` to use the shared context-window
    accounting.
    - Adds integration coverage for body-after-prefix reminder timing and
    `get_context_remaining` output.
    
    ## Testing
    
    - `just test -p codex-core body_after_prefix_window`
    - `just test -p codex-core auto_compact_body_after_prefix`
    - `just fix -p codex-core`
  • refactor: extract context window token status (#29664)
    ## Why
    
    This PR keeps the mechanical helper extraction separate from the
    behavior change in #29665. The follow-up needs the token-window
    accounting from `turn.rs` in another call path, but reviewing that is
    much easier when the helper extraction is separate from the semantic
    change.
    
    ## What
    
    - Adds `session/context_window.rs` with `ContextWindowTokenStatus`.
    - Moves the existing auto-compaction token-status calculation out of
    `session/turn.rs`.
    - Replaces the duplicated inline remaining-token calculation in
    `turn.rs` with `tokens_until_compaction()`.
    
    This PR is intended to be behavior-preserving. The
    `get_context_remaining` behavior change is stacked separately in #29665.
    
    ## Testing
    
    - `just test -p codex-core auto_compact_body_after_prefix`
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/29664).
    * #29665
    * __->__ #29664
  • protocol: separate app and exec RPC ownership (#29714)
    ## Why
    
    The app-server and exec-server expose separate JSON-RPC APIs, but
    exec-server currently sources its serialized protocol and envelope types
    through app-server-oriented code. Giving each API an explicit owner
    makes the crate boundary legible without introducing shared generic
    envelopes.
    
    ## What changed
    
    - Added `codex-exec-server-protocol` to own exec DTOs, process IDs, and
    JSON-RPC envelopes.
    - Updated exec-server clients, transports, handlers, and tests to use
    the new crate.
    - Exposed app-server's existing JSON-RPC types through a public `rpc`
    module while retaining root re-exports.
    - Preserved existing wire shapes, including exec `PathUri` behavior.
    
    ## Stack
    
    This is PR 1 of 6. Next: [PR
    #29721](https://github.com/openai/codex/pull/29721), which moves auth
    mode below the app wire boundary.
    
    ## Validation
    
    - Exec-server protocol and server coverage passed in the focused
    protocol test runs.
    - App-server protocol schema fixtures passed.
  • Load executor skills without host path conversion (#29626)
    ## Why
    
    After #28918, selected skill roots are `PathUri`, but the executor skill
    provider still converts them to the app-server host's `AbsolutePathBuf`.
    A foreign Windows root therefore cannot be discovered by a Unix host,
    and the inverse has the same problem.
    
    This PR keeps executor skill discovery and reads on the filesystem that
    owns the selected root while reusing the existing skill rules.
    
    ## What changed
    
    - Generalize the existing skill traversal to operate on `PathUri`
    through `ExecutorFileSystem`, preserving its depth, directory, symlink,
    and sibling-metadata concurrency behavior.
    - Add a small environment skill loader that reuses the shared discovery,
    frontmatter validation, dependency parsing, product policy, and
    prompt-visibility rules.
    - Keep the environment id and entrypoint `PathUri` in the skill catalog,
    then route `skills.read` back through the same environment filesystem.
    - Preserve the executor's path convention when deriving catalog handles,
    including literal backslashes in POSIX filenames.
    - Resolve plugin namespaces from nearby manifests through URI-native
    filesystem reads.
    - Cover foreign Windows roots, executor-owned reads, namespaces,
    metadata, policy, and path identity.
    
    ```text
    selected root (PathUri)
            |
            v
    shared discovery over ExecutorFileSystem
            |
            v
    environment-bound catalog entry --skills.read--> same ExecutorFileSystem
    ```
    
    No second filesystem abstraction or duplicate traversal implementation
    is introduced.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. **This PR** — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • code-mode: Remove Session::is_alive() (#29732)
    Remove this unused API. This API is insidious in that it implies that
    alive state should be determinable from the caller, and implies that a
    preflight should indicate routing. Lets drop this, and handle errors
    correctly from a failed session in the future.
  • [codex] surface rollout budget exhaustion (#29715)
    ## Summary
    - surface shared rollout-budget exhaustion as
    `CodexErr::RolloutBudgetExceeded` instead of a generic interrupted turn
    - map it through the existing `CodexErrorInfo` and app-server v2
    `codexErrorInfo` path
    - keep local compaction from retrying after the shared rollout budget is
    exhausted
    
    This gives app-server clients a stable `rolloutBudgetExceeded` error
    they can classify without guessing from `status="interrupted"`.
    
    ## Tests
    - `just test -p codex-core rollout_budget`
  • [codex] define code mode host handshake protocol (#29515)
    ## Summary
    
    - add validated protocol-version, capability, and session identifier
    types
    - define explicit `ClientToHost` and `HostToClient` JSON envelopes for
    connection negotiation and session open/close acknowledgements
    - reject invalid states and unknown fields during decoding, with
    explicit wire-format and round-trip coverage
    
    ## Why
    
    This establishes the transport-neutral encoding shape needed to build
    and test the new code-mode host incrementally. Cell, tool callback, and
    failure-domain messages are intentionally deferred until their actors
    and behavior tests establish the required semantics.
    
    This is additive protocol scaffolding and does not change the current
    production code-mode implementation.
    
    ## Validation
  • Make selected plugin roots URI-native (#28918)
    ## Why
    
    Selected capability roots belong to the executor filesystem, not the
    app-server host. Converting their path strings into the host's native
    `Path` breaks whenever the two machines use different path conventions,
    such as a Windows executor behind a Unix app-server.
    
    This PR establishes `PathUri` as the selected-plugin boundary so the
    executor remains authoritative for its paths.
    
    ## What changed
    
    - Require `selectedCapabilityRoots[].location.path` to be a canonical
    `file:` URI and deserialize it directly as `PathUri`; native path
    strings are rejected.
    - Update the app-server schema, generated TypeScript, examples, and
    request coverage for the URI contract.
    - Keep selected roots, resolved plugin locations, manifest paths, and
    manifest resources as `PathUri`.
    - Inspect and read plugin roots and manifests only through the selected
    environment's `ExecutorFileSystem`.
    - Parse executor manifests with the shared URI-native parser from #29620
    instead of projecting them onto the host filesystem.
    - Enforce resource containment lexically and preserve the root URI's
    POSIX or Windows path convention.
    - Cover foreign Windows plugin roots and URI-native manifest resources.
    
    ```text
    thread/start
      selectedCapabilityRoots[].location.path = "file:///C:/plugins/demo"
                                  | PathUri
                                  v
                        ExecutorFileSystem
                                  |
                                  +--> plugin.json
                                  +--> manifest resources
    ```
    
    This PR stops at the shared selected-plugin representation. The next two
    PRs remove the remaining host-path projections in the skill and MCP
    consumers.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. **This PR** — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • core: persist initial context window metadata (#29519)
    ## Why
    
    PR #29494 made context-window IDs visible to the model by wrapping the
    token-budget window payload in `<context_window>`, but rollout JSONL
    consumers still could not see the initial window identity by tailing the
    session file. Compacted rollout items carry window IDs only after
    compaction has happened, so a session with no compaction had no durable
    JSONL record for window 0.
    
    This change gives tailing consumers a stable initial-window record at
    session creation time.
    
    ## What Changed
    
    - Added `session_meta.context_window.window_id` for the initial
    context-window identity.
    - `CreateThreadParams` now requires `initial_window_id: String`, so
    thread-store callers cannot accidentally create new threads without
    window-0 metadata.
    - Live thread creation derives the persisted initial window ID from the
    same `AutoCompactWindowIds` used to initialize `SessionState`, keeping
    runtime state and JSONL metadata aligned.
    - Rollout reconstruction uses `session_meta.context_window.window_id` as
    the initial-window fallback and derives `window_number = 0`,
    `first_window_id = window_id`, and `previous_window_id = None`
    internally.
    - Fork reconstruction intentionally uses the same rollout reconstruction
    path; consumers that need to distinguish copied initial-window metadata
    can use the rollout `thread_id`.
    - Legacy compactions without `window_number` still use compaction-count
    fallback accounting instead of being reset to window 0 by the
    initial-window fallback.
    - Compacted rollout metadata still takes precedence once compaction
    records exist, preserving the richer chain fields there.
    
    ## JSONL Shape
    
    Real rollout JSONL is one object per line. This example is expanded for
    readability, but shows the new initial `session_meta.context_window`
    record followed by the existing compacted rollout item shape that also
    carries window IDs:
    
    ```jsonl
    {
      "timestamp": "2026-06-22T12:00:00.000Z",
      "type": "session_meta",
      "payload": {
        "session_id": "<THREAD_ID>",
        "id": "<THREAD_ID>",
        "timestamp": "2026-06-22T12:00:00.000Z",
        "cwd": "/repo",
        "originator": "codex",
        "cli_version": "0.0.0",
        "source": "cli",
        "model_provider": "<MODEL_PROVIDER>",
        "context_window": {
          "window_id": "<INITIAL_WINDOW_ID>"
        }
      }
    }
    ...
    {
      "timestamp": "2026-06-22T12:34:56.000Z",
      "type": "compacted",
      "payload": {
        "message": "<COMPACTION_SUMMARY>",
        "replacement_history": [
          "..."
        ],
        "window_number": 1,
        "first_window_id": "<INITIAL_WINDOW_ID>",
        "previous_window_id": "<INITIAL_WINDOW_ID>",
        "window_id": "<NEXT_WINDOW_ID>"
      }
    }
    ```
    
    The nested `context_window` object is intentional: it gives rollout
    consumers a stable namespace for context-window metadata while only
    writing the non-derivable initial `window_id`. For the initial window,
    `window_number`, `first_window_id`, and `previous_window_id` are derived
    internally instead of being written to the rollout.
    
    ## Verification
    
    - `just test -p codex-protocol`
    - `just test -p codex-rollout
    recorder_materializes_on_flush_with_pending_items`
    - `just test -p codex-core reconstruct_history`
    - `just test -p codex-core
    record_initial_history_reconstructs_forked_transcript`
    - `just test -p codex-thread-store`
    - `just test -p codex-state`
    - `just test -p codex-app-server
    thread_read_returns_summary_without_turns`
    - `just test -p codex-rollout persistence_metrics`
  • path-uri: remove legacy path deserialization (#29158)
    ## Why
    
    I'd originally added `PathUri` legacy path deserialization thinking we'd
    want it for having `PathUri` in public app-server APIs. Since then we've
    added `LegacyAppPathString` to handle the messy conversions that we need
    for backcompat. It's confusing for `PathUri` to support deserializing
    legacy paths when we don't yet want to actually expose app-server
    callers or rollout storage to the new URI format.
    
    Stacked on top of #29472 to avoid breaking compatibility in case those
    types ended up stored somewhere for someone.
    
    ## What changed
    
    - Parse deserialized `PathUri` values exclusively as valid `file:` URIs.
    - Replace legacy acceptance coverage with rejection coverage for
    top-level filesystem paths and sandbox working directories.
    - Serialize CWDs in hand-built exec-server process requests as `PathUri`
    values.
  • core tests: rename automatic environment builder (#29728)
    ## Why
    
    Use a clearer name for what happens when this helper sets up a test
    environment.
    
    ## What
    
    - Rename the builder and its harness wrapper to use `auto_env` instead
    of `remote_env` because the helper will set up a local environment if
    configured by the build system.
  • test: branch on target OS instead of runner flavor (#29712)
    ## Why
    
    Core tests should branch on the executor's operating system, not on
    runner details such as Docker or Wine. This keeps platform behavior
    stable as new test backends are added and reserves Wine-specific skips
    for actual runner debt.
    
    ## What
    
    - Add `TestTargetOs` and target/host-aware skip helpers while keeping
    `TestEnvironment` internal.
    - Replace topology enum access with remote predicates and a narrow
    Docker accessor.
    - Migrate OS-semantic Wine skips, preserve runner-specific gaps, and
    document the skip taxonomy.
    
    ## Validation
    
    - `just test -p core_test_support`
    - `just test -p codex-core
    remote_test_env_can_connect_and_use_filesystem`
    - `bazel test //codex-rs/core:core-all-wine-exec-test
    --test_output=errors` reached test execution; unrelated existing
    view-image, path, and timing failures remain.
    - `just test -p codex-core` and `just test` reached broad test
    execution; this checkout has unrelated helper, sandbox, and timing
    failures.
  • code-mode: Rename codex_code_mode::CodeModeService (#29716)
    Mechanical rename of CodeModeService => InProcessCodeModeSession
    
    This already implements a CodeModeSession as its prime interface to
    Core. The name was vestigial _and_ confusing af when embedded inside
    core::tools::code_mode::CodeModeService
  • feat(app-server): thread/turns/items/list -> thread/items/list (#29705)
    ## Description
    
    Rename the experimental app-server item pagination API from
    `thread/turns/items/list` to `thread/items/list` and make `turnId`
    optional. Clients can now page persisted items across a thread, or still
    filter to one turn when needed.
    
    ## What changed
    
    - Rename the request/response protocol types and JSON-RPC method to
    `ThreadItemsList*` / `thread/items/list`.
    - Pass optional `turnId` through to `ThreadStore::list_items`.
    - Update app-server docs and focused protocol/app-server tests.
    
    ## Validation
    
    - `just test -p codex-app-server-protocol thread_items_list_round_trips`
    - `just test -p codex-app-server thread_items_list_returns_unsupported`
  • [codex] Report the exec-server working directory (#29666)
    ## Summary
    
    - add the exec-server working directory to `environment/info` as an
    optional `PathUri`
    - populate it from the executor process's current directory
    - preserve compatibility with older responses that omit `cwd`
    
    ## Why
    
    Remote clients currently have no executor-native default working
    directory. This forces callers such as app-server-backend to assume
    `/workspace`, which fails for laptop environments. Reporting the cwd
    alongside the detected shell lets clients use the path convention and
    location of the actual executor.
    
    ## Impact
    
    This is backward-compatible: the new response field is optional, and
    clients can continue handling responses from older exec servers. A
    follow-up app-server-backend change will consume the value for cwd-less
    `command/exec` requests.
    
    ## Validation
    
    - `just test -p codex-exec-server` (275 passed, 2 skipped)
  • Decouple plugin manifest path resolution (#29620)
    ## Why
    
    Plugin manifests use the same schema whether the package lives on the
    host or in an executor. Only the path representation differs: host
    callers need native `Path` inputs and `AbsolutePathBuf` outputs, while
    executor callers need `PathUri` throughout.
    
    Maintaining separate parsing or resolver implementations would duplicate
    the manifest rules and allow them to drift. This PR instead makes
    URI-native resolution the single parsing path and keeps host conversion
    at the boundary.
    
    ## What changed
    
    - Make `parse_plugin_manifest_uri` the shared manifest parser and
    resolve every path-bearing field as `PathUri`.
    - Keep the existing host entrypoint as a thin adapter: convert its
    native root and manifest path to `PathUri`, run the shared parser, then
    map resources back to `AbsolutePathBuf`.
    - Expose `PluginManifest::try_map_resources` so callers can convert the
    generic resource type without duplicating manifest construction.
    - Resolve relative manifest paths using the root URI's convention:
    backslashes are separators for Windows roots and ordinary filename
    characters for POSIX roots.
    - Apply lexical containment after URI resolution, rejecting absolute
    paths and parent traversal outside the plugin root.
    - Make encoded backslashes fail containment only for Windows URIs;
    encoded `/` remains unsafe for every convention.
    - Use a host-native synthetic root for marketplace fallback manifests so
    the host adapter also works on Windows.
    
    ```text
    host Path --------> PathUri --\
                                  +--> one manifest parser --> PluginManifest<PathUri>
    executor PathUri -------------/
    
    host result: PluginManifest<PathUri> --> PluginManifest<AbsolutePathBuf>
    ```
    
    Existing host manifest behavior is preserved; #28918 is the first
    executor consumer.
    
    ## Verification
    
    - `just test -p codex-utils-path-uri`
    - `just test -p codex-plugin`
    - `just test -p codex-core-plugins`
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. **This PR** — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. #29626 — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • feat(guardian): include connected account email in app reviews (#27045)
    ## Why
    
    auto review reviews Codex App tool calls using connector metadata such
    as the app ID, name, and description. That metadata does not identify
    the account behind the OAuth connection.
    
    For Google Drive, this means auto review cannot distinguish a Drive
    connection authenticated as `user@email.com` from a personal Drive
    account. Uploading work data can therefore look like a transfer to a
    personal destination even though the connector service already knows the
    authenticated account email.
    
    ## What changed
    
    - Read `_meta._codex_apps.connected_account_email` while resolving
    approval metadata for built-in Codex App tools.
    - Include the connected account email in the structured MCP tool action
    sent to auto review.
    - Trim empty values and omit the field when the connector link has no
    account email.
    - Update existing auto review request constructors and add coverage for
    request construction and JSON serialization.
    
    ## Security
    
    Only metadata from the trusted built-in `codex_apps` MCP server is
    accepted. Custom MCP servers cannot inject a connected account email
    into auto review reviews; the new regression test verifies that spoofed
    metadata is ignored.
    
    The email is used only in auto review's private review request. This
    change does not add it to model-visible tool descriptions, app-server
    approval events, or auto review assessment/review analytics.
  • Add MCP tool call error metrics (#28976)
    [Codex Thread
    019edc37-5345-7272-92c9-bf5494cf3819](https://codex-thread-link.openai.chatgpt-team.site/thread/019edc37-5345-7272-92c9-bf5494cf3819)
    
    ## Summary
    
    - count MCP `CallToolResult.isError` responses as failed calls instead
    of successful transport-level calls
    - add `codex.mcp.call.error` with bounded `error_type` and trusted
    plugin-service `error_code` dimensions
    - record the same error classification on MCP tool-call spans while
    keeping untrusted server error text out of metric labels
    
    ## Scope
    
    - no changes to MCP routing, retries, tool behavior, configuration, or
    public APIs
    - request failures remain grouped as `mcp_request`; separating
    connection, timeout, protocol, and JSON-RPC failures requires preserving
    typed errors through the existing flattened error boundary
    
    ## Testing
    
    - `just test -p codex-core 'mcp_tool_call::tests::'` (75 passed)
    - `just fix -p codex-core`
    - `just fmt`
    - `just test -p codex-core` (2,676 passed; 80 unrelated environment
    failures from missing test binaries, sandbox signals, and read-only
    paths)
  • core: use current step environments for tools (#29547)
    ## Why
    
    With deferred executors, an environment can become ready between two
    sampling requests in the same turn. The model-visible environment
    update, advertised tools, and eventual tool execution must all describe
    the same request-time view.
    
    Otherwise, a request built while only environment B is ready can
    advertise a tool without an `environment_id`; if higher-priority
    environment A becomes ready before execution, that call could silently
    run in A instead.
    
    This PR is stacked on #29527.
    
    ## Design
    
    `run_turn` captures one `Arc<StepContext>` at each sampling-request
    boundary. That step owns the request's `TurnContext` and environment
    snapshot.
    
    - World-state environment updates and tool planning borrow that same
    step.
    - `ToolCallRuntime` retains the `Arc` while asynchronous tool calls
    execute.
    - `ToolInvocation` carries the step to handlers; its temporary `turn`
    compatibility field is derived from the same object.
    - `ToolRouter` does not retain `StepContext`; it only uses it while
    constructing the request's tool set.
    - With `DeferredExecutor` disabled, step capture keeps using the
    environments frozen at turn start.
    
    Simply: every sampling request gets one consistent picture of its
    environments, from what the model sees through where its tool calls run.
    
    ## What changed
    
    - Build environment-dependent tool specs from the current request's
    `StepContext`.
    - Use that same step for unified exec, legacy shell, `apply_patch`,
    `view_image`, and `request_permissions` execution.
    - Hide environment-backed tools, including `request_permissions`, while
    no environment is attached.
    - Resolve legacy shell paths and metadata from the selected step
    environment instead of the stale turn-start environment.
    - Capture explicit steps at non-turn-loop boundaries such as compaction,
    prompt debug, and startup prewarm.
    - Reconcile prompt-debug history from the same step used to build its
    tools.
    
    ## Follow-up
    
    - Bind yielded code-mode cells to the tool runtime that created them, so
    nested calls made after yielding continue to use the originating
    request's `StepContext`.
    
    ## Test plan
    
    - `just test -p codex-core
    deferred_executor_updates_context_and_tools_after_startup`
    - `just test -p codex-core
    environment_count_controls_environment_backed_tools`
    - `just test -p codex-core
    build_prompt_input_includes_context_and_user_message`
  • [codex] Fix stale approval policy in MCP test (#29704)
    ## Summary
    
    - replace the removed `AskForApproval::OnFailure` variant in the MCP
    shutdown test with `OnRequest`
    
    ## Why
    
    `OnFailure` was removed from `AskForApproval`, but this test fixture
    still referenced it, causing Rust and Clippy compilation failures.
    
    ## Validation
    
    - `just test -p codex-mcp shutdown_continues_after_caller_is_aborted`
    - `just fmt`
  • [codex] Fix stale approval policy in MCP test (#29696)
    ## Summary
    
    - replace the stale `AskForApproval::OnFailure` reference in the MCP
    connection manager test with `AskForApproval::OnRequest`
    - restore `codex-mcp` test compilation after `OnFailure` was removed in
    #28418
    
    ## Root cause
    
    The test was added on main after the approval-policy removal branch had
    already updated the other references, so the newly added call site was
    missed when #28418 merged.
    
    ## Validation
    
    - `just test -p codex-mcp` (90 passed)
    - `just fmt`
  • core: resolve view_image paths in selected environment (#29526)
    ## Why
    
    view_image needs to support foreign OS remote executors.
    
    ## What
    
    - resolve image paths against the selected environment as `PathUri` and
    read them through that environment's filesystem
    - keep app-server's public path field wire-compatible as
    `LegacyAppPathString`, with purpose-specific UI rendering
    - cover relative and absolute target-native paths in the core
    integration test and run the full `view_image` suite under wine-exec
    without skips
  • [codex] allow image generation with provider auth (#29513)
    ## Summary
    
    - allow the native Responses API `image_generation` tool when the active
    provider carries CCA's non-empty `x-openai-actor-authorization` header
    - preserve the Codex-managed ChatGPT auth path, scoped to providers that
    actually require OpenAI auth
    - keep generic custom providers excluded, including when unrelated
    ChatGPT credentials are cached
    - retain the existing feature, provider-capability, and
    image-input-modality gates
    
    ## Why
    
    CCA authenticates its inference requests through the active provider's
    `x-openai-actor-authorization` and `ChatGPT-Account-ID` headers, so it
    does not have a Codex-managed login session. The previous gate therefore
    hid the native hosted image-generation tool despite an authenticated
    codex-backend path.
    
    This change is intentionally limited to the native hosted tool. It adds
    no extension, MCP, plugin-service, session-source, token plumbing, or
    new provider configuration surface.
    
    ## Tests
    
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
    - `cargo fmt --all -- --check`
    - `git diff --check origin/main`
  • [codex] Preserve proxy state for filesystem sandbox helpers (#29671)
    ## Why
    
    Filesystem helpers intentionally run with a minimal environment that
    excludes proxy variables. After filesystem operations started using the
    Windows sandbox wrapper, the wrapper derived an empty proxy
    configuration from that helper environment and compared it with the
    persistent sandbox setup marker. When the marker contained proxy ports,
    every filesystem operation appeared to require a firewall update, which
    could launch elevated setup, show a UAC or loader dialog, and fail
    operations such as `apply_patch` with error 1223.
    
    Filesystem helpers do not use network access, so they should preserve
    the proxy/firewall state established by normal sandboxed process
    launches.
    
    ## What changed
    
    - Add an explicit Windows sandbox proxy-settings mode for reconciling or
    preserving persistent proxy state.
    - Use preserve mode for filesystem helpers while normal process launches
    continue to reconcile proxy settings from their environment.
    - Carry the selected proxy state consistently through setup validation,
    elevated setup, and non-elevated ACL refreshes.
    - Cover wrapper argument propagation and marker-derived proxy
    preservation.
    
    ## Validation
    
    - `cargo build -p codex-cli --bin codex`
    - `just test -p codex-windows-sandbox
    preserving_proxy_settings_uses_the_existing_marker`
    - `just test -p codex-windows-sandbox windows_wrapper_args_round_trip`
    - `just test -p codex-windows-sandbox
    setup_request_prefers_explicit_proxy_settings`
    - `just test -p codex-sandboxing transform_for_direct_spawn_windows`
    - `just test -p codex-exec-server fs_sandbox::tests`
    - Ran the same sandboxed `fs/writeFile` reproduction against published
    `0.142.0-alpha.6` and the new CLI. The published CLI launched elevated
    setup and failed with `ShellExecuteExW ... 1223`; the new CLI completed
    without elevation.
    
    Related to #28359.
  • Separate local and remote plugin analytics IDs (#29495)
    ## Why
    
    Plugin analytics overloaded `plugin_id`: most events used the Codex
    `<plugin>@<marketplace>` identity, while remote install events used the
    backend plugin ID. That makes the same field change meaning across event
    types and complicates downstream identity resolution.
    
    This change makes the contract unambiguous:
    
    - `plugin_id`: the local Codex `<plugin>@<marketplace>` identity, when
    resolved
    - `remote_plugin_id`: the backend plugin identity, when available
    
    For a remote install failure that happens before plugin details resolve,
    `plugin_id` is `null` and `remote_plugin_id` remains populated.
    
    ## What changed
    
    All six plugin analytics events use the same identity contract:
    
    - `codex_plugin_installed`
    - `codex_plugin_install_failed`
    - `codex_plugin_uninstalled`
    - `codex_plugin_enabled`
    - `codex_plugin_disabled`
    - `codex_plugin_used`
    
    Remote identity is resolved from the current installed-plugin snapshot
    first, with persisted install metadata as fallback. The telemetry
    metadata type keeps local identity optional for failures that occur
    before remote details are available.
    
    The app-server test client's manual analytics smokes now find remote
    mutation events through `remote_plugin_id` and validate that `plugin_id`
    remains local.
    
    ## Remote uninstall
    
    Resolve and capture telemetry metadata before removing the local plugin
    cache, then emit `codex_plugin_uninstalled` after the backend confirms
    success. The event is also emitted when backend uninstall succeeds but
    local cache cleanup reports `CacheRemove`.
    
    If a concurrent remote-cache refresh removes the local bundle before
    telemetry capture, the already-fetched remote plugin detail supplies
    fallback capability metadata.
    
    ## Validation
    
    - `just test -p codex-analytics` — 82 passed
    - `just test -p codex-core-plugins` — 271 passed
    - `just test -p codex-app-server-test-client` — 5 passed
    - `just test -p codex-plugin` — 3 passed
    - `just test -p codex-app-server plugin_install` — 37 passed
    - `just test -p codex-app-server plugin_uninstall` — 10 passed
    
    The production app-server install/uninstall flow was also exercised
    against `plugins~Plugin_f1b845ac33888191ac156169c58733c2`
    (`build-ios-apps@openai-curated-remote`), and the plugin's original
    uninstalled state was restored.
  • Keep managed MITM CA private keys in proxy memory (#29013)
    ## Why
    
    The managed MITM trust bundle must be readable by sandboxed commands.
    Persisting its sibling CA private key under `$CODEX_HOME/proxy`
    therefore requires a deny-read sandbox rule, but the Windows unelevated
    backend rejects deny-read paths and WSL1's legacy Landlock path cannot
    enforce that rule.
    
    A persistent OS credential store also does not provide the same
    cross-platform boundary from other processes running as the same user.
    Keeping the signer inside the network proxy process avoids both
    problems: ordinary sandbox setup stays independent of CA-key state, and
    no private signing key is exposed through the filesystem or a persistent
    credential record.
    
    ## What
    
    - generate one managed CA per proxy process and retain its private
    signer only in proxy memory
    - emit only content-addressed public CA certificates and trust bundles
    under `$CODEX_HOME/proxy`
    - hold a cross-process lease for each active public certificate and
    prune artifacts from inactive proxy processes
    - keep all CA ownership in `codex-network-proxy`; no `codex-core` or
    sandbox-policy changes
    - validate generated trust-bundle paths by their content hash
    - keep the public bundle readable by sandboxed commands on Windows,
    WSL1, macOS, and Linux
    
    The independent startup custom-CA follow-up is #29014.
    
    ## Validation
    
    - `CODEX_HOME=/private/tmp/codex-test-home-network-proxy just test -p
    codex-network-proxy` (179 tests)
    - `just bazel-lock-check`
    - `just fix -p codex-network-proxy`
    - `just fmt`
    
    ---------
    
    Co-authored-by: viyatb-oai <viyatb@openai.com>
  • core: add extra metadata field to Thread struct (#29675)
    # Summary
    
    Adds a field Thread.extras that can be used to hold arbitrary metadata
    specific to a given thread.
  • chore(core) rm AskForApproval::OnFailure (#28418)
    ## Summary
    Deletes the OnFailure variant of the `AskForApproval` enum. This option
    has been deprecated since #11631.
    
    ## Testing
    - [x] Tests pass
  • Prepare managed network sandbox context (#29456)
    ## Why
    
    Managed network configures commands to use local HTTP and SOCKS proxies.
    For commands delegated to the exec server, the proxy environment and the
    sandbox policy were prepared separately. On macOS, that meant a command
    could receive `HTTPS_PROXY=http://127.0.0.1:43123` while Seatbelt still
    denied access to port `43123`.
    
    ## What changed
    
    `NetworkProxy` now prepares the command environment and sandbox context
    together from the same runtime snapshot:
    
    ```text
    Prepared managed network
    ├── command environment: HTTPS_PROXY=http://127.0.0.1:43123
    └── sandbox context: allow outbound to 127.0.0.1:43123
    ```
    
    That context travels with remote exec requests. The exec server
    preserves the managed proxy and CA environment, and macOS Seatbelt
    allows only the prepared loopback proxy ports without enabling broad
    network access or local binding.
    
    The protocol field is optional and the existing enforcement flag remains
    in place, preserving compatibility with callers that do not send the new
    context.
  • app-server: document thread and turn IDs are UUID7 (#27714)
    It's actually a very nice property that these are UUID7s, so documenting
    them so we think twice before changing it away from UUID7s in the
    future.
  • Handle additional tools in rollout persistence metrics (#29669)
    ## Why
    
    The rollout persistence metrics added on current `main` exhaustively
    match `ResponseItem`, but omit `ResponseItem::AdditionalTools`. That
    prevents `codex-rollout` and downstream targets from compiling across
    Cargo and Bazel builds.
    
    ## What
    
    Map `ResponseItem::AdditionalTools` to the `response.additional_tools`
    metric label, consistent with the existing exact-variant labels.
    
    ## Validation
    
    - `just test -p codex-rollout` (76 passed)
    - `just fix -p codex-rollout`
  • [codex] Handle additional tools in rollout persistence metrics (#29672)
    ## Summary
    
    Handle `ResponseItem::AdditionalTools` in rollout persistence metrics.
    
    The persistence metrics match was added after the `AdditionalTools`
    variant and omitted it, causing release builds to fail with a
    non-exhaustive pattern error. This assigns the item the
    `response.additional_tools` metrics label.
    
    Release failure:
    https://github.com/openai/codex/actions/runs/28043786727/job/83016608475
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-rollout` (76 passed)
  • core: use turn-owned world state for inline compaction (#29527)
    ## Why
    
    Follow-up to #29249 and its [compaction review
    thread](https://github.com/openai/codex/pull/29249#discussion_r3455055101).
    
    During a turn, environment readiness can change between sampling
    requests. Inline compaction must render the same model-visible
    `WorldState` used by the request it follows. Rebuilding that state
    during compaction can observe a newer environment, make replacement
    history disagree with what the model saw, and suppress the next
    environment update.
    
    ## What changed
    
    - Make `run_turn` own the current `Arc<WorldState>` and replace it only
    between sampling requests.
    - Build each state from an explicitly chosen environment snapshot, diff
    deferred-executor steps against the turn-owned state, and retain the
    latest state in `ContextManager` only for cross-turn and resume
    tracking.
    - Pass the exact turn-owned state into inline compaction and explicit
    new-context-window replacement.
    - Carry that state with
    `InitialContextInjection::BeforeLastUserMessage`, so replacement context
    and its stored baseline cannot come from different snapshots.
    - Remove obsolete state-recapture helpers and ambiguous TurnContext-only
    WorldState builders.
    - Add an integration test that moves an environment from starting to
    ready during a paused turn, triggers compaction, and verifies the next
    request receives the readiness update exactly once.
    
    ## Test plan
    
    - `just test -p codex-core
    deferred_executor_compaction_preserves_then_updates_environment_once`
    - `just test -p codex-core process_compacted_history`
    - `just test -p codex-core mid_turn_continuation_compaction`
    - `just test -p codex-core build_initial_context`
    - `just test -p codex-core
    ignores_session_prefix_messages_when_truncating`
  • Shut down superseded MCP managers on refresh (#29608)
    ## Summary
    
    MCP refresh replaced the published connection manager without shutting
    down the manager it superseded. If another task retained that old
    manager, its stdio MCP processes stayed alive and accumulated across
    refreshes.
    
    Atomically swap in the refreshed manager, then explicitly shut down the
    exact manager returned by the swap. Add a process-level regression test
    that retains the old manager during refresh and verifies its stdio
    process exits while the replacement remains available.
    
    ## Context
    
    Explicit cleanup was lost when manager publication moved to `ArcSwap`.
    Dropping the old manager is not a reliable shutdown boundary because
    active callers can retain its `Arc` and underlying client process
    handles.
  • [core] debounce current-time reminders by elapsed time (#29659)
    ## Summary
    - rename `reminder_interval_model_requests` to
    `reminder_interval_seconds`
    - read the configured time provider before every model request and
    inject a reminder only after the configured number of seconds has
    elapsed
    - preserve immediate first delivery and forced delivery after compaction
    changes the context window
    
    ## Tests
    - `just test -p codex-core current_time_reminder`
  • [codex] Instrument rollout persistence bytes (#29498)
    - Add 1%-sampled rollout persistence metrics that report per-item and
    per-thread JSON byte totals before and after filtering when metrics
    export is enabled.
    - Tag each item with its exact response or event variant, including
    nested turn-item kinds for conditionally persisted completion events, so
    aggregate cloud-storage impact can be estimated by policy choice.
  • Update vulnerable Hono and fast-uri dependencies (#29650)
    ## Summary
    
    - Pin `hono` to 4.12.25, the first patched release for the recent Hono
    security advisories.
    - Pin `fast-uri` to 3.1.1 to fix the percent-encoded path traversal
    vulnerability.
    - Refresh `pnpm-lock.yaml` with only those dependency updates.
    
    `hono` 4.12.25 is used instead of the newer 4.12.27 because the
    repository requires dependencies to be at least seven days old.
  • Update rmcp to 1.8.0 (#29634)
    ## Summary
    
    - Update `rmcp` and `rmcp-macros` from 1.7.0 to 1.8.0.
    - Adapt to the new shared `peer_info` return type.
    - Box OAuth status discovery at the MCP boundary to keep the expanded
    future type from overflowing Rust's trait recursion limit.
    
    This brings in custom OAuth HTTP client support from
    [modelcontextprotocol/rust-sdk#908](https://github.com/modelcontextprotocol/rust-sdk/pull/908).
  • Share resumed rollout history (#28426)
    ## Summary
    
    Resuming a persisted thread currently deep-clones its complete rollout
    history several times. `InitialHistory` is retained for the app-server
    response, copied into thread persistence, and copied again by read-only
    accessors. These copies scale with the complete rollout rather than the
    bounded model context and add measurable latency for large sessions.
    
    This change stores resumed rollout history in `Arc<Vec<RolloutItem>>`.
    Rollout loading wraps the parsed vector once, while app-server response
    construction, session initialization, and thread persistence share it
    through inexpensive `Arc` clones. Read-only history access now returns a
    borrowed slice, and fork paths use `Arc::unwrap_or_clone` where they
    genuinely need mutable ownership. Rollout reconstruction also consumes
    its temporary context instead of cloning the reconstructed model
    history.
    
    The serialized representation remains unchanged. In an artificial 123 MB
    rollout benchmark, sharing resumed history reduced cold resume latency
    by roughly 9–10%. The affected crates compile with their test targets,
    all 80 thread-store tests pass, and the Bazel dependency lock remains
    valid.
  • path-uri: add lexical containment (#29614)
    ## Why
    
    Executor-owned paths must stay portable while the orchestrator reasons
    about them. Converting a Windows or remote path to the orchestrator
    host's native path just to check containment breaks that boundary.
    
    ## What changed
    
    - Add lexical containment to `PathUri`.
    - Compare URI authorities and complete path segments, so `plugin-other`
    is not treated as a child of `plugin`.
    - Fail closed for encoded path separators and opaque fallback URIs.
    
    For example:
    
    ```text
    file:///C:/plugins/foo/assets/icon.svg
      is below file:///C:/plugins/foo
    
    file:///C:/plugins/foo2/icon.svg
      is not below file:///C:/plugins/foo
    ```
    
    This is the shared foundation for keeping executor-owned plugin
    resources URI-native without consulting the orchestrator filesystem.
  • Namespace multi-agent v2 tools under collaboration (#29067)
    ## Summary
    
    Multi-agent v2 tools now use the fixed `collaboration` namespace when
    namespace tools are available. This keeps the model-visible hint and the
    actual tool surface aligned around `functions.collaboration.*`, without
    exposing an unshipped namespace knob to users.
    
    The PR also removes the old `features.multi_agent_v2.tool_namespace`
    config/schema surface, updates the MAv2 test fixtures for namespaced
    calls, and fixes stale `TurnContext.features` references that were
    breaking `codex-core` builds.
    
    ## Changes
    
    - Expose MAv2 tools under `collaboration` instead of relying on a
    configurable namespace.
    - Remove `tool_namespace` from MAv2 TOML config, resolved config,
    validation, schema, and tests.
    - Update tool-planning and integration fixtures to assert or emit
    namespaced MAv2 tool calls.
    - Read feature state through `TurnContext.config.features` in the
    multi-agent mode context paths.
    
    ## Testing
    
    - `just write-config-schema`
    - `just test -p codex-features`
  • Fix Codex Apps auth elicitation hang (#29615)
    ## Summary
    - Require the reserved Codex Apps MCP server name to be present in the
    connection manager before treating it as host-owned.
    - Update auth elicitation tests to model an installed host-owned Codex
    Apps server without sending startup events to the test session.
    
    ## Why
    PR #29518 replaced the old host-owned flag with a name-only check. That
    made non-host-owned tests with the reserved codex_apps name enter auth
    elicitation and wait forever for a response.
  • Stop persisting bridged log events (#29599)
    ## Why
    
    The 0.142.0 persistent-log filter disables target=log, but bridged log
    records are filtered using their original dependency target before
    tracing-log emits them as target=log. This allowed high-volume
    dependency TRACE events to keep reaching SQLite.
    
    This is a follow-up to #28224.
    
    ## What changed
    
    - Reject bridged target=log events inside the SQLite sink before
    formatting or queueing them.
  • Allow codex sandbox to consume MCP sandbox state (#29358)
    ## Summary
    
    - let `codex sandbox` accept the JSON value from
    `codex/sandbox-state-meta`
    - require the payload `permissionProfile` instead of falling back to
    ambient permissions
    - reuse the existing macOS, Linux, and Windows launch paths, treating
    external sandbox state conservatively as read-only
    - let opaque forwarders add runtime read roots and disable direct
    network access without decoding the payload
    
    Builds on #29113, which is now on `main`.
    
    ## Tests
    
    - `just test -p codex-cli debug_sandbox::tests`
    - `cargo build -p codex-rmcp-client --bin test_stdio_server`
    - `just test -p codex-core
    stdio_mcp_tool_call_includes_sandbox_state_meta`
    - `just test -p codex-mcp`
    - `just fmt`
  • Group Codex Apps client setup (#29583)
    ## Why
    
    `McpConnectionManager::new` classified the Codex Apps server twice: once
    to create its tools cache context and again to select its runtime
    authentication provider. Keeping those decisions separate makes it
    harder to see that they belong to the same server-specific setup path.
    
    ## What changed
    
    - Group Codex Apps cache and authentication setup under one explicit
    branch.
    - Keep regular MCP server setup in the corresponding `else` branch.
    - Limit environment bearer-token inspection to the Codex Apps path where
    it affects runtime authentication.
  • Remove redundant Codex Apps cache guard (#29575)
    ## Why
    
    Codex Apps cache writes are already restricted to Codex Apps call paths:
    startup invokes the helper only from the Codex Apps branch, and hard
    refresh operates on the reserved Codex Apps server directly. Rechecking
    the server name inside the cache helper duplicates that classification
    and leaves the helper with an argument that cannot change valid
    behavior.
    
    ## What changed
    
    - Remove the redundant server-name check and parameter from the cache
    writer.
    - Rename the helper to `write_codex_apps_tools_cache` to reflect its
    narrower contract.
    - Update production and test callsites to use the simplified API.
  • Handle additional tools in image URL validation (#29577)
    ## Why
    
    `ResponseItem::AdditionalTools` was added without updating app-server
    image URL validation. The exhaustive match therefore prevents app-server
    and downstream targets from compiling on `main`.
    
    ## What changed
    
    Treat `AdditionalTools` like the other response items that cannot
    contain input-image URLs.