Commit Graph

5734 Commits

  • Use Auto-review wording for fallback rationale (#19168)
    ## Why
    
    PR #18797 currently surfaces fallback rationale text that names Guardian
    directly.
    
    ## What changed
    
    - Updated the bare allow and bare deny fallback rationales in
    `codex-rs/core/src/guardian/prompt.rs` from Guardian to Auto-review.
    - Updated the existing bare allow parser test and added explicit bare
    deny parser coverage.
    
    ## Verification
    
    - `cargo test -p codex-core parse_guardian_assessment_treats_bare`
  • Move marketplace add/remove and startup sync out of core. (#19099)
    Move more things to core-plugins.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • ci: add macOS keychain entitlements (#19167)
    ## Summary
    
    - add macOS application and team identifiers to the release signing
    entitlements
    - add a Codex keychain access group for release-signed macOS binaries
    - keep the existing JIT entitlement unchanged
    
    ## Why
    
    Codex release binaries are signed with the OpenAI Developer ID team, but
    the current entitlements plist only grants JIT. macOS Keychain and
    Secure Enclave operations that create persistent keys can require the
    process to carry an application identifier and keychain access group.
    Adding these entitlements gives release-signed binaries a stable
    Keychain namespace for Codex-owned device keys.
    
    ## Validation
    
    - `plutil -lint
    .github/actions/macos-code-sign/codex.entitlements.plist`
  • app-server: add Unix socket transport (#18255)
    ## Summary
    - add unix:// app-server transport backed by the shared codex-uds crate
    - reuse the websocket connection loop for axum and tungstenite-backed
    streams
    - add codex app-server proxy to bridge stdio clients to the control
    socket
    - tolerate Windows UDS backends that report a missing rendezvous path as
    connection refused before binding
    
    ## Tests
    - cargo test -p codex-app-server
    control_socket_acceptor_forwards_websocket_text_messages_and_pings
    - cargo test -p codex-app-server
    - just fmt
    - just fix -p codex-app-server
    - git -c core.fsmonitor=false diff --check
  • Respect explicit untrusted project config (#18626)
    ## Why
    
    Fixes #18475. A `-c` override such as `projects.<cwd>.trust_level =
    "untrusted"` is meant to be a runtime config override, but app-server
    thread startup treated any non-trusted project as eligible for automatic
    trust persistence when a permissive sandbox/cwd was requested. That
    meant an explicit `untrusted` session override could still cause
    `config.toml` to be updated with `trusted`.
    
    ## What changed
    
    The app-server auto-trust path now runs only when the active project
    trust level is unknown. Explicit `trusted` and explicit `untrusted`
    values are both respected, regardless of whether they came from
    persisted config or session flags.
    
    A focused `thread/start` test now covers the explicit `untrusted` case
    with a permissive sandbox request.
    
    ## Verification
    
    - `cargo test -p codex-app-server`
    - `just fix -p codex-app-server`
  • [codex] Route live thread writes through ThreadStore (#18882)
    Begin migrating the thread write codepaths to ThreadStore.
    
    This starts using ThreadStore inside of core session code, not only in
    the app server code.
    
    Rework the interfaces around thread recording/persistence. We're left
    with the following:
    
    * `ThreadManager`: owns the process-level registry of loaded threads and
    handles cross-thread orchestration: start, resume, fork, lookup, remove,
    and route ops to running CodexThreads.
    * `CodexThread`: represents one loaded/running thread from the outside.
    It is the handle app-server and callers use to submit ops, inspect
    session metadata, and shut the thread down.
    * `LiveThread`: session-owned persistence lifecycle handle for one
    active thread. Core session code uses it to append rollout items,
    materialize lazy persistence, flush, shutdown, discard init-failed
    writers, and load that thread’s persisted history.
    * `ThreadStore`: storage backend abstraction. It answers “how are
    threads persisted, read, listed, updated, archived?” Local and remote
    implementations live behind this trait.
    * `LocalThreadStore`: local ThreadStore implementation. It owns the
    file/sqlite-specific details and keeps RolloutRecorder as a local
    implementation detail.
    
    This is a few too many Thread abstractions for my liking, but they do
    all represent different concepts / needs / layers.
    
    Migration note: in places where the core code explicitly requires a
    path, rather than a thread ID, throw an error if we're running with a
    remote store.
    
    Cover the new local live-writer lifecycle with focused tests and
    preserve app-server thread-start behavior, including ephemeral pathless
    sessions.
  • Add excludeTurns parameter to thread/resume and thread/fork (#19014)
    For callers who expect to be paginating the results for the UI, they can
    now call thread/resume or thread/fork with excludeturns:true so it will
    not fetch any pages of turns, and instead only set up the subscription.
    That call can be immediately followed by pagination requests to
    thread/turns/list to fetch pages of turns according to the UI's current
    interactions.
  • Add remote thread config loader protos (#18892)
    ## Why
    
    Thread-scoped config needs a stable boundary between the app/session
    owner and the config stack. Instead of having call sites manually copy
    thread config fields into individual overrides, this adds the proto and
    Rust plumbing needed for a `ThreadConfigLoader` implementation to return
    typed sources that can be translated into ordinary config layer entries.
    
    Keeping the remote payload typed also makes precedence easier to reason
    about: session-owned thread config maps back to the existing session
    config source, while user-owned thread config is represented separately
    without introducing a new config-layer source until it has TOML-backed
    fields.
    
    ## What changed
    
    - Added the `codex.thread_config.v1` protobuf service and generated Rust
    module for loading thread config sources.
    - Added `RemoteThreadConfigLoader`, which calls the gRPC service, parses
    `SessionThreadConfig` / `UserThreadConfig`, and validates provider
    fields such as `wire_api`, auth timeout, and absolute auth cwd.
    - Added proto generation tooling under
    `config/scripts/generate-proto.sh` and
    `config/examples/generate-proto.rs`.
    - Added `ThreadConfigLoader::load_config_layers`, plus static/no-op
    loader helpers, so tests and callers can use the same typed loader
    interface while config-layer translation stays centralized.
    
    ## Verification
    
    - `cargo test -p codex-config thread_config`
  • feat: drop spawned-agent context instructions (#19127)
    ## Why
    
    MultiAgentV2 children should not receive an extra model-visible
    developer fragment just because they were spawned. The parent/configured
    developer instructions should carry through normally, but the dedicated
    `<spawned_agent_context>` block is no longer desired.
    
    ## What changed
    
    - Removed the `SpawnAgentInstructions` context fragment and its
    `<spawned_agent_context>` wrapper.
    - Stopped appending spawned-agent instructions in
    `codex-rs/core/src/tools/handlers/multi_agents_v2/spawn.rs`.
    - Updated subagent notification coverage to assert inherited parent
    developer instructions without expecting the spawned-agent wrapper.
    
    ## Verification
    
    - `cargo test -p codex-core --test all
    spawned_multi_agent_v2_child_inherits_parent_developer_context --
    --nocapture`
    - `cargo test -p codex-core --test all
    skills_toggle_skips_instructions_for_parent_and_spawned_child --
    --nocapture`
    - `cargo test -p codex-core --test all subagent_notifications --
    --nocapture`
  • [codex] Fix plugin marketplace help usage (#18710)
    ## Summary
    - Updates generated CLI help for plugin marketplace commands to show the
    full `codex plugin marketplace ...` namespace.
    - Adds a regression test covering the marketplace command and its `add`,
    `upgrade`, and `remove` help pages.
    
    ## Root Cause
    The marketplace parser already lived under `codex plugin marketplace`,
    but Clap generated usage text from the child parser's standalone command
    name. That made help output show stale `codex marketplace ...`
    instructions even though the top-level `codex marketplace` command no
    longer parses.
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-cli`
    - `./target/debug/codex plugin marketplace --help`
  • tui: sync session permission profiles (#18284)
    ## Why
    
    Once `SessionConfigured` carries the active `PermissionProfile`, the TUI
    must treat that as authoritative session state. Otherwise the widget can
    keep stale local permission details after a session is configured or
    resumed.
    
    The TUI also keeps a local `Config` copy used for later operations, so
    session-sourced profiles and subsequent local sandbox changes need to
    keep the derived split runtime permissions in sync. Because this PR may
    land before the follow-up user-turn profile plumbing, embedded
    app-server turns also need a standalone path for carrying local runtime
    sandbox overrides.
    
    ## What changed
    
    - Sync the chat widget runtime filesystem/network permissions from
    `SessionConfigured.permission_profile`, with the legacy `sandbox_policy`
    as the fallback.
    - Recompute split runtime permissions whenever the TUI applies or
    carries forward a local sandbox-policy override.
    - Mark feature-driven Auto-review sandbox changes as runtime sandbox
    overrides so the standalone embedded turn-start profile path is used
    even without the follow-up user-turn profile PR.
    - Send a turn-start `permissionProfile` for embedded,
    non-ExternalSandbox turns when the TUI has a runtime sandbox override;
    remote and ExternalSandbox turns keep using the legacy sandbox field.
    - Extend coverage for profile sync, local sandbox changes,
    ExternalSandbox fallback, feature-driven sandbox overrides, and
    turn-start permission override selection.
    
    ## Verification
    
    - `cargo test -p codex-tui
    update_feature_flags_enabling_guardian_selects_auto_review`
    - `cargo test -p codex-tui
    turn_start_permission_overrides_send_profiles_only_for_embedded_runtime_overrides`
    - `cargo test -p codex-tui permission_settings_sync`
    - `cargo test -p codex-tui
    session_configured_external_sandbox_keeps_external_runtime_policy`
    - `cargo test -p codex-tui
    session_configured_syncs_widget_config_permissions_and_cwd`
    - `just fix -p codex-tui`
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18284).
    * #18288
    * #18287
    * #18286
    * #18285
    * __->__ #18284
  • Update safety check wording (#19149)
    Updates wording of cyber safety check.
  • exec-server: wait for close after observed exit (#19130)
    ## Why
    
    Windows CI can flake in
    `server::handler::tests::output_and_exit_are_retained_after_notification_receiver_closes`
    after a process has exited but before both output streams have closed.
    `exec/read` returned immediately whenever `exited` was true, so callers
    that had already observed the exit event could spin instead of
    long-polling for the later `closed` state.
    
    ## What Changed
    
    - Keep returning immediately when a terminal exit event is newly
    observable.
    - Allow later reads, after the caller has advanced past that event, to
    wait for `closed` or new output until `wait_ms` expires.
    
    ## Verification
    
    - CI pending.
  • Reject agents.max_threads with multi_agent_v2 (#19129)
    ## Why
    
    `multi_agent_v2` uses the v2 agent lifecycle, so accepting the legacy
    `agents.max_threads` limit alongside it creates conflicting
    configuration semantics. Config load should fail early with a clear
    error instead of allowing both knobs to be set.
    
    ## What Changed
    
    - During config load, detect when the effective `multi_agent_v2` feature
    is enabled and `agents.max_threads` is explicitly set.
    - Return an `InvalidInput` error: `agents.max_threads cannot be set when
    multi_agent_v2 is enabled`.
    
    ## Verification
    
    - `cargo test -p codex-core multi_agent_v2_rejects_agents_max_threads`
    passed locally with a temporary focused test for this behavior.
    - `cargo test -p codex-core` was also run; the new focused path passed,
    but the crate suite has unrelated pre-existing failures in managed
    config/proxy/request-permissions tests.
  • Fix auto-review config compatibility across protocol and SDK (#19113)
    ## Why
    
    This keeps the partial Guardian subagent -> Auto-review rename
    forward-compatible across mixed Codex installations. Newer binaries need
    to understand the new `auto_review` spelling, but they cannot write it
    to shared `~/.codex/config.toml` yet because older CLI/app-server
    bundles only know `user` and `guardian_subagent` and can fail during
    config load before recovering.
    
    The Python SDK had the opposite compatibility gap: app-server responses
    can contain `approvalsReviewer: "auto_review"`, but the checked-in
    generated SDK enum did not accept that value.
    
    ## What Changed
    
    - Keep `ApprovalsReviewer::AutoReview` readable from both
    `guardian_subagent` and `auto_review`, while serializing it as
    `guardian_subagent` in both protocol crates.
    - Update TUI Auto-review persistence tests so enabling Auto-review
    writes `approvals_reviewer = "guardian_subagent"` while UI copy still
    says Auto-review.
    - Map managed/cloud `feature_requirements.auto_review` to the existing
    `Feature::GuardianApproval` gate without adding a broad local
    `[features].auto_review` key or changing config writes.
    - Add `auto_review` to the Python SDK `ApprovalsReviewer` enum and cover
    `ThreadResumeResponse` validation.
    
    ## Testing
    
    - `cargo test -p codex-protocol approvals_reviewer`
    - `cargo test -p codex-app-server-protocol approvals_reviewer`
    - `cargo test -p codex-tui
    update_feature_flags_enabling_guardian_selects_auto_review`
    - `cargo test -p codex-tui
    update_feature_flags_enabling_guardian_in_profile_sets_profile_auto_review_policy`
    - `cargo test -p codex-core
    feature_requirements_auto_review_disables_guardian_approval`
    - `pytest
    sdk/python/tests/test_client_rpc_methods.py::test_thread_resume_response_accepts_auto_review_reviewer`
    - `git diff --check`
  • Support MCP tools in hooks (#18385)
    ## Summary
    
    Lifecycle hooks currently treat `PreToolUse`, `PostToolUse`, and
    `PermissionRequest` as Bash-only flows
    - hook schema constrains `tool_name` to `Bash`
    - hook input assumes a command-shaped `tool_input`
    - core hook dispatch path passes only shell command strings
    
    That means hooks cannot target MCP tools even though MCP tool names are
    model-visible and stable
    
    This change generalizes those hook paths so they can match and receive
    payloads for MCP tools while preserving the existing Bash behavior.
    
    ## Reviewer Notes
    
    I think these are the key files
    - `codex-rs/core/src/tools/handlers/mcp.rs`
    - `codex-rs/core/src/mcp_tool_call.rs`
    
    Otherwise the changes across apply_patch, shell, and unified_exec are
    mainly to rewire everything to be `tool_input` based instead of just
    `command` so that it'll make sense for MCP tools.
    
    ## Changes
    
    - Allow `PreToolUse`, `PostToolUse`, and `PermissionRequest` hook inputs
    to carry arbitrary `tool_name` and `tool_input` values instead of
    hard-coding `Bash` and command-only payloads.
    - Add MCP hook payload support through `McpHandler`, using the
    model-visible tool name from `ToolInvocation` and the raw MCP arguments
    as `tool_input`.
    - Include MCP tool responses in `PostToolUse` by serializing
    `McpToolOutput` into the hook response payload.
    - Run `PermissionRequest` hooks for MCP approval requests after
    remembered approval checks and before falling back to user-facing MCP
    elicitation.
    - Preserve exact matching for literal hook matchers like `Bash` and
    `mcp__memory__create_entities`, while keeping regex matcher support for
    patterns like `mcp__memory__.*` and `mcp__.*__write.*`.
    
    ---------
    
    Co-authored-by: Andrei Eternal <eternal@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: include filesystem entries in permission requests (#19086)
    ## Why
    
    `item/permissions/requestApproval` sends a requested permission profile
    to app-server clients. The core profile already stores filesystem
    permissions as `entries`, but the v2 compatibility conversion used the
    legacy `read`/`write` projection whenever possible and left `entries`
    unset.
    
    That made the request ambiguous for clients that consume the canonical
    v2 shape: `permissions.fileSystem.entries` was missing even though
    filesystem access was being requested. A client that rendered or echoed
    grants from `entries` could treat the request as having no filesystem
    permission entries, then return an empty or incomplete grant. The
    app-server intersects responses with the original request, so omitted
    filesystem permissions are denied.
    
    ## What Changed
    
    - Populate `AdditionalFileSystemPermissions.entries` when converting
    legacy read/write roots for request permission payloads, while
    preserving `read` and `write` for compatibility.
    - Mark `read` and `write` as transitional schema fields in the generated
    app-server schema.
    - Add regression coverage for the v2 conversion, the app-server
    `item/permissions/requestApproval` round trip, and TUI app-server
    approval conversion expectations.
    - Refresh generated JSON and TypeScript schema fixtures.
    
    ## Verification
    
    - `just fmt`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server request_permissions_round_trip`
    - `cargo test -p codex-tui
    converts_request_permissions_into_granted_permissions`
    - `cargo test -p codex-tui
    resolves_permissions_and_user_input_through_app_server_request_id`
  • Persist target default reasoning on model upgrade (#19085)
    ## Why
    
    When the TUI upgrade flow moves a user to a newer model, the accepted
    migration should also persist the target model's default reasoning
    effort. That keeps the upgraded model and reasoning setting aligned
    instead of carrying forward a stale previously saved effort from the old
    model.
    
    ## What changed
    
    - The accepted model migration path now updates in-memory config, TUI
    state, and persisted model selection with the target preset's
    `default_reasoning_effort`.
    - The upgrade destructuring keeps `reasoning_effort_mapping` explicitly
    unused because mappings are no longer consulted on accepted migrations.
    - Added a catalog test that starts with a pre-existing saved reasoning
    effort and verifies the accepted upgrade overwrites it with the target
    model default and emits the expected persistence events.
    - Rebasing onto current `main` also updates a TUI thread-session test
    helper for the latest `permission_profile` field and
    `ApprovalsReviewer::AutoReview` rename so CI compiles on the new base.
    
    ## Verification
    
    - `cargo test -p codex-tui model_catalog`
    - `cargo test -p codex-tui
    permission_settings_sync_updates_active_snapshot_without_rewriting_side_thread`
  • Clarify cloud requirements error messages (#19078)
    ## Why
    The current cloud-requirements failures say `workspace-managed config`,
    which is ambiguous and can read like it refers to local managed config
    such as `managed_config.toml`.
    
    This code path only applies to cloud requirements, so the user-facing
    message should name that source directly.
    
    ## What changed
    - Updated the load failure in
    [`codex-rs/cloud-requirements/src/lib.rs`](https://github.com/openai/codex/blob/46e704d1f93054daa9a3b5a9100333c540c81d50/codex-rs/cloud-requirements/src/lib.rs)
    to say `failed to load cloud requirements (workspace-managed policies)`.
    - Updated the parse failure in the same file to use the same `cloud
    requirements (workspace-managed policies)` terminology.
    - Kept `workspace-managed` hyphenated because it is used as a compound
    modifier.
    - Updated the matching assertion in
    [`codex-rs/app-server/src/codex_message_processor.rs`](https://github.com/openai/codex/blob/46e704d1f93054daa9a3b5a9100333c540c81d50/codex-rs/app-server/src/codex_message_processor.rs).
    - Reused `CLOUD_REQUIREMENTS_LOAD_FAILED_MESSAGE` in the
    `codex-cloud-requirements` test where the test is asserting that
    crate-local contract directly.
    
    ## Testing
    `cargo test -p codex-cloud-requirements`
  • feat: Warn and continue on unknown feature requirements (#19038)
    Requirements feature flags now fail open like config feature flags, but
    with a startup warning.
    
    <img width="443" height="68" alt="image"
    src="https://github.com/user-attachments/assets/76767fa7-8ce8-4fc7-8a09-902fcdda6298"
    />
  • Use remote plugin IDs for detail reads and enlarge list pages (#19079)
    1. For remote plugin use plugin id (plugin name) directly for read
    plugin details;
    2. Request up to 200 remote plugins per directory list page.
  • Add computer_use feature requirement key (#19071)
    ## Summary
    - add the `computer_use` requirements-only feature key
    - include it in generated config schema output
    - cover the new key in feature metadata tests
    
    ## Testing
    - `cargo test -p codex-features`
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-features`
    
    cc @xl-openai
    
    ---------
    
    Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
  • TUI: preserve permission state after side conversations (#18924)
    Addresses #18854
    
    ## Why
    
    The `/permissions` selector updates the active TUI session state, but
    the cached session snapshot used when replaying a thread could still
    contain the old approval or sandbox settings. After opening and leaving
    `/side`, the main thread replay could restore those stale settings into
    the `ChatWidget`, so the UI and the next submitted turn could fall back
    to the old permission mode.
    
    ## What
    
    - Sync the active thread's cached `ThreadSessionState` whenever approval
    policy, sandbox policy, or approval reviewer changes.
    
    ## Verification
    
    Confirmed bug prior to fix and correct behavior after fix.
  • Mark codex_hooks stable (#19012)
    # Why
    
    Hooks are ready to graduate to GA in the next release!
    
    # What
    
    - Moves `Feature::CodexHooks` into the stable feature group.
    - Marks the `codex_hooks` feature spec as `Stage::Stable` and
    default-enabled.
  • app-server: accept command permission profiles (#18283)
    ## Why
    
    `command/exec` is another app-server entry point that can run under
    caller-provided permissions. It needs to accept `PermissionProfile`
    directly so command execution is not left behind on `SandboxPolicy`
    while thread APIs move forward.
    
    Command-level profiles also need to preserve the semantics clients
    expect from profile-relative paths. `:cwd` and cwd-relative deny globs
    should be anchored to the resolved command cwd for a command-specific
    profile, while configured deny-read restrictions such as `**/*.env =
    none` still need to be enforced because they can come from config or
    requirements rather than the command override itself.
    
    ## What Changed
    
    This adds `permissionProfile` to `CommandExecParams`, rejects requests
    that combine it with `sandboxPolicy`, and converts accepted profiles
    into the runtime filesystem/network permissions used for command
    execution.
    
    When a command supplies a profile, the app-server resolves that profile
    against the command cwd instead of the thread/server cwd. It also
    preserves configured deny-read entries and `globScanMaxDepth` on the
    effective filesystem policy so one-off command overrides cannot drop
    those read protections. The PR also updates app-server docs/schema
    fixtures and adds command-exec coverage for accepted, rejected,
    cwd-scoped, and deny-read-preserving profile paths.
    
    ## Verification
    
    - `cargo test -p codex-app-server
    command_exec_permission_profile_cwd_uses_command_cwd`
    - `cargo test -p codex-app-server
    command_profile_preserves_configured_deny_read_restrictions`
    - `cargo test -p codex-app-server
    command_exec_accepts_permission_profile`
    - `cargo test -p codex-app-server
    command_exec_rejects_sandbox_policy_with_permission_profile`
    - `just fix -p codex-app-server`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18283).
    * #18288
    * #18287
    * #18286
    * #18285
    * #18284
    * __->__ #18283
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • Default Fast service tier for eligible ChatGPT plans (#19053)
    ## Why
    
    Enterprise and business-like ChatGPT plans should get Codex's Fast
    service tier by default when the user or caller has not made an explicit
    service-tier choice. At the same time, callers need a durable way to
    choose standard routing without adding a new persisted `standard`
    service tier value. This keeps existing config compatibility while
    letting core own the managed default policy.
    
    ## What changed
    
    - Resolve the effective service tier in core at session creation:
    explicit `fast` or `flex` wins, explicit null/clear or
    `[notice].fast_default_opt_out = true` resolves to standard routing, and
    otherwise eligible ChatGPT plans resolve to Fast when FastMode is
    enabled.
    - Add `[notice].fast_default_opt_out` as the persisted opt-out marker
    for managed Fast defaults.
    - Treat app-server/TUI `service_tier: null` as an explicit
    standard/clear choice by preserving that intent through config loading.
    - Update TUI rendering to use core's effective service tier for startup
    and status surfaces while still keeping `config.service_tier` as the
    explicit configured choice.
    - Update `/fast off` to clear `service_tier`, persist the opt-out
    marker, and send explicit standard for subsequent turns.
    
    ## Verification
    
    - Added unit coverage for config override/notice handling, service-tier
    resolution, runtime null clearing, and `/fast off` turn propagation.
    - `cargo build -p codex-cli`
    
    Full test suite was not run locally per author request.
  • protocol: report session permission profiles (#18282)
    ## Why
    
    Clients that observe `SessionConfigured` need the same canonical
    permission view that app-server thread responses provide. Reporting the
    profile in protocol events lets clients keep their local state
    synchronized without reinterpreting legacy sandbox fields.
    
    ## What changed
    
    This adds `permission_profile` to `SessionConfigured` and propagates it
    through core, exec JSON output, MCP server messages, and TUI
    history/widget handling.
    
    ## Verification
    
    - `cargo test -p codex-tui permissions -- --nocapture`
    - `cargo test -p codex-core --test all permissions_messages --
    --nocapture`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18282).
    * #18288
    * #18287
    * #18286
    * #18285
    * #18284
    * #18283
    * __->__ #18282
  • codex: support hooks in config.toml and requirements.toml (#18893)
    ## Summary
    
    Support the existing hooks schema in inline TOML so hooks can be
    configured from both `config.toml` and enterprise-managed
    `requirements.toml` without requiring a separate `hooks.json` payload.
    
    This gives enterprise admins a way to ship managed hook policy through
    the existing requirements channel while still leaving script delivery to
    MDM or other device-management tooling, and it keeps `hooks.json`
    working unchanged for existing users.
    
    This also lays the groundwork for follow-on managed filtering work such
    as #15937, while continuing to respect project trust gating from #14718.
    It does **not** implement `allow_managed_hooks_only` itself.
    
    NOTE: yes, it's a bit unfortunate that the toml isn't formatted as
    closely as normal to our default styling. This is because we're trying
    to stay compatible with the spec for plugins/hooks that we'll need to
    support & the main usecase here is embedding into requirements.toml
    
    ## What changed
    
    - moved the shared hook serde model out of `codex-rs/hooks` into
    `codex-rs/config` so the same schema can power `hooks.json`, inline
    `config.toml` hooks, and managed `requirements.toml` hooks
    - added `hooks` support to both `ConfigToml` and
    `ConfigRequirementsToml`, including requirements-side `managed_dir` /
    `windows_managed_dir`
    - treated requirements-managed hooks as one constrained value via
    `Constrained`, so managed hook policy is merged atomically and cannot
    drift across requirement sources
    - updated hook discovery to load requirements-managed hooks first, then
    per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning
    when a single layer defines both representations
    - threaded managed hook metadata through discovered handlers and exposed
    requirements hooks in app-server responses, generated schemas, and
    `/debug-config`
    - added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`,
    `codex-rs/core/src/config_loader/tests.rs`, and
    `codex-rs/core/tests/suite/hooks.rs`
    
    ## Testing
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-app-server config_api`
    
    ## Documentation
    
    Companion updates are needed in the developers website repo for:
    
    - the hooks guide
    - the config reference, sample, basic, and advanced pages
    - the enterprise managed configuration guide
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • tui: fix approvals popup disabled shortcut test (#19072)
    ## Why
    
    This regressed in #19063, which made `GuardianApproval` stable and
    enabled by default. That adds an enabled `Auto-review` row to the
    permissions popup, but `approvals_popup_navigation_skips_disabled` still
    assumed the disabled `Full Access` row lived behind a hard-coded numeric
    shortcut, so the test started selecting a different row and closing the
    popup instead of verifying disabled-row behavior.
    
    ## What
    
    - disable `GuardianApproval` in
    `approvals_popup_navigation_skips_disabled` so the popup layout matches
    the scenario the test is exercising
    - choose the hidden numeric shortcut for the disabled `Full Access` row
    by platform (`2` on non-Windows, `3` on Windows where `Read Only` is
    shown) before asserting that selecting the disabled row leaves the popup
    open
    
    ## Testing
    
    - `cargo test -p codex-tui --lib
    chatwidget::tests::permissions::approvals_popup_navigation_skips_disabled
    -- --exact --nocapture`
    - `cargo test -p codex-tui --lib chatwidget::tests::permissions --
    --nocapture`
    - `cargo test -p codex-tui`
  • test: set Rust test thread stack size (#19067)
    ## Summary
    
    Set `RUST_MIN_STACK=8388608` for Rust test entry points so
    libtest-spawned test threads get an 8 MiB stack.
    
    The Windows BuildBuddy failure on #18893 showed
    `//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a
    `#[tokio::test]` even though later test binaries in the shard printed
    successful summaries. Default `#[tokio::test]` uses a current-thread
    Tokio runtime, which means the async test body is driven on libtest's
    std-spawned test thread. Increasing the test thread stack addresses that
    failure mode directly.
    
    To date, we have been fixing these stack-pressure problems with
    localized future-size reductions, such as #13429, and by adding
    `Box::pin()` in specific async wrapper chains. This gives us a baseline
    test-runner stack size instead of continuing to patch individual tests
    only after CI finds another large async future.
    
    ## What changed
    
    - Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so
    Bazel test actions receive the env var through Bazel's cache-keyed test
    environment path.
    - Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points
    and `just test`.
    - Annotated the existing Windows Bazel linker stack reserve as 8 MiB so
    it stays aligned with the libtest thread stack size.
    
    ## Testing
    
    - `just --list`
    - parsed `.github/workflows/rust-ci.yml` and
    `.github/workflows/rust-ci-full.yml` with Ruby's YAML loader
    - compared `bazel aquery` `TestRunner` action keys before/after explicit
    `--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to
    `.bazelrc`
    - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors`
    - failed locally on the existing sandbox-specific status snapshot
    permission mismatch, but loaded the Starlark changes and ran the TUI
    test shards
  • feat(request-permissions) approve with strict review (#19050)
    ## Summary
    Allow the user to approve a request_permissions_tool request with the
    condition that all commands in the rest of the turn are reviewed by
    guardian, regardless of sandbox status.
    
    ## Testing
    - [x] Added unit tests
    - [x] Ran locally
  • chore(auto-review) feature => stable (#19063)
    ## Summary
    Turn on Auto Review
    
    ## Testing
    - [x] Update unit tests
  • core: box multi-agent wrapper futures (#19059)
    ## Why
    
    While debugging the Windows stack overflows we saw in
    [#13429](https://github.com/openai/codex/pull/13429) and then again in
    [#18893](https://github.com/openai/codex/pull/18893), I hit another
    overflow in
    `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.
    
    That test drives the legacy multi-agent spawn / close / resume path. The
    behavior was fine, but several thin async wrappers were still inlining
    much larger `AgentControl` futures into their callers, which was enough
    to overflow the default Windows stack.
    
    ## What
    
    - Box the thin `AgentControl` wrappers around `spawn_agent_internal`,
    `resume_single_agent_from_rollout`, and `shutdown_agent_tree`.
    - Box the corresponding legacy `multi_agents` handler calls in `spawn`,
    `resume_agent`, and `close_agent`.
    - Keep behavior unchanged while reducing future size on this call path
    so the Windows test no longer overflows its stack.
    
    ## Testing
    
    - `cargo test -p codex-core --lib
    tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed
    -- --exact --nocapture`
    - `cargo test -p codex-core` (this still hit unrelated local
    integration-test failures because `codex.exe` / `test_stdio_server.exe`
    were not present in this shell; the relevant unit tests passed)
  • [3/4] Add executor-backed RMCP HTTP client (#18583)
    ### Why
    The RMCP layer needs a Streamable HTTP client that can talk either
    directly over `reqwest` or through the executor HTTP runner without
    duplicating MCP session logic higher in the stack. This PR adds that
    client-side transport boundary so remote Streamable HTTP MCP can reuse
    the same RMCP flow as the local path.
    
    ### What
    - Add a shared `rmcp-client/src/streamable_http/` module with:
      - `transport_client.rs` for the local-or-remote transport enum
      - `local_client.rs` for the direct `reqwest` implementation
      - `remote_client.rs` for the executor-backed implementation
      - `common.rs` for the small shared Streamable HTTP helpers
    - Teach `RmcpClient` to build Streamable HTTP transports in either local
    or remote mode while keeping the existing OAuth ownership in RMCP.
    - Translate remote POST, GET, and DELETE session operations into
    executor `http/request` calls.
    - Preserve RMCP session expiry handling and reconnect behavior for the
    remote transport.
    - Add remote transport coverage in
    `rmcp-client/tests/streamable_http_remote.rs` and keep the shared test
    support in `rmcp-client/tests/streamable_http_test_support.rs`.
    
    ### Verification
    - `cargo check -p codex-rmcp-client`
    - online CI
    
    ### Stack
    1. #18581 protocol
    2. #18582 runner
    3. #18583 RMCP client
    4. #18584 manager wiring and local/remote coverage
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Rename approvals reviewer variant to auto-review (#19056)
    ## Why
    
    `approvals_reviewer` now uses `auto_review` as the canonical config/API
    value after #18504, but the Rust enum variant and nearby helper/test
    names still used `GuardianSubagent` / guardian approval wording. That
    made follow-up code and reviews confusing even though the external value
    had already moved to Auto-review.
    
    ## What changed
    
    - Renamed `ApprovalsReviewer::GuardianSubagent` to
    `ApprovalsReviewer::AutoReview`.
    - Updated protocol, app-server, config, core, TUI, exec, and analytics
    test callsites.
    - Renamed nearby helper/test names from guardian approval wording to
    Auto-review wording where they refer to the approvals reviewer mode.
    - Preserved wire compatibility:
      - `auto_review` remains the canonical serialized value.
      - `guardian_subagent` remains accepted as a legacy alias.
    
    This intentionally does not rename the `[features].guardian_approval`
    key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event
    names, or app-server Guardian review event types.
    
    ## Verification
    
    - `cargo test -p codex-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
    - `cargo test -p codex-app-server-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
    - `cargo test -p codex-config approvals_reviewer`
    - `cargo test -p codex-tui update_feature_flags`
    - `cargo test -p codex-core permissions_instructions`
    - `cargo test -p codex-tui permissions_selection`
  • hooks: emit Bash PostToolUse when exec_command completes via write_stdin (#18888)
    Fixes #16246.
    
    ## Why
    
    `exec_command` already emits `PreToolUse`, but long-running unified exec
    commands that finish on a later `write_stdin` poll could miss the
    matching `PostToolUse`. That left the Bash hook lifecycle inconsistent,
    broke expectations around `tool_use_id` and `tool_input.command`, and
    meant `PostToolUse` block/replacement feedback could fail to replace the
    final session output before it reached model context.
    
    This keeps the fix scoped to the `exec_command` / `write_stdin`
    lifecycle. Broader non-Bash hook expansion is still out of scope here
    and remains tracked separately in #16732.
    
    ## What changed
    
    - Compute and store `PostToolUsePayload` while handlers still have
    access to their concrete output type, and carry `tool_use_id` through
    that payload.
    - Preserve the original hook-facing `exec_command` string through
    unified exec state (`ExecCommandRequest`, `ProcessEntry`,
    `PreparedProcessHandles`, and `ExecCommandToolOutput`) via
    `hook_command`, and remove the now-unused `session_command` output
    metadata.
    - Emit exactly one Bash `PostToolUse` for long-running `exec_command`
    sessions when a later `write_stdin` poll observes final completion,
    using the original `exec_command` call id and hook-facing command.
    - Keep one-shot `exec_command` behavior aligned with the same payload
    construction, including interactive completions that return a final
    result directly.
    - Apply `PostToolUse` block/replacement feedback before the final
    `write_stdin` completion output is sent back to the model.
    - Keep `write_stdin` itself out of `PreToolUse` matching so it continues
    to act as transport/polling for the original Bash tool call.
    - Restore plain matcher behavior for tool-name matchers such as `Bash`
    and `Edit|Write`, while still treating patterns with regex characters
    (for example `mcp__.*`) as regexes.
    - Add unit coverage for unified exec payload construction and parallel
    session separation, plus a core integration regression that verifies a
    blocked `PostToolUse` replaces the final `write_stdin` output in model
    context.
    
    ## Testing
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-core post_tool_use_payload`
    - `cargo test -p codex-core
    post_tool_use_blocks_when_exec_session_completes_via_write_stdin`
  • rollout: persist turn permission profiles (#18281)
    ## Why
    
    Resume and reconstruction need to preserve the permissions that were
    active for each user turn. If rollouts only keep legacy sandbox fields,
    replay cannot faithfully represent profile-shaped overrides introduced
    earlier in the stack.
    
    ## What changed
    
    This records `permission_profile` on user-turn rollout events,
    reconstructs it through history/state extraction, and updates rollout
    reconstruction and related fixtures to keep the field explicit.
    
    ## Verification
    
    - `cargo test -p codex-core --test all permissions_messages --
    --nocapture`
    - `cargo test -p codex-core --test all request_permissions --
    --nocapture`
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18281).
    * #18288
    * #18287
    * #18286
    * #18285
    * #18284
    * #18283
    * #18282
    * __->__ #18281
  • clients: send permission profiles to app-server (#18280)
    ## Why
    
    After app-server can accept `PermissionProfile`, first-party clients
    should stop preferring legacy sandbox fields when canonical permission
    information is available. This keeps the migration moving without
    removing legacy compatibility yet.
    
    The client side still has mixed surfaces during the stack: embedded
    thread start/resume/fork and exec initial turns can derive a profile
    directly from local config, while TUI remote sessions and some
    turn-start paths only have a legacy/server-context-safe sandbox
    projection. Those paths keep sending legacy sandbox fields rather than
    synthesizing or sending lossy/local-only profiles.
    
    ## What changed
    
    - Sends `permissionProfile` from exec and embedded TUI thread
    start/resume/fork requests when config has a representable profile.
    - Keeps legacy sandbox fallback for external sandbox policies, TUI
    remote thread lifecycle requests, and TUI turn-start requests that do
    not yet carry the active profile.
    - Sends the actual config-derived `permissionProfile` for exec initial
    turns instead of rebuilding one from the legacy sandbox projection.
    - Stores response `permissionProfile` as optional in TUI session state
    so external sandbox responses and compatibility payloads preserve
    `null`.
    - Updates tests for request construction and response mapping.
    
    ## Verification
    
    - `cargo check --tests -p codex-tui -p codex-exec`
    - `cargo test -p codex-tui app_server_session -- --nocapture`
    - `cargo test -p codex-exec thread_start_params -- --nocapture`
    - `cargo test -p codex-tui
    app_server_session::tests::thread_lifecycle_params -- --nocapture`
    - `just fix -p codex-tui -p codex-exec`
    - `just fix -p codex-tui`
    
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18280).
    * #18288
    * #18287
    * #18286
    * #18285
    * #18284
    * #18283
    * #18282
    * #18281
    * __->__ #18280
  • exec-server: require explicit filesystem sandbox cwd (#19046)
    ## Why
    
    This is a cleanup PR for the `PermissionProfile` migration stack. #19016
    fixed remote exec-server sandbox contexts so Docker-backed filesystem
    requests use a request/container `cwd` instead of leaking the local test
    runner `cwd`. That exposed the broader API problem:
    `FileSystemSandboxContext::new(SandboxPolicy)` could still reconstruct
    filesystem permissions by reading the exec-server process cwd with
    `AbsolutePathBuf::current_dir()`.
    
    That made `cwd`-dependent legacy entries, such as `:cwd`,
    `:project_roots`, and relative deny globs, depend on ambient process
    state instead of the request sandbox `cwd`. As later PRs make
    `PermissionProfile` the primary permissions abstraction, sandbox
    contexts should be explicit about whether they carry a request `cwd` or
    are profile-only. Removing the implicit constructor prevents new call
    sites from accidentally rebuilding permissions against the wrong `cwd`.
    
    ## What changed
    
    - Removed `FileSystemSandboxContext::new(SandboxPolicy)`.
    - Kept production callers on explicit constructors:
    `from_legacy_sandbox_policy(..., cwd)`, `from_permission_profile(...)`,
    and `from_permission_profile_with_cwd(...)`.
    - Updated exec-server test helpers to construct `PermissionProfile`
    values directly instead of routing through legacy `SandboxPolicy`
    projections.
    - Updated the environment regression test to use an explicit restricted
    profile with no synthetic `cwd`.
    
    ## Verification
    
    - `cargo test -p codex-exec-server`
    - `just fix -p codex-exec-server`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19046).
    * #18288
    * #18287
    * #18286
    * #18285
    * #18284
    * #18283
    * #18282
    * #18281
    * #18280
    * __->__ #19046
  • Rebrand approvals reviewer config to auto-review (#18504)
    ### Why
    
    Auto-review is the user-facing name for the approvals reviewer, but the
    config/API value still exposed the old `guardian_subagent` name. That
    made new configs and generated schemas point users at Guardian
    terminology even though the intended product surface is Auto-review.
    
    This PR updates the external `approvals_reviewer` value while preserving
    compatibility for existing configs and clients.
    
    ### What changed
    
    - Makes `auto_review` the canonical serialized value for
    `approvals_reviewer`.
    - Keeps `guardian_subagent` accepted as a legacy alias.
    - Keeps `user` accepted and serialized as `user`.
    - Updates generated config and app-server schemas so
    `approvals_reviewer` includes:
      - `user`
      - `auto_review`
      - `guardian_subagent`
    - Updates app-server README docs for the reviewer value.
    - Updates analytics and config requirements tests for the canonical
    auto_review value.
    
    
    ### Compatibility
    
    Existing configs and API payloads using:
    
    ```toml
    approvals_reviewer = "guardian_subagent"
    ```
    
    continue to load and map to the Auto-review reviewer behavior. 
    
    New serialization emits: 
    ```toml
    approvals_reviewer = "auto_review" 
    ```
    
    This PR intentionally does not rename the [features].guardian_approval
    key or broad internal Guardian symbols. Those are split out for a
    follow-up PR to keep this migration small and avoid touching large
    TUI/internal surfaces.
    
    **Verification**
    cargo test -p codex-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
    cargo test -p codex-app-server-protocol
    approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
  • Update bundled OpenAI Docs skill freshness check (#19043)
    ## Summary
    
    Sync the bundled `openai-docs` system skill with the already-merged
    `openai/skills` update from https://github.com/openai/skills/pull/360.
    
    Codex bundles system skills from `codex-rs/skills/src/assets/samples`,
    so this PR copies the same GPT-5.4 OpenAI Docs skill update into the
    Codex app/CLI bundle path.
    
    ## Changes
    
    - Add the latest-model resolver script to the bundled `openai-docs`
    skill.
    - Route model upgrade and prompt-upgrade requests through remote
    latest-model metadata when current guidance is needed.
    - Rename bundled fallback references to `upgrade-guide.md` and
    `prompting-guide.md`.
    - Keep the bundled fallback guidance GPT-5.4-only.
    
    ## Validation
    
    - Verified this bundled skill is byte-for-byte identical to
    `openai/skills@origin/main` `skills/.system/openai-docs`.
    - Ran the resolver locally and confirmed it returns `gpt-5.4` /
    `gpt-5p4`.
  • [Codex] Register browser requirements feature keys (#18956)
    ## Summary
    - register `in_app_browser` and `browser_use` as stable feature keys
    - allow requirements/MDM feature requirements to pin those desktop
    browser controls
    - add coverage for browser requirements being accepted by config loading
    
    ## Testing
    - `cargo fmt --all` (`just fmt` unavailable locally; rustfmt warned
    about nightly-only `imports_granularity` config)
    - `cargo test -p codex-features`
    - `cargo test -p codex-core browser_feature_requirements_are_valid`
    - Tested manually by setting in `requirements.toml` and seeing after app
    restart state to reflect the setting was correct (at the time hiding the
    `Browser Use` setting when the enterprise setting was set to false
  • Overlay state DB git metadata for filtered thread lists (#19036)
    ## Summary
    - Factor the state DB `ThreadMetadata` to rollout `ThreadItem` mapping
    into a shared helper used by both DB pages and filesystem overlays
    - Generalize filtered filesystem list overlays to fill missing thread
    list metadata from the state-derived `ThreadItem`, while preserving
    filesystem `path` and `thread_id`
    - Add coverage for the merge behavior so existing filesystem values are
    not overwritten and future `ThreadItem` fields require an explicit
    decision
    
    ## Testing
    - `just fmt` from `codex-rs`
    - `git diff --check -- codex-rs/rollout/src/recorder.rs
    codex-rs/rollout/src/recorder_tests.rs`
    - Attempted `cargo test -p codex-rollout thread_item_metadata` from
    `codex-rs`; blocked in dependency fetch/setup after updating crates.io
    and git submodules `https://github.com/livekit/protocol` and
    `https://chromium.googlesource.com/libyuv/libyuv`, so the focused tests
    did not run
  • exec-server: expose arg0 alias root to fs sandbox (#19016)
    ## Why
    
    The post-merge `rust-ci-full` run for #18999 still failed the Ubuntu
    remote `suite::remote_env` sandboxed filesystem tests. That run checked
    out merge commit `ddde50c611e4800cb805f243ed3c50bbafe7d011`, so the arg0
    guard lifetime fix was present.
    
    The Docker-backed failure had two remaining pieces:
    
    - The sandboxed filesystem helper needs to execute Codex through the
    `codex-linux-sandbox` arg0 alias path. The helper sandbox was only
    granting read access to the real Codex executable parent, so the alias
    parent also has to be visible inside the helper sandbox.
    - The remote-env tests were building sandbox contexts with
    `FileSystemSandboxContext::new()`, which captures the local test runner
    cwd. In the Docker remote exec-server, that host checkout path does not
    exist, so spawning the filesystem helper failed with `No such file or
    directory` before the helper could process the request.
    
    ## What Changed
    
    - Track all helper runtime read roots instead of a single root.
    - Add both the real Codex executable parent and the
    `codex-linux-sandbox` alias parent to sandbox readable roots.
    - Avoid sending an unused local cwd in remote filesystem sandbox
    contexts when the permission profile has no cwd-dependent entries.
    - Build the Docker remote-env test sandbox contexts with a cwd path that
    exists inside the container.
    - Add unit coverage for the alias-parent root and remote sandbox cwd
    handling.
    
    ## Verification
    
    - `cargo test -p codex-exec-server`
    - `cargo test -p codex-core
    remote_test_env_sandboxed_read_allows_readable_root`
    - `just fix -p codex-exec-server`
    - `just fix -p codex-core`
  • Fix MCP permission policy sync (#19033)
    ###### Why/Context/Summary
    
    Repro: start a session outside Full Access, switch permissions to Full
    Access, then submit a new turn that triggers MCP/CUA permission
    handling.
    
    The turn used the live Full Access `SessionConfiguration`, but the MCP
    coordinator was still synced from the stale `original_config_do_not_use`
    / per-turn config copy. That left the coordinator with an old sandbox
    policy, so empty MCP permission elicitations could be denied instead of
    auto-accepted.
    
    Fix: update/rebuild the MCP connection manager from the live
    turn/session approval and sandbox policy fields.
    
    ###### Test plan
    
    ```sh
    just fmt
    cargo test -p codex-core --lib
    cargo test -p codex-core --lib mcp_tool_call::tests
    ```
  • feat: add guardian network approval trigger context (#18197)
    ## Summary
    
    Give guardian network-access reviews the command context that triggered
    a managed-network approval. The prompt JSON now includes the originating
    tool call id, tool name, command argv, cwd, sandbox permissions,
    additional permissions, justification, and tty state when a single
    active tool call can be attributed.
    
    The implementation keeps the trigger shape canonical by serializing
    `GuardianNetworkAccessTrigger` directly and lets each runtime build that
    trigger from its `ToolCtx`. Non-guardian approval prompts avoid cloning
    the full trigger payload.
    
    ## UX changes
    
    Guardian network-access reviews now include a `trigger` object that
    explains what command caused the network approval. Instead of seeing
    only the requested host, the guardian reviewer can also see the
    originating tool call, argv, working directory, sandbox mode,
    justification, and tty state.
    
    Example payload the guardian reviewer can see:
    
    ```json
    {
      "tool": "network_access",
      "target": "https://api.github.com:443",
      "host": "api.github.com",
      "protocol": "https",
      "port": 443,
      "trigger": {
        "callId": "call_abc123",
        "toolName": "shell",
        "command": ["gh", "api", "/repos/openai/codex/pulls/18197"],
        "cwd": "/workspace/codex",
        "sandboxPermissions": "require_escalated",
        "justification": "Fetch PR metadata from GitHub.",
        "tty": false
      }
    }
    ```
    
    The network review itself remains scoped to the network decision:
    `target_item_id` stays `null`. `trigger.callId` is attribution context
    only, so clients can still distinguish network reviews from
    item-targeted command reviews.
    
    ## Verification
    
    - Added coverage for serializing network trigger context in guardian
    approval JSON.
    - Added regression coverage that network guardian reviews do not reuse
    `trigger.callId` as `target_item_id`.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [2/4] Implement executor HTTP request runner (#18582)
    ### Why
    Remote streamable HTTP MCP needs the executor to perform ordinary HTTP
    requests on the executor side. This keeps network placement aligned with
    `experimental_environment = "remote"` without adding MCP-specific
    executor APIs.
    
    ### What
    - Add an executor-side `http/request` runner backed by `reqwest`.
    - Validate request method and URL scheme, preserving the transport
    boundary at plain HTTP.
    - Return buffered responses for ordinary calls and emit ordered
    `http/request/bodyDelta` notifications for streaming responses.
    - Register the request handler in the exec-server router.
    - Document the runner entrypoint, conversion helpers, body-stream
    bridge, notification sender, timeout behavior, and new integration-test
    helpers.
    - Add exec-server integration tests with the existing websocket harness
    and a local TCP HTTP peer for buffered and streamed responses, with
    comments spelling out what each test proves and its
    setup/exercise/assert phases.
    
    ### Stack
    1. #18581 protocol
    2. #18582 runner
    3. #18583 RMCP client
    4. #18584 manager wiring and local/remote coverage
    
    ### Verification
    - `just fmt`
    - `cargo check -p codex-exec-server -p codex-rmcp-client --tests`
    - `cargo check -p codex-core --test all` compile-only
    - `git diff --check`
    - Online full CI is running from the `full-ci` branch, including the
    remote Rust test job.
    
    Co-authored-by: Codex <noreply@openai.com>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>