Commit Graph

616 Commits

  • [codex] Use model-advertised reasoning effort order (#26446)
    ## Summary
    - preserve the model catalog order for app-server
    `supportedReasoningEfforts` and document that client contract
    - render TUI reasoning choices in the advertised order
    - step reasoning shortcuts by adjacent list position instead of deriving
    order from known effort names
    - anchor unsupported configured values to the advertised default, or the
    first option when needed
    - remove canonical effort ordering helpers and the unused upgrade effort
    mapping
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
    
    Stacked on #26444.
  • [codex] Support model-defined reasoning efforts (#26444)
    ## Summary
    - accept non-empty model-defined reasoning effort values while
    preserving built-in effort behavior
    - propagate the non-Copy effort type through core, app-server, TUI,
    telemetry, and persistence call sites
    - preserve string wire encoding and expose an open-string schema for
    clients
    - update model selection and shortcut behavior for model-advertised
    effort values
    
    ## Root cause
    `ReasoningEffort` gained a string-backed custom variant, so it could no
    longer implement `Copy` or rely on derived closed-enum serialization.
    Existing consumers still moved effort values from shared references and
    assumed a fixed built-in value set.
    
    ## Validation
    - `just fmt`
    - Local tests and compilation were not run per request; relying on CI.
  • Expose local image paths to models (#25944)
    ## Why
    
    Local image attachments include image bytes, but the adjacent
    model-visible label omits the source path. Exposing the path lets
    model-selected workflows refer back to the intended local image
    explicitly.
    
    ## What changed
    
    - Include an escaped `path` attribute in model-visible local image
    opening tags.
    - Reuse the path-aware marker generator in rollout coverage.
    - Update protocol, replay, and rollout coverage for the new request
    shape.
    
    ## Validation
    
    - `just fmt`
    - `just test -p codex-protocol`
    - `just test -p codex-core skips_local_image_label_text`
    - `just test -p codex-core
    copy_paste_local_image_persists_rollout_request_shape`
    - `git diff --check`
  • Propagate permission approval environment id (#25862)
    ## Stack
    
    1. #25850 - Key request-permission grants by environment: stores and
    applies sticky permission grants per environment id.
    2. #25858 - Add `environmentId` to `request_permissions`: lets the model
    target a selected environment and resolves relative permission paths
    against it.
    3. This PR (#25862) - Propagate permission approval environment id:
    carries the selected environment id through approval events, app-server
    requests, TUI prompts, and delegate forwarding.
    4. #25867 - Add remote request permissions integration coverage:
    verifies the selected remote environment across request, approval, grant
    reuse, and exec.
    
    This PR is stacked on #25858, and #25867 is stacked on this PR.
    
    ## Why
    
    PR2 lets the model bind a `request_permissions` call to a selected
    environment, but the approval event and client-facing request still
    needed to carry that binding. For CCA, the user-facing prompt and
    delegated approval path should know which environment the grant applies
    to instead of relying on cwd alone.
    
    ## What Changed
    
    - Added optional `environmentId` to `RequestPermissionsEvent`.
    - Emit the selected environment id from core permission approval events.
    - Preserve the environment id through delegate forwarding, including
    cwd-based delegated requests.
    - Added `environmentId` to app-server permission approval params,
    generated schema/TypeScript artifacts, and README examples.
    - Preserve and display the environment id in TUI permission approval
    prompts.
    - Updated focused core, app-server protocol, and TUI conversion
    coverage.
    
    ## Testing
    
    Not run locally per instruction. Performed read-only `git diff --check`.
  • Add environmentId to request_permissions (#25858)
    ## Stack
    
    1. #25850 - Key request-permission grants by environment: stores and
    applies sticky permission grants per environment id.
    2. This PR (#25858) - Add `environmentId` to `request_permissions`: lets
    the model target a selected environment and resolves relative permission
    paths against it.
    3. #25862 - Propagate permission approval environment id: carries the
    selected environment id through approval events, app-server requests,
    TUI prompts, and delegate forwarding.
    4. #25867 - Add remote request permissions integration coverage:
    verifies the selected remote environment across request, approval, grant
    reuse, and exec.
    
    This PR is stacked on #25850; #25862 and #25867 are stacked on this PR.
    
    ## Why
    
    PR1 made request-permission grants internally environment-keyed, but the
    model-facing `request_permissions` tool could still only target the
    primary environment. For CCA and multi-environment turns, the tool needs
    an explicit way to bind a permission request to a selected attached
    environment before resolving relative paths.
    
    ## What Changed
    
    - Added optional `environmentId` to `RequestPermissionsArgs`, with
    `environment_id` accepted as an alias.
    - Exposed `environmentId` in the `request_permissions` tool schema and
    description.
    - Resolve the selected environment before parsing filesystem permission
    paths, so relative paths bind to the selected environment cwd.
    - Route validated tool calls through
    `request_permissions_for_environment` directly instead of duplicating
    environment lookup in `Session::request_permissions`.
    - Reject unknown environment ids with a model-facing error.
    - Updated focused request-permissions and Guardian call sites for the
    new optional field.
    
    ## Testing
    
    Not run locally per instruction.
  • core: derive built-in permission profiles from raw policies (#25739)
    ## Why
    
    Permission profiles that extend a built-in profile should behave like
    other TOML inheritance: parent entries provide defaults, and child keys
    override matching fields before the profile is compiled.
    
    That was not true for `:workspace`. Previously, a profile with `extends
    = ":workspace"` seeded the compiled runtime
    `PermissionProfile::workspace_write()` policy and then appended child
    filesystem entries. A child override such as `":tmpdir" = "read"`
    therefore left the inherited `":tmpdir" = "write"` entry in the final
    policy. Since same-target `write` wins over `read` during runtime
    resolution, the child override was ineffective.
    
    This also needs a clear source of truth for the built-in profiles. The
    protocol-level sandbox policy constructors now define the raw built-in
    filesystem entries, and both `PermissionProfile` presets and
    config-profile inheritance derive from those same values.
    
    ## What Changed
    
    - Add a canonical `FileSystemSandboxPolicy::read_only()` constructor
    while keeping the read-only and workspace-write raw filesystem entries
    explicit and independent.
    - Derive `PermissionProfile::read_only()` from
    `FileSystemSandboxPolicy::read_only()`;
    `PermissionProfile::workspace_write()` continues to derive from
    `FileSystemSandboxPolicy::workspace_write()`.
    - Build extensible `:read-only` and `:workspace` parent profiles by
    projecting those canonical sandbox policies into
    `PermissionProfileToml`, then merge user overrides at the TOML layer
    before compilation.
    - Add config parsing support for `:slash_tmp` so the built-in
    `:workspace` parent can be expressed in the same TOML-shaped filesystem
    table as user profiles.
    - Document that `PermissionsToml::resolve_profile()` returns an
    already-merged `PermissionProfileToml`, and return that profile directly
    after removing the resolved-profile wrapper.
    - Extend the config test for `extends = ":workspace"` to assert that
    inherited `":slash_tmp" = "write"` is preserved and that a child
    `":tmpdir" = "read"` entry replaces the inherited `write` entry.
    
    ## Verification
    
    - `just test -p codex-config`
    - `just test -p codex-protocol`
    - `just test -p codex-core
    permissions_profiles_resolve_extends_parent_first_with_child_overrides`
    - `just test -p codex-core
    default_permissions_profile_can_extend_builtin_workspace`
    - `just test -p codex-core`
      - Result: 2596 passed, 4 failed, 1 timed out.
    - The failures were existing sandbox/environment-sensitive tests
    unrelated to this permissions change:
    
    `suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`,
    
    `suite::user_shell_cmd::user_shell_command_history_is_persisted_and_shared_with_model`,
    
    `suite::abort_tasks::interrupt_persists_turn_aborted_marker_in_next_request`,
        `suite::abort_tasks::interrupt_tool_records_history_entries`, and
    
    `thread_manager::tests::start_thread_uses_all_default_environments_from_codex_home`.
  • Add multi-agent runtime metadata types (#25720)
    Stack split from #25708. Original PR intentionally left open. This first
    PR adds the multi-agent runtime metadata types and catalog plumbing used
    by the rest of the stack.
  • feat: show enterprise monthly credit limits in status (#24812)
    ## Summary
    
    Enterprise users can have an effective monthly credit limit, but Codex
    `/status` currently drops that metadata from the account-usage response.
    
    This change adds the optional `spend_control.individual_limit`
    projection to the existing rate-limit snapshot flow. The backend client
    reads the monthly limit, app-server exposes it as `individualLimit`, and
    the TUI renders a `Monthly credit limit` row through the existing
    progress-bar renderer.
    
    When the backend does not return an effective monthly limit, existing
    rate-limit behavior is unchanged.
    
    ## Existing backend state
    
    The account-usage backend already returns the effective monthly limit
    and current usage together:
    
    ```json
    {
      "spend_control": {
        "reached": false,
        "individual_limit": {
          "limit": "25000",
          "used": "8000",
          "remaining": "17000",
          "used_percent": 32,
          "remaining_percent": 68,
          "reset_after_seconds": 86400,
          "reset_at": 1778137680
        }
      }
    }
    ```
    
    Before this change, Codex projected rolling `primary` and `secondary`
    windows plus `credits`. It ignored `spend_control.individual_limit`, so
    app-server clients and `/status` could not render the monthly cap.
    
    The updated flow is:
    
    ```text
    account usage backend
      -> backend-client reads spend_control.individual_limit
      -> existing rate-limit snapshot carries optional individual_limit
      -> app-server exposes optional individualLimit
      -> TUI renders Monthly credit limit
    ```
    
    ## App-server contract
    
    `account/rateLimits/read` and sparse `account/rateLimits/updated`
    notifications now include an additive nullable
    `rateLimits.individualLimit` field:
    
    ```json
    {
      "individualLimit": {
        "limit": "25000",
        "used": "8000",
        "remainingPercent": 68,
        "resetsAt": 1778137680
      }
    }
    ```
    
    In an `account/rateLimits/read` response, `null` means no monthly limit
    is available. `account/rateLimits/updated` remains a sparse rolling
    notification: clients merge available values into their most recent
    `account/rateLimits/read` snapshot or refetch. Nullable account metadata
    in a rolling notification does not clear a previously observed value.
    
    ## Design decisions
    
    - Extend the existing rate-limit snapshot instead of introducing a
    separate request or wire-level update protocol.
    - Keep the Codex projection narrow: `/status` needs the effective limit,
    current usage, remaining percentage, and reset timestamp.
    - Render the monthly row through the existing progress-bar renderer,
    with one optional detail line for `8,000 of 25,000 credits used`.
    - Keep the backend response optional so existing accounts and older
    usage states preserve their current behavior.
    - Preserve cached monthly metadata when sparse rolling notifications
    omit it. Live account-usage reads remain authoritative and can clear a
    removed limit.
    
    ## Visual evidence
    
    ```text
     Monthly credit limit:   [██████████████░░░░░░] 68% left (resets 07:08 on 7 May)
                             8,000 of 25,000 credits used
    ```
    
    Snapshot:
    `codex-rs/tui/src/status/snapshots/codex_tui__status__tests__status_snapshot_includes_enterprise_monthly_credit_limit.snap`
    
    ## Testing
    
    Tests: generated app-server schema verification, protocol tests,
    backend-client tests, app-server integration coverage, TUI snapshot
    coverage, formatting, and workspace lint cleanup.
  • [codex-rs] auto-review model override (#23767)
    ## Why
    
    Guardian auto-review normally uses the provider-preferred review model
    when one is available. Some parent models need model-catalog metadata to
    select a different review model while keeping older `/models` payloads
    compatible when that metadata is absent.
    
    ## What changed
    
    - Added optional `ModelInfo::auto_review_model_override` metadata to the
    public model payload as a review-model slug.
    - Updated Guardian review model selection to prefer the catalog override
    when present, while preserving the existing provider preferred-model
    path and parent-model fallback when it is omitted.
    - Added focused Guardian coverage for override and no-override model
    selection.
    - Added an `auto_review` core integration suite test that loads override
    metadata from a remote model catalog path and asserts the strict
    auto-review `/responses` request uses the catalog-selected review model.
    - Updated existing `ModelInfo` fixtures and local catalog constructors
    for the new optional field.
    
    ## Validation
    
    - `cargo test -p codex-protocol
    model_info_defaults_availability_nux_to_none_when_omitted`
    - `cargo test -p codex-core guardian_review_uses_`
    - `cargo test -p codex-core
    remote_model_override_uses_catalog_model_for_strict_auto_review --test
    all`
    - `just fix -p codex-protocol`
    - `just fix -p codex-core`
    - `just fmt`
    - `git diff --check`
  • store and expose parent_thread_id on Threads (#25113)
    ## Why
    
    This PR
    https://github.com/openai/codex/pull/24161#discussion_r3325692763
    revealed a subagent data modeling issue, where we overloaded
    `forked_from_id` to also mean `parent_thread_id`. That's incorrect since
    guardian and review subagents can be a subagent and NOT fork the main
    thread's history.
    
    The solution here is to explicitly store a new `parent_thread_id` on
    `SessionMeta`, alongside `forked_from_id` which already exists. While
    we're at it, also expose it in the app-server protocol on the `Thread`
    object.
    
    A thread->subagent relationship and a fork of thread history are
    orthogonal concepts.
    
    ## What Changed
    
    - Added top-level `parent_thread_id` persistence on `SessionMeta` and
    runtime/session plumbing through `SessionConfiguredEvent`,
    `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`,
    `TurnContext`, and `ModelClient`.
    - Made turn metadata, request headers, analytics, and subagent-start
    events read the separate runtime/top-level parent field instead of
    deriving general parent lineage from `SessionSource` or
    `forked_from_thread_id`.
    - Passed parent lineage separately at delegated subagent, review,
    guardian, agent-job, and multi-agent spawn construction sites;
    copied-history fork lineage remains derived only from `InitialHistory`.
    - Persisted and exposed parent lineage through rollout/thread-store
    projections and app-server v2 `Thread.parentThreadId`.
    - Updated app-server README text and regenerated app-server schema
    fixtures for the additive `parentThreadId` response field.
  • Add cloud-managed config layer support (#24620)
    ## Summary
    
    PR 3 of 5 in the cloud-managed config client stack.
    
    Adds enterprise-managed cloud config as a first-class config layer
    source. The layer metadata is preserved through config loading,
    diagnostics, debug output, hook attribution, and app-server protocol
    surfaces.
    
    ## Details
    
    - Enterprise-managed config becomes a normal config layer source with
    backend-supplied `id` and display `name` attached for provenance.
    - These layers are designed to behave like non-file managed config: they
    can surface syntax/type diagnostics by layer name even though there is
    no physical config file.
    - Relative path settings are resolved from a stored config base so
    cloud-delivered config remains consistent with existing MDM-delivered
    config semantics.
    - Hook attribution distinguishes config-delivered hooks from
    requirements-delivered hooks via `HookSource::CloudManagedConfig`.
    - This remains pull-based and snapshot-oriented; the PR adds layer
    identity/diagnostics, not dynamic reload behavior.
    
    ## Validation
    
    Validated through the targeted stack checks after rebasing onto current
    `main`:
    
    - Rust crate tests for
    config/hooks/cloud-config/backend-client/app-server-protocol
    - Filtered `codex-core` and `codex-app-server` `cloud_config_bundle`
    tests
    - Python generated-file contract test
    - `cargo shear --deny-warnings`
    - Targeted `argument-comment-lint` for config/hooks
  • Add subagent lineage metadata for responsesapi (#24161)
    ## Why
    
    We recently added `forked_from_thread_id` which lets us trace where a
    thread's _context_ comes from, but we also want to understand subagent
    lineage (e.g. which parent thread spawned this subagent? what kind of
    subagent is it?) which is orthogonal.
    
    This PR adds `parent_thread_id` and `subagent_kind` to the
    `x-codex-turn-metadata` header sent to ResponsesAPI.
    
    ## What changed
    
    - Adds `parent_thread_id` and `subagent_kind` to core-owned
    `x-codex-turn-metadata`.
    - Restores persisted `SessionSource` and `ThreadSource` from resumed
    session metadata so cold-resumed subagent threads keep their lineage on
    later Responses API requests.
    - Centralizes parent-thread extraction on `SessionSource` /
    `SubAgentSource` and reuses it in the Responses client, analytics, agent
    control, and state parsing paths.
    - Extends reserved-key, git-enrichment, thread-spawn, and app-server v2
    metadata coverage for the new lineage fields.
    
    ## Verification
    
    - Not run locally per request.
    - Added focused coverage in `core/src/turn_metadata_tests.rs` and
    `app-server/tests/suite/v2/client_metadata.rs`.
  • [codex] Add model tool mode selector (#25031)
    ## Why
    Some models need to select their code-execution behavior through model
    catalog metadata. Models without that metadata must continue to follow
    the existing `CodeMode` and `CodeModeOnly` feature flags, including when
    a newer server sends an enum value this client does not recognize.
    
    ## What changed
    - add optional `ModelInfo.tool_mode` metadata with `direct`,
    `code_mode`, and `code_mode_only`
    - treat omitted and unknown wire values as `None`
    - resolve `None` from the existing feature flags
    - carry the resolved `ToolMode` directly on `TurnContext`, outside
    `Config`
    - use the resolved value for turn creation, model switches, review
    turns, tool planning, and code execution
    
    ## Coverage
    - add protocol coverage for omitted, known, and unknown enum values
    - add focused coverage for flag fallback and explicit metadata
    overriding feature flags
    - add core integration coverage that fetches remote model metadata
    through `/v1/models` and verifies the outbound `/responses` tools for
    explicit `direct` and `code_mode_only` selectors
    
    ## Stack
    - followed by #25032
  • Surface filesystem permission profiles in prompt context (#23924)
    ## Summary
    Some permission profiles can encode filesystem reads that should remain
    unavailable to the agent. Before this change, the model-visible context
    and automatic approval review prompt summarized the effective
    permissions as a legacy sandbox mode, which can omit permission-profile
    filesystem entries from escalation decisions.
    
    For example, a profile can grant workspace access while denying a
    private subtree across every workspace root:
    
    ```toml
    default_permissions = "restricted-workspace"
    
    [permissions.restricted-workspace.workspace_roots]
    "/Users/alice/project" = true
    "/Users/alice/other-project" = true
    
    [permissions.restricted-workspace.filesystem]
    ":minimal" = "read"
    
    [permissions.restricted-workspace.filesystem.":workspace_roots"]
    "." = "write"
    "private" = "deny"
    "private/**" = "deny"
    ```
    
    The context window now describes the workspace roots and effective
    filesystem side of the `PermissionProfile` directly, with deny entries
    marked as non-escalatable:
    
    ```xml
    <environment_context>
      <cwd>/Users/alice/project</cwd>
      <shell>zsh</shell>
      <filesystem><workspace_roots><root>/Users/alice/project</root><root>/Users/alice/other-project</root></workspace_roots><permission_profile type="managed"><file_system type="restricted"><entry access="read"><special>:minimal</special></entry><entry access="write"><path>/Users/alice/project</path></entry><entry access="write"><path>/Users/alice/other-project</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/project/private</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/other-project/private</path></entry><entry access="deny" escalatable="false"><glob>/Users/alice/project/private/**</glob></entry><entry access="deny" escalatable="false"><glob>/Users/alice/other-project/private/**</glob></entry></file_system></permission_profile></filesystem>
    </environment_context>
    ```
    
    Managed requirements can impose the same kind of deny-read restriction:
    
    ```toml
    [permissions.filesystem]
    deny_read = [
      "/Users/alice/project/private",
      "/Users/alice/project/private/**",
    ]
    ```
    
    The automatic approval review prompt also receives the parent turn's
    denied-read context, so review decisions can account for the active
    permission profile.
    
    ## What Changed
    - Render the effective filesystem profile in `<environment_context>`,
    including profile type, filesystem entries, workspace roots, and
    non-escalatable deny entries.
    - Persist effective `workspace_roots` in `TurnContextItem` so
    resumed/replayed context does not have to bind `:workspace_roots`
    through legacy `cwd` fallback.
    - Add explicit permission instructions that denied reads are policy
    restrictions, not escalation targets.
    - Pass the parent turn's denied-read context into automatic approval
    reviews.
    - Add targeted coverage for prompt rendering, workspace-root
    materialization, replay context, and review prompt context.
    - Keep the prompt-context test expectations platform-aware so the same
    filesystem rendering assertions pass on Unix and Windows paths.
    
    ## Testing
    - `just test -p codex-core
    context::environment_context::tests::serialize_environment_context_with_full_filesystem_profile`
    - `just test -p codex-core
    context::environment_context::tests::turn_context_item_filesystem_uses_workspace_roots_instead_of_cwd`
    - `just test -p codex-core
    context::permissions_instructions::permissions_instructions_tests::builds_permissions_from_profile_with_denied_reads`
    - `just fix -p codex-core`
    
    I also attempted `just test -p codex-core`; the changed prompt-context
    tests passed, but the full local run did not complete cleanly in this
    sandboxed macOS environment due unrelated user-shell `CODEX_SANDBOX*`
    expectations and integration-test timeouts.
  • [codex] Add user input client ids (#24653)
    ## Summary
    
    Adds an optional `clientId` field to app-server v2 `UserInput` and
    carries it through the core `UserInput` model so clients can correlate
    echoed user input items without relying on payload equality.
    
    ## Details
    
    - Adds `client_id: Option<String>` to core `UserInput` variants.
    - Exposes the v2 app-server field as `clientId` on the wire and in
    generated TypeScript.
    - Preserves the id when converting between app-server v2 and core
    protocol types.
    - Regenerates app-server schema fixtures.
    
    ## Validation
    
    - `just fmt`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-protocol`
    - `git diff --check`
  • Expose MCP server info as part of server status (#24698)
    # Summary
    
    Expose MCP server info via App Server (when available) so apps can
    render a richer MCP experience
  • Restore legacy image detail values (#24644)
    ## Why
    
    Older persisted rollouts can contain `input_image.detail` values of
    `auto` or `low` from before `ImageDetail` was narrowed to
    `high`/`original`. Current deserialization rejects those values, which
    can make resume skip later compacted checkpoints and reconstruct an
    oversized raw suffix before the next compaction attempt.
    
    Confirmed Sentry reports fixed by this compatibility path:
    
    - [CODEX-1H3F](https://openai.sentry.io/issues/7500642496/)
    - [CODEX-1H6N](https://openai.sentry.io/issues/7501025347/)
    - [CODEX-1JDP](https://openai.sentry.io/issues/7504549065/)
    - [CODEX-1HW6](https://openai.sentry.io/issues/7503407986/)
    
    ## Background
    
    [openai/codex#20693](https://github.com/openai/codex/pull/20693) added
    image-detail plumbing for app-server `UserInput` so input images could
    explicitly request `detail: original`. The Slack discussion behind that
    PR was about ScreenSpot / bridge evals where user input images were
    resized, while tool output images already had MCP/code-mode ways to
    request image detail.
    
    In review, the intended new API surface was narrowed to `high` and
    `original`: default to `high`, allow `original` when callers need
    unchanged image handling, and avoid encouraging new `auto` or `low`
    usage. That policy still makes sense for newly emitted values.
    
    The missing compatibility piece is persisted history. Older rollouts can
    already contain `auto` and `low`, and resume reconstructs typed history
    by deserializing those rollout records. Rejecting old values at that
    boundary causes valid compacted checkpoints to be skipped. This PR
    restores `auto` and `low` as real variants so old records deserialize
    and round-trip without being rewritten as `high`, while product paths
    can continue to default to `high` and avoid emitting `auto` for new
    behavior.
    
    ## What changed
    
    - Restored `ImageDetail::Auto` and `ImageDetail::Low` as first-class
    protocol values.
    - Preserved `auto`/`low` through rollout deserialization, MCP image
    metadata, code-mode image output, and schema/type generation.
    - Kept local image byte handling conservative: only `original` switches
    to original-resolution loading; `auto`/`low`/`high` continue through the
    resize-to-fit path while retaining their detail value.
    - Added regression coverage for enum round-tripping and code-mode `low`
    detail handling.
    
    ## Testing
    
    - `just write-app-server-schema`
    - `just test -p codex-protocol`
    - `just test -p codex-tools`
    - `just test -p codex-code-mode`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-core
    suite::rmcp_client::stdio_image_responses_preserve_original_detail_metadata`
    - `just test -p codex-core
    suite::code_mode::code_mode_can_use_mcp_image_result_with_image_helper`
    - Loaded broken rollouts on local fixed builds, and started/completed
    new turns.
    
    I also attempted `just test -p codex-core`; the local broad run did not
    finish green: 2559 tests run, 2467 passed, 55 flaky, 91 failed, 1 timed
    out. The failures were broad timeout/deadline failures across unrelated
    areas; targeted changed-path core tests above passed.
  • [codex] remove plain image wrapper spans (#24652)
    ## Why
    
    Remote image submissions currently wrap native `input_image` spans in
    literal `<image>` and `</image>` text spans. Those extra prompt tokens
    add structure without providing label or routing information.
    
    ## What Changed
    
    - Serialize `UserInput::Image` directly as an `input_image` content
    span.
    - Preserve named local-image framing and legacy wrapper parsing for
    labeled attachments and existing histories.
    - Update existing request-shape expectations for drag-and-drop images,
    model switching, and compaction.
    
    ## Validation
    
    - `just test -p codex-protocol`
    - Focused `codex-core` run covering
    `drag_drop_image_persists_rollout_request_shape`,
    `model_change_from_image_to_text_strips_prior_image_content`, and
    `snapshot_request_shape_pre_turn_compaction_including_incoming_user_message`
    
    ## Notes
    
    - A broader `just test -p codex-core` run was attempted; the affected
    tests passed, while the overall run failed in unrelated CLI, MCP, and
    tooling tests plus a `thread_manager` timeout.
  • Add experimental turn additional context (#24154)
    ## Summary
    
    Adds experimental `additionalContext` support to `turn/start` and
    `turn/steer` so clients can provide ephemeral external context, such as
    browser or automation state, without turning that plumbing into a
    visible user prompt or triggering user-prompt lifecycle behavior.
    
    ## API Shape
    
    The parameter shape is:
    
    ```ts
    additionalContext?: Record<string, {
      value: string
      kind: "untrusted" | "application"
    }> | null
    ```
    
    Example:
    
    ```json
    {
      "additionalContext": {
        "browser_info": {
          "value": "Active tab is CI failures.",
          "kind": "untrusted"
        },
        "automation_info": {
          "value": "CI rerun is in progress.",
          "kind": "application"
        }
      }
    }
    ```
    
    The keys are opaque and caller-defined.
    
    ## Context Injection
    
    When provided, accepted entries are inserted into model context as
    hidden contextual message items, not as visible thread user-message
    items.
    
    `kind: "untrusted"` entries are inserted with role `user`:
    
    ```text
    <external_${key}>${value}</external_${key}>
    ```
    
    `kind: "application"` entries are inserted with role `developer`:
    
    ```text
    <${key}>${value}</${key}>
    ```
    
    Values are not escaped. Each value is truncated to 1k approximate tokens
    before wrapping.
    
    For `turn/start`, accepted additional context is inserted before normal
    user input. For `turn/steer`, additional context is merged only when the
    steer includes non-empty user input; context-only steers still reject as
    empty input.
    
    ## Dedupe Strategy
    
    `AdditionalContextStore` lives on session state and stores the latest
    complete additional-context map.
    
    Each `turn/start` or non-empty `turn/steer` treats its
    `additionalContext` as the current complete set of values. Entries are
    injected only when the key is new or the exact entry for that key
    changed, including `value` or `kind`. After merging, the store is
    replaced with the provided map, so omitted keys are removed from the
    retained set and can be injected again later if reintroduced.
    
    Omitting `additionalContext`, passing `null`, or passing an empty object
    resets the store to empty and injects nothing.
    
    ## What Changed
    
    - Threads experimental v2 `additionalContext` through app-server into
    core turn start and steer handling.
    - Adds separate contextual fragment types for untrusted user-role
    context and application developer-role context.
    - Uses pending response input items so additional context can be
    combined with normal user input without treating it as prompt text.
    - Adds integration coverage for start/steer flow, role routing,
    dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
    behavior, empty context-only steer rejection, external-fragment marker
    matching, and truncation.
  • standalone websearch extension (#23823)
    ## Summary
    
    Add the extension-backed standalone `web.run` tool so Codex can call the
    standalone search endpoint through the `codex-api` search client and
    return its encrypted output to Responses.
    
    - gate the new tool behind `standalone_web_search`
    - install the extension in the app-server thread registry and hide
    hosted `web_search` when standalone search is enabled for OpenAI
    providers so the two paths stay mutually exclusive
    - build search context from persisted history using a small tail
    heuristic: previous user message, assistant text between the last two
    user turns capped at about 1k tokens, and current user message
    
    ## Test Plan
    
    - `cargo test -p codex-web-search-extension`
    - `cargo test -p codex-api`
    - `cargo test -p codex-core
    hosted_tools_follow_provider_auth_model_and_config_gates`
  • Display workspace usage limit error copy from response header (#24114)
    ## Why
    
    `openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex
    workspace credit-depletion and spend-cap responses. The CLI currently
    reads the adjacent promo header but otherwise renders generic
    usage-limit copy, so those responses do not explain the
    workspace-specific action the user needs to take.
    
    Backend dependency: https://github.com/openai/openai/pull/947613
    
    ## What Changed
    
    - Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error
    handling path alongside `x-codex-promo-message`.
    - Keep the header value parsing with the shared `RateLimitReachedType`
    enum.
    - Carry the parsed type on `UsageLimitReachedError` and render
    client-owned copy for the four workspace owner/member credit and
    spend-cap values.
    - Preserve existing promo and plan-based text for absent, generic, or
    unknown header values.
    - Keep the existing TUI workspace-owner nudge state path unchanged; the
    response header only selects the displayed error string.
    - Add focused display coverage for all specific type values and the
    generic fallback case.
    
    ## Test Plan
    
    - Added `usage_limit_reached_error_formats_rate_limit_reached_types`
    coverage.
    - Not run manually, per request; CI runs validation on the pushed
    commit.
  • Add trace_id to TurnStartedEvent (#23980)
    ## Why
    [Recent PR](https://github.com/openai/codex/pull/22709) removed
    `trace_id` from `TurnContextItem`.
    
    ## What changed
    - Add to `TurnStartedEvent` so rollout consumers can correlate turns
    with telemetry traces.
    - Note that the branch name is out of date because I originally re-added
    to `TurnContextItem`, but we decided to move it to `TurnStartedEvent`.
    
    ## Verification
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-core --lib
    regular_turn_emits_turn_started_without_waiting_for_startup_prewarm`
    - `cargo test -p codex-core --test all
    emits_warning_when_resumed_model_differs`
    - `cargo test -p codex-rollout`
    - `cargo test -p codex-state`
  • cli: rename profile v2 flag to --profile (#23883)
    ## Why
    
    Profile v2 is taking over the user-facing profile selection path, so the
    CLI no longer needs to expose the transitional `--profile-v2` spelling.
    This switches the public args surface to `--profile` before the
    remaining legacy profile plumbing is removed separately.
    
    ## What
    
    - Rebind `--profile` and `-p` to the v2 profile name argument that
    selects `$CODEX_HOME/<name>.config.toml`.
    - Stop parsing the legacy shared CLI profile argument while keeping its
    implementation path in place for follow-up cleanup.
    - Update CLI validation, profile-name parse errors, and the
    legacy-profile collision message/tests to refer to `--profile`.
    
    ## Testing
    
    - `cargo test -p codex-cli -p codex-config -p codex-protocol -p
    codex-utils-cli`
  • [codex] Add plugin id to MCP tool call items (#23737)
    Add owning plugin id to MCP tool call items so we can better filter them
    at plugin level.
    
    ## Summary
    - add optional `plugin_id` to MCP tool-call items and legacy begin/end
    events
    - propagate plugin metadata into emitted core items and app-server v2
    `ThreadItem::McpToolCall`
    - preserve plugin ids through app-server replay/redaction paths and
    regenerate v2 schema fixtures
    
    ## Testing
    - `just write-app-server-schema`
    - `just fmt`
    - `just fix -p codex-core`
    - `cargo test -p codex-protocol -p codex-app-server-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core mcp_tool_call_item_includes_plugin_id --lib`
    - `cargo check -p codex-tui --tests`
    - `cargo check -p codex-app-server --tests`
    - `git diff --check`
    
    ## Notes
    - `just fix -p codex-core` completed with two non-fatal
    `too_many_arguments` warnings on the touched MCP notification helpers.
    - A broader `cargo test -p codex-core` run passed core unit tests, then
    hit shell/sandbox/snapshot failures in the integration target.
    - A broader app-server downstream run hit the existing
    `in_process::tests::in_process_start_clamps_zero_channel_capacity` stack
    overflow; `cargo test -p codex-exec` also hit the existing sandbox
    expectation mismatch in
    `thread_lifecycle_params_include_legacy_sandbox_when_no_active_profile`.
  • Honor client-resolved service tier defaults (#23537)
    ## Why
    
    Model catalog responses can now advertise a nullable
    `default_service_tier` for each model. Codex needs to preserve three
    distinct states all the way from config/app-server inputs to inference:
    
    - no explicit service tier, so the client may apply the current model
    catalog default when FastMode is enabled
    - explicit `default`, meaning the user intentionally wants standard
    routing
    - explicit catalog tier ids such as `priority`, `flex`, or future tiers
    
    Keeping those states distinct prevents the UI from showing one tier
    while core sends another, especially after model switches or app-server
    `thread/start` / `turn/start` updates.
    
    ## What Changed
    
    - Plumbed `default_service_tier` through model catalog protocol types,
    app-server model responses, generated schemas, model cache fixtures, and
    provider/model-manager conversions.
    - Added the request-only `default` service tier sentinel and normalized
    legacy config spelling so `fast` in `config.toml` still materializes as
    the runtime/request id `priority`.
    - Moved catalog default resolution to the TUI/client side, including
    recomputing the effective service tier when model/FastMode-dependent
    surfaces change.
    - Updated app-server thread lifecycle config construction so
    `serviceTier: null` preserves explicit standard-routing intent by
    mapping to `default` instead of internal `None`.
    - Kept core responsible for validating explicit tiers against the
    current model and stripping `default` before `/v1/responses`, without
    applying catalog defaults itself.
    
    ## Validation
    
    - `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
    - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
    - `cargo test -p codex-tui service_tier`
    - `cargo test -p codex-protocol service_tier_for_request`
    - `cargo test -p codex-core get_service_tier`
    - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
    service_tier`
  • Add SubagentStop hook (#22873)
    # What
    
    <img width="1792" height="1024" alt="image"
    src="https://github.com/user-attachments/assets/8f81d232-5813-4994-a61d-e42a05a93a3e"
    />
    
    `SubagentStop` runs when a thread-spawned subagent turn is about to
    finish. Thread-spawned subagents use `SubagentStop` instead of the
    normal root-agent `Stop` hook.
    
    Configured handlers match on `agent_type`. Hook input includes the
    normal stop fields plus:
    
    - `agent_id`: the child thread id.
    - `agent_type`: the resolved subagent type.
    - `agent_transcript_path`: the child subagent transcript path.
    - `transcript_path`: the parent thread transcript path.
    - `last_assistant_message`: the final assistant message from the child
    turn, when available.
    - `stop_hook_active`: `true` when the child is already continuing
    because an earlier stop-like hook blocked completion.
    
    `SubagentStop` shares the same completion-control semantics as `Stop`,
    scoped to the child turn:
    
    - No decision allows the child turn to finish.
    - `decision: "block"` with a non-empty `reason` records that reason as
    hook feedback and continues the child with that prompt.
    - `continue: false` stops the child turn. If `stopReason` is present,
    Codex surfaces it as the stop reason.
    
    # Lifecycle Scope
    
    Only thread-spawned subagents run `SubagentStop`.
    
    Internal/system subagents such as Review, Compact, MemoryConsolidation,
    and Other do not run normal `Stop` hooks and do not run `SubagentStop`.
    This avoids exposing synthetic matcher labels for internal
    implementation paths.
    
    # Stack
    
    1. #22782: add `SubagentStart`.
    2. This PR: add `SubagentStop`.
    3. #22882: add subagent identity to normal hook inputs.
  • feat(permissions): resolve permission profile inheritance (#22270)
    ## Stack
    
    This is the foundation PR for the permission-profile inheritance stack.
    
    - This PR adds config-level `extends` resolution and merge semantics.
    - Follow-up: #23705 applies resolved profiles at runtime and updates the
    active-profile protocol surfaces.
    
    ## Why
    
    Permission profiles are starting to carry enough policy that
    copy-pasting near-identical definitions becomes hard to review and easy
    to drift. Before the runtime can consume inherited profiles, the config
    layer needs one explicit resolver that can merge parent chains and
    reject unsafe or invalid inheritance shapes.
    
    ## What changed
    
    - Add `extends` to permission-profile TOML and resolve parent chains in
    inheritance order.
    - Merge inherited profile TOML with the existing config merge behavior
    while preserving the permission-specific normalization needed for
    network domain keys.
    - Keep parent descriptions out of resolved child profiles and record
    inherited profile names separately for downstream consumers.
    - Reject undefined parents, unsupported built-in parents, and
    inheritance cycles with targeted errors.
    - Cover resolver behavior with TOML fixture tests and refresh the
    generated config schema.
    
    ## Validation
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core permissions_profiles_`
  • Add timeout for remote compaction requests (#23451)
    ## Why
    
    Remote compaction currently sends a unary `POST /responses/compact` and
    waits for the full response before replacing history or emitting the
    completed `ContextCompaction` item. Unlike normal `/responses` streaming
    requests, this unary compact request had no timeout boundary. If the
    backend accepts the request and then stalls before returning a body, the
    existing request retry policy never sees a transport error, so the
    compact turn can remain stuck after the started item with no completion
    or actionable error.
    
    That matches the reported hang shape in issues such as #18363, where
    logs show `responses/compact` was posted but no corresponding compact
    completion followed. A bounded request timeout gives the existing retry
    policy a concrete timeout error to retry instead of letting the user sit
    indefinitely on automatic context compaction.
    
    ## What
    
    - Add a request timeout to legacy `/responses/compact` calls.
    - Size that timeout from the provider stream idle timeout with a
    conservative multiplier, so the default compact attempt gets 20 minutes
    rather than the 5 minute stream idle window.
    - Map API transport timeouts to a request timeout error instead of the
    child-process timeout message.
    
    ## Testing
    
    - Not run (per request; CI will cover).
  • add encryptedcontent to functioncalloutput (#23500)
    add new `EncryptedContent` variant to `FunctionCallOutputContentItem`
    ahead of standalone websearch.
    
    we need to be able to receive and pass encrypted function call output
    from the new web search endpoint back to responsesapi, as we cannot
    expose direct search results.
  • Add SubagentStart hook (#22782)
    # What
    
    `SubagentStart` runs once when Codex creates a thread-spawned subagent,
    before that child sends its first model request. Thread-spawned
    subagents use `SubagentStart` instead of the normal root-agent
    `SessionStart` hook.
    
    Configured handlers match on the subagent `agent_type`, using the same
    value passed to `spawn_agent`. When no agent type is specified, Codex
    uses the default agent type.
    
    Hook input includes the normal session-start fields plus:
    
    - `agent_id`: the child thread id.
    - `agent_type`: the resolved subagent type.
    
    `SubagentStart` may return `hookSpecificOutput.additionalContext`. That
    context is added to the child conversation before the first model
    request.
    
    # Lifecycle Scope
    
    Only thread-spawned subagents run `SubagentStart`.
    
    Internal/system subagents such as Review, Compact, MemoryConsolidation,
    and Other do not run normal `SessionStart` hooks and do not run
    `SubagentStart`. This avoids exposing synthetic matcher labels for
    internal implementation paths.
    
    Also the `SessionStart` hook no longer fires for subagents, this matches
    behavior with other coding agents' implementation
    
    # Stack
    
    1. This PR: add `SubagentStart`.
    2. #22873: add `SubagentStop`.
    3. #22882: add subagent identity to normal hook inputs.
  • Make deny canonical for filesystem permission entries (#23493)
    ## Why
    Filesystem permission profiles used `none` for deny-read entries, which
    is less direct than the action the entry actually represents. This
    change makes `deny` the canonical filesystem permission spelling while
    preserving compatibility for older configs that still send `none`.
    
    ## What changed
    - rename `FileSystemAccessMode::None` to `Deny`
    - serialize and generate schemas with `deny` as the canonical value
    - retain `none` only as a legacy input alias for temporary config
    compatibility
    - update filesystem glob diagnostics and regression coverage to use the
    canonical spelling
    - refresh config and app-server schema fixtures to match the new wire
    shape
    
    ## Validation
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core config_toml_deserializes_permission_profiles
    --lib`
    - `cargo test -p codex-core
    read_write_glob_patterns_still_reject_non_subpath_globs --lib`
    
    Earlier in the session, a broad `cargo test -p codex-core` run reached
    unrelated pre-existing failures in timing/snapshot/git-info tests under
    this environment; the targeted surfaces touched by this PR passed
    cleanly.
  • Add body_after_prefix auto-compact token limit scope (#22870)
    ## Why
    
    `model_auto_compact_token_limit` has only been able to budget the full
    active context. That makes it hard to set a small "growth since
    compaction" budget for sessions that preserve a large carried window
    prefix: the preserved prefix can consume the whole budget and force
    immediate repeated compaction.
    
    This PR adds an opt-in `body_after_prefix` scope so callers can apply
    `model_auto_compact_token_limit` to sampled output and later growth
    after the current carried prefix, while still forcing compaction before
    the full model context window is exhausted.
    
    ## What changed
    
    - Adds `AutoCompactTokenLimitScope` with the existing `total` behavior
    as the default and a new `body_after_prefix` mode:
    [`config_types.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/protocol/src/config_types.rs#L24-L37).
    - Threads `model_auto_compact_token_limit_scope` through config loading,
    `Config`, `core-api`, and app-server v2 schema/TypeScript generation.
    - Records the first observed input-token count for a `body_after_prefix`
    compaction window and uses it as the baseline when deciding whether the
    scoped auto-compaction budget is exhausted:
    [`turn.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/src/session/turn.rs#L743-L781).
    - Keeps a hard context-window cap in `body_after_prefix`, so scoped
    budgeting cannot let the active context overrun the usable window.
    
    ## Verification
    
    Added compact-suite coverage for the two key behaviors:
    `body_after_prefix` does not re-compact just because the carried prefix
    is larger than the scoped budget, and it still compacts when the total
    active context reaches the configured context window:
    [`compact.rs`](https://github.com/openai/codex/blob/973806b1cb35792555bead994cb3ed94656eb171/codex-rs/core/tests/suite/compact.rs#L3003-L3128).
  • [5 of 7] Replace OverrideTurnContext with ThreadSettings (#22508)
    **Stack position:** [5 of 7]
    
    ## Summary
    
    This PR adds `Op::ThreadSettings`, a queued settings-only update
    mechanism for changing stored thread settings without starting a new
    turn. It also removes the legacy `Op::OverrideTurnContext` in the same
    layer, so reviewers can see the replacement and deletion together.
    
    ## Changes
    
    - Add `Op::ThreadSettings` for settings-only queued updates.
    - Emit `ThreadSettingsApplied` with the effective thread settings
    snapshot after core applies an update.
    - Route settings-only updates through the same submission queue as user
    input.
    - Migrate remaining `OverrideTurnContext` tests and callers to the
    queued `Op::ThreadSettings` path.
    - Delete `Op::OverrideTurnContext` from the core protocol and submission
    loop.
    
    This stack addresses #20656 and #22090.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508) (this PR)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • [3 of 7] Remove UserTurn (#23075)
    **Stack position:** [3 of 7]
    
    ## Summary
    
    This PR finishes the input-op consolidation by moving the remaining
    `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`.
    This touches a lot of files, but it is a low-risk mechanical migration.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075) (this PR)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • [2 of 7] Remove UserInputWithTurnContext (#23081)
    **Stack position:** [2 of 7]
    
    ## Summary
    
    This PR removes the overlapping `Op::UserInputWithTurnContext` variant
    now that `Op::UserInput` can carry thread settings overrides directly.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    (this PR)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • [1 of 7] Add thread settings to UserInput (#23080)
    **Stack position:** [1 of 7]
    
    ## Summary
    
    The first three PRs in this stack are a cleanup pass before the actual
    thread settings API work.
    
    Today, core has several overlapping "user input" ops: `UserInput`,
    `UserInputWithTurnContext`, and `UserTurn`. They differ mostly in how
    much next-turn state they carry, which makes the later queued thread
    settings update harder to reason about and review.
    
    This PR starts that cleanup by adding the shared
    `ThreadSettingsOverrides` payload and allowing `Op::UserInput` to carry
    it. Existing variants remain in place here, so this layer is mostly a
    behavior-preserving API shape change plus mechanical constructor
    updates.
    
    ## End State After PR3
    
    By the end of PR3, `Op::UserInput` is the only "user input" core op. It
    can carry optional thread settings overrides for callers that need to
    update stored defaults with a turn, while callers without updates use
    empty settings. `Op::UserInputWithTurnContext` and `Op::UserTurn` are
    deleted.
    
    ## End State After PR5
    
    By the end of PR5, core will have only two ops for this area:
    
    - `Op::UserInput` for user-input-bearing submissions.
    - `Op::ThreadSettings` for settings-only updates.
    
    ## Stack
    
    1. [1 of 7] [Add thread settings to
    UserInput](https://github.com/openai/codex/pull/23080) (this PR)
    2. [2 of 7] [Remove
    UserInputWithTurnContext](https://github.com/openai/codex/pull/23081)
    3. [3 of 7] [Remove
    UserTurn](https://github.com/openai/codex/pull/23075)
    4. [4 of 7] [Placeholder for OverrideTurnContext
    cleanup](https://github.com/openai/codex/pull/23087)
    5. [5 of 7] [Replace OverrideTurnContext with
    ThreadSettings](https://github.com/openai/codex/pull/22508)
    6. [6 of 7] [Add app-server thread settings
    API](https://github.com/openai/codex/pull/22509)
    7. [7 of 7] [Sync TUI thread
    settings](https://github.com/openai/codex/pull/22510)
  • [codex] Trim unused TurnContextItem fields (#22709)
    ## Why
    
    `TurnContextItem` is the durable baseline used to reconstruct context
    diffs across resume/fork. Most of the old persisted-only fields on it
    are no longer read, so keeping them in rollout snapshots adds schema
    surface and state that can drift without affecting reconstruction.
    
    `summary` is the exception: older Codex versions require it to
    deserialize `turn_context` records, so keep writing a default
    compatibility value until that schema surface can be removed safely.
    
    ## What changed
    
    - Removed the unused persisted fields from `TurnContextItem`: trace ids,
    user/developer instructions, output schema, and truncation policy.
    - Kept `summary` with a compatibility comment and made
    `TurnContext::to_turn_context_item` write `ReasoningSummary::Auto`
    instead of live turn state.
    - Updated rollout/context reconstruction fixtures for the retained
    summary field.
    
    ## Verification
    
    - `cargo test -p codex-protocol --lib turn_context_item`
    - `cargo test -p codex-rollout
    resume_candidate_matches_cwd_reads_latest_turn_context`
    - `cargo test -p codex-state turn_context`
    - `cargo test -p codex-core --lib
    new_default_turn_captures_current_span_trace_id`
    - `cargo test -p codex-core --lib
    record_initial_history_resumed_turn_context_after_compaction_reestablishes_reference_context_item`
    - `cargo test -p codex-core --test all
    emits_warning_when_resumed_model_differs`
    - `git diff --check`
  • goal: pause continuation loops on usage limits and blockers (#23094)
    Addresses #22833, #22245, #23067
    
    ## Why
    `/goal` can keep synthesizing turns even when the next turn cannot make
    meaningful progress. Hard usage exhaustion can replay failing turns, and
    repeated permission or external-resource blockers can keep burning
    tokens while waiting for user or system intervention.
    
    ## What changed
    - Add resumable `blocked` and `usageLimited` goal states. As with
    `paused`, goal continuation stops with these states.
    - Move to `usageLimited` after usage-limit failures.
    - Allow the built-in `update_goal` tool to set `blocked` only under
    explicit repeated-impasse guidance. Updated goal continuation prompt to
    specify that agent should use `blocked` only when it has made at least
    three attempts to get past an impasse.
    
    Most of the files touched by this PR are because of the small app server
    protocol update.
    
    ## Validation
    
    I manually reproduced a number of situations where an agent can run into
    a true impasse and verified that it properly enters `blocked` state. I
    then resumed and verified that it once again entered `blocked` state
    several turns later if the impasse still exists.
    
    I also manually reproduced the usage-limit condition by creating a
    simulated responses API endpoint that returns 429 errors with the
    appropriate error message. Verified that the goal runtime properly moves
    the goal into `usageLimited` state and TUI UI updates appropriately.
    Verified that `/goal resume` resumes (and immediately goes back into
    `ussageLImited` state if appropriate).
    
    
    ## Follow-up PRs
    
    Small changes will be needed to the GUI clients to properly handle the
    two new states.
  • app-server-protocol: remove PermissionProfile from API (#22924)
    ## Why
    
    The app server API should expose permission profile identity, not the
    lower-level runtime permission model. `PermissionProfile` is the
    compiled sandbox/network representation that the server uses internally;
    exposing it through app-server-protocol forces clients to understand
    details that should remain implementation-level.
    
    The API boundary should prefer `ActivePermissionProfile`: a stable
    profile id, plus future parent-profile metadata, that clients can pass
    back when they want to select the same active permissions. This also
    avoids schema generation collisions between the app-server v2 API type
    space and the core protocol model.
    
    Incidentally, while PR makes a number of changes to `command/exec`, note
    that we are hoping to deprecate this API in favor of `process/spawn`, so
    we don't need to be too finicky about these changes.
    
    ## What Changed
    
    - Removed `PermissionProfile` from the app-server-protocol API surface,
    including generated schema and TypeScript exports.
    - Changed `CommandExecParams.permissionProfile` to
    `ActivePermissionProfile`.
    - Resolve command exec profile ids through `ConfigManager` for the
    command cwd, matching turn override selection semantics.
    - Updated downstream TUI tests/helpers to use core permission types
    directly instead of app-server-protocol `PermissionProfile` shims.
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • [codex] Use compaction_trigger item for remote compaction v2 (#22809)
    ## Why
    
    Remote compaction v2 was still using `context_compaction` as both the
    request trigger and the compacted output shape. The Responses API now
    has the landed contract for this flow: Codex sends a dedicated `{
    "type": "compaction_trigger" }` input item, and the backend returns the
    standard `compaction` output item with encrypted content.
    
    This aligns the v2 path with that wire contract while preserving the
    existing local compacted-history post-processing behavior.
    
    ## What changed
    
    - Add `ResponseItem::CompactionTrigger` and regenerate the app-server
    protocol schema fixtures.
    - Send `compaction_trigger` from `remote_compaction_v2` instead of a
    payload-less `context_compaction`.
    - Collect exactly one backend `compaction` output item, then reuse the
    existing compacted-history rebuilding path.
    - Treat the trigger item as a transient request marker rather than model
    output or persisted rollout/memory content.
    
    ## Verification
    
    - `cargo test -p codex-protocol compaction_trigger`
    - `cargo test -p codex-core remote_compact_v2`
    - `cargo test -p codex-core compact_remote_v2`
    - `cargo test -p codex-core
    responses_websocket_sends_response_processed_after_remote_compaction_v2`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol schema_fixtures`
  • app-server: use permission ids and runtime workspace roots (#22611)
    ## Why
    
    This PR builds on [#22610](https://github.com/openai/codex/pull/22610)
    and is the app-server side of the migration from mutable per-turn
    `SandboxPolicy` replacement toward selecting immutable permission
    profiles by id plus mutable runtime workspace roots.
    
    Once permission profiles can carry their own immutable
    `workspace_roots`, app-server no longer needs to mutate the selected
    `PermissionProfile` just to represent thread-specific filesystem
    context. The mutable part now lives on the thread as explicit
    `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until
    the sandbox is realized for a turn.
    
    ## What Changed
    
    - Replaced the v2 permission-selection wrapper surface with plain
    profile ids for `thread/start`, `thread/resume`, `thread/fork`, and
    `turn/start`.
    - Removed the API surface for profile modifications
    (`PermissionProfileSelectionParams`,
    `PermissionProfileModificationParams`,
    `ActivePermissionProfileModification`).
    - Added experimental `runtimeWorkspaceRoots` fields to the thread
    lifecycle and turn-start APIs.
    - Threaded runtime workspace roots through core session/thread
    snapshots, turn overrides, app-server request handling, and command
    execution permission resolution.
    - Kept session permission state symbolic so later runtime root updates
    and cwd-only implicit-root retargeting rebind `:workspace_roots`
    correctly.
    - Updated the embedded clients just enough to send and restore the new
    thread state.
    - Refreshed the generated schema/TypeScript artifacts and the app-server
    README to match the new contract.
    
    ## Verification
    
    Targeted coverage for this layer lives in:
    
    - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_start.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
    - `codex-rs/app-server/tests/suite/v2/turn_start.rs`
    - `codex-rs/core/src/session/tests.rs`
    
    The key regression checks exercise that:
    
    - `runtimeWorkspaceRoots` resolve against the effective cwd on thread
    start.
    - Profile-declared workspace roots are excluded from the runtime
    workspace roots returned by app-server.
    - A turn-level runtime workspace-root update persists onto the thread
    and is returned by `thread/resume`.
    - A named permission profile selected on one turn remains symbolic so a
    later runtime-root-only turn update changes the actual sandbox writes.
    - A cwd-only turn update retargets the implicit runtime cwd root while
    preserving additional runtime roots.
    - The protocol fixtures and generated client artifacts stay in sync with
    the string-based permission selection contract.
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611).
    * #22612
    * __->__ #22611
  • permissions: support workspace roots in profiles (#22610)
    ## Why
    
    This is the configuration/model half of the alternative permissions
    migration we discussed as a comparison point for
    [#22401](https://github.com/openai/codex/pull/22401) and
    [#22402](https://github.com/openai/codex/pull/22402).
    
    The old `workspace-write` model mixes three concerns that we want to
    keep separate:
    - reusable profile rules that should stay immutable once selected
    - user/runtime workspace roots from `cwd`, `--add-dir`, and legacy
    workspace-write config
    - internal Codex writable roots such as memories, which should not be
    shown as user workspace roots
    
    This PR gives permission profiles first-class `workspace_roots` so users
    can opt multiple repositories into the same `:workspace_roots` rules
    without using broad absolute-path write grants. It also starts
    separating the raw selected profile from the effective runtime profile
    by making `Permissions` expose explicit accessors instead of public
    mutable fields.
    
    A representative `config.toml` looks like this:
    
    ```toml
    default_permissions = "dev"
    
    [permissions.dev.workspace_roots]
    "~/code/openai" = true
    "~/code/developers-website" = true
    
    [permissions.dev.filesystem.":workspace_roots"]
    "." = "write"
    ".codex" = "read"
    ".git" = "read"
    ".vscode" = "read"
    ```
    
    If Codex starts in `~/code/codex` with that profile selected, the
    effective workspace-root set becomes:
    - `~/code/codex` from the runtime `cwd`
    - `~/code/openai` from the profile
    - `~/code/developers-website` from the profile
    
    The `:workspace_roots` rules are materialized across each root, so
    `.git`, `.codex`, and `.vscode` stay scoped the same way everywhere.
    Runtime additions such as `--add-dir` can still layer on later stack
    entries without mutating the selected profile.
    
    ## Stack Shape
    
    This PR intentionally stops before the profile-identity cleanup in
    [#22683](https://github.com/openai/codex/pull/22683) so the base review
    stays focused on config loading, workspace-root materialization, and
    compatibility with legacy `workspace-write`.
    
    The representation in this PR is therefore transitional: `Permissions`
    carries enough state to distinguish the raw constrained profile from the
    effective runtime profile, and there are still call sites that must keep
    the active profile identity and constrained profile value in sync. The
    follow-up PR replaces that with a single resolved profile state
    (`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the
    profile id, immutable `PermissionProfile`, and profile-declared
    workspace roots together. That follow-up removes APIs such as
    `set_constrained_permission_profile_with_active_profile()` where
    separate arguments could drift out of sync.
    
    Downstream PRs then build on this base to switch app-server turn updates
    to profile ids plus runtime workspace roots and to finish the
    user-visible summary behavior. Reviewers should judge this PR as the
    workspace-roots foundation, not as the final in-memory shape of selected
    permission profiles.
    
    ## Review Guide
    
    Suggested review order:
    
    1. Start with `codex-rs/core/src/config/mod.rs`.
    This is the main shape change in the base slice. `Permissions` now
    stores a private raw `Constrained<PermissionProfile>` plus runtime
    `workspace_roots`. Callers use `permission_profile()` when they need the
    raw constrained value and `effective_permission_profile()` when they
    need a materialized runtime profile. As noted above,
    [#22683](https://github.com/openai/codex/pull/22683) replaces this
    transitional shape with a resolved profile state that keeps identity and
    profile data together.
    
    2. Review `codex-rs/config/src/permissions_toml.rs` and
    `codex-rs/core/src/config/permissions.rs`.
    These add `[permissions.<id>.workspace_roots]`, resolve enabled entries
    relative to the policy cwd, and keep `:workspace_roots` deny-read glob
    patterns symbolic until the actual roots are known.
    
    3. Review `codex-rs/protocol/src/permissions.rs` and
    `codex-rs/protocol/src/models.rs`.
    These add the policy/profile materialization helpers that expand exact
    `:workspace_roots` entries and scoped deny-read globs over every
    workspace root. This is also where `ActivePermissionProfileModification`
    is removed from the core model.
    
    4. Review the legacy bridge in
    `Config::load_from_base_config_with_overrides` and
    `Config::set_legacy_sandbox_policy`.
    This is where legacy `workspace-write` roots become runtime workspace
    roots, while Codex internal writable roots stay internal and do not
    appear as user-facing workspace roots.
    
    5. Then skim downstream call sites.
    The interesting pattern is raw-vs-effective access: state/proxy/bwrap
    paths keep the raw constrained profile, while execution, summaries, and
    user-visible status use the effective profile and workspace-root list.
    
    ## What Changed
    
    - added `[permissions.<id>.workspace_roots]` to the config model and
    schema
    - added runtime `workspace_roots` state to `Config`/`Permissions` and
    `ConfigOverrides`
    - made `Permissions` profile fields private and replaced direct mutation
    with accessors/setters
    - added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for
    materializing `:workspace_roots` exact paths and deny-read globs across
    all roots
    - moved legacy additional writable roots into runtime workspace-root
    state instead of active profile modifications
    - removed `ActivePermissionProfileModification` and its app-server
    protocol/schema export
    - updated sandbox/status summary paths so internal writable roots are
    not reported as user workspace roots
    
    ## Verification Strategy
    
    The targeted tests cover the behavior at the layers where regressions
    are most likely:
    - `codex-rs/core/src/config/config_tests.rs` verifies config loading,
    legacy workspace-root seeding, effective profile materialization, and
    memory-root handling.
    - `codex-rs/core/src/config/permissions_tests.rs` verifies profile
    `workspace_roots` parsing and `:workspace_roots` scoped/glob
    compilation.
    - `codex-rs/protocol/src/permissions.rs` unit tests verify exact and
    glob materialization over multiple workspace roots.
    - `codex-rs/tui/src/status/tests.rs` and
    `codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the
    user-facing summaries show effective workspace roots and hide internal
    writes.
    
    I also ran `cargo check --tests` locally after the latest stack refresh
    to catch cross-crate API breakage from the private-field/accessor
    changes.
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610).
    * #22612
    * #22611
    * #22683
    * __->__ #22610
  • permissions: canonicalize workspace_roots and danger-full-access names (#22624)
    ## Why
    
    This is a small precursor to the larger permissions-migration work. Both
    the comparison stack in
    [#22401](https://github.com/openai/codex/pull/22401) /
    [#22402](https://github.com/openai/codex/pull/22402) and the alternate
    stack in [#22610](https://github.com/openai/codex/pull/22610) /
    [#22611](https://github.com/openai/codex/pull/22611) /
    [#22612](https://github.com/openai/codex/pull/22612) are easier to
    review if the terminology is already settled underneath them.
    
    Because `:project_roots` and `:danger-no-sandbox` have not shipped as
    stable user-facing surface area, carrying them forward as aliases would
    just add more migration logic to the later stacks. This PR removes that
    ambiguity now so the follow-on work can rely on one spelling for each
    built-in concept.
    
    ## What Changed
    
    - renamed the config-facing special filesystem key from `:project_roots`
    to `:workspace_roots`
    - dropped unpublished `:project_roots` parsing support in
    `core/src/config/permissions.rs`, so new config only recognizes
    `:workspace_roots`
    - renamed the built-in full-access permission profile id from
    `:danger-no-sandbox` to `:danger-full-access`
    - dropped unpublished `:danger-no-sandbox` support entirely, including
    the old active-profile canonicalization path, and added explicit
    rejection coverage for the legacy id
    - introduced shared built-in permission-profile id constants in
    `codex-rs/protocol/src/models.rs`
    - updated `core`, `app-server`, and `tui` call sites that special-case
    built-in profiles to use the shared constants and canonical ids
    - updated tests and the Linux sandbox README to use `:workspace_roots` /
    `:danger-full-access`
    
    ## Verification
    
    I focused verification on the three places this rename can regress:
    config parsing, active-profile identity surfaced back out of `core`, and
    user/server call sites that special-case built-in profiles.
    
    Targeted checks:
    
    -
    `config::tests::default_permissions_can_select_builtin_profile_without_permissions_table`
    -
    `config::tests::default_permissions_read_only_applies_additional_writable_roots_as_modifications`
    -
    `config::tests::default_permissions_can_select_builtin_full_access_profile`
    - `config::tests::legacy_danger_no_sandbox_is_rejected`
    - `workspace_root` filtered `codex-core` tests
    -
    `request_processors::thread_processor::thread_processor_tests::thread_processor_behavior_tests::requested_permissions_trust_project_uses_permission_profile_intent`
    -
    `suite::v2::turn_start::turn_start_rejects_invalid_permission_selection_before_starting_turn`
    - `status::tests::status_snapshot_shows_auto_review_permissions`
    -
    `status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access`
    -
    `app_server_session::tests::embedded_turn_permissions_use_active_profile_selection`
  • feat: add layered --profile-v2 config files (#17141)
    ## Why
    
    `--profile-v2 <name>` gives launchers and runtime entry points a named
    profile config without making each profile duplicate the base user
    config. The base `$CODEX_HOME/config.toml` still loads first, then
    `$CODEX_HOME/<name>.config.toml` layers above it and becomes the active
    writable user config for that session.
    
    That keeps shared defaults, plugin/MCP setup, and managed/user
    constraints in one place while letting a named profile override only the
    pieces that need to differ.
    
    ## What Changed
    
    - Added the shared `--profile-v2 <name>` runtime option with validated
    plain names, now represented by `ProfileV2Name`.
    - Extended config layer state so the base user config and selected
    profile config are both `User` layers; APIs expose the active user layer
    and merged effective user config.
    - Threaded profile selection through runtime entry points: `codex`,
    `codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex
    debug prompt-input`.
    - Made user-facing config writes go to the selected profile file when
    active, including TUI/settings persistence, app-server config writes,
    and MCP/app tool approval persistence.
    - Made plugin, marketplace, MCP, hooks, and config reload paths read
    from the merged user config so base and profile layers both participate.
    - Updated app-server config layer schemas to mark profile-backed user
    layers.
    
    ## Limits
    
    `--profile-v2` is still rejected for config-management subcommands such
    as feature, MCP, and marketplace edits. Those paths remain tied to the
    base `config.toml` until they have explicit profile-selection semantics.
    
    Some adjacent background writes may still update base or global state
    rather than the selected profile:
    
    - marketplace auto-upgrade metadata
    - automatic MCP dependency installs from skills
    - remote plugin sync or uninstall config edits
    - personality migration marker/default writes
    
    ## Verification
    
    Added targeted coverage for profile name validation, layer
    ordering/merging, selected-profile writes, app-server config writes,
    session hot reload, plugin config merging, hooks/config fixture updates,
    and MCP/app approval persistence.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Remove unused legacy shell tools (#22246)
    ## Why
    
    Recent session history showed no active use of the raw `shell`,
    `local_shell`, or `container.exec` execution surfaces. Keeping those
    handlers/specs wired into core leaves duplicate shell execution paths
    alongside the supported `shell_command` and unified exec tools.
    
    ## What changed
    
    - Removed the raw `shell` handler/spec and its `ShellToolCallParams`
    protocol helper.
    - Removed the legacy `local_shell` and `container.exec` handler/spec
    plumbing while preserving persisted-history compatibility for old
    response items.
    - Normalized model/config `default` and `local` shell selections to
    `shell_command`.
    - Pruned tests that exercised removed raw-shell/local-shell/apply-patch
    variants and kept coverage on `shell_command`, unified exec, and
    freeform `apply_patch`.
    
    ## Verification
    
    - `git diff --check`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tools::handlers::shell`
    - `cargo test -p codex-core tools::spec`
    - `cargo test -p codex-core tools::router`
    - `cargo test -p codex-core
    active_call_preserves_triggering_command_context`
    - `cargo test -p codex-core guardian_tests`
    - `cargo test -p codex-core --test all shell_serialization`
    - `cargo test -p codex-core --test all apply_patch_cli`
    - `cargo test -p codex-core --test all shell_command_`
    - `cargo test -p codex-core --test all local_shell`
    - `cargo test -p codex-core --test all otel::`
    - `cargo test -p codex-core --test all hooks::`
    - `just fix -p codex-core`
    - `just fix -p codex-tools`
  • feat(tui): remove Zellij TUI workarounds (#22214)
    ## Why
    
    We added Zellij-specific TUI workarounds because older Zellij behavior
    did not work with Codex's normal terminal model:
    
    - #8555 made `tui.alternate_screen = "auto"` disable alternate screen in
    Zellij so transcript history stayed available.
    - #16578 avoided scroll-region operations in Zellij by emitting raw
    newlines and using a separate composer styling path.
    
    This PR removes both workarounds because the latest Zellij release
    tested locally (`zellij 0.44.1`) works correctly with Codex's standard
    TUI behavior: normal alternate-screen handling, redraw, and history
    insertion.
    
    ## What Changed
    
    - Removed the `InsertHistoryMode::Zellij` path and the Zellij-only
    newline scrollback insertion behavior.
    - Removed cached `is_zellij` state from the TUI and composer.
    - Removed Zellij-specific composer styling, the helper snapshot, and the
    `TerminalInfo::is_zellij()` convenience method that only served this
    workaround.
    - Changed `tui.alternate_screen = "auto"` to use alternate screen for
    Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still
    preserve the inline mode escape hatch.
    - Updated the generated config schema description for
    `tui.alternate_screen`.
    
    ## How to Test
    
    Manual smoke path used with `zellij 0.44.1`:
    
    1. Build and run this branch inside a Zellij `0.44.1` session with
    default config.
    2. Start Codex normally and produce enough assistant/tool output to
    create scrollback.
    3. Confirm the transcript remains readable, the composer renders
    normally, and scrolling through terminal history works.
    4. Resize the Zellij pane while output exists and confirm the TUI
    redraws without duplicated, missing, or stale rows.
    5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if
    you want to verify the inline fallback still works.
    
    Targeted tests:
    - `just write-config-schema`
    - `just fmt`
    - `just fix -p codex-tui`
    - `cargo test -p codex-terminal-detection`
    - `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen`
    
    Attempted but did not complete locally:
    - `cargo test -p codex-tui` built and ran the new test successfully,
    then failed later on unrelated local failures in
    `status_permissions_full_disk_managed_*` and a stack overflow in
    `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
    
    ## Documentation
    
    No developers.openai.com Codex documentation update is needed for this
    revert.
  • feat(sandbox): add Windows deny-read parity (#18202)
    ## Why
    
    The split filesystem policy stack already supports exact and glob
    `access = none` read restrictions on macOS and Linux. Windows still
    needed subprocess handling for those deny-read policies without claiming
    enforcement from a backend that cannot provide it.
    
    ## Key finding
    
    The unelevated restricted-token backend cannot safely enforce deny-read
    overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
    checks, not read denials, so this PR intentionally fails that backend
    closed when deny-read overrides are present instead of claiming
    unsupported enforcement.
    
    ## What changed
    
    This PR adds the Windows deny-read enforcement layer and makes the
    backend split explicit:
    
    - Resolves Windows deny-read filesystem policy entries into concrete ACL
    targets.
    - Preserves exact missing paths so they can be materialized and denied
    before an enforceable sandboxed process starts.
    - Snapshot-expands existing glob matches into ACL targets for Windows
    subprocess enforcement.
    - Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
    - Plans both the configured lexical path and the canonical target for
    existing paths so reparse-point aliases are covered.
    - Threads deny-read overrides through the elevated/logon-user Windows
    sandbox backend and unified exec.
    - Applies elevated deny-read ACLs synchronously before command launch
    rather than delegating them to the background read-grant helper.
    - Reconciles persistent deny-read ACEs per sandbox principal so policy
    changes do not leave stale deny-read ACLs behind.
    - Fails closed on the unelevated restricted-token backend when deny-read
    overrides are present, because its `WRITE_RESTRICTED` token model is not
    authoritative for read denials.
    
    ## Landed prerequisites
    
    These prerequisite PRs are already on `main`:
    
    1. #15979 `feat(permissions): add glob deny-read policy support`
    2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
    3. #17740 `feat(config): support managed deny-read requirements`
    
    This PR targets `main` directly and contains only the Windows deny-read
    enforcement layer.
    
    ## Implementation notes
    
    - Exact deny-read paths remain enforceable on the elevated path even
    when they do not exist yet: Windows materializes the missing path before
    applying the deny ACE, so the sandboxed command cannot create and read
    it during the same run.
    - Existing exact deny paths are preserved lexically until the ACL
    planner, which then adds the canonical target as a second ACL target
    when needed. That keeps both the configured alias and the resolved
    object covered.
    - Windows ACLs do not consume Codex glob syntax directly, so glob
    deny-read entries are expanded to the concrete matches that exist before
    process launch.
    - Glob traversal deduplicates directory visits within each pattern walk
    to avoid cycles, without collapsing distinct lexical roots that happen
    to resolve to the same target.
    - Persistent deny-read ACL state is keyed by sandbox principal SID, so
    cleanup only removes ACEs owned by the same backend principal.
    - Deny-read ACEs are fail-closed on the elevated path: setup aborts if
    mandatory deny-read ACL application fails.
    - Unelevated restricted-token sessions reject deny-read overrides early
    instead of running with a silently unenforceable read policy.
    
    ## Verification
    
    - `cargo test -p codex-core
    windows_restricted_token_rejects_unreadable_split_carveouts`
    - `just fmt`
    - `just fix -p codex-core`
    - `just fix -p codex-windows-sandbox`
    - GitHub Actions rerun is in progress on the pushed head.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Reapply "Move skills watcher to app-server" (#21652)
    ## Why
    
    PR #21460 reverted the earlier move of skills change watching from
    `codex-core` into app-server. This reapplies that boundary change so
    app-server owns client-facing `skills/changed` notifications and core no
    longer carries the watcher.
    
    ## What
    
    - Restore the app-server `SkillsWatcher` and register it from thread
    listener setup.
    - Remove the core-owned skills watcher and its core live-reload
    integration surface.
    - Restore app-server coverage for `skills/changed` notifications after a
    watched skill file changes.
    
    ## Validation
    
    - `cargo test -p codex-app-server --test all
    suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change
    -- --exact --nocapture`
    - `cargo test -p codex-core --lib --no-run`
  • Enable --deny-warnings for cargo shear (#21616)
    ## Summary
    
    In https://github.com/openai/codex/pull/21584, we disabled doctests for
    crates that lack any doctests. We can enforce that property via `cargo
    shear --deny-warnings`: crates that lack doctests will be flagged if
    doctests are enabled, and crates with doctests will be flagged if
    doctests are disabled.
    
    A few additional notes:
    
    - By adding `--deny-warnings`, `cargo shear` also flagged a number of
    modules that were not reachable at all. Some of those have been removed.
    - This PR removes a usage of `windows_modules!` (since `cargo shear` and
    `rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
    "windows")]` macros. As a consequence, many of these files exhibit churn
    in this PR, since they weren't being formatted by `rustfmt` at all on
    main.
    - Again, to make the code more analyzable, this PR also removes some
    usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
    module structure. The bin sidecar structure is still retained, but,
    e.g., `windows-sandbox-rs/src/bin/command_runner.rs‎` was moved to
    `windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>