Commit Graph

6591 Commits

  • tui: keep cleared Fast tier from reappearing after side-thread resume (#23121)
    ## Why
    
    After turning Fast mode off in the TUI, returning from a side thread
    could make `Fast` appear again in the main chat widget. The opt-out
    itself was still persisted; the display was being rebuilt from stale
    cached `ThreadSessionState` data, which made it look like Fast had been
    re-enabled.
    
    Fixes #23104.
    
    ## What changed
    
    - Keep the active thread's cached `service_tier` in sync whenever the
    user persists a service-tier selection.
    - Update both the primary-thread snapshot and the thread event store so
    restored TUI state reflects the current tier.
    - Add a focused regression test for clearing a cached Fast tier.
    
    ## Manual repro
    
    1. Start a TUI session where `Fast` is enabled by default.
    2. Run `/fast` and turn Fast mode off. Confirm `Fast` disappears from
    the chat widget display.
    3. Re-enter thread navigation via either path:
       - Run `/side test`, then return to the main thread.
       - Run `/agent`, enter a child thread, then return to the main thread.
    4. Before this fix, `Fast` reappears in the main chat widget display
    even though the opt-out was already persisted.
    5. After this fix, `Fast` stays cleared.
    
    ## Verification
    
    - `cargo test -p codex-tui
    app::thread_session_state::tests::service_tier_sync_updates_active_cached_session
    -- --exact`
  • Emit goal update events from goal extension tools (#23306)
    ## Why
    
    Goal creation and completion are moving through the goal extension, but
    the rest of Codex still observes goal state through `ThreadGoalUpdated`
    events. Without an event from the extension-owned tool path, a
    model-initiated `create_goal` or `update_goal` can mutate the backend
    and return a tool result while app-server and TUI listeners miss the
    goal state transition.
    
    ## What changed
    
    - Added `GoalEventEmitter` as a small wrapper around the host
    `ExtensionEventSink` to build `EventMsg::ThreadGoalUpdated` events for
    goal updates.
    - Threaded the registry event sink into `GoalExtension` and the
    `GoalToolExecutor`s created by the extension. The public
    `GoalExtension::new` constructor keeps a `NoopExtensionEventSink`
    fallback for standalone use.
    - Emitted a goal update after successful `create_goal` and `update_goal`
    tool calls. Until `ToolCall` exposes the current turn submission id,
    these events use the tool call id as the event id and leave `turn_id`
    unset.
    
    Relevant code:
    
    -
    [`GoalEventEmitter::thread_goal_updated`](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/events.rs#L19-L32)
    - [`GoalToolExecutor` emission
    points](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/tool.rs#L161-L190)
    
    ## Testing
    
    - `cargo test -p codex-goal-extension`
  • chore: make token usage async (#23305)
    Make the `TokenUsageContributor` async. This will be required for future
    extension and it's basically free
  • chore: goal resumed metrics (#23301)
    Add metrics for goal resume
  • chore: isolate thread goal storage behind GoalStore (#23295)
    ## Why
    
    Thread goal persistence is being prepared for a dedicated storage
    boundary. Before that split, goal-specific reads, writes, accounting,
    and cleanup were exposed directly on `StateRuntime`, so core and
    app-server callsites stayed coupled to the full runtime instead of a
    goal-specific store.
    
    This PR introduces that boundary without changing the goal wire API or
    current persistence behavior. Callers now go through
    `StateRuntime::thread_goals()` and the new `GoalStore`, while
    `GoalStore` still uses the existing state DB pool underneath.
    
    ## What changed
    
    - Added `GoalStore` in `state/src/runtime/goals.rs` and exposed it from
    `StateRuntime` via `thread_goals()`.
    - Moved thread-goal reads, writes, status updates, pause, delete, and
    usage accounting onto `GoalStore`.
    - Updated core session goal handling, app-server goal RPCs, resume
    snapshots, and goal tests to use the store boundary.
    - Kept thread deletion responsible for cascading goal cleanup by
    deleting the goal through the store only after a thread row is removed.
    
    ## Testing
    
    - Existing goal persistence, resume, and accounting tests were updated
    to exercise the new `GoalStore` access path.
  • feat: add extension event sink capability (#23293)
    ## Why
    
    Extensions can already expose typed contributions and receive host
    capabilities such as `AgentSpawner`, but they do not have a typed way to
    send protocol events back through the host. Extensions that need to
    surface progress or status should not have to own persistence, ordering,
    transport fanout, or logging decisions themselves.
    
    ## What
    
    - Add `ExtensionEventSink`, a host-provided fire-and-forget sink for
    `codex_protocol::protocol::Event`.
    - Add `NoopExtensionEventSink` so hosts that do not expose extension
    event emission keep the existing empty-registry behavior.
    - Store the sink on `ExtensionRegistryBuilder` / `ExtensionRegistry`,
    with `with_event_sink(...)` and `event_sink()` accessors, and re-export
    the new capability from `codex-extension-api`.
    
    ## Testing
    
    - Not run locally; PR metadata/body update only.
  • Make extension lifecycle hooks async (#23291)
    ## Why
    
    Extension lifecycle hooks sit on the host/extension boundary, but the
    current trait surface only allows synchronous callbacks. That forces
    extensions that need to seed, rehydrate, observe, or flush
    extension-owned state during thread and turn transitions to either block
    inside the callback or move async work into separate host plumbing.
    
    This PR makes those lifecycle callbacks awaitable so extension
    implementations can perform async work directly at the lifecycle point
    where the host already has the relevant session, thread, or turn stores
    available.
    
    ## What changed
    
    - Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor`
    async in `codex-extension-api`.
    - Awaits thread start/resume/stop and turn start/stop/abort lifecycle
    callbacks from `codex-core`.
    - Updates the guardian and memories extensions to implement the async
    lifecycle trait surface.
    - Updates the existing lifecycle tests to use async contributor
    implementations.
    - Adds `async-trait` to the crates that now expose or implement these
    async object-safe lifecycle traits.
    
    ## Testing
    
    - Existing `codex-core` lifecycle tests were updated to cover async
    implementations for thread stop and turn abort ordering.
  • chore: goal ext skeleton (#23288)
    Skeleton of `/goal` in extension
    Lot's of follow-ups coming
  • [codex] Add installed-plugin mention API (#22448)
    ## Summary
    - add app-server `plugin/installed` for mention-oriented plugin loading
    - return installed plugins plus explicitly requested install-suggestion
    rows
    - keep remote handling on installed-state data instead of the broad
    catalog listing path
    
    ## Why
    The `@` mention surface only needs plugins that are usable now, plus a
    small product-approved set of install suggestions. It does not need the
    full catalog-shaped `plugin/list` payload that the Plugins page uses.
    
    ## Validation
    - `just write-app-server-schema`
    - `just fmt`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core-plugins`
    - `cargo test -p codex-app-server --test all plugin_installed_`
    
    ## Notes
    - The package-wide `cargo test -p codex-app-server` run still hits an
    existing unrelated stack overflow in
    `in_process::tests::in_process_start_clamps_zero_channel_capacity`.
    - Companion webview PR: https://github.com/openai/openai/pull/915672
  • Densify and version memory summaries (#23148)
    ## Why
    
    `memory_summary.md` is injected into every session, so its value depends
    on staying compact, navigational, and easy to regenerate when the
    expected shape changes. The previous consolidation prompt encouraged a
    broad actionable inventory and allowed older summary structures to be
    patched in place, which makes it easier for stale or overly verbose
    summaries to keep accumulating.
    
    This change makes the summary format explicitly versioned and biases
    Phase 2 memory consolidation toward denser prompt-loaded context.
    
    ## What changed
    
    - Require `memory_summary.md` to begin with an exact `v1` header.
    - Teach consolidation to regenerate `memory_summary.md` from scratch
    when the header is missing or incompatible, while still allowing
    incremental updates to `MEMORY.md`.
    - Tighten the `memory_summary.md` instructions so it acts as a compact
    routing/index layer instead of a second handbook.
    - Lower `MEMORY_TOOL_DEVELOPER_INSTRUCTIONS_SUMMARY_TOKEN_LIMIT` from
    `5_000` to `2_500` so the runtime prompt budget matches the denser
    summary target.
    
    ## Verification
    
    Not run; this is a prompt/template update plus a prompt budget constant
    change.
  • Add exec-server websocket keepalive (#23226)
    ## Summary
    - send periodic websocket Ping frames from outbound exec-server
    websocket clients
    - cover direct exec-server websocket clients plus rendezvous
    harness/executor websocket connections
    - keep inbound axum-accepted exec-server websocket connections passive
    - add focused keepalive coverage for direct and relay websocket paths
    
    ## Validation
    - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test
    //codex-rs/exec-server:exec-server-unit-tests
    --test_filter='websocket_connection_sends_keepalive_ping|harness_connection_sends_keepalive_ping|multiplexed_executor_sends_keepalive_ping'
    - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test
    //codex-rs/exec-server:exec-server-relay-test
    --test_filter=multiplexed_remote_executor_routes_independent_virtual_streams
  • [codex] Accept string input for Python turns (#23162)
    ## Summary
    - Allow thread.turn and turn.steer, including async variants, to accept
    RunInput so plain strings work alongside typed input objects.
    - Export RunInput and update the SDK artifact generator so regenerated
    turn methods keep the same signature and normalization.
    - Update docs, examples, notebook cells, and tests to use string
    shorthand for text-only turns while keeping typed inputs for multimodal
    input.
    
    ## Validation
    - uv run --extra dev ruff format .
    - uv run --extra dev ruff check --output-format=github .
    - python3 -m py_compile sdk/python/src/openai_codex/__init__.py
    sdk/python/src/openai_codex/api.py
    sdk/python/src/openai_codex/_inputs.py
    sdk/python/scripts/update_sdk_artifacts.py
    sdk/python/tests/test_public_api_signatures.py
    sdk/python/tests/test_app_server_streaming.py
    sdk/python/tests/test_app_server_turn_controls.py
    sdk/python/tests/test_real_app_server_integration.py
    - python3 -c "import json;
    json.load(open('sdk/python/notebooks/sdk_walkthrough.ipynb'))"
    - sdk/python/.venv/bin/python -c "import inspect, openai_codex; from
    openai_codex import Thread, AsyncThread, TurnHandle, AsyncTurnHandle,
    RunInput; funcs=[Thread.run, Thread.turn, AsyncThread.run,
    AsyncThread.turn, TurnHandle.steer, AsyncTurnHandle.steer]; assert
    all(inspect.signature(fn).parameters['input'].annotation == 'RunInput'
    for fn in funcs); assert RunInput is openai_codex.RunInput"
  • test: reduce core sandbox policy test setup (#23036)
    ## Why
    
    `SandboxPolicy` is a legacy compatibility shape, but several core tests
    still used it for ordinary turn setup even when the runtime path now
    carries `PermissionProfile`. With the first cleanup PR merged, this
    follow-up trims more core test scaffolding so remaining `SandboxPolicy`
    matches are easier to classify as production compatibility,
    legacy-boundary coverage, or explicit conversion tests.
    
    ## What Changed
    
    - Updated apply-patch handler and runtime tests to pass
    `PermissionProfile` directly.
    - Changed sandboxing test helpers to build permission profiles without
    first creating `SandboxPolicy` values.
    - Converted request-permissions integration turns to pass
    `PermissionProfile` through the test helper, leaving legacy sandbox
    projection at the `Op::UserTurn` boundary.
    - Converted unified exec integration helpers and direct turn submissions
    to use `PermissionProfile` values instead of `SandboxPolicy` setup.
    - Removed now-unused `SandboxPolicy` imports from the touched core
    tests.
    
    ## Test Plan
    
    - `just fmt`
    - `cargo test -p codex-core --lib tools::sandboxing::tests`
    - `cargo test -p codex-core --lib tools::runtimes::apply_patch::tests`
    - `cargo test -p codex-core --lib tools::handlers::apply_patch::tests`
    - `cargo test -p codex-core --lib unified_exec::process_manager::tests`
    - `cargo test -p codex-core --test all request_permissions::`
    - `cargo test -p codex-core --test all unified_exec::`
    - `just fix -p codex-core`
  • Make multi-agent v2 tool namespace configurable (#23147)
    ## Summary
    - Add `features.multi_agent_v2.tool_namespace` with config/schema
    validation for Responses-compatible namespace values.
    - Thread the resolved namespace into `ToolsConfig` for normal turns and
    review turns.
    - Wrap MultiAgentV2 tool specs and registry names in the configured
    namespace when namespace tools are supported, while falling back to the
    plain tool names when they are not.
    
    ## Validation
    - `just fmt`
    - `just write-config-schema`
    - `cargo test -p codex-features multi_agent_v2_feature_config --
    --nocapture`
    - `cargo test -p codex-core test_build_specs_multi_agent_v2 --
    --nocapture`
    - `cargo test -p codex-core multi_agent_v2_config -- --nocapture`
    - `cargo test -p codex-core
    multi_agent_v2_rejects_invalid_tool_namespace -- --nocapture`
    - `cargo test -p codex-tools`
    - `git diff --check`
  • [codex] Return TurnResult from Python turn handles (#23151)
    ## Why
    
    `TurnHandle.run()` returned the raw app-server `Turn`, whose live
    start/completed payloads do not include loaded `items`, so users saw
    empty `items` after starting a turn. That made the handle-based path
    behave differently from `Thread.run(...)`, and pushed examples toward
    persisted-thread reads plus helper extraction.
    
    This PR makes the run APIs standalone: starting a turn and running it
    returns collected turn data directly, or fails visibly when required
    stream events are missing.
    
    ## What Changed
    
    - Replaces the public `RunResult` export with `TurnResult`.
    - Adds turn metadata to `TurnResult`: `id`, `status`, `error`,
    `started_at`, `completed_at`, and `duration_ms`, alongside
    `final_response`, `items`, and `usage`.
    - Changes `TurnHandle.run()` and `AsyncTurnHandle.run()` to consume
    stream events with the same collector used by `Thread.run(...)`.
    - Exports `TurnError` from `openai_codex.types` for the new result
    shape.
    - Updates tests, examples, docs, and the walkthrough notebook to use
    `result.final_response` and `result.items` directly.
    - Removes persisted-thread helper paths and placeholder/skipped control
    flows from the public examples and notebook.
    
    ## Verification
    
    - `python3 -m py_compile ...` over changed SDK, example, and test Python
    files.
    - `python3 -c "import json;
    json.load(open('sdk/python/notebooks/sdk_walkthrough.ipynb'))"`
    - `git diff --check`
    - `PYTHONPATH=sdk/python/src python3 -c ...` import/signature smoke for
    `TurnResult`, `TurnHandle.run`, and `AsyncTurnHandle.run`.
  • sdk/python: add first-class login support (#23093)
    ## Why
    
    The Python SDK can already create threads and run turns, but
    authentication still has to be arranged outside the SDK. App-server
    already exposes account login, account inspection, logout, and
    `account/login/completed` notifications, so SDK users currently have to
    work around a missing public client layer for a core setup step.
    
    This change makes authentication a normal SDK workflow while preserving
    the backend flow shape: API-key login completes immediately, and
    interactive ChatGPT flows return live handles that complete later
    through app-server notifications.
    
    ## What changed
    
    - Added public sync and async auth methods on `Codex` / `AsyncCodex`:
      - `login_api_key(...)`
      - `login_chatgpt()`
      - `login_chatgpt_device_code()`
      - `account(...)`
      - `logout()`
    - Added public browser-login and device-code handle types with
    attempt-local `wait()` and `cancel()` helpers. Cancellation stays on the
    handle instead of a root-level SDK method.
    - Extended the Python app-server client and notification router so login
    completion events are routed by `login_id` without consuming unrelated
    global notifications.
    - Kept login request/handle logic in a focused internal `_login.py`
    module so `api.py` remains the public facade instead of absorbing more
    auth plumbing.
    - Exported the new handle types plus curated account/login response
    types from the SDK surfaces.
    - Updated SDK docs, added sync/async login walkthrough examples, and
    added a notebook login walkthrough cell.
    
    ## Verification
    
    Added SDK coverage for:
    
    - API-key login, account readback, and logout through the app-server
    harness in both sync and async clients.
    - Browser login cancellation plus `handle.wait()` completion through the
    real app-server boundary used by the Python SDK harness.
    - Waiter routing that stays scoped across replaced interactive login
    attempts, plus async handle cancellation coverage.
    - Login notification demuxing, replay of early completion events, and
    async client delegation.
    - Public export/signature assertions.
    - Real integration-suite smoke coverage for the new examples and
    notebook login cell.
  • [1 of 4] tui: route primary settings writes through app server (#22913)
    ## Why
    The TUI can run against a remote app server, but several high-traffic
    settings still persisted by editing the local config file. That sends
    remote sessions' preference writes to the wrong machine and lets local
    disk state drift from the app-server-owned config.
    
    This is **[1 of 4]** in a stacked series that moves TUI-owned config
    mutations onto app-server APIs.
    
    ## What changed
    - Added a small TUI helper for typed app-server config writes.
    - Routed primary interactive preference writes through
    `config/batchWrite`.
    - Preserved existing profile scoping for settings that already support
    `profiles.<profile>.*` overrides.
    
    ## Config keys affected
    - `model`
    - `model_reasoning_effort`
    - `personality`
    - `service_tier`
    - `plan_mode_reasoning_effort`
    - `approvals_reviewer`
    - `notice.fast_default_opt_out`
    - Profile-scoped equivalents under `profiles.<profile>.*`
    
    ## Suggested manual validation
    - Connect the TUI to a remote app server, change `model` and
    `model_reasoning_effort`, reconnect, and confirm the remote config
    retained both values while the local `config.toml` did not change.
    - Change `personality`, `plan_mode_reasoning_effort`, and the explicit
    auto-review selection, then reconnect and confirm those choices persist
    through the app server.
    - Clear the service tier back to default and confirm `service_tier` is
    cleared while `notice.fast_default_opt_out = true` is persisted
    remotely.
    - Repeat one setting change with an active profile and confirm the write
    lands under `profiles.<profile>.*`.
    
    ## Stack
    1. [#22913](https://github.com/openai/codex/pull/22913) `[1 of 4]`
    primary settings writes
    2. [#22914](https://github.com/openai/codex/pull/22914) `[2 of 4]` app
    and skill enablement
    3. [#22915](https://github.com/openai/codex/pull/22915) `[3 of 4]`
    feature and memory toggles
    4. [#22916](https://github.com/openai/codex/pull/22916) `[4 of 4]`
    startup and onboarding bookkeeping
  • multiagent: trim model-visible description, cap to 5 models (#23069)
    ## Why
    
    The `spawn_agent` model override guidance is uncapped and bloating
    context. We need to trim down each entry and cap total entries.
    
    picked 5 as cap, we can change
    
    ## What changed
    
    - Cap the model override summaries shown in `spawn_agent` to the first 5
    picker-visible models, preserving the existing priority ordering from
    the models manager.
    - Condense each rendered entry to the actionable pieces the model needs:
      - use the model slug as the label
      - render compact reasoning effort lists with the default marked inline
    - render only service tier IDs, and omit the clause when no tiers are
    available
    - Update coverage so the compact formatter shape and the top-5 cap are
    exercised, and keep the end-to-end request assertion aligned with real
    model metadata.
    
    ## Example
    
    Before:
    
    `- gpt-5.4 ('gpt-5.4\'): Strong model for everyday coding. Default
    reasoning effort: medium. Supported reasoning efforts: low (Fast
    responses with lighter reasoning), medium (Balances speed and reasoning
    depth for everyday tasks), high (Greater reasoning depth for complex
    problems), xhigh (Extra high reasoning depth for complex problems).
    Supported service tiers: priority (Fast: 1.5x speed, increased usage).`
    
    After:
    
    `- 'gpt-5.4': Strong model for everyday coding. Reasoning efforts: low,
    medium (default), high, xhigh. Service tiers: priority.`
  • [codex] preserve MCP result meta in McpToolCallItemResult (#22946)
    ## Summary
    
    https://openai.slack.com/archives/C0ARA9UAQEA/p1778890981647319?thread_ts=1778888537.934319&cid=C0ARA9UAQEA
    
    
    - Add `_meta` to exec JSONL MCP tool call result events.
    - Copy MCP result metadata through the JSONL event conversion.
    - Add a focused test that verifies `_meta` is serialized as `_meta` and
    not `meta`.
    
    
    ## Verification
    
    https://www.notion.so/openai/Miaolin-0516-_meta-population-debug-3628e50b62b08074b365e0ce1ffb8f74
  • exec-server: support auth-backed remote executor registration (#22769)
    This updates remote `exec-server` registration to use normal Codex auth
    instead of a registry-issued credential. The registry request is built
    from the existing auth-provider path, which preserves the biscuit-only
    registry contract introduced in
    [openai/openai#924101](https://github.com/openai/openai/pull/924101)
    while removing the old remote registry bearer env var and its direct
    transport assumptions.
    
    The default remote flow uses persisted ChatGPT auth from the normal
    Codex config/storage path. This PR also includes the containerized Agent
    Identity path needed by
    [openai/openai#924260](https://github.com/openai/openai/pull/924260):
    remote `exec-server` accepts `--allow-agent-identity-auth`, permits
    Agent Identity auth loaded from `CODEX_ACCESS_TOKEN` only when that flag
    is present, and reuses the existing Agent task registration plus derived
    `AgentAssertion` header generation. API-key auth remains unsupported,
    and Agent Identity stays opt-in.
    
    Validation performed beyond normal presubmit coverage:
    - `cargo fmt --all --check`
    - `cargo check -p codex-cli`
    - `cargo test -p codex-exec-server`
    - `cargo test -p codex-cli exec_server_agent_identity_auth_flag_`
    - `cargo test -p codex-cli remote_exec_server_auth_mode_`
    
    I also attempted `cargo test -p codex-cli`. The new CLI tests passed
    inside that run, but the suite ended on an unrelated local
    marketplace-state failure in
    `plugin_list_excludes_unconfigured_repo_local_marketplaces`.
  • test: construct permission profiles directly (#23030)
    ## Why
    
    `SandboxPolicy` is now a legacy compatibility shape, but several tests
    still built a `SandboxPolicy` only to immediately convert it into
    `PermissionProfile` for APIs that already accept canonical runtime
    permissions. Those detours make it harder to audit where legacy sandbox
    policy is still required, because boundary-only usages are mixed
    together with ordinary test setup.
    
    ## What Changed
    
    - Updated tests in `codex-core`, `codex-exec`, `codex-analytics`, and
    `codex-config` to construct `PermissionProfile` values directly when the
    code under test takes a permission profile.
    - Changed exec-policy, request-permissions, session, and sandbox test
    helpers to pass `PermissionProfile` through instead of converting from
    `SandboxPolicy` internally.
    - Left `SandboxPolicy` in place where tests are explicitly exercising
    legacy compatibility or request/response boundaries.
    
    ## Test Plan
    
    - `cargo test -p codex-analytics -p codex-config`
    - `cargo test -p codex-core --lib safety::tests`
    - `cargo test -p codex-core --lib exec_policy::tests::`
    - `cargo test -p codex-core --lib exec::tests`
    - `cargo test -p codex-core --lib guardian_review_session_config`
    - `cargo test -p codex-core --lib tools::network_approval::tests`
    - `cargo test -p codex-core --lib
    tools::runtimes::shell::unix_escalation::tests`
    - `cargo test -p codex-core --lib managed_network`
    - `cargo test -p codex-core --test all request_permissions::`
    - `cargo test -p codex-exec sandbox`
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23030).
    * #23036
    * __->__ #23030
  • Improve goal completion usage reporting (#22907)
    ## Why
    
    Goal completion follow-up turns currently receive a preformatted English
    usage sentence such as `time used: 2586 seconds`. That nudges the model
    to echo an awkward raw seconds count in the final reply, even though the
    tool result already exposes structured usage fields like
    `goal.timeUsedSeconds`, `goal.tokensUsed`, and `goal.tokenBudget`.
    
    ## What changed
    
    - Replace the preformatted completion usage sentence with guidance to
    read the structured goal fields from the tool result.
    - Preserve token-budget reporting while allowing the model to phrase
    elapsed time in a concise, human-friendly way that fits the response
    language.
    - Update core coverage for both the generated completion guidance and
    the session flow that forwards it back to the model.
    
    ## Verification
    
    Previously, it would have output a final message indicating that it
    "worked for 303 seconds". Now it shows the following:
    
    <img width="286" height="35" alt="image"
    src="https://github.com/user-attachments/assets/d7011880-9449-46a7-856f-4e50ae00eb45"
    />
  • [codex] Split Python SDK helper logic (#22939)
    ## Summary
    - Move approval-mode mapping into
    `sdk/python/src/openai_codex/_approval_mode.py`.
    - Move initialize metadata parsing and normalization into
    `sdk/python/src/openai_codex/_initialize_metadata.py`.
    - Keep the public `ApprovalMode` export stable and retarget direct
    metadata helper coverage.
    
    ## Integration coverage
    - Add an app-server harness smoke that exercises sync and async SDK
    initialization plus thread creation.
    
    ## Validation
    - Local tests were not run per repo guidance. CI should validate this
    branch once the PR is online.
  • core: set permission profiles from snapshots (#22920)
    ## Why
    
    #22891 moved the TUI turn-command path to pass `ActivePermissionProfile`
    instead of the full `PermissionProfile`, but the remaining
    config/session bridge still accepted the concrete `PermissionProfile`
    and active profile id as separate arguments. That shape made it too easy
    for future callers to update the concrete profile and active profile id
    out of sync.
    
    This PR makes the trusted session snapshot path pass one coherent value
    into `Permissions`, while keeping `requirements.toml` enforcement owned
    by the existing constrained permission state.
    
    ## What Changed
    
    - Added `PermissionProfileSnapshot` as the public snapshot value for
    trusted session/config synchronization.
    - Changed `Permissions::set_permission_profile_from_session_snapshot()`
    and `replace_permission_profile_from_session_snapshot()` to take a
    `PermissionProfileSnapshot`.
    - Updated the replacement path to derive its constrained
    `PermissionProfile` from the snapshot, so callers cannot pass a separate
    profile that disagrees with the snapshot.
    - Removed the internal tuple-style
    `PermissionProfileState::set_active_permission_profile()` mutation path.
    - Updated core session projection and TUI call sites to construct
    explicit legacy or active snapshots.
    - Documented the snapshot constructors so legacy use and id/profile
    mismatch hazards are called out at the API boundary.
    - Added a focused config test that verifies snapshot updates still
    respect existing permission constraints.
    
    ## How To Review
    
    1. Start with `codex-rs/core/src/config/resolved_permission_profile.rs`;
    `PermissionProfileSnapshot` is the public wrapper, while
    `ResolvedPermissionProfile` stays internal.
    2. Check `codex-rs/core/src/config/mod.rs` to confirm both
    session-snapshot setters validate through `PermissionProfileState` and
    no longer accept loose profile/id pairs.
    3. Skim `codex-rs/core/src/session/session.rs` for the session
    projection path; it now builds the snapshot before installing it.
    4. Skim the TUI changes as call-site migration from loose argument pairs
    to explicit snapshot construction.
    
    ## Verification
    
    - `cargo test -p codex-core
    permission_snapshot_setter_preserves_permission_constraints`
    - `cargo test -p codex-tui status_permissions_`
    - `cargo test -p codex-tui
    session_configured_preserves_profile_workspace_roots`
    - `just fix -p codex-core -p codex-tui`
  • Fix Windows doctor npm root probe (#22967)
    ## Why
    On Windows npm-managed installs expose the working shim as `npm.cmd`.
    `codex doctor` probed bare `npm`, which could incorrectly report that
    npm global-root inspection was unavailable even when the install was
    healthy.
    
    Fixes #22964.
    
    ## What changed
    - Use `npm.cmd` for the doctor npm-root probe on Windows.
    - Keep the existing `npm` probe on non-Windows platforms.
  • [codex] Refine Python SDK user-facing docs (#22941)
    ## Summary
    - Remove maintainer and release-process wording from the Python SDK
    README and docs.
    - Rewrite SDK-facing comments/docstrings so they read as standalone
    product documentation.
    - Add a real app-server integration smoke that follows the public
    quickstart-style `Codex() -> thread_start() -> run()` path.
    
    ## Integration coverage
    - Add `test_real_quickstart_style_flow_smoke` in the real app-server
    integration suite.
    
    ## Validation
    - Local tests were not run per repo guidance. CI should validate this
    branch once the PR is online.
  • app-server-protocol: remove PermissionProfile from API (#22924)
    ## Why
    
    The app server API should expose permission profile identity, not the
    lower-level runtime permission model. `PermissionProfile` is the
    compiled sandbox/network representation that the server uses internally;
    exposing it through app-server-protocol forces clients to understand
    details that should remain implementation-level.
    
    The API boundary should prefer `ActivePermissionProfile`: a stable
    profile id, plus future parent-profile metadata, that clients can pass
    back when they want to select the same active permissions. This also
    avoids schema generation collisions between the app-server v2 API type
    space and the core protocol model.
    
    Incidentally, while PR makes a number of changes to `command/exec`, note
    that we are hoping to deprecate this API in favor of `process/spawn`, so
    we don't need to be too finicky about these changes.
    
    ## What Changed
    
    - Removed `PermissionProfile` from the app-server-protocol API surface,
    including generated schema and TypeScript exports.
    - Changed `CommandExecParams.permissionProfile` to
    `ActivePermissionProfile`.
    - Resolve command exec profile ids through `ConfigManager` for the
    command cwd, matching turn override selection semantics.
    - Updated downstream TUI tests/helpers to use core permission types
    directly instead of app-server-protocol `PermissionProfile` shims.
  • tui: pass active permission profiles through app commands (#22891)
    ## Why
    
    This continues the permissions migration by keeping the TUI command
    boundary aligned with the app-server protocol direction from #22795:
    callers should select a permission profile by id instead of passing a
    concrete `PermissionProfile` value around as the turn configuration.
    
    `AppCommand` is internal to the TUI, but it is the path that eventually
    becomes `thread/turn/start`, so carrying concrete profile details there
    made it too easy for UI code to keep relying on the old whole-profile
    replacement model.
    
    ## What changed
    
    - `AppCommand::UserTurn` and `AppCommand::OverrideTurnContext` now carry
    `Option<ActivePermissionProfile>` instead of `PermissionProfile`.
    - Composer submissions copy the active permission profile id from the
    current session snapshot; legacy snapshots intentionally submit no
    active profile id.
    - Permission preset UI events now carry only the active built-in profile
    id. The app derives the concrete built-in `PermissionProfile` internally
    only when updating its local config/status snapshot.
    - Permission presets expose their built-in active profile id, and preset
    selection preserves that id in both the immediate turn override and the
    local TUI config snapshot.
    - Turn routing sends `TurnPermissionsOverride::ActiveProfile` when an
    active id is present, and only falls back to the legacy sandbox
    projection for the remaining runtime override path.
    
    ## How to review
    
    Start with `codex-rs/tui/src/app_command.rs` to verify the command shape
    no longer exposes `PermissionProfile`.
    
    Then read `codex-rs/tui/src/app/thread_routing.rs` to verify the
    app-server turn-start conversion: active ids go through as ids, while
    the legacy sandbox fallback is still constrained to the existing runtime
    override case.
    
    Finally, check `codex-rs/tui/src/chatwidget/permission_popups.rs`,
    `codex-rs/tui/src/app/event_dispatch.rs`,
    `codex-rs/tui/src/app/config_persistence.rs`, and
    `codex-rs/utils/approval-presets/src/lib.rs` to see how preset
    selections stay id-only across TUI events while the local display/config
    mirror still gets a concrete built-in profile.
    
    ## Verification
    
    Latest local verification after the id-only `AppEvent` cleanup:
    
    - `cargo check -p codex-tui --tests`
    - `cargo test -p codex-tui
    permissions_selection_sends_approvals_reviewer_in_override_turn_context`
    - `cargo test -p codex-tui update_feature_flags_enabling_guardian`
    - `cargo test -p codex-utils-approval-presets`
    - `just fmt`
    - `just fix -p codex-tui -p codex-utils-approval-presets`
    
    Earlier in the same PR, before the final event-shape cleanup:
    
    - `cargo test -p codex-tui turn_permissions_`
    - `cargo test -p codex-tui submission_`
    - `cargo test -p codex-tui
    session_configured_syncs_widget_config_permissions_and_cwd`
    - `RUST_MIN_STACK=16777216 cargo test -p codex-tui`
  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • [codex] Soften SQLite metadata sync failures (#22899)
    ## Summary
    - keep transcript-derived local thread metadata SQLite failures
    best-effort
    - preserve hard failures for explicit git-only metadata updates that
    still require SQLite state
    - add regression coverage for the soft-vs-hard metadata update policy
    
    ## Root cause
    The live thread metadata sync introduced after v0.131.0-alpha.8 moved
    append-derived metadata writes above the rollout writer. Those SQLite
    writes now propagated through the live thread flush path, so a corrupted
    optional state DB could surface as a transcript persistence warning even
    when JSONL writes still succeeded.
    
    The hard failures were introduced in #22236
  • feat(app-server): update remote control APIs for better UX (#22877)
    ## Why
    To help improve `codex remote-control` CLI UX which I plan to do in a
    followup, this PR adds `server-name` to the various remote control APIs:
    - `remoteControl/enable`
    - `remoteControl/disable`
    - `remoteControl/status/changed`
    
    Also, add a `remoteControl/status/read` API. This will be helpful in the
    Codex App.
  • Disable DMG staging for signed macOS promotion (#22900)
    ## Why
    `promote_signed` is now used to finish a release from an externally
    signed macOS handoff, but this release path (temporarily) no longer
    distributes DMGs. Keeping DMG staging enabled made the handoff
    unnecessarily require DMG assets and notarization/stapling validation
    even though the promoted release only needs the signed macOS binaries.
    
    ## What changed
    - Set every `stage-signed-macos` matrix entry to `build_dmg: "false"`,
    including the primary macOS bundles.
    - Kept the existing DMG staging branch in place behind
    `matrix.build_dmg` so it can be re-enabled deliberately later.
    - Updated the workflow header comment so the signed handoff contract
    asks for signed binaries, not signed DMGs.
    
    The regular signed build path that creates, signs, notarizes, and stages
    DMGs is unchanged; this only affects the `promote_signed` handoff path.
  • core: construct test permission profiles directly (#22795)
    ## Why
    
    The core migration is trying to make `PermissionProfile` the shape tests
    and runtime code reason about, leaving `SandboxPolicy` only where legacy
    behavior is explicitly under test. The local
    `permission_profile_for_sandbox_policy()` test helpers kept new
    permission-profile tests mentally tied to the old sandbox model even
    when the equivalent profile is straightforward.
    
    ## What Changed
    
    - Removed the `permission_profile_for_sandbox_policy()` helper from the
    network proxy spec tests and session tests.
    - Replaced legacy conversions for read-only, workspace-write, and
    full-access cases with `PermissionProfile::read_only()`,
    `PermissionProfile::workspace_write()`, and
    `PermissionProfile::Disabled`.
    - Constructed the external-sandbox session test's
    `PermissionProfile::External` directly, while preserving the legacy
    `SandboxPolicy` only where the test still exercises legacy config update
    behavior.
    
    ## How To Review
    
    This PR is intentionally test-only. Review the two touched files and
    check that each replacement preserves the old legacy mapping:
    
    - `SandboxPolicy::new_read_only_policy()` ->
    `PermissionProfile::read_only()`
    - `SandboxPolicy::new_workspace_write_policy()` ->
    `PermissionProfile::workspace_write()`
    - `SandboxPolicy::DangerFullAccess` -> `PermissionProfile::Disabled`
    - `SandboxPolicy::ExternalSandbox { network_access: Restricted }` ->
    `PermissionProfile::External { network: Restricted }`
    
    ## Verification
    
    - `cargo test -p codex-core
    requirements_allowed_domains_are_a_baseline_for_user_allowlist`
    - `cargo test -p codex-core
    start_managed_network_proxy_applies_execpolicy_network_rules`
    - `cargo test -p codex-core
    session_configured_reports_permission_profile_for_external_sandbox`
    - `cargo test -p codex-core
    managed_network_proxy_decider_survives_full_access_start`
    - `just fix -p codex-core`
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22795).
    * #22891
    * __->__ #22795
  • app-server: stop returning thread permission profiles (#22792)
    ## Why
    
    The app-server thread lifecycle API should no longer expose the full
    `PermissionProfile` value. After the permissions-profile migration,
    clients should round-trip only the active profile identity through
    `activePermissionProfile` and `permissions` when that identity is known.
    
    The full profile is server-side config. Treating a response-derived
    legacy sandbox projection as a new local profile can lose named-profile
    restrictions and accidentally widen permissions on the next turn. The
    legacy `sandbox` response field remains only as the
    compatibility/display fallback.
    
    ## What Changed
    
    - Removed `permissionProfile` from `ThreadStartResponse`,
    `ThreadResumeResponse`, and `ThreadForkResponse`.
    - Stopped populating that field in app-server thread start/resume/fork
    responses.
    - Updated embedded exec/TUI response mapping to derive display
    permission state from local config or the legacy sandbox fallback
    instead of a response profile value.
    - Added a TUI turn override shape that distinguishes preserving server
    permissions, selecting an active profile id, and sending a legacy
    sandbox for an explicit local override.
    - Preserved remote app-server permissions across turns by sending
    `permissions` only when an `activePermissionProfile` id is known, and
    otherwise sending no sandbox override unless the user selected a local
    override.
    - Kept embedded `thread/resume` hydration server-authored when
    `activePermissionProfile` is absent, which matches the live-thread
    attach path where the server ignores requested overrides.
    - Updated the app-server README to remove the obsolete lifecycle
    response `permissionProfile` reference. The remaining
    `permissionProfile` README references are request-side permission
    overrides.
    - Regenerated app-server JSON schema and TypeScript fixtures.
    - Kept the generated typed response enum exempt from
    `large_enum_variant`, matching the existing payload enum exemption after
    the lifecycle response variants shrank.
    
    ## How To Review
    
    Start with `codex-rs/app-server-protocol/src/protocol/v2/thread.rs` to
    confirm the response shape, then check the response construction in
    `codex-rs/app-server/src/request_processors`. The generated schema and
    TypeScript fixture changes are mechanical follow-through from the
    protocol removal.
    
    The TUI behavior is the delicate part: review
    `codex-rs/tui/src/app_server_session.rs` for response hydration and
    turn-start override projection, then
    `codex-rs/tui/src/app/thread_routing.rs` for the decision about whether
    the next turn should preserve the server snapshot, send an active
    profile id, or send a legacy sandbox for an explicit local override.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol
    thread_lifecycle_responses_default_missing_optional_fields`
    - `cargo test -p codex-exec
    session_configured_from_thread_response_uses_permission_profile_from_config`
    - `cargo test -p codex-tui --lib thread_response`
    - `cargo test -p codex-tui turn_permissions_`
    - `cargo test -p codex-tui
    resume_response_restores_turns_from_thread_items`
    - `cargo test -p codex-analytics
    track_response_only_enqueues_analytics_relevant_responses`
    - `just fix -p codex-analytics`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-tui`
    - `just argument-comment-lint`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22792).
    * #22795
    * __->__ #22792
  • Forward apps MCP product SKU from Codex config (#22872)
    This adds `apps_mcp_product_sku` as a toplevel config.toml key. We pass
    the given value as a header when listing MCPs for the client, allowing
    connectors to be filtered per product entry point.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • telemetry: tag sandboxes from permission profiles (#22791)
    ## Why
    
    Sandbox telemetry tags should be derived from the active permission
    profile, not from a legacy `SandboxPolicy`, so the tagging code stays
    aligned with the permissions migration and does not preserve a
    policy-shaped production helper only for tests.
    
    ## What Changed
    
    - Removed the production `sandbox_tag(&SandboxPolicy, ...)` helper.
    - Updated sandbox tag tests to construct the relevant
    `PermissionProfile` values directly.
    - Kept the platform-specific sandbox tag behavior under the existing
    `permission_profile_sandbox_tag` path.
    
    ## How To Review
    
    The production change is in `codex-rs/core/src/sandbox_tags.rs`. Most of
    the diff is test cleanup that replaces legacy policy setup with
    permission profiles, so review the expected tag assertions rather than
    the old helper mechanics.
    
    ## Verification
    
    - `cargo test -p codex-core sandbox_tag`
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22791).
    * #22795
    * #22792
    * __->__ #22791
  • context: remove legacy permissions instructions helper (#22790)
    ## Why
    
    The permissions instruction builder should consume the new permissions
    model directly. Keeping a `SandboxPolicy` conversion helper in this path
    encourages new code to route through legacy sandbox policy values even
    when the caller already has a `PermissionProfile`.
    
    ## What Changed
    
    - Removed `PermissionsInstructions::from_policy`.
    - Removed the test that exercised that legacy helper.
    - Left the existing profile-based instruction coverage in place.
    
    ## How To Review
    
    Review `codex-rs/core/src/context/permissions_instructions.rs` first.
    This PR is intentionally narrow: the production behavior should be
    unchanged for profile callers, and the deleted surface was only a
    convenience adapter from `SandboxPolicy`.
    
    ## Verification
    
    - `cargo test -p codex-core builds_permissions_from_profile`
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22790).
    * #22795
    * #22792
    * #22791
    * __->__ #22790
  • Ignore configured hooks in git helpers (#22843)
    ## What
    - Internal Git helper commands now ignore configured hook directories
    during repository bookkeeping.
    
    ## Why
    - These helper flows should stay consistent even when a repository has
    hook-directory configuration of its own.
    
    ## How
    - Pass a command-local `core.hooksPath` override in the shared helper
    path and the Git-info helper path.
    - Add regressions for the baseline index rewrite flow and the metadata
    status flow.
    
    ## Validation
    - `cargo fmt --manifest-path
    /Users/bookholt/code/codex/codex-rs/Cargo.toml --all --check`
    - `cargo test --manifest-path
    /Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-git-utils`
    - `cargo test --manifest-path
    /Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-core
    test_get_has_changes_`
  • tui: split remaining composer draft and footer state (#22656)
    ## Why
    
    [#22581](https://github.com/openai/codex/pull/22581) started separating
    the chat composer’s responsibilities, but `ChatComposer` still owned the
    remaining editable draft state alongside footer/status presentation
    state. This follow-up makes those ownership lines explicit so future
    composer changes have a smaller blast radius and `BottomPane` does not
    need to keep exposing scattered draft getters.
    
    This is just a refactor. No functional or behavioral changes are
    intended.
    
    ## What changed
    
    - Move the remaining editable composer state into
    `bottom_pane/chat_composer/draft_state.rs`.
    - Move footer and status-row presentation state into
    `bottom_pane/chat_composer/footer_state.rs`.
    - Add an internal `ComposerDraftSnapshot` for restore flows, replacing
    several ad hoc `BottomPane` pass-through reads.
    - Rewire the related history-search and thread-input restore paths to
    use the extracted state.
    
    ## Verification
    
    - `RUST_MIN_STACK=8388608 cargo test -p codex-tui`
    - `cargo insta pending-snapshots`
  • guardian: use permission profile for review sandbox (#22789)
    ## Why
    
    `SandboxPolicy` is being pushed back toward legacy config loading and
    compatibility boundaries. Guardian review sessions already want the
    built-in read-only permission behavior; carrying that as an active
    `PermissionProfile` makes the review sandbox follow the new permissions
    path instead of configuring the child session through the legacy policy
    API.
    
    ## What Changed
    
    - Configure the guardian review session with
    `PermissionProfile::read_only()`.
    - Send the read-only profile through the guardian child `Op::UserTurn`.
    - Keep the legacy `sandbox_policy` field populated with
    `SandboxPolicy::new_read_only_policy()` declared next to the profile so
    the two remain visibly in sync until the compatibility field goes away.
    
    ## How To Review
    
    Start in `codex-rs/core/src/guardian/review_session.rs`. The important
    check is that both the guardian config and the child turn now use the
    read-only permission profile, while the remaining
    `SandboxPolicy::ReadOnly` assignment is only the compatibility field
    required by the current turn protocol.
    
    ## Verification
    
    - `cargo test -p codex-core
    guardian_review_session_config_clears_parent_developer_instructions`
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22789).
    * #22795
    * #22792
    * #22791
    * #22790
    * __->__ #22789
  • Move memory prompt injection to app-server extension (#22841)
    ## Why
    
    Memory prompt injection should be owned by the extension path that
    app-server composes at runtime, not by an inlined special case inside
    `codex-core`. This keeps `codex-core` focused on session orchestration
    while allowing the memories extension to own its app-server prompt
    behavior.
    
    ## What Changed
    
    - Registers `codex-memories-extension` in the app-server extension
    registry.
    - Moves the memory developer-instruction injection out of
    `core/src/session/mod.rs` and into the memories extension prompt
    contributor.
    - Adds config-change handling so the extension keeps its per-thread
    memory settings in sync after startup.
    - Leaves memories read/retrieval tools unregistered for now so this PR
    only changes prompt injection.
    - Removes the stale `cargo-shear` ignore now that app-server depends on
    the extension crate.
    
    ## Validation
    
    Not run locally; validation is left to CI.
  • Run compact hooks for remote compaction v2 (#22828)
    ## Why
    
    Remote compaction v2 is the `/responses` implementation of
    session-history compaction, but it still needs to preserve the
    observable contract of the legacy `/responses/compact` path. In
    particular, users and integrations that rely on `PreCompact` and
    `PostCompact` hooks should not see different behavior when
    `remote_compaction_v2` is enabled.
    
    ## What Changed
    
    - Runs `PreCompact` before issuing the remote compaction v2 request,
    including `Interrupted` analytics when a pre-hook stops execution.
    - Runs `PostCompact` after a successful v2 compaction and aborts the
    turn if the post-hook stops execution.
    - Adds `compact_remote_parity` coverage that compares legacy and v2
    compaction across manual transcript shapes, automatic pre-turn
    compaction, automatic mid-turn compaction, hook payloads, replacement
    history, follow-up request payloads, and API-key `service_tier=fast`
    behavior.
    - Registers the new parity suite under `core/tests/suite`.
    
    Relevant code:
    
    -
    [`compact_remote_v2.rs`](https://github.com/openai/codex/blob/af63745cb502183a6fc447d0240f8150934d70b7/codex-rs/core/src/compact_remote_v2.rs)
    -
    [`compact_remote_parity.rs`](https://github.com/openai/codex/blob/af63745cb502183a6fc447d0240f8150934d70b7/codex-rs/core/tests/suite/compact_remote_parity.rs)
    
    ## Verification
    
    - Added `core/tests/suite/compact_remote_parity.rs` to assert parity
    between legacy remote compaction and remote compaction v2 for the
    affected request, hook, rollout-history, and follow-up paths.
    - Existing `compact_remote_v2` unit coverage still exercises v2
    replacement-history retention and compaction-output collection.
  • Remove zombie tools spec module (#22820)
    ## Summary
    
    - move tool_user_shell_type out of the old tools::spec module and call
    it from tools directly
    - attach the remaining spec planning model tests under spec_plan
    - delete core/src/tools/spec.rs
    
    ## Tests
    
    - just fmt
    - cargo test -p codex-core tools::spec_plan
    
    Note: a broader cargo test -p codex-core run on the earlier PR-head
    worktree still hit the pre-existing stack overflow in
    agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns.
  • Simplify tool executor and registry plumbing (#22636)
    ## Why
    
    The tool runtime path still had a typed output associated type on
    `ToolExecutor`, plus a core-only `RegisteredTool` adapter and
    extension-only executor aliases. That made every new shared tool runtime
    carry extra adapter plumbing before it could participate in core
    dispatch, extension tools, hook payloads, telemetry, and model-visible
    spec generation.
    
    This PR moves output erasure to the shared executor boundary so core and
    extension tools can use the same execution contract directly.
    
    ## What Changed
    
    - Changed `codex_tools::ToolExecutor` to return `Box<dyn ToolOutput>`
    instead of an associated `Output` type.
    - Removed the extension-specific `ExtensionToolExecutor` /
    `ExtensionToolOutput` aliases and exposed `ToolExecutor<ToolCall>` plus
    `ToolOutput` through `codex-extension-api`.
    - Reworked core tool registration around `CoreToolRuntime` and
    `ToolRegistry::from_tools`, removing the extra `RegisteredTool` /
    `ToolRegistryBuilder` layer.
    - Consolidated model-visible spec planning and registry construction in
    `core/src/tools/spec_plan.rs`, including deferred tool search and
    code-mode-only filtering.
    - Added `ToolOutput` helpers for post-tool-use hook ids and inputs so
    MCP, unified exec, extension, and other boxed outputs preserve the same
    hook payload behavior.
    - Updated core handlers, memories tools, and the related
    registry/spec/router tests to use the simplified contract.
    
    ## Test Coverage
    
    - Updated coverage for tool spec planning, registry lookup, deferred
    tool search registration, extension tool routing, post-tool-use hook
    payloads, dispatch tracing, guardian output extraction, and memories
    extension tool execution.
  • [codex] Use compaction_trigger item for remote compaction v2 (#22809)
    ## Why
    
    Remote compaction v2 was still using `context_compaction` as both the
    request trigger and the compacted output shape. The Responses API now
    has the landed contract for this flow: Codex sends a dedicated `{
    "type": "compaction_trigger" }` input item, and the backend returns the
    standard `compaction` output item with encrypted content.
    
    This aligns the v2 path with that wire contract while preserving the
    existing local compacted-history post-processing behavior.
    
    ## What changed
    
    - Add `ResponseItem::CompactionTrigger` and regenerate the app-server
    protocol schema fixtures.
    - Send `compaction_trigger` from `remote_compaction_v2` instead of a
    payload-less `context_compaction`.
    - Collect exactly one backend `compaction` output item, then reuse the
    existing compacted-history rebuilding path.
    - Treat the trigger item as a transient request marker rather than model
    output or persisted rollout/memory content.
    
    ## Verification
    
    - `cargo test -p codex-protocol compaction_trigger`
    - `cargo test -p codex-core remote_compact_v2`
    - `cargo test -p codex-core compact_remote_v2`
    - `cargo test -p codex-core
    responses_websocket_sends_response_processed_after_remote_compaction_v2`
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol schema_fixtures`
  • Reject legacy [profiles] when using profile-v2 (#22647)
    ## Why
    
    `profile-v2` layers the selected profile file on top of the base user
    `config.toml`, but the legacy `[profiles]` table also stores named
    profile overrides in that same base file. Allowing both paths during one
    load makes it too easy to get a mixed profile where stale legacy
    settings still influence a profile-v2 run.
    
    ## What Changed
    
    - Detect a legacy `[profiles]` table in the base user config whenever
    `--profile-v2` selects a profile file.
    - Fail config loading with an `InvalidData` error that tells the user to
    move those settings into the selected profile-v2 file or remove
    `[profiles]`.
    - Add a loader regression covering `--profile-v2` with legacy
    `[profiles]` in `config.toml`.
    
    ## Testing
    
    - `cargo test -p codex-config
    profile_v2_rejects_legacy_profiles_in_base_user_config`
  • Fix signed macOS release promotion follow-up jobs (#22788)
    ## Why
    
    The `release_mode=promote_signed` path intentionally skips the build
    jobs after signed macOS artifacts are staged, then runs the `release`
    job from the signed handoff. In the `rust-v0.131.0-alpha.19` promotion
    run, `release` succeeded but the npm, PyPI, and `latest-alpha-cli`
    follow-up jobs were skipped because their custom job `if:` expressions
    let GitHub Actions apply the implicit `success()` status check before
    reading `needs.release.outputs.*`.
    
    The unsigned build handoff does not need DotSlash manifests. Publishing
    unsigned DotSlash manifests creates release assets that can conflict
    with the later signed promotion, especially shared outputs such as
    `bwrap`, `codex-command-runner`, and `codex-windows-sandbox-setup`.
    
    ## What Changed
    
    - Stop publishing DotSlash manifests when `SIGN_MACOS == 'false'`.
    - Delete `.github/dotslash-unsigned-config.json`.
    - Gate post-release jobs with the `!cancelled()` status function plus an
    explicit `needs.release.result == 'success'` check before consulting
    release outputs.
    - Keep the existing publish eligibility rules for npm, PyPI, WinGet, and
    `latest-alpha-cli`.
    
    ## Verification
    
    - `rg -n "dotslash-unsigned-config|SIGN_MACOS ==
    'false'.*dotslash|unsigned-config" .github/workflows/rust-release.yml
    .github || true`
    - `git diff --check -- .github/workflows/rust-release.yml
    .github/dotslash-unsigned-config.json`
  • tui/exec: show effective workspace roots in summaries (#22612)
    ## Why
    
    This PR builds on [#22611](https://github.com/openai/codex/pull/22611).
    
    After `runtimeWorkspaceRoots` moved onto thread state, the user-facing
    summaries were still inconsistent about which roots they showed. In
    particular, `/status` and the exec startup summary could under-report
    extra workspace roots from `--add-dir` or from profile-defined
    `workspace_roots`, which made the new model look incorrect even when the
    permissions themselves were right.
    
    ## What Changed
    
    - switched the TUI status surfaces to summarize against
    `Config::effective_workspace_roots()`
    - updated the exec human-output summary to render from the effective
    permission profile instead of the raw constrained profile
    - added focused regressions for both the TUI and exec code paths so
    extra workspace roots stay visible in user-facing summaries
    
    ## Verification
    
    Targeted coverage for this follow-up lives in:
    - `codex-rs/tui/src/status/tests.rs`
    - `codex-rs/exec/src/event_processor_with_human_output_tests.rs`
    
    The added regressions verify that:
    - status output includes profile-defined workspace roots in the
    effective permissions summary
    - exec startup output includes runtime workspace roots instead of
    collapsing back to `cwd` only
  • app-server: use permission ids and runtime workspace roots (#22611)
    ## Why
    
    This PR builds on [#22610](https://github.com/openai/codex/pull/22610)
    and is the app-server side of the migration from mutable per-turn
    `SandboxPolicy` replacement toward selecting immutable permission
    profiles by id plus mutable runtime workspace roots.
    
    Once permission profiles can carry their own immutable
    `workspace_roots`, app-server no longer needs to mutate the selected
    `PermissionProfile` just to represent thread-specific filesystem
    context. The mutable part now lives on the thread as explicit
    `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until
    the sandbox is realized for a turn.
    
    ## What Changed
    
    - Replaced the v2 permission-selection wrapper surface with plain
    profile ids for `thread/start`, `thread/resume`, `thread/fork`, and
    `turn/start`.
    - Removed the API surface for profile modifications
    (`PermissionProfileSelectionParams`,
    `PermissionProfileModificationParams`,
    `ActivePermissionProfileModification`).
    - Added experimental `runtimeWorkspaceRoots` fields to the thread
    lifecycle and turn-start APIs.
    - Threaded runtime workspace roots through core session/thread
    snapshots, turn overrides, app-server request handling, and command
    execution permission resolution.
    - Kept session permission state symbolic so later runtime root updates
    and cwd-only implicit-root retargeting rebind `:workspace_roots`
    correctly.
    - Updated the embedded clients just enough to send and restore the new
    thread state.
    - Refreshed the generated schema/TypeScript artifacts and the app-server
    README to match the new contract.
    
    ## Verification
    
    Targeted coverage for this layer lives in:
    
    - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_start.rs`
    - `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
    - `codex-rs/app-server/tests/suite/v2/turn_start.rs`
    - `codex-rs/core/src/session/tests.rs`
    
    The key regression checks exercise that:
    
    - `runtimeWorkspaceRoots` resolve against the effective cwd on thread
    start.
    - Profile-declared workspace roots are excluded from the runtime
    workspace roots returned by app-server.
    - A turn-level runtime workspace-root update persists onto the thread
    and is returned by `thread/resume`.
    - A named permission profile selected on one turn remains symbolic so a
    later runtime-root-only turn update changes the actual sandbox writes.
    - A cwd-only turn update retargets the implicit runtime cwd root while
    preserving additional runtime roots.
    - The protocol fixtures and generated client artifacts stay in sync with
    the string-based permission selection contract.
    
    
    
    
    
    
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611).
    * #22612
    * __->__ #22611