Commit Graph

467 Commits

  • register all mcp tools with namespace (#17404)
    stacked on #17402.
    
    MCP tools returned by `tool_search` (deferred tools) get registered in
    our `ToolRegistry` with a different format than directly available
    tools. this leads to two different ways of accessing MCP tools from our
    tool catalog, only one of which works for each. fix this by registering
    all MCP tools with the namespace format, since this info is already
    available.
    
    also, direct MCP tools are registered to responsesapi without a
    namespace, while deferred MCP tools have a namespace. this means we can
    receive MCP `FunctionCall`s in both formats from namespaces. fix this by
    always registering MCP tools with namespace, regardless of deferral
    status.
    
    make code mode track `ToolName` provenance of tools so it can map the
    literal JS function name string to the correct `ToolName` for
    invocation, rather than supporting both in core.
    
    this lets us unify to a single canonical `ToolName` representation for
    each MCP tool and force everywhere to use that one, without supporting
    fallbacks.
  • Spread AbsolutePathBuf (#17792)
    Mechanical change to promote absolute paths through code.
  • [codex-analytics] add session source to client metadata (#17374)
    ## Summary
    
    Adds `thread_source` field to the existing Codex turn metadata sent to
    Responses API
    - Sends `thread_source: "user"` for user-initiated sessions: CLI, VS
    Code, and Exec
    - Sends `thread_source: "subagent"` for subagent sessions
    - Omits `thread_source` for MCP, custom, and unknown session sources
    - Uses the existing turn metadata transport:
      - HTTP requests send through the `x-codex-turn-metadata` header
    - WebSocket `response.create` requests send through
    `client_metadata["x-codex-turn-metadata"]`
    
    ## Testing
    - `cargo test -p codex-protocol
    session_source_thread_source_name_classifies_user_and_subagent_sources`
    - `cargo test -p codex-core turn_metadata_state`
    - `cargo test -p codex-core --test responses_headers
    responses_stream_includes_turn_metadata_header_for_git_workspace_e2e --
    --nocapture`
  • Add realtime output modality and transcript events (#17701)
    - Add outputModality to thread/realtime/start and wire text/audio output
    selection through app-server, core, API, and TUI.\n- Rename the realtime
    transcript delta notification and add a separate transcript done
    notification that forwards final text from item done without correlating
    it with deltas.
  • [codex-analytics] feature plumbing and emittance (#16640)
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16640).
    * #16870
    * #16706
    * #16641
    * __->__ #16640
  • Support prolite plan type (#17419)
    Addresses #17353
    
    Problem: Codex rate-limit fetching failed when the backend returned the
    new `prolite` subscription plan type.
    
    Solution: Add `prolite` to the backend/account/auth plan mappings, keep
    unknown WHAM plan values decodable, and regenerate app-server plan
    schemas.
  • representing guardian review timeouts in protocol types (#17381)
    ## Summary
    
    - Add `TimedOut` to Guardian/review carrier types:
      - `ReviewDecision::TimedOut`
      - `GuardianAssessmentStatus::TimedOut`
      - app-server v2 `GuardianApprovalReviewStatus::TimedOut`
    - Regenerate app-server JSON/TypeScript schemas for the new wire shape.
    - Wire the new status through core/app-server/TUI mappings with
    conservative fail-closed handling.
    - Keep `TimedOut` non-user-selectable in the approval UI.
    
    **Does not change runtime behavior yet; emitting `TimeOut` and
    parent-model timeout messaging will come in followup PRs**
  • fix(permissions): fix symlinked writable roots in sandbox permissions (#15981)
    ## Summary
    - preserve logical symlink paths during permission normalization and
    config cwd handling
    - bind real targets for symlinked readable/writable roots in bwrap and
    remap carveouts and unreadable roots there
    - add regressions for symlinked carveouts and nested symlink escape
    masking
    
    ## Root cause
    Permission normalization canonicalized symlinked writable roots and cwd
    to their real targets too early. That drifted policy checks away from
    the logical paths the sandboxed process can actually address, while
    bwrap still needed the real targets for mounts. The mismatch caused
    shell and apply_patch failures on symlinked writable roots.
    
    ## Impact
    Fixes #15781.
    
    Also fixes #17079:
    - #17079 is the protected symlinked carveout side: bwrap now binds the
    real symlinked writable-root target and remaps carveouts before masking.
    
    Related to #15157:
    - #15157 is the broader permission-check side of this path-identity
    problem. This PR addresses the shared logical-vs-canonical normalization
    issue, but the reported Darwin prompt behavior should be validated
    separately before auto-closing it.
    
    This should also fix #14672, #14694, #14715, and #15725:
    - #14672, #14694, and #14715 are the same Linux
    symlinked-writable-root/bwrap family as #15781.
    - #15725 is the protected symlinked workspace path variant; the PR
    preserves the protected logical path in policy space while bwrap applies
    read-only or unreadable treatment to the resolved target so
    file-vs-directory bind mismatches do not abort sandbox setup.
    
    ## Notes
    - Added Linux-only regressions for symlinked writable ancestors and
    protected symlinked directory targets, including nested symlink escape
    masking without rebinding the escape target writable.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391)
    Reverts openai/codex#16969
    
    #sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300
  • fix(guardian, app-server): introduce guardian review ids (#17298)
    ## Description
    
    This PR introduces `review_id` as the stable identifier for guardian
    reviews and exposes it in app-server `item/autoApprovalReview/started`
    and `item/autoApprovalReview/completed` events.
    
    Internally, guardian rejection state is now keyed by `review_id` instead
    of the reviewed tool item ID. `target_item_id` is still included when a
    review maps to a concrete thread item, but it is no longer overloaded as
    the review lifecycle identifier.
    
    ## Motivation
    
    We'd like to give users the ability to preempt a guardian review while
    it's running (approve or decline).
    
    However, we can't implement the API that allows the user to override a
    running guardian review because we didn't have a unique `review_id` per
    guardian review. Using `target_item_id` is not correct since:
    - with execve reviews, there can be multiple execve calls (and therefore
    guardian reviews) per shell command
    - with network policy reviews, there is no target item ID
    
    The PR that actually implements user overrides will use `review_id` as
    the stable identifier.
  • Support clear SessionStart source (#17073)
    ## Motivation
    
    The `SessionStart` hook already receives `startup` and `resume` sources,
    but sessions created from `/clear` previously looked like normal startup
    sessions. This makes it impossible for hook authors to distinguish
    between these with the matcher.
    
    ## Summary
    
    - Add `InitialHistory::Cleared` so `/clear`-created sessions can be
    distinguished from ordinary startup sessions.
    - Add `SessionStartSource::Clear` and wire it through core, app-server
    thread start params, and TUI clear-session flow.
    - Update app-server protocol schemas, generated TypeScript, docs, and
    related tests.
    
    
    https://github.com/user-attachments/assets/9cae3cb4-41c7-4d06-b34f-966252442e5c
  • Queue Realtime V2 response.create while active (#17306)
    Builds on #17264.
    
    - queues Realtime V2 `response.create` while an active response is open,
    then flushes it after `response.done` or `response.cancelled`
    - requests `response.create` after background agent final output and
    steering acknowledgements
    - adds app-server integration coverage for all `response.create` paths
    
    Validation:
    - `just fmt`
    - `cargo check -p codex-app-server --tests`
    - `git diff --check`
    - CI green
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Option to Notify Workspace Owner When Usage Limit is Reached (#16969)
    ## Summary
    - Replace the manual `/notify-owner` flow with an inline confirmation
    prompt when a usage-based workspace member hits a credits-depleted
    limit.
    - Fetch the current workspace role from the live ChatGPT
    `accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
    the desktop and web clients.
    - Keep owner, member, and spend-cap messaging distinct so we only offer
    the owner nudge when the workspace is actually out of credits.
    
    ## What Changed
    - `backend-client`
    - Added a typed fetch for the current account role from
    `accounts/check`.
      - Mapped backend role values into a Rust workspace-role enum.
    - `app-server` and protocol
      - Added `workspaceRole` to `account/read` and `account/updated`.
    - Derived `isWorkspaceOwner` from the live role, with a fallback to the
    cached token claim when the role fetch is unavailable.
    - `tui`
      - Removed the explicit `/notify-owner` slash command.
    - When a member is blocked because the workspace is out of credits, the
    error now prompts:
    - `Your workspace is out of credits. Request more from your workspace
    owner? [y/N]`
      - Choosing `y` sends the existing owner-notification request.
    - Choosing `n`, pressing `Esc`, or accepting the default selection
    dismisses the prompt without sending anything.
    - Selection popups now honor explicit item shortcuts, which is how the
    `y` / `n` interaction is wired.
    
    ## Reviewer Notes
    - The main behavior change is scoped to usage-based workspace members
    whose workspace credits are depleted.
    - Spend-cap reached should not show the owner-notification prompt.
    - Owners and admins should continue to see `/usage` guidance instead of
    the member prompt.
    - The live role fetch is best-effort; if it fails, we fall back to the
    existing token-derived ownership signal.
    
    ## Testing
    - Manual verification
      - Workspace owner does not see the member prompt.
    - Workspace member with depleted credits sees the confirmation prompt
    and can send the nudge with `y`.
    - Workspace member with spend cap reached does not see the
    owner-notification prompt.
    
    ### Workspace member out of usage
    
    https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1
    
    ### Workspace owner
    <img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
    22 AM"
    src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
    />
  • Forward app-server turn clientMetadata to Responses (#16009)
    ## Summary
    App-server v2 already receives turn-scoped `clientMetadata`, but the
    Rust app-server was dropping it before the outbound Responses request.
    This change keeps the fix lightweight by threading that metadata through
    the existing turn-metadata path rather than inventing a new transport.
    
    ## What we're trying to do and why
    We want turn-scoped metadata from the app-server protocol layer,
    especially fields like Hermes/GAAS run IDs, to survive all the way to
    the actual Responses API request so it is visible in downstream
    websocket request logging and analytics.
    
    The specific bug was:
    - app-server protocol uses camelCase `clientMetadata`
    - Responses transport already has an existing turn metadata carrier:
    `x-codex-turn-metadata`
    - websocket transport already rewrites that header into
    `request.request_body.client_metadata["x-codex-turn-metadata"]`
    - but the Rust app-server never parsed or stored `clientMetadata`, so
    nothing from the app-server request was making it into that existing
    path
    
    This PR fixes that without adding a new header or a second metadata
    channel.
    
    ## How we did it
    ### Protocol surface
    - Add optional `clientMetadata` to v2 `TurnStartParams` and
    `TurnSteerParams`
    - Regenerate the JSON schema / TypeScript fixtures
    - Update app-server docs to describe the field and its behavior
    
    ### Runtime plumbing
    - Add a dedicated core op for app-server user input carrying turn-scoped
    metadata: `Op::UserInputWithClientMetadata`
    - Wire `turn/start` and `turn/steer` through that op / signature path
    instead of dropping the metadata at the message-processor boundary
    - Store the metadata in `TurnMetadataState`
    
    ### Transport behavior
    - Reuse the existing serialized `x-codex-turn-metadata` payload
    - Merge the new app-server `clientMetadata` into that JSON additively
    - Do **not** replace built-in reserved fields already present in the
    turn metadata payload
    - Keep websocket behavior unchanged at the outer shape level: it still
    sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON
    string now contains the merged fields
    - Keep HTTP fallback behavior unchanged except that the existing
    `x-codex-turn-metadata` header now includes the merged fields too
    
    ### Request shape before / after
    Before, a websocket `response.create` looked like:
    ```json
    {
      "type": "response.create",
      "client_metadata": {
        "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}"
      }
    }
    ```
    Even if the app-server caller supplied `clientMetadata`, it was not
    represented there.
    
    After, the same request shape is preserved, but the serialized payload
    now includes the new turn-scoped fields:
    ```json
    {
      "type": "response.create",
      "client_metadata": {
        "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}"
      }
    }
    ```
    
    ## Validation
    ### Targeted tests added / updated
    - protocol round-trip coverage for `clientMetadata` on `turn/start` and
    `turn/steer`
    - protocol round-trip coverage for `Op::UserInputWithClientMetadata`
    - `TurnMetadataState` merge test proving client metadata is added
    without overwriting reserved built-in fields
    - websocket request-shape test proving outbound `response.create`
    contains merged metadata inside
    `client_metadata["x-codex-turn-metadata"]`
    - app-server integration tests proving:
    - `turn/start` forwards `clientMetadata` into the outbound Responses
    request path
      - websocket warmup + real turn request both behave correctly
      - `turn/steer` updates the follow-up request metadata
    
    ### Commands run
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-core
    turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields
    --lib`
    - `cargo test -p codex-core --test all
    responses_websocket_preserves_custom_turn_metadata_fields`
    - `cargo test -p codex-app-server --test all client_metadata`
    - `cargo test -p codex-app-server --test all
    turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2
    -- --nocapture`
    - `just fmt`
    - `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol
    -p codex-app-server`
    - `just fix -p codex-exec -p codex-tui-app-server`
    - `just argument-comment-lint`
    
    ### Full suite note
    `cargo test` in `codex-rs` still fails in:
    -
    `suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request`
    
    I verified that same failure on a clean detached `HEAD` worktree with an
    isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.
  • Default realtime startup to v2 model (#17183)
    - Default realtime sessions to v2 and gpt-realtime-1.5 when no override
    is configured.
    - Add Op::RealtimeConversationStart integration coverage and keep
    v1-specific tests explicit.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add realtime voice selection (#17176)
    - Add realtime voice selection for realtime/start.
    - Expose the supported v1/v2 voice lists and cover explicit, configured,
    default, and invalid voice paths.
  • Move default realtime prompt into core (#17165)
    - Adds a core-owned realtime backend prompt template and preparation
    path.
    - Makes omitted realtime start prompts use the core default, while null
    or empty prompts intentionally send empty instructions.
    - Covers the core realtime path and app-server v2 path with integration
    coverage.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Update guardian output schema (#17061)
    ## Summary
    - Update guardian output schema to separate risk, authorization,
    outcome, and rationale.
    - Feed guardian rationale into rejection messages.
    - Split the guardian policy into template and tenant-config sections.
    
    ## Validation
    - `cargo test -p codex-core mcp_tool_call`
    - `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
    -p codex-core guardian::`
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • Use model metadata for Fast Mode status (#16949)
    Fast Mode status was still tied to one model name in the TUI and
    model-list plumbing. This changes the model metadata shape so a model
    can advertise additional speed tiers, carries that field through the
    app-server model list, and uses it to decide when to show Fast Mode
    status.
    
    For people using Codex, the behavior is intended to stay the same for
    existing models. Fast Mode still requires the existing signed-in /
    feature-gated path; the difference is that the UI can now recognize any
    model the model list marks as Fast-capable, instead of requiring a new
    client-side slug check.
  • Add WebRTC transport to realtime start (#16960)
    Adds WebRTC startup to the experimental app-server
    `thread/realtime/start` method with an optional transport enum. The
    websocket path remains the default; WebRTC offers create the realtime
    session through the shared start flow and emit the answer SDP via
    `thread/realtime/sdp`.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Make AbsolutePathBuf joins infallible (#16981)
    Having to check for errors every time join is called is painful and
    unnecessary.
  • Preserve null developer instructions (#16976)
    Preserve explicit null developer-instruction overrides across app-server
    resume and fork flows.
  • [codex] reduce module visibility (#16978)
    ## Summary
    - reduce public module visibility across Rust crates, preferring private
    or crate-private modules with explicit crate-root public exports
    - update external call sites and tests to use the intended public crate
    APIs instead of reaching through module trees
    - add the module visibility guideline to AGENTS.md
    
    ## Validation
    - `cargo check --workspace --all-targets --message-format=short` passed
    before the final fix/format pass
    - `just fix` completed successfully
    - `just fmt` completed successfully
    - `git diff --check` passed
  • Honor null thread instructions (#16964)
    - Treat explicit null thread instructions as a blank-slate override
    while preserving omitted-field fallback behavior.
    - Preserve null through rollout resume/fork and keep explicit empty
    strings distinct.
    - Add app-server v2 start/fork coverage for the tri-state instruction
    params.
  • [mcp] Support MCP Apps part 1. (#16082)
    - [x] Add `mcpResource/read` method to read mcp resource.
  • [codex-analytics] add protocol-native turn timestamps (#16638)
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
    * #16870
    * #16706
    * #16659
    * #16641
    * #16640
    * __->__ #16638
  • extract models manager and related ownership from core (#16508)
    ## Summary
    - split `models-manager` out of `core` and add `ModelsManagerConfig`
    plus `Config::to_models_manager_config()` so model metadata paths stop
    depending on `core::Config`
    - move login-owned/auth-owned code out of `core` into `codex-login`,
    move model provider config into `codex-model-provider-info`, move API
    bridge mapping into `codex-api`, move protocol-owned types/impls into
    `codex-protocol`, and move response debug helpers into a dedicated
    `response-debug-context` crate
    - move feedback tag emission into `codex-feedback`, relocate tests to
    the crates that now own the code, and keep broad temporary re-exports so
    this PR avoids a giant import-only rewrite
    
    ## Major moves and decisions
    - created `codex-models-manager` as the owner for model
    cache/catalog/config/model info logic, including the new
    `ModelsManagerConfig` struct
    - created `codex-model-provider-info` as the owner for provider config
    parsing/defaults and kept temporary `codex-login`/`codex-core`
    re-exports for old import paths
    - moved `api_bridge` error mapping + `CoreAuthProvider` into
    `codex-api`, while `codex-login::api_bridge` temporarily re-exports
    those symbols and keeps the `auth_provider_from_auth` wrapper
    - moved `auth_env_telemetry` and `provider_auth` ownership to
    `codex-login`
    - moved `CodexErr` ownership to `codex-protocol::error`, plus
    `StreamOutput`, `bytes_to_string_smart`, and network policy helpers to
    protocol-owned modules
    - created `codex-response-debug-context` for
    `extract_response_debug_context`, `telemetry_transport_error_message`,
    and related response-debug plumbing instead of leaving that behavior in
    `core`
    - moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and
    `emit_feedback_request_tags_with_auth_env` to `codex-feedback`
    - deferred removal of temporary re-exports and the mechanical import
    rewrites to a stacked follow-up PR so this PR stays reviewable
    
    ## Test moves
    - moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to
    `login/tests/suite/auth_refresh.rs`
    - moved text encoding coverage from
    `core/tests/suite/text_encoding_fix.rs` to
    `protocol/src/exec_output_tests.rs`
    - moved model info override coverage from
    `core/tests/suite/model_info_overrides.rs` to
    `models-manager/src/model_info_overrides_tests.rs`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • fix(guardian): make GuardianAssessmentEvent.action strongly typed (#16448)
    ## Description
    
    Previously the `action` field on `EventMsg::GuardianAssessment`, which
    describes what Guardian is reviewing, was typed as an arbitrary JSON
    blob. This PR cleans it up and defines a sum type representing all the
    various actions that Guardian can review.
    
    This is a breaking change (on purpose), which is fine because:
    - the Codex app / VSCE does not actually use `action` at the moment
    - the TUI code that consumes `action` is updated in this PR as well
    - rollout files that serialized old `EventMsg::GuardianAssessment` will
    just silently drop these guardian events
    - the contract is defined as unstable, so other clients have a fair
    warning :)
    
    This will make things much easier for followup Guardian work.
    
    ## Why
    
    The old guardian review payloads worked, but they pushed too much shape
    knowledge into downstream consumers. The TUI had custom JSON parsing
    logic for commands, patches, network requests, and MCP calls, and the
    app-server protocol was effectively just passing through an opaque blob.
    
    Typing this at the protocol boundary makes the contract clearer.
  • login: treat provider auth refresh_interval_ms=0 as no auto-refresh (#16480)
    ## Why
    
    Follow-up to #16288: the new dynamic provider auth token flow currently
    defaults `refresh_interval_ms` to a non-zero value and rejects `0`
    entirely.
    
    For command-backed bearer auth, `0` should mean "never auto-refresh".
    That lets callers keep using the cached token until the backend actually
    returns `401 Unauthorized`, at which point Codex can rerun the auth
    command as part of the existing retry path.
    
    ## What changed
    
    - changed `ModelProviderAuthInfo.refresh_interval_ms` to accept `0` and
    documented that value as disabling proactive refresh
    - updated the external bearer token refresher to treat
    `refresh_interval_ms = 0` as an indefinitely reusable cached token,
    while still rerunning the auth command during unauthorized recovery
    - regenerated `core/config.schema.json` so the schema minimum is `0` and
    the new behavior is described in the field docs
    - added coverage for both config deserialization and the no-auto-refresh
    plus `401` recovery behavior
    
    ## How tested
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-login`
    - `cargo test -p codex-core test_deserialize_provider_auth_config_`
  • auth: let AuthManager own external bearer auth (#16287)
    ## Summary
    
    `AuthManager` and `UnauthorizedRecovery` already own token resolution
    and staged `401` recovery. The missing piece for provider auth was a
    bearer-only mode that still fit that design, instead of pushing a second
    auth abstraction into `codex-core`.
    
    This PR keeps the design centered on `AuthManager`: it teaches
    `codex-login` how to own external bearer auth directly so later provider
    work can keep calling `AuthManager.auth()` and `UnauthorizedRecovery`.
    
    ## Motivation
    
    This is the middle layer for #15189.
    
    The intended design is still:
    
    - `AuthManager` encapsulates token storage and refresh
    - `UnauthorizedRecovery` powers staged `401` recovery
    - all request tokens go through `AuthManager.auth()`
    
    This PR makes that possible for provider-backed bearer tokens by adding
    a bearer-only auth mode inside `AuthManager` instead of building
    parallel request-auth plumbing in `core`.
    
    ## What Changed
    
    - move `ModelProviderAuthInfo` into `codex-protocol` so `core` and
    `login` share one config shape
    - add `login/src/auth/external_bearer.rs`, which runs the configured
    command, caches the bearer token in memory, and refreshes it after `401`
    - add `AuthManager::external_bearer_only(...)` for provider-scoped
    request paths that should use command-backed bearer auth without
    mutating the shared OpenAI auth manager
    - add `AuthManager::shared_with_external_chatgpt_auth_refresher(...)`
    and rename the other `AuthManager` helpers that only apply to external
    ChatGPT auth so the ChatGPT-only path is explicit at the call site
    - keep external ChatGPT refresh behavior unchanged while ensuring
    bearer-only external auth never persists to `auth.json`
    
    ## Testing
    
    - `cargo test -p codex-login`
    - `cargo test -p codex-protocol`
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16287).
    * #16288
    * __->__ #16287
  • Remove remaining custom prompt support (#16115)
    ## Summary
    - remove protocol and core support for discovering and listing custom
    prompts
    - simplify the TUI slash-command flow and command popup to built-in
    commands only
    - delete obsolete custom prompt tests, helpers, and docs references
    - clean up downstream event handling for the removed protocol events
  • chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
    ## Why
    
    `argument-comment-lint` was green in CI even though the repo still had
    many uncommented literal arguments. The main gap was target coverage:
    the repo wrapper did not force Cargo to inspect test-only call sites, so
    examples like the `latest_session_lookup_params(true, ...)` tests in
    `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
    
    This change cleans up the existing backlog, makes the default repo lint
    path cover all Cargo targets, and starts rolling that stricter CI
    enforcement out on the platform where it is currently validated.
    
    ## What changed
    
    - mechanically fixed existing `argument-comment-lint` violations across
    the `codex-rs` workspace, including tests, examples, and benches
    - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
    `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
    `--all-targets` unless the caller explicitly narrows the target set
    - fixed both wrappers so forwarded cargo arguments after `--` are
    preserved with a single separator
    - documented the new default behavior in
    `tools/argument-comment-lint/README.md`
    - updated `rust-ci` so the macOS lint lane keeps the plain wrapper
    invocation and therefore enforces `--all-targets`, while Linux and
    Windows temporarily pass `-- --lib --bins`
    
    That temporary CI split keeps the stricter all-targets check where it is
    already cleaned up, while leaving room to finish the remaining Linux-
    and Windows-specific target-gated cleanup before enabling
    `--all-targets` on those runners. The Linux and Windows failures on the
    intermediate revision were caused by the wrapper forwarding bug, not by
    additional lint findings in those lanes.
    
    ## Validation
    
    - `bash -n tools/argument-comment-lint/run.sh`
    - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
    - shell-level wrapper forwarding check for `-- --lib --bins`
    - shell-level wrapper forwarding check for `-- --tests`
    - `just argument-comment-lint`
    - `cargo test` in `tools/argument-comment-lint`
    - `cargo test -p codex-terminal-detection`
    
    ## Follow-up
    
    - Clean up remaining Linux-only target-gated callsites, then switch the
    Linux lint lane back to the plain wrapper invocation.
    - Clean up remaining Windows-only target-gated callsites, then switch
    the Windows lint lane back to the plain wrapper invocation.
  • Add usage-based business plan types (#15934)
    ## Summary
    - add `self_serve_business_usage_based` and `enterprise_cbp_usage_based`
    to the public/internal plan enums and regenerate the app-server + Python
    SDK artifacts
    - map both plans through JWT login and backend rate-limit payloads, then
    bucket them with the existing Team/Business entitlement behavior in
    cloud requirements, usage-limit copy, tooltips, and status display
    - keep the earlier display-label remap commit on this branch so the new
    Team-like and Business-like plans render consistently in the UI
    
    ## Testing
    - `just write-app-server-schema`
    - `uv run --project sdk/python python
    sdk/python/scripts/update_sdk_artifacts.py generate-types`
    - `just fix -p codex-protocol -p codex-login -p codex-core -p
    codex-backend-client -p codex-cloud-requirements -p codex-tui -p
    codex-tui-app-server -p codex-backend-openapi-models`
    - `just fmt`
    - `just argument-comment-lint`
    - `cargo test -p codex-protocol
    usage_based_plan_types_use_expected_wire_names`
    - `cargo test -p codex-login usage_based`
    - `cargo test -p codex-backend-client usage_based`
    - `cargo test -p codex-cloud-requirements usage_based`
    - `cargo test -p codex-core usage_limit_reached_error_formats_`
    - `cargo test -p codex-tui plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui remapped`
    - `cargo test -p codex-tui-app-server
    plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui-app-server remapped`
    - `cargo test -p codex-tui-app-server
    preserves_usage_based_plan_type_wire_name`
    
    ## Notes
    - a broader multi-crate `cargo test` run still hits unrelated existing
    guardian-approval config failures in
    `codex-rs/core/src/config/config_tests.rs`
  • permissions: remove macOS seatbelt extension profiles (#15918)
    ## Why
    
    `PermissionProfile` should only describe the per-command permissions we
    still want to grant dynamically. Keeping
    `MacOsSeatbeltProfileExtensions` in that surface forced extra macOS-only
    approval, protocol, schema, and TUI branches for a capability we no
    longer want to expose.
    
    ## What changed
    
    - Removed the macOS-specific permission-profile types from
    `codex-protocol`, the app-server v2 API, and the generated
    schema/TypeScript artifacts.
    - Deleted the core and sandboxing plumbing that threaded
    `MacOsSeatbeltProfileExtensions` through execution requests and seatbelt
    construction.
    - Simplified macOS seatbelt generation so it always includes the fixed
    read-only preferences allowlist instead of carrying a configurable
    profile extension.
    - Removed the macOS additional-permissions UI/docs/test coverage and
    deleted the obsolete macOS permission modules.
    - Tightened `request_permissions` intersection handling so explicitly
    empty requested read lists are preserved only when that field was
    actually granted, avoiding zero-grant responses being stored as active
    permissions.
  • chore: remove skill metadata from command approval payloads (#15906)
    ## Why
    
    This is effectively a follow-up to
    [#15812](https://github.com/openai/codex/pull/15812). That change
    removed the special skill-script exec path, but `skill_metadata` was
    still being threaded through command-approval payloads even though the
    approval flow no longer uses it to render prompts or resolve decisions.
    
    Keeping it around added extra protocol, schema, and client surface area
    without changing behavior.
    
    Removing it keeps the command-approval contract smaller and avoids
    carrying a dead field through app-server, TUI, and MCP boundaries.
    
    ## What changed
    
    - removed `ExecApprovalRequestSkillMetadata` and the corresponding
    `skillMetadata` field from core approval events and the v2 app-server
    protocol
    - removed the generated JSON and TypeScript schema output for that field
    - updated app-server, MCP server, TUI, and TUI app-server approval
    plumbing to stop forwarding the field
    - cleaned up tests that previously constructed or asserted
    `skillMetadata`
    
    ## Testing
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-app-server-test-client`
    - `cargo test -p codex-mcp-server`
    - `just argument-comment-lint`
  • Protect first-time project .codex creation across Linux and macOS sandboxes (#15067)
    ## Problem
    
    Codex already treated an existing top-level project `./.codex` directory
    as protected, but there was a gap on first creation.
    
    If `./.codex` did not exist yet, a turn could create files under it,
    such as `./.codex/config.toml`, without going through the same approval
    path as later modifications. That meant the initial write could bypass
    the intended protection for project-local Codex state.
    
    ## What this changes
    
    This PR closes that first-creation gap in the Unix enforcement layers:
    
    - `codex-protocol`
    - treat the top-level project `./.codex` path as a protected carveout
    even when it does not exist yet
    - avoid injecting the default carveout when the user already has an
    explicit rule for that exact path
    - macOS Seatbelt
    - deny writes to both the exact protected path and anything beneath it,
    so creating `./.codex` itself is blocked in addition to writes inside it
    - Linux bubblewrap
    - preserve the same protected-path behavior for first-time creation
    under `./.codex`
    - tests
    - add protocol regressions for missing `./.codex` and explicit-rule
    collisions
    - add Unix sandbox coverage for blocking first-time `./.codex` creation
      - tighten Seatbelt policy assertions around excluded subpaths
    
    ## Scope
    
    This change is intentionally scoped to protecting the top-level project
    `.codex` subtree from agent writes.
    
    It does not make `.codex` unreadable, and it does not change the product
    behavior around loading project skills from `.codex` when project config
    is untrusted.
    
    ## Why this shape
    
    The fix is pointed rather than broad:
    - it preserves the current model of “project `.codex` is protected from
    writes”
    - it closes the security-relevant first-write hole
    - it avoids folding a larger permissions-model redesign into this PR
    
    ## Validation
    
    - `cargo test -p codex-protocol`
    - `cargo test -p codex-sandboxing seatbelt`
    - `cargo test -p codex-exec --test all
    sandbox_blocks_first_time_dot_codex_creation -- --nocapture`
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • [hooks] add non-streaming (non-stdin style) shell-only PostToolUse support (#15531)
    CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main
    -- so the following PR should be first
    
    Matching post-tool hook to the pre-tool functionality here:
    https://github.com/openai/codex/pull/15211
    
    So, PreToolUse calls for plain shell calls, allows blocking. This
    PostToolUse call runs after the command executed
    
    example run:
    
    ```
    › as a test, run in parallel the following commands:
      - echo 'one'
      - echo '[block-pre-tool-use]'
      - echo '[block-post-tool-use]'
    
    
    ⚠ MCP startup incomplete (failed: notion, linear)
    
    • Cruising through those three commands in parallel now, and I’ll share the exact outputs right after
      they land.
    
    • Running PreToolUse hook: checking the observatory runes
    
    • Running PreToolUse hook: checking the observatory runes
    
    • Running PreToolUse hook: checking the observatory runes
    
    PreToolUse hook (blocked)
      warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
      feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo 'one'
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo '[block-post-tool-use]'
    
    • Ran echo '[block-post-tool-use]'
      └ [block-post-tool-use]
    
    • Ran echo 'one'
      └ one
    
    • Running PostToolUse hook: reading the observatory aftermath
    
    • Running PostToolUse hook: reading the observatory aftermath
    
    PostToolUse hook (blocked)
      warning: wizard-tower PostToolUse demo blocked the Bash follow-through on purpose.
      feedback: PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to
    continue.
    
    PostToolUse hook (completed)
      warning: wizard-tower PostToolUse demo inspected Bash: echo 'one'
      hook context: PostToolUse demo saw the Bash result. Response preview: one
    
    ──────────────────────────────────────────────────────────────────────────────────────────────────────
    
    • Ran them in parallel. Little wave report:
    
      - echo 'one' completed and returned one.
      - echo '[block-pre-tool-use]' was blocked before execution by the PreToolUse hook.
      - echo '[block-post-tool-use]' ran, but its result was blocked afterward by the PostToolUse hook.
    
      The hook messages were:
    
      - PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
      - PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue.
    
    
    › for the blocked messages (correctly blocked in this test) did the tool output land back into
      context?
    
    
    • Yeah, from this run:
    
      - PreToolUse block: the command itself did not execute, so no command output landed in context. Only
        the block message did.
      - PostToolUse block: the command did execute, but the actual stdout did not come back into context.
        Only the post-tool block message did.
    
      So the blocked notifications landed in context, but the blocked command payloads themselves did not.
      Nice clean guardrail, aloha.
    ```
  • Move string truncation helpers into codex-utils-string (#15572)
    - move the shared byte-based middle truncation logic from `core` into
    `codex-utils-string`
    - keep token-specific truncation in `codex-core` so rollout can reuse
    the shared helper in the next stacked PR
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Move git utilities into a dedicated crate (#15564)
    - create `codex-git-utils` and move the shared git helpers into it with
    file moves preserved for diff readability
    - move the `GitInfo` helpers out of `core` so stacked rollout work can
    depend on the shared crate without carrying its own git info module
    
    ---------
    
    Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
    Co-authored-by: Codex <noreply@openai.com>
  • feat: communication pattern v2 (#15647)
    See internal communication
  • feat: disable notifier v2 and start turn on agent interaction (#15624)
    Make the inter-agent communication start a turn
    
    As part of this, we disable the v2 notifier to prevent some odd
    behaviour where the agent restart working while you're talking to it for
    example
  • chore(core) Add approvals reviewer to UserTurn (#15426)
    ## Summary
    Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
    `[CodexMessageProcessor::turn_start]` to use Op::UserTurn
    
    ## Testing
    - [x] Adds quick test for the new field
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: use serde to differenciate inter agent communication (#15560)
    Use `serde` to encode the inter agent communication to an assistant
    message and use the decode to see if this is such a message
    
    Note: this assume serde on small pattern is fast enough
  • [hooks] add non-streaming (non-stdin style) shell-only PreToolUse support (#15211)
    - add `PreToolUse` hook for bash-like tool execution only at first
    - block shell execution before dispatch with deny-only hook behavior
    - introduces common.rs matcher framework for matching when hooks are run
    
    example run:
    
    ```
    › run three parallel echo commands, and the second one should echo "[block-pre-tool-use]" as a test
    
    
    • Running the three echo commands in parallel now and I’ll report the output directly.
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo "first parallel echo"
      
    PreToolUse hook (blocked)
      warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
      feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo "third parallel echo"
    
    • Ran echo "first parallel echo"
      └ first parallel echo
    
    • Ran echo "third parallel echo"
      └ third parallel echo
    
    • Three little waves went out in parallel.
    
      1. printed first parallel echo
      2. was blocked before execution because it contained the exact test string [block-pre-tool-use]
      3. printed third parallel echo
    
      There was also an unrelated macOS defaults warning around the successful commands, but the echoes
      themselves worked fine. If you want, I can rerun the second one with a slightly modified string so
      it passes cleanly.
    ```
  • feat: new op type for sub-agents communication (#15556)
    Add `InterAgentCommunication` for v2 agent communication