Commit Graph

66 Commits

  • Add realtime transcript notification in v2 (#15344)
    - emit a typed `thread/realtime/transcriptUpdated` notification from
    live realtime transcript deltas
    - expose that notification as flat `threadId`, `role`, and `text` fields
    instead of a nested transcript array
    - continue forwarding raw `handoff_request` items on
    `thread/realtime/itemAdded`, including the accumulated
    `active_transcript`
    - update app-server docs, tests, and generated protocol schema artifacts
    to match the delta-based payloads
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: change multi-agent to use path-like system instead of uuids (#15313)
    This PR add an URI-based system to reference agents within a tree. This
    comes from a sync between research and engineering.
    
    The main agent (the one manually spawned by a user) is always called
    `/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
    where `agent_1` is chosen by the model.
    
    Any agent can contact any agents using the path.
    
    Paths can be used either in absolute or relative to the calling agents
    
    Resume is not supported for now on this new path
  • Feat/restore image generation history (#15223)
    Restore image generation items in resumed thread history
  • feat(app-server): add mcpServer/startupStatus/updated notification (#15220)
    Exposes the legacy `codex/event/mcp_startup_update` event as an API v2
    notification.
    
    The legacy event has this shape:
    ```
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    pub struct McpStartupUpdateEvent {
        /// Server name being started.
        pub server: String,
        /// Current startup status.
        pub status: McpStartupStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    #[serde(rename_all = "snake_case", tag = "state")]
    #[ts(rename_all = "snake_case", tag = "state")]
    pub enum McpStartupStatus {
        Starting,
        Ready,
        Failed { error: String },
        Cancelled,
    }
    ```
  • [hooks] use a user message > developer message for prompt continuation (#14867)
    ## Summary
    
    Persist Stop-hook continuation prompts as `user` messages instead of
    hidden `developer` messages + some requested integration tests
    
    This is a followup to @pakrym 's comment in
    https://github.com/openai/codex/pull/14532 to make sure stop-block
    continuation prompts match training for turn loops
    
    - Stop continuation now writes `<hook_prompt hook_run_id="...">stop
    hook's user prompt<hook_prompt>`
    - Introduces quick-xml dependency, though we already indirectly depended
    on it anyway via syntect
    - This PR only has about 500 lines of actual logic changes, the rest is
    tests/schema
    
    ## Testing
    
    Example run (with a sessionstart hook and 3 stop hooks) - this shows
    context added by session start, then two stop hooks sending their own
    additional prompts in a new turn. The model responds with a single
    message addressing both. Then when that turn ends, the hooks detect that
    they just ran using `stop_hook_active` and decide not to infinite loop
    
    test files for this (unzip, move codex -> .codex):
    [codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)
    
    ```
    › cats
    
    
    • Running SessionStart hook: lighting the observatory
    
    SessionStart hook (completed)
      warning: Hi, I'm a session start hook for wizard-tower (startup).
      hook context: A wimboltine stonpet is an exotic cuisine from hyperspace
    
    • Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
      cat facts, cat breeds, cat names, or build something cat-themed in this repo.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: cook the stonpet
    
    Stop hook (blocked)
      warning: Wizard Tower Stop hook continuing conversation
      feedback: eat the cooked stonpet
    
    • Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
      rested until the hyperspace juices settle.
    
      Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
      smoky, bright, and totally out of this dimension.
    
    • Running Stop hook: checking the tower wards
    
    • Running Stop hook: sacking the guards
    
    • Running Stop hook: hiring the guards
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    
    Stop hook (completed)
      warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
    ```
  • feat: support product-scoped plugins. (#15041)
    1. Added SessionSource::Custom(String) and --session-source.
      2. Enforced plugin and skill products by session_source.
      3. Applied the same filtering to curated background refresh.
  • Add thread/shellCommand to app server API surface (#14988)
    This PR adds a new `thread/shellCommand` app server API so clients can
    implement `!` shell commands. These commands are executed within the
    sandbox, and the command text and output are visible to the model.
    
    The internal implementation mirrors the current TUI `!` behavior.
    - persist shell command execution as `CommandExecution` thread items,
    including source and formatted output metadata
    - bridge live and replayed app-server command execution events back into
    the existing `tui_app_server` exec rendering path
    
    This PR also wires `tui_app_server` to submit `!` commands through the
    new API.
  • Simple directory mentions (#14970)
    - Adds simple support for directory mentions in the TUI.
    - Codex App/VS Code will require minor change to recognize a directory
    mention as such and change the link behavior.
    - Directory mentions have a trailing slash to differentiate from
    extensionless files
    
    
    <img width="972" height="382" alt="image"
    src="https://github.com/user-attachments/assets/8035b1eb-0978-465b-8d7a-4db2e5feca39"
    />
    <img width="978" height="228" alt="image"
    src="https://github.com/user-attachments/assets/af22cf0b-dd10-4440-9bee-a09915f6ba52"
    />
  • Revert "fix: harden plugin feature gating" (#15102)
    Reverts openai/codex#15020
    
    I messed up the commit in my PR and accidentally merged changes that
    were still under review.
  • fix: harden plugin feature gating (#15020)
    1. Use requirement-resolved config.features as the plugin gate.
    2. Guard plugin/list, plugin/read, and related flows behind that gate.
    3. Skip bad marketplace.json files instead of failing the whole list.
    4. Simplify plugin state and caching.
  • [hooks] userpromptsubmit - hook before user's prompt is executed (#14626)
    - this allows blocking the user's prompts from executing, and also
    prevents them from entering history
    - handles the edge case where you can both prevent the user's prompt AND
    add n amount of additionalContexts
    - refactors some old code into common.rs where hooks overlap
    functionality
    - refactors additionalContext being previously added to user messages,
    instead we use developer messages for them
    - handles queued messages correctly
    
    Sample hook for testing - if you write "[block-user-submit]" this hook
    will stop the thread:
    
    example run
    ```
    › sup
    
    
    • Running UserPromptSubmit hook: reading the observatory notes
    
    UserPromptSubmit hook (completed)
      warning: wizard-tower UserPromptSubmit demo inspected: sup
      hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact
    phrase 'observatory lanterns lit' exactly once near the end.
    
    • Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory
      lanterns lit
    
    
    › and [block-user-submit]
    
    
    • Running UserPromptSubmit hook: reading the observatory notes
    
    UserPromptSubmit hook (stopped)
      warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose.
      stop: Wizard Tower demo block: remove [block-user-submit] to continue.
    ```
    
    .codex/config.toml
    ```
    [features]
    codex_hooks = true
    ```
    
    .codex/hooks.json
    ```
    {
      "hooks": {
        "UserPromptSubmit": [
          {
            "hooks": [
              {
                "type": "command",
                "command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py",
                "timeoutSec": 10,
                "statusMessage": "reading the observatory notes"
              }
            ]
          }
        ]
      }
    }
    ```
    
    .codex/hooks/user_prompt_submit_demo.py
    ```
    #!/usr/bin/env python3
    
    import json
    import sys
    from pathlib import Path
    
    
    def prompt_from_payload(payload: dict) -> str:
        prompt = payload.get("prompt")
        if isinstance(prompt, str) and prompt.strip():
            return prompt.strip()
    
        event = payload.get("event")
        if isinstance(event, dict):
            user_prompt = event.get("user_prompt")
            if isinstance(user_prompt, str):
                return user_prompt.strip()
    
        return ""
    
    
    def main() -> int:
        payload = json.load(sys.stdin)
        prompt = prompt_from_payload(payload)
        cwd = Path(payload.get("cwd", ".")).name or "wizard-tower"
    
        if "[block-user-submit]" in prompt:
            print(
                json.dumps(
                    {
                        "systemMessage": (
                            f"{cwd} UserPromptSubmit demo blocked the prompt on purpose."
                        ),
                        "decision": "block",
                        "reason": (
                            "Wizard Tower demo block: remove [block-user-submit] to continue."
                        ),
                    }
                )
            )
            return 0
    
        prompt_preview = prompt or "(empty prompt)"
        if len(prompt_preview) > 80:
            prompt_preview = f"{prompt_preview[:77]}..."
    
        print(
            json.dumps(
                {
                    "systemMessage": (
                        f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}"
                    ),
                    "hookSpecificOutput": {
                        "hookEventName": "UserPromptSubmit",
                        "additionalContext": (
                            "Wizard Tower UserPromptSubmit demo fired. "
                            "For this reply only, include the exact phrase "
                            "'observatory lanterns lit' exactly once near the end."
                        ),
                    },
                }
            )
        )
        return 0
    
    
    if __name__ == "__main__":
        raise SystemExit(main())
    ```
  • Gate realtime audio interruption logic to v2 (#14984)
    - thread the realtime version into conversation start and app-server
    notifications
    - keep playback-aware mic gating and playback interruption behavior on
    v2 only, leaving v1 on the legacy path
  • [stack 2/4] Align main realtime v2 wire and runtime flow (#14830)
    ## Stack Position
    2/4. Built on top of #14828.
    
    ## Base
    - #14828
    
    ## Unblocks
    - #14829
    - #14827
    
    ## Scope
    - Port the realtime v2 wire parsing, session, app-server, and
    conversation runtime behavior onto the split websocket-method base.
    - Branch runtime behavior directly on the current realtime session kind
    instead of parser-derived flow flags.
    - Keep regression coverage in the existing e2e suites.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: make interrupt state not final for multi-agents (#13850)
    Make `interrupted` an agent state and make it not final. As a result, a
    `wait` won't return on an interrupted agent and no notification will be
    send to the parent agent.
    
    The rationals are:
    * If a user interrupt a sub-agent for any reason, you don't want the
    parent agent to instantaneously ask the sub-agent to restart
    * If a parent agent interrupt a sub-agent, no need to add a noisy
    notification in the parent agen
  • Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
    ## Summary
    - add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
    control for who reviews approval requests
    - route Smart Approvals guardian review through core for command
    execution, file changes, managed-network approvals, MCP approvals, and
    delegated/subagent approval flows
    - expose guardian review in app-server with temporary unstable
    `item/autoApprovalReview/{started,completed}` notifications carrying
    `targetItemId`, `review`, and `action`
    - update the TUI so Smart Approvals can be enabled from `/experimental`,
    aligned with the matching `/approvals` mode, and surfaced clearly while
    reviews are pending or resolved
    
    ## Runtime model
    This PR does not introduce a new `approval_policy`.
    
    Instead:
    - `approval_policy` still controls when approval is needed
    - `approvals_reviewer` controls who reviewable approval requests are
    routed to:
      - `user`
      - `guardian_subagent`
    
    `guardian_subagent` is a carefully prompted reviewer subagent that
    gathers relevant context and applies a risk-based decision framework
    before approving or denying the request.
    
    The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
    behavior keys off `approvals_reviewer`.
    
    When Smart Approvals is enabled from the TUI, it also switches the
    current `/approvals` settings to the matching Smart Approvals mode so
    users immediately see guardian review in the active thread:
    - `approval_policy = on-request`
    - `approvals_reviewer = guardian_subagent`
    - `sandbox_mode = workspace-write`
    
    Users can still change `/approvals` afterward.
    
    Config-load behavior stays intentionally narrow:
    - plain `smart_approvals = true` in `config.toml` remains just the
    rollout/UI gate and does not auto-set `approvals_reviewer`
    - the deprecated `guardian_approval = true` alias migration does
    backfill `approvals_reviewer = "guardian_subagent"` in the same scope
    when that reviewer is not already configured there, so old configs
    preserve their original guardian-enabled behavior
    
    ARC remains a separate safety check. For MCP tool approvals, ARC
    escalations now flow into the configured reviewer instead of always
    bypassing guardian and forcing manual review.
    
    ## Config stability
    The runtime reviewer override is stable, but the config-backed
    app-server protocol shape is still settling.
    
    - `thread/start`, `thread/resume`, and `turn/start` keep stable
    `approvalsReviewer` overrides
    - the config-backed `approvals_reviewer` exposure returned via
    `config/read` (including profile-level config) is now marked
    `[UNSTABLE]` / experimental in the app-server protocol until we are more
    confident in that config surface
    
    ## App-server surface
    This PR intentionally keeps the guardian app-server shape narrow and
    temporary.
    
    It adds generic unstable lifecycle notifications:
    - `item/autoApprovalReview/started`
    - `item/autoApprovalReview/completed`
    
    with payloads of the form:
    - `{ threadId, turnId, targetItemId, review, action? }`
    
    `review` is currently:
    - `{ status, riskScore?, riskLevel?, rationale? }`
    - where `status` is one of `inProgress`, `approved`, `denied`, or
    `aborted`
    
    `action` carries the guardian action summary payload from core when
    available. This lets clients render temporary standalone pending-review
    UI, including parallel reviews, even when the underlying tool item has
    not been emitted yet.
    
    These notifications are explicitly documented as `[UNSTABLE]` and
    expected to change soon.
    
    This PR does **not** persist guardian review state onto `thread/read`
    tool items. The intended follow-up is to attach guardian review state to
    the reviewed tool item lifecycle instead, which would improve
    consistency with manual approvals and allow thread history / reconnect
    flows to replay guardian review state directly.
    
    ## TUI behavior
    - `/experimental` exposes the rollout gate as `Smart Approvals`
    - enabling it in the TUI enables the feature and switches the current
    session to the matching Smart Approvals `/approvals` mode
    - disabling it in the TUI clears the persisted `approvals_reviewer`
    override when appropriate and returns the session to default manual
    review when the effective reviewer changes
    - `/approvals` still exposes the reviewer choice directly
    - the TUI renders:
    - pending guardian review state in the live status footer, including
    parallel review aggregation
      - resolved approval/denial state in history
    
    ## Scope notes
    This PR includes the supporting core/runtime work needed to make Smart
    Approvals usable end-to-end:
    - shell / unified-exec / apply_patch / managed-network / MCP guardian
    review
    - delegated/subagent approval routing into guardian review
    - guardian review risk metadata and action summaries for app-server/TUI
    - config/profile/TUI handling for `smart_approvals`, `guardian_approval`
    alias migration, and `approvals_reviewer`
    - a small internal cleanup of delegated approval forwarding to dedupe
    fallback paths and simplify guardian-vs-parent approval waiting (no
    intended behavior change)
    
    Out of scope for this PR:
    - redesigning the existing manual approval protocol shapes
    - persisting guardian review state onto app-server `ThreadItem`s
    - delegated MCP elicitation auto-review (the current delegated MCP
    guardian shim only covers the legacy `RequestUserInput` path)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Include spawn agent model metadata in app-server items (#14410)
    - add model and reasoning effort to app-server collab spawn items and
    notifications
    - regenerate app-server protocol schemas for the new fields
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • start of hooks engine (#13276)
    (Experimental)
    
    This PR adds a first MVP for hooks, with SessionStart and Stop
    
    The core design is:
    
    - hooks live in a dedicated engine under codex-rs/hooks
    - each hook type has its own event-specific file
    - hook execution is synchronous and blocks normal turn progression while
    running
    - matching hooks run in parallel, then their results are aggregated into
    a normalized HookRunSummary
    
    On the AppServer side, hooks are exposed as operational metadata rather
    than transcript-native items:
    
    - new live notifications: hook/started, hook/completed
    - persisted/replayed hook results live on Turn.hookRuns
    - we intentionally did not add hook-specific ThreadItem variants
    
    Hooks messages are not persisted, they remain ephemeral. The context
    changes they add are (they get appended to the user's prompt)
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • add @plugin mentions (#13510)
    ## Note-- added plugin mentions via @, but that conflicts with file
    mentions
    
    depends and builds upon #13433.
    
    - introduces explicit `@plugin` mentions. this injects the plugin's mcp
    servers, app names, and skill name format into turn context as a dev
    message.
    - we do not yet have UI for these mentions, so we currently parse raw
    text (as opposed to skills and apps which have UI chips, autocomplete,
    etc.) this depends on a `plugins/list` app-server endpoint we can feed
    the UI with, which is upcoming
    - also annotate mcp and app tool descriptions with the plugin(s) they
    come from. this gives the model a first class way of understanding what
    tools come from which plugins, which will help implicit invocation.
    
    ### Tests
    Added and updated tests, unit and integration. Also confirmed locally a
    raw `@plugin` injects the dev message, and the model knows about its
    apps, mcps, and skills.
  • image-gen-event/client_processing (#13512)
    enabling client-side to process with image-generation capabilities
    (setting app-server)
  • feat(app-server): add a skills/changed v2 notification (#13414)
    This adds a first-class app-server v2 `skills/changed` notification for
    the existing skills live-reload signal.
    
    Before this change, clients only had the legacy raw
    `codex/event/skills_update_available` event. With this PR, v2 clients
    can listen for a typed JSON-RPC notification instead of depending on the
    legacy `codex/event/*` stream, which we want to remove soon.
  • [codex] include plan type in account updates (#13181)
    This change fixes a Codex app account-state sync bug where clients could
    know the user was signed in but still miss the ChatGPT subscription
    tier, which could lead to incorrect upgrade messaging for paid users.
    
    The root cause was that `account/updated` only carried `authMode` while
    plan information was available separately via `account/read` and
    rate-limit snapshots, so this update adds `planType` to
    `account/updated`, populates it consistently across login and refresh
    paths.
  • app-server: Add ephemeral field to Thread object (#13084)
    Currently there is no alternative way to know that thread is ephemeral,
    only client which did create it has the knowledge.
  • app-server: Replay pending item requests on thread/resume (#12560)
    Replay pending client requests after `thread/resume` and emit resolved
    notifications when those requests clear so approval/input UI state stays
    in sync after reconnects and across subscribed clients.
    
    Affected RPCs:
    - `item/commandExecution/requestApproval`
    - `item/fileChange/requestApproval`
    - `item/tool/requestUserInput`
    
    Motivation:
    - Resumed clients need to see pending approval/input requests that were
    already outstanding before the reconnect.
    - Clients also need an explicit signal when a pending request resolves
    or is cleared so stale UI can be removed on turn start, completion, or
    interruption.
    
    Implementation notes:
    - Use pending client requests from `OutgoingMessageSender` in order to
    replay them after `thread/resume` attaches the connection, using
    original request ids.
    - Emit `serverRequest/resolved` when pending requests are answered
    or cleared by lifecycle cleanup.
    - Update the app-server protocol schema, generated TypeScript bindings,
    and README docs for the replay/resolution flow.
    
    High-level test plan:
    - Added automated coverage for replaying pending command execution and
    file change approval requests on `thread/resume`.
    - Added automated coverage for resolved notifications in command
    approval, file change approval, request_user_input, turn start, and turn
    interrupt flows.
    - Verified schema/docs updates in the relevant protocol and app-server
    tests.
    
    Manual testing:
    - Tested reconnect/resume with multiple connections.
    - Confirmed state stayed in sync between connections.
  • feat(app-server): thread/unsubscribe API (#10954)
    Adds a new v2 app-server API for a client to be able to unsubscribe to a
    thread:
    - New RPC method: `thread/unsubscribe`
    - New server notification: `thread/closed`
    
    Today clients can start/resume/archive threads, but there wasn’t a way
    to explicitly unload a live thread from memory without archiving it.
    With `thread/unsubscribe`, a client can indicate it is no longer
    actively working with a live Thread. If this is the only client
    subscribed to that given thread, the thread will be automatically closed
    by app-server, at which point the server will send `thread/closed` and
    `thread/status/changed` with `status: notLoaded` notifications.
    
    This gives clients a way to prevent long-running app-server processes
    from accumulating too many thread (and related) objects in memory.
    
    Closed threads will also be removed from `thread/loaded/list`.
  • feat(app-server): add ThreadItem::DynamicToolCall (#12732)
    Previously, clients would call `thread/start` with dynamic_tools set,
    and when a model invokes a dynamic tool, it would just make the
    server->client `item/tool/call` request and wait for the client's
    response to complete the tool call. This works, but it doesn't have an
    `item/started` or `item/completed` event.
    
    Now we are doing this:
    - [new] emit `item/started` with `DynamicToolCall` populated with the
    call arguments
    - send an `item/tool/call` server request
    - [new] once the client responds, emit `item/completed` with
    `DynamicToolCall` populated with the response.
    
    Also, with `persistExtendedHistory: true`, dynamic tool calls are now
    reconstructable in `thread/read` and `thread/resume` as
    `ThreadItem::DynamicToolCall`.
  • Add app-server v2 thread realtime API (#12715)
    Add experimental `thread/realtime/*` v2 requests and notifications, then
    route app-server realtime events through that thread-scoped surface with
    integration coverage.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • ignore v1 in JSON schema codegen (#12408)
    ## Why
    
    The generated unnamespaced JSON envelope schemas (`ClientRequest` and
    `ServerNotification`) still contained both v1 and v2 variants, which
    pulled legacy v1/core types and v2 types into the same `definitions`
    graph. That caused `schemars` to produce numeric suffix names (for
    example `AskForApproval2`, `ByteRange2`, `MessagePhase2`).
    
    This PR moves JSON codegen toward v2-only output while preserving the
    unnamespaced envelope artifacts, and avoids reintroducing numeric-suffix
    tolerance by removing the v1/internal-only variants that caused the
    collisions in those envelope schemas.
    
    ## What Changed
    
    - In `codex-rs/app-server-protocol/src/export.rs`, JSON generation now
    excludes v1 schema artifacts (`v1/*`) while continuing to emit
    unnamespaced/root JSON schemas and the JSON bundle.
    - Added a narrow JSON v1 allowlist (`JSON_V1_ALLOWLIST`) so
    `InitializeParams` and `InitializeResponse` are still emitted.
    - Added JSON-only post-processing for the mixed envelope schemas before
    collision checks run:
    - `ClientRequest`: strips v1 request variants from the generated `oneOf`
    using the temporary `V1_CLIENT_REQUEST_METHODS` list
    - `ServerNotification`: strips v1 notifications plus the internal-only
    `rawResponseItem/completed` notification using the temporary
    `EXCLUDED_SERVER_NOTIFICATION_METHODS_FOR_JSON` list
    - Added a temporary local-definition pruning pass for those envelope
    schemas so now-unreferenced v1/core definitions are removed from
    `definitions` after method filtering.
    - Updated the variant-title naming heuristic for single-property literal
    object variants to use the literal value (when available), avoiding
    collisions like multiple `state`-only variants all deriving the same
    title.
    - Collision handling remains fail-fast (no numeric suffix fallback map
    in this PR path).
    
    ## Verification
    
    - `just write-app-server-schema`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12408).
    * __->__ #12408
    * #12406
  • feat: use OAI Responses API MessagePhase type directly in App Server v2 (#12422)
    https://github.com/openai/codex/pull/10455 introduced the `phase` field,
    and then https://github.com/openai/codex/pull/12072 introduced a
    `MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type
    in `codex-rs/protocol/src/models.rs`.
    
    The app server protocol prefers `camelCase` while the Responses API uses
    `snake_case`, so this meant we had two versions of `MessagePhase` with
    different serialization rules. When the app server protocol refers to
    types from the Responses API, we use the wire format of the the
    Responses API even though it is inconsistent with the app server API.
    
    This PR deletes `MessagePhase` from `v2.rs` and consolidates on the
    Responses API version to eliminate confusion.
  • Wire realtime api to core (#12268)
    - Introduce `RealtimeConversationManager` for realtime API management 
    - Add `op::conversation` to start conversation, insert audio, insert
    text, and close conversation.
    - emit conversation lifecycle and realtime events.
    - Move shared realtime payload types into codex-protocol and add core
    e2e websocket tests for start/replace/transport-close paths.
    
    Things to consider:
    - Should we use the same `op::` and `Events` channel to carry audio? I
    think we should try this simple approach and later we can create
    separate one if the channels got congested.
    - Sending text updates to the client: we can start simple and later
    restrict that.
    - Provider auth isn't wired for now intentionally
  • Add field to Thread object for the latest rename set for a given thread (#12301)
    Exposes through the app server updated names set for a thread. This
    enables other surfaces to use the core as the source of truth for thread
    naming. `threadName` is gathered using the helper functions used to
    interact with `session_index.jsonl`, and is hydrated in:
    - `thread/list`
    - `thread/read`
    - `thread/resume`
    - `thread/unarchive`
    - `thread/rollback`
    
    We don't do this for `thread/start` and `thread/fork`.
  • feat: cleaner TUI for sub-agents (#12327)
    <img width="760" height="496" alt="Screenshot 2026-02-20 at 14 31 25"
    src="https://github.com/user-attachments/assets/1983b825-bb47-417e-9925-6f727af56765"
    />
  • feat: add nick name to sub-agents (#12320)
    Adding random nick name to sub-agents. Used for UX
    
    At the same time, also storing and wiring the role of the sub-agent
  • feat: add Reject approval policy with granular prompt rejection controls (#12087)
    ## Why
    
    We need a way to auto-reject specific approval prompt categories without
    switching all approvals off.
    
    The goal is to let users independently control:
    - sandbox escalation approvals,
    - execpolicy `prompt` rule approvals,
    - MCP elicitation prompts.
    
    ## What changed
    
    - Added a new primary approval mode in `protocol/src/protocol.rs`:
    
    ```rust
    pub enum AskForApproval {
        // ...
        Reject(RejectConfig),
        // ...
    }
    
    pub struct RejectConfig {
        pub sandbox_approval: bool,
        pub rules: bool,
        pub mcp_elicitations: bool,
    }
    ```
    
    - Wired `RejectConfig` semantics through approval paths in `core`:
      - `core/src/exec_policy.rs`
        - rejects rule-driven prompts when `rules = true`
        - rejects sandbox/escalation prompts when `sandbox_approval = true`
    - preserves rule priority when both rule and sandbox prompt conditions
    are present
      - `core/src/tools/sandboxing.rs`
    - applies `sandbox_approval` to default exec approval decisions and
    sandbox-failure retry gating
      - `core/src/safety.rs`
    - keeps `Reject { all false }` behavior aligned with `OnRequest` for
    patch safety
        - rejects out-of-root patch approvals when `sandbox_approval = true`
      - `core/src/mcp_connection_manager.rs`
        - auto-declines MCP elicitations when `mcp_elicitations = true`
    
    - Ensured approval policy used by MCP elicitation flow stays in sync
    with constrained session policy updates.
    
    - Updated app-server v2 conversions and generated schema/TypeScript
    artifacts for the new `Reject` shape.
    
    ## Verification
    
    Added focused unit coverage for the new behavior in:
    - `core/src/exec_policy.rs`
    - `core/src/tools/sandboxing.rs`
    - `core/src/mcp_connection_manager.rs`
    - `core/src/safety.rs`
    - `core/src/tools/runtimes/apply_patch.rs`
    
    Key cases covered include rule-vs-sandbox prompt precedence, MCP
    auto-decline behavior, and patch/sandbox retry behavior under
    `RejectConfig`.
  • app-server: expose loaded thread status via read/list and notifications (#11786)
    Motivation
    - Today, a newly connected client has no direct way to determine the
    current runtime status of threads from read/list responses alone.
    - This forces clients to infer state from transient events, which can
    lead to stale or inconsistent UI when reconnecting or attaching late.
    
    Changes
    - Add `status` to `thread/read` responses.
    - Add `statuses` to `thread/list` responses.
    - Emit `thread/status/changed` notifications with `threadId` and the new
    status.
    - Track runtime status for all loaded threads and default unknown
    threads to `idle`.
    - Update protocol/docs/tests/schema fixtures for the revised API.
    
    Testing
    - Validated protocol API changes with automated protocol tests and
    regenerated schema/type fixtures.
    - Validated app-server behavior with unit and integration test suites,
    including status transitions and notifications.
  • app-server support for Windows sandbox setup. (#12025)
    app-server support for initiating Windows sandbox setup.
    server responds quickly to setup request and makes a future RPC call
    back to client when the setup finishes.
    
    The TUI implementation is unaffected but in a future PR I'll update the
    TUI to use the shared setup helper
    (`windows_sandbox.run_windows_sandbox_setup`)
  • feat(core): plumb distinct approval ids for command approvals (#12051)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 👈 
    - https://github.com/openai/codex/pull/12052
    
    With upcoming support for a fork of zsh that allows us to intercept
    `execve` and run execpolicy checks for each subcommand as part of a
    `CommandExecution`, it will be possible for there to be multiple
    approval requests for a shell command like `/path/to/zsh -lc 'git status
    && rg \"TODO\" src && make test'`.
    
    To support that, this PR introduces a new `approval_id` field across
    core, protocol, and app-server so that we can associate approvals
    properly for subcommands.
  • app-server: Emit thread archive/unarchive notifications (#12030)
    * Add v2 server notifications `thread/archived` and `thread/unarchived`
    with a `threadId` payload.
    * Wire new events into `thread/archive` and `thread/unarchive` success
    paths.
    * Update app-server protocol/schema/docs accordingly.
    
    Testing:
    - Updated archive/unarchive end-to-end tests to verify both
    notifications are emitted with the expected thread id payload.
  • [apps] Expose more fields from apps listing endpoints. (#11706)
    - [x] Expose app_metadata, branding, and labels in AppInfo.
  • Feat: add model reroute notification (#12001)
    ### Summary
    Builiding off
    https://github.com/openai/codex/pull/11964/files/5c75aa7b89a70bc2cc410a6fd238749306ec4c5e#diff-058ae8f109a8b84b4b79bbfa45f522c2233b9d9e139696044ae374d50b6196e0,
    we have created a `model/rerouted` notification that captures the event
    so that consumers can render as expected. Keep the `EventMsg::Warning`
    path in core so that this does not affect TUI rendering.
    
    `model/rerouted` is meant to be generic to account for future usage
    including capacity planning etc.
  • feat(core): add structured network approval plumbing and policy decision model (#11672)
    ### Description
    #### Summary
    Introduces the core plumbing required for structured network approvals
    
    #### What changed
    - Added structured network policy decision modeling in core.
    - Added approval payload/context types needed for network approval
    semantics.
    - Wired shell/unified-exec runtime plumbing to consume structured
    decisions.
    - Updated related core error/event surfaces for structured handling.
    - Updated protocol plumbing used by core approval flow.
    - Included small CLI debug sandbox compatibility updates needed by this
    layer.
    
    #### Why
    establishes the minimal backend foundation for network approvals without
    yet changing high-level orchestration or TUI behavior.
    
    #### Notes
    - Behavior remains constrained by existing requirements/config gating.
    - Follow-up PRs in the stack handle orchestration, UX, and app-server
    integration.
    
    ---------
    
    Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
  • [app-server] add fuzzyFileSearch/sessionCompleted (#11773)
    this is to allow the client to know when to stop showing a spinner.
  • [apps] Add is_enabled to app info. (#11417)
    - [x] Add is_enabled to app info and the response of `app/list`.
    - [x] Update TUI to have Enable/Disable button on the app detail page.
  • chore(core) Deprecate approval_policy: on-failure (#11631)
    ## Summary
    In an effort to start simplifying our sandbox setup, we're announcing
    this approval_policy as deprecated. In general, it performs worse than
    `on-request`, and we're focusing on making fewer sandbox configurations
    perform much better.
    
    ## Testing
    - [x] Tested locally
    - [x] Existing tests pass
  • feat(app-server): experimental flag to persist extended history (#11227)
    This PR adds an experimental `persist_extended_history` bool flag to
    app-server thread APIs so rollout logs can retain a richer set of
    EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
    on `thread/resume`).
    
    ### Motivation
    Today, our rollout recorder only persists a small subset (e.g. user
    message, reasoning, assistant message) of `EventMsg` types, dropping a
    good number (like command exec, file change, etc.) that are important
    for reconstructing full item history for `thread/resume`, `thread/read`,
    and `thread/fork`.
    
    Some clients want to be able to resume a thread without lossiness. This
    lossiness is primarily a UI thing, since what the model sees are
    `ResponseItem` and not `EventMsg`.
    
    ### Approach
    This change introduces an opt-in `persist_full_history` flag to preserve
    those events when you start/resume/fork a thread (defaults to `false`).
    
    This is done by adding an `EventPersistenceMode` to the rollout
    recorder:
    - `Limited` (existing behavior, default)
    - `Extended` (new opt-in behavior)
    
    In `Extended` mode, persist additional `EventMsg` variants needed for
    non-lossy app-server `ThreadItem` reconstruction. We now store the
    following ThreadItems that we didn't before:
    - web search
    - command execution
    - patch/file changes
    - MCP tool calls
    - image view calls
    - collab tool outcomes
    - context compaction
    - review mode enter/exit
    
    For **command executions** in particular, we truncate the output using
    the existing `truncate_text` from core to store an upper bound of 10,000
    bytes, which is also the default value for truncating tool outputs shown
    to the model. This keeps the size of the rollout file and command
    execution items returned over the wire reasonable.
    
    And we also persist `EventMsg::Error` which we can now map back to the
    Turn's status and populates the Turn's error metadata.
    
    #### Updates to EventMsgs
    To truly make `thread/resume` non-lossy, we also needed to persist the
    `status` on `EventMsg::CommandExecutionEndEvent` and
    `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
    command failed or was declined (similar for apply_patch). These
    EventMsgs were never persisted before so I made it a required field.