3 Commits

  • [codex] Surface MCP reauthentication-required startup failures (#29877)
    ## Summary
    
    - distinguish expired, non-refreshable stored MCP OAuth credentials from
    first-time missing credentials
    - carry a typed `failureReason: "reauthenticationRequired"` on the
    existing `mcpServer/startupStatus/updated` notification only when user
    action is required
    - keep the public MCP auth-status API unchanged and regenerate the
    app-server protocol schemas and documentation
    
    ## Why
    
    An MCP server with an expired access token and no usable refresh token
    currently fails startup without giving clients a reliable, typed
    recovery signal.
    
    The existing startup-status notification is the natural place to carry
    this state. Its nullable `failureReason` keeps the recovery reason
    attached to the failed startup transition without adding a one-off
    notification. Internally, Codex distinguishes first-time login from
    reauthentication and emits the reason only when the startup error itself
    requires authentication.
    
    ## User impact
    
    App clients can prompt an existing user to reconnect an MCP server when
    automatic recovery is impossible by handling a failed
    `mcpServer/startupStatus/updated` notification whose `failureReason` is
    `reauthenticationRequired`. Starting, ready, cancelled, unrelated
    failures, and first-time setup carry no reauthentication reason.
    
    ## Companion app PR
    
    - openai/openai#1069582
    
    ## Validation
    
    - `just test -p codex-app-server-protocol` — 248 passed; schema fixture
    tests passed
    - `cargo check -p codex-app-server -p codex-tui`
    - `just test -p codex-rmcp-client -p codex-mcp` — 184 passed, 2 skipped
    - `just test -p codex-protocol -p codex-app-server-protocol -p
    codex-mcp` — 579 passed
    - `just write-app-server-schema`
    - `just fmt`
  • fix(tui): scope MCP startup status by thread (#26639)
    ## Why
    
    MCP startup failures from spawned subagents were rendered as global
    notifications, so a child thread's failure could pollute the visible
    parent transcript. Routing the notification to the child exposed two
    related replay problems: session refresh could discard the buffered
    event, and a newly created child `ChatWidget` did not know the expected
    MCP server set, which could leave its startup spinner running after
    every server had settled.
    
    MCP startup diagnostics should remain visible in the thread that owns
    the startup without affecting other transcripts. The protocol also needs
    to support a future app-scoped MCP lifecycle where startup is not owned
    by any thread.
    
    ## Reported Behavior
    
    The [originating Slack
    report](https://openai.slack.com/archives/C08JZTV654K/p1780604538859939)
    called out that using subagents could turn MCP startup failures into a
    wall of yellow CLI warnings because repeated failures were not
    deduplicated. The intended behavior is for those diagnostics to remain
    visible once in the thread that owns the startup, without polluting the
    parent transcript.
    
    ## What Changed
    
    - add nullable `threadId` ownership to `mcpServer/startupStatus/updated`
    - populate it from the app-server conversation ID for the current
    thread-scoped lifecycle and regenerate the protocol schema and
    TypeScript artifacts
    - treat a missing or null `threadId` as app-scoped without injecting it
    into the active chat transcript
    - route and buffer thread-owned MCP startup notifications by thread in
    the TUI
    - preserve buffered MCP startup events across child session refresh
    - seed expected MCP servers before replaying a thread snapshot so
    startup reaches its terminal state
    - suppress an identical repeated failure warning for the same server
    within one startup round
    
    The owning thread still renders the detailed failure and final `MCP
    startup incomplete (...)` summary.
    
    ## How to Test
    
    1. Configure an optional MCP server named `smoke` that exits during
    initialization.
    2. Launch the TUI with multi-agent support enabled.
    3. Confirm the main thread's own startup failure renders one detailed
    `smoke` warning and one incomplete-startup summary.
    4. Spawn exactly one subagent.
    5. Confirm the parent transcript does not receive the subagent's MCP
    startup failure.
    6. Switch to the subagent thread and confirm it contains exactly one
    detailed `smoke` failure and one incomplete-startup summary.
    7. Confirm the subagent's MCP startup spinner disappears and the thread
    remains usable.
    8. Switch between the parent and subagent and confirm the warnings
    neither move nor duplicate.
    
    Targeted tests:
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server
    thread_start_emits_mcp_server_status_updated_notifications`
    - `just test -p codex-tui mcp_startup`
    
    The parent/child behavior and spinner completion were also exercised
    manually in tmux. `just argument-comment-lint` was attempted but blocked
    by an unrelated local Bazel LLVM empty-glob failure; touched Rust
    callsites were inspected manually.
  • feat(app-server): add mcpServer/startupStatus/updated notification (#15220)
    Exposes the legacy `codex/event/mcp_startup_update` event as an API v2
    notification.
    
    The legacy event has this shape:
    ```
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    pub struct McpStartupUpdateEvent {
        /// Server name being started.
        pub server: String,
        /// Current startup status.
        pub status: McpStartupStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    #[serde(rename_all = "snake_case", tag = "state")]
    #[ts(rename_all = "snake_case", tag = "state")]
    pub enum McpStartupStatus {
        Starting,
        Ready,
        Failed { error: String },
        Cancelled,
    }
    ```