Commit Graph

150 Commits

  • fix(tui): conditionally restore status indicator using message phase (#10947)
    TLDR: use new message phase field emitted by preamble-supported models
    to determine whether an AgentMessage is mid-turn commentary. if so,
    restore the status indicator afterwards to indicate the turn has not
    completed.
    
    ### Problem
    `commit_tick` hides the status indicator while streaming assistant text.
    For preamble-capable models, that text can be commentary mid-turn, so
    hiding was correct during streaming but restore timing mattered:
    - restoring too aggressively caused jitter/flashing
    - not restoring caused indicator to stay hidden before subsequent work
    (tool calls, web search, etc.)
    
    ### Fix
    - Add optional `phase` to `AgentMessageItem` and propagate it from
    `ResponseItem::Message`
    - Keep indicator hidden during streamed commit ticks, restore only when:
      - assistant item completes as `phase=commentary`, and
      - stream queues are idle + task is still running.
    - Treat `phase=None` as final-answer behavior (no restore) to keep
    existing behavior for non-preamble models
    
    ### Tests
    Add/update tests for:
    - no idle-tick restore without commentary completion
    - commentary completion restoring status before tool begin
    - snapshot coverage for preamble/status behavior
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • Mark Config.apps as experimental, correct schema generation issue (#10938)
    This PR makes `Config.apps `experimental-only and fixes a TS schema
    post-processing bug that removed needed imports. The bug happened
    because import pruning only checked the inner type body after filtering,
    not the full alias, so `JsonValue` got dropped from `Config.ts`. We now
    prune against the full alias body and added a regression test for this
    scenario.
  • chore(app-server): add experimental annotation to relevant fields (#10928)
    These fields had always been documented as experimental/unstable with
    docstrings, but now let's actually use the `experimental` annotation to
    be more explicit.
    
    - thread/start.experimentalRawEvents
    - thread/resume.history
    - thread/resume.path
    - thread/fork.path
    - turn/start.collaborationMode
    - account/login/start.chatgptAuthTokens
  • Add app configs to config.toml (#10822)
    Adds app configs to config.toml + tests
  • feat(app-server): turn/steer API (#10821)
    This PR adds a dedicated `turn/steer` API for appending user input to an
    in-flight turn.
    
    ## Motivation
    Currently, steering in the app is implemented by just calling
    `turn/start` while a turn is running. This has some really weird quirks:
    - Client gets back a new `turn.id`, even though streamed
    events/approvals remained tied to the original active turn ID.
    - All the various turn-level override params on `turn/start` do not
    apply to the "steer", and would only apply to the next real turn.
    - There can also be a race condition where the client thinks the turn is
    active but the server has already completed it, so there might be bugs
    if the client has baked in some client-specific behavior thinking it's a
    steer when in fact the server kicked off a new turn. This is
    particularly possible when running a client against a remote app-server.
    
    Having a dedicated `turn/steer` API eliminates all those quirks.
    
    `turn/steer` behavior:
    - Requires an active turn on threadId. Returns a JSON-RPC error if there
    is no active turn.
    - If expectedTurnId is provided, it must match the active turn (more
    useful when connecting to a remote app-server).
    - Does not emit `turn/started`.
    - Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or
    `outputSchema` to accurately reflect that these are not applied when
    steering.
  • Add stage field for experimental flags. (#10793)
    - [x] Add stage field for experimental flags.
  • [app-server] Add a method to list experimental features. (#10721)
    - [x] Add a method to list experimental features.
  • feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567)
    Took over the work that @aaronl-openai started here:
    https://github.com/openai/codex/pull/10397
    
    Now that app-server clients are able to set up custom tools (called
    `dynamic_tools` in app-server), we should expose a way for clients to
    pass in not just text, but also image outputs. This is something the
    Responses API already supports for function call outputs, where you can
    pass in either a string or an array of content outputs (text, image,
    file):
    https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image
    
    So let's just plumb it through in Codex (with the caveat that we only
    support text and image for now). This is implemented end-to-end across
    app-server v2 protocol types and core tool handling.
    
    ## Breaking API change
    NOTE: This introduces a breaking change with dynamic tools, but I think
    it's ok since this concept was only recently introduced
    (https://github.com/openai/codex/pull/9539) and it's better to get the
    API contract correct. I don't think there are any real consumers of this
    yet (not even the Codex App).
    
    Old shape:
    `{ "output": "dynamic-ok", "success": true }`
    
    New shape:
    ```
    {
        "contentItems": [
          { "type": "inputText", "text": "dynamic-ok" },
          { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" }
        ]
      "success": true
    }
    ```
  • Add thread/compact v2 (#10445)
    - add `thread/compact` as a trigger-only v2 RPC that submits
    `Op::Compact` and returns `{}` immediately.
    - add v2 compaction e2e coverage for success and invalid/unknown thread
    ids, and update protocol schemas/docs.
  • Feat: add upgrade to app server modelList (#10556)
    ### Summary
    * Add model upgrade to listModel app server endpoint to support
    dynamically show model upgrade banner.
  • feat: add APIs to list and download public remote skills (#10448)
    Add API to list / download from remote public skills
  • fix(app-server): fix TS annotations for optional fields on requests (#10412)
    This updates our generated TypeScript types to be more correct with how
    the server actually behaves, **specifically for JSON-RPC requests**.
    
    Before this PR, we'd generate `field: T | null`. After this PR, we will
    have `field?: T | null`. The latter matches how the server actually
    works, in that if an optional field is omitted, the server will treat it
    as null. This also makes it less annoying in theory for clients to
    upgrade to newer versions of Codex, since adding a new optional field to
    a JSON-RPC request should not require a client change.
    
    NOTE: This only applies to JSON-RPC requests. All other payloads (i.e.
    responses, notifications) will return `field: T | null` as usual.
  • fix WebSearchAction type clash between v1 and v2 (#10408)
    type clash; app-server generated types were still using the v1
    snake_case `WebSearchAction`, so there was a mismatch between the
    camelCase emitted types and the snake_case types we were trying to
    parse.
    
    Updated v2 `WebSearchAction` to export into the `v2/` type set and
    updated `ThreadItem` to use that.
    
    ### Tests
    Ran new `just write-app-server-schema` to surface changes to schema, the
    import looks correct now.
  • [Codex][CLI] Gate image inputs by model modalities (#10271)
    ###### Summary
    
    - Add input_modalities to model metadata so clients can determine
    supported input types.
    - Gate image paste/attach in TUI when the selected model does not
    support images.
    - Block submits that include images for unsupported models and show a
    clear warning.
    - Propagate modality metadata through app-server protocol/model-list
    responses.
      - Update related tests/fixtures.
    
      ###### Rationale
    
      - Models support different input modalities.
    - Clients need an explicit capability signal to prevent unsupported
    requests.
    - Backward-compatible defaults preserve existing behavior when modality
    metadata is absent.
    
      ###### Scope
    
      - codex-rs/protocol, codex-rs/core, codex-rs/tui
      - codex-rs/app-server-protocol, codex-rs/app-server
      - Generated app-server types / schema fixtures
    
      ###### Trade-offs
    
    - Default behavior assumes text + image when field is absent for
    compatibility.
      - Server-side validation remains the source of truth.
    
      ###### Follow-up
    
    - Non-TUI clients should consume input_modalities to disable unsupported
    attachments.
    - Model catalogs should explicitly set input_modalities for text-only
    models.
    
      ###### Testing
    
      - cargo fmt --all
      - cargo test -p codex-tui
      - env -u GITHUB_APP_KEY cargo test -p codex-core --lib
      - just write-app-server-schema
    - cargo run -p codex-cli --bin codex -- app-server generate-ts --out
    app-server-types
      - test against local backend
      
    <img width="695" height="199" alt="image"
    src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
    />
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
    We started working with MCP in Codex before
    https://crates.io/crates/rmcp was mature, so we had our own crate for
    MCP types that was generated from the MCP schema:
    
    
    https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md
    
    Now that `rmcp` is more mature, it makes more sense to use their MCP
    types in Rust, as they handle details (like the `_meta` field) that our
    custom version ignored. Though one advantage that our custom types had
    is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
    whereas the types in `rmcp` do not. As such, part of the work of this PR
    is leveraging the adapters between `rmcp` types and the serializable
    types that are API for us (app server and MCP) introduced in #10356.
    
    Note this PR results in a number of changes to
    `codex-rs/app-server-protocol/schema`, which merit special attention
    during review. We must ensure that these changes are still
    backwards-compatible, which is possible because we have:
    
    ```diff
    - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
    + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
    ```
    
    so `ContentBlock` has been replaced with the more general `JsonValue`.
    Note that `ContentBlock` was defined as:
    
    ```typescript
    export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
    ```
    
    so the deletion of those individual variants should not be a cause of
    great concern.
    
    Similarly, we have the following change in
    `codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
    
    ```
    - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
    + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
    ```
    
    so:
    
    - `annotations?: ToolAnnotations` ➡️ `JsonValue`
    - `inputSchema: ToolInputSchema` ➡️ `JsonValue`
    - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
    
    and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
    * #10357
    * __->__ #10349
    * #10356
  • feat: experimental flags (#10231)
    ## Problem being solved
    - We need a single, reliable way to mark app-server API surface as
    experimental so that:
      1. the runtime can reject experimental usage unless the client opts in
    2. generated TS/JSON schemas can exclude experimental methods/fields for
    stable clients.
    
    Right now that’s easy to drift or miss when done ad-hoc.
    
    ## How to declare experimental methods and fields
    - **Experimental method**: add `#[experimental("method/name")]` to the
    `ClientRequest` variant in `client_request_definitions!`.
    - **Experimental field**: on the params struct, derive `ExperimentalApi`
    and annotate the field with `#[experimental("method/name.field")]` + set
    `inspect_params: true` for the method variant so
    `ClientRequest::experimental_reason()` inspects params for experimental
    fields.
    
    ## How the macro solves it
    - The new derive macro lives in
    `codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
    `#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
    attributes.
    - **Structs**:
    - Generates `ExperimentalApi::experimental_reason(&self)` that checks
    only annotated fields.
      - The “presence” check is type-aware:
        - `Option<T>`: `is_some_and(...)` recursively checks inner.
        - `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
        - `bool`: must be `true`.
        - Other types: considered present (returns `true`).
    - Registers each experimental field in an `inventory` with `(type_name,
    serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
    that type. Field names are converted from `snake_case` to `camelCase`
    for schema/TS filtering.
    - **Enums**:
    - Generates an exhaustive `match` returning `Some(reason)` for annotated
    variants and `None` otherwise (no wildcard arm).
    - **Wiring**:
    - Runtime gating uses `ExperimentalApi::experimental_reason()` in
    `codex-rs/app-server/src/message_processor.rs` to reject requests unless
    `InitializeParams.capabilities.experimental_api == true`.
    - Schema/TS export filters use the inventory list and
    `EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
    strip experimental methods/fields when `experimental_api` is false.
  • add missing fields to WebSearchAction and update app-server types (#10276)
    - add `WebSearchAction` to app-server v2 types
    - add `queries` to `WebSearchAction::Search` type
    
    Updated tests.
  • Add enforce_residency to requirements (#10263)
    Add `enforce_residency` to requirements.toml and thread it through to a
    header on `default_client`.
  • chore: rename ChatGpt -> Chatgpt in type names (#10244)
    When using ChatGPT in names of types, we should be consistent, so this
    renames some types with `ChatGpt` in the name to `Chatgpt`. From
    https://rust-lang.github.io/api-guidelines/naming.html:
    
    > In `UpperCamelCase`, acronyms and contractions of compound words count
    as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
    or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
    contractions are lower-cased: `is_xid_start`.
    
    This PR updates existing uses of `ChatGpt` and changes them to
    `Chatgpt`. Though in all cases where it could affect the wire format, I
    visually inspected that we don't change anything there. That said, this
    _will_ change the codegen because it will affect the spelling of type
    names.
    
    For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
    `app-server-protocol`, but the wire format is still `"chatgpt"`.
    
    This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
  • Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)
    ## Summary
    - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
    in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
    tags from normal assistant output.
    - Persist plan items and rebuild them on resume so proposed plans show
    in thread history.
    - Wire plan items/deltas through app-server protocol v2 and render a
    dedicated proposed-plan view in the TUI, including the “Implement this
    plan?” prompt only when a plan item is present.
    
    ## Changes
    
    ### Core (`codex-rs/core`)
    - Added a generic, line-based tag parser that buffers each line until it
    can disprove a tag prefix; implements auto-close on `finish()` for
    unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
    - Refactored proposed plan parsing to wrap the generic parser.
    `codex-rs/core/src/proposed_plan_parser.rs`
    - In plan mode, stream assistant deltas as:
      - **Normal text** → `AgentMessageContentDelta`
      - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
      (`codex-rs/core/src/codex.rs`)
    - Final plan item content is derived from the completed assistant
    message (authoritative), not necessarily the concatenated deltas.
    - Strips `<proposed_plan>` blocks from assistant text in plan mode so
    tags don’t appear in normal messages.
    (`codex-rs/core/src/stream_events_utils.rs`)
    - Persist `ItemCompleted` events only for plan items for rollout replay.
    (`codex-rs/core/src/rollout/policy.rs`)
    - Guard `update_plan` tool in Plan Mode with a clear error message.
    (`codex-rs/core/src/tools/handlers/plan.rs`)
    - Updated Plan Mode prompt to:  
      - keep `<proposed_plan>` out of non-final reasoning/preambles  
      - require exact tag formatting  
      - allow only one `<proposed_plan>` block per turn  
      (`codex-rs/core/templates/collaboration_mode/plan.md`)
    
    ### Protocol / App-server protocol
    - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
    (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
    - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
    EXPERIMENTAL markers and note that deltas may not match the final plan
    item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
    - Added plan delta route in app-server protocol common mapping.
    (`codex-rs/app-server-protocol/src/protocol/common.rs`)
    - Rebuild plan items from persisted `ItemCompleted` events on resume.
    (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)
    
    ### App-server
    - Forward plan deltas to v2 clients and map core plan items to v2 plan
    items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
    `codex-rs/app-server/src/codex_message_processor.rs`)
    - Added v2 plan item tests.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ### TUI
    - Added a dedicated proposed plan history cell with special background
    and padding, and moved “• Proposed Plan” outside the highlighted block.
    (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
    - Only show “Implement this plan?” when a plan item exists.
    (`codex-rs/tui/src/chatwidget.rs`,
    `codex-rs/tui/src/chatwidget/tests.rs`)
    
    <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
    src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
    />
    
    ### Docs / Misc
    - Updated protocol docs to mention plan deltas.
    (`codex-rs/docs/protocol_v1.md`)
    - Minor plumbing updates in exec/debug clients to tolerate plan deltas.
    (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)
    
    ## Tests
    - Added core integration tests:
      - Plan mode strips plan from agent messages.
      - Missing `</proposed_plan>` closes at end-of-message.  
      (`codex-rs/core/tests/suite/items.rs`)
    - Added unit tests for generic tag parser (prefix buffering, non-tag
    lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
    - Existing app-server plan item tests in v2.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ## Notes / Behavior
    - Plan output no longer appears in standard assistant text in Plan Mode;
    it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
    - The final plan item content is authoritative and may diverge from
    streamed deltas (documented as experimental).
    - Reasoning summaries are not filtered; prompt instructs the model not
    to include `<proposed_plan>` outside the final plan message.
    
    ## Codex Author
    `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
  • feat: refactor CodexAuth so invalid state cannot be represented (#10208)
    Previously, `CodexAuth` was defined as follows:
    
    
    https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L39-L46
    
    But if you looked at its constructors, we had creation for
    `AuthMode::ApiKey` where `storage` was built using a nonsensical path
    (`PathBuf::new()`) and `auth_dot_json` was `None`:
    
    
    https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L212-L220
    
    By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always
    `None`:
    
    
    https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L665-L671
    
    https://github.com/openai/codex/pull/10012 took things further because
    it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is
    important in when invoking `account/login/start` via the app server, but
    most logic _internal_ to the app server should just reason about two
    `AuthMode` variants: `ApiKey` and `ChatGPT`.
    
    This PR tries to clean things up as follows:
    
    - `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/`
    both continue to have the `ChatgptAuthTokens` variant, though it is used
    exclusively for the on-the-wire messaging.
    - `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which
    only has two variants: `ApiKey` and `ChatGPT`.
    - `CodexAuth` has been changed from a struct to an enum. It is a
    disjoint union where each variant (`ApiKey`, `ChatGpt`, and
    `ChatGptAuthTokens`) have only the associated fields that make sense for
    that variant.
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208).
    * #10224
    * __->__ #10208
  • Conversation naming (#8991)
    Session renaming:
    - `/rename my_session`
    - `/rename` without arg and passing an argument in `customViewPrompt`
    - AppExitInfo shows resume hint using the session name if set instead of
    uuid, defaults to uuid if not set
    - Names are stored in `CODEX_HOME/sessions.jsonl`
    
    Session resuming:
    - codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
    matching the name and resumes the session
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Chore: plan mode do not include free form question and always include isOther (#10210)
    We should never ask a freeform question when planning and we should
    always include isOther as an escape hatch.
  • chore(app-server): document AuthMode (#10191)
    Explain what this is and what it's used for.
  • feat(app-server): support external auth mode (#10012)
    This enables a new use case where `codex app-server` is embedded into a
    parent application that will directly own the user's ChatGPT auth
    lifecycle, which means it owns the user’s auth tokens and refreshes it
    when necessary. The parent application would just want a way to pass in
    the auth tokens for codex to use directly.
    
    The idea is that we are introducing a new "auth mode" currently only
    exposed via app server: **`chatgptAuthTokens`** which consist of the
    `id_token` (stores account metadata) and `access_token` (the bearer
    token used directly for backend API calls). These auth tokens are only
    stored in-memory. This new mode is in addition to the existing `apiKey`
    and `chatgpt` auth modes.
    
    This PR reuses the shape of our existing app-server account APIs as much
    as possible:
    - Update `account/login/start` with a new `chatgptAuthTokens` variant,
    which will allow the client to pass in the tokens and have codex
    app-server use them directly. Upon success, the server emits
    `account/login/completed` and `account/updated` notifications.
    - A new server->client request called
    `account/chatgptAuthTokens/refresh` which the server can use whenever
    the access token previously passed in has expired and it needs a new one
    from the parent application.
    
    I leveraged the core 401 retry loop which typically triggers auth token
    refreshes automatically, but made it pluggable:
    - **chatgpt** mode refreshes internally, as usual.
    - **chatgptAuthTokens** mode calls the client via
    `account/chatgptAuthTokens/refresh`, the client responds with updated
    tokens, codex updates its in-memory auth, then retries. This RPC has a
    10s timeout and handles JSON-RPC errors from the client.
    
    Also some additional things:
    - chatgpt logins are blocked while external auth is active (have to log
    out first. typically clients will pick one OR the other, not support
    both)
    - `account/logout` clears external auth in memory
    - Ensures that if `forced_chatgpt_workspace_id` is set via the user's
    config, we respect it in both:
    - `account/login/start` with `chatgptAuthTokens` (returns a JSON-RPC
    error back to the client)
    - `account/chatgptAuthTokens/refresh` (fails the turn, and on next
    request app-server will send another `account/chatgptAuthTokens/refresh`
    request to the client).
  • [Codex][CLI] Show model-capacity guidance on 429 (#10118)
    ###### Problem
    Users get generic 429s with no guidance when a model is at capacity.
    ###### Solution
    Detect model-cap headers, surface a clear “try a different model”
    message, and keep behavior non‑intrusive (no auto‑switch).
    ###### Scope
    CLI/TUI only; protocol + error mapping updated to carry model‑cap info.
    ###### Tests
          - just fmt
          - cargo test -p codex-tui
    - cargo test -p codex-core --lib
    shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file --
    --nocapture (ran in isolated env)
          - validate local build with backend
         
    <img width="719" height="845" alt="image"
    src="https://github.com/user-attachments/assets/1470b33d-0974-4b1f-b8e6-d11f892f4b54"
    />
  • Better handling skill depdenencies on ENV VAR. (#9017)
    An experimental flow for env var skill dependencies. Skills can now
    declare required env vars in SKILL.md; if missing, the CLI prompts the
    user to get the value, and Core will store it in memory (eventually to a
    local persistent store)
    <img width="790" height="169" alt="image"
    src="https://github.com/user-attachments/assets/cd928918-9403-43cb-a7e7-b8d59bcccd9a"
    />
  • [connectors] Support connectors part 2 - slash command and tui (#9728)
    - [x] Support `/apps` slash command to browse the apps in tui.
    - [x] Support inserting apps to prompt using `$`.
    - [x] Lots of simplification/renaming from connectors to apps.
  • compaction (#10034)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • [skills] Auto install MCP dependencies when running skils with dependency specs. (#9982)
    Auto install MCP dependencies when running skils with dependency specs.
  • fix: handle all web_search actions and in progress invocations (#9960)
    ### Summary
    - Parse all `web_search` tool actions (`search`, `find_in_page`,
    `open_page`).
    - Previously we only parsed + displayed `search`, which made the TUI
    appear to pause when the other actions were being used.
    - Show in progress `web_search` calls as `Searching the web`
      - Previously we only showed completed tool calls
    
    <img width="308" height="149" alt="image"
    src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
    />
    
    ### Tests
    Added + updated tests, tested locally
    
    ### Follow ups
    Update VSCode extension to display these as well
  • Add thread/unarchive to restore archived rollouts (#9843)
    ## Summary
    - Adds a new `thread/unarchive` RPC to move archived thread rollouts
    back into the active `sessions/` tree.
    
    ## What changed
    - **Protocol**
      - Adds `thread/unarchive` request/response types and wiring.
    - **Server**
      - Implements `thread_unarchive` in the app server.
      - Validates the archived rollout path and thread ID.
    - Restores the rollout to `sessions/YYYY/MM/DD/...` based on the rollout
    filename timestamp.
    - **Core**
    - Adds `find_archived_thread_path_by_id_str` helper for archived
    rollouts.
    - **Docs**
      - Documents the new RPC and usage example.
    - **Tests**
      - Adds an end-to-end server test that:
        1) starts a thread,
        2) archives it,
        3) unarchives it,
        4) asserts the file is restored to `sessions/`.
    
    ## How to use
    ```json
    { "method": "thread/unarchive", "id": 24, "params": { "threadId": "<thread-id>" } }
    ```
    
    ## Author Codex Session
    
    `codex resume 019bf158-54b6-7960-a696-9d85df7e1bc1` (soon I'll make this
    kind of session UUID forkable by anyone with the right
    `session_object_storage_url` line in their config, but for now just
    pasting it here for my reference)
  • Feat: add isOther to question returned by request user input tool (#9890)
    ### Summary
    Add `isOther` to question object from request_user_input tool input and
    remove `other` option from the tool prompt to better handle tool input.
  • feat: dynamic tools injection (#9539)
    ## Summary
    Add dynamic tool injection to thread startup in API v2, wire dynamic
    tool calls through the app server to clients, and plumb responses back
    into the model tool pipeline.
    
    ### Flow (high level)
    - Thread start injects `dynamic_tools` into the model tool list for that
    thread (validation is done here).
    - When the model emits a tool call for one of those names, core raises a
    `DynamicToolCallRequest` event.
    - The app server forwards it to the client as `item/tool/call`, waits
    for the client’s response, then submits a `DynamicToolResponse` back to
    core.
    - Core turns that into a `function_call_output` in the next model
    request so the model can continue.
    
    ### What changed
    - Added dynamic tool specs to v2 thread start params and protocol types;
    introduced `item/tool/call` (request/response) for dynamic tool
    execution.
    - Core now registers dynamic tool specs at request time and routes those
    calls via a new dynamic tool handler.
    - App server validates tool names/schemas, forwards dynamic tool call
    requests to clients, and publishes tool outputs back into the session.
    - Integration tests
  • feat(tui) /personality (#9718)
    ## Summary
    Adds /personality selector in the TUI, which leverages the new core
    interface in #9644
    
    Notes:
    - We are doing some of our own state management for model_info loading
    here, but not sure if that's ideal. open to opinions on simpler
    approach, but would like to avoid blocking on a larger refactor
    - Right now, the `/personality` selector just hides when the model
    doesn't support it. we can update this behavior down the line
    
    ## Testing
    - [x] Tested locally
    - [x] Added snapshot tests
  • Use collaboration mode masks without mutating base settings (#9806)
    Keep an unmasked base collaboration mode and apply the active mask on
    demand. Simplify the TUI mask helpers and update tests/docs to match the
    mask contract.
  • feat: ephemeral threads (#9765)
    Add ephemeral threads capabilities. Only exposed through the
    `app-server` v2
    
    The idea is to disable the rollout recorder for those threads.
  • Another round of improvements for config error messages (#9746)
    In a [recent PR](https://github.com/openai/codex/pull/9182), I made some
    improvements to config error messages so errors didn't leave app server
    clients in a dead state. This is a follow-on PR to make these error
    messages more readable and actionable for both TUI and GUI users. For
    example, see #9668 where the user was understandably confused about the
    source of the problem and how to fix it.
    
    The improved error message:
    1. Clearly identifies the config file where the error was found (which
    is more important now that we support layered configs)
    2. Provides a line and column number of the error
    3. Displays the line where the error occurred and underlines it
    
    For example, if my `config.toml` includes the following:
    ```toml
    [features]
    collaboration_modes = "true"
    ```
    
    Here's the current CLI error message:
    ```
    Error loading config.toml: invalid type: string "true", expected a boolean in `features`
    ```
    
    And here's the improved message:
    ```
    Error loading config.toml:
    /Users/etraut/.codex/config.toml:43:23: invalid type: string "true", expected a boolean
       |
    43 | collaboration_modes = "true"
       |                       ^^^^^^
    ```
    
    The bulk of the new logic is contained within a new module
    `config_loader/diagnostics.rs` that is responsible for calculating the
    text range for a given toml path (which is more involved than I would
    have expected).
    
    In addition, this PR adds the file name and text range to the
    `ConfigWarningNotification` app server struct. This allows GUI clients
    to present the user with a better error message and an optional link to
    open the errant config file. This was a suggestion from @.bolinfest when
    he reviewed my previous PR.
  • Print warning if we skip config loading (#9611)
    https://github.com/openai/codex/pull/9533 silently ignored config if
    untrusted. Instead, we still load it but disable it. Maybe we shouldn't
    try to parse it either...
    
    <img width="939" height="515" alt="Screenshot 2026-01-21 at 14 56 38"
    src="https://github.com/user-attachments/assets/e753cc22-dd99-4242-8ffe-7589e85bef66"
    />
  • feat(app-server) Expose personality (#9674)
    ### Motivation
    Exposes a per-thread / per-turn `personality` override in the v2
    app-server API so clients can influence model communication style at
    thread/turn start. Ensures the override is passed into the session
    configuration resolution so it becomes effective for subsequent turns
    and headless runners.
    
    ### Testing
    - [x] Add an integration-style test
    `turn_start_accepts_personality_override_v2` in
    `codex-rs/app-server/tests/suite/v2/turn_start.rs` that verifies a
    `/personality` override results in a developer update message containing
    `<personality_spec>` in the outbound model request.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_6971d646b1c08322a689a54d2649f3fe)
  • [connectors] Support connectors part 1 - App server & MCP (#9667)
    In order to make Codex work with connectors, we add a built-in gateway
    MCP that acts as a transparent proxy between the client and the
    connectors. The gateway MCP collects actions that are accessible to the
    user and sends them down to the user, when a connector action is chosen
    to be called, the client invokes the action through the gateway MCP as
    well.
    
     - [x] Add the system built-in gateway MCP to list and run connectors.
     - [x] Add the app server methods and protocol
  • Chore: add cmd related info to exec approval request (#9659)
    ### Summary
    We now rely purely on `item/commandExecution/requestApproval` item to
    render pending approval in VSCE and app. With v2 approach, it does not
    include the actual cmd that it is attempting and therefore we can only
    use `proposedExecpolicyAmendment` to render which can be incomplete.
    
    ### Reproduce
    * Add `prefix_rule(pattern=["echo"], decision="prompt")` to your
    `~/.codex/rules.default.rules`.
    * Ask to `Run  echo "approval-test" please` in VSCE or app. 
    * The pending approval protal does show up but with no content
    
    #### Example screenshot
    <img width="3434" height="3648" alt="Screenshot 2026-01-21 at 8 23
    25 PM"
    src="https://github.com/user-attachments/assets/75644837-21f1-40f8-8b02-858d361ff817"
    />
    
    #### Sample output
    ```
      {"method":"item/commandExecution/requestApproval","id":0,"params":{
        "threadId":"019be439-5a90-7600-a7ea-2d2dcc50302a",
        "turnId":"0",
        "itemId":"call_usgnQ4qEX5U9roNdjT7fPzhb",
        "reason":"`/bin/zsh -lc 'echo \"testing\"'` requires approval by policy",
        "proposedExecpolicyAmendment":null
      }}
    
    ```
    
    ### Fix
    Inlude `command` string, `cwd` and `command_actions` in
    `CommandExecutionRequestApprovalParams` so that consumers can display
    the correct command instead of relying on exec policy output.
  • Add layered config.toml support to app server (#9510)
    This PR adds support for chained (layered) config.toml file merging for
    clients that use the app server interface. This feature already exists
    for the TUI, but it does not work for GUI clients.
    
    It does the following:
    * Changes code paths for new thread, resume thread, and fork thread to
    use the effective config based on the cwd.
    * Updates the `config/read` API to accept an optional `cwd` parameter.
    If specified, the API returns the effective config based on that cwd
    path. Also optionally includes all layers including project config
    files. If cwd is not specified, the API falls back on its older behavior
    where it considers only the global (non-project) config files when
    computing the effective config.
    
    The changes in codex_message_processor.rs look deceptively large. They
    mostly just involve moving existing blocks of code to a later point in
    some functions so it can use the cwd to calculate the config.
    
    This PR builds upon #9509 and should be reviewed and merged after that
    PR.
    
    Tested:
    * Verified change with (dependent, as-yet-uncommitted) changes to IDE
    Extension and confirmed correct behavior
    
    The full fix requires additional changes in the IDE Extension code base,
    but they depend on this PR.
  • Chore: update plan mode output in prompt (#9592)
    ### Summary
    * Update plan prompt output
    * Update requestUserInput response to be a single key value pair
    `answer: String`.
  • Add total (non-partial) TextElement placeholder accessors (#9545)
    ## Summary
    - Make `TextElement` placeholders private and add a text-backed accessor
    to avoid assuming `Some`.
    - Since they are optional in the protocol, we want to make sure any
    accessors properly handle the None case (getting the placeholder using
    the byte range in the text)
    - Preserve placeholders during protocol/app-server conversions using the
    accessor fallback.
    - Update TUI composer/remap logic and tests to use the new
    constructor/accessor.
  • Feat: request user input tool (#9472)
    ### Summary
    * Add `requestUserInput` tool that the model can use for gather
    feedback/asking question mid turn.
    
    
    ### Tool input schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput input",
      "type": "object",
      "additionalProperties": false,
      "required": ["questions"],
      "properties": {
        "questions": {
          "type": "array",
          "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
          "minItems": 1,
          "maxItems": 3,
          "items": {
            "type": "object",
            "additionalProperties": false,
            "required": ["id", "header", "question"],
            "properties": {
              "id": {
                "type": "string",
                "description": "Stable identifier for mapping answers (snake_case)."
              },
              "header": {
                "type": "string",
                "description": "Short header label shown in the UI (12 or fewer chars)."
              },
              "question": {
                "type": "string",
                "description": "Single-sentence prompt shown to the user."
              },
              "options": {
                "type": "array",
                "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
                "minItems": 2,
                "maxItems": 3,
                "items": {
                  "type": "object",
                  "additionalProperties": false,
                  "required": ["value", "label", "description"],
                  "properties": {
                    "value": {
                      "type": "string",
                      "description": "Machine-readable value (snake_case)."
                    },
                    "label": {
                      "type": "string",
                      "description": "User-facing label (1-5 words)."
                    },
                    "description": {
                      "type": "string",
                      "description": "One short sentence explaining impact/tradeoff if selected."
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    ```
    
    ### Tool output schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput output",
      "type": "object",
      "additionalProperties": false,
      "required": ["answers"],
      "properties": {
        "answers": {
          "type": "object",
          "description": "Map of question id to user answer.",
          "additionalProperties": {
            "type": "object",
            "additionalProperties": false,
            "required": ["selected"],
            "properties": {
              "selected": {
                "type": "array",
                "items": { "type": "string" }
              },
              "other": {
                "type": ["string", "null"]
              }
            }
          }
        }
      }
    }
    ```