Commit Graph

281 Commits

  • feat: do not close unified exec processes across turns (#10799)
    With this PR we do not close the unified exec processes (i.e. background
    terminals) at the end of a turn unless:
    * The user interrupt the turn
    * The user decide to clean the processes through `app-server` or
    `/clean`
    
    I made sure that `codex exec` correctly kill all the processes
  • Add resume_agent collab tool (#10903)
    Summary
    - add the new resume_agent collab tool path through core, protocol, and
    the app server API, including the resume events
    - update the schema/TypeScript definitions plus docs so resume_agent
    appears in generated artifacts and README
    - note that resumed agents rehydrate rollout history without overwriting
    their base instructions
    
    Testing
    - Not run (not requested)
  • fix(tui): conditionally restore status indicator using message phase (#10947)
    TLDR: use new message phase field emitted by preamble-supported models
    to determine whether an AgentMessage is mid-turn commentary. if so,
    restore the status indicator afterwards to indicate the turn has not
    completed.
    
    ### Problem
    `commit_tick` hides the status indicator while streaming assistant text.
    For preamble-capable models, that text can be commentary mid-turn, so
    hiding was correct during streaming but restore timing mattered:
    - restoring too aggressively caused jitter/flashing
    - not restoring caused indicator to stay hidden before subsequent work
    (tool calls, web search, etc.)
    
    ### Fix
    - Add optional `phase` to `AgentMessageItem` and propagate it from
    `ResponseItem::Message`
    - Keep indicator hidden during streamed commit ticks, restore only when:
      - assistant item completes as `phase=commentary`, and
      - stream queues are idle + task is still running.
    - Treat `phase=None` as final-answer behavior (no restore) to keep
    existing behavior for non-preamble models
    
    ### Tests
    Add/update tests for:
    - no idle-tick restore without commentary completion
    - commentary completion restoring status before tool begin
    - snapshot coverage for preamble/status behavior
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • Sync collaboration mode naming across Default prompt, tools, and TUI (#10666)
    ## Summary
    - add shared `ModeKind` helpers for display names, TUI visibility, and
    `request_user_input` availability
    - derive TUI mode filtering/labels from shared `ModeKind` metadata
    instead of local hardcoded matches
    - derive `request_user_input` availability text and unavailable error
    mode names from shared mode metadata
    - replace hardcoded known mode names in the Default collaboration-mode
    template with `{{KNOWN_MODE_NAMES}}` and fill it from
    `TUI_VISIBLE_COLLABORATION_MODES`
    - add regression tests for mode metadata sync and placeholder
    replacement
    
    ## Notes
    - `cargo test -p codex-core` integration target (`tests/all`) still
    shows pre-existing env-specific failures in this environment due missing
    `test_stdio_server` binary resolution; core unit tests are green.
    
    ## Codex author
    `codex resume 019c26ff-dfe7-7173-bc04-c9e1fff1e447`
  • fix(core) switching model appends model instructions (#10651)
    ## Summary
    When switching models, we should append the instructions of the new
    model to the conversation as a developer message.
    
    ## Test
    - [x] Adds a unit test
  • feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567)
    Took over the work that @aaronl-openai started here:
    https://github.com/openai/codex/pull/10397
    
    Now that app-server clients are able to set up custom tools (called
    `dynamic_tools` in app-server), we should expose a way for clients to
    pass in not just text, but also image outputs. This is something the
    Responses API already supports for function call outputs, where you can
    pass in either a string or an array of content outputs (text, image,
    file):
    https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image
    
    So let's just plumb it through in Codex (with the caveat that we only
    support text and image for now). This is implemented end-to-end across
    app-server v2 protocol types and core tool handling.
    
    ## Breaking API change
    NOTE: This introduces a breaking change with dynamic tools, but I think
    it's ok since this concept was only recently introduced
    (https://github.com/openai/codex/pull/9539) and it's better to get the
    API contract correct. I don't think there are any real consumers of this
    yet (not even the Codex App).
    
    Old shape:
    `{ "output": "dynamic-ok", "success": true }`
    
    New shape:
    ```
    {
        "contentItems": [
          { "type": "inputText", "text": "dynamic-ok" },
          { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" }
        ]
      "success": true
    }
    ```
  • add none personality option (#10688)
    - add none personality enum value and empty placeholder behavior\n- add
    docs/schema updates and e2e coverage
  • fix(core) Request Rule guidance tweak (#10598)
    ## Summary
    Forgot to include this tweak.
    
    ## Testing
    - [x] Unit tests pass
  • fix(core) updated request_rule guidance (#10379)
    ## Summary
    Update guidance for request_rule
    
    ## Testing
    - [x] Unit tests pass
  • feat: add APIs to list and download public remote skills (#10448)
    Add API to list / download from remote public skills
  • fix: make $PWD/.agents read-only like $PWD/.codex (#10524)
    In light of https://github.com/openai/codex/pull/10317, because
    `.agents` can include resources that Codex can run in a privileged way,
    it should be read-only by default just as `.codex` is.
  • Cleanup collaboration mode variants (#10404)
    ## Summary
    
    This PR simplifies collaboration modes to the visible set `default |
    plan`, while preserving backward compatibility for older partners that
    may still send legacy mode
    names.
    
    Specifically:
    - Renames the old Code behavior to **Default**.
    - Keeps **Plan** as-is.
    - Removes **Custom** mode behavior (fallbacks now resolve to Default).
    - Keeps `PairProgramming` and `Execute` internally for compatibility
    plumbing, while removing them from schema/API and UI visibility.
    - Adds legacy input aliasing so older clients can still send old mode
    names.
    
    ## What Changed
    
    1. Mode enum and compatibility
    - `ModeKind` now uses `Plan` + `Default` as active/public modes.
    - `ModeKind::Default` deserialization accepts legacy values:
      - `code`
      - `pair_programming`
      - `execute`
      - `custom`
    - `PairProgramming` and `Execute` variants remain in code but are hidden
    from protocol/schema generation.
    - `Custom` variant is removed; previous custom fallbacks now map to
    `Default`.
    
    2. Collaboration presets and templates
    - Built-in presets now return only:
      - `Plan`
      - `Default`
    - Template rename:
      - `core/templates/collaboration_mode/code.md` -> `default.md`
    - `execute.md` and `pair_programming.md` remain on disk but are not
    surfaced in visible preset lists.
    
    3. TUI updates
    - Updated user-facing naming and prompts from “Code” to “Default”.
    - Updated mode-cycle and indicator behavior to reflect only visible
    `Plan` and `Default`.
    - Updated corresponding tests and snapshots.
    
    4. request_user_input behavior
    - `request_user_input` remains allowed only in `Plan` mode.
    - Rejection messaging now consistently treats non-plan modes as
    `Default`.
    
    5. Schemas
    - Regenerated config and app-server schemas.
    - Public schema types now advertise mode values as:
      - `plan`
      - `default`
    
    ## Backward Compatibility Notes
    
    - Incoming legacy mode names (`code`, `pair_programming`, `execute`,
    `custom`) are accepted and coerced to `default`.
    - Outgoing/public schema surfaces intentionally expose only `plan |
    default`.
    - This allows tolerant ingestion of older partner payloads while
    standardizing new integrations on the reduced mode set.
    
    ## Codex author
    `codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00`
  • [Codex][CLI] Gate image inputs by model modalities (#10271)
    ###### Summary
    
    - Add input_modalities to model metadata so clients can determine
    supported input types.
    - Gate image paste/attach in TUI when the selected model does not
    support images.
    - Block submits that include images for unsupported models and show a
    clear warning.
    - Propagate modality metadata through app-server protocol/model-list
    responses.
      - Update related tests/fixtures.
    
      ###### Rationale
    
      - Models support different input modalities.
    - Clients need an explicit capability signal to prevent unsupported
    requests.
    - Backward-compatible defaults preserve existing behavior when modality
    metadata is absent.
    
      ###### Scope
    
      - codex-rs/protocol, codex-rs/core, codex-rs/tui
      - codex-rs/app-server-protocol, codex-rs/app-server
      - Generated app-server types / schema fixtures
    
      ###### Trade-offs
    
    - Default behavior assumes text + image when field is absent for
    compatibility.
      - Server-side validation remains the source of truth.
    
      ###### Follow-up
    
    - Non-TUI clients should consume input_modalities to disable unsupported
    attachments.
    - Model catalogs should explicitly set input_modalities for text-only
    models.
    
      ###### Testing
    
      - cargo fmt --all
      - cargo test -p codex-tui
      - env -u GITHUB_APP_KEY cargo test -p codex-core --lib
      - just write-app-server-schema
    - cargo run -p codex-cli --bin codex -- app-server generate-ts --out
    app-server-types
      - test against local backend
      
    <img width="695" height="199" alt="image"
    src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
    />
    
    ---------
    
    Co-authored-by: Josh McKinney <joshka@openai.com>
  • chore: add phase to message responseitem (#10455)
    ### What
    
    add wiring for `phase` field on `ResponseItem::Message` to lay
    groundwork for differentiating model preambles and final messages.
    currently optional.
    
    follows pattern in #9698.
    
    updated schemas with `just write-app-server-schema` so we can see type
    changes.
    
    ### Tests
    Updated existing tests for SSE parsing and hydrating from history
  • feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
    We started working with MCP in Codex before
    https://crates.io/crates/rmcp was mature, so we had our own crate for
    MCP types that was generated from the MCP schema:
    
    
    https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md
    
    Now that `rmcp` is more mature, it makes more sense to use their MCP
    types in Rust, as they handle details (like the `_meta` field) that our
    custom version ignored. Though one advantage that our custom types had
    is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
    whereas the types in `rmcp` do not. As such, part of the work of this PR
    is leveraging the adapters between `rmcp` types and the serializable
    types that are API for us (app server and MCP) introduced in #10356.
    
    Note this PR results in a number of changes to
    `codex-rs/app-server-protocol/schema`, which merit special attention
    during review. We must ensure that these changes are still
    backwards-compatible, which is possible because we have:
    
    ```diff
    - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
    + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
    ```
    
    so `ContentBlock` has been replaced with the more general `JsonValue`.
    Note that `ContentBlock` was defined as:
    
    ```typescript
    export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
    ```
    
    so the deletion of those individual variants should not be a cause of
    great concern.
    
    Similarly, we have the following change in
    `codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
    
    ```
    - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
    + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
    ```
    
    so:
    
    - `annotations?: ToolAnnotations` ➡️ `JsonValue`
    - `inputSchema: ToolInputSchema` ➡️ `JsonValue`
    - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
    
    and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
    * #10357
    * __->__ #10349
    * #10356
  • feat: add MCP protocol types and rmcp adapters (#10356)
    Currently, types from our custom `mcp-types` crate are part of some of
    our APIs:
    
    
    https://github.com/openai/codex/blob/03fcd12e77fedf4fa327af27e2e476e1ebc5f651/codex-rs/app-server-protocol/src/protocol/v2.rs#L43-L46
    
    To eliminate this crate in #10349 by switching to `rmcp`, we need our
    own wrappers for the `rmcp` types that we can use in our API, which is
    what this PR does.
    
    Note this PR introduces the new API types, but we do not make use of
    them until #10349.
    
    
    
    
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10356).
    * #10357
    * #10349
    * __->__ #10356
  • fix(rules) Limit rules listed in conversation (#10351)
    ## Summary
    We should probably warn users that they have a million rules, and help
    clean them up. But for now, we should handle this unbounded case.
    
    Limit rules listed in conversations, with shortest / broadest rules
    first.
    
    ## Testing
    - [x] Updated unit tests
  • add missing fields to WebSearchAction and update app-server types (#10276)
    - add `WebSearchAction` to app-server v2 types
    - add `queries` to `WebSearchAction::Search` type
    
    Updated tests.
  • Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)
    ## Summary
    - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
    in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
    tags from normal assistant output.
    - Persist plan items and rebuild them on resume so proposed plans show
    in thread history.
    - Wire plan items/deltas through app-server protocol v2 and render a
    dedicated proposed-plan view in the TUI, including the “Implement this
    plan?” prompt only when a plan item is present.
    
    ## Changes
    
    ### Core (`codex-rs/core`)
    - Added a generic, line-based tag parser that buffers each line until it
    can disprove a tag prefix; implements auto-close on `finish()` for
    unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
    - Refactored proposed plan parsing to wrap the generic parser.
    `codex-rs/core/src/proposed_plan_parser.rs`
    - In plan mode, stream assistant deltas as:
      - **Normal text** → `AgentMessageContentDelta`
      - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
      (`codex-rs/core/src/codex.rs`)
    - Final plan item content is derived from the completed assistant
    message (authoritative), not necessarily the concatenated deltas.
    - Strips `<proposed_plan>` blocks from assistant text in plan mode so
    tags don’t appear in normal messages.
    (`codex-rs/core/src/stream_events_utils.rs`)
    - Persist `ItemCompleted` events only for plan items for rollout replay.
    (`codex-rs/core/src/rollout/policy.rs`)
    - Guard `update_plan` tool in Plan Mode with a clear error message.
    (`codex-rs/core/src/tools/handlers/plan.rs`)
    - Updated Plan Mode prompt to:  
      - keep `<proposed_plan>` out of non-final reasoning/preambles  
      - require exact tag formatting  
      - allow only one `<proposed_plan>` block per turn  
      (`codex-rs/core/templates/collaboration_mode/plan.md`)
    
    ### Protocol / App-server protocol
    - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
    (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
    - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
    EXPERIMENTAL markers and note that deltas may not match the final plan
    item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
    - Added plan delta route in app-server protocol common mapping.
    (`codex-rs/app-server-protocol/src/protocol/common.rs`)
    - Rebuild plan items from persisted `ItemCompleted` events on resume.
    (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)
    
    ### App-server
    - Forward plan deltas to v2 clients and map core plan items to v2 plan
    items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
    `codex-rs/app-server/src/codex_message_processor.rs`)
    - Added v2 plan item tests.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ### TUI
    - Added a dedicated proposed plan history cell with special background
    and padding, and moved “• Proposed Plan” outside the highlighted block.
    (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
    - Only show “Implement this plan?” when a plan item exists.
    (`codex-rs/tui/src/chatwidget.rs`,
    `codex-rs/tui/src/chatwidget/tests.rs`)
    
    <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
    src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
    />
    
    ### Docs / Misc
    - Updated protocol docs to mention plan deltas.
    (`codex-rs/docs/protocol_v1.md`)
    - Minor plumbing updates in exec/debug clients to tolerate plan deltas.
    (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)
    
    ## Tests
    - Added core integration tests:
      - Plan mode strips plan from agent messages.
      - Missing `</proposed_plan>` closes at end-of-message.  
      (`codex-rs/core/tests/suite/items.rs`)
    - Added unit tests for generic tag parser (prefix buffering, non-tag
    lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
    - Existing app-server plan item tests in v2.
    (`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
    
    ## Notes / Behavior
    - Plan output no longer appears in standard assistant text in Plan Mode;
    it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
    - The final plan item content is authoritative and may diverge from
    streamed deltas (documented as experimental).
    - Reasoning summaries are not filtered; prompt instructs the model not
    to include `<proposed_plan>` outside the final plan message.
    
    ## Codex Author
    `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
  • Conversation naming (#8991)
    Session renaming:
    - `/rename my_session`
    - `/rename` without arg and passing an argument in `customViewPrompt`
    - AppExitInfo shows resume hint using the session name if set instead of
    uuid, defaults to uuid if not set
    - Names are stored in `CODEX_HOME/sessions.jsonl`
    
    Session resuming:
    - codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
    matching the name and resumes the session
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • chore(personality) new schema with fallbacks (#10147)
    ## Summary
    Let's dial in this api contract in a bit more with more robust fallback
    behavior when model_instructions_template is false.
    
    Switches to a more explicit template / variables structure, with more
    fallbacks.
    
    ## Testing
    - [x] Adding unit tests
    - [x] Tested locally
  • add error messages for the go plan type (#10181)
    Adds support for the Go plan type
    Updates rate limit error messages to point to the usage page
  • [feat] persist dynamic tools in session rollout file (#10130)
    Add dynamic tools to rollout file for persistence & read from rollout on
    resume. Ran a real example and spotted the following in the rollout
    file:
    ```
    {"timestamp":"2026-01-29T01:27:57.468Z","type":"session_meta","payload":{"id":"019c075d-3f0b-77e3-894e-c1c159b04b1e","timestamp":"2026-01-29T01:27:57.451Z","...."dynamic_tools":[{"name":"demo_tool","description":"Demo dynamic tool","inputSchema":{"additionalProperties":false,"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}}],"git":{"commit_hash":"ebc573f15c01b8af158e060cfedd401f043e9dfa","branch":"dev/cc/dynamic-tools","repository_url":"https://github.com/openai/codex.git"}}}
    ```
  • [Codex][CLI] Show model-capacity guidance on 429 (#10118)
    ###### Problem
    Users get generic 429s with no guidance when a model is at capacity.
    ###### Solution
    Detect model-cap headers, surface a clear “try a different model”
    message, and keep behavior non‑intrusive (no auto‑switch).
    ###### Scope
    CLI/TUI only; protocol + error mapping updated to carry model‑cap info.
    ###### Tests
          - just fmt
          - cargo test -p codex-tui
    - cargo test -p codex-core --lib
    shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file --
    --nocapture (ran in isolated env)
          - validate local build with backend
         
    <img width="719" height="845" alt="image"
    src="https://github.com/user-attachments/assets/1470b33d-0974-4b1f-b8e6-d11f892f4b54"
    />
  • Better handling skill depdenencies on ENV VAR. (#9017)
    An experimental flow for env var skill dependencies. Skills can now
    declare required env vars in SKILL.md; if missing, the CLI prompts the
    user to get the value, and Core will store it in memory (eventually to a
    local persistent store)
    <img width="790" height="169" alt="image"
    src="https://github.com/user-attachments/assets/cd928918-9403-43cb-a7e7-b8d59bcccd9a"
    />
  • [connectors] Support connectors part 2 - slash command and tui (#9728)
    - [x] Support `/apps` slash command to browse the apps in tui.
    - [x] Support inserting apps to prompt using `$`.
    - [x] Lots of simplification/renaming from connectors to apps.
  • compaction (#10034)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • feat: sqlite 1 (#10004)
    Add a `.sqlite` database to be used to store rollout metatdata (and
    later logs)
    This PR is phase 1:
    * Add the database and the required infrastructure
    * Add a backfill of the database
    * Persist the newly created rollout both in files and in the DB
    * When we need to get metadata or a rollout, consider the `JSONL` as the
    source of truth but compare the results with the DB and show any errors
  • feat(core) RequestRule (#9489)
    ## Summary
    Instead of trying to derive the prefix_rule for a command mechanically,
    let's let the model decide for us.
    
    ## Testing
    - [x] tested locally
  • [skills] Auto install MCP dependencies when running skils with dependency specs. (#9982)
    Auto install MCP dependencies when running skils with dependency specs.
  • remove sandbox globals. (#9797)
    Threads sandbox updates through OverrideTurnContext for active turn
    Passes computed sandbox type into safety/exec
  • make cached web_search client-side default (#9974)
    [Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary)
    for default cached `web_search` completed; cached chosen as default.
    
    Update client to reflect that.
  • fix: handle all web_search actions and in progress invocations (#9960)
    ### Summary
    - Parse all `web_search` tool actions (`search`, `find_in_page`,
    `open_page`).
    - Previously we only parsed + displayed `search`, which made the TUI
    appear to pause when the other actions were being used.
    - Show in progress `web_search` calls as `Searching the web`
      - Previously we only showed completed tool calls
    
    <img width="308" height="149" alt="image"
    src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
    />
    
    ### Tests
    Added + updated tests, tested locally
    
    ### Follow ups
    Update VSCode extension to display these as well
  • Feat: add isOther to question returned by request user input tool (#9890)
    ### Summary
    Add `isOther` to question object from request_user_input tool input and
    remove `other` option from the tool prompt to better handle tool input.
  • feat: dynamic tools injection (#9539)
    ## Summary
    Add dynamic tool injection to thread startup in API v2, wire dynamic
    tool calls through the app server to clients, and plumb responses back
    into the model tool pipeline.
    
    ### Flow (high level)
    - Thread start injects `dynamic_tools` into the model tool list for that
    thread (validation is done here).
    - When the model emits a tool call for one of those names, core raises a
    `DynamicToolCallRequest` event.
    - The app server forwards it to the client as `item/tool/call`, waits
    for the client’s response, then submits a `DynamicToolResponse` back to
    core.
    - Core turns that into a `function_call_output` in the next model
    request so the model can continue.
    
    ### What changed
    - Added dynamic tool specs to v2 thread start params and protocol types;
    introduced `item/tool/call` (request/response) for dynamic tool
    execution.
    - Core now registers dynamic tool specs at request time and routes those
    calls via a new dynamic tool handler.
    - App server validates tool names/schemas, forwards dynamic tool call
    requests to clients, and publishes tool outputs back into the session.
    - Integration tests
  • feat(tui) /personality (#9718)
    ## Summary
    Adds /personality selector in the TUI, which leverages the new core
    interface in #9644
    
    Notes:
    - We are doing some of our own state management for model_info loading
    here, but not sure if that's ideal. open to opinions on simpler
    approach, but would like to avoid blocking on a larger refactor
    - Right now, the `/personality` selector just hides when the model
    doesn't support it. we can update this behavior down the line
    
    ## Testing
    - [x] Tested locally
    - [x] Added snapshot tests
  • feat: cap number of agents (#9855)
    Adding more guards to agent:
    * Max depth or 1 (i.e. a sub-agent can't spawn another one)
    * Max 12 sub-agents in total
  • Use collaboration mode masks without mutating base settings (#9806)
    Keep an unmasked base collaboration mode and apply the active mask on
    demand. Simplify the TUI mask helpers and update tests/docs to match the
    mask contract.
  • feat: ephemeral threads (#9765)
    Add ephemeral threads capabilities. Only exposed through the
    `app-server` v2
    
    The idea is to disable the rollout recorder for those threads.
  • change collaboration mode to struct (#9793)
    Shouldn't cause behavioral change
  • Fix execpolicy parsing for multiline quoted args (#9565)
    ## What
    Fix bash command parsing to accept double-quoted strings that contain
    literal newlines so execpolicy can match allow rules.
    
    ## Why
    Allow rules like [git, commit] should still match when commit messages
    include a newline in a quoted argument; the parser currently rejects
    these strings and falls back to the outer shell invocation.
    
    ## How
    - Validate double-quoted strings by ensuring all named children are
    string_content and then stripping the outer quotes from the raw node
    text so embedded newlines are preserved.
    - Reuse the helper for concatenated arguments.
    - Ensure large SI suffix formatting uses the caller-provided locale
    formatter for grouping.
    - Add coverage for newline-containing quoted arguments.
    
    Fixes #9541.
    
    ## Tests
    - cargo test -p codex-core
    - just fix -p codex-core
    - cargo test -p codex-protocol
    - just fix -p codex-protocol
    - cargo test --all-features
  • feat(core) update Personality on turn (#9644)
    ## Summary
    Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
    is explicitly unused by the clients so far, to simplify PRs - app-server
    and tui implementations will be follow-ups.
    
    ## Testing
    - [x] added integration tests
  • Support end_turn flag (#9698)
    Experimental flag that signals the end of the turn.
  • Add UI for skill enable/disable. (#9627)
    "/skill" will now allow you to enable/disable skills:
    <img width="658" height="199" alt="image"
    src="https://github.com/user-attachments/assets/bf8994c8-d6c1-462f-8bbb-f1ee9241caa4"
    />
  • feat(core) ModelInfo.model_instructions_template (#9597)
    ## Summary
    #9555 is the start of a rename, so I'm starting to standardize here.
    Sets up `model_instructions` templating with a strongly-typed object for
    injecting a personality block into the model instructions.
    
    ## Testing
    - [x] Added tests
    - [x] Ran locally
  • Add layered config.toml support to app server (#9510)
    This PR adds support for chained (layered) config.toml file merging for
    clients that use the app server interface. This feature already exists
    for the TUI, but it does not work for GUI clients.
    
    It does the following:
    * Changes code paths for new thread, resume thread, and fork thread to
    use the effective config based on the cwd.
    * Updates the `config/read` API to accept an optional `cwd` parameter.
    If specified, the API returns the effective config based on that cwd
    path. Also optionally includes all layers including project config
    files. If cwd is not specified, the API falls back on its older behavior
    where it considers only the global (non-project) config files when
    computing the effective config.
    
    The changes in codex_message_processor.rs look deceptively large. They
    mostly just involve moving existing blocks of code to a later point in
    some functions so it can use the cwd to calculate the config.
    
    This PR builds upon #9509 and should be reviewed and merged after that
    PR.
    
    Tested:
    * Verified change with (dependent, as-yet-uncommitted) changes to IDE
    Extension and confirmed correct behavior
    
    The full fix requires additional changes in the IDE Extension code base,
    but they depend on this PR.