Commit Graph

253 Commits

  • feat: sqlite 1 (#10004)
    Add a `.sqlite` database to be used to store rollout metatdata (and
    later logs)
    This PR is phase 1:
    * Add the database and the required infrastructure
    * Add a backfill of the database
    * Persist the newly created rollout both in files and in the DB
    * When we need to get metadata or a rollout, consider the `JSONL` as the
    source of truth but compare the results with the DB and show any errors
  • feat(core) RequestRule (#9489)
    ## Summary
    Instead of trying to derive the prefix_rule for a command mechanically,
    let's let the model decide for us.
    
    ## Testing
    - [x] tested locally
  • [skills] Auto install MCP dependencies when running skils with dependency specs. (#9982)
    Auto install MCP dependencies when running skils with dependency specs.
  • remove sandbox globals. (#9797)
    Threads sandbox updates through OverrideTurnContext for active turn
    Passes computed sandbox type into safety/exec
  • make cached web_search client-side default (#9974)
    [Experiment](https://console.statsig.com/50aWbk2p4R76rNX9lN5VUw/experiments/codex_web_search_rollout/summary)
    for default cached `web_search` completed; cached chosen as default.
    
    Update client to reflect that.
  • fix: handle all web_search actions and in progress invocations (#9960)
    ### Summary
    - Parse all `web_search` tool actions (`search`, `find_in_page`,
    `open_page`).
    - Previously we only parsed + displayed `search`, which made the TUI
    appear to pause when the other actions were being used.
    - Show in progress `web_search` calls as `Searching the web`
      - Previously we only showed completed tool calls
    
    <img width="308" height="149" alt="image"
    src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
    />
    
    ### Tests
    Added + updated tests, tested locally
    
    ### Follow ups
    Update VSCode extension to display these as well
  • Feat: add isOther to question returned by request user input tool (#9890)
    ### Summary
    Add `isOther` to question object from request_user_input tool input and
    remove `other` option from the tool prompt to better handle tool input.
  • feat: dynamic tools injection (#9539)
    ## Summary
    Add dynamic tool injection to thread startup in API v2, wire dynamic
    tool calls through the app server to clients, and plumb responses back
    into the model tool pipeline.
    
    ### Flow (high level)
    - Thread start injects `dynamic_tools` into the model tool list for that
    thread (validation is done here).
    - When the model emits a tool call for one of those names, core raises a
    `DynamicToolCallRequest` event.
    - The app server forwards it to the client as `item/tool/call`, waits
    for the client’s response, then submits a `DynamicToolResponse` back to
    core.
    - Core turns that into a `function_call_output` in the next model
    request so the model can continue.
    
    ### What changed
    - Added dynamic tool specs to v2 thread start params and protocol types;
    introduced `item/tool/call` (request/response) for dynamic tool
    execution.
    - Core now registers dynamic tool specs at request time and routes those
    calls via a new dynamic tool handler.
    - App server validates tool names/schemas, forwards dynamic tool call
    requests to clients, and publishes tool outputs back into the session.
    - Integration tests
  • feat(tui) /personality (#9718)
    ## Summary
    Adds /personality selector in the TUI, which leverages the new core
    interface in #9644
    
    Notes:
    - We are doing some of our own state management for model_info loading
    here, but not sure if that's ideal. open to opinions on simpler
    approach, but would like to avoid blocking on a larger refactor
    - Right now, the `/personality` selector just hides when the model
    doesn't support it. we can update this behavior down the line
    
    ## Testing
    - [x] Tested locally
    - [x] Added snapshot tests
  • feat: cap number of agents (#9855)
    Adding more guards to agent:
    * Max depth or 1 (i.e. a sub-agent can't spawn another one)
    * Max 12 sub-agents in total
  • Use collaboration mode masks without mutating base settings (#9806)
    Keep an unmasked base collaboration mode and apply the active mask on
    demand. Simplify the TUI mask helpers and update tests/docs to match the
    mask contract.
  • feat: ephemeral threads (#9765)
    Add ephemeral threads capabilities. Only exposed through the
    `app-server` v2
    
    The idea is to disable the rollout recorder for those threads.
  • change collaboration mode to struct (#9793)
    Shouldn't cause behavioral change
  • Fix execpolicy parsing for multiline quoted args (#9565)
    ## What
    Fix bash command parsing to accept double-quoted strings that contain
    literal newlines so execpolicy can match allow rules.
    
    ## Why
    Allow rules like [git, commit] should still match when commit messages
    include a newline in a quoted argument; the parser currently rejects
    these strings and falls back to the outer shell invocation.
    
    ## How
    - Validate double-quoted strings by ensuring all named children are
    string_content and then stripping the outer quotes from the raw node
    text so embedded newlines are preserved.
    - Reuse the helper for concatenated arguments.
    - Ensure large SI suffix formatting uses the caller-provided locale
    formatter for grouping.
    - Add coverage for newline-containing quoted arguments.
    
    Fixes #9541.
    
    ## Tests
    - cargo test -p codex-core
    - just fix -p codex-core
    - cargo test -p codex-protocol
    - just fix -p codex-protocol
    - cargo test --all-features
  • feat(core) update Personality on turn (#9644)
    ## Summary
    Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
    is explicitly unused by the clients so far, to simplify PRs - app-server
    and tui implementations will be follow-ups.
    
    ## Testing
    - [x] added integration tests
  • Support end_turn flag (#9698)
    Experimental flag that signals the end of the turn.
  • Add UI for skill enable/disable. (#9627)
    "/skill" will now allow you to enable/disable skills:
    <img width="658" height="199" alt="image"
    src="https://github.com/user-attachments/assets/bf8994c8-d6c1-462f-8bbb-f1ee9241caa4"
    />
  • feat(core) ModelInfo.model_instructions_template (#9597)
    ## Summary
    #9555 is the start of a rename, so I'm starting to standardize here.
    Sets up `model_instructions` templating with a strongly-typed object for
    injecting a personality block into the model instructions.
    
    ## Testing
    - [x] Added tests
    - [x] Ran locally
  • Add layered config.toml support to app server (#9510)
    This PR adds support for chained (layered) config.toml file merging for
    clients that use the app server interface. This feature already exists
    for the TUI, but it does not work for GUI clients.
    
    It does the following:
    * Changes code paths for new thread, resume thread, and fork thread to
    use the effective config based on the cwd.
    * Updates the `config/read` API to accept an optional `cwd` parameter.
    If specified, the API returns the effective config based on that cwd
    path. Also optionally includes all layers including project config
    files. If cwd is not specified, the API falls back on its older behavior
    where it considers only the global (non-project) config files when
    computing the effective config.
    
    The changes in codex_message_processor.rs look deceptively large. They
    mostly just involve moving existing blocks of code to a later point in
    some functions so it can use the cwd to calculate the config.
    
    This PR builds upon #9509 and should be reviewed and merged after that
    PR.
    
    Tested:
    * Verified change with (dependent, as-yet-uncommitted) changes to IDE
    Extension and confirmed correct behavior
    
    The full fix requires additional changes in the IDE Extension code base,
    but they depend on this PR.
  • Add collaboration_mode to TurnContextItem (#9583)
    ## Summary
    - add optional `collaboration_mode` to `TurnContextItem` in rollouts
    - persist the current collaboration mode when recording turn context
    (sampling + compaction)
    
    ## Rationale
    We already persist turn context data for resume logic. Capturing
    collaboration mode in the rollout gives us the mode context for each
    turn, enabling follow‑up work to diff mode instructions correctly on
    resume.
    
    ## Changes
    - protocol: add optional `collaboration_mode` field to `TurnContextItem`
    - core: persist collaboration mode alongside other turn context settings
    in rollouts
  • Chore: update plan mode output in prompt (#9592)
    ### Summary
    * Update plan prompt output
    * Update requestUserInput response to be a single key value pair
    `answer: String`.
  • Prompt Expansion: Preserve Text Elements (#9518)
    Summary
    - Preserve `text_elements` through custom prompt argument parsing and
    expansion (named and numeric placeholders).
    - Translate text element ranges through Shlex parsing using sentinel
    substitution, and rehydrate text + element ranges per arg.
    - Drop image attachments when their placeholder does not survive prompt
    expansion, keeping attachments consistent with rendered elements.
    - Mirror changes in TUI2 and expand tests for prompt parsing/expansion
    edge cases.
    
    Tests
    - placeholders with spaces as single tokens (positional + key=value,
    quoted + unquoted),
      - prompt expansion with image placeholders,
      - large paste + image arg combinations,
      - unused image arg dropped after expansion.
  • Add total (non-partial) TextElement placeholder accessors (#9545)
    ## Summary
    - Make `TextElement` placeholders private and add a text-backed accessor
    to avoid assuming `Some`.
    - Since they are optional in the protocol, we want to make sure any
    accessors properly handle the None case (getting the placeholder using
    the byte range in the text)
    - Preserve placeholders during protocol/app-server conversions using the
    accessor fallback.
    - Update TUI composer/remap logic and tests to use the new
    constructor/accessor.
  • fix(core) Preserve base_instructions in SessionMeta (#9427)
    ## Summary
    This PR consolidates base_instructions onto SessionMeta /
    SessionConfiguration, so we ensure `base_instructions` is set once per
    session and should be (mostly) immutable, unless:
    - overridden by config on resume / fork
    - sub-agent tasks, like review or collab
    
    
    In a future PR, we should convert all references to `base_instructions`
    to consistently used the typed struct, so it's less likely that we put
    other strings there. See #9423. However, this PR is already quite
    complex, so I'm deferring that to a follow-up.
    
    ## Testing
    - [x] Added a resume test to assert that instructions are preserved. In
    particular, `resume_switches_models_preserves_base_instructions` fails
    against main.
    
    Existing test coverage thats assert base instructions are preserved
    across multiple requests in a session:
    - Manual compact keeps baseline instructions:
    core/tests/suite/compact.rs:199
    - Auto-compact keeps baseline instructions:
    core/tests/suite/compact.rs:1142
    - Prompt caching reuses the same instructions across two requests:
    core/tests/suite/prompt_caching.rs:150 and
    core/tests/suite/prompt_caching.rs:157
    - Prompt caching with explicit expected string across two requests:
    core/tests/suite/prompt_caching.rs:213 and
    core/tests/suite/prompt_caching.rs:222
    - Resume with model switch keeps original instructions:
    core/tests/suite/resume.rs:136
    - Compact/resume/fork uses request 0 instructions for later expected
    payloads: core/tests/suite/compact_resume_fork.rs:215
  • Migrate tui to use UserTurn (#9497)
    - `tui/` and `tui2/` submit `Op::UserTurn` and own full turn context
    (cwd/approval/sandbox/model/etc.).
    - `Op::UserInput` is documented as legacy in `codex-protocol` (doc-only;
    no `#[deprecated]` to avoid `-D warnings` fallout).
    - Remove obsolete `#[allow(deprecated)]` and the unused `ConversationId`
    alias/re-export.
  • Feat: request user input tool (#9472)
    ### Summary
    * Add `requestUserInput` tool that the model can use for gather
    feedback/asking question mid turn.
    
    
    ### Tool input schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput input",
      "type": "object",
      "additionalProperties": false,
      "required": ["questions"],
      "properties": {
        "questions": {
          "type": "array",
          "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
          "minItems": 1,
          "maxItems": 3,
          "items": {
            "type": "object",
            "additionalProperties": false,
            "required": ["id", "header", "question"],
            "properties": {
              "id": {
                "type": "string",
                "description": "Stable identifier for mapping answers (snake_case)."
              },
              "header": {
                "type": "string",
                "description": "Short header label shown in the UI (12 or fewer chars)."
              },
              "question": {
                "type": "string",
                "description": "Single-sentence prompt shown to the user."
              },
              "options": {
                "type": "array",
                "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
                "minItems": 2,
                "maxItems": 3,
                "items": {
                  "type": "object",
                  "additionalProperties": false,
                  "required": ["value", "label", "description"],
                  "properties": {
                    "value": {
                      "type": "string",
                      "description": "Machine-readable value (snake_case)."
                    },
                    "label": {
                      "type": "string",
                      "description": "User-facing label (1-5 words)."
                    },
                    "description": {
                      "type": "string",
                      "description": "One short sentence explaining impact/tradeoff if selected."
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    ```
    
    ### Tool output schema
    ```
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "requestUserInput output",
      "type": "object",
      "additionalProperties": false,
      "required": ["answers"],
      "properties": {
        "answers": {
          "type": "object",
          "description": "Map of question id to user answer.",
          "additionalProperties": {
            "type": "object",
            "additionalProperties": false,
            "required": ["selected"],
            "properties": {
              "selected": {
                "type": "array",
                "items": { "type": "string" }
              },
              "other": {
                "type": ["string", "null"]
              }
            }
          }
        }
      }
    }
    ```
  • Remove unused protocol collaboration mode prompts (#9463)
    Delete duplicate collaboration mode markdown under protocol prompts;
    core templates remain the single source of truth.
  • Add collaboration modes test prompts (#9443)
    # External (non-OpenAI) Pull Request Requirements
    
    Before opening this Pull Request, please read the dedicated
    "Contributing" markdown file or your PR may be closed:
    https://github.com/openai/codex/blob/main/docs/contributing.md
    
    If your PR conforms to our contribution guidelines, replace this text
    with a detailed and high quality description of your changes.
    
    Include a link to a bug report or enhancement request.
  • Add collaboration developer instructions (#9424)
    - Add additional instructions when they are available
    - Make sure to update them on change either UserInput or UserTurn
  • chore(instructions) Remove unread SessionMeta.instructions field (#9423)
    ### Description
    - Remove the now-unused `instructions` field from the session metadata
    to simplify SessionMeta and stop propagating transient instruction text
    through the rollout recorder API. This was only saving
    user_instructions, and was never being read.
    - Stop passing user instructions into the rollout writer at session
    creation so the rollout header only contains canonical session metadata.
    
    ### Testing
    
    - Ran `just fmt` which completed successfully.
    - Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
    -p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
    codex-tui2` which completed (Clippy fixes applied) as part of
    verification.
    - Ran `cargo test -p codex-protocol` which passed (28 tests).
    - Ran `cargo test -p codex-core` which showed failures in a small set of
    tests (not caused by the protocol type change directly):
    `default_client::tests::test_create_client_sets_default_headers`,
    several `models_manager::manager::tests::refresh_available_models_*`,
    and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
    tests failed in this CI run).
    - Ran `cargo test -p codex-app-server` which reported several failing
    integration tests (including
    `suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
    `suite::output_schema::send_user_turn_*`, and
    `suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
    - `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
    attempted but aborted due to disk space exhaustion (`No space left on
    device`).
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
  • Introduce collaboration modes (#9340)
    - Merge `model` and `reasoning_effort` under collaboration modes.
    - Add additional instructions for custom collaboration mode
    - Default to Custom to not change behavior
  • feat: show forked from session id in /status (#9330)
    Summary:
    - Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
    for forked sessions.
    - Surface forked_from in /status for tui + tui2 and add snapshots.
  • chore: upgrade to Rust 1.92.0 (#8860)
    **Summary**
    - Upgrade Rust toolchain used by CI to 1.92.0.
    - Address new clippy `derivable_impls` warnings by deriving `Default`
    for enums across protocol, core, backend openapi models, and
    windows-sandbox setup.
    - Tidy up related test/config behavior (originator header handling, env
    override cleanup) and remove a now-unused assignment in TUI/TUI2 render
    layout.
    
    **Testing**
    - `just fmt`
    - `just fix -p codex-tui`
    - `just fix -p codex-tui2`
    - `just fix -p codex-windows-sandbox`
    - `cargo test -p codex-tui`
    - `cargo test -p codex-tui2`
    - `cargo test -p codex-windows-sandbox`
    - `cargo test -p codex-core --test all`
    - `cargo test -p codex-app-server --test all`
    - `cargo test -p codex-mcp-server --test all`
    - `cargo test --all-features`
  • Add text element metadata to protocol, app server, and core (#9331)
    The second part of breaking up PR
    https://github.com/openai/codex/pull/9116
    
    Summary:
    
    - Add `TextElement` / `ByteRange` to protocol user inputs and user
    message events with defaults.
    - Thread `text_elements` through app-server v1/v2 request handling and
    history rebuild.
    - Preserve UI metadata only in user input/events (not `ContentItem`)
    while keeping local image attachments in user events for rehydration.
    
    Details:
    
    - Protocol: `UserInput::Text` carries `text_elements`;
    `UserMessageEvent` carries `text_elements` + `local_images`.
    Serialization includes empty vectors for backward compatibility.
    - app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
    camelCase with conversions; v2 uses its own camelCase wrapper.
    - app-server: v1/v2 input mapping includes `text_elements`; thread
    history rebuilds include them.
    - Core: user event emission preserves UI metadata while model history
    stays clean; history replay round-trips the metadata.
  • Support SKILL.toml file. (#9125)
    We’re introducing a new SKILL.toml to hold skill metadata so Codex can
    deliver a richer Skills experience.
    
    Initial focus is the interface block:
    ```
    [interface]
    display_name = "Optional user-facing name"
    short_description = "Optional user-facing description"
    icon_small = "./assets/small-400px.png"
    icon_large = "./assets/large-logo.svg"
    brand_color = "#3B82F6"
    default_prompt = "Optional surrounding prompt to use the skill with"
    ```
    
    All fields are exposed via the app server API.
    display_name and short_description are consumed by the TUI.
  • Add migration_markdown in model_info (#9219)
    Next step would be to clean Model Upgrade in model presets
    
    ---------
    
    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
  • Add text element metadata to types (#9235)
    Initial type tweaking PR to make the diff of
    https://github.com/openai/codex/pull/9116 smaller
    
    This should not change any behavior, just adds some fields to types
  • add WebSearchMode enum (#9216)
    ### What
    Add `WebSearchMode` enum (disabled, cached live, defaults to cached) to
    config + V2 protocol. This enum takes precedence over legacy flags:
    `web_search_cached`, `web_search_request`, and `tools.web_search`.
    
    Keep `--search` as live.
    
    ### Tests
    Added tests
  • feat: emit events around collab tools (#9095)
    Emit the following events around the collab tools. On the `app-server`
    this will be under `item/started` and `item/completed`
    ```
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabAgentSpawnBeginEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
        /// beginning.
        pub prompt: String,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabAgentSpawnEndEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the newly spawned agent, if it was created.
        pub new_thread_id: Option<ThreadId>,
        /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
        /// beginning.
        pub prompt: String,
        /// Last known status of the new agent reported to the sender agent.
        pub status: AgentStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabAgentInteractionBeginEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
        /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
        /// leaking at the beginning.
        pub prompt: String,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabAgentInteractionEndEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
        /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
        /// leaking at the beginning.
        pub prompt: String,
        /// Last known status of the receiver agent reported to the sender agent.
        pub status: AgentStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabWaitingBeginEvent {
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
        /// ID of the waiting call.
        pub call_id: String,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabWaitingEndEvent {
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
        /// ID of the waiting call.
        pub call_id: String,
        /// Last known status of the receiver agent reported to the sender agent.
        pub status: AgentStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabCloseBeginEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
    pub struct CollabCloseEndEvent {
        /// Identifier for the collab tool call.
        pub call_id: String,
        /// Thread ID of the sender.
        pub sender_thread_id: ThreadId,
        /// Thread ID of the receiver.
        pub receiver_thread_id: ThreadId,
        /// Last known status of the receiver agent reported to the sender agent before
        /// the close.
        pub status: AgentStatus,
    }
    ```
  • feat: add support for read-only bind mounts in the linux sandbox (#9112)
    ### Motivation
    
    - Landlock alone cannot prevent writes to sensitive in-repo files like
    `.git/` when the repo root is writable, so explicit mount restrictions
    are required for those paths.
    - The sandbox must set up any mounts before calling Landlock so Landlock
    can still be applied afterwards and the two mechanisms compose
    correctly.
    
    ### Description
    
    - Add a new `linux-sandbox` helper `apply_read_only_mounts` in
    `linux-sandbox/src/mounts.rs` that: unshares namespaces, maps uids/gids
    when required, makes mounts private, bind-mounts targets, and remounts
    them read-only.
    - Wire the mount step into the sandbox flow by calling
    `apply_read_only_mounts(...)` before network/seccomp and before applying
    Landlock rules in `linux-sandbox/src/landlock.rs`.
  • clean models manager (#9168)
    Have only the following Methods:
    - `list_models`: getting current available models
    - `try_list_models`: sync version no refresh for tui use
    - `get_default_model`: get the default model (should be tightened to
    core and received on session configuration)
    - `get_model_info`: get `ModelInfo` for a specific model (should be
    tightened to core but used in tests)
    - `refresh_if_new_etag`: trigger refresh on different etags
    
    Also move the cache to its own struct
  • Use markdown for migration screen (#8952)
    Next steps will be routing this to model info
  • Add model client sessions (#9102)
    Maintain a long-running session.
  • Assemble sandbox/approval/network prompts dynamically (#8961)
    - Add a single builder for developer permissions messaging that accepts
    SandboxPolicy and approval policy. This builder now drives the developer
    “permissions” message that’s injected at session start and any time
    sandbox/approval settings change.
    - Trim EnvironmentContext to only include cwd, writable roots, and
    shell; removed sandbox/approval/network duplication and adjusted XML
    serialization and tests accordingly.
    
    Follow-up: adding a config value to replace the developer permissions
    message for custom sandboxes.
  • feat: hot reload mcp servers (#8957)
    ### Summary
    * Added `mcpServer/refresh` command to inform app servers and active
    threads to refresh mcpServer on next turn event.
    * Added `pending_mcp_server_refresh_config` to codex core so that if the
    value is populated, we reinitialize the mcp server manager on the thread
    level.
    * The config is updated on `mcpServer/refresh` command which we iterate
    through threads and provide with the latest config value after last
    write.