Commit Graph

162 Commits

  • Add realtime transcript notification in v2 (#15344)
    - emit a typed `thread/realtime/transcriptUpdated` notification from
    live realtime transcript deltas
    - expose that notification as flat `threadId`, `role`, and `text` fields
    instead of a nested transcript array
    - continue forwarding raw `handoff_request` items on
    `thread/realtime/itemAdded`, including the accumulated
    `active_transcript`
    - update app-server docs, tests, and generated protocol schema artifacts
    to match the delta-based payloads
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [plugins] Install MCPs when calling plugin/install (#15195)
    - [x] Auth MCPs when installing plugins.
  • feat(app-server): add mcpServer/startupStatus/updated notification (#15220)
    Exposes the legacy `codex/event/mcp_startup_update` event as an API v2
    notification.
    
    The legacy event has this shape:
    ```
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    pub struct McpStartupUpdateEvent {
        /// Server name being started.
        pub server: String,
        /// Current startup status.
        pub status: McpStartupStatus,
    }
    
    #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
    #[serde(rename_all = "snake_case", tag = "state")]
    #[ts(rename_all = "snake_case", tag = "state")]
    pub enum McpStartupStatus {
        Starting,
        Ready,
        Failed { error: String },
        Cancelled,
    }
    ```
  • Add thread/shellCommand to app server API surface (#14988)
    This PR adds a new `thread/shellCommand` app server API so clients can
    implement `!` shell commands. These commands are executed within the
    sandbox, and the command text and output are visible to the model.
    
    The internal implementation mirrors the current TUI `!` behavior.
    - persist shell command execution as `CommandExecution` thread items,
    including source and formatted output metadata
    - bridge live and replayed app-server command execution events back into
    the existing `tui_app_server` exec rendering path
    
    This PR also wires `tui_app_server` to submit `!` commands through the
    new API.
  • Feat: reuse persisted model and reasoning effort on thread resume (#14888)
    ## Summary
    
    This PR makes `thread/resume` reuse persisted thread model metadata when
    the caller does not explicitly override it.
    
    Changes:
    - read persisted thread metadata from SQLite during `thread/resume`
    - reuse persisted `model` and `model_reasoning_effort` as resume-time
    defaults
    - fetch persisted metadata once and reuse it later in the resume
    response path
    - keep thread summary loading on the existing rollout path, while
    reusing persisted metadata when available
    - document the resume fallback behavior in the app-server README
    
    ## Why
    
    Before this change, resuming a thread without explicit overrides derived
    `model` and `model_reasoning_effort` from current config, which could
    drift from the thread’s last persisted values. That meant a resumed
    thread could report and run with different model settings than the ones
    it previously used.
    
    ## Behavior
    
    Precedence on `thread/resume` is now:
    1. explicit resume overrides
    2. persisted SQLite metadata for the thread
    3. normal config resolution for the resumed cwd
  • Revert "fix: harden plugin feature gating" (#15102)
    Reverts openai/codex#15020
    
    I messed up the commit in my PR and accidentally merged changes that
    were still under review.
  • fix: harden plugin feature gating (#15020)
    1. Use requirement-resolved config.features as the plugin gate.
    2. Guard plugin/list, plugin/read, and related flows behind that gate.
    3. Skip bad marketplace.json files instead of failing the whole list.
    4. Simplify plugin state and caching.
  • app-server: reject websocket requests with Origin headers (#14995)
    Reject websocket requests that carry an `Origin` header
  • Cleanup skills/remote/xxx endpoints. (#14977)
    Remote skills/remote/xxx as they are not in used for now.
  • Preserve background terminals on interrupt and rename cleanup command to /stop (#14602)
    ### Motivation
    - Interrupting a running turn (Ctrl+C / Esc) currently also terminates
    long‑running background shells, which is surprising for workflows like
    local dev servers or file watchers.
    - The existing cleanup command name was confusing; callers expect an
    explicit command to stop background terminals rather than a UI clear
    action.
    - Make background‑shell termination explicit and surface a clearer
    command name while preserving backward compatibility.
    
    ### Description
    - Renamed the background‑terminal cleanup slash command from `Clean`
    (`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the
    command parsing/visibility layer, updated the user descriptions and
    command popup wiring accordingly.
    - Updated the unified‑exec footer text and snapshots to point to `/stop`
    (and trimmed corresponding snapshot output to match the new label).
    - Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt)
    no longer closes or clears tracked unified exec / background terminal
    processes in the TUI or core cleanup path; background shells are now
    preserved after an interrupt.
    - Updated protocol/docs to clarify that `turn/interrupt` (or
    `Op::Interrupt`) interrupts the active turn but does not terminate
    background terminals, and that `thread/backgroundTerminals/clean` is the
    explicit API to stop those shells.
    - Updated unit/integration tests and insta snapshots in the TUI and core
    unified‑exec suites to reflect the new semantics and command name.
    
    ### Testing
    - Ran formatting with `just fmt` in `codex-rs` (succeeded). 
    - Ran `cargo test -p codex-protocol` (succeeded). 
    - Attempted `cargo test -p codex-tui` but the build could not complete
    in this environment due to a native build dependency that requires
    `libcap` development headers (the `codex-linux-sandbox` vendored build
    step); install `libcap-dev` / make `libcap.pc` available in
    `PKG_CONFIG_PATH` to run the TUI test suite locally.
    - Updated and accepted the affected `insta` snapshots for the TUI
    changes so visual diffs reflect the new `/stop` wording and preserved
    interrupt behavior.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)
  • dynamic tool calls: add param exposeToContext to optionally hide tool (#14501)
    This extends dynamic_tool_calls to allow us to hide a tool from the
    model context but still use it as part of the general tool calling
    runtime (for ex from js_repl/code_mode)
  • Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
    ## Summary
    - add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
    control for who reviews approval requests
    - route Smart Approvals guardian review through core for command
    execution, file changes, managed-network approvals, MCP approvals, and
    delegated/subagent approval flows
    - expose guardian review in app-server with temporary unstable
    `item/autoApprovalReview/{started,completed}` notifications carrying
    `targetItemId`, `review`, and `action`
    - update the TUI so Smart Approvals can be enabled from `/experimental`,
    aligned with the matching `/approvals` mode, and surfaced clearly while
    reviews are pending or resolved
    
    ## Runtime model
    This PR does not introduce a new `approval_policy`.
    
    Instead:
    - `approval_policy` still controls when approval is needed
    - `approvals_reviewer` controls who reviewable approval requests are
    routed to:
      - `user`
      - `guardian_subagent`
    
    `guardian_subagent` is a carefully prompted reviewer subagent that
    gathers relevant context and applies a risk-based decision framework
    before approving or denying the request.
    
    The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
    behavior keys off `approvals_reviewer`.
    
    When Smart Approvals is enabled from the TUI, it also switches the
    current `/approvals` settings to the matching Smart Approvals mode so
    users immediately see guardian review in the active thread:
    - `approval_policy = on-request`
    - `approvals_reviewer = guardian_subagent`
    - `sandbox_mode = workspace-write`
    
    Users can still change `/approvals` afterward.
    
    Config-load behavior stays intentionally narrow:
    - plain `smart_approvals = true` in `config.toml` remains just the
    rollout/UI gate and does not auto-set `approvals_reviewer`
    - the deprecated `guardian_approval = true` alias migration does
    backfill `approvals_reviewer = "guardian_subagent"` in the same scope
    when that reviewer is not already configured there, so old configs
    preserve their original guardian-enabled behavior
    
    ARC remains a separate safety check. For MCP tool approvals, ARC
    escalations now flow into the configured reviewer instead of always
    bypassing guardian and forcing manual review.
    
    ## Config stability
    The runtime reviewer override is stable, but the config-backed
    app-server protocol shape is still settling.
    
    - `thread/start`, `thread/resume`, and `turn/start` keep stable
    `approvalsReviewer` overrides
    - the config-backed `approvals_reviewer` exposure returned via
    `config/read` (including profile-level config) is now marked
    `[UNSTABLE]` / experimental in the app-server protocol until we are more
    confident in that config surface
    
    ## App-server surface
    This PR intentionally keeps the guardian app-server shape narrow and
    temporary.
    
    It adds generic unstable lifecycle notifications:
    - `item/autoApprovalReview/started`
    - `item/autoApprovalReview/completed`
    
    with payloads of the form:
    - `{ threadId, turnId, targetItemId, review, action? }`
    
    `review` is currently:
    - `{ status, riskScore?, riskLevel?, rationale? }`
    - where `status` is one of `inProgress`, `approved`, `denied`, or
    `aborted`
    
    `action` carries the guardian action summary payload from core when
    available. This lets clients render temporary standalone pending-review
    UI, including parallel reviews, even when the underlying tool item has
    not been emitted yet.
    
    These notifications are explicitly documented as `[UNSTABLE]` and
    expected to change soon.
    
    This PR does **not** persist guardian review state onto `thread/read`
    tool items. The intended follow-up is to attach guardian review state to
    the reviewed tool item lifecycle instead, which would improve
    consistency with manual approvals and allow thread history / reconnect
    flows to replay guardian review state directly.
    
    ## TUI behavior
    - `/experimental` exposes the rollout gate as `Smart Approvals`
    - enabling it in the TUI enables the feature and switches the current
    session to the matching Smart Approvals `/approvals` mode
    - disabling it in the TUI clears the persisted `approvals_reviewer`
    override when appropriate and returns the session to default manual
    review when the effective reviewer changes
    - `/approvals` still exposes the reviewer choice directly
    - the TUI renders:
    - pending guardian review state in the live status footer, including
    parallel review aggregation
      - resolved approval/denial state in history
    
    ## Scope notes
    This PR includes the supporting core/runtime work needed to make Smart
    Approvals usable end-to-end:
    - shell / unified-exec / apply_patch / managed-network / MCP guardian
    review
    - delegated/subagent approval routing into guardian review
    - guardian review risk metadata and action summaries for app-server/TUI
    - config/profile/TUI handling for `smart_approvals`, `guardian_approval`
    alias migration, and `approvals_reviewer`
    - a small internal cleanup of delegated approval forwarding to dedupe
    fallback paths and simplify guardian-vs-parent approval waiting (no
    intended behavior change)
    
    Out of scope for this PR:
    - redesigning the existing manual approval protocol shapes
    - persisting guardian review state onto app-server `ThreadItem`s
    - delegated MCP elicitation auto-review (the current delegated MCP
    guardian shim only covers the legacy `RequestUserInput` path)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: add v2 filesystem APIs (#14245)
    Add a protocol-level filesystem surface to the v2 app-server so Codex
    clients can read and write files, inspect directories, and subscribe to
    path changes without relying on host-specific helpers.
    
    High-level changes:
    - define the new v2 fs/readFile, fs/writeFile, fs/createDirectory,
    fs/getMetadata, fs/readDirectory, fs/remove, fs/copy RPCs
    - implement the app-server handlers, including absolute-path validation,
    base64 file payloads, recursive copy/remove semantics
    - document the API, regenerate protocol schemas/types, and add
    end-to-end tests for filesystem operations, copy edge cases
    
    Testing plan:
    - validate protocol serialization and generated schema output for the
    new fs request, response, and notification types
    - run app-server integration coverage for file and directory CRUD paths,
    metadata/readDirectory responses, copy failure modes, and absolute-path
    validation
  • app-server: Add platform os and family to init response (#14527)
    This allows the client to pick os-specific behavior while interacting
    with the app server, e.g. to use proper path separators.
  • feat: add plugin/read. (#14445)
    return more information for a specific plugin.
  • chore: use AVAILABLE and ON_INSTALL as default plugin install and auth policies (#14407)
    make `AVAILABLE` the default plugin installPolicy when unset in
    `marketplace.json`. similarly, make `ON_INSTALL` the default authPolicy.
    
    this means, when unset, plugins are available to be installed (but not
    auto-installed), and the contained connectors will be authed at
    install-time.
    
    updated tests.
  • chore(app-server): stop emitting codex/event/ notifications (#14392)
    ## Description
    
    This PR stops emitting legacy `codex/event/*` notifications from the
    public app-server transports.
    
    It's been a long time coming! app-server was still producing a raw
    notification stream from core, alongside the typed app-server
    notifications and server requests, for compatibility reasons. Now,
    external clients should no longer be depending on those legacy
    notifications, so this change removes them from the stdio and websocket
    contract and updates the surrounding docs, examples, and tests to match.
    
    ### Caveat
    I left the "in-process" version of app-server alone for now, since
    `codex exec` was recently based on top of app-server via this in-process
    form here: https://github.com/openai/codex/pull/14005
    
    Seems like `codex exec` still consumes some legacy notifications
    internally, so this branch only removes `codex/event/*` from app-server
    over stdio and websockets.
    
    ## Follow-up
    
    Once `codex exec` is fully migrated off `codex/event/*` notifications,
    we'll be able to stop emitting them entirely entirely instead of just
    filtering it at the external transport boundary.
  • chore: wire through plugin policies + category from marketplace.json (#14305)
    wire plugin marketplace metadata through app-server endpoints:
    - `plugin/list` has `installPolicy` and `authPolicy`
    - `plugin/install` has plugin-level `authPolicy`
    
    `plugin/install` also now enforces `NOT_AVAILABLE` `installPolicy` when
    installing.
    
    
    added tests.
  • Add ephemeral flag support to thread fork (#14248)
    ### Summary
    This PR adds first-class ephemeral support to thread/fork, bringing it
    in line with thread/start. The goal is to support one-off completions on
    full forked threads without persisting them as normal user-visible
    threads.
    
    ### Testing
  • app-server: propagate nested experimental gating for AskForApproval::Reject (#14191)
    ## Summary
    This change makes `AskForApproval::Reject` gate correctly anywhere it
    appears inside otherwise-stable app-server protocol types.
    
    Previously, experimental gating for `approval_policy: Reject` was
    handled with request-specific logic in `ClientRequest` detection. That
    covered a few request params types, but it did not generalize to other
    nested uses such as `ProfileV2`, `Config`, `ConfigReadResponse`, or
    `ConfigRequirements`.
    
    This PR replaces that ad hoc handling with a generic nested experimental
    propagation mechanism.
    
    ## Testing
    
    seeing this when run app-server-test-client without experimental api
    enabled:
    ```
     initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.3.1; arm64) vscode/2.4.36 (codex-toy-app-server; 0.0.0)" }
    > {
    >   "id": "50244f6a-270a-425d-ace0-e9e98205bde7",
    >   "method": "thread/start",
    >   "params": {
    >     "approvalPolicy": {
    >       "reject": {
    >         "mcp_elicitations": false,
    >         "request_permissions": true,
    >         "rules": false,
    >         "sandbox_approval": true
    >       }
    >     },
    >     "baseInstructions": null,
    >     "config": null,
    >     "cwd": null,
    >     "developerInstructions": null,
    >     "dynamicTools": null,
    >     "ephemeral": null,
    >     "experimentalRawEvents": false,
    >     "mockExperimentalField": null,
    >     "model": null,
    >     "modelProvider": null,
    >     "persistExtendedHistory": false,
    >     "personality": null,
    >     "sandbox": null,
    >     "serviceName": null
    >   }
    > }
    < {
    <   "error": {
    <     "code": -32600,
    <     "message": "askForApproval.reject requires experimentalApi capability"
    <   },
    <   "id": "50244f6a-270a-425d-ace0-e9e98205bde7"
    < }
    [verified] thread/start rejected approvalPolicy=Reject without experimentalApi
    ```
    
    ---------
    
    Co-authored-by: celia-oai <celia@openai.com>
  • feat: Allow sync with remote plugin status. (#14176)
    Add forceRemoteSync to plugin/list.
    When it is set to True, we will sync the local plugin status with the
    remote one (backend-api/plugins/list).
  • feat(approvals) RejectConfig for request_permissions (#14118)
    ## Summary
    We need to support allowing request_permissions calls when using
    `Reject` policy
    
    <img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
    40 PM"
    src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
    />
    
    Note that this is a backwards-incompatible change for Reject policy. I'm
    not sure if we need to add a default based on our current use/setup
    
    ## Testing
    - [x] Added tests
    - [x] Tested locally
  • codex-rs/app-server: add health endpoints for --listen websocket server (#13782)
    Healthcheck endpoints for the websocket server
    
    - serve `GET /readyz` and `GET /healthz` from the same listener used for
    `--listen ws://...`
    - switch the websocket listener over to `axum` upgrade handling instead
    of manual socket parsing
    - add websocket transport coverage for the health endpoints and document
    the new behavior
    
    Testing
    - integration tests
    - built and tested e2e
    
    ```
    > curl -i http://127.0.0.1:9234/readyz
    HTTP/1.1 200 OK
    content-length: 0
    date: Fri, 06 Mar 2026 19:20:23 GMT
    
    >  curl -i http://127.0.0.1:9234/healthz
    HTTP/1.1 200 OK
    content-length: 0
    date: Fri, 06 Mar 2026 19:20:24 GMT
    ```
  • feat(core) Persist request_permission data across turns (#14009)
    ## Summary
    request_permissions flows should support persisting results for the
    session.
    
    Open Question: Still deciding if we need within-turn approvals - this
    adds complexity but I could see it being useful
    
    ## Testing
    - [x] Updated unit tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • chore: plugin/uninstall endpoint (#14111)
    add `plugin/uninstall` app-server endpoint to fully rm plugin from
    plugins cache dir and rm entry from user config file.
    
    plugin-enablement is session-scoped, so uninstalls are only picked up in
    new sessions (like installs).
    
    added tests.
  • Add request permissions tool (#13092)
    Adds a built-in `request_permissions` tool and wires it through the
    Codex core, protocol, and app-server layers so a running turn can ask
    the client for additional permissions instead of relying on a static
    session policy.
    
    The new flow emits a `RequestPermissions` event from core, tracks the
    pending request by call ID, forwards it through app-server v2 as an
    `item/permissions/requestApproval` request, and resumes the tool call
    once the client returns an approved subset of the requested permission
    profile.
  • [app-server] Support hot-reload user config when batch writing config. (#13839)
    - [x] Support hot-reload user config when batch writing config.
  • app-server: require absolute cwd for windowsSandbox/setupStart (#13833)
    ## Summary
    - require windowsSandbox/setupStart.cwd to be an AbsolutePathBuf
    - reject relative cwd values at request parsing instead of normalizing
    them later in the setup flow
    - add RPC-layer coverage for relative cwd rejection and update the
    checked-in protocol schemas/docs
    
    ## Why
    windowsSandbox/setupStart was carrying the client-provided cwd as a raw
    PathBuf for command_cwd while config derivation normalized the same
    value into an absolute policy_cwd.
    
    That left room for relative-path ambiguity in the setup path, especially
    for inputs like cwd: "repo". Making the RPC accept only absolute paths
    removes that split entirely: the handler now receives one
    already-validated absolute path and uses it for both config derivation
    and setup.
    
    This keeps the trust model unchanged. Trusted clients could already
    choose the session cwd; this change is only about making the setup RPC
    reject relative paths so command_cwd and policy_cwd cannot diverge.
    
    ## Testing
    - cargo test -p codex-app-server windows_sandbox_setup (run locally by
    user)
    - cargo test -p codex-app-server-protocol windows_sandbox (run locally
    by user)
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • feat: Add curated plugin marketplace + Metadata Cleanup. (#13712)
    1. Add a synced curated plugin marketplace and include it in marketplace
    discovery.
    2. Expose optional plugin.json interface metadata in plugin/list
    3. Tighten plugin and marketplace path handling using validated absolute
    paths.
    4. Let manifests override skill, MCP, and app config paths.
    5. Restrict plugin enablement/config loading to the user config layer so
    plugin enablement is at global level
  • feat: structured plugin parsing (#13711)
    #### What
    
    Add structured `@plugin` parsing and TUI support for plugin mentions.
    
    - Core: switch from plain-text `@display_name` parsing to structured
    `plugin://...` mentions via `UserInput::Mention` and
    `[$...](plugin://...)` links in text, same pattern as apps/skills.
    - TUI: add plugin mention popup, autocomplete, and chips when typing
    `$`. Load plugin capability summaries and feed them into the composer;
    plugin mentions appear alongside skills and apps.
    - Generalize mention parsing to a sigil parameter, still defaults to `$`
    
    <img width="797" height="119" alt="image"
    src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb"
    />
    
    Builds on #13510. Currently clients have to build their own `id` via
    `plugin@marketplace` and filter plugins to show by `enabled`, but we
    will add `id` and `available` as fields returned from `plugin/list`
    soon.
    
    ####Tests
    
    Added tests, verified locally.
  • check app auth in plugin/install (#13685)
    #### What
    on `plugin/install`, check if installed apps are already authed on
    chatgpt, and return list of all apps that are not. clients can use this
    list to trigger auth workflows as needed.
    
    checks are best effort based on `codex_apps` loading, much like
    `app/list`.
    
    #### Tests
    Added integration tests, tested locally.
  • support plugin/list. (#13540)
    Introduce a plugin/list which reads from local marketplace.json.
    Also update the signature for plugin/install.
  • feat(app-server): support mcp elicitations in v2 api (#13425)
    This adds a first-class server request for MCP server elicitations:
    `mcpServer/elicitation/request`.
    
    Until now, MCP elicitation requests only showed up as a raw
    `codex/event/elicitation_request` event from core. That made it hard for
    v2 clients to handle elicitations using the same request/response flow
    as other server-driven interactions (like shell and `apply_patch`
    tools).
    
    This also updates the underlying MCP elicitation request handling in
    core to pass through the full MCP request (including URL and form data)
    so we can expose it properly in app-server.
    
    ### Why not `item/mcpToolCall/elicitationRequest`?
    This is because MCP elicitations are related to MCP servers first, and
    only optionally to a specific MCP tool call.
    
    In the MCP protocol, elicitation is a server-to-client capability: the
    server sends `elicitation/create`, and the client replies with an
    elicitation result. RMCP models it that way as well.
    
    In practice an elicitation is often triggered by an MCP tool call, but
    not always.
    
    ### What changed
    - add `mcpServer/elicitation/request` to the v2 app-server API
    - translate core `codex/event/elicitation_request` events into the new
    v2 server request
    - map client responses back into `Op::ResolveElicitation` so the MCP
    server can continue
    - update app-server docs and generated protocol schema
    - add an end-to-end app-server test that covers the full round trip
    through a real RMCP elicitation flow
    - The new test exercises a realistic case where an MCP tool call
    triggers an elicitation, the app-server emits
    mcpServer/elicitation/request, the client accepts it, and the tool call
    resumes and completes successfully.
    
    ### app-server API flow
    - Client starts a thread with `thread/start`.
    - Client starts a turn with `turn/start`.
    - App-server sends `item/started` for the `mcpToolCall`.
    - While that tool call is in progress, app-server sends
    `mcpServer/elicitation/request`.
    - Client responds to that request with `{ action: "accept" | "decline" |
    "cancel" }`.
    - App-server sends `serverRequest/resolved`.
    - App-server sends `item/completed` for the mcpToolCall.
    - App-server sends `turn/completed`.
    - If the turn is interrupted while the elicitation is pending,
    app-server still sends `serverRequest/resolved` before the turn
    finishes.
  • plugin: support local-based marketplace.json + install endpoint. (#13422)
    Support marketplace.json that points to a local file, with
    ```
        "source":
        {
            "source": "local",
            "path": "./plugin-1"
        },
     ```
     
     Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.
  • allow apps to specify cwd for sandbox setup. (#13484)
    The electron app doesn't start up the app-server in a particular
    workspace directory.
    So sandbox setup happens in the app-installed directory instead of the
    project workspace.
    
    This allows the app do specify the workspace cwd so that the sandbox
    setup actually sets up the ACLs instead of exiting fast and then having
    the first shell command be slow.
  • chore: Nest skill and protocol network permissions under network.enabled (#13427)
    ## Summary
    
    Changes the permission profile shape from a bare network boolean to a
    nested object.
    
    Before:
    
    ```yaml
    permissions:
      network: true
    ```
    
    After:
    
    ```yaml
    permissions:
      network:
        enabled: true
    ```
    
    This also updates the shared Rust and app-server protocol types so
    `PermissionProfile.network` is no longer `Option<bool>`, but
    `Option<NetworkPermissions>` with `enabled: Option<bool>`.
    
    ## What Changed
    
    - Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`:
    - `pub network: Option<bool>` -> `pub network:
    Option<NetworkPermissions>`
    - Added `NetworkPermissions` with:
      - `pub enabled: Option<bool>`
    - Changed emptiness semantics so `network` is only considered empty when
    `enabled` is `None`
    - Updated skill metadata parsing to accept `permissions.network.enabled`
    - Updated core permission consumers to read
    `network.enabled.unwrap_or(false)` where a concrete boolean is needed
    - Updated app-server v2 protocol types and regenerated schema/TypeScript
    outputs
    - Updated docs to mention `additionalPermissions.network.enabled`
  • config: enforce enterprise feature requirements (#13388)
    ## Why
    
    Enterprises can already constrain approvals, sandboxing, and web search
    through `requirements.toml` and MDM, but feature flags were still only
    configurable as managed defaults. That meant an enterprise could suggest
    feature values, but it could not actually pin them.
    
    This change closes that gap and makes enterprise feature requirements
    behave like the other constrained settings. The effective feature set
    now stays consistent with enterprise requirements during config load,
    when config writes are validated, and when runtime code mutates feature
    flags later in the session.
    
    It also tightens the runtime API for managed features. `ManagedFeatures`
    now follows the same constraint-oriented shape as `Constrained<T>`
    instead of exposing panic-prone mutation helpers, and production code
    can no longer construct it through an unconstrained `From<Features>`
    path.
    
    The PR also hardens the `compact_resume_fork` integration coverage on
    Windows. After the feature-management changes,
    `compact_resume_after_second_compaction_preserves_history` was
    overflowing the libtest/Tokio thread stacks on Windows, so the test now
    uses an explicit larger-stack harness as a pragmatic mitigation. That
    may not be the ideal root-cause fix, and it merits a parallel
    investigation into whether part of the async future chain should be
    boxed to reduce stack pressure instead.
    
    ## What Changed
    
    Enterprises can now pin feature values in `requirements.toml` with the
    requirements-side `features` table:
    
    ```toml
    [features]
    personality = true
    unified_exec = false
    ```
    
    Only canonical feature keys are allowed in the requirements `features`
    table; omitted keys remain unconstrained.
    
    - Added a requirements-side pinned feature map to
    `ConfigRequirementsToml`, threaded it through source-preserving
    requirements merge and normalization in `codex-config`, and made the
    TOML surface use `[features]` (while still accepting legacy
    `[feature_requirements]` for compatibility).
    - Exposed `featureRequirements` from `configRequirements/read`,
    regenerated the JSON/TypeScript schema artifacts, and updated the
    app-server README.
    - Wrapped the effective feature set in `ManagedFeatures`, backed by
    `ConstrainedWithSource<Features>`, and changed its API to mirror
    `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
    and result-returning `enable` / `disable` / `set_enabled` helpers.
    - Removed the legacy-usage and bulk-map passthroughs from
    `ManagedFeatures`; callers that need those behaviors now mutate a plain
    `Features` value and reapply it through `set(...)`, so the constrained
    wrapper remains the enforcement boundary.
    - Removed the production loophole for constructing unconstrained
    `ManagedFeatures`. Non-test code now creates it through the configured
    feature-loading path, and `impl From<Features> for ManagedFeatures` is
    restricted to `#[cfg(test)]`.
    - Rejected legacy feature aliases in enterprise feature requirements,
    and return a load error when a pinned combination cannot survive
    dependency normalization.
    - Validated config writes against enterprise feature requirements before
    persisting changes, including explicit conflicting writes and
    profile-specific feature states that normalize into invalid
    combinations.
    - Updated runtime and TUI feature-toggle paths to use the constrained
    setter API and to persist or apply the effective post-constraint value
    rather than the requested value.
    - Updated the `core_test_support` Bazel target to include the bundled
    core model-catalog fixtures in its runtime data, so helper code that
    resolves `core/models.json` through runfiles works in remote Bazel test
    environments.
    - Renamed the core config test coverage to emphasize that effective
    feature values are normalized at runtime, while conflicting persisted
    config writes are rejected.
    - Ran `compact_resume_after_second_compaction_preserves_history` inside
    an explicit 8 MiB test thread and Tokio runtime worker stack, following
    the existing larger-stack integration-test pattern, to keep the Windows
    `compact_resume_fork` test slice from aborting while a parallel
    investigation continues into whether some of the underlying async
    futures should be boxed.
    
    ## Verification
    
    - `cargo test -p codex-config`
    - `cargo test -p codex-core feature_requirements_ -- --nocapture`
    - `cargo test -p codex-core
    load_requirements_toml_produces_expected_constraints -- --nocapture`
    - `cargo test -p codex-core
    compact_resume_after_second_compaction_preserves_history -- --nocapture`
    - `cargo test -p codex-core compact_resume_fork -- --nocapture`
    - Re-ran the built `codex-core` `tests/all` binary with
    `RUST_MIN_STACK=262144` for
    `compact_resume_after_second_compaction_preserves_history` to confirm
    the explicit-stack harness fixes the deterministic low-stack repro.
    - `cargo test -p codex-core`
    - This still fails locally in unrelated integration areas that expect
    the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
    wiremock mismatches.
    
    ## Docs
    
    `developers.openai.com/codex` should document the requirements-side
    `[features]` table for enterprise and MDM-managed configuration,
    including that it only accepts canonical feature keys and that
    conflicting config writes are rejected.
  • feat(app-server): add a skills/changed v2 notification (#13414)
    This adds a first-class app-server v2 `skills/changed` notification for
    the existing skills live-reload signal.
    
    Before this change, clients only had the legacy raw
    `codex/event/skills_update_available` event. With this PR, v2 clients
    can listen for a typed JSON-RPC notification instead of depending on the
    legacy `codex/event/*` stream, which we want to remove soon.
  • Add thread metadata update endpoint to app server (#13280)
    ## Summary
    - add the v2 `thread/metadata/update` API, including
    protocol/schema/TypeScript exports and app-server docs
    - patch stored thread `gitInfo` in sqlite without resuming the thread,
    with validation plus support for explicit `null` clears
    - repair missing sqlite thread rows from rollout data before patching,
    and make those repairs safe by inserting only when absent and updating
    only git columns so newer metadata is not clobbered
    - keep sqlite authoritative for mutable thread git metadata by
    preserving existing sqlite git fields during reconcile/backfill and only
    using rollout `SessionMeta` git fields to fill gaps
    - add regression coverage for the endpoint, repair paths, concurrent
    sqlite writes, clearing git fields, and rollout/backfill reconciliation
    - fix the login server shutdown race so cancelling before the waiter
    starts still terminates `block_until_done()` correctly
    
    ## Testing
    - `cargo test -p codex-state
    apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields`
    - `cargo test -p codex-state
    update_thread_git_info_preserves_newer_non_git_metadata`
    - `cargo test -p codex-core
    backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields`
    - `cargo test -p codex-app-server thread_metadata_update`
    - `cargo test`
    - currently fails in existing `codex-core` grep-files tests with
    `unsupported call: grep_files`:
        - `suite::grep_files::grep_files_tool_collects_matches`
        - `suite::grep_files::grep_files_tool_reports_empty_results`
  • app-server: Silence thread status changes caused by thread being created (#13079)
    Currently we emit `thread/status/changed` with `Idle` status right
    before sending `thread/started` event (which also has `Idle` status in
    it).
    It feels that there is no point in that as client has no way to know
    prior state of the thread as it didn't exist yet, so silence these kinds
    of notifications.
  • fix(app-server): emit turn/started only when turn actually starts (#13261)
    This is a follow-up for https://github.com/openai/codex/pull/13047
    
    ## Why
    We had a race where `turn/started` could be observed before the thread
    had actually transitioned to `Active`. This was because we eagerly
    emitted `turn/started` in the request handler for `turn/start` (and
    `review/start`).
    
    That was showing up as flaky `thread/resume` tests, but the real issue
    was broader: a client could see `turn/started` and still get back an
    idle thread immediately afterward.
    
    The first idea was to eagerly call
    `thread_watch_manager.note_turn_started(...)` from the `turn/start`
    request path. That turns out to be unsafe, because
    `submit(Op::UserInput)` only queues work. If a turn starts and completes
    quickly, request-path bookkeeping can race with the real lifecycle
    events and leave stale running state behind.
    
    **The real fix** is to move `turn/started` to emit only after the turn
    _actually_ starts, so we do that by waiting for the
    `EventMsg::TurnStarted` notification emitted by codex core. We do this
    for both `turn/start` and `review/start`.
    
    I also verified this change is safe for our first-party codex apps -
    they don't have any assumptions that `turn/started` is emitted before
    the RPC response to `turn/start` (which is correct anyway).
    
    I also removed `single_client_mode` since it isn't really necessary now.
    
    ## Testing
    - `cargo test -p codex-app-server thread_resume -- --nocapture`
    - `cargo test -p codex-app-server
    'suite::v2::turn_start::turn_start_emits_notifications_and_accepts_model_override'
    -- --exact --nocapture`
    - `cargo test -p codex-app-server`