Commit Graph

71 Commits

  • Add thread/shellCommand to app server API surface (#14988)
    This PR adds a new `thread/shellCommand` app server API so clients can
    implement `!` shell commands. These commands are executed within the
    sandbox, and the command text and output are visible to the model.
    
    The internal implementation mirrors the current TUI `!` behavior.
    - persist shell command execution as `CommandExecution` thread items,
    including source and formatted output metadata
    - bridge live and replayed app-server command execution events back into
    the existing `tui_app_server` exec rendering path
    
    This PR also wires `tui_app_server` to submit `!` commands through the
    new API.
  • Revert "fix: harden plugin feature gating" (#15102)
    Reverts openai/codex#15020
    
    I messed up the commit in my PR and accidentally merged changes that
    were still under review.
  • fix: harden plugin feature gating (#15020)
    1. Use requirement-resolved config.features as the plugin gate.
    2. Guard plugin/list, plugin/read, and related flows behind that gate.
    3. Skip bad marketplace.json files instead of failing the whole list.
    4. Simplify plugin state and caching.
  • Add notify to code-mode (#14842)
    Allows model to send an out-of-band notification.
    
    The notification is injected as another tool call output for the same
    call_id.
  • Cleanup skills/remote/xxx endpoints. (#14977)
    Remote skills/remote/xxx as they are not in used for now.
  • generate an internal json schema for RolloutLine (#14434)
    ### Why
    i'm working on something that parses and analyzes codex rollout logs,
    and i'd like to have a schema for generating a parser/validator.
    
    `codex app-server generate-internal-json-schema` writes an
    `RolloutLine.json` file
    
    while doing this, i noticed we have a writer <> reader mismatch issue on
    `FunctionCallOutputPayload` and reasoning item ID -- added some schemars
    annotations to fix those
    
    ### Test
    
    ```
    $ just codex app-server generate-internal-json-schema --out ./foo
    ```
    
    generates an `RolloutLine.json` file, which i validated against jsonl
    files on disk
    
    `just codex app-server --help` doesn't expose the
    `generate-internal-json-schema` option by default, but you can do `just
    codex app-server generate-internal-json-schema --help` if you know the
    command
    
    everything else still works
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [stack 2/4] Align main realtime v2 wire and runtime flow (#14830)
    ## Stack Position
    2/4. Built on top of #14828.
    
    ## Base
    - #14828
    
    ## Unblocks
    - #14829
    - #14827
    
    ## Scope
    - Port the realtime v2 wire parsing, session, app-server, and
    conversation runtime behavior onto the split websocket-method base.
    - Branch runtime behavior directly on the current realtime session kind
    instead of parser-derived flow flags.
    - Keep regression coverage in the existing e2e suites.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: support remote_sync for plugin install/uninstall. (#14878)
    - Added forceRemoteSync to plugin/install and plugin/uninstall.
    - With forceRemoteSync=true, we update the remote plugin status first,
    then apply the local change only if the backend call succeeds.
    - Kept plugin/list(forceRemoteSync=true) as the main recon path, and for
    now it treats remote enabled=false as uninstall. We
    will eventually migrate to plugin/installed for more precise state
    handling.
  • dynamic tool calls: add param exposeToContext to optionally hide tool (#14501)
    This extends dynamic_tool_calls to allow us to hide a tool from the
    model context but still use it as part of the general tool calling
    runtime (for ex from js_repl/code_mode)
  • Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
    ## Summary
    - add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
    control for who reviews approval requests
    - route Smart Approvals guardian review through core for command
    execution, file changes, managed-network approvals, MCP approvals, and
    delegated/subagent approval flows
    - expose guardian review in app-server with temporary unstable
    `item/autoApprovalReview/{started,completed}` notifications carrying
    `targetItemId`, `review`, and `action`
    - update the TUI so Smart Approvals can be enabled from `/experimental`,
    aligned with the matching `/approvals` mode, and surfaced clearly while
    reviews are pending or resolved
    
    ## Runtime model
    This PR does not introduce a new `approval_policy`.
    
    Instead:
    - `approval_policy` still controls when approval is needed
    - `approvals_reviewer` controls who reviewable approval requests are
    routed to:
      - `user`
      - `guardian_subagent`
    
    `guardian_subagent` is a carefully prompted reviewer subagent that
    gathers relevant context and applies a risk-based decision framework
    before approving or denying the request.
    
    The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
    behavior keys off `approvals_reviewer`.
    
    When Smart Approvals is enabled from the TUI, it also switches the
    current `/approvals` settings to the matching Smart Approvals mode so
    users immediately see guardian review in the active thread:
    - `approval_policy = on-request`
    - `approvals_reviewer = guardian_subagent`
    - `sandbox_mode = workspace-write`
    
    Users can still change `/approvals` afterward.
    
    Config-load behavior stays intentionally narrow:
    - plain `smart_approvals = true` in `config.toml` remains just the
    rollout/UI gate and does not auto-set `approvals_reviewer`
    - the deprecated `guardian_approval = true` alias migration does
    backfill `approvals_reviewer = "guardian_subagent"` in the same scope
    when that reviewer is not already configured there, so old configs
    preserve their original guardian-enabled behavior
    
    ARC remains a separate safety check. For MCP tool approvals, ARC
    escalations now flow into the configured reviewer instead of always
    bypassing guardian and forcing manual review.
    
    ## Config stability
    The runtime reviewer override is stable, but the config-backed
    app-server protocol shape is still settling.
    
    - `thread/start`, `thread/resume`, and `turn/start` keep stable
    `approvalsReviewer` overrides
    - the config-backed `approvals_reviewer` exposure returned via
    `config/read` (including profile-level config) is now marked
    `[UNSTABLE]` / experimental in the app-server protocol until we are more
    confident in that config surface
    
    ## App-server surface
    This PR intentionally keeps the guardian app-server shape narrow and
    temporary.
    
    It adds generic unstable lifecycle notifications:
    - `item/autoApprovalReview/started`
    - `item/autoApprovalReview/completed`
    
    with payloads of the form:
    - `{ threadId, turnId, targetItemId, review, action? }`
    
    `review` is currently:
    - `{ status, riskScore?, riskLevel?, rationale? }`
    - where `status` is one of `inProgress`, `approved`, `denied`, or
    `aborted`
    
    `action` carries the guardian action summary payload from core when
    available. This lets clients render temporary standalone pending-review
    UI, including parallel reviews, even when the underlying tool item has
    not been emitted yet.
    
    These notifications are explicitly documented as `[UNSTABLE]` and
    expected to change soon.
    
    This PR does **not** persist guardian review state onto `thread/read`
    tool items. The intended follow-up is to attach guardian review state to
    the reviewed tool item lifecycle instead, which would improve
    consistency with manual approvals and allow thread history / reconnect
    flows to replay guardian review state directly.
    
    ## TUI behavior
    - `/experimental` exposes the rollout gate as `Smart Approvals`
    - enabling it in the TUI enables the feature and switches the current
    session to the matching Smart Approvals `/approvals` mode
    - disabling it in the TUI clears the persisted `approvals_reviewer`
    override when appropriate and returns the session to default manual
    review when the effective reviewer changes
    - `/approvals` still exposes the reviewer choice directly
    - the TUI renders:
    - pending guardian review state in the live status footer, including
    parallel review aggregation
      - resolved approval/denial state in history
    
    ## Scope notes
    This PR includes the supporting core/runtime work needed to make Smart
    Approvals usable end-to-end:
    - shell / unified-exec / apply_patch / managed-network / MCP guardian
    review
    - delegated/subagent approval routing into guardian review
    - guardian review risk metadata and action summaries for app-server/TUI
    - config/profile/TUI handling for `smart_approvals`, `guardian_approval`
    alias migration, and `approvals_reviewer`
    - a small internal cleanup of delegated approval forwarding to dedupe
    fallback paths and simplify guardian-vs-parent approval waiting (no
    intended behavior change)
    
    Out of scope for this PR:
    - redesigning the existing manual approval protocol shapes
    - persisting guardian review state onto app-server `ThreadItem`s
    - delegated MCP elicitation auto-review (the current delegated MCP
    guardian shim only covers the legacy `RequestUserInput` path)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: add v2 filesystem APIs (#14245)
    Add a protocol-level filesystem surface to the v2 app-server so Codex
    clients can read and write files, inspect directories, and subscribe to
    path changes without relying on host-specific helpers.
    
    High-level changes:
    - define the new v2 fs/readFile, fs/writeFile, fs/createDirectory,
    fs/getMetadata, fs/readDirectory, fs/remove, fs/copy RPCs
    - implement the app-server handlers, including absolute-path validation,
    base64 file payloads, recursive copy/remove semantics
    - document the API, regenerate protocol schemas/types, and add
    end-to-end tests for filesystem operations, copy edge cases
    
    Testing plan:
    - validate protocol serialization and generated schema output for the
    new fs request, response, and notification types
    - run app-server integration coverage for file and directory CRUD paths,
    metadata/readDirectory responses, copy failure modes, and absolute-path
    validation
  • feat: add plugin/read. (#14445)
    return more information for a specific plugin.
  • feat: search_tool migrate to bring you own tool of Responses API (#14274)
    ## Why
    
    to support a new bring your own search tool in Responses
    API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search)
    we migrating our bm25 search tool to use official way to execute search
    on client and communicate additional tools to the model.
    
    ## What
    - replace the legacy `search_tool_bm25` flow with client-executed
    `tool_search`
    - add protocol, SSE, history, and normalization support for
    `tool_search_call` and `tool_search_output`
    - return namespaced Codex Apps search results and wire namespaced
    follow-up tool calls back into MCP dispatch
  • chore(app-server): stop emitting codex/event/ notifications (#14392)
    ## Description
    
    This PR stops emitting legacy `codex/event/*` notifications from the
    public app-server transports.
    
    It's been a long time coming! app-server was still producing a raw
    notification stream from core, alongside the typed app-server
    notifications and server requests, for compatibility reasons. Now,
    external clients should no longer be depending on those legacy
    notifications, so this change removes them from the stdio and websocket
    contract and updates the surrounding docs, examples, and tests to match.
    
    ### Caveat
    I left the "in-process" version of app-server alone for now, since
    `codex exec` was recently based on top of app-server via this in-process
    form here: https://github.com/openai/codex/pull/14005
    
    Seems like `codex exec` still consumes some legacy notifications
    internally, so this branch only removes `codex/event/*` from app-server
    over stdio and websockets.
    
    ## Follow-up
    
    Once `codex exec` is fully migrated off `codex/event/*` notifications,
    we'll be able to stop emitting them entirely entirely instead of just
    filtering it at the external transport boundary.
  • chore: add a separate reject-policy flag for skill approvals (#14271)
    ## Summary
    - add `skill_approval` to `RejectConfig` and the app-server v2
    `AskForApproval::Reject` payload so skill-script prompts can be
    configured independently from sandbox and rule-based prompts
    - update Unix shell escalation to reject prompts based on the actual
    decision source, keeping prefix rules tied to `rules`, unmatched command
    fallbacks tied to `sandbox_approval`, and skill scripts tied to
    `skill_approval`
    - regenerate the affected protocol/config schemas and expand
    unit/integration coverage for the new flag and skill approval behavior
  • Add ephemeral flag support to thread fork (#14248)
    ### Summary
    This PR adds first-class ephemeral support to thread/fork, bringing it
    in line with thread/start. The goal is to support one-off completions on
    full forked threads without persisting them as normal user-visible
    threads.
    
    ### Testing
  • feat: Allow sync with remote plugin status. (#14176)
    Add forceRemoteSync to plugin/list.
    When it is set to True, we will sync the local plugin status with the
    remote one (backend-api/plugins/list).
  • fix(core) default RejectConfig.request_permissions (#14165)
    ## Summary
    Adds a default here so existing config deserializes
    
    ## Testing
    - [x] Added a unit test
  • feat(approvals) RejectConfig for request_permissions (#14118)
    ## Summary
    We need to support allowing request_permissions calls when using
    `Reject` policy
    
    <img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
    40 PM"
    src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
    />
    
    Note that this is a backwards-incompatible change for Reject policy. I'm
    not sure if we need to add a default based on our current use/setup
    
    ## Testing
    - [x] Added tests
    - [x] Tested locally
  • chore: plugin/uninstall endpoint (#14111)
    add `plugin/uninstall` app-server endpoint to fully rm plugin from
    plugins cache dir and rm entry from user config file.
    
    plugin-enablement is session-scoped, so uninstalls are only picked up in
    new sessions (like installs).
    
    added tests.
  • app-server: include experimental skill metadata in exec approval requests (#13929)
    ## Summary
    
    This change surfaces skill metadata on command approval requests so
    app-server clients can tell when an approval came from a skill script
    and identify the originating `SKILL.md`.
    
    - add `skill_metadata` to exec approval events in the shared protocol
    - thread skill metadata through core shell escalation and delegated
    approval handling for skill-triggered approvals
    - expose the field in app-server v2 as experimental `skillMetadata`
    - regenerate the JSON/TypeScript schemas and cover the new field in
    protocol, transport, core, and TUI tests
    
    ## Why
    
    Skill-triggered approvals already carry skill context inside core, but
    app-server clients could not see which skill caused the prompt. Sending
    the skill metadata with the approval request makes it possible for
    clients to present better approval UX and connect the prompt back to the
    relevant skill definition.
    
    
    ## example event in app-server-v2
    verified that we see this event when experimental api is on:
    ```
    < {
    <   "id": 11,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "additionalPermissions": {
    <       "fileSystem": null,
    <       "macos": {
    <         "accessibility": false,
    <         "automations": {
    <           "bundle_ids": [
    <             "com.apple.Notes"
    <           ]
    <         },
    <         "calendar": false,
    <         "preferences": "read_only"
    <       },
    <       "network": null
    <     },
    <     "approvalId": "25d600ee-5a3c-4746-8d17-e2e61fb4c563",
    <     "availableDecisions": [
    <       "accept",
    <       "acceptForSession",
    <       "cancel"
    <     ],
    <     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <     "commandActions": [
    <       {
    <         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes",
    <     "itemId": "call_jZp3xFpNg4D8iKAD49cvEvZy",
    <     "skillMetadata": {
    <       "pathToSkillsMd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/SKILL.md"
    <     },
    <     "threadId": "019ccc10-b7d3-7ff2-84fe-3a75e7681e69",
    <     "turnId": "019ccc10-b848-76f1-81b3-4a1fa225493f"
    <   }
    < }`
    ```
    
    & verified that this is the event when experimental api is off:
    ```
    < {
    <   "id": 13,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "approvalId": "5fbbf776-261b-4cf8-899b-c125b547f2c0",
    <     "availableDecisions": [
    <       "accept",
    <       "acceptForSession",
    <       "cancel"
    <     ],
    <     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <     "commandActions": [
    <       {
    <         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Users/celia/code/codex/codex-rs",
    <     "itemId": "call_OV2DHzTgYcbYtWaTTBWlocOt",
    <     "threadId": "019ccc16-2a2b-7be1-8500-e00d45b892d4",
    <     "turnId": "019ccc16-2a8e-7961-98ec-649600e7d06a"
    <   }
    < }
    ```
  • [app-server] Support hot-reload user config when batch writing config. (#13839)
    - [x] Support hot-reload user config when batch writing config.
  • app-server: require absolute cwd for windowsSandbox/setupStart (#13833)
    ## Summary
    - require windowsSandbox/setupStart.cwd to be an AbsolutePathBuf
    - reject relative cwd values at request parsing instead of normalizing
    them later in the setup flow
    - add RPC-layer coverage for relative cwd rejection and update the
    checked-in protocol schemas/docs
    
    ## Why
    windowsSandbox/setupStart was carrying the client-provided cwd as a raw
    PathBuf for command_cwd while config derivation normalized the same
    value into an absolute policy_cwd.
    
    That left room for relative-path ambiguity in the setup path, especially
    for inputs like cwd: "repo". Making the RPC accept only absolute paths
    removes that split entirely: the handler now receives one
    already-validated absolute path and uses it for both config derivation
    and setup.
    
    This keeps the trust model unchanged. Trusted clients could already
    choose the session cwd; this change is only about making the setup RPC
    reject relative paths so command_cwd and policy_cwd cannot diverge.
    
    ## Testing
    - cargo test -p codex-app-server windows_sandbox_setup (run locally by
    user)
    - cargo test -p codex-app-server-protocol windows_sandbox (run locally
    by user)
  • feat(app-server-protocol): address naming conflicts in json schema exporter (#13819)
    This fixes a schema export bug where two different `WebSearchAction`
    types were getting merged under the same name in the app-server v2 JSON
    schema bundle.
    
    The problem was that v2 thread items use the app-server API's
    `WebSearchAction` with camelCase variants like `openPage`, while
    `ThreadResumeParams.history` and
    `RawResponseItemCompletedNotification.item` pull in the upstream
    `ResponseItem` graph, which uses the Responses API snake_case shape like
    `open_page`. During bundle generation we were flattening nested
    definitions into the v2 namespace by plain name, so the later definition
    could silently overwrite the earlier one.
    
    That meant clients generating code from the bundled schema could end up
    with the wrong `WebSearchAction` definition for v2 thread history. In
    practice this shows up on web search items reconstructed from rollout
    files with persisted extended history.
    
    This change does two things:
    - Gives the upstream Responses API schema a distinct JSON schema name:
    `ResponsesApiWebSearchAction`
    - Makes namespace-level schema definition collisions fail loudly instead
    of silently overwriting
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • feat: Add curated plugin marketplace + Metadata Cleanup. (#13712)
    1. Add a synced curated plugin marketplace and include it in marketplace
    discovery.
    2. Expose optional plugin.json interface metadata in plugin/list
    3. Tighten plugin and marketplace path handling using validated absolute
    paths.
    4. Let manifests override skill, MCP, and app config paths.
    5. Restrict plugin enablement/config loading to the user config layer so
    plugin enablement is at global level
  • support plugin/list. (#13540)
    Introduce a plugin/list which reads from local marketplace.json.
    Also update the signature for plugin/install.
  • plugin: support local-based marketplace.json + install endpoint. (#13422)
    Support marketplace.json that points to a local file, with
    ```
        "source":
        {
            "source": "local",
            "path": "./plugin-1"
        },
     ```
     
     Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.
  • feat(app-server-test-client): OTEL setup for tracing (#13493)
    ### Overview
    This PR:
    - Updates `app-server-test-client` to load OTEL settings from
    `$CODEX_HOME/config.toml` and initializes its own OTEL provider.
    - Add real client root spans to app-server test client traces.
    
    This updates `codex-app-server-test-client` so its Datadog traces
    reflect the full client-driven flow instead of a set of server spans
    stitched together under a synthetic parent.
    
    Before this change, the test client generated a fake `traceparent` once
    and reused it for every JSON-RPC request. That kept the requests in one
    trace, but there was no real client span at the top, so Datadog ended up
    showing the sequence in a slightly misleading way, where all RPCs were
    anchored under `initialize`.
    
    Now the test client:
    - loads OTEL settings from the normal Codex config path, including
    `$CODEX_HOME/config.toml` and existing --config overrides
    - initializes tracing the same way other Codex binaries do when trace
    export is enabled
    - creates a real client root span for each scripted command
    - creates per-request client spans for JSON-RPC methods like
    `initialize`, `thread/start`, and `turn/start`
    - injects W3C trace context from the current client span into
    request.trace instead of reusing a fabricated carrier
    
    This gives us a cleaner trace shape in Datadog:
    - one trace URL for the whole scripted flow
    - a visible client root span
    - proper client/server parent-child relationships for each app-server
    request
  • allow apps to specify cwd for sandbox setup. (#13484)
    The electron app doesn't start up the app-server in a particular
    workspace directory.
    So sandbox setup happens in the app-installed directory instead of the
    project workspace.
    
    This allows the app do specify the workspace cwd so that the sandbox
    setup actually sets up the ACLs instead of exiting fast and then having
    the first shell command be slow.
  • image-gen-core (#13290)
    Core tool-calling for image-gen, handles requesting and receiving logic
    for images using response API
  • Feat: Preserve network access on read-only sandbox policies (#13409)
    ## Summary
    
    `PermissionProfile.network` could not be preserved when additional or
    compiled permissions resolved to
    `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access
    field. This change makes read-only + network
    enabled representable directly and threads that through the protocol,
    app-server v2 mirror, and permission-
      merging logic.
    
    ## What changed
    
    - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core
    protocol and app-server v2 protocol.
    - Kept backward compatibility by defaulting the new field to false, so
    legacy read-only payloads still
        deserialize unchanged.
    - Updated `has_full_network_access()` and sandbox summaries to respect
    read-only network access.
      - Preserved PermissionProfile.network when:
          - compiling skill permission profiles into sandbox policies
          - normalizing additional permissions
          - merging additional permissions into existing sandbox policies
    - Updated the approval overlay to show network in the rendered
    permission rule when requested.
      - Regenerated app-server schema fixtures for the new v2 wire shape.
  • Add under-development original-resolution view_image support (#13050)
    ## Summary
    
    Add original-resolution support for `view_image` behind the
    under-development `view_image_original_resolution` feature flag.
    
    When the flag is enabled and the target model is `gpt-5.3-codex` or
    newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends
    `detail: "original"` to the Responses API instead of using the legacy
    resize/compress path.
    
    ## What changed
    
    - Added `view_image_original_resolution` as an under-development feature
    flag.
    - Added `ImageDetail` to the protocol models and support for serializing
    `detail: "original"` on tool-returned images.
    - Added `PromptImageMode::Original` to `codex-utils-image`.
      - Preserves original PNG/JPEG/WebP bytes.
      - Keeps legacy behavior for the resize path.
    - Updated `view_image` to:
    - use the shared `local_image_content_items_with_label_number(...)`
    helper in both code paths
      - select original-resolution mode only when:
        - the feature flag is enabled, and
        - the model slug parses as `gpt-5.3-codex` or newer
    - Kept local user image attachments on the existing resize path; this
    change is specific to `view_image`.
    - Updated history/image accounting so only `detail: "original"` images
    use the docs-based GPT-5 image cost calculation; legacy images still use
    the old fixed estimate.
    - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG
    at 85% quality unless lossless is required, while still allowing other
    formats when explicitly requested.
    - Updated tests and helper code that construct
    `FunctionCallOutputContentItem::InputImage` to carry the new `detail`
    field.
    
    ## Behavior
    
    ### Feature off
    - `view_image` keeps the existing resize/re-encode behavior.
    - History estimation keeps the existing fixed-cost heuristic.
    
    ### Feature on + `gpt-5.3-codex+`
    - `view_image` sends original-resolution images with `detail:
    "original"`.
    - PNG/JPEG/WebP source bytes are preserved when possible.
    - History estimation uses the GPT-5 docs-based image-cost calculation
    for those `detail: "original"` images.
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/13050
    -  `2` https://github.com/openai/codex/pull/13331
    -  `3` https://github.com/openai/codex/pull/13049
  • Add thread metadata update endpoint to app server (#13280)
    ## Summary
    - add the v2 `thread/metadata/update` API, including
    protocol/schema/TypeScript exports and app-server docs
    - patch stored thread `gitInfo` in sqlite without resuming the thread,
    with validation plus support for explicit `null` clears
    - repair missing sqlite thread rows from rollout data before patching,
    and make those repairs safe by inserting only when absent and updating
    only git columns so newer metadata is not clobbered
    - keep sqlite authoritative for mutable thread git metadata by
    preserving existing sqlite git fields during reconcile/backfill and only
    using rollout `SessionMeta` git fields to fill gaps
    - add regression coverage for the endpoint, repair paths, concurrent
    sqlite writes, clearing git fields, and rollout/backfill reconciliation
    - fix the login server shutdown race so cancelling before the waiter
    starts still terminates `block_until_done()` correctly
    
    ## Testing
    - `cargo test -p codex-state
    apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields`
    - `cargo test -p codex-state
    update_thread_git_info_preserves_newer_non_git_metadata`
    - `cargo test -p codex-core
    backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields`
    - `cargo test -p codex-app-server thread_metadata_update`
    - `cargo test`
    - currently fails in existing `codex-core` grep-files tests with
    `unsupported call: grep_files`:
        - `suite::grep_files::grep_files_tool_collects_matches`
        - `suite::grep_files::grep_files_tool_reports_empty_results`
  • app-server service tier plumbing (plus some cleanup) (#13334)
    followup to https://github.com/openai/codex/pull/13212 to expose fast
    tier controls to app server
    (majority of this PR is generated schema jsons - actual code is +69 /
    -35 and +24 tests )
    
    - add service tier fields to the app-server protocol surfaces used by
    thread lifecycle, turn start, config, and session configured events
    - thread service tier through the app-server message processor and core
    thread config snapshots
    - allow runtime config overrides to carry service tier for app-server
    callers
    
    cleanup:
    - Removing useless "legacy" code supporting "standard" - we moved to
    None | "fast", so "standard" is not needed.
  • Support multimodal custom tool outputs (#12948)
    ## Summary
    
    This changes `custom_tool_call_output` to use the same output payload
    shape as `function_call_output`, so freeform tools can return either
    plain text or structured content items.
    
    The main goal is to let `js_repl` return image content from nested
    `view_image` calls in its own `custom_tool_call_output`, instead of
    relying on a separate injected message.
    
    ## What changed
    
    - Changed `custom_tool_call_output.output` from `string` to
    `FunctionCallOutputPayload`
    - Updated freeform tool plumbing to preserve structured output bodies
    - Updated `js_repl` to aggregate nested tool content items and attach
    them to the outer `js_repl` result
    - Removed the old `js_repl` special case that injected `view_image`
    results as a separate pending user image message
    - Updated normalization/history/truncation paths to handle multimodal
    `custom_tool_call_output`
    - Regenerated app-server protocol schema artifacts
    
    ## Behavior
    
    Direct `view_image` calls still return a `function_call_output` with
    image content.
    
    When `view_image` is called inside `js_repl`, the outer `js_repl`
    `custom_tool_call_output` now carries:
    - an `input_text` item if the JS produced text output
    - one or more `input_image` items from nested tool results
    
    So the nested image result now stays inside the `js_repl` tool output
    instead of being injected as a separate message.
    
    ## Compatibility
    
    This is intended to be backward-compatible for resumed conversations.
    
    Older histories that stored `custom_tool_call_output.output` as a plain
    string still deserialize correctly, and older histories that used the
    previous injected-image-message flow also continue to resume.
    
    Added regression coverage for resuming a pre-change rollout containing:
    - string-valued `custom_tool_call_output`
    - legacy injected image message history
    
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/12948
  • feat(app-server): thread/unsubscribe API (#10954)
    Adds a new v2 app-server API for a client to be able to unsubscribe to a
    thread:
    - New RPC method: `thread/unsubscribe`
    - New server notification: `thread/closed`
    
    Today clients can start/resume/archive threads, but there wasn’t a way
    to explicitly unload a live thread from memory without archiving it.
    With `thread/unsubscribe`, a client can indicate it is no longer
    actively working with a live Thread. If this is the only client
    subscribed to that given thread, the thread will be automatically closed
    by app-server, at which point the server will send `thread/closed` and
    `thread/status/changed` with `status: notLoaded` notifications.
    
    This gives clients a way to prevent long-running app-server processes
    from accumulating too many thread (and related) objects in memory.
    
    Closed threads will also be removed from `thread/loaded/list`.
  • Add app-server v2 thread realtime API (#12715)
    Add experimental `thread/realtime/*` v2 requests and notifications, then
    route app-server realtime events through that thread-scoped surface with
    integration coverage.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Support external agent config detect and import (#12660)
    Migration Behavior
    
    * Config
      *  Migrates settings.json into config.toml
    * Only adds fields when config.toml is missing, or when those fields are
    missing from the existing file
      *  Supported mappings:
        env -> shell_environment_policy
         sandbox.enabled = true -> sandbox_mode = "workspace-write"
    
    * Skills
      *  Copies home and repo .claude/skills into .agents/skills
      *  Existing skill directories are not overwritten
      *  SKILL.md content is rewritten from Claude-related terms to Codex
    
    * AgentsMd
      *  Repo only
      *  Migrates CLAUDE.md into AGENTS.md
    * Detect/import only proceed when AGENTS.md is missing or present but
    empty
      *  Content is rewritten from Claude-related terms to Codex
  • feat: add search term to thread list (#12578)
    Add `searchTerm` to `thread/list` that will search for a match in the
    titles (the condition being `searchTerm` $$\in$$ `title`)
  • feat: add service name to app-server (#12319)
    Add service name to the app-server so that the app can use it's own
    service name
    
    This is on thread level because later we might plan the app-server to
    become a singleton on the computer
  • ignore v1 in JSON schema codegen (#12408)
    ## Why
    
    The generated unnamespaced JSON envelope schemas (`ClientRequest` and
    `ServerNotification`) still contained both v1 and v2 variants, which
    pulled legacy v1/core types and v2 types into the same `definitions`
    graph. That caused `schemars` to produce numeric suffix names (for
    example `AskForApproval2`, `ByteRange2`, `MessagePhase2`).
    
    This PR moves JSON codegen toward v2-only output while preserving the
    unnamespaced envelope artifacts, and avoids reintroducing numeric-suffix
    tolerance by removing the v1/internal-only variants that caused the
    collisions in those envelope schemas.
    
    ## What Changed
    
    - In `codex-rs/app-server-protocol/src/export.rs`, JSON generation now
    excludes v1 schema artifacts (`v1/*`) while continuing to emit
    unnamespaced/root JSON schemas and the JSON bundle.
    - Added a narrow JSON v1 allowlist (`JSON_V1_ALLOWLIST`) so
    `InitializeParams` and `InitializeResponse` are still emitted.
    - Added JSON-only post-processing for the mixed envelope schemas before
    collision checks run:
    - `ClientRequest`: strips v1 request variants from the generated `oneOf`
    using the temporary `V1_CLIENT_REQUEST_METHODS` list
    - `ServerNotification`: strips v1 notifications plus the internal-only
    `rawResponseItem/completed` notification using the temporary
    `EXCLUDED_SERVER_NOTIFICATION_METHODS_FOR_JSON` list
    - Added a temporary local-definition pruning pass for those envelope
    schemas so now-unreferenced v1/core definitions are removed from
    `definitions` after method filtering.
    - Updated the variant-title naming heuristic for single-property literal
    object variants to use the literal value (when available), avoiding
    collisions like multiple `state`-only variants all deriving the same
    title.
    - Collision handling remains fail-fast (no numeric suffix fallback map
    in this PR path).
    
    ## Verification
    
    - `just write-app-server-schema`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12408).
    * __->__ #12408
    * #12406
  • Add ability to attach extra files to feedback (#12370)
    Allow clients to provide extra files.
  • feat: add Reject approval policy with granular prompt rejection controls (#12087)
    ## Why
    
    We need a way to auto-reject specific approval prompt categories without
    switching all approvals off.
    
    The goal is to let users independently control:
    - sandbox escalation approvals,
    - execpolicy `prompt` rule approvals,
    - MCP elicitation prompts.
    
    ## What changed
    
    - Added a new primary approval mode in `protocol/src/protocol.rs`:
    
    ```rust
    pub enum AskForApproval {
        // ...
        Reject(RejectConfig),
        // ...
    }
    
    pub struct RejectConfig {
        pub sandbox_approval: bool,
        pub rules: bool,
        pub mcp_elicitations: bool,
    }
    ```
    
    - Wired `RejectConfig` semantics through approval paths in `core`:
      - `core/src/exec_policy.rs`
        - rejects rule-driven prompts when `rules = true`
        - rejects sandbox/escalation prompts when `sandbox_approval = true`
    - preserves rule priority when both rule and sandbox prompt conditions
    are present
      - `core/src/tools/sandboxing.rs`
    - applies `sandbox_approval` to default exec approval decisions and
    sandbox-failure retry gating
      - `core/src/safety.rs`
    - keeps `Reject { all false }` behavior aligned with `OnRequest` for
    patch safety
        - rejects out-of-root patch approvals when `sandbox_approval = true`
      - `core/src/mcp_connection_manager.rs`
        - auto-declines MCP elicitations when `mcp_elicitations = true`
    
    - Ensured approval policy used by MCP elicitation flow stays in sync
    with constrained session policy updates.
    
    - Updated app-server v2 conversions and generated schema/TypeScript
    artifacts for the new `Reject` shape.
    
    ## Verification
    
    Added focused unit coverage for the new behavior in:
    - `core/src/exec_policy.rs`
    - `core/src/tools/sandboxing.rs`
    - `core/src/mcp_connection_manager.rs`
    - `core/src/safety.rs`
    - `core/src/tools/runtimes/apply_patch.rs`
    
    Key cases covered include rule-vs-sandbox prompt precedence, MCP
    auto-decline behavior, and patch/sandbox retry behavior under
    `RejectConfig`.
  • app-server support for Windows sandbox setup. (#12025)
    app-server support for initiating Windows sandbox setup.
    server responds quickly to setup request and makes a future RPC call
    back to client when the setup finishes.
    
    The TUI implementation is unaffected but in a future PR I'll update the
    TUI to use the shared setup helper
    (`windows_sandbox.run_windows_sandbox_setup`)
  • app-server: Emit thread archive/unarchive notifications (#12030)
    * Add v2 server notifications `thread/archived` and `thread/unarchived`
    with a `threadId` payload.
    * Wire new events into `thread/archive` and `thread/unarchive` success
    paths.
    * Update app-server protocol/schema/docs accordingly.
    
    Testing:
    - Updated archive/unarchive end-to-end tests to verify both
    notifications are emitted with the expected thread id payload.
  • Add remote skill scope/product_surface/enabled params and cleanup (#11801)
    skills/remote/list: params=hazelnutScope, productSurface, enabled;
    returns=data: { id, name, description }[]
    skills/remote/export: params=hazelnutId; returns={ id, path }
  • fix: send unfiltered models over model/list (#11793)
    ### What
    to unblock filtering models in VSCE, change `model/list` app-server
    endpoint to send all models + visibility field `showInPicker` so
    filtering can be done in VSCE if desired.
    
    ### Tests
    Updated tests.