Commit Graph

50 Commits

  • Apply argument comment lint across codex-rs (#14652)
    ## Why
    
    Once the repo-local lint exists, `codex-rs` needs to follow the
    checked-in convention and CI needs to keep it from drifting. This commit
    applies the fallback `/*param*/` style consistently across existing
    positional literal call sites without changing those APIs.
    
    The longer-term preference is still to avoid APIs that require comments
    by choosing clearer parameter types and call shapes. This PR is
    intentionally the mechanical follow-through for the places where the
    existing signatures stay in place.
    
    After rebasing onto newer `main`, the rollout also had to cover newly
    introduced `tui_app_server` call sites. That made it clear the first cut
    of the CI job was too expensive for the common path: it was spending
    almost as much time installing `cargo-dylint` and re-testing the lint
    crate as a representative test job spends running product tests. The CI
    update keeps the full workspace enforcement but trims that extra
    overhead from ordinary `codex-rs` PRs.
    
    ## What changed
    
    - keep a dedicated `argument_comment_lint` job in `rust-ci`
    - mechanically annotate remaining opaque positional literals across
    `codex-rs` with exact `/*param*/` comments, including the rebased
    `tui_app_server` call sites that now fall under the lint
    - keep the checked-in style aligned with the lint policy by using
    `/*param*/` and leaving string and char literals uncommented
    - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
    registry/git metadata in the lint job
    - split changed-path detection so the lint crate's own `cargo test` step
    runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
    - continue to run the repo wrapper over the `codex-rs` workspace, so
    product-code enforcement is unchanged
    
    Most of the code changes in this commit are intentionally mechanical
    comment rewrites or insertions driven by the lint itself.
    
    ## Verification
    
    - `./tools/argument-comment-lint/run.sh --workspace`
    - `cargo test -p codex-tui-app-server -p codex-tui`
    - parsed `.github/workflows/rust-ci.yml` locally with PyYAML
    
    ---
    
    * -> #14652
    * #14651
  • chore(app-server): stop emitting codex/event/ notifications (#14392)
    ## Description
    
    This PR stops emitting legacy `codex/event/*` notifications from the
    public app-server transports.
    
    It's been a long time coming! app-server was still producing a raw
    notification stream from core, alongside the typed app-server
    notifications and server requests, for compatibility reasons. Now,
    external clients should no longer be depending on those legacy
    notifications, so this change removes them from the stdio and websocket
    contract and updates the surrounding docs, examples, and tests to match.
    
    ### Caveat
    I left the "in-process" version of app-server alone for now, since
    `codex exec` was recently based on top of app-server via this in-process
    form here: https://github.com/openai/codex/pull/14005
    
    Seems like `codex exec` still consumes some legacy notifications
    internally, so this branch only removes `codex/event/*` from app-server
    over stdio and websockets.
    
    ## Follow-up
    
    Once `codex exec` is fully migrated off `codex/event/*` notifications,
    we'll be able to stop emitting them entirely entirely instead of just
    filtering it at the external transport boundary.
  • fix(otel): make HTTP trace export survive app-server runtimes (#14300)
    ## Summary
    
    This PR fixes OTLP HTTP trace export in runtimes where the previous
    exporter setup was unreliable, especially around app-server usage. It
    also removes the old `codex_otel::otel_provider` compatibility shim and
    switches remaining call sites over to the crate-root
    `codex_otel::OtelProvider` export.
    
    ## What changed
    
    - Use a runtime-safe OTLP HTTP trace exporter path for Tokio runtimes.
    - Add an async HTTP client path for trace export when we are already
    inside a multi-thread Tokio runtime.
    - Make provider shutdown flush traces before tearing down the tracer
    provider.
    - Add loopback coverage that verifies traces are actually sent to
    `/v1/traces`:
      - outside Tokio
      - inside a multi-thread Tokio runtime
      - inside a current-thread Tokio runtime
    - Remove the `codex_otel::otel_provider` shim and update remaining
    imports.
    
    ## Why
    
    I hit cases where spans were being created correctly but never made it
    to the collector. The issue turned out to be in exporter/runtime
    behavior rather than the span plumbing itself. This PR narrows that gap
    and gives us regression coverage for the actual export path.
  • Implemented thread-level atomic elicitation counter for stopwatch pausing (#12296)
    ### Purpose
    While trying to build out CLI-Tools for the agent to use under skills we
    have found that those tools sometimes need to invoke a user elicitation.
    These elicitations are handled out of band of the codex app-server but
    need to indicate to the exec manager that the command running is not
    going to progress on the usual timeout horizon.
    
    ### Example
    Model calls universal exec:
    `$ download-credit-card-history --start-date 2026-01-19 --end-date
    2026-02-19 > credit_history.jsonl`
    
    download-cred-card-history might hit a hosted/preauthenticated service
    to fetch data. That service might decide that the request requires an
    end user approval the access to the personal data. It should be able to
    signal to the running thread that the command in question is blocked on
    user elicitation. In that case we want the exec to continue, but the
    timeout to not expire on the tool call, essentially freezing time until
    the user approves or rejects the command at which point the tool would
    signal the app-server to decrement the outstanding elicitation count.
    Now timeouts would proceed as normal.
    
    ### What's Added
    
    - New v2 RPC methods:
        - thread/increment_elicitation
        - thread/decrement_elicitation
    - Protocol updates in:
        - codex-rs/app-server-protocol/src/protocol/common.rs
        - codex-rs/app-server-protocol/src/protocol/v2.rs
    - App-server handlers wired in:
        - codex-rs/app-server/src/codex_message_processor.rs
    
    ### Behavior
    
    - Counter starts at 0 per thread.
    - increment atomically increases the counter.
    - decrement atomically decreases the counter; decrement at 0 returns
    invalid request.
    - Transition rules:
    - 0 -> 1: broadcast pause state, pausing all active stopwatches
    immediately.
        - \>0 -> >0: remain paused.
        - 1 -> 0: broadcast unpause state, resuming stopwatches.
    - Core thread/session logic:
        - codex-rs/core/src/codex_thread.rs
        - codex-rs/core/src/codex.rs
        - codex-rs/core/src/mcp_connection_manager.rs
    
    ### Exec-server stopwatch integration
    
    - Added centralized stopwatch tracking/controller:
        - codex-rs/exec-server/src/posix/stopwatch_controller.rs
    - Hooked pause/unpause broadcast handling + stopwatch registration:
        - codex-rs/exec-server/src/posix/mcp.rs
        - codex-rs/exec-server/src/posix/stopwatch.rs
        - codex-rs/exec-server/src/posix.rs
  • app-server: include experimental skill metadata in exec approval requests (#13929)
    ## Summary
    
    This change surfaces skill metadata on command approval requests so
    app-server clients can tell when an approval came from a skill script
    and identify the originating `SKILL.md`.
    
    - add `skill_metadata` to exec approval events in the shared protocol
    - thread skill metadata through core shell escalation and delegated
    approval handling for skill-triggered approvals
    - expose the field in app-server v2 as experimental `skillMetadata`
    - regenerate the JSON/TypeScript schemas and cover the new field in
    protocol, transport, core, and TUI tests
    
    ## Why
    
    Skill-triggered approvals already carry skill context inside core, but
    app-server clients could not see which skill caused the prompt. Sending
    the skill metadata with the approval request makes it possible for
    clients to present better approval UX and connect the prompt back to the
    relevant skill definition.
    
    
    ## example event in app-server-v2
    verified that we see this event when experimental api is on:
    ```
    < {
    <   "id": 11,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "additionalPermissions": {
    <       "fileSystem": null,
    <       "macos": {
    <         "accessibility": false,
    <         "automations": {
    <           "bundle_ids": [
    <             "com.apple.Notes"
    <           ]
    <         },
    <         "calendar": false,
    <         "preferences": "read_only"
    <       },
    <       "network": null
    <     },
    <     "approvalId": "25d600ee-5a3c-4746-8d17-e2e61fb4c563",
    <     "availableDecisions": [
    <       "accept",
    <       "acceptForSession",
    <       "cancel"
    <     ],
    <     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <     "commandActions": [
    <       {
    <         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes",
    <     "itemId": "call_jZp3xFpNg4D8iKAD49cvEvZy",
    <     "skillMetadata": {
    <       "pathToSkillsMd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/SKILL.md"
    <     },
    <     "threadId": "019ccc10-b7d3-7ff2-84fe-3a75e7681e69",
    <     "turnId": "019ccc10-b848-76f1-81b3-4a1fa225493f"
    <   }
    < }`
    ```
    
    & verified that this is the event when experimental api is off:
    ```
    < {
    <   "id": 13,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "approvalId": "5fbbf776-261b-4cf8-899b-c125b547f2c0",
    <     "availableDecisions": [
    <       "accept",
    <       "acceptForSession",
    <       "cancel"
    <     ],
    <     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <     "commandActions": [
    <       {
    <         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Users/celia/code/codex/codex-rs",
    <     "itemId": "call_OV2DHzTgYcbYtWaTTBWlocOt",
    <     "threadId": "019ccc16-2a2b-7be1-8500-e00d45b892d4",
    <     "turnId": "019ccc16-2a8e-7961-98ec-649600e7d06a"
    <   }
    < }
    ```
  • app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
    * Add an ability to stream stdin, stdout, and stderr
    * Streaming of stdout and stderr has a configurable cap for total amount
    of transmitted bytes (with an ability to disable it)
    * Add support for overriding environment variables
    * Add an ability to terminate running applications (using
    `command/exec/terminate`)
    * Add TTY/PTY support, with an ability to resize the terminal (using
    `command/exec/resize`)
  • feat(app-server-test-client): OTEL setup for tracing (#13493)
    ### Overview
    This PR:
    - Updates `app-server-test-client` to load OTEL settings from
    `$CODEX_HOME/config.toml` and initializes its own OTEL provider.
    - Add real client root spans to app-server test client traces.
    
    This updates `codex-app-server-test-client` so its Datadog traces
    reflect the full client-driven flow instead of a set of server spans
    stitched together under a synthetic parent.
    
    Before this change, the test client generated a fake `traceparent` once
    and reused it for every JSON-RPC request. That kept the requests in one
    trace, but there was no real client span at the top, so Datadog ended up
    showing the sequence in a slightly misleading way, where all RPCs were
    anchored under `initialize`.
    
    Now the test client:
    - loads OTEL settings from the normal Codex config path, including
    `$CODEX_HOME/config.toml` and existing --config overrides
    - initializes tracing the same way other Codex binaries do when trace
    export is enabled
    - creates a real client root span for each scripted command
    - creates per-request client spans for JSON-RPC methods like
    `initialize`, `thread/start`, and `turn/start`
    - injects W3C trace context from the current client span into
    request.trace instead of reusing a fabricated carrier
    
    This gives us a cleaner trace shape in Datadog:
    - one trace URL for the whole scripted flow
    - a visible client root span
    - proper client/server parent-child relationships for each app-server
    request
  • Feat: Preserve network access on read-only sandbox policies (#13409)
    ## Summary
    
    `PermissionProfile.network` could not be preserved when additional or
    compiled permissions resolved to
    `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access
    field. This change makes read-only + network
    enabled representable directly and threads that through the protocol,
    app-server v2 mirror, and permission-
      merging logic.
    
    ## What changed
    
    - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core
    protocol and app-server v2 protocol.
    - Kept backward compatibility by defaulting the new field to false, so
    legacy read-only payloads still
        deserialize unchanged.
    - Updated `has_full_network_access()` and sandbox summaries to respect
    read-only network access.
      - Preserved PermissionProfile.network when:
          - compiling skill permission profiles into sandbox policies
          - normalizing additional permissions
          - merging additional permissions into existing sandbox policies
    - Updated the approval overlay to show network in the rendered
    permission rule when requested.
      - Regenerated app-server schema fixtures for the new v2 wire shape.
  • chore(app-server): delete v1 RPC methods and notifications (#13375)
    ## Summary
    This removes the old app-server v1 methods and notifications we no
    longer need, while keeping the small set the main codex app client still
    depends on for now.
    
    The remaining legacy surface is:
    - `initialize`
    - `getConversationSummary`
    - `getAuthStatus`
    - `gitDiffToRemote`
    - `fuzzyFileSearch`
    - `fuzzyFileSearch/sessionStart`
    - `fuzzyFileSearch/sessionUpdate`
    - `fuzzyFileSearch/sessionStop`
    
    And the raw `codex/event/*` notifications emitted from core. These
    notifications will be removed in a followup PR.
    
    ## What changed
    - removed deprecated v1 request variants from the protocol and
    app-server dispatcher
    - removed deprecated typed notifications: `authStatusChange`,
    `loginChatGptComplete`, and `sessionConfigured`
    - updated the app-server test client to use v2 flows instead of deleted
    v1 flows
    - deleted legacy-only app-server test suites and added focused coverage
    for `getConversationSummary`
    - regenerated app-server schema fixtures and updated the MCP interface
    docs to match the remaining compatibility surface
    
    ## Testing
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server`
  • app-server: Add an ability to watch events in the test client (#13080)
    Add a `watch` subcommand to `codex-app-server-test-client` binary to
    help in manual testing of events flow.
  • feat: include available decisions in command approval requests (#12758)
    Command-approval clients currently infer which choices to show from
    side-channel fields like `networkApprovalContext`,
    `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
    the request shape harder to evolve, and it forces each client to
    replicate the server's heuristics instead of receiving the exact
    decision list for the prompt.
    
    This PR introduces a mapping between `CommandExecutionApprovalDecision`
    and `codex_protocol::protocol::ReviewDecision`:
    
    ```rust
    impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
        fn from(value: CoreReviewDecision) -> Self {
            match value {
                CoreReviewDecision::Approved => Self::Accept,
                CoreReviewDecision::ApprovedExecpolicyAmendment {
                    proposed_execpolicy_amendment,
                } => Self::AcceptWithExecpolicyAmendment {
                    execpolicy_amendment: proposed_execpolicy_amendment.into(),
                },
                CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
                CoreReviewDecision::NetworkPolicyAmendment {
                    network_policy_amendment,
                } => Self::ApplyNetworkPolicyAmendment {
                    network_policy_amendment: network_policy_amendment.into(),
                },
                CoreReviewDecision::Abort => Self::Cancel,
                CoreReviewDecision::Denied => Self::Decline,
            }
        }
    }
    ```
    
    And updates `CommandExecutionRequestApprovalParams` to have a new field:
    
    ```rust
    available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
    ```
    
    when, if specified, should make it easier for clients to display an
    appropriate list of options in the UI.
    
    This makes it possible for `CoreShellActionProvider::prompt()` in
    `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
    adding support for `ApprovedForSession` when approving a skill script,
    which was previously missing in the TUI.
    
    Note this results in a significant change to `exec_options()` in
    `approval_overlay.rs`, as the displayed options are now derived from
    `available_decisions: &[ReviewDecision]`.
    
    ## What Changed
    
    - Add `available_decisions` to
    [`ExecApprovalRequestEvent`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/protocol/src/approvals.rs#L111-L175),
    including helpers to derive the legacy default choices when older
    senders omit the field.
    - Map `codex_protocol::protocol::ReviewDecision` to app-server
    `CommandExecutionApprovalDecision` and expose the ordered list as
    experimental `availableDecisions` in
    [`CommandExecutionRequestApprovalParams`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/app-server-protocol/src/protocol/v2.rs#L3798-L3807).
    - Thread optional `available_decisions` through the core approval path
    so Unix shell escalation can explicitly request `ApprovedForSession` for
    session-scoped approvals instead of relying on client heuristics.
    [`unix_escalation.rs`](https://github.com/openai/codex/blob/de00e932dd9801de0a4faac0519162099753f331/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs#L194-L214)
    - Update the TUI approval overlay to build its buttons from the ordered
    decision list, while preserving the legacy fallback when
    `available_decisions` is missing.
    - Update the app-server README, test client output, and generated schema
    artifacts to document and surface the new field.
    
    ## Testing
    
    - Add `approval_overlay.rs` coverage for explicit decision lists,
    including the generic `ApprovedForSession` path and network approval
    options.
    - Update `chatwidget/tests.rs` and app-server protocol tests to populate
    the new optional field and keep older event shapes working.
    
    ## Developers Docs
    
    - If we document `item/commandExecution/requestApproval` on
    [developers.openai.com/codex](https://developers.openai.com/codex), add
    experimental `availableDecisions` as the preferred source of approval
    choices and note that older servers may omit it.
  • Revert "Add skill approval event/response (#12633)" (#12811)
    This reverts commit https://github.com/openai/codex/pull/12633. We no
    longer need this PR, because we favor sending normal exec command
    approval server request with `additional_permissions` of skill
    permissions instead
  • feat: add search term to thread list (#12578)
    Add `searchTerm` to `thread/list` that will search for a match in the
    titles (the condition being `searchTerm` $$\in$$ `title`)
  • feat(ui): add network approval persistence plumbing (#12358)
    ## Summary
    - add TUI approval options for persistent network host rules
    - add app-server v2 approval payload plumbing for network approval
    context + proposed network policy amendments
    - add app-server handling to translate `applyNetworkPolicyAmendment`
    decisions back into core review decisions
    - update docs/test client output and generated app-server schemas/types
  • feat: add experimental additionalPermissions to v2 command execution approval requests (#12737)
    This adds additionalPermissions to the app-server v2
    item/commandExecution/requestApproval payload as an experimental field.
    
    The field is now exposed on CommandExecutionRequestApprovalParams and is
    populated from the existing core approval event when a command requests
    additional sandbox permissions.
    
    This PR also contains changes to make server requests to support
    experiment API.
    
    A real app server test client test:
    
    sample payload with experimental flag off:
    ```
     {
    <   "id": 0,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "command": "/bin/zsh -lc 'mkdir -p ~/some/test && touch ~/some/test/file'",
    <     "commandActions": [
    <       {
    <         "command": "mkdir -p '~/some/test'",
    <         "type": "unknown"
    <       },
    <       {
    <         "command": "touch '~/some/test/file'",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Users/celia/code/codex/codex-rs",
    <     "itemId": "call_QLp0LWkQ1XkU6VW9T2vUZFWB",
    <     "proposedExecpolicyAmendment": [
    <       "mkdir",
    <       "-p",
    <       "~/some/test"
    <     ],
    <     "reason": "Do you want to allow creating ~/some/test/file outside the workspace?",
    <     "threadId": "019c9309-e209-7d82-a01b-dcf9556a354d",
    <     "turnId": "019c9309-e27a-7f33-834f-6011e795c2d6"
    <   }
    < }
    ```
    with experimental flag on: 
    ```
    < {
    <   "id": 0,
    <   "method": "item/commandExecution/requestApproval",
    <   "params": {
    <     "additionalPermissions": {
    <       "fileSystem": null,
    <       "macos": null,
    <       "network": true
    <     },
    <     "command": "/bin/zsh -lc 'install -D /dev/null ~/some/test/file'",
    <     "commandActions": [
    <       {
    <         "command": "install -D /dev/null '~/some/test/file'",
    <         "type": "unknown"
    <       }
    <     ],
    <     "cwd": "/Users/celia/code/codex/codex-rs",
    <     "itemId": "call_K3U4b3dRbj3eMCqslmncbGsq",
    <     "proposedExecpolicyAmendment": [
    <       "install",
    <       "-D"
    <     ],
    <     "reason": "Do you want to allow creating the file at ~/some/test/file outside the workspace sandbox?",
    <     "threadId": "019c9303-3a8e-76e1-81bf-d67ac446d892",
    <     "turnId": "019c9303-3af1-7143-88a1-73132f771234"
    <   }
    < }
    ```
  • Add skill approval event/response (#12633)
    Set the stage for skill-level permission approval in addition to
    command-level.
    
    Behind a feature flag.
  • Refactor network approvals to host/protocol/port scope (#12140)
    ## Summary
    Simplify network approvals by removing per-attempt proxy correlation and
    moving to session-level approval dedupe keyed by (host, protocol, port).
    Instead of encoding attempt IDs into proxy credentials/URLs, we now
    treat approvals as a destination policy decision.
    
    - Concurrent calls to the same destination share one approval prompt.
    - Different destinations (or same host on different ports) get separate
    prompts.
    - Allow once approves the current queued request group only.
    - Allow for session caches that (host, protocol, port) and auto-allows
    future matching requests.
    - Never policy continues to deny without prompting.
    
    Example:
    - 3 calls: 
      - a.com (line 443)
      - b.com (line 443)
      - a.com (line 443)
    => 2 prompts total (a, b), second a waits on the first decision.
    - a.com:80 is treated separately from a.com line 443
    
    ## Testing
    - `just fmt` (in `codex-rs`)
    - `cargo test -p codex-core tools::network_approval::tests`
    - `cargo test -p codex-core` (unit tests pass; existing
    integration-suite failures remain in this environment)
  • feat(core): zsh exec bridge (#12052)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 
    - https://github.com/openai/codex/pull/12052 👈 
    
    ### Summary
    This PR introduces a feature-gated native shell runtime path that routes
    shell execution through a patched zsh exec bridge, removing MCP-specific
    behavior from the shell hot path while preserving existing
    CommandExecution lifecycle semantics.
    
    When shell_zsh_fork is enabled, shell commands run via patched zsh with
    per-`execve` interception through EXEC_WRAPPER. Core receives wrapper
    IPC requests over a Unix socket, applies existing approval policy, and
    returns allow/deny before the subcommand executes.
    
    ### What’s included
    **1) New zsh exec bridge runtime in core**
    - Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for
    EXEC_WRAPPER invocations.
    - Per-execution Unix-socket IPC handling for wrapper requests/responses.
    - Approval callback integration using existing core approval
    orchestration.
    - Streaming stdout/stderr deltas to existing command output event
    pipeline.
    - Error handling for malformed IPC, denial/abort, and execution
    failures.
    
    **2) Session lifecycle integration**
    SessionServices now owns a `ZshExecBridge`.
    Session startup initializes bridge state; shutdown tears it down
    cleanly.
    
    **3) Shell runtime routing (feature-gated)**
    When `shell_zsh_fork` is enabled:
    - Build execution env/spec as usual.
    - Add wrapper socket env wiring.
    - Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of
    the regular shell path.
    - Non-zsh-fork behavior remains unchanged.
    
    **4) Config + feature wiring**
    - Added `Feature::ShellZshFork` (under development).
    - Added config support for `zsh_path` (optional absolute path to patched
    zsh):
    - `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema.
    - Session startup validates that `zsh_path` exists/usable when zsh-fork
    is enabled.
    - Added startup test for missing `zsh_path` failure mode.
    
    **5) Seatbelt/sandbox updates for wrapper IPC**
    - Extended seatbelt policy generation to optionally allow outbound
    connection to explicitly permitted Unix sockets.
    - Wired sandboxing path to pass wrapper socket path through to seatbelt
    policy generation.
    - Added/updated seatbelt tests for explicit socket allow rule and
    argument emission.
    
    **6) Runtime entrypoint hooks**
    - This allows the same binary to act as the zsh wrapper subprocess when
    invoked via `EXEC_WRAPPER`.
    
    **7) Tool selection behavior**
    - ToolsConfig now prefers ShellCommand type when shell_zsh_fork is
    enabled.
    - Added test coverage for precedence with unified-exec enabled.
  • feat(core): plumb distinct approval ids for command approvals (#12051)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 👈 
    - https://github.com/openai/codex/pull/12052
    
    With upcoming support for a fork of zsh that allows us to intercept
    `execve` and run execpolicy checks for each subcommand as part of a
    `CommandExecution`, it will be possible for there to be multiple
    approval requests for a shell command like `/path/to/zsh -lc 'git status
    && rg \"TODO\" src && make test'`.
    
    To support that, this PR introduces a new `approval_id` field across
    core, protocol, and app-server so that we can associate approvals
    properly for subcommands.
  • codex-rs: fix thread resume rejoin semantics (#11756)
    ## Summary
    - always rejoin an in-memory running thread on `thread/resume`, even
    when overrides are present
    - reject `thread/resume` when `history` is provided for a running thread
    - reject `thread/resume` when `path` mismatches the running thread
    rollout path
    - warn (but do not fail) on override mismatches for running threads
    - add more `thread_resume` integration tests and fixes; including
    restart-based resume-with-overrides coverage
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-app-server --test all thread_resume`
    - manual test with app-server-test-client
    https://github.com/openai/codex/pull/11755
    - manual test both stdio and websocket in app
  • app-server-test-client websocket client and thread tools (#11755)
    - add websocket endpoint mode with default ws://127.0.0.1:4222 while
    keeping stdio codex-bin path compatibility
    - add thread-resume (follow stream) and thread-list commands for manual
    thread lifecycle testing
    - quickstart docs
  • feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
    `SandboxPolicy::ReadOnly` previously implied broad read access and could
    not express a narrower read surface.
    This change introduces an explicit read-access model so we can support
    user-configurable read restrictions in follow-up work, while preserving
    current behavior today.
    
    It also ensures unsupported backends fail closed for restricted-read
    policies instead of silently granting broader access than intended.
    
    ## What
    
    - Added `ReadOnlyAccess` in protocol with:
      - `Restricted { include_platform_defaults, readable_roots }`
      - `FullAccess`
    - Updated `SandboxPolicy` to carry read-access configuration:
      - `ReadOnly { access: ReadOnlyAccess }`
      - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
    - Preserved existing behavior by defaulting current construction paths
    to `ReadOnlyAccess::FullAccess`.
    - Threaded the new fields through sandbox policy consumers and call
    sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
    related tests.
    - Updated Seatbelt policy generation to honor restricted read roots by
    emitting scoped read rules when full read access is not granted.
    - Added fail-closed behavior on Linux and Windows backends when
    restricted read access is requested but not yet implemented there
    (`UnsupportedOperation`).
    - Regenerated app-server protocol schema and TypeScript artifacts,
    including `ReadOnlyAccess`.
    
    ## Compatibility / rollout
    
    - Runtime behavior remains unchanged by default (`FullAccess`).
    - API/schema changes are in place so future config wiring can enable
    restricted read access without another policy-shape migration.
  • fix: remove errant Cargo.lock files (#11526)
    These leaked into the repo:
    
    - #4905 `codex-rs/windows-sandbox-rs/Cargo.lock`
    - #5391 `codex-rs/app-server-test-client/Cargo.lock`
    
    Note that these affect cache keys such as:
    
    
    https://github.com/openai/codex/blob/9722567a80b4bac81b74f679828747031fd95fa0/.github/workflows/rust-release.yml#L154
    
    so it seems best to remove them.
  • chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
    Problem:
    1. turn id is constructed in-memory;
    2. on resuming threads, turn_id might not be unique;
    3. client cannot no the boundary of a turn from rollout files easily.
    
    This PR does three things:
    1. persist `task_started` and `task_complete` events;
    1. persist `turn_id` in rollout turn events;
    5. generate turn_id as unique uuids instead of incrementing it in
    memory.
    
    This helps us resolve the issue of clients wanting to have unique turn
    ids for resuming a thread, and knowing the boundry of each turn in
    rollout files.
    
    example debug logs
    ```
    2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
    2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
    2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
    ```
    
    backward compatibility:
    if you try to resume an old session without task_started and
    task_complete event populated, the following happens:
    - If you resume and do nothing: those reconstructed historical IDs can
    differ next time you resume.
    - If you resume and send a new turn: the new turn gets a fresh UUID from
    live submission flow and is persisted, so that new turn’s ID is stable
    on later resumes.
    I think this behavior is fine, because we only care about deterministic
    turn id once a turn is triggered.
  • feat: opt-out of events in the app-server (#11319)
    Add `optOutNotificationMethods` in the app-server to opt-out events
    based on exact method matching
  • chore: add codex debug app-server tooling (#10367)
    codex debug app-server <user message> forwards the message through
    codex-app-server-test-client’s send_message_v2 library entry point,
    using std::env::current_exe() to resolve the codex binary.
    
    for how it looks like, see:
    
    ```
    celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server --help                       
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s
    Tooling: helps debug the app server
    
    Usage: codex debug app-server [OPTIONS] <COMMAND>
    
    Commands:
      send-message-v2  
      help             Print this message or the help of the given subcommand(s)
    ````
    and
    ```
    celia@com-92114 codex-rs % cargo build -p codex-cli && target/debug/codex debug app-server send-message-v2 "hello world"
       Compiling codex-cli v0.0.0 (/Users/celia/code/codex/codex-rs/cli)
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.38s
    > {
    >   "method": "initialize",
    >   "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954",
    >   "params": {
    >     "clientInfo": {
    >       "name": "codex-toy-app-server",
    >       "title": "Codex Toy App Server",
    >       "version": "0.0.0"
    >     },
    >     "capabilities": {
    >       "experimentalApi": true
    >     }
    >   }
    > }
    < {
    <   "id": "f8ba9f60-3a49-4ea9-81d6-4ab6853e3954",
    <   "result": {
    <     "userAgent": "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)"
    <   }
    < }
    < initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.2.0; arm64) vscode/2.4.27 (codex-toy-app-server; 0.0.0)" }
    > {
    >   "method": "thread/start",
    >   "id": "203f1630-beee-4e60-b17b-9eff16b1638b",
    >   "params": {
    >     "model": null,
    >     "modelProvider": null,
    >     "cwd": null,
    >     "approvalPolicy": null,
    >     "sandbox": null,
    >     "config": null,
    >     "baseInstructions": null,
    >     "developerInstructions": null,
    >     "personality": null,
    >     "ephemeral": null,
    >     "dynamicTools": null,
    >     "mockExperimentalField": null,
    >     "experimentalRawEvents": false
    >   }
    > }
    ...
    ```
  • feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
    We started working with MCP in Codex before
    https://crates.io/crates/rmcp was mature, so we had our own crate for
    MCP types that was generated from the MCP schema:
    
    
    https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md
    
    Now that `rmcp` is more mature, it makes more sense to use their MCP
    types in Rust, as they handle details (like the `_meta` field) that our
    custom version ignored. Though one advantage that our custom types had
    is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
    whereas the types in `rmcp` do not. As such, part of the work of this PR
    is leveraging the adapters between `rmcp` types and the serializable
    types that are API for us (app server and MCP) introduced in #10356.
    
    Note this PR results in a number of changes to
    `codex-rs/app-server-protocol/schema`, which merit special attention
    during review. We must ensure that these changes are still
    backwards-compatible, which is possible because we have:
    
    ```diff
    - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
    + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
    ```
    
    so `ContentBlock` has been replaced with the more general `JsonValue`.
    Note that `ContentBlock` was defined as:
    
    ```typescript
    export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
    ```
    
    so the deletion of those individual variants should not be a cause of
    great concern.
    
    Similarly, we have the following change in
    `codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
    
    ```
    - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
    + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
    ```
    
    so:
    
    - `annotations?: ToolAnnotations` ➡️ `JsonValue`
    - `inputSchema: ToolInputSchema` ➡️ `JsonValue`
    - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
    
    and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
    * #10357
    * __->__ #10349
    * #10356
  • [feat] persist thread_dynamic_tools in db (#10252)
    Persist thread_dynamic_tools in sqlite and read first from it. Fall back
    to rollout files if it's not found. Persist dynamic tools to both sqlite
    and rollout files.
    
    Saw that new sessions get populated to db correctly & old sessions get
    backfilled correctly at startup:
    ```
    celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \      "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
    019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
    ....
    019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
    ```
  • feat: experimental flags (#10231)
    ## Problem being solved
    - We need a single, reliable way to mark app-server API surface as
    experimental so that:
      1. the runtime can reject experimental usage unless the client opts in
    2. generated TS/JSON schemas can exclude experimental methods/fields for
    stable clients.
    
    Right now that’s easy to drift or miss when done ad-hoc.
    
    ## How to declare experimental methods and fields
    - **Experimental method**: add `#[experimental("method/name")]` to the
    `ClientRequest` variant in `client_request_definitions!`.
    - **Experimental field**: on the params struct, derive `ExperimentalApi`
    and annotate the field with `#[experimental("method/name.field")]` + set
    `inspect_params: true` for the method variant so
    `ClientRequest::experimental_reason()` inspects params for experimental
    fields.
    
    ## How the macro solves it
    - The new derive macro lives in
    `codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
    `#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
    attributes.
    - **Structs**:
    - Generates `ExperimentalApi::experimental_reason(&self)` that checks
    only annotated fields.
      - The “presence” check is type-aware:
        - `Option<T>`: `is_some_and(...)` recursively checks inner.
        - `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
        - `bool`: must be `true`.
        - Other types: considered present (returns `true`).
    - Registers each experimental field in an `inventory` with `(type_name,
    serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
    that type. Field names are converted from `snake_case` to `camelCase`
    for schema/TS filtering.
    - **Enums**:
    - Generates an exhaustive `match` returning `Some(reason)` for annotated
    variants and `None` otherwise (no wildcard arm).
    - **Wiring**:
    - Runtime gating uses `ExperimentalApi::experimental_reason()` in
    `codex-rs/app-server/src/message_processor.rs` to reject requests unless
    `InitializeParams.capabilities.experimental_api == true`.
    - Schema/TS export filters use the inventory list and
    `EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
    strip experimental methods/fields when `experimental_api` is false.
  • Chore: add cmd related info to exec approval request (#9659)
    ### Summary
    We now rely purely on `item/commandExecution/requestApproval` item to
    render pending approval in VSCE and app. With v2 approach, it does not
    include the actual cmd that it is attempting and therefore we can only
    use `proposedExecpolicyAmendment` to render which can be incomplete.
    
    ### Reproduce
    * Add `prefix_rule(pattern=["echo"], decision="prompt")` to your
    `~/.codex/rules.default.rules`.
    * Ask to `Run  echo "approval-test" please` in VSCE or app. 
    * The pending approval protal does show up but with no content
    
    #### Example screenshot
    <img width="3434" height="3648" alt="Screenshot 2026-01-21 at 8 23
    25 PM"
    src="https://github.com/user-attachments/assets/75644837-21f1-40f8-8b02-858d361ff817"
    />
    
    #### Sample output
    ```
      {"method":"item/commandExecution/requestApproval","id":0,"params":{
        "threadId":"019be439-5a90-7600-a7ea-2d2dcc50302a",
        "turnId":"0",
        "itemId":"call_usgnQ4qEX5U9roNdjT7fPzhb",
        "reason":"`/bin/zsh -lc 'echo \"testing\"'` requires approval by policy",
        "proposedExecpolicyAmendment":null
      }}
    
    ```
    
    ### Fix
    Inlude `command` string, `cwd` and `command_actions` in
    `CommandExecutionRequestApprovalParams` so that consumers can display
    the correct command instead of relying on exec policy output.
  • Persist text elements through TUI input and history (#9393)
    Continuation of breaking up this PR
    https://github.com/openai/codex/pull/9116
    
    ## Summary
    - Thread user text element ranges through TUI/TUI2 input, submission,
    queueing, and history so placeholders survive resume/edit flows.
    - Preserve local image attachments alongside text elements and rehydrate
    placeholders when restoring drafts.
    - Keep model-facing content shapes clean by attaching UI metadata only
    to user input/events (no API content changes).
    
    ## Key Changes
    - TUI/TUI2 composer now captures text element ranges, trims them with
    text edits, and restores them when submission is suppressed.
    - User history cells render styled spans for text elements and keep
    local image paths for future rehydration.
    - Initial chat widget bootstraps accept empty `initial_text_elements` to
    keep initialization uniform.
    - Protocol/core helpers updated to tolerate the new InputText field
    shape without changing payloads sent to the API.
  • Add text element metadata to protocol, app server, and core (#9331)
    The second part of breaking up PR
    https://github.com/openai/codex/pull/9116
    
    Summary:
    
    - Add `TextElement` / `ByteRange` to protocol user inputs and user
    message events with defaults.
    - Thread `text_elements` through app-server v1/v2 request handling and
    history rebuild.
    - Preserve UI metadata only in user input/events (not `ContentItem`)
    while keeping local image attachments in user events for rehydration.
    
    Details:
    
    - Protocol: `UserInput::Text` carries `text_elements`;
    `UserMessageEvent` carries `text_elements` + `local_images`.
    Serialization includes empty vectors for backward compatibility.
    - app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
    camelCase with conversions; v2 uses its own camelCase wrapper.
    - app-server: v1/v2 input mapping includes `text_elements`; thread
    history rebuilds include them.
    - Core: user event emission preserves UI metadata while model history
    stays clean; history replay round-trips the metadata.
  • Add text element metadata to types (#9235)
    Initial type tweaking PR to make the diff of
    https://github.com/openai/codex/pull/9116 smaller
    
    This should not change any behavior, just adds some fields to types
  • feat: add support for building with Bazel (#8875)
    This PR configures Codex CLI so it can be built with
    [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
    includes configuration so that remote builds can be done using
    [BuildBuddy](https://www.buildbuddy.io).
    
    If you are familiar with Bazel, things should work as you expect, e.g.,
    run `bazel test //... --keep-going` to run all the tests in the repo,
    but we have also added some new aliases in the `justfile` for
    convenience:
    
    - `just bazel-test` to run tests locally
    - `just bazel-remote-test` to run tests remotely (currently, the remote
    build is for x86_64 Linux regardless of your host platform). Note we are
    currently seeing the following test failures in the remote build, so we
    still need to figure out what is happening here:
    
    ```
    failures:
        suite::compact::manual_compact_twice_preserves_latest_user_messages
        suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
        suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
    ```
    
    - `just build-for-release` to build release binaries for all
    platforms/architectures remotely
    
    To setup remote execution:
    - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
    employees should also request org access at
    https://openai.buildbuddy.io/join/ with their `@openai.com` email
    address.)
    - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
    `~/.bazelrc` (add the line `build
    --remote_header=x-buildbuddy-api-key=YOUR_KEY`)
    - Use `--config=remote` in your `bazel` invocations (or add `common
    --config=remote` to your `~/.bazelrc`, or use the `just` commands)
    
    ## CI
    
    In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
    uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
    (we are working on supporting Windows, but that is not ready yet). Note
    that the failures we are seeing in `just bazel-remote-test` do not occur
    on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
    is green right now.
    
    The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
    that macOS CI jobs build _remotely_ on Linux hosts (using the
    `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
    root `BUILD.bazel`) using cross-compilation to build the macOS
    artifacts. Then these artifacts are downloaded locally to GitHub's macOS
    runner so the tests can be executed natively. This is the relevant
    config that enables this:
    
    ```
    common:macos --config=remote
    common:macos --strategy=remote
    common:macos --strategy=TestRunner=darwin-sandbox,local
    ```
    
    Because of the remote caching benefits we get from BuildBuddy, these new
    CI jobs can be extremely fast! For example, consider these two jobs that
    ran all the tests on Linux x86_64:
    
    - Bazel 1m37s
    https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
    - Cargo 9m20s
    https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
    
    For now, we will continue to run both the Bazel and Cargo jobs for PRs,
    but once we add support for Windows and running Clippy, we should be
    able to cutover to using Bazel exclusively for PRs, which should still
    speed things up considerably. We will probably continue to run the Cargo
    jobs post-merge for commits that land on `main` as a sanity check.
    
    Release builds will also continue to be done by Cargo for now.
    
    Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
    Earlier attempt to add support for Buck2, now abandoned:
    https://github.com/openai/codex/pull/8504
    
    ---------
    
    Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • fix: implement 'Allow this session' for apply_patch approvals (#8451)
    **Summary**
    This PR makes “ApprovalDecision::AcceptForSession / don’t ask again this
    session” actually work for `apply_patch` approvals by caching approvals
    based on absolute file paths in codex-core, properly wiring it through
    app-server v2, and exposing the choice in both TUI and TUI2.
    - This brings `apply_patch` calls to be at feature-parity with general
    shell commands, which also have a "Yes, and don't ask again" option.
    - This also fixes VSCE's "Allow this session" button to actually work.
    
    While we're at it, also split the app-server v2 protocol's
    `ApprovalDecision` enum so execpolicy amendments are only available for
    command execution approvals.
    
    **Key changes**
    - Core: per-session patch approval allowlist keyed by absolute file
    paths
    - Handles multi-file patches and renames/moves by recording both source
    and destination paths for `Update { move_path: Some(...) }`.
    - Extend the `Approvable` trait and `ApplyPatchRuntime` to work with
    multiple keys, because an `apply_patch` tool call can modify multiple
    files. For a request to be auto-approved, we will need to check that all
    file paths have been approved previously.
    - App-server v2: honor AcceptForSession for file changes
    - File-change approval responses now map AcceptForSession to
    ReviewDecision::ApprovedForSession (no longer downgraded to plain
    Approved).
    - Replace `ApprovalDecision` with two enums:
    `CommandExecutionApprovalDecision` and `FileChangeApprovalDecision`
    - TUI / TUI2: expose “don’t ask again for these files this session”
    - Patch approval overlays now include a third option (“Yes, and don’t
    ask again for these files this session (s)”).
        - Snapshot updates for the approval modal.
    
    **Tests added/updated**
    - Core:
    - Integration test that proves ApprovedForSession on a patch skips the
    next patch prompt for the same file
    - App-server:
    - v2 integration test verifying
    FileChangeApprovalDecision::AcceptForSession works properly
    
    **User-visible behavior**
    - When the user approves a patch “for session”, future patches touching
    only those previously approved file(s) will no longer prompt gain during
    that session (both via app-server v2 and TUI/TUI2).
    
    **Manual testing**
    Tested both TUI and TUI2 - see screenshots below.
    
    TUI:
    <img width="1082" height="355" alt="image"
    src="https://github.com/user-attachments/assets/adcf45ad-d428-498d-92fc-1a0a420878d9"
    />
    
    
    TUI2:
    <img width="1089" height="438" alt="image"
    src="https://github.com/user-attachments/assets/dd768b1a-2f5f-4bd6-98fd-e52c1d3abd9e"
    />
  • chore: unify conversation with thread name (#8830)
    Done and verified by Codex + refactor feature of RustRover
  • [app-server] fix config loading for conversations (#8765)
    Currently we don't load config properly for app server conversations.
    see:
    https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server.
    This PR fixes that by respecting the config passed in.
    
    Tested by running `cargo build -p codex-cli &&
    RUST_LOG=codex_app_server=debug CODEX_BIN=target/debug/codex cargo run
    -p codex-app-server-test-client -- \
    --config
    model_providers.mock_provider.base_url=\"http://localhost:4010/v2\" \
        --config model_provider=\"mock_provider\" \
        --config model_providers.mock_provider.name="hello" \
        send-message-v2 "hello"`
    and verified that the mock_provider is called instead of default
    provider.
    
    #closes
    https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server
    
    ---------
    
    Co-authored-by: Michael Bolin <mbolin@openai.com>
  • chore: add model/list call to app-server-test-client (#8331)
    Allows us to run `cargo run -p codex-app-server-test-client --
    model-list` to return the list of models over app-server.
  • Removed experimental "command risk assessment" feature (#7799)
    This experimental feature received lukewarm reception during internal
    testing. Removing from the code base.
  • updating app server types to support execpoilcy amendment (#7747)
    also includes minor refactor merging `ApprovalDecision` with
    `CommandExecutionRequestAcceptSettings`
  • fix: remove serde(flatten) annotation for TurnError (#7499)
    The problem with using `serde(flatten)` on Turn status is that it
    conditionally serializes the `error` field, which is not the pattern we
    want in API v2 where all fields on an object should always be returned.
    
    ```
    #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
    #[serde(rename_all = "camelCase")]
    #[ts(export_to = "v2/")]
    pub struct Turn {
        pub id: String,
        /// Only populated on a `thread/resume` response.
        /// For all other responses and notifications returning a Turn,
        /// the items field will be an empty list.
        pub items: Vec<ThreadItem>,
        #[serde(flatten)]
        pub status: TurnStatus,
    }
    
    #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
    #[serde(tag = "status", rename_all = "camelCase")]
    #[ts(tag = "status", export_to = "v2/")]
    pub enum TurnStatus {
        Completed,
        Interrupted,
        Failed { error: TurnError },
        InProgress,
    }
    ```
    
    serializes to:
    ```
    {
      "id": "turn-123",
      "items": [],
      "status": "completed"
    }
    
    {
      "id": "turn-123",
      "items": [],
      "status": "failed",
      "error": {
        "message": "Tool timeout",
        "codexErrorInfo": null
      }
    }
    ```
    
    Instead we want:
    ```
    {
      "id": "turn-123",
      "items": [],
      "status": "completed",
      "error": null
    }
    
    {
      "id": "turn-123",
      "items": [],
      "status": "failed",
      "error": {
        "message": "Tool timeout",
        "codexErrorInfo": null
      }
    }
    ```
  • [app-server-test-client] add send-followup-v2 (#7271)
    Add a new endpoint that allows us to test multi-turn behavior.
    
    Tested with running:
    ```
    RUST_LOG=codex_app_server=debug CODEX_BIN=target/debug/codex \
          cargo run -p codex-app-server-test-client -- \
          send-follow-up-v2 "hello" "and now a follow-up question"
    ```
  • chore: add cargo-deny configuration (#7119)
    - add GitHub workflow running cargo-deny on push/PR
    - document cargo-deny allowlist with workspace-dep notes and advisory
    ignores
    - align workspace crates to inherit version/edition/license for
    consistent checks
  • [app-server] feat: v2 apply_patch approval flow (#6760)
    This PR adds the API V2 version of the apply_patch approval flow, which
    centers around `ThreadItem::FileChange`.
    
    This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
    and related events (`item/started`, `item/completed` for
    `ThreadItem::FileChange`, which are emitted in both V1 and V2) through
    the app-server
    protocol. The new approval RPC is only sent when the user initiates a
    turn with the new `turn/start` API so we don't break backwards
    compatibility with VSCE.
    
    Similar to https://github.com/openai/codex/pull/6758, the approach I
    took was to make as few changes to the Codex core as possible,
    leveraging existing `EventMsg` core events, and translating those in
    app-server. I did have to add a few additional fields to
    `EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
    were fairly lightweight.
    
    However, the `EventMsg`s emitted by core are the following:
    ```
    1) Auto-approved (no request for approval)

    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    
    2) Approved by user
    - EventMsg::ApplyPatchApprovalRequest
    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    
    3) Declined by user
    - EventMsg::ApplyPatchApprovalRequest
    - EventMsg::PatchApplyBegin
    - EventMsg::PatchApplyEnd
    ```
    
    For a request triggering an approval, this would result in:
    ```
    item/fileChange/requestApproval
    item/started
    item/completed
    ```
    
    which is different from the `ThreadItem::CommandExecution` flow
    introduced in https://github.com/openai/codex/pull/6758, which does the
    below and is preferable:
    ```
    item/started
    item/commandExecution/requestApproval
    item/completed
    ```
    
    To fix this, we leverage `TurnSummaryStore` on codex_message_processor
    to store a little bit of state, allowing us to fire `item/started` and
    `item/fileChange/requestApproval` whenever we receive the underlying
    `EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
    `EventMsg::PatchApplyBegin` later.
    
    This is much less invasive than modifying the order of EventMsg within
    core (I tried).
    
    The resulting payloads:
    ```
    {
      "method": "item/started",
      "params": {
        "item": {
          "changes": [
            {
              "diff": "Hello from Codex!\n",
              "kind": "add",
              "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
            }
          ],
          "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
          "status": "inProgress",
          "type": "fileChange"
        }
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "method": "item/fileChange/requestApproval",
      "params": {
        "grantRoot": null,
        "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
        "reason": null,
        "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
        "turnId": "1"
      }
    }
    ```
    
    ```
    {
      "id": 0,
      "result": {
        "decision": "accept"
      }
    }
    ```
    
    ```
    {
      "method": "item/completed",
      "params": {
        "item": {
          "changes": [
            {
              "diff": "Hello from Codex!\n",
              "kind": "add",
              "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
            }
          ],
          "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
          "status": "completed",
          "type": "fileChange"
        }
      }
    }
    ```
  • [app-server] introduce turn/completed v2 event (#6800)
    similar to logic in
    `codex/codex-rs/exec/src/event_processor_with_jsonl_output.rs`.
    translation of v1 -> v2 events:
    `codex/event/task_complete` -> `turn/completed`
    `codex/event/turn_aborted` -> `turn/completed` with `interrupted` status
    `codex/event/error` -> `turn/completed` with `error` status
    
    this PR also makes `items` field in `Turn` optional. For now, we only
    populate it when we resume a thread, and leave it as None for all other
    places until we properly rewrite core to keep track of items.
    
    tested using the codex app server client. example new event:
    ```
    < {
    <   "method": "turn/completed",
    <   "params": {
    <     "turn": {
    <       "id": "0",
    <       "items": [],
    <       "status": "interrupted"
    <     }
    <   }
    < }
    ```
  • feat: add app-server-test-client crate for internal use (#5391)
    For app-server development it's been helpful to be able to trigger some
    test flows end-to-end and print the JSON-RPC messages sent between
    client and server.