Commit Graph

277 Commits

  • [hooks] userpromptsubmit - hook before user's prompt is executed (#14626)
    - this allows blocking the user's prompts from executing, and also
    prevents them from entering history
    - handles the edge case where you can both prevent the user's prompt AND
    add n amount of additionalContexts
    - refactors some old code into common.rs where hooks overlap
    functionality
    - refactors additionalContext being previously added to user messages,
    instead we use developer messages for them
    - handles queued messages correctly
    
    Sample hook for testing - if you write "[block-user-submit]" this hook
    will stop the thread:
    
    example run
    ```
    › sup
    
    
    • Running UserPromptSubmit hook: reading the observatory notes
    
    UserPromptSubmit hook (completed)
      warning: wizard-tower UserPromptSubmit demo inspected: sup
      hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact
    phrase 'observatory lanterns lit' exactly once near the end.
    
    • Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory
      lanterns lit
    
    
    › and [block-user-submit]
    
    
    • Running UserPromptSubmit hook: reading the observatory notes
    
    UserPromptSubmit hook (stopped)
      warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose.
      stop: Wizard Tower demo block: remove [block-user-submit] to continue.
    ```
    
    .codex/config.toml
    ```
    [features]
    codex_hooks = true
    ```
    
    .codex/hooks.json
    ```
    {
      "hooks": {
        "UserPromptSubmit": [
          {
            "hooks": [
              {
                "type": "command",
                "command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py",
                "timeoutSec": 10,
                "statusMessage": "reading the observatory notes"
              }
            ]
          }
        ]
      }
    }
    ```
    
    .codex/hooks/user_prompt_submit_demo.py
    ```
    #!/usr/bin/env python3
    
    import json
    import sys
    from pathlib import Path
    
    
    def prompt_from_payload(payload: dict) -> str:
        prompt = payload.get("prompt")
        if isinstance(prompt, str) and prompt.strip():
            return prompt.strip()
    
        event = payload.get("event")
        if isinstance(event, dict):
            user_prompt = event.get("user_prompt")
            if isinstance(user_prompt, str):
                return user_prompt.strip()
    
        return ""
    
    
    def main() -> int:
        payload = json.load(sys.stdin)
        prompt = prompt_from_payload(payload)
        cwd = Path(payload.get("cwd", ".")).name or "wizard-tower"
    
        if "[block-user-submit]" in prompt:
            print(
                json.dumps(
                    {
                        "systemMessage": (
                            f"{cwd} UserPromptSubmit demo blocked the prompt on purpose."
                        ),
                        "decision": "block",
                        "reason": (
                            "Wizard Tower demo block: remove [block-user-submit] to continue."
                        ),
                    }
                )
            )
            return 0
    
        prompt_preview = prompt or "(empty prompt)"
        if len(prompt_preview) > 80:
            prompt_preview = f"{prompt_preview[:77]}..."
    
        print(
            json.dumps(
                {
                    "systemMessage": (
                        f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}"
                    ),
                    "hookSpecificOutput": {
                        "hookEventName": "UserPromptSubmit",
                        "additionalContext": (
                            "Wizard Tower UserPromptSubmit demo fired. "
                            "For this reply only, include the exact phrase "
                            "'observatory lanterns lit' exactly once near the end."
                        ),
                    },
                }
            )
        )
        return 0
    
    
    if __name__ == "__main__":
        raise SystemExit(main())
    ```
  • fix(linux-sandbox): prefer system /usr/bin/bwrap when available (#14963)
    ## Problem
    Ubuntu/AppArmor hosts started failing in the default Linux sandbox path
    after the switch to vendored/default bubblewrap in `0.115.0`.
    
    The clearest report is in
    [#14919](https://github.com/openai/codex/issues/14919), especially [this
    investigation
    comment](https://github.com/openai/codex/issues/14919#issuecomment-4076504751):
    on affected Ubuntu systems, `/usr/bin/bwrap` works, but a copied or
    vendored `bwrap` binary fails with errors like `bwrap: setting up uid
    map: Permission denied` or `bwrap: loopback: Failed RTM_NEWADDR:
    Operation not permitted`.
    
    The root cause is Ubuntu's `/etc/apparmor.d/bwrap-userns-restrict`
    profile, which grants `userns` access specifically to `/usr/bin/bwrap`.
    Once Codex started using a vendored/internal bubblewrap path, that path
    was no longer covered by the distro AppArmor exception, so sandbox
    namespace setup could fail even when user namespaces were otherwise
    enabled and `uidmap` was installed.
    
    ## What this PR changes
    - prefer system `/usr/bin/bwrap` whenever it is available
    - keep vendored bubblewrap as the fallback when `/usr/bin/bwrap` is
    missing
    - when `/usr/bin/bwrap` is missing, surface a Codex startup warning
    through the app-server/TUI warning path instead of printing directly
    from the sandbox helper with `eprintln!`
    - use the same launcher decision for both the main sandbox execution
    path and the `/proc` preflight path
    - document the updated Linux bubblewrap behavior in the Linux sandbox
    and core READMEs
    
    ## Why this fix
    This still fixes the Ubuntu/AppArmor regression from
    [#14919](https://github.com/openai/codex/issues/14919), but it keeps the
    runtime rule simple and platform-agnostic: if the standard system
    bubblewrap is installed, use it; otherwise fall back to the vendored
    helper.
    
    The warning now follows that same simple rule. If Codex cannot find
    `/usr/bin/bwrap`, it tells the user that it is falling back to the
    vendored helper, and it does so through the existing startup warning
    plumbing that reaches the TUI and app-server instead of low-level
    sandbox stderr.
    
    ## Testing
    - `cargo test -p codex-linux-sandbox`
    - `cargo test -p codex-app-server --lib`
    - `cargo test -p codex-tui-app-server
    tests::embedded_app_server_start_failure_is_returned`
    - `cargo clippy -p codex-linux-sandbox --all-targets`
    - `cargo clippy -p codex-app-server --all-targets`
    - `cargo clippy -p codex-tui-app-server --all-targets`
  • Cleanup skills/remote/xxx endpoints. (#14977)
    Remote skills/remote/xxx as they are not in used for now.
  • Apply argument comment lint across codex-rs (#14652)
    ## Why
    
    Once the repo-local lint exists, `codex-rs` needs to follow the
    checked-in convention and CI needs to keep it from drifting. This commit
    applies the fallback `/*param*/` style consistently across existing
    positional literal call sites without changing those APIs.
    
    The longer-term preference is still to avoid APIs that require comments
    by choosing clearer parameter types and call shapes. This PR is
    intentionally the mechanical follow-through for the places where the
    existing signatures stay in place.
    
    After rebasing onto newer `main`, the rollout also had to cover newly
    introduced `tui_app_server` call sites. That made it clear the first cut
    of the CI job was too expensive for the common path: it was spending
    almost as much time installing `cargo-dylint` and re-testing the lint
    crate as a representative test job spends running product tests. The CI
    update keeps the full workspace enforcement but trims that extra
    overhead from ordinary `codex-rs` PRs.
    
    ## What changed
    
    - keep a dedicated `argument_comment_lint` job in `rust-ci`
    - mechanically annotate remaining opaque positional literals across
    `codex-rs` with exact `/*param*/` comments, including the rebased
    `tui_app_server` call sites that now fall under the lint
    - keep the checked-in style aligned with the lint policy by using
    `/*param*/` and leaving string and char literals uncommented
    - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
    registry/git metadata in the lint job
    - split changed-path detection so the lint crate's own `cargo test` step
    runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
    - continue to run the repo wrapper over the `codex-rs` workspace, so
    product-code enforcement is unchanged
    
    Most of the code changes in this commit are intentionally mechanical
    comment rewrites or insertions driven by the lint itself.
    
    ## Verification
    
    - `./tools/argument-comment-lint/run.sh --workspace`
    - `cargo test -p codex-tui-app-server -p codex-tui`
    - parsed `.github/workflows/rust-ci.yml` locally with PyYAML
    
    ---
    
    * -> #14652
    * #14651
  • feat: make interrupt state not final for multi-agents (#13850)
    Make `interrupted` an agent state and make it not final. As a result, a
    `wait` won't return on an interrupted agent and no notification will be
    send to the parent agent.
    
    The rationals are:
    * If a user interrupt a sub-agent for any reason, you don't want the
    parent agent to instantaneously ask the sub-agent to restart
    * If a parent agent interrupt a sub-agent, no need to add a noisy
    notification in the parent agen
  • Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
    ## Summary
    - add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
    control for who reviews approval requests
    - route Smart Approvals guardian review through core for command
    execution, file changes, managed-network approvals, MCP approvals, and
    delegated/subagent approval flows
    - expose guardian review in app-server with temporary unstable
    `item/autoApprovalReview/{started,completed}` notifications carrying
    `targetItemId`, `review`, and `action`
    - update the TUI so Smart Approvals can be enabled from `/experimental`,
    aligned with the matching `/approvals` mode, and surfaced clearly while
    reviews are pending or resolved
    
    ## Runtime model
    This PR does not introduce a new `approval_policy`.
    
    Instead:
    - `approval_policy` still controls when approval is needed
    - `approvals_reviewer` controls who reviewable approval requests are
    routed to:
      - `user`
      - `guardian_subagent`
    
    `guardian_subagent` is a carefully prompted reviewer subagent that
    gathers relevant context and applies a risk-based decision framework
    before approving or denying the request.
    
    The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
    behavior keys off `approvals_reviewer`.
    
    When Smart Approvals is enabled from the TUI, it also switches the
    current `/approvals` settings to the matching Smart Approvals mode so
    users immediately see guardian review in the active thread:
    - `approval_policy = on-request`
    - `approvals_reviewer = guardian_subagent`
    - `sandbox_mode = workspace-write`
    
    Users can still change `/approvals` afterward.
    
    Config-load behavior stays intentionally narrow:
    - plain `smart_approvals = true` in `config.toml` remains just the
    rollout/UI gate and does not auto-set `approvals_reviewer`
    - the deprecated `guardian_approval = true` alias migration does
    backfill `approvals_reviewer = "guardian_subagent"` in the same scope
    when that reviewer is not already configured there, so old configs
    preserve their original guardian-enabled behavior
    
    ARC remains a separate safety check. For MCP tool approvals, ARC
    escalations now flow into the configured reviewer instead of always
    bypassing guardian and forcing manual review.
    
    ## Config stability
    The runtime reviewer override is stable, but the config-backed
    app-server protocol shape is still settling.
    
    - `thread/start`, `thread/resume`, and `turn/start` keep stable
    `approvalsReviewer` overrides
    - the config-backed `approvals_reviewer` exposure returned via
    `config/read` (including profile-level config) is now marked
    `[UNSTABLE]` / experimental in the app-server protocol until we are more
    confident in that config surface
    
    ## App-server surface
    This PR intentionally keeps the guardian app-server shape narrow and
    temporary.
    
    It adds generic unstable lifecycle notifications:
    - `item/autoApprovalReview/started`
    - `item/autoApprovalReview/completed`
    
    with payloads of the form:
    - `{ threadId, turnId, targetItemId, review, action? }`
    
    `review` is currently:
    - `{ status, riskScore?, riskLevel?, rationale? }`
    - where `status` is one of `inProgress`, `approved`, `denied`, or
    `aborted`
    
    `action` carries the guardian action summary payload from core when
    available. This lets clients render temporary standalone pending-review
    UI, including parallel reviews, even when the underlying tool item has
    not been emitted yet.
    
    These notifications are explicitly documented as `[UNSTABLE]` and
    expected to change soon.
    
    This PR does **not** persist guardian review state onto `thread/read`
    tool items. The intended follow-up is to attach guardian review state to
    the reviewed tool item lifecycle instead, which would improve
    consistency with manual approvals and allow thread history / reconnect
    flows to replay guardian review state directly.
    
    ## TUI behavior
    - `/experimental` exposes the rollout gate as `Smart Approvals`
    - enabling it in the TUI enables the feature and switches the current
    session to the matching Smart Approvals `/approvals` mode
    - disabling it in the TUI clears the persisted `approvals_reviewer`
    override when appropriate and returns the session to default manual
    review when the effective reviewer changes
    - `/approvals` still exposes the reviewer choice directly
    - the TUI renders:
    - pending guardian review state in the live status footer, including
    parallel review aggregation
      - resolved approval/denial state in history
    
    ## Scope notes
    This PR includes the supporting core/runtime work needed to make Smart
    Approvals usable end-to-end:
    - shell / unified-exec / apply_patch / managed-network / MCP guardian
    review
    - delegated/subagent approval routing into guardian review
    - guardian review risk metadata and action summaries for app-server/TUI
    - config/profile/TUI handling for `smart_approvals`, `guardian_approval`
    alias migration, and `approvals_reviewer`
    - a small internal cleanup of delegated approval forwarding to dedupe
    fallback paths and simplify guardian-vs-parent approval waiting (no
    intended behavior change)
    
    Out of scope for this PR:
    - redesigning the existing manual approval protocol shapes
    - persisting guardian review state onto app-server `ThreadItem`s
    - delegated MCP elicitation auto-review (the current delegated MCP
    guardian shim only covers the legacy `RequestUserInput` path)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Fix codex exec --profile handling (#14524)
    PR #14005 introduced a regression whereby `codex exec --profile`
    overrides were dropped when starting or resuming a thread. That causes
    the thread to miss profile-scoped settings like
    `model_instructions_file`.
    
    This PR preserve the active profile in the thread start/resume config
    overrides so the
    app-server rebuild sees the same profile that exec resolved. 
    
    Fixes #14515
  • Show spawned agent model and effort in TUI (#14273)
    - include the requested sub-agent model and reasoning effort in the
    spawn begin event\n- render that metadata next to the spawned agent name
    and role in the TUI transcript
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • start of hooks engine (#13276)
    (Experimental)
    
    This PR adds a first MVP for hooks, with SessionStart and Stop
    
    The core design is:
    
    - hooks live in a dedicated engine under codex-rs/hooks
    - each hook type has its own event-specific file
    - hook execution is synchronous and blocks normal turn progression while
    running
    - matching hooks run in parallel, then their results are aggregated into
    a normalized HookRunSummary
    
    On the AppServer side, hooks are exposed as operational metadata rather
    than transcript-native items:
    
    - new live notifications: hook/started, hook/completed
    - persisted/replayed hook results live on Turn.hookRuns
    - we intentionally did not add hook-specific ThreadItem variants
    
    Hooks messages are not persisted, they remain ephemeral. The context
    changes they add are (they get appended to the user's prompt)
  • Add request permissions tool (#13092)
    Adds a built-in `request_permissions` tool and wires it through the
    Codex core, protocol, and app-server layers so a running turn can ask
    the client for additional permissions instead of relying on a static
    session policy.
    
    The new flow emits a `RequestPermissions` event from core, tracks the
    pending request by call ID, forwards it through app-server v2 as an
    `item/permissions/requestApproval` request, and resumes the tool call
    once the client returns an approved subset of the requested permission
    profile.
  • Add in-process app server and wire up exec to use it (#14005)
    This is a subset of PR #13636. See that PR for a full overview of the
    architectural change.
    
    This PR implements the in-process app server and modifies the
    non-interactive "exec" entry point to use the app server.
    
    ---------
    
    Co-authored-by: Felipe Coury <felipe.coury@gmail.com>
  • [elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621)
    - [x] Switch to use MCP style elicitation payload for mcp tool
    approvals.
    - [ ] TODO: Update the UI to support the full spec.
  • Enabling CWD Saving for Image-Gen (#13607)
    Codex now saves the generated image on to your current working
    directory.
  • feat(app-server): support mcp elicitations in v2 api (#13425)
    This adds a first-class server request for MCP server elicitations:
    `mcpServer/elicitation/request`.
    
    Until now, MCP elicitation requests only showed up as a raw
    `codex/event/elicitation_request` event from core. That made it hard for
    v2 clients to handle elicitations using the same request/response flow
    as other server-driven interactions (like shell and `apply_patch`
    tools).
    
    This also updates the underlying MCP elicitation request handling in
    core to pass through the full MCP request (including URL and form data)
    so we can expose it properly in app-server.
    
    ### Why not `item/mcpToolCall/elicitationRequest`?
    This is because MCP elicitations are related to MCP servers first, and
    only optionally to a specific MCP tool call.
    
    In the MCP protocol, elicitation is a server-to-client capability: the
    server sends `elicitation/create`, and the client replies with an
    elicitation result. RMCP models it that way as well.
    
    In practice an elicitation is often triggered by an MCP tool call, but
    not always.
    
    ### What changed
    - add `mcpServer/elicitation/request` to the v2 app-server API
    - translate core `codex/event/elicitation_request` events into the new
    v2 server request
    - map client responses back into `Op::ResolveElicitation` so the MCP
    server can continue
    - update app-server docs and generated protocol schema
    - add an end-to-end app-server test that covers the full round trip
    through a real RMCP elicitation flow
    - The new test exercises a realistic case where an MCP tool call
    triggers an elicitation, the app-server emits
    mcpServer/elicitation/request, the client accepts it, and the tool call
    resumes and completes successfully.
    
    ### app-server API flow
    - Client starts a thread with `thread/start`.
    - Client starts a turn with `turn/start`.
    - App-server sends `item/started` for the `mcpToolCall`.
    - While that tool call is in progress, app-server sends
    `mcpServer/elicitation/request`.
    - Client responds to that request with `{ action: "accept" | "decline" |
    "cancel" }`.
    - App-server sends `serverRequest/resolved`.
    - App-server sends `item/completed` for the mcpToolCall.
    - App-server sends `turn/completed`.
    - If the turn is interrupted while the elicitation is pending,
    app-server still sends `serverRequest/resolved` before the turn
    finishes.
  • image-gen-event/client_processing (#13512)
    enabling client-side to process with image-generation capabilities
    (setting app-server)
  • feat(app-server): propagate app-server trace context into core (#13368)
    ### Summary
    Propagate trace context originating at app-server RPC method handlers ->
    codex core submission loop (so this includes spans such as `run_turn`!).
    This implements PR 2 of the app-server tracing rollout.
    
    This also removes the old lower-level env-based reparenting in core so
    explicit request/submission ancestry wins instead of being overridden by
    ambient `TRACEPARENT` state.
    
    ### What changed
    - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission
    - Taught `Codex::submit()` / `submit_with_id()` to automatically capture
    the current span context when constructing or forwarding a submission
    - Wrapped the core submission loop in a submission_dispatch span
    parented from Submission.trace
    - Warn on invalid submission trace carriers and ignore them cleanly
    - Removed the old env-based downstream reparenting path in core task
    execution
    - Stopped OTEL provider init from implicitly attaching env trace context
    process-wide
    - Updated mcp-server Submission call sites for the new field
    
    Added focused unit tests for:
    - capturing trace context into Submission
    - preferring `Submission.trace` when building the core dispatch span
    
    ### Why
    PR 1 gave us consistent inbound request spans in app-server, but that
    only covered the transport boundary. For long-running work like turns
    and reviews, the important missing piece was preserving ancestry after
    the request handler returns and core continues work on a different async
    path.
    
    This change makes that handoff explicit and keeps the parentage rules
    simple:
    - app-server request span sets the current context
    - `Submission.trace` snapshots that context
    - core restores it once, at the submission boundary
    - deeper core spans inherit naturally
    
    That also lets us stop relying on env-based reparenting for this path,
    which was too ambient and could override explicit ancestry.
  • app-server service tier plumbing (plus some cleanup) (#13334)
    followup to https://github.com/openai/codex/pull/13212 to expose fast
    tier controls to app server
    (majority of this PR is generated schema jsons - actual code is +69 /
    -35 and +24 tests )
    
    - add service tier fields to the app-server protocol surfaces used by
    thread lifecycle, turn start, config, and session configured events
    - thread service tier through the app-server message processor and core
    thread config snapshots
    - allow runtime config overrides to carry service tier for app-server
    callers
    
    cleanup:
    - Removing useless "legacy" code supporting "standard" - we moved to
    None | "fast", so "standard" is not needed.
  • add fast mode toggle (#13212)
    - add a local Fast mode setting in codex-core (similar to how model id
    is currently stored on disk locally)
    - send `service_tier=priority` on requests when Fast is enabled
    - add `/fast` in the TUI and persist it locally
    - feature flag
  • Enable analytics in codex exec and codex mcp-server (#13083)
    Addresses #12913
    
    `codex exec` was not correctly defaulting to Otel metrics to enabled 
    `codex mcp-server` completely lacked an Otel collector
    
    Summary:
    - default to enabling analytics when `codex exec` initializes
    OpenTelemetry so the CLI actually reports metrics again
    - add a regression test that proves the flag remains enabled by default
    - added Otel collector to `codex mcp-server`
  • Suppress duplicate assistant output on stdout in interactive sessions (#13082)
    Addresses #12566
    
    Summary
    - stop printing the final assistant message on stdout when the process
    is running in a terminal so interactive users only see it once
    - add a helper that gates the stdout emission and cover it with unit
    tests
  • Allow clients not to send summary as an option (#12950)
    Summary is a required parameter on UserTurn. Ideally we'd like the core
    to decide the appropriate summary level.
    
    Make the summary optional and don't send it when not needed.
  • Use model catalog default for reasoning summary fallback (#12873)
    ## Summary
    - make `Config.model_reasoning_summary` optional so unset means use
    model default
    - resolve the optional config value to a concrete summary when building
    `TurnContext`
    - add protocol support for `default_reasoning_summary` in model metadata
    
    ## Validation
    - `cargo test -p codex-core --lib client::tests -- --nocapture`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Revert "Add skill approval event/response (#12633)" (#12811)
    This reverts commit https://github.com/openai/codex/pull/12633. We no
    longer need this PR, because we favor sending normal exec command
    approval server request with `additional_permissions` of skill
    permissions instead
  • Enable request_user_input in Default mode (#12735)
    ## Summary
    - allow `request_user_input` in Default collaboration mode as well as
    Plan
    - update the Default-mode instructions to prefer assumptions first and
    use `request_user_input` only when a question is unavoidable
    - update request_user_input and app-server tests to match the new
    Default-mode behavior
    - refactor collaboration-mode availability plumbing into
    `CollaborationModesConfig` for future mode-related flags
    
    ## Codex author
    `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`
  • feat(app-server): add ThreadItem::DynamicToolCall (#12732)
    Previously, clients would call `thread/start` with dynamic_tools set,
    and when a model invokes a dynamic tool, it would just make the
    server->client `item/tool/call` request and wait for the client's
    response to complete the tool call. This works, but it doesn't have an
    `item/started` or `item/completed` event.
    
    Now we are doing this:
    - [new] emit `item/started` with `DynamicToolCall` populated with the
    call arguments
    - send an `item/tool/call` server request
    - [new] once the client responds, emit `item/completed` with
    `DynamicToolCall` populated with the response.
    
    Also, with `persistExtendedHistory: true`, dynamic tool calls are now
    reconstructable in `thread/read` and `thread/resume` as
    `ThreadItem::DynamicToolCall`.
  • feat: pass helper executable paths via Arg0DispatchPaths (#12719)
    ## Why
    
    `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` previously
    located `codex-execve-wrapper` by scanning `PATH` and sibling
    directories. That lookup is brittle and can select the wrong binary when
    the runtime environment differs from startup assumptions.
    
    We already pass `codex-linux-sandbox` from `codex-arg0`;
    `codex-execve-wrapper` should use the same startup-driven path plumbing.
    
    ## What changed
    
    - Introduced `Arg0DispatchPaths` in `codex-arg0` to carry both helper
    executable paths:
      - `codex_linux_sandbox_exe`
      - `main_execve_wrapper_exe`
    - Updated `arg0_dispatch_or_else()` to pass `Arg0DispatchPaths` to
    top-level binaries and preserve helper paths created in
    `prepend_path_entry_for_codex_aliases()`.
    - Threaded `Arg0DispatchPaths` through entrypoints in `cli`, `exec`,
    `tui`, `app-server`, and `mcp-server`.
    - Added `main_execve_wrapper_exe` to core configuration plumbing
    (`Config`, `ConfigOverrides`, and `SessionServices`).
    - Updated zsh-fork shell escalation to consume the configured
    `main_execve_wrapper_exe` and removed path-sniffing fallback logic.
    - Updated app-server config reload paths so reloaded configs keep the
    same startup-provided helper executable paths.
    
    ## References
    
    - [`Arg0DispatchPaths`
    definition](https://github.com/openai/codex/blob/e355b43d5c2a771f045296a6deae10d7c9c36ec6/codex-rs/arg0/src/lib.rs#L20-L24)
    - [`arg0_dispatch_or_else()` forwarding both
    paths](https://github.com/openai/codex/blob/e355b43d5c2a771f045296a6deae10d7c9c36ec6/codex-rs/arg0/src/lib.rs#L145-L176)
    - [zsh-fork escalation using configured wrapper
    path](https://github.com/openai/codex/blob/e355b43d5c2a771f045296a6deae10d7c9c36ec6/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs#L109-L150)
    
    ## Testing
    
    - `cargo check -p codex-arg0 -p codex-core -p codex-exec -p codex-tui -p
    codex-mcp-server -p codex-app-server`
    - `cargo test -p codex-arg0`
    - `cargo test -p codex-core tools::runtimes::shell::unix_escalation:: --
    --nocapture`
  • Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
    ## Summary
    - Add agent job support: spawn a batch of sub-agents from CSV, auto-run,
    auto-export, and store results in SQLite.
    - Simplify workflow: remove run/resume/get-status/export tools; spawn is
    deterministic and completes in one call.
    - Improve exec UX: stable, single-line progress bar with ETA; suppress
    sub-agent chatter in exec.
    
    ## Why
    Enables map-reduce style workflows over arbitrarily large repos using
    the existing Codex orchestrator. This addresses review feedback about
    overly complex job controls and non-deterministic monitoring.
    
    ## Demo (progress bar)
    ```
    ./codex-rs/target/debug/codex exec \
      --enable collab \
      --enable sqlite \
      --full-auto \
      --progress-cursor \
      -c agents.max_threads=16 \
      -C /Users/daveaitel/code/codex \
      - <<'PROMPT'
    Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
    path = item-01..item-30, area = test.
    
    Then call spawn_agents_on_csv with:
    - csv_path: /tmp/agent_job_progress_demo.csv
    - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
    - output_csv_path: /tmp/agent_job_progress_demo_out.csv
    PROMPT
    ```
    
    ## Review feedback addressed
    - Auto-start jobs on spawn; removed run/resume/status/export tools.
    - Auto-export on success.
    - More descriptive tool spec + clearer prompts.
    - Avoid deadlocks on spawn failure; pending/running handled safely.
    - Progress bar no longer scrolls; stable single-line redraw.
    
    ## Tests
    - `cd codex-rs && cargo test -p codex-exec`
    - `cd codex-rs && cargo build -p codex-cli`
  • Add skill approval event/response (#12633)
    Set the stage for skill-level permission approval in addition to
    command-level.
    
    Behind a feature flag.
  • Allow exec resume to parse output-last-message flag after command (#12541)
    Summary
    - mark `output-last-message` as a global exec flag so it can follow
    subcommands like `resume`
    - add regression tests in both `cli` and `exec` crates verifying the
    flag order works when invoking `resume`
    
    Fixes #12538
  • chore: remove codex-core public protocol/shell re-exports (#12432)
    ## Why
    
    `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
    from `codex-protocol` and `codex-shell-command`. That made it easy for
    workspace crates to import those APIs through `codex-core`, which in
    turn hides dependency edges and makes it harder to reduce compile-time
    coupling over time.
    
    This change removes those public re-exports so call sites must import
    from the source crates directly. Even when a crate still depends on
    `codex-core` today, this makes dependency boundaries explicit and
    unblocks future work to drop `codex-core` dependencies where possible.
    
    ## What Changed
    
    - Removed public re-exports from `codex-rs/core/src/lib.rs` for:
    - `codex_protocol::protocol` and related protocol/model types (including
    `InitialHistory`)
      - `codex_protocol::config_types` (`protocol_config_types`)
    - `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
    parse_command, powershell}`
    - Migrated workspace Rust call sites to import directly from:
      - `codex_protocol::protocol`
      - `codex_protocol::config_types`
      - `codex_protocol::models`
      - `codex_shell_command`
    - Added explicit `Cargo.toml` dependencies (`codex-protocol` /
    `codex-shell-command`) in crates that now import those crates directly.
    - Kept `codex-core` internal modules compiling by using `pub(crate)`
    aliases in `core/src/lib.rs` (internal-only, not part of the public
    API).
    - Updated the two utility crates that can already drop a `codex-core`
    dependency edge entirely:
      - `codex-utils-approval-presets`
      - `codex-utils-cli`
    
    ## Verification
    
    - `cargo test -p codex-utils-approval-presets`
    - `cargo test -p codex-utils-cli`
    - `cargo check --workspace --all-targets`
    - `just clippy`
  • Wire realtime api to core (#12268)
    - Introduce `RealtimeConversationManager` for realtime API management 
    - Add `op::conversation` to start conversation, insert audio, insert
    text, and close conversation.
    - emit conversation lifecycle and realtime events.
    - Move shared realtime payload types into codex-protocol and add core
    e2e websocket tests for start/replace/transport-close paths.
    
    Things to consider:
    - Should we use the same `op::` and `Events` channel to carry audio? I
    think we should try this simple approach and later we can create
    separate one if the channels got congested.
    - Sending text updates to the client: we can start simple and later
    restrict that.
    - Provider auth isn't wired for now intentionally
  • feat: cleaner TUI for sub-agents (#12327)
    <img width="760" height="496" alt="Screenshot 2026-02-20 at 14 31 25"
    src="https://github.com/user-attachments/assets/1983b825-bb47-417e-9925-6f727af56765"
    />
  • client side modelinfo overrides (#12101)
    TL;DR
    Add top-level `model_catalog_json` config support so users can supply a
    local model catalog override from a JSON file path (including adding new
    models) without backend changes.
    
    ### Problem
    Codex previously had no clean client-side way to replace/overlay model
    catalog data for local testing of model metadata and new model entries.
    
    ### Fix
    - Add top-level `model_catalog_json` config field (JSON file path).
    - Apply catalog entries when resolving `ModelInfo`:
      1. Base resolved model metadata (remote/fallback)
      2. Catalog overlay from `model_catalog_json`
    3. Existing global top-level overrides (`model_context_window`,
    `model_supports_reasoning_summaries`, etc.)
    
    ### Note
    Will revisit per-field overrides in a follow-up
    
    ### Tests
    Added tests
  • [js_repl] paths for node module resolution can be specified for js_repl (#11944)
    # External (non-OpenAI) Pull Request Requirements
    
    In `js_repl` mode, module resolution currently starts from
    `js_repl_kernel.js`, which is written to a per-kernel temp dir. This
    effectively means that bare imports will not resolve.
    
    This PR adds a new config option, `js_repl_node_module_dirs`, which is a
    list of dirs that are used (in order) to resolve a bare import. If none
    of those work, the current working directory of the thread is used.
    
    For example:
    ```toml
    js_repl_node_module_dirs = [
        "/path/to/node_modules/",
        "/other/path/to/node_modules/",
    ]
    ```
  • feat(core): zsh exec bridge (#12052)
    zsh fork PR stack:
    - https://github.com/openai/codex/pull/12051 
    - https://github.com/openai/codex/pull/12052 👈 
    
    ### Summary
    This PR introduces a feature-gated native shell runtime path that routes
    shell execution through a patched zsh exec bridge, removing MCP-specific
    behavior from the shell hot path while preserving existing
    CommandExecution lifecycle semantics.
    
    When shell_zsh_fork is enabled, shell commands run via patched zsh with
    per-`execve` interception through EXEC_WRAPPER. Core receives wrapper
    IPC requests over a Unix socket, applies existing approval policy, and
    returns allow/deny before the subcommand executes.
    
    ### What’s included
    **1) New zsh exec bridge runtime in core**
    - Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for
    EXEC_WRAPPER invocations.
    - Per-execution Unix-socket IPC handling for wrapper requests/responses.
    - Approval callback integration using existing core approval
    orchestration.
    - Streaming stdout/stderr deltas to existing command output event
    pipeline.
    - Error handling for malformed IPC, denial/abort, and execution
    failures.
    
    **2) Session lifecycle integration**
    SessionServices now owns a `ZshExecBridge`.
    Session startup initializes bridge state; shutdown tears it down
    cleanly.
    
    **3) Shell runtime routing (feature-gated)**
    When `shell_zsh_fork` is enabled:
    - Build execution env/spec as usual.
    - Add wrapper socket env wiring.
    - Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of
    the regular shell path.
    - Non-zsh-fork behavior remains unchanged.
    
    **4) Config + feature wiring**
    - Added `Feature::ShellZshFork` (under development).
    - Added config support for `zsh_path` (optional absolute path to patched
    zsh):
    - `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema.
    - Session startup validates that `zsh_path` exists/usable when zsh-fork
    is enabled.
    - Added startup test for missing `zsh_path` failure mode.
    
    **5) Seatbelt/sandbox updates for wrapper IPC**
    - Extended seatbelt policy generation to optionally allow outbound
    connection to explicitly permitted Unix sockets.
    - Wired sandboxing path to pass wrapper socket path through to seatbelt
    policy generation.
    - Added/updated seatbelt tests for explicit socket allow rule and
    argument emission.
    
    **6) Runtime entrypoint hooks**
    - This allows the same binary to act as the zsh wrapper subprocess when
    invoked via `EXEC_WRAPPER`.
    
    **7) Tool selection behavior**
    - ToolsConfig now prefers ShellCommand type when shell_zsh_fork is
    enabled.
    - Added test coverage for precedence with unified-exec enabled.
  • chore: rm remote models fflag (#11699)
    rm `remote_models` feature flag.
    
    We see issues like #11527 when a user has `remote_models` disabled, as
    we always use the default fallback `ModelInfo`. This causes issues with
    model performance.
    
    Builds on #11690, which helps by warning the user when they are using
    the default fallback. This PR will make that happen much less frequently
    as an accidental consequence of disabling `remote_models`.
  • Feat: add model reroute notification (#12001)
    ### Summary
    Builiding off
    https://github.com/openai/codex/pull/11964/files/5c75aa7b89a70bc2cc410a6fd238749306ec4c5e#diff-058ae8f109a8b84b4b79bbfa45f522c2233b9d9e139696044ae374d50b6196e0,
    we have created a `model/rerouted` notification that captures the event
    so that consumers can render as expected. Keep the `EventMsg::Warning`
    path in core so that this does not affect TUI rendering.
    
    `model/rerouted` is meant to be generic to account for future usage
    including capacity planning etc.
  • Report syntax errors in rules file (#11686)
    Currently, if there are syntax errors detected in the starlark rules
    file, the entire policy is silently ignored by the CLI. The app server
    correctly emits a message that can be displayed in a GUI.
    
    This PR changes the CLI (both the TUI and non-interactive exec) to fail
    when the rules file can't be parsed. It then prints out an error message
    and exits with a non-zero exit code. This is consistent with the
    handling of errors in the config file.
    
    This addresses #11603
  • feat: introduce Permissions (#11633)
    ## Why
    We currently carry multiple permission-related concepts directly on
    `Config` for shell/unified-exec behavior (`approval_policy`,
    `sandbox_policy`, `network`, `shell_environment_policy`,
    `windows_sandbox_mode`).
    
    Consolidating these into one in-memory struct makes permission handling
    easier to reason about and sets up the next step: supporting named
    permission profiles (`[permissions.PROFILE_NAME]`) without changing
    behavior now.
    
    This change is mostly mechanical: it updates existing callsites to go
    through `config.permissions`, but it does not yet refactor those
    callsites to take a single `Permissions` value in places where multiple
    permission fields are still threaded separately.
    
    This PR intentionally **does not** change the on-disk `config.toml`
    format yet and keeps compatibility with legacy config keys.
    
    ## What Changed
    - Introduced `Permissions` in `core/src/config/mod.rs`.
    - Added `Config::permissions` and moved effective runtime permission
    fields under it:
      - `approval_policy`
      - `sandbox_policy`
      - `network`
      - `shell_environment_policy`
      - `windows_sandbox_mode`
    - Updated config loading/building so these effective values are still
    derived from the same existing config inputs and constraints.
    - Updated Windows sandbox helpers/resolution to read/write via
    `permissions`.
    - Threaded the new field through all permission consumers across core
    runtime, app-server, CLI/exec, TUI, and sandbox summary code.
    - Updated affected tests to reference `config.permissions.*`.
    - Renamed the struct/field from
    `EffectivePermissions`/`effective_permissions` to
    `Permissions`/`permissions` and aligned variable naming accordingly.
    
    ## Verification
    - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
    -p codex-exec -p codex-utils-sandbox-summary`
    - `cargo build -p codex-core -p codex-tui -p codex-cli -p
    codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
  • Add feature-gated freeform js_repl core runtime (#10674)
    ## Summary
    
    This PR adds an **experimental, feature-gated `js_repl` core runtime**
    so models can execute JavaScript in a persistent REPL context across
    tool calls.
    
    The implementation integrates with existing feature gating, tool
    registration, prompt composition, config/schema docs, and tests.
    
    ## What changed
    
    - Added new experimental feature flag: `features.js_repl`.
    - Added freeform `js_repl` tool and companion `js_repl_reset` tool.
    - Gated tool availability behind `Feature::JsRepl`.
    - Added conditional prompt-section injection for JS REPL instructions
    via marker-based prompt processing.
    - Implemented JS REPL handlers, including freeform parsing and pragma
    support (timeout/reset controls).
    - Added runtime resolution order for Node:
      1. `CODEX_JS_REPL_NODE_PATH`
      2. `js_repl_node_path` in config
      3. `PATH`
    - Added JS runtime assets/version files and updated docs/schema.
    
    ## Why
    
    This enables richer agent workflows that require incremental JavaScript
    execution with preserved state, while keeping rollout safe behind an
    explicit feature flag.
    
    ## Testing
    
    Coverage includes:
    
    - Feature-flag gating behavior for tool exposure.
    - Freeform parser/pragma handling edge cases.
    - Runtime behavior (state persistence across calls and top-level `await`
    support).
    
    ## Usage
    
    ```toml
    [features]
    js_repl = true
    ```
    
    Optional runtime override:
    
    - `CODEX_JS_REPL_NODE_PATH`, or
    - `js_repl_node_path` in config.
    
    #### [git stack](https://github.com/magus/git-stack-cli)
    - 👉 `1` https://github.com/openai/codex/pull/10674
    -  `2` https://github.com/openai/codex/pull/10672
    -  `3` https://github.com/openai/codex/pull/10671
    -  `4` https://github.com/openai/codex/pull/10673
    -  `5` https://github.com/openai/codex/pull/10670
  • Cache cloud requirements (#11305)
    We're loading these from the web on every startup. This puts them in a
    local file with a 1hr TTL.
    
    We sign the downloaded requirements with a key compiled into the Codex
    CLI to prevent unsophisticated tampering (determined circumvention is
    outside of our threat model: after all, one could just compile Codex
    without any of these checks).
    
    If any of the following are true, we ignore the local cache and re-fetch
    from Cloud:
    * The signature is invalid for the payload (== requirements, sign time,
    ttl, user identity)
    * The identity does not match the auth'd user's identity
    * The TTL has expired
    * We cannot parse requirements.toml from the payload
  • feat: split codex-common into smaller utils crates (#11422)
    We are removing feature-gated shared crates from the `codex-rs`
    workspace. `codex-common` grouped several unrelated utilities behind
    `[features]`, which made dependency boundaries harder to reason about
    and worked against the ongoing effort to eliminate feature flags from
    workspace crates.
    
    Splitting these utilities into dedicated crates under `utils/` aligns
    this area with existing workspace structure and keeps each dependency
    explicit at the crate boundary.
    
    ## What changed
    
    - Removed `codex-rs/common` (`codex-common`) from workspace members and
    workspace dependencies.
    - Added six new utility crates under `codex-rs/utils/`:
      - `codex-utils-cli`
      - `codex-utils-elapsed`
      - `codex-utils-sandbox-summary`
      - `codex-utils-approval-presets`
      - `codex-utils-oss`
      - `codex-utils-fuzzy-match`
    - Migrated the corresponding modules out of `codex-common` into these
    crates (with tests), and added matching `BUILD.bazel` targets.
    - Updated direct consumers to use the new crates instead of
    `codex-common`:
      - `codex-rs/cli`
      - `codex-rs/tui`
      - `codex-rs/exec`
      - `codex-rs/app-server`
      - `codex-rs/mcp-server`
      - `codex-rs/chatgpt`
      - `codex-rs/cloud-tasks`
    - Updated workspace lockfile entries to reflect the new dependency graph
    and removal of `codex-common`.
  • chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
    Problem:
    1. turn id is constructed in-memory;
    2. on resuming threads, turn_id might not be unique;
    3. client cannot no the boundary of a turn from rollout files easily.
    
    This PR does three things:
    1. persist `task_started` and `task_complete` events;
    1. persist `turn_id` in rollout turn events;
    5. generate turn_id as unique uuids instead of incrementing it in
    memory.
    
    This helps us resolve the issue of clients wanting to have unique turn
    ids for resuming a thread, and knowing the boundry of each turn in
    rollout files.
    
    example debug logs
    ```
    2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
    2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
    2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
    ```
    
    backward compatibility:
    if you try to resume an old session without task_started and
    task_complete event populated, the following happens:
    - If you resume and do nothing: those reconstructed historical IDs can
    differ next time you resume.
    - If you resume and send a new turn: the new turn gets a fresh UUID from
    live submission flow and is persisted, so that new turn’s ID is stable
    on later resumes.
    I think this behavior is fine, because we only care about deterministic
    turn id once a turn is triggered.
  • feat: do not close unified exec processes across turns (#10799)
    With this PR we do not close the unified exec processes (i.e. background
    terminals) at the end of a turn unless:
    * The user interrupt the turn
    * The user decide to clean the processes through `app-server` or
    `/clean`
    
    I made sure that `codex exec` correctly kill all the processes
  • Add resume_agent collab tool (#10903)
    Summary
    - add the new resume_agent collab tool path through core, protocol, and
    the app server API, including the resume events
    - update the schema/TypeScript definitions plus docs so resume_agent
    appears in generated artifacts and README
    - note that resumed agents rehydrate rollout history without overwriting
    their base instructions
    
    Testing
    - Not run (not requested)
  • Handle required MCP startup failures across components (#10902)
    Summary
    - add a `required` flag for MCP servers everywhere config/CLI data is
    touched so mandatory helpers can be round-tripped
    - have `codex exec` and `codex app-server` thread start/resume fail fast
    when required MCPs fail to initialize
  • Handle exec shutdown on Interrupt (fixes immortal codex exec with websockets) (#10519)
    ### Motivation
    - Ensure `codex exec` exits when a running turn is interrupted (e.g.,
    Ctrl-C) so the CLI is not "immortal" when websockets/streaming are used.
    
    ### Description
    - Return `CodexStatus::InitiateShutdown` when handling
    `EventMsg::TurnAborted` in
    `exec/src/event_processor_with_human_output.rs` so human-output exec
    mode shuts down after an interrupt.
    - Treat `protocol::EventMsg::TurnAborted` as
    `CodexStatus::InitiateShutdown` in
    `exec/src/event_processor_with_jsonl_output.rs` so JSONL output mode
    behaves the same.
    - Applied formatting with `just fmt`.
    
    ### Testing
    - Ran `just fmt` successfully.
    - Ran `cargo test -p codex-exec`; many unit tests ran and the test
    command completed, but the full test run in this environment produced
    `35 passed, 11 failed` where the failures are due to Landlock sandbox
    panics and 403 responses in the test harness (environmental/integration
    issues) and are not caused by the interrupt/shutdown changes.
    
    ------
    [Codex
    Task](https://chatgpt.com/codex/tasks/task_i_698165cec4e083258d17702bd29014c1)