Commit Graph

5210 Commits

  • Support Warp for OSC 9 notifications (#17174)
    Problem: Warp supports OSC 9 notifications, but the TUI's automatic
    notification backend selection did not recognize its
    `TERM_PROGRAM=WarpTerminal` environment value.
    
    Solution: Treat `TERM_PROGRAM=WarpTerminal` as OSC 9-capable when
    choosing the TUI desktop notification backend.
  • Add WebRTC realtime app-server e2e tests (#17093)
    Summary:
    - add app-server WebRTC realtime e2e harness
    - cover v1 handoff and v2 codex tool delegation over sideband
    
    Validation:
    - just fmt
    - git diff --check
    - local tests not run; relying on PR CI
  • Auto-approve MCP server elicitations in Full Access mode (#17164)
    Currently, when a MCP server sends an elicitation to Codex running in
    Full Access (`sandbox_policy: DangerFullAccess` + `approval_policy:
    Never`), the elicitations are auto-cancelled.
    
    This PR updates the automatic handling of MCP elicitations to be
    consistent with other approvals in full-access, where they are
    auto-approved. Because MCP elicitations may actually require user input,
    this mechanism is limited to empty form elicitations.
    
    ## Changeset
    - Add policy helper shared with existing MCP tool call approval
    auto-approve
    - Update `ElicitationRequestManager` to auto-approve elicitations in
    full access when `can_auto_accept_elicitation` is true.
    - Add tests
    
    Co-authored-by: Codex <noreply@openai.com>
  • Update guardian output schema (#17061)
    ## Summary
    - Update guardian output schema to separate risk, authorization,
    outcome, and rationale.
    - Feed guardian rationale into rejection messages.
    - Split the guardian policy into template and tenant-config sections.
    
    ## Validation
    - `cargo test -p codex-core mcp_tool_call`
    - `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
    -p codex-core guardian::`
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • Add top-level exec-server subcommand (#17162)
    ## Summary
    - add a top-level `codex exec-server` subcommand, marked experimental in
    CLI help
    - launch an adjacent or PATH-provided `codex-exec-server`, with a
    source-tree `cargo run -p codex-exec-server --` fallback
    - cover the new subcommand parser path
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - not run: Rust test suite
    
    Co-authored-by: Codex <noreply@openai.com>
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Attach WebRTC realtime starts to sideband websocket (#17057)
    Summary:
    - parse the realtime call Location header and join that call over the
    direct realtime WebSocket
    - keep WebRTC starts alive on the existing realtime conversation path
    
    Validation:
    - just fmt
    - git diff --check
    - cargo check -p codex-api
    - cargo check -p codex-core --tests
    - local cargo tests not run; relying on PR CI
  • Wire realtime WebRTC native media into Bazel (#17145)
    - Builds codex-realtime-webrtc through the normal Bazel Rust macro so
    native macOS WebRTC sources are included.\n- Shares the macOS -ObjC link
    flag with Bazel targets that can link libwebrtc.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Fix ToolsConfigParams initializer in tool registry test (#17154)
    ## Summary
    - add the missing `image_generation_tool_auth_allowed` field to the new
    tool registry plan test initializer
    
    ## Validation
    - `just fmt`
    - `cargo test -p codex-tools image_generation`
    - `cargo test -p codex-tools --no-run`
  • Fix missing fields (#17149)
    Fix missing `image_generation_tool_auth_allowed` in two locations.
  • [codex] Support remote exec cwd in TUI startup (#17142)
    When running with remote executor the cwd is the remote path. Today we
    check for existence of a local directory on startup and attempt to load
    config from it.
    
    For remote executors don't do that.
  • Add sandbox support to filesystem APIs (#16751)
    ## Summary
    - add optional `sandboxPolicy` support to the app-server filesystem
    request surface
    - thread sandbox-aware filesystem options through app-server and
    exec-server adapters
    - enforce sandboxed read/write access in the filesystem abstraction with
    focused local and remote coverage
    
    ## Validation
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-exec-server file_system`
    - `cargo test -p codex-app-server suite::v2::fs`
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • release ready, enabling only for siwc users (#17046)
    **Disabling Image-Gen for Non-SIWC Codex Users**
    
    We are only enabling image-gen feature for SIWC Codex users until there
    comes a fix in ResponsesAPI to omit output from responses.completed, to
    prevent the following issues:
    
    1. websocket blows up due to heavier load (images) than before (text) 
    2. http parser streams through n^2 of n-base64 bytes (sum of base64s of
    all images generated in turn) that causes long delays in
    turn_completion.
  • fix(debug-config, guardian): fix /debug-config rendering and guardian… (#17138)
    ## Description
    
    This PR fixes `/debug-config` so it shows more of the active
    requirements state, including reviewer requirements and managed feature
    pins. This made it clear that legacy MDM config was setting
    `approvals_reviewer = "guardian_subagent"` and that we were translating
    that into a requirements constraint.
    
    Also, translate `approvals_reviewer = "guardian_subagent"` (from legacy
    managed_config.toml) to `allowed_approvals_reviewers: guardian_subagent,
    user` instead of `allowed_approvals_reviewers: guardian_subagent`.
    
    Example `/debug-config`:
    ```
    Config layer stack (lowest precedence first):
      1. system (/etc/codex/config.toml) (enabled)
      2. user (/Users/owen/.codex/config.toml) (enabled)
      3. project (/Users/owen/repos/codex/.codex/config.toml) (enabled)
      4. legacy managed_config.toml (MDM) (enabled)
         MDM value:
           ...
    
           # Enable Guardian Mode
           features.guardian_approval = true
           approvals_reviewer = "guardian_subagent"
    
    Requirements:
      - allowed_approvals_reviewers: guardian_subagent, user (source: MDM managed_config.toml (legacy))
      - features: apps=true, plugins=true (source: cloud requirements)
    ```
    
    Before this PR, the `Requirements` section showed None.
  • Use AbsolutePathBuf for exec cwd plumbing (#17063)
    ## Summary
    - Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
    resolving workdirs to raw `PathBuf`s.
    - Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
    `ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
    requests.
    - Keep `PathBuf` conversions at external/event boundaries and update
    existing tests/fixtures for the typed cwd.
    
    ## Validation
    - `cargo check -p codex-core --tests`
    - `cargo check -p codex-sandboxing --tests`
    - `cargo test -p codex-sandboxing`
    - `cargo test -p codex-core --lib tools::handlers::`
    - `just fix -p codex-sandboxing`
    - `just fix -p codex-core`
    - `just fmt`
    
    Full `codex-core` test suite was not run locally; per repo guidance I
    kept local validation targeted.
  • Add WebRTC media transport to realtime TUI (#17058)
    Adds the `[realtime].transport = "webrtc"` TUI media path using a new
    `codex-realtime-webrtc` crate, while leaving app-server as the
    signaling/event source.\n\nLocal checks: fmt, diff-check, dependency
    tree only; test signal should come from CI.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [mcp] Support server-driven elicitations (#17043)
    - [x] Enables MCP elicitation for custom servers, not just Codex Apps
    - [x] Adds an RMCP service wrapper to preserve elicitation _meta
    - [x] Round-trips response _meta for persist/approval choices
    - [x] Updates TUI empty-schema elicitations into message-only approval
    prompts
  • Add realtime transport config (#17097)
    Adds realtime.transport config with websocket as the default and webrtc
    wired through the effective config.
    
    Co-authored-by: Codex <noreply@openai.com>
  • Skip MCP auth probing for disabled servers (#17098)
    Addresses #16971
    
    Problem: Disabled MCP servers were still queried for streamable HTTP
    auth status during MCP inventory, so unreachable disabled entries could
    add startup latency.
    
    Solution: Return `Unsupported` immediately for disabled MCP server
    configs before bearer token/OAuth status discovery.
  • Fix TUI crash when resuming the current thread (#17086)
    Problem: Resuming the live TUI thread through `/resume` could
    unsubscribe and reconnect the same app-server thread, leaving the UI
    crashed or disconnected.
    
    Solution: No-op `/resume` only when the selected thread is the currently
    attached active thread; keep the normal resume path for
    stale/displayed-only threads so recovery and reattach still work.
  • Show global AGENTS.md in /status (#17091)
    Addresses #3793
    
    Problem: /status only reported project-level AGENTS files, so sessions
    with a loaded global $CODEX_HOME/AGENTS.md still showed Agents.md as
    <none>.
    
    Solution: Track the global instructions file loaded during config
    initialization and prepend that path to the /status Agents.md summary,
    with coverage for AGENTS.md, AGENTS.override.md, and global-plus-project
    ordering.
  • Configure multi_agent_v2 spawn agent hints (#17071)
    Allow multi_agent_v2 features to have its own temporary configuration
    under `[features.multi_agent_v2]`
    
    ```
    [features.multi_agent_v2]
    enabled = true
    usage_hint_enabled = false
    usage_hint_text = "Custom delegation guidance."
    hide_spawn_agent_metadata = true
    ```
    
    Absent `usage_hint_text` means use the default hint.
    
    ```
    [features]
    multi_agent_v2 = true
    ```
    
    still works as the boolean shorthand.
  • codex debug 14 (guardian approved) (#17130)
    Removes lines 92-98 from core/templates/agents/orchestrator.md.
  • codex debug 12 (guardian approved) (#17128)
    Removes lines 78-84 from core/templates/agents/orchestrator.md.
  • codex debug 10 (guardian approved) (#17126)
    Removes lines 64-70 from core/templates/agents/orchestrator.md.
  • codex debug 8 (guardian approved) (#17124)
    Removes lines 50-56 from core/templates/agents/orchestrator.md.
  • codex debug 6 (guardian approved) (#17122)
    Removes lines 36-42 from core/templates/agents/orchestrator.md.
  • codex debug 4 (guardian approved) (#17120)
    Removes lines 22-28 from core/templates/agents/orchestrator.md.
  • codex debug 2 (guardian approved) (#17118)
    Removes lines 8-14 from core/templates/agents/orchestrator.md.
  • codex debug 15 (guardian approved) (#17131)
    Removes lines 99-106 from core/templates/agents/orchestrator.md.
  • codex debug 13 (guardian approved) (#17129)
    Removes lines 85-91 from core/templates/agents/orchestrator.md.
  • codex debug 11 (guardian approved) (#17127)
    Removes lines 71-77 from core/templates/agents/orchestrator.md.
  • codex debug 9 (guardian approved) (#17125)
    Removes lines 57-63 from core/templates/agents/orchestrator.md.
  • codex debug 7 (guardian approved) (#17123)
    Removes lines 43-49 from core/templates/agents/orchestrator.md.
  • codex debug 5 (guardian approved) (#17121)
    Removes lines 29-35 from core/templates/agents/orchestrator.md.
  • codex debug 3 (guardian approved) (#17119)
    Removes lines 15-21 from core/templates/agents/orchestrator.md.
  • codex debug 1 (guardian approved) (#17117)
    Removes lines 1-7 from core/templates/agents/orchestrator.md.
  • feat: single app-server bootstrap in TUI (#16582)
    Before this, the TUI was starting 2 app-server. One to check the login
    status and one to actually start the session
    
    This PR make only one app-server startup and defer the login check in
    async, outside of the frame rendering path
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Support anyOf and enum in JsonSchema (#16875)
    This brings us into better alignment with the JSON schema subset that is
    supported in
    <https://developers.openai.com/api/docs/guides/structured-outputs#supported-schemas>,
    and also allows us to render richer function signatures in code mode
    (e.g., anyOf{null, OtherObjectType})
  • Remove obsolete codex-cli README (#17096)
    Problem: codex-cli/README.md is obsolete and confusing to keep around.
    
    Solution: Delete codex-cli/README.md so the stale README is no longer
    present in the repository.
  • Remove expired April 2nd tooltip copy (#16698)
    Addresses #16677
    
    Problem: Paid-plan startup tooltips still advertised 2x rate limits
    until April 2nd after that promo had expired.
    
    Solution: Remove the stale expiry copy and use evergreen Codex App /
    Codex startup tips instead.
  • fix: refresh network proxy settings when sandbox mode changes (#17040)
    ## Summary
    
    Fix network proxy sessions so changing sandbox mode recomputes the
    effective managed network policy and applies it to the already-running
    per-session proxy.
    
    ## Root Cause
    
    `danger_full_access_denylist_only` injects `"*"` only while building the
    proxy spec for Full Access. Sessions built that spec once at startup, so
    a later permission switch to Full Access left the live proxy in its
    original restricted policy. Switching back needed the same recompute
    path to remove the synthetic wildcard again.
    
    ## What Changed
    
    - Preserve the original managed network proxy config/requirements so the
    effective spec can be recomputed for a new sandbox policy.
    - Refresh the current session proxy when sandbox settings change, then
    reapply exec-policy network overlays.
    - Add an in-place proxy state update path while rejecting
    listener/port/SOCKS changes that cannot be hot-reloaded.
    - Keep runtime proxy settings cheap to snapshot and update.
    - Add regression coverage for workspace-write -> Full Access ->
    workspace-write.
  • Add project-local codex bug triage skill (#17064)
    Add a `codex-bug` skill to help diagnose and fix bugs in codex.
  • Add remote exec start script (#17059)
    Just pass an SSH host
    ```
    ./scripts/start-codex-exec.sh codex-remote
    ```
  • Add regression tests for JsonSchema (#17052)
    Tests added for existing JsonSchema in
    `codex-rs/tools/src/json_schema_tests.rs`:
    
    - `parse_tool_input_schema_coerces_boolean_schemas`
    - `parse_tool_input_schema_infers_object_shape_and_defaults_properties`
    - `parse_tool_input_schema_normalizes_integer_and_missing_array_items`
    - `parse_tool_input_schema_sanitizes_additional_properties_schema`
    -
    `parse_tool_input_schema_infers_object_shape_from_boolean_additional_properties_only`
    - `parse_tool_input_schema_infers_number_from_numeric_keywords`
    - `parse_tool_input_schema_infers_number_from_multiple_of`
    -
    `parse_tool_input_schema_infers_string_from_enum_const_and_format_keywords`
    - `parse_tool_input_schema_defaults_empty_schema_to_string`
    - `parse_tool_input_schema_infers_array_from_prefix_items`
    -
    `parse_tool_input_schema_preserves_boolean_additional_properties_on_inferred_object`
    -
    `parse_tool_input_schema_infers_object_shape_from_schema_additional_properties_only`
    
    Tests that we expect to fail on the baseline normalizer, but pass with
    the new JsonSchema:
    
    - `parse_tool_input_schema_preserves_nested_nullable_type_union`
    - `parse_tool_input_schema_preserves_nested_any_of_property`
  • fix(tui): reduce startup and new-session latency (#17039)
    ## TL;DR
    
    - Fetches account/rateLimits/read asynchronously so the TUI can continue
    starting without waiting for the rate-limit response.
    - Fixes the /status card so it no longer leaves a stale “refreshing
    cached limits...” notice in terminal history.
    
    ## Problem
    
    The TUI bootstrap path fetched account rate limits synchronously
    (`account/rateLimits/read`) before the event loop started for
    ChatGPT/OpenAI-authenticated startups. This added ~670 ms of blocking
    latency in the measured hot-start case, even though rate-limit data is
    not needed to render the initial UI or accept user input. The delay was
    especially noticeable on hot starts where every other RPC
    (`account/read`, `model/list`, `thread/start`) completed in under 70 ms
    total.
    
    Moving that fetch to the background also exposed a `/status` UI bug: the
    status card is flattened into terminal scrollback when it is inserted. A
    transient "refreshing limits in background..." line could not be cleared
    later, because the async completion updated the retained `HistoryCell`,
    not the already-written terminal history.
    
    ## Mental model
    
    Before this change, `AppServerSession::bootstrap()` performed three
    sequential RPCs: `account/read` → `model/list` →
    `account/rateLimits/read`. The result of the third call was baked into
    `AppServerBootstrap` and applied to the chat widget before the event
    loop began.
    
    After this change, `bootstrap()` only performs two RPCs (`account/read`
    + `model/list`), and rate-limit fetching is kicked off as an async
    background task immediately after the first frame is scheduled. A new
    enum, `RateLimitRefreshOrigin`, tags each fetch so the event handler
    knows whether the result came from the startup prefetch or from a
    user-initiated `/status` command; they have different completion
    side-effects.
    
    The `get_login_status()` helper (used outside the main app flow) was
    also decoupled: it previously called the full `bootstrap()` just to
    check auth mode, wasting model-list and rate-limit work. It now calls
    the narrower `read_account()` directly.
    
    For `/status`, this PR keeps the background refresh request but stops
    printing transient refresh notices into status history when cached
    limits are already available. If a refresh updates the cache, the next
    `/status` command will render the new values.
    
    ## Non-goals
    
    - This change does not alter the rate-limit data itself.
    - This change does not introduce caching, retries, or staleness
    management for rate limits.
    - This change does not affect the `model/list` or `thread/start` RPCs;
    they remain on the critical startup path.
    
    ## Tradeoffs
    
    - **Stale-on-first-render**: The status bar will briefly show no
    rate-limit info until the background fetch completes; observed
    background fetches landed roughly in the 400-900 ms range after the UI
    appeared. This is acceptable because the user cannot meaningfully act on
    rate-limit data in the first fraction of a second.
    - **Error silence on startup prefetch**: If the startup prefetch fails,
    the error is logged but the UI is not notified (unlike `/status` refresh
    failures, which go through the status-command completion path). This
    avoids surfacing transient network errors as a startup blocker.
    - **Static `/status` history**: `/status` output is terminal history,
    not a live widget. The card now avoids progress-style language that
    would appear stuck in scrollback; users can run `/status` again to see
    newly cached values.
    - **`account_auth_mode` field removed from `AppServerBootstrap`**: The
    only consumer was `get_login_status()`, which no longer goes through
    `bootstrap()`. The field was dead weight.
    
    ## Architecture
    
    ### New types
    
    - `RateLimitRefreshOrigin` (in `app_event.rs`): A `Copy` enum
    distinguishing `StartupPrefetch` from `StatusCommand { request_id }`.
    Carried through `RefreshRateLimits` and `RateLimitsLoaded` events so the
    handler applies the right completion behavior.
    
    ### Modified types
    
    - `AppServerBootstrap`: Lost `account_auth_mode` and
    `rate_limit_snapshots`; gained `requires_openai_auth: bool` (passed
    through from the account response so the caller can decide whether to
    fire the prefetch).
    
    ### Control flow
    
    1. `bootstrap()` returns with `requires_openai_auth` and
    `has_chatgpt_account`.
    2. After scheduling the first frame, `App::run_inner` fires
    `refresh_rate_limits(StartupPrefetch)` if both flags are true.
    3. When `RateLimitsLoaded { StartupPrefetch, Ok(..) }` arrives,
    snapshots are applied and a frame is scheduled to repaint the status
    bar.
    4. When `RateLimitsLoaded { StartupPrefetch, Err(..) }` arrives, the
    error is logged and no UI update occurs.
    5. `/status`-initiated refreshes continue to use `StatusCommand {
    request_id }` and call `finish_status_rate_limit_refresh` on completion
    (success or failure).
    6. `/status` history cells with cached rate-limit rows no longer render
    an additional "refreshing limits" notice; the async refresh updates the
    cache for future status output.
    
    ### Extracted method
    
    - `AppServerSession::read_account()`: Factored out of `bootstrap()` so
    that `get_login_status()` can call it independently without triggering
    model-list or rate-limit work.
    
    ## Observability
    
    - The existing `tracing::warn!` for rate-limit fetch failures is
    preserved for the startup path.
    - No new metrics or spans are introduced. The startup-time improvement
    is observable via the existing `ready` timestamp in TUI startup logs.
    
    ## Tests
    
    - Existing tests in `status_command_tests.rs` are updated to match on
    `RateLimitRefreshOrigin::StatusCommand { request_id }` instead of a bare
    `request_id`.
    - Focused `/status` tests now assert that status history avoids
    transient refresh text, continues to request an async refresh, and uses
    refreshed cached limits in future status output.
    - No new tests are added for the startup prefetch path because it is a
    fire-and-forget spawn with no observable side-effect other than the
    widget state update, which is already covered by the
    snapshot-application tests.
    
    ---------
    
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Use model metadata for Fast Mode status (#16949)
    Fast Mode status was still tied to one model name in the TUI and
    model-list plumbing. This changes the model metadata shape so a model
    can advertise additional speed tiers, carries that field through the
    app-server model list, and uses it to decide when to show Fast Mode
    status.
    
    For people using Codex, the behavior is intended to stay the same for
    existing models. Fast Mode still requires the existing signed-in /
    feature-gated path; the difference is that the UI can now recognize any
    model the model list marks as Fast-capable, instead of requiring a new
    client-side slug check.
  • [codex] Apply patches through executor filesystem (#17048)
    ## Summary
    - run apply_patch through the executor filesystem when a remote
    environment is present instead of shelling out to the local process
    - thread the executor FileSystem into apply_patch interception and keep
    existing local behavior for non-remote turns
    - make the apply_patch integration harness use the executor filesystem
    for setup/assertions
    - add remote-aware skips for turn-diff coverage that still reads the
    test-runner filesystem
    
    ## Why
    Remote apply_patch needed to mutate the remote workspace instead of the
    local checkout. The tests also needed to seed and assert workspace state
    through the same filesystem abstraction so local and remote runs
    exercise the same behavior.
    
    ## Validation
    - `just fmt`
    - `git diff --check`
    - `cargo check -p core_test_support --tests`
    - `cargo test -p codex-core --test all
    suite::shell_serialization::apply_patch_custom_tool_call -- --nocapture`
    - `cargo test -p codex-core --test all
    suite::apply_patch_cli::apply_patch_cli_updates_file_appends_trailing_newline
    -- --nocapture`
    - remote `cargo test -p codex-core --test all apply_patch_cli --
    --nocapture` (229 passed)
  • Fix remote address format to work with Windows Firewall rules. (#17053)
    since March 27, most elevated sandbox setups are failing with:
    ```
    {
      "code": "helper_firewall_rule_create_or_add_failed",
      "message": "SetRemoteAddresses_failed__Error___code__HRESULT_0xD000000D___message___An_invalid_parameter_was_passed_to_a_service_or_function.",
      "originator": "Codex_Desktop",
      "__metric_type": "sum"
    }
    ```