Commit Graph

7421 Commits

  • Specify platform support in AGENTS.md (#27966)
    Codex seems to do interesting things with `cfg`'s sometimes and it seems
    it would be good to give it guidance about how broadly our Rust needs to
    work.
    
    This adds a very brief section to AGENTS.md explaining that we target
    the major desktop OSes and that we want the vast majority of our logic
    to be portable across them.
  • Add Guardian catalog diagnostics metadata (#27109)
    ## Why
    
    We need request-level evidence for Guardian cases where
    `codex-auto-review` is missing from the client-side model catalog and
    the review falls back to the parent model.
    
    ## What changed
    
    - Add `guardian_catalog_contains_auto_review` to Guardian Responses API
    client metadata.
    - Add `guardian_model_provider_id` to Guardian Responses API client
    metadata.
    - Keep review-session metadata optional so callers without metadata
    preserve the existing `None` path.
    - Add tests for override, normal preferred-model, and
    missing-auto-review-catalog behavior.
    
    ## Validation
    
    - `just test -p codex-core
    guardian_review_records_missing_auto_review_model_in_request_metadata`
    - `just test -p codex-core
    guardian_review_uses_model_catalog_override_when_preferred_review_model_exists`
    - `just test -p codex-core
    guardian_review_uses_preferred_review_model_without_model_catalog_override`
    - `git diff --check origin/main`
  • [2 of 3] Support long pasted text in TUI goals (#27509)
    ## Stack
    
    1. [1 of 3] Support long raw TUI goal objectives - #27508
    2. **[2 of 3] Support long pasted text in TUI goals** - this PR
    3. [3 of 3] Support images in TUI goals - #27510
    
    ## Why
    
    Large text pasted into the TUI composer is represented as a paste
    placeholder plus pending paste metadata. For `/goal`, preserving only
    the visible placeholder is not enough: the agent would see a short
    placeholder string instead of the actual pasted text, and the long-text
    support from the first PR would never see the payload.
    
    The TUI also needs to avoid writing stale sidecar files when a user
    pastes a large block and then deletes its placeholder before submitting
    the goal.
    
    ## What Changed
    
    - Introduces a TUI `GoalDraft` for goal submissions so `/goal`, `/goal
    edit`, and queued goal commands can carry objective text plus text
    elements and pending paste payloads.
    - Materializes active pasted-text placeholders to `pasted-text-N.txt`
    files through the app-server filesystem path introduced in #27508.
    - Rewrites active paste placeholders in the persisted objective to file
    references, while leaving literal placeholder-looking text alone.
    - Filters out deleted paste placeholders so otherwise-small goals do not
    require `$CODEX_HOME` or remote filesystem writes.
    - Preserves pending paste metadata when a `/goal` command is queued
    before a thread exists.
    
    ## Verification
    
    - Added goal materialization tests for active paste placeholders,
    deleted paste placeholders, and whitespace-only paste payloads.
    - Added/updated TUI slash-command tests for large pasted text, queued
    `/goal` commands before thread start, and queued oversized goal
    behavior.
    
    ## Manual Testing
    
    - Used real terminal bracketed-paste sequences through a remote TUI
    session. A 1,228-byte multiline paste became `pasted-text-1.txt`; its
    first/last lines and byte count matched exactly, and the persisted
    objective referenced the server-host path.
    - Pasted a large block, deleted its placeholder, and submitted a small
    replacement objective. No new directory or sidecar file was created.
    - Added two same-length large pastes to one goal. The composer
    disambiguated their visible placeholders, and materialization preserved
    order and contents in `pasted-text-1.txt` and `pasted-text-2.txt`.
    - Submitted a whitespace-only large paste and verified the goal was
    rejected as empty without writing a file.
    - Submitted a pasted-text replacement while another goal was active,
    verified no file was written before confirmation, then canceled and
    confirmed the original goal remained unchanged.
    - Combined a large paste with enough raw text to exceed 4,000 characters
    after placeholder rewriting. The paste sidecar and `goal-objective.md`
    were created in the same remote attachment directory, and `/goal edit`
    restored the rewritten objective with its sidecar reference.
  • [codex] add roles to realtime append text (#27936)
    ## Summary
    
    Add an explicit `user` or `developer` role to
    `thread/realtime/appendText` and propagate it through the realtime input
    queue into `conversation.item.create`. Older JSON clients that omit the
    field continue to default to `user`.
    
    This lets app-provided context such as memory retain developer authority
    without bypassing app-server through a renderer-owned data channel. The
    app-server schemas, API documentation, and focused protocol and
    websocket coverage are updated with the new contract.
    
    The Codex Apps consumer is tracked in
    [openai/openai#1025261](https://github.com/openai/openai/pull/1025261).
  • feat: use encrypted local secrets for MCP OAuth (#27541)
    ## Summary
    
    - store MCP OAuth credentials in the configured auth credential backend
    - support encrypted-local OAuth storage, including legacy keyring
    migration
    - propagate the credential backend through MCP refresh, session, CLI,
    and app-server paths
    
    ## Stack
    
    1. #27504 — config and feature flag
    2. #27535 — auth-specific secret namespaces
    3. #27539 — encrypted CLI auth storage
    4. this PR — encrypted MCP OAuth storage
    
    This is a parallel review stack; the original #17931 remains unchanged.
    
    ## Tests
    
    - `just test -p codex-rmcp-client` (the transport round-trip test passed
    after building the required `codex` binary and retrying)
    - `just test -p codex-mcp`
    - `just test -p codex-app-server
    refresh_config_uses_latest_auth_keyring_backend`
    - `just test -p codex-core
    refresh_mcp_servers_is_deferred_until_next_turn`
    - `just test -p codex-cli mcp`
    - `just fix -p codex-rmcp-client -p codex-mcp -p codex-core -p codex-cli
    -p codex-app-server -p codex-protocol`
    - `just bazel-lock-check`
  • Warn for structured feature toggles (#27076)
    ## Summary
    Startup warnings for under-development features only recognized bare
    boolean toggles like `features.foo = true`. An upcoming feature will use
    table-format config, so `features.foo = { enabled = true, ... }` needs
    to count as an explicit opt-in too.
    
    This updates the warning predicate to recognize structured tables with
    `enabled = true`, while leaving tables without that field unwarned.
    
    ## Testing
    - `just fmt`
    - `just test -p codex-features
    unstable_warning_event_mentions_enabled_structured_under_development_feature`
  • feat: use encrypted local secrets for CLI auth (#27539)
    ## Why
    
    Windows Credential Manager limits generic credential blobs to 2,560
    bytes. Large serialized ChatGPT auth payloads can exceed that limit, so
    keyring-mode CLI auth needs a backend that keeps only the encryption key
    in the OS keyring and stores the payload in Codex's encrypted
    local-secrets file.
    
    This is the third PR in the encrypted-auth stack:
    
    1. #27504 — feature and config selection
    2. #27535 — auth-specific local-secrets namespaces
    3. This PR — CLI auth implementation and activation
    4. MCP OAuth implementation and activation
    
    ## What Changed
    
    - Added encrypted CLI-auth storage using the `CliAuth` secrets
    namespace.
    - Preserved direct keyring storage for platforms/configurations where it
    remains selected.
    - Selected the backend consistently for login, logout, refresh,
    device-code login, auth loading, and login restrictions.
    - Threaded resolved bootstrap/full config through CLI, exec, TUI,
    app-server account handling, cloud config, and cloud tasks.
    - Removed stale `auth.json` fallback data after successful encrypted
    saves and removed encrypted, direct-keyring, and fallback data during
    logout.
    - Added storage and integration coverage for both direct and encrypted
    keyring modes.
    
    MCP OAuth persistence is intentionally left to the next PR.
    
    ## Validation
    
    - `just test -p codex-login` — 131 passed
    - `just test -p codex-cli` — 280 passed
    - `just test -p codex-app-server v2::account` — 25 passed
    - `just test -p codex-cloud-config service` — 21 passed, 7 skipped
    - `just fix -p codex-login`
    - `just fix -p codex-cli`
    - `just fmt`
  • Remove TUI realtime voice support (#27801)
    ## Why
    
    Removes the realtime audio support from TUI.
    
    ## What Changed
    
    - Removed the TUI `/realtime` and realtime `/settings` command paths.
    - Deleted TUI voice capture/playback, WebRTC session handling,
    audio-device selection UI, and recording-meter code.
    - Removed TUI realtime tests and snapshots that covered the deleted
    surfaces.
    - Dropped the TUI-only `cpal` and `codex-realtime-webrtc` dependencies
    and refreshed the Rust/Bazel locks.
  • Support plaintext agent messages (#27830)
    ## Why
    
    Multi-agent v2 `send_message` deliveries already reach the receiving
    model as typed `agent_message` items with encrypted content.
    Child-completion notifications are generated by Codex itself, so their
    content is plaintext and previously fell back to a serialized JSON
    envelope inside an assistant message.
    
    With plaintext `input_text` supported for `agent_message`, both delivery
    paths can use the same model-visible type while preserving explicit
    author and recipient metadata.
    
    ## What changed
    
    - add plaintext `input_text` support to `AgentMessageInputContent` and
    regenerate the affected app-server schemas
    - preserve `InterAgentCommunication` as structured mailbox input instead
    of converting it to assistant text
    - record delivered communications as typed `agent_message` history items
    - persist a dedicated rollout item so local delivery metadata such as
    `trigger_turn` remains available without leaking into the Responses
    request
    - reconstruct typed agent messages on resume and preserve fork-turn
    truncation behavior
    - remove request-time assistant-content parsing
    - preserve plaintext and encrypted inter-agent deliveries in stage-one
    memory inputs
    - normalize and link plaintext and encrypted agent messages in rollout
    traces without treating inbound messages as child results
    - cover the real MultiAgent V2 child-completion path end to end with
    deterministic mailbox synchronization
    
    ## Verification
    
    - `just test -p codex-core
    plaintext_multi_agent_v2_completion_sends_agent_message`
    - `just test -p codex-core input_queue_drains_mailbox_in_delivery_order
    record_initial_history_reconstructs_typed_inter_agent_message
    fork_turn_positions_use_inter_agent_delivery_metadata`
    - `just test -p codex-memories-write
    serializes_inter_agent_communications_for_memory`
    - `just test -p codex-rollout-trace
    agent_messages_preserve_routing_and_content
    sub_agent_started_activity_creates_spawn_edge`
    - `just test -p codex-rollout-trace
    agent_result_edge_falls_back_to_child_thread_without_result_message`
    - `just test -p codex-protocol -p codex-rollout -p
    codex-app-server-protocol`
  • fix(plugins) rm plugin descriptions (#23254)
    ## Summary
    Removes Plugin descriptions from the dev message, since descriptions of
    skills and MCPs cover the capabilities offered by the plugin.
    
    ## Testing
    - [x] Updates unit tests
  • [codex] Align implicit skill reads with parser (#27926)
    ## Summary
    - reuse the shared shell read parser for implicit skill doc invocation
    detection
    - add regression coverage for `nl -ba .../SKILL.md`
    
    ## Why
    Desktop could render `Read User Context skill` for reads recognized by
    the shared command parser, while implicit `skill_invocation` analytics
    used a separate reader allowlist and missed cases such as `nl`.
    
    ## Validation
    - `HOME=/private/tmp/codex-core-skills-home-pr
    PATH=/Users/alexsong/.cache/cargo-home/bin:$PATH
    CARGO_HOME=/Users/alexsong/.cache/cargo-home just test -p
    codex-core-skills`
    - `git diff --cached --check`
    - `just fmt` attempted; Rust formatting completed, but the Python
    formatters could not download uncached Ruff wheels because
    `files.pythonhosted.org` is blocked in this sandbox.
    - `bazel mod deps --lockfile_mode=update/error
    --repo_env=ASPECT_TOOLS_TELEMETRY= --repo_env=DO_NOT_TRACK=1` evaluated
    the module graph and produced no `MODULE.bazel.lock` diff, but Bazel
    crashed on sandboxed `sysctl` during exit.
  • [codex] Add crate API surface review rule (#27939)
    ## Why
    
    Review guidance should explicitly discourage widening crate APIs for
    testing convenience. Keeping those boundaries narrow reduces accidental
    coupling and prevents one-off test utilities from becoming durable
    public surface area.
    
    ## What
    
    - Add a crate API surface rule to `AGENTS.md`.
    - Ask reviewers to keep crate APIs small and avoid proliferating
    test-only helpers.
    
    ## Test plan
    
    - Not run (documentation-only change).
  • feat: add auth-specific encrypted secret namespaces (#27535)
    ## Why
    
    CLI auth and MCP OAuth credentials should use separate encrypted files
    while sharing the existing local-secrets implementation and
    OS-keyring-backed encryption key mechanism.
    
    This is the second PR in the encrypted-auth stack:
    
    1. #27504 — feature and config selection
    2. This PR — auth-specific local-secrets namespaces
    3. CLI auth implementation and activation
    4. MCP OAuth implementation and activation
    
    ## What Changed
    
    - Added `LocalSecretsNamespace` variants for shared secrets, CLI auth,
    and MCP OAuth.
    - Selected `local.age`, `cli_auth.age`, or `mcp_oauth.age` from the
    namespace.
    - Made atomic temporary filenames derive from the selected secrets
    filename.
    - Added namespaced `SecretsManager` construction and coverage proving
    the auth namespaces write separate encrypted files.
    - Made the default keyring store clonable for downstream namespaced auth
    backends.
    
    This PR does not activate either auth backend or change existing
    credential behavior.
    
    ## Validation
    
    - `just test -p codex-secrets` — 7 passed
    - `just test -p codex-keyring-store` — package has no test binaries
    - `just fmt`
  • [login] revoke existing auth before starting login (#27674)
    ## Why
    
    `codex login` previously persisted newly issued OAuth credentials and
    only then attempted to revoke the superseded refresh token. The old
    credential must be revoked before a replacement browser or device-code
    flow starts, and successful login must not perform any post-login
    revocation attempt.
    
    ## What changed
    
    - Revoke and clear existing stored auth before browser or device-code
    CLI login begins.
    - Remove superseded-token detection and revocation from the shared token
    persistence path; successful login now only saves the new credentials.
    - Read the raw configured auth store during CLI cleanup so
    environment-provided auth cannot mask the stored refresh token.
    - Preserve `auto` storage fallback semantics when keyring deletion fails
    by clearing the fallback auth file.
    - Add a process-level CLI regression test that requires the revoke
    request to precede every device-login request and occur exactly once.
    
    If replacement login is canceled or fails, the previous local
    credentials have already been cleared. Remote revocation remains best
    effort, matching explicit logout behavior.
    
    ## Validation
    
    ### Process-level before/after reproduction
    
    I compiled the real `codex` CLI from the pre-fix parent (`14df0e8833`)
    and from the PR implementation (`25c002f23b`; the login behavior is
    unchanged at the current head), then ran the same device-code flow
    against a local HTTP mock OAuth authority.
    
    Each run:
    
    1. Used a fresh temporary `CODEX_HOME` configured with
    `cli_auth_credentials_store = "file"`.
    2. Seeded that temporary home with managed ChatGPT auth containing
    `old-access` and `old-refresh` tokens.
    3. Pointed `CODEX_REVOKE_TOKEN_URL_OVERRIDE` at the mock `/oauth/revoke`
    endpoint.
    4. Ran the compiled CLI as:
    
       ```shell
       CODEX_HOME=<temporary-home> \
         CODEX_REVOKE_TOKEN_URL_OVERRIDE=<mock-issuer>/oauth/revoke \
    <compiled-codex> login --device-auth --experimental_issuer <mock-issuer>
       ```
    
    5. Recorded every request received by the mock authority. The mock
    marked `new-access` valid when `/oauth/token` issued it and invalidated
    it if `/oauth/revoke` arrived afterward, reproducing the observed
    session-invalidating failure mode. After login exited, the harness also
    verified the persisted refresh token and probed a protected endpoint
    with `new-access`.
    
    | Build | Observed request order | CLI/persistence result | `new-access`
    probe |
    | --- | --- | --- | --- |
    | Pre-fix | `usercode → device token → OAuth token →
    revoke(old-refresh)` | Exit `0`; `new-refresh` persisted | `401` |
    | PR | `revoke(old-refresh) → usercode → device token → OAuth token` |
    Exit `0`; `new-refresh` persisted | `200` |
    
    The PR run therefore issued exactly one revocation request, before any
    request that initiated the replacement login, and issued no revocation
    after token exchange.
    
    ### Regression coverage
    
    
    `codex-rs/cli/tests/login.rs::device_login_revokes_existing_auth_before_requesting_new_tokens`
    runs the real first-party `codex` binary against a `wiremock` OAuth
    server with an isolated temporary `CODEX_HOME`. It asserts:
    
    - the exact request sequence is `/oauth/revoke`,
    `/api/accounts/deviceauth/usercode`, `/api/accounts/deviceauth/token`,
    then `/oauth/token`;
    - there is exactly one revoke request and its body contains
    `old-refresh` with the `refresh_token` hint;
    - the completed login persists `new-refresh`.
    
    Local validation:
    
    - `just test -p codex-login` — 130 passed
    - `just test -p codex-cli` — 280 passed, including the new process-level
    regression test
    - `just bazel-lock-check`
  • feat: add secret auth storage configuration (#27504)
    ## Why
    
    Windows Credential Manager limits generic credential blobs to 2,560
    bytes. The encrypted local secrets backend avoids storing large
    serialized auth payloads directly in the OS keyring, but selecting that
    backend needs an independently reviewable feature/config layer before
    the auth and secrets implementation is wired in.
    
    ## What Changed
    
    - Added the stable `secret_auth_storage` feature, enabled by default on
    Windows and disabled by default elsewhere.
    - Added `AuthKeyringBackendKind` and config resolution for full and
    bootstrap config loading.
    - Applied managed feature requirements when resolving the bootstrap auth
    backend.
    - Updated the generated config schema and added focused tests.
    
    This is the base PR for #17931. The auth, secrets, MCP, CLI, TUI, and
    app-server implementation remains in that follow-up PR.
    
    ## Validation
    
    - `just test -p codex-features`
    - `just test -p codex-config`
    - `just test -p codex-core
    resolve_bootstrap_auth_keyring_backend_kind_uses_secret_auth_storage_feature`
    - `just write-config-schema`
    - `just fix -p codex-core`
    
    The full `just test -p codex-core` run compiled successfully and ran
    2,690 tests; 2,589 passed, one was flaky, and 101 environment-sensitive
    tests failed because this shell injects a `pyenv` rehash warning into
    command output or because sandboxed subprocesses timed out.
  • [codex] Add size to internal filesystem metadata (#27927)
    ## Why
    
    `ExecutorFileSystem::get_metadata` reports file kind and timestamps but
    not size. Internal callers that need to enforce a size limit therefore
    have to read the complete file first, which is especially wasteful for
    remote filesystems.
    
    This adds the missing internal metadata so consumers can reject
    oversized files before transferring or buffering them. The field is
    named `size`, matching VS Code's `FileStat.size` filesystem convention.
    
    ## What changed
    
    - add `size: u64` to internal `FileMetadata`
    - populate it from the underlying filesystem metadata
    - carry it through sandbox-helper and remote exec-server responses
    - cover files, directories, symlink targets, and sandboxed reads across
    local and remote filesystem implementations
    
    The new field is intentionally not exposed through the app-server API.
    
    ## Testing
    
    - `just test -p codex-exec-server get_metadata`
    - `just test -p codex-exec-server
    file_system_sandboxed_metadata_and_read_allow_readable_root`
    - `just test -p codex-core-plugins`
    - `just test -p codex-skills-extension`
  • Handle standalone image generation failures as terminal items (#27920)
    ## Why
    
    Standalone image generation emitted a started item but no terminal item
    when the backend failed. Clients could leave the operation unresolved or
    render it as successful.
    
    ## What changed
    
    - Emit a terminal image-generation item with `status: "failed"` when
    generation or editing fails.
    - Skip image persistence for failed terminal items.
    - Render failed image generation distinctly in TUI history.
    - Preserve the status when handling live and replayed terminal items.
    
    ## Looks for TUI, App-Side change needed 
    
    <img width="867" height="89" alt="image"
    src="https://github.com/user-attachments/assets/9e32342f-a982-411e-8498-456639fc468a"
    />
    
    ## Validation
    
    - `just test -p codex-image-generation-extension`
    - App-server image-generation tests
    - Core stream-event tests
    - TUI image-generation lifecycle and snapshot tests
    - Scoped Clippy and formatting
  • [codex] unify apply patch parsing (#27913)
    ## Why
    
    `apply_patch` maintained separate batch and streaming parsers for the
    same patch grammar. That duplicated the parsing rules and allowed final
    execution to disagree with the live streamed preview.
    
    ## What changed
    
    - Make `StreamingPatchParser` the single owner of hunk and environment
    ID parsing.
    - Keep heredoc and outer patch-boundary normalization in the existing
    `parse_patch` wrapper, preserving its public API.
    - Reject non-whitespace content after `*** End Patch` and preserve
    separator handling after `*** End of File`.
    - Reject duplicate environment ID preambles explicitly.
    - Remove the duplicate batch hunk parser and its implementation-specific
    tests.
    
    The change removes 201 net lines while retaining focused coverage for
    the unified parser's boundary behavior.
    
    ## Validation
    
    - `just test -p codex-apply-patch`
    - Compared a 24-hour corpus of 2,788,059 observed `apply_patch` payloads
    against the previous batch parser. All 2,779,502 accepted payloads
    produced identical hunks, canonical patch text, and environment IDs; the
    remaining 8,557 payloads were rejected by both parsers, with zero
    acceptance or payload mismatches.
  • [codex] expose remote plugin share URL (#27890)
    ## Summary
    
    - expose the remote plugin detail endpoint's `share_url` as nullable
    `PluginDetail.shareUrl`
    - preserve existing `PluginSummary.shareContext` behavior for local and
    workspace sharing flows
    - regenerate the app-server TypeScript and JSON schema fixtures
    
    ## Why
    
    The remote plugin detail response already includes a canonical
    `share_url`, but that value was not surfaced by `plugin/read` for global
    plugins. Global plugins intentionally have no `shareContext`, so using
    that model for the URL would change the semantics consumed by the
    existing share modal.
    
    ## User impact
    
    Codex clients can use `PluginDetail.shareUrl` for a remote plugin's
    copy-link action, including when the plugin is disabled by an
    administrator, without changing existing share-modal or ownership
    behavior.
    
    ## Validation
    
    - `cargo test -p codex-app-server
    plugin_read_includes_share_url_for_admin_disabled_remote_plugin`
    - `cargo test -p codex-app-server-protocol
    typescript_schema_fixtures_match_generated`
    - `cargo test -p codex-app-server-protocol
    json_schema_fixtures_match_generated`
    - `cargo fmt --all`
  • sandboxing: migrate cwd inputs to PathUri (#27816)
    ## Why
    
    Sandbox cwd values can cross app-server and exec-server host boundaries.
    They should retain URI semantics until the receiving host validates them
    instead of being interpreted early as native paths.
    
    ## What
    
    - Carry `PathUri` through filesystem sandbox contexts, sandbox commands,
    and transform inputs.
    - Convert command and policy cwd once in `SandboxManager::transform`,
    then keep launch requests native.
    - Preserve sandbox cwd over remote filesystem transport and reject
    non-native URIs without fallback.
    - Cache paired native/URI turn-environment cwd values during migration,
    with immutable access to keep them synchronized.
    - Extend existing protocol, forwarding, transform, and core runtime
    tests.
  • chore: prompt MAv2 (#27919)
    Prompt update of MAv2
  • realtime: add AVAS architecture override (#27720)
    ## Summary
    
    Adds a `RealtimeConversationArchitecture` option for realtime
    conversation startup, with `realtimeapi` as the default and `avas` as an
    opt-in architecture.
    
    The AVAS path is limited to realtime v1 conversational WebRTC starts,
    and WebRTC call creation appends `intent=quicksilver&architecture=avas`
    to `/v1/realtime/calls`. The existing sideband websocket still joins by
    `call_id`.
    
    This also exposes the per-session architecture override through
    app-server v2 `thread/realtime/start` params and updates the config
    schema for `[realtime].architecture`.
    
    ## Validation
    
    - `just fmt`
    - `just write-config-schema`
    - `just test -p codex-api sends_avas_session_call_query_params`
    - `just test -p codex-core -E
    'test(~conversation_webrtc_start_uses_avas_architecture_query)'`
    - `just test -p codex-core -E 'test(realtime_loads_from_config_toml)'`
    - `just test -p codex-app-server-protocol -E
    'test(~serialize_thread_realtime_start) |
    test(generated_ts_optional_nullable_fields_only_in_params)'`
    - `just test -p codex-app-server -E
    'test(realtime_webrtc_start_emits_sdp_notification)'`
  • Use uv as Python SDK build backend (#27901)
    ## Summary
    
    Replace Hatchling with uv's build backend for the Python SDK. The
    backend infers the `src/openai_codex` module from the normalized project
    name and standard source layout, so no uv-specific package configuration
    is required.
    
    This keeps Python packaging within the uv toolchain already used for
    dependency management and release builds. A controlled before-and-after
    PEP 517 comparison produced identical wheel package paths, bytes,
    permissions, and semantic metadata. The sdist retains the SDK package
    tree, root README, and project metadata while dropping the unrelated
    examples README that Hatch included through its broad include matching.
  • tui: Allow extra o's in /goal command (#27814)
    ## Why
    
    The TUI rejected playful `/goal` spellings such as `/goooooooooooal`,
    even though Codex Apps accepts them for the World Cup promotion. This
    keeps the TUI behavior consistent without changing how the canonical
    command is presented.
    
    ## How it works
    
    Built-in command lookup recognizes lowercase `go+al` as the existing
    `goal` command after normal exact-name parsing fails. The command
    catalog remains unchanged, so autocomplete continues to advertise
    `/goal` normally.
    
    ## Verification
    
    Added lookup-level and end-to-end TUI coverage for the flexible
    spelling. The focused tests, scoped Clippy checks, and formatting pass.
    The full `codex-tui` suite passed 2,833 of 2,835 tests; the two failing
    guardian feature-flag tests reproduce unchanged on fresh `origin/main`.
  • Persist update dismissal without cache (#27783)
    ## Summary
    
    Choosing “Don’t remind me” can silently fail when `version.json`
    disappears before dismissal because `dismiss_version` returns success
    without writing anything. The same update can then reappear on the next
    launch.
    
    Initialize a minimal `VersionInfo` from the selected version when the
    cache cannot be read, then persist the dismissal through the existing
    write path.
    
    Fixes #27147
  • Use dependency groups for Python SDK tooling (#27538)
    ## Summary
    
    `just fmt` previously used `uv run --with ruff` to make Ruff available.
    Because `--with` creates an ephemeral overlay outside the project
    lockfile, uv periodically re-resolved Ruff (by default every 10 minutes)
    instead of using the version recorded in `uv.lock`.
    
    Move the Python SDK tooling dependencies from the published `dev` extra
    into `format`, `test`, and composed `dev` dependency groups. The
    formatter now selects only the locked `format` group, contributor and CI
    setup explicitly sync the `dev` group, and CI and release commands reuse
    that environment with `--frozen --no-sync`. The scripts formatter also
    uses its project's locked Ruff dependency instead of an ephemeral
    overlay.
    
    Validated the Python 3.12 SDK suite (119 passed, 38 skipped) and the
    repository formatter.
  • [ez][codex-rs] Support approvals reviewer in app defaults (#27075)
    [from codex]
    
    ## Summary
    
    - add `approvals_reviewer` support to `[apps._default]`
    - resolve connected-app reviewers in per-app, app-default, then global
    order
    - expose the setting through the v2 config API and regenerate schema
    fixtures
    
    ## Context
    
    PR #25167 added `apps.<connector_id>.approvals_reviewer`, but the shared
    app defaults table could not specify the reviewer. This extends the same
    behavior to `[apps._default]` while preserving per-app overrides.
    
    Managed `allowed_approvals_reviewers` requirements still constrain both
    default and per-app values. A disallowed app value falls back to the
    global reviewer, and non-app MCP servers continue using the global
    reviewer.
    
    ## Testing
    
    - `just write-config-schema`
    - `just write-app-server-schema`
    - `just fmt`
    - `just test -p codex-config`
    - `just test -p codex-core app_approvals_reviewer`
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-app-server config_read_includes_apps`
  • Reject transcript backtrack in side conversations (#27791)
    ## Why
    
    Fixes #27735.
    
    Side conversations are ephemeral forks, and thread rollback currently
    requires persisted thread history. The normal backtrack path already
    rejected editing previous prompts in side conversations, but
    transcript-mode backtrack could still call the rollback path and surface
    the core `thread/rollback` failure as a TUI error.
    
    ## What changed
    
    - Moved the existing side-conversation edit rejection message into
    `app_backtrack.rs` so backtrack rollback code can reuse it.
    - Added a side-conversation guard in `apply_backtrack_rollback` so
    transcript-mode confirmation is rejected before submitting
    `thread/rollback`.
    
    ## Verification
    
    - `just test -p codex-tui
    app::tests::side_backtrack_rejection_reports_unavailable_message_snapshot`
  • fix: serialize auth environment tests (#27879)
    ## Summary
    - serialize the remaining login tests that mutate or read the
    process-global auth environment
    - include Bedrock auth-manager tests in the existing `codex_auth_env`
    serial group
    
    ## Root cause
    The login unit-test binary runs tests concurrently. One test removed
    `CODEX_ACCESS_TOKEN` without joining the existing serial group, while
    several Bedrock tests constructed `AuthManager` and read that same
    process-global environment outside the group. An interleaving could
    restore a stale personal access token while another test was loading
    file-backed auth, causing the observed mismatched `/whoami` request
    count and related auth-state flakes.
    
    ## Verification
    - `just fmt`
    - `git diff --check`
    - `bazel test //codex-rs/login:login-unit-tests --nocache_test_results
    --runs_per_test=20 --test_output=errors` (20/20 passed)
    - `just test -p codex-login` (135/135 passed)
  • [codex] restore source-specific import copy (#27703)
    ## Summary
    
    - restore source-specific wording across the `/import` picker, lifecycle
    messages, diagnostics, and help text
    - update the matching Unix and Windows snapshots
    - leave import behavior unchanged
    
    ## Why
    
    The import path currently supports one source, so the UI should identify
    that source directly instead of presenting the flow as
    provider-agnostic.
    
    ## Validation
    
    - `just test -p codex-tui external_agent_config_migration` (12 passed)
    - `just fix -p codex-tui`
    - `just fmt`
  • Extract shared plugin MCP config parsing (#27863)
    ## Why
    
    We want a thread-selected plugin to eventually expose stdio MCP servers
    that run on the executor owning that plugin.
    
    The existing plugin MCP parser lived inside `core-plugins` and was
    coupled to the host filesystem loader. Reusing it from an executor
    provider would either duplicate MCP normalization or make the plugin
    package layer own MCP runtime semantics. This PR creates the shared
    MCP-owned boundary first.
    
    In simple terms:
    
    ```text
    plugin .mcp.json
            |
            v
    shared parser in codex-mcp
            |
            +-- Declared placement: preserve current local-plugin behavior
            |
            +-- Environment placement: produce config bound to one executor
    ```
    
    This builds on the authority-bound plugin descriptors from #27692. It
    intentionally does not discover, register, or launch executor MCP
    servers yet.
    
    ## What changed
    
    - Moved plugin MCP file parsing and normalization from `core-plugins`
    into `codex-mcp`.
    - Kept support for both existing file shapes: a top-level server map and
    an object containing `mcpServers`.
    - Kept per-server failure isolation: one invalid server does not discard
    valid siblings, while malformed top-level JSON still fails the whole
    file.
    - Updated the existing local plugin loader to use `Declared` placement,
    preserving its current transport, OAuth, relative `cwd`, and error
    behavior.
    - Added `Environment` placement for the next stacked PR:
    - the selected environment ID overrides anything declared by the plugin;
      - missing stdio `cwd` defaults to the plugin root;
    - relative `cwd` is resolved beneath the plugin root and cannot traverse
    outside it;
    - bare or source-less environment-variable references resolve on a
    non-local executor;
    - explicit orchestrator environment-variable forwarding is rejected for
    executor-owned plugins.
    
    ## User impact
    
    None in this PR. Existing local plugin MCP loading follows the same
    behavior through the shared parser. The executor placement mode is not
    connected to thread startup until the follow-up registration PR.
    
    ## Assumptions
    
    - A selected capability root's environment is authoritative. A plugin
    cannot redirect its stdio process to the orchestrator or another
    executor.
    - Relative working directories belong under the plugin package root.
    Explicit absolute working directories remain valid within the owning
    environment.
    - For a non-local executor, unqualified environment-variable names refer
    to that executor. Reading an orchestrator variable requires an explicit
    contract and is rejected for now.
    - Parsing only produces normalized `McpServerConfig` values. Process
    startup remains owned by the existing MCP runtime and connection
    manager.
    
    ## Follow-ups
    
    1. Add the executor MCP provider and catalog registration: read the
    selected plugin's MCP config through the same executor filesystem,
    support stdio only, freeze the result per active thread, apply managed
    policy, and resolve name collisions as discovered plugin < selected
    plugin < explicit config.
    2. Install that provider in app-server and add an end-to-end test
    proving `thread/start.selectedCapabilityRoots` launches and calls the
    MCP tool on the selected executor, preserves the frozen registration
    across refresh, and does not expose it to an unselected thread.
    3. After the initial executor-stdio vertical, define
    resume/fork/environment-replacement semantics, executor HTTP placement,
    warning delivery, common MCP tool-context bounds, and move remaining MCP
    source composition above core.
    
    ## Verification
    
    - `cargo check -p codex-mcp -p codex-core-plugins --tests`
    - `just bazel-lock-check`
    - Added focused parser coverage for legacy local normalization, executor
    authority, working-directory handling, and environment-variable
    sourcing.
  • Add executor-owned plugin resolution (#27692)
    ## Why
    
    CCA can select a capability root that lives in an executor environment,
    but
    Codex only had a host-filesystem plugin loader. Before selected executor
    plugins can contribute MCP servers, we need a small package boundary
    that can
    answer:
    
    > Does this selected root contain a plugin, and if so, what does its
    manifest
    > declare?
    
    The answer must come from the selected environment's filesystem. A
    failed
    executor lookup must never fall back to the orchestrator filesystem.
    
    ## What this changes
    
    This PR introduces:
    
    ```rust
    PluginProvider::resolve(root)
        -> Result<Option<ResolvedPlugin>, Error>
    ```
    
    `ExecutorPluginProvider` resolves one `SelectedCapabilityRoot` through
    its
    exact `environment_id`. It checks the recognized manifest locations,
    reads the
    manifest through that environment's `ExecutorFileSystem`, and returns an
    inert
    `ResolvedPlugin` containing:
    
    - the opaque selected-root ID;
    - the environment-bound plugin root;
    - the authority-bound manifest resource;
    - parsed metadata and authority-bound component locators.
    
    Descriptor construction rejects manifest or component paths outside the
    selected package root, so consumers cannot accidentally lose the package
    boundary when they receive a resolved plugin.
    
    If the root has no plugin manifest, resolution returns `None`, allowing
    the
    caller to treat it as a standalone capability such as a skill.
    
    ```text
    selected root: repo -> env-1:/workspace/repo
                             |
                             | env-1 filesystem only
                             v
                 .codex-plugin/plugin.json
                             |
                             v
            ResolvedPlugin { authority, root, manifest }
    ```
    
    The existing host loader and the new executor provider now share the
    same
    manifest parser. Existing `codex-core-plugins::manifest` type paths
    remain
    available through re-exports, so host behavior and callers are
    unchanged.
    
    ## Scope
    
    This is intentionally a non-user-visible package-resolution PR. It does
    not:
    
    - parse or register plugin MCP server configurations;
    - activate skills, connectors, hooks, or MCP servers;
    - change app-server wiring;
    - introduce host fallback, caching, or lifecycle behavior.
    
    #27670 has merged, and this PR is now based directly on `main`. Together
    with
    the resolved MCP catalog from #27634, it establishes the inputs needed
    for the
    executor stdio MCP vertical without changing the existing MCP runtime.
    
    ## Follow-up
    
    The next PR will consume `ResolvedPlugin`, read its declared/default MCP
    config
    through the same executor filesystem, bind supported stdio servers to
    that
    environment, and feed those registrations into the resolved MCP catalog.
    An
    app-server E2E will prove that selecting an executor plugin exposes and
    invokes
    its tool on the owning executor.
    
    Resume/fork semantics, dynamic environment replacement, and non-stdio
    placement remain separate lifecycle decisions.
    
    ## Validation
    
    - `just fmt`
    - `cargo check --tests -p codex-plugin -p codex-core-plugins`
    - `just bazel-lock-check`
    - `git diff --check`
    
    Test targets were compiled but not executed locally; CI will run the
    test and
    Clippy suites.
  • [code-mode] Reject remote image URLs from output helpers (#27732)
    ## Summary
    
    - reject HTTP(S) image URLs from the shared code-mode output-image
    normalization path
    - return a concise model-visible tool error so the model can recover on
    its next turn
    - apply the targeted rejection to both `image()` and `generatedImage()`
    - leave other non-empty image URL values to existing downstream handling
    
    The returned error is:
    
    > Tool call failed: remote image URLs are not supported in tool outputs.
    Pass a base64 data URI instead
    
    ## Why
    
    Responses Lite cannot lower a remote image URL emitted from a structured
    tool output. Rejecting HTTP(S) values in the Codex harness preserves the
    tool-call metadata and gives the model a recoverable next turn instead
    of invalidating the sample.
    
    ## Test coverage
    
    The regression is covered primarily by a `test_codex()` agent
    integration test that simulates the Responses API exchange and asserts
    the failed model-visible exec output. A supplemental runtime test covers
    both `http://` and `https://` inputs across both image output helpers.
    
    ## Test plan
    
    - `cd codex-rs && just test -p codex-code-mode`
    - `cd codex-rs && just test -p codex-code-mode-protocol`
    - `cd codex-rs && just test -p codex-core
    code_mode_image_helper_rejects_remote_url`
    - `cd codex-rs && just fmt`
    - `git diff --check origin/main...HEAD`
    
    Related context: https://github.com/openai/openai/pull/1022346
  • Make MCP server contributions thread-scoped (#27670)
    ## Why
    
    `selectedCapabilityRoots` belongs to one thread, but MCP contributors
    previously received only the global Codex config. That left no clean way
    for a selected executor capability to contribute MCP servers to its own
    thread.
    
    ## What this PR does
    
    - Gives MCP contributors a small context containing the config and, for
    a running thread, its frozen host-seeded inputs.
    - Uses the same thread inputs during startup, status queries, refreshes,
    and skill dependency checks.
    - Keeps threadless MCP operations and the existing hosted Apps behavior
    unchanged.
    - Adds coverage showing that two threads resolve independent
    registrations and that later lifecycle mutations do not change the
    frozen MCP inputs.
    
    This PR does not discover plugin manifests, add MCP servers, or launch
    anything new. It only establishes the thread-scoped registration
    boundary.
    
    ## Follow-ups
    
    - Resolve selected executor plugin roots through their owning
    environment filesystem.
    - Convert their stdio MCP declarations into environment-bound
    registrations and add an executor MCP end-to-end test.
    
    ## Verification
    
    - `just fmt`
    - `cargo check --tests -p codex-protocol -p codex-extension-api -p
    codex-mcp-extension -p codex-core -p codex-app-server`
    
    Tests and Clippy were not run.
  • [codex] Load AGENTS.md from all bound environments (#27696)
    ## Why
    
    We already have the machinery to support multiple environments on a
    single thread, but we only show the model the contents of `AGENTS.md`
    files in the primary environment.
    
    We should show the model all of the relevant project instructions when
    we know there's more than one environment.
    
    ## Known Gaps
    
    As discussed in the RFC, this implementation:
    
    1. doesn't handle environments being added/removed to/from the thread
    after its creation
    2. it doesn't enforce an aggregate context budget across environments,
    and instead applies the configured project maximum independently to each
    environment
    
    ## Implementation
    
    - Discover project instructions in environment order with an independent
    byte budget per environment and preserve source provenance/order.
    - Keep the legacy fragment byte-for-byte when exactly one environment
    contributes project instructions; use environment-labeled sections when
    two or more environments contribute.
    - Freeze the complete rendered fragment in `LoadedAgentsMd`, insert it
    directly into requests, and recognize both layouts in contextual and
    memory filtering.
    - Add exact rendering, independent-budget, source-order,
    creation-snapshot, and consumer coverage without changing app-server
    schemas.
  • Keep request_user_input direct-model only (#27316)
    ## Why
    
    `request_user_input` has direct blocking semantics when invoked by the
    model. When it is exposed as a nested code-mode tool, the call has to
    flow through code-mode waiting and continuation behavior instead, which
    is not the behavior we want for this user-input request surface.
    
    ## What changed
    
    - Mark `request_user_input` with `ToolExposure::DirectModelOnly` when
    registering the core utility tool.
    - Keep `request_user_input` direct-model visible, including in
    code-mode-only planning.
    - Add focused `spec_plan_tests` coverage that verifies
    `request_user_input` remains visible and registered as
    direct-model-only, while it is omitted from the nested code-mode tool
    description.
    
    No active goal suppression or runtime unavailability behavior is
    included in this PR.
    
    ## Validation
    
    - No new build/test run for this housekeeping pass, per maintainer
    request.
    - Earlier targeted run, confirmed from session context: `just test -p
    codex-core request_user_input` passed.
  • Translate non-English issues (#27778)
    Issues written in languages other than English, such as #26979, require
    manual translation before the development team can triage them.
    
    This adds an `Issue Translator` workflow that uses Codex when an issue
    is opened. For non-English reports, it replaces the title with an
    English translation, preserves the original body, and posts the
    translated body as an idempotent issue comment.
    
    The translation scripts were run manually against non-English issue
    content and produced the expected English title and comment output.
  • code-mode standalone: extract protocol and add host crate (#27724)
    This is phase 1 of a 4 phase stack:
    1. **Add protocol and host crates for new IPC code mode implementation**
    2. Create the new standalone binary
    3. Create a new IPC `CodeModeSessionProvider` to use new binary
    4. Remove v8 from core and only use IPC provider
    
    
    ## Add protocol and host crates for new IPC code mode implementation
    Establish a clean process boundary without changing the existing
    in-process behavior.
    
    - Add the codex-code-mode-protocol crate for shared session, runtime,
    response, and tool-definition types.
    - Move protocol-facing code out of the V8-backed implementation.
    - Add a buildable codex-code-mode-host crate as the foundation for the
    standalone process.
    - Keep the existing in-process runtime as the active implementation.
  • Add request_user_input auto-resolution window contract (#27256)
    ## Why
    
    `request_user_input` is moving beyond its original plan-mode-only
    workflow, and future default/goal-mode usage needs a way for the model
    to ask helpful but non-blocking questions without forcing the turn to
    wait forever. This PR adds an explicit `autoResolutionMs` contract so a
    later client/runtime change can auto-resolve unanswered prompts after a
    bounded window while leaving truly blocking questions unchanged.
    
    This is contract plumbing only; it does not implement the client-side
    timer or auto-selection behavior, and the model-facing description
    treats the field as reserved unless the current runtime explicitly
    supports auto-resolution.
    
    ## What Changed
    
    - Added optional `autoResolutionMs` to the model-facing
    `request_user_input` args and core `RequestUserInputEvent`.
    - Added model-facing schema text for `autoResolutionMs` while marking it
    reserved for runtimes that explicitly support auto-resolution.
    - Bounds `autoResolutionMs` to `60_000..=240_000` ms during argument
    normalization by clamping out-of-range model-provided values.
    - Propagated the field through app-server v2
    `ToolRequestUserInputParams`, app-server request forwarding, generated
    TypeScript, and JSON schema fixtures.
    - Updated app-server, core, protocol, and TUI call sites/tests so
    omitted values preserve existing `None`/`null` behavior and coverage
    verifies a `Some(60_000)` round trip.
    
    ## Verification
    
    - `just test -p codex-app-server-protocol`
    - `just test -p codex-core request_user_input`
    - `just test -p codex-app-server request_user_input_round_trip`
    - `just test -p codex-tui request_user_input`
    - `just test -p codex-protocol`
  • [1 of 3] Support long raw TUI goal objectives (#27508)
    ## Stack
    
    1. **[1 of 3] Support long raw TUI goal objectives** - this PR
    2. [2 of 3] Support long pasted text in TUI goals - #27509
    3. [3 of 3] Support images in TUI goals - #27510
    
    ## Why
    
    `thread/goal/set` limits persisted objective text to 4000 characters.
    The TUI used to reject raw `/goal` objectives above that limit, even
    though the client can make them usable by writing the long text to a
    file and storing a short objective that points at that file.
    
    This also needs to work for remote app-server sessions: filesystem API
    calls must create files on the app-server host, and the stored path must
    be meaningful to the agent on that host.
    
    ## What Changed
    
    - Adds an app-server-host path helper so TUI code can build paths that
    are resolved on the app-server host rather than the TUI host.
    - Adds TUI app-server session helpers for `fs/createDirectory`,
    `fs/writeFile`, `fs/readFile`, and `fs/remove` that work for embedded
    and remote app-server sessions without changing the app-server protocol.
    - Materializes oversized raw `/goal` objectives into
    `$CODEX_HOME/attachments/<uuid>/goal-objective.md` through the
    app-server filesystem APIs, then stores a short, readable objective that
    directs the agent to that file.
    - Reads managed objective files back for `/goal edit`. Other goal UI
    renders the readable stored objective normally, without
    managed-file-specific presentation logic.
    - Recognizes managed references only when they name the expected
    generated file under the app server's reported `$CODEX_HOME`, and cleans
    up newly materialized files when goal replacement or setting does not
    complete.
    
    ## Verification
    
    - Added/updated TUI tests for raw oversized `/goal` submission, large
    inline-paste expansion, queued oversized goals, app-facing
    materialization before `thread/goal/set`, managed-path validation,
    editing, and cleanup.
    - Added/updated app-server-client remote coverage for initialized remote
    Codex home handling.
    
    ## Manual Testing
    
    - Ran the real TUI against a Unix-socket app server with different local
    and server `$CODEX_HOME` directories. Oversized goals wrote only under
    the server home, and persisted references used the server-canonical path
    rather than the TUI path.
    - Exercised 3,999-, 4,000-, and 4,001-character raw objectives. The
    first two stayed inline without new files; the 4,001-character objective
    became a managed objective file.
    - Submitted a larger 8,275-character objective, verified its full
    contents on the app-server host, and observed the goal continuation open
    the referenced server-side file.
    - Opened `/goal edit` for a managed objective and verified the full text
    was restored through remote `fs/readFile`.
    - Submitted an oversized replacement while a goal was active, verified
    no file was written before confirmation, then canceled and confirmed
    that the existing goal and attachment count were unchanged.
  • feat(app-server): persist remote-control desired state (#27445)
    ## Why
    
    Remote-control runtime enablement and persisted enrollment preference
    were represented by separate flags. That made startup rehydration, RPC
    persistence, and new-enrollment seeding race with one another, and it
    did not cleanly distinguish runtime-only CLI or daemon starts from
    durable app-server RPC changes.
    
    ## What Changed
    
    - Replace the parallel enablement, seed, and rehydration flags with one
    transport-owned `RemoteControlDesiredState`.
    - Add nullable enrollment-scoped persistence and preserve existing
    preferences during enrollment upserts.
    - Rehydrate plain startup only after auth and client scope resolve,
    without overwriting a concurrent RPC transition.
    - Make ordinary `remoteControl/enable` and `remoteControl/disable`
    durable while retaining `ephemeral: true` for runtime-only callers.
    - Have the daemon explicitly request ephemeral enablement and regenerate
    the app-server schemas.
    
    ## Verification
    
    - Covered migration and `NULL`/`0`/`1` persistence round trips.
    - Covered plain-start rehydration and runtime-only versus durable
    enrollment seeding.
    - Covered durable enable, durable disable, and ephemeral enable through
    app-server RPC.
    - Covered the daemon's exact `{ "ephemeral": true }` request payload.
    
    Related issue: N/A (internal remote-control persistence architecture
    change).
  • [codex] resolve environment shell metadata eagerly (#27709)
    ## Why
    
    Turn construction passed resolved environments through several layers
    while leaving the environment shell unresolved. As a result,
    model-visible environment context could fall back to the session shell
    instead of reporting the selected remote environment's shell.
    
    Resolve environment metadata at the turn-context boundary so each turn
    carries the shell that belongs to its selected environment. Keep request
    validation in app-server, where invalid selections can be returned as
    straightforward JSON-RPC errors without coupling core turn construction
    to that policy.
    
    ## What changed
    
    - resolve environment selections eagerly in
    `new_turn_context_from_configuration`
    - store the full resolved `Shell` on each `TurnEnvironment`
    - simplify the now-redundant resolved-environment constructor plumbing
    - keep duplicate and unknown-environment validation as a small
    app-server preflight
    - add a remote-environment integration test that runs a full
    `test_codex` turn and verifies the model-visible environment message
    reports `bash`
    
    ## Testing
    
    - `cargo check -p codex-core --test all -p codex-app-server`
    - `remote_test_env_exposes_bash_shell_to_model` on the Linux
    remote-executor harness
  • [codex] parallelize release code generation (#27702)
    The release profile still uses one codegen unit, which serializes LLVM
    code generation within each crate. That setting was selected alongside
    fat LTO for optimization quality and binary size, but releases now use
    ThinLTO and code generation dominates the critical-path build.
    
    Use four codegen units. On an Apple M4 Max with 16 cores and 128 GiB
    RAM, using rustc 1.96.0, four and eight units took 507.486 and 505.325
    seconds respectively. Four therefore keeps the build-time gain while
    limiting the stripped `codex` increase to 14.7%, compared with 21.5% at
    eight units. The gzip-compressed binary grows 7.8% at four units.
    
    The one-unit build from an empty target directory took 981.150 seconds.
    That comparison also populated dependency and native build caches, so it
    is directional rather than controlled. It agrees with the earlier clean
    matrix where eight units reduced 671 seconds to 303 seconds:
    https://gist.github.com/anp/4b88393a0acd35783d9f42156f3243d5
    
    At the local 48% reduction, the current release's 55m22s critical-path
    macOS Cargo step would save about 26 minutes from the 71m28s workflow:
    https://github.com/openai/codex/actions/runs/27367405663
    
    The prompt-image medians ranged from 3.9% faster to 0.9% slower. CLI
    startup shifted by 1-2 ms while user and system CPU time were unchanged.
    
    This is a draft because the release-latency improvement may not justify
    the binary-size increase.
  • ci(v8): gate Windows source builds on relevant changes (#27715)
    Avoid rebuilding sandboxed Windows MSVC V8 artifacts for unrelated
    changes to `codex-rs/Cargo.toml`.
    
    The V8 canary now compares the resolved V8 version between the base and
    head commits and only runs the Windows source-build matrix when:
    
    - the resolved V8 crate version changes;
    - Windows artifact-production scripts or workflows change; or
    - the workflow is manually dispatched.
    
    The existing Bazel V8 matrix is unchanged.
    
    ## Why
    
    The Windows MSVC source builds take roughly two to three hours and
    currently run whenever any entry in the broad `v8-canary` path filter
    changes.
  • fix: Recover from sqlite directory being a file (#27719)
    Missed this file in the last PR -- this ensures that if you're in the
    really-weird edge case of your sqlite directory being a file, that it
    will fix it and recover properly.
  • [codex] Remove async_trait from first-party code (#27475)
    ## Why
    
    First-party async traits should expose their `Send` contracts explicitly
    without requiring `async_trait`. This completes the migration pattern
    established in #27303 and #27304.
    
    ## What changed
    
    - Replaced the remaining first-party `async_trait` traits with native
    return-position `impl Future + Send` where statically dispatched and
    explicit boxed `Send` futures where object safety is required.
    - Kept implementations behavior-preserving, outlining existing async
    bodies into inherent methods where that keeps the diff reviewable.
    - Removed all direct first-party `async-trait` dependencies and the
    workspace dependency declaration.
    - Added a cargo-deny policy that permits `async-trait` only through the
    remaining transitive wrapper crates.
    - Updated `rand` from 0.8.5 to 0.8.6 to resolve RUSTSEC-2026-0097 and
    keep the full cargo-deny check passing.
    
    ## Validation
    
    - `just test -p codex-exec-server`: 216 passed, 2 skipped.
    - `just test -p codex-model-provider`: 39 passed.
    - `just test -p codex-core` and `just test`: changed tests passed;
    remaining failures are environment-sensitive suites unrelated to this
    migration.
    - `cargo deny check`
    - `just fix`
    - `just fmt`
    - `cargo shear`
    - `just bazel-lock-check`
  • Fix image extension PathUri conversion (#27711)
    ## Why
    
    `main` stopped compiling when #27498 passed an `AbsolutePathBuf` to the
    `ExecutorFileSystem` API migrated to `PathUri` by #27653.
    
    ## What
    
    Convert referenced image paths to `PathUri` before filesystem reads,
    declare the internal path-URI dependency, and refresh `Cargo.lock`.
  • tui: clear stale hook row after turn completion (#27619)
    Fixes #27210.
    
    ## Why
    When the app server reports a visible `HookStarted` event for a
    `PostToolUse` hook but the turn reaches `TurnCompleted` before a
    matching hook completion event arrives, the TUI can leave the transient
    `Running PostToolUse hook` row visible after the agent is done.
    Interrupted and failed turn cleanup already drops transient live hook
    rows; the normal completion path did not.
    
    ## What Changed
    - Added `ChatWidget::clear_active_hook_cell()` for dropping transient
    live hook status without writing it to history.
    - Call that cleanup from normal task completion, while reusing it for
    the existing start/finalize cleanup paths.
    - Added `completed_turn_clears_visible_running_hook` snapshot coverage
    for the reported `PostToolUse` case.
    
    ## Tests
    - `just test -p codex-tui completed_turn_clears_visible_running_hook`
    - `just test -p codex-tui` (fails on current `main` in unrelated
    guardian tests:
    `update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
    and
    `update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`)
  • Add spans to turn lifecycle gaps (#27623)
    ## Why
    Codex app-server latency traces do not granularly cover turn task
    startup and inter-request handoffs. These spans help attribute time
    across task execution, startup prewarm, in-flight tool completion, and
    rollout persistence.
    
    ## What changed
    - Add `session_task.run` spans around task execution and
    `session_task.flush_rollout` around flushing pending conversation
    transcript writes to durable storage
    - Add `regular_task.prepare_run_turn` around regular-turn startup (Send
    the `TurnStarted` event, reset turn-specific reasoning state, and
    resolve any startup prewarm)
    - Add `startup_prewarm.resolve` around waiting for background session
    prewarming to finish, fail, time out, or be cancelled
    - Add a function-level trace span around draining in-flight tool calls
    (Wait for tool calls to complete, record tool result in conversation
    history, and other bookkeeping)
    
    ## Verification
    Trigger Codex rollout and observe new spans are included
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.