Commit Graph

117 Commits

  • Preserve image detail in app-server inputs (#20693)
    ## Summary
    
    - Add optional image detail to user image inputs across core, app-server
    v2, thread history/event mapping, and the generated app-server
    schemas/types.
    - Preserve requested detail when serializing Responses image inputs:
    omitted detail stays on the existing `high` default, while explicit
    `original` keeps local images on the original-resolution path.
    - Support `high`/`original` consistently for tool image outputs,
    including MCP `codex/imageDetail`, code-mode image helpers, and
    `view_image`.
  • feat(app-server): update remote control APIs for better UX (#22877)
    ## Why
    To help improve `codex remote-control` CLI UX which I plan to do in a
    followup, this PR adds `server-name` to the various remote control APIs:
    - `remoteControl/enable`
    - `remoteControl/disable`
    - `remoteControl/status/changed`
    
    Also, add a `remoteControl/status/read` API. This will be helpful in the
    Codex App.
  • feat: Use installation ID in remote enrollments (#21662)
    * Pass installation ID for storage on enrollments server for
    deduping/grouping multiple appservers per installation
    * Pass installation ID in remoteControl/status/changed events
  • [codex-analytics] plumb protocol-native review timing (#21434)
    ## Why
    
    We want terminal tool review analytics, but the reducer should not stamp
    review timing from its own wall clock.
    
    This PR plumbs review timing through the real protocol and app-server
    seams so downstream analytics can consume the emitter's timestamps
    directly. Guardian reviews keep their enriched `started_at` /
    `completed_at` analytics fields by deriving those legacy second-based
    values from the same protocol-native millisecond lifecycle timestamps,
    rather than sampling a separate analytics clock.
    
    ## What changed
    
    - add `started_at_ms` to user approval request payloads
    - add `started_at_ms` / `completed_at_ms` to guardian review
    notifications
    - preserve Guardian review `started_at` / `completed_at` enrichment from
    the protocol-native timing source
    - stamp typed `ServerResponse` analytics facts with app-server-observed
    `completed_at_ms`
    - thread the new timing fields through core, protocol, app-server, TUI,
    and analytics fixtures
    
    ## Verification
    
    - `cargo test -p codex-app-server outgoing_message --manifest-path
    codex-rs/Cargo.toml`
    - `cargo test -p codex-app-server-protocol guardian --manifest-path
    codex-rs/Cargo.toml`
    - `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml`
    - `cargo test -p codex-analytics analytics_client_tests --manifest-path
    codex-rs/Cargo.toml`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434).
    * #18748
    * __->__ #21434
    * #18747
    * #17090
    * #17089
    * #20514
  • Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
    Based on work from Vincent K -
    https://github.com/openai/codex/pull/19060
    
    <img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
    src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
    />
    
    ## Why
    
    Compaction rewrites the conversation context that future model turns
    receive, but hooks currently have no deterministic lifecycle point
    around that rewrite. This adds compact lifecycle hooks so users can
    audit manual and automatic compaction, surface hook messages in the UI,
    and run post-compaction follow-up without overloading tool or prompt
    hooks.
    
    ## What Changed
    
    - Added `PreCompact` and `PostCompact` hook events across hook config,
    discovery, dispatch, generated schemas, app-server notifications,
    analytics, and TUI hook rendering.
    - Added trigger matching for compact hooks with the documented `manual`
    and `auto` matcher values.
    - Wired `PreCompact` before both local and remote compaction, and
    `PostCompact` after successful local or remote compaction.
    - Kept compact hook command input to lifecycle metadata: session id,
    Codex turn id, transcript path, cwd, hook event name, model, and
    trigger.
    - Made compact stdout handling consistent with other hooks: plain stdout
    is ignored as debug output, while malformed JSON-looking stdout is
    reported as failed hook output.
    - Added integration coverage for compact hook dispatch, trigger
    matching, post-compact execution, and the audited behavior that
    `decision:"block"` does not block compaction.
    
    ## Out of Scope
    
    - Hook-specific compaction blocking is not implemented;
    `decision:"block"` and exit-code-2 blocking semantics are intentionally
    unsupported for `PreCompact`.
    - Custom compaction instructions are not exposed to compact hooks in
    this PR.
    - Compact summaries, summary character counts, and summary previews are
    not exposed to compact hooks in this PR.
    
    ## Verification
    
    - `cargo test -p codex-hooks`
    - `cargo test -p codex-core
    manual_pre_compact_block_decision_does_not_block_compaction`
    - `cargo test -p codex-app-server hooks_list`
    - `cargo test -p codex-core config_schema_matches_fixture`
    - `cargo test -p codex-tui hooks_browser`
    
    ## Docs
    
    The developer documentation for Codex hooks should be updated alongside
    this feature to document `PreCompact` and `PostCompact`, the
    `manual`/`auto` matcher values, and the compact hook payload fields.
    
    ---------
    
    Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
  • feat(app-server): move v2 sessionId onto Thread (#21336)
    ## Why
    
    `session_id` and `thread_id` are separate identities after #20437, but
    app-server only surfaced `sessionId` on the `thread/start`,
    `thread/resume`, and `thread/fork` response envelopes. Other
    thread-bearing surfaces such as `thread/list`, `thread/read`,
    `thread/started`, `thread/rollback`, `thread/metadata/update`, and
    `thread/unarchive` either lacked the grouping key or forced clients to
    special-case those three responses.
    
    Making `sessionId` part of the reusable `Thread` payload gives every v2
    API surface one place to expose session-tree identity.
    
    ## Mental model
      1. thread.sessionId lives on `Thread`
    2. It is a view/runtime identity for the current live session tree, not
    durable stored lineage metadata
    3. When app-server has a live loaded thread, it copies the real value
    from core’s session_configured.session_id
    4. When it only has stored/unloaded data, it falls back to
    thread.sessionId = thread.id
    
    ## What changed
    
    - Added `sessionId` to the v2
    [`Thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs#L105-L109).
    - Removed the duplicate top-level `sessionId` fields from
    `thread/start`, `thread/resume`, and `thread/fork`; clients should now
    read `response.thread.sessionId`.
    - Populated `thread.sessionId` when building live thread responses,
    replaying loaded threads, and returning stored-thread summaries so the
    field is present across start, resume, fork, list, read, rollback,
    metadata-update, unarchive, and `thread/started` paths. See
    [`load_thread_from_resume_source_or_send_internal`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L2824-L2918)
    and
    [`thread_from_stored_thread`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/src/request_processors/thread_processor.rs#L3671-L3719).
    - Preserved the stored-thread fallback: if a thread has not been loaded
    into a live session tree yet, `thread.sessionId` falls back to
    `thread.id`; once the thread is live again, the field reports the active
    session tree root.
    - Regenerated the JSON/TypeScript schemas and updated the app-server
    README examples to show
    [`thread.sessionId`](https://github.com/openai/codex/blob/8fc9e9b4cf81b6f61d432e71f1eb266f6f104b63/codex-rs/app-server/README.md#L306-L310)
    on the thread object.
  • [codex-analytics] rework thread_source for thread analytics (#20949)
    ## Summary
    - make `thread_source` an explicit optional thread-level field on
    `thread/start`, `thread/fork`, and returned thread payloads
    - persist `thread_source` in rollout/session metadata so resumed live
    threads retain the original value
    - replace the old best-effort `session_source` -> `thread_source`
    mapping with an explicit caller-supplied analytics classification
    
    ## Why
    Before this change, analytics `thread_source` was populated by a
    best-effort mapping from `session_source`. `session_source` describes
    the runtime/client surface, not the actual thread-level origin, so that
    projection was not accurate enough to distinguish cases such as `user`,
    `subagent`, `memory_consolidation`, and future thread origins reliably.
    
    Making `thread_source` explicit keeps one thread-level analytics field
    while letting callers provide the real classification directly instead
    of recovering it indirectly from `session_source`.
    
    ## Impact
    For new analytics events, `thread_source` now reflects the explicit
    thread-level classification supplied by the caller rather than an
    inferred value derived from `session_source`. Existing protocol fields
    remain optional; callers that omit `threadSource` now produce `null`
    instead of a best-effort inferred value.
    
    ## Validation
    - `just write-app-server-schema`
    - `cargo test -p codex-analytics -p codex-core -p
    codex-app-server-protocol --no-run`
    - `cargo test -p codex-app-server-protocol
    generated_ts_optional_nullable_fields_only_in_params`
    - `cargo test -p codex-analytics
    thread_initialized_event_serializes_expected_shape`
    - `cargo test -p codex-core
    resume_stopped_thread_from_rollout_preserves_thread_source`
  • chore(app-server-protocol): split v2 API definitions into modules (#21251)
    ## Why
    
    `codex-rs/app-server-protocol/src/protocol/v2.rs` had grown into a
    single ~12k-line definition file for the entire app-server v2 API.
    
    This is purely a mechanical refactor to break up the monolithic `v2.rs`
    file that contains all app-server API v2 types into more modular files,
    grouped by resource (e.g. account, thread, turn, etc.).
    
    `just write-app-server-schema` shows no real changes, so we can be sure
    that this is purely an internal organizational change.
    
    ## What changed
    
    - Replaced the monolithic `protocol/v2.rs` with a `protocol/v2/` module
    tree and a small `mod.rs` that only declares and reexports modules.
    - Grouped v2 API definitions by conceptual owner, including `account`,
    `apps`, `collaboration_mode`, `command_exec`, `config`, `device_key`,
    `experimental_feature`, `feedback`, `fs`, `hook`, `item`, `mcp`,
    `model`, `notification`, `permissions`, `plugin`, `process`, `realtime`,
    `review`, `thread`, `thread_data`, `turn`, and `windows_sandbox`.
    - Moved v2 tests into `protocol/v2/tests.rs` so `mod.rs` stays small.
    - Kept shared protocol helpers in `protocol/v2/shared.rs`, including the
    enum mirroring macro and common cross-resource types.
    - Co-located resource-specific notifications and server-request payloads
    with the modules that own those resources.
    - Regenerated app-server protocol schema fixtures. The schema diffs are
    non-semantic newline-only changes after the refactor.
    
    ## Verification
    
    - `cargo check -p codex-app-server-protocol`
    - `cargo test -p codex-app-server-protocol`
    - `just write-app-server-schema`
  • add turn items view to app-server turns (#21063)
    ## Why
    
    `Turn.items` currently overloads an empty array to mean either that no
    items exist or that the server intentionally did not load them for this
    response. That ambiguity blocks future lazy-loading work where clients
    need to distinguish unloaded, summary, and fully hydrated turn payloads.
    
    ## What changed
    
    - add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
    variants
    - add required `itemsView` metadata to app-server `Turn` payloads
    - mark reconstructed persisted history as `full` and live shell-style
    turn payloads as `notLoaded`
    - keep current `thread/turns/list` behavior unchanged and document that
    it still returns `full` turns today
    - regenerate the JSON and TypeScript protocol fixtures
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server thread_read_can_include_turns`
    - `cargo test -p codex-app-server
    thread_turns_list_can_page_backward_and_forward`
    - `cargo test -p codex-app-server
    thread_resume_rejects_history_when_thread_is_running`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fmt`
  • [codex] Add unsandboxed process exec API (#19040)
    ## Why
    
    App-server clients sometimes need argv-based local process execution
    while sandbox policy is controlled outside Codex. Those environments can
    reject sandbox-disabling paths before a command ever starts, even when
    the caller intentionally wants unsandboxed execution.
    
    This PR adds a distinct `process/*` API for that use case instead of
    extending `command/exec` with another sandbox-disabling shape. Keeping
    the new surface separate also makes the future removal of `command/exec`
    simpler: clients that need explicit process lifecycle control can move
    to the newer handle-based API without depending on `command/exec`
    business logic.
    
    ## What changed
    
    - Added v2 process lifecycle methods: `process/spawn`,
    `process/writeStdin`, `process/resizePty`, and `process/kill`.
    - Added process notifications: `process/outputDelta` for streamed
    stdout/stderr chunks and `process/exited` for final exit status and
    buffered output.
    - Made `process/spawn` intentionally unsandboxed and omitted
    sandbox-selection fields such as `sandboxPolicy` and
    `permissionProfile`.
    - Added client-supplied, connection-scoped `processHandle` values for
    follow-up control requests and notification routing.
    - Supported cwd, environment overrides, PTY mode and size, stdin
    streaming, stdout/stderr streaming, per-stream output caps, and timeout
    controls.
    - Killed active process sessions when the originating app-server
    connection closes.
    - Wired the implementation through the modular `request_processors/`
    app-server layout, with process-handle request serialization for
    follow-up control calls.
    - Updated generated JSON/TypeScript schema fixtures and documented the
    new API in `codex-rs/app-server/README.md`.
    - Added v2 app-server integration coverage in
    `codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn
    acknowledgement before exit, buffered output caps, and process
    termination.
    
    ## Verification
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server`
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • [codex-analytics] add item lifecycle timing (#20514)
    ## Why
    
    Tool families already disagree on what their existing `duration` fields
    mean, so lifecycle latency should live on the shared item envelope
    instead of being inferred from per-tool execution fields. Carrying that
    envelope through app-server notifications gives downstream consumers one
    reusable timing signal without pretending every tool has the same
    execution semantics.
    
    ## What changed
    
    - Adds `started_at_ms` to core `ItemStartedEvent` values and
    `completed_at_ms` to core `ItemCompletedEvent` values.
    - Populates those timestamps in the shared session lifecycle emitters,
    so protocol-native items get timing without each producer tracking its
    own clock state.
    - Exposes `startedAtMs` on app-server `item/started` notifications and
    `completedAtMs` on `item/completed` notifications.
    - Maps the lifecycle timestamps through the app-server boundary while
    leaving legacy-converted notifications nullable when no lifecycle
    timestamp exists.
    - Regenerates the app-server JSON schema and TypeScript fixtures for the
    notification-envelope change and updates downstream fixtures that
    construct those notifications directly.
    - Extends the existing web-search and image-generation integration flows
    to assert the new lifecycle timestamps on the native item events.
    
    ## Verification
    
    - `cargo check -p codex-protocol -p codex-core -p
    codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
    -p codex-app-server-client`
    - `cargo test -p codex-core --test all web_search_item_is_emitted`
    - `cargo test -p codex-core --test all
    image_generation_call_event_is_emitted`
    - `cargo test -p codex-app-server-protocol`
    
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
    * #18748
    * #18747
    * #17090
    * #17089
    * __->__ #20514
  • Stop emitting item/fileChange/outputDelta output delta notifications (#20471)
    ## Why
    
    `item/fileChange/outputDelta` text output was only the tool's summary or
    error text and not used by client surfaces.
    
    We keep `item/fileChange/outputDelta` in the app-server protocol as a
    deprecated compatibility entry, but the server no longer emits it.
    
    ## What changed
    
    - stop the `apply_patch` runtime from emitting `ExecCommandOutputDelta`
    events
    - simplify `item_event_to_server_notification` so command output deltas
    always map to `item/commandExecution/outputDelta`
    - remove the app-server bookkeeping that tried to detect whether an
    output delta belonged to a file change
    - mark `item/fileChange/outputDelta` as a deprecated legacy protocol
    entry in the v2 types, schema, and README
    - simplify the file-change approval tests so they only wait for
    completion instead of expecting output-delta notifications
    
    ## Testing
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-thread-manager-sample`
    - `cargo test -p codex-app-server-protocol
    protocol::event_mapping::tests::exec_command_output_delta_maps_to_command_execution_output_delta
    -- --exact`
    - `cargo test -p codex-app-server
    turn_start_file_change_approval_accept_for_session_persists_v2 --
    --exact` *(failed before the test assertions because the wiremock
    `/responses` mock received 0 requests in setup)*
  • realtime: rename provider session ids (#20361)
    ## Summary
    
    Codex is repurposing `session` to mean a thread group, so the realtime
    provider session id should no longer use `session_id` / `sessionId` in
    Codex-facing protocol payloads. This PR renames that provider-specific
    field to `realtime_session_id` / `realtimeSessionId` and intentionally
    breaks clients that still send the old field names.
    
    ## What Changed
    
    - Renamed realtime provider session fields in `ConversationStartParams`,
    `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
    - Renamed app-server v2 realtime request and notification fields to
    `realtimeSessionId`.
    - Removed legacy serde aliases for `session_id` / `sessionId`; clients
    must send the new names.
    - Propagated the rename through core realtime startup, app-server
    adapters, codex-api websocket handling, and TUI realtime state.
    - Regenerated app-server protocol schema/TypeScript outputs and updated
    app-server README examples.
    - Kept upstream Realtime API concepts unchanged: provider `session.id`
    parsing and `x-session-id` headers still use the upstream wire names.
    
    ## Testing
    
    - CI is running on the latest pushed commit.
    - Earlier local verification on this PR:
      - `cargo test -p codex-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
    realtime_conversation`
      - `cargo test -p codex-app-server-protocol`
    - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
    realtime_conversation`
    - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
    linker bus error while linking the test binary)
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Add persisted hook enablement state (#19840)
    ## Why
    
    After `hooks/list` exposes the hook inventory, clients need a way to
    persist user hook preferences, make those changes effective in
    already-open sessions, and distinguish user-controllable hooks from
    managed requirements without adding another bespoke app-server write
    API.
    
    ## What
    
    - Extends `hooks/list` entries with effective `enabled` state.
    - Persists user-level hook state under `hooks.state.<hook-id>` so the
    model can grow beyond a single boolean over time.
    - Uses the existing `config/batchWrite` path for hook state updates
    instead of introducing a dedicated hook write RPC.
    - Refreshes live session hook engines after config writes so
    already-open threads observe updated enablement without a restart.
    
    ## Stack
    
    1. openai/codex#19705
    2. openai/codex#19778
    3. This PR - openai/codex#19840
    4. openai/codex#19882
    
    ## Reviewer Notes
    
    The generated schema files account for much of the raw diff. The core
    behavior is in:
    
    - `hooks/src/config_rules.rs`, which resolves per-hook user state from
    the config layer stack.
    - `hooks/src/engine/discovery.rs`, which projects effective enablement
    into `hooks/list` from source-derived managedness.
    - `config/src/hook_config.rs`, which defines the new `hooks.state`
    representation.
    - `core/src/session/mod.rs`, which rebuilds live hook state after user
    config reloads.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: notify clients of remote-control status changes (#19919)
    ## Why
    
    Remote-control app-server enrollments have both an internal server id
    and the environment id exposed to remote-control clients. App-server
    clients need one current status snapshot that says whether remote
    control is usable and which environment id, if any, is exposed.
    
    A temporary websocket disconnect is not itself an identity change.
    Account changes, stale enrollment invalidation, successful
    re-enrollment, and missing ChatGPT auth are meaningful status changes.
    Disabled remote control remains `disabled` regardless of auth or SQLite
    state. SQLite startup failure disablement and enrollment persistence
    failures are handled in #20068; this PR reports the resulting effective
    status to clients.
    
    ## What changed
    
    - Adds v2 `remoteControl/status/changed` carrying `state` and
    `environmentId`.
    - Adds `RemoteControlConnectionState` values: `disabled`, `connecting`,
    `connected`, and `errored`.
    - Exposes remote-control status updates through `RemoteControlHandle`
    using a Tokio watch channel.
    - Always sends the current remote-control status snapshot to newly
    initialized app-server clients.
    - Broadcasts status changes to initialized app-server clients when state
    or environment id changes.
    - Treats missing ChatGPT auth as an `errored` status while leaving it
    retryable because auth can change at runtime.
    - Clears `environmentId` when enrollment is cleared for account changes,
    auth loss, stale backend invalidation, or disabled remote control.
    - Updates app-server protocol schema fixtures, generated TypeScript,
    app-server README, remote-control tests, and TUI exhaustive notification
    matches.
    
    ## Stack
    
    - Builds on #20068.
    
    ## Verification
    
    - `just write-app-server-schema`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server transport::remote_control --lib`
    - `cargo check -p codex-tui`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fix -p codex-tui`
  • Discover hooks bundled with plugins (#19705)
    ## Why
    
    Plugins can bundle lifecycle hooks, but Codex previously only discovered
    hooks from user, project, and managed config layers. This adds the
    plugin discovery and runtime plumbing needed for plugin-bundled hooks
    while keeping execution behind the `plugin_hooks` feature flag.
    
    ## What
    
    - Discovers plugin hook sources from each plugin's default
    `hooks/hooks.json`.
    - Supports `plugin.json` manifest `hooks` entries as either relative
    paths or inline hook objects.
    - Plumbs discovered plugin hook sources through plugin loading into the
    hook runtime when `plugin_hooks` is enabled.
    - Marks plugin-originated hook runs as `HookSource::Plugin`.
    - Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook
    command environments.
    - Updates generated schemas and hook source metadata for the plugin hook
    source.
    
    ## Stack
    
    1. This PR - openai/codex#19705
    2. openai/codex#19778
    3. openai/codex#19840
    4. openai/codex#19882
    
    ## Reviewer Notes
    
    - Core logic is in `codex-rs/core-plugins/src/loader.rs` and
    `codex-rs/hooks/src/engine/discovery.rs`
    - Moved existing / adding new tests to
    `codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there
    - Otherwise mostly plumbing and minor schema updates
    
    ### Core Changes
    
    The `codex-rs/core` changes are limited to wiring plugin hook support
    into existing core flows:
    
    - `core/src/session/session.rs` conditionally pulls effective plugin
    hook sources and plugin hook load warnings from `PluginsManager` when
    `plugin_hooks` is enabled, then passes them into `HooksConfig`.
    - `core/src/hook_runtime.rs` adds the `plugin` metric tag for
    `HookSource::Plugin`.
    - `core/config.schema.json` picks up the new `plugin_hooks` feature
    flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the
    added plugin hook fields.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • permissions: remove cwd special path (#19841)
    ## Why
    
    The experimental `PermissionProfile` API had both `:cwd` and
    `:project_roots` special filesystem paths, which made the permission
    root ambiguous. This PR removes the unstable `current_working_directory`
    special path before the permissions API is stabilized, so callers use
    `:project_roots` for symbolic project-root access.
    
    ## What changed
    
    - Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol
    and app-server protocol models, plus regenerated app-server
    JSON/TypeScript schemas.
    - Replaces internal `:cwd` permission entries with `:project_roots`
    entries.
    - Keeps the existing cwd-update behavior for legacy-shaped
    workspace-write profiles, while removing the deleted
    `CurrentWorkingDirectory` case from that compatibility path.
    - Keeps `PermissionProfile::workspace_write()` as the reusable symbolic
    workspace-write helper, with docs noting that `:project_roots` entries
    resolve at enforcement time.
    - Updates app-server docs/examples and approval UI labeling to stop
    advertising `:cwd` as a permission token.
    
    ## Compatibility
    
    Persisted rollout items may contain the old
    `{"kind":"current_working_directory"}` tag from earlier experimental
    `permissionProfile` snapshots. This PR keeps that tag as a
    deserialize-only alias for `ProjectRoots { subpath: None }`, while
    continuing to serialize only the new `project_roots` tag.
    
    ## Follow-up
    
    This PR intentionally does not introduce an explicit project-root set on
    `SessionConfiguration` or runtime sandbox resolution. Today, the
    resolver still uses the active cwd as the single implicit project root.
    A follow-up should model project roots separately from tool cwd so
    `:project_roots` entries can resolve against the configured project
    roots, and resolve to no entries when there are no project roots.
    
    ## Verification
    
    - `cargo test -p codex-protocol permissions:: --lib`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-sandboxing -p codex-exec-server --lib`
    - `cargo test -p codex-core session_configuration_apply_ --lib`
    - `cargo test -p codex-app-server
    command_exec_permission_profile_project_roots_use_command_cwd --test
    all`
    - `cargo test -p codex-tui
    thread_read_session_state_does_not_reuse_primary_permission_profile
    --lib`
    - `cargo test -p codex-tui
    preset_matching_accepts_workspace_write_with_extra_roots --lib`
    - `cargo test -p codex-config --lib`
  • Add goal app-server API (2 / 5) (#18074)
    Adds the app-server v2 goal API on top of the persisted goal state from
    PR 1.
    
    ## Why
    
    Clients need a stable app-server surface for reading and controlling
    materialized thread goals before the model tools and TUI can use them.
    Goal changes also need to be observable by app-server clients, including
    clients that resume an existing thread.
    
    ## What changed
    
    - Added v2 `thread/goal/get`, `thread/goal/set`, and `thread/goal/clear`
    RPCs for materialized threads.
    - Added `thread/goal/updated` and `thread/goal/cleared` notifications so
    clients can keep local goal state in sync.
    - Added resume/snapshot wiring so reconnecting clients see the current
    goal state for a thread.
    - Added app-server handlers that reconcile persisted rollout state
    before direct goal mutations.
    - Updated the app-server README plus generated JSON and TypeScript
    schema fixtures for the new API surface.
    
    ## Verification
    
    - Added app-server v2 coverage for goal get/set/clear behavior,
    notification emission, resume snapshots, and non-local thread-store
    interactions.
  • app-server: include filesystem entries in permission requests (#19086)
    ## Why
    
    `item/permissions/requestApproval` sends a requested permission profile
    to app-server clients. The core profile already stores filesystem
    permissions as `entries`, but the v2 compatibility conversion used the
    legacy `read`/`write` projection whenever possible and left `entries`
    unset.
    
    That made the request ambiguous for clients that consume the canonical
    v2 shape: `permissions.fileSystem.entries` was missing even though
    filesystem access was being requested. A client that rendered or echoed
    grants from `entries` could treat the request as having no filesystem
    permission entries, then return an empty or incomplete grant. The
    app-server intersects responses with the original request, so omitted
    filesystem permissions are denied.
    
    ## What Changed
    
    - Populate `AdditionalFileSystemPermissions.entries` when converting
    legacy read/write roots for request permission payloads, while
    preserving `read` and `write` for compatibility.
    - Mark `read` and `write` as transitional schema fields in the generated
    app-server schema.
    - Add regression coverage for the v2 conversion, the app-server
    `item/permissions/requestApproval` round trip, and TUI app-server
    approval conversion expectations.
    - Refresh generated JSON and TypeScript schema fixtures.
    
    ## Verification
    
    - `just fmt`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server request_permissions_round_trip`
    - `cargo test -p codex-tui
    converts_request_permissions_into_granted_permissions`
    - `cargo test -p codex-tui
    resolves_permissions_and_user_input_through_app_server_request_id`
  • Add safety check notification and error handling (#19055)
    Adds a new app-server notification that fires when a user account has
    been flagged for potential safety reasons.
  • feat(auto-review) short-circuit (#18890)
    ## Summary
    Short circuit the convo if auto-review hits too many denials
    
    ## Testing
    - [x] Added unit tests
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • feat: add explicit AgentIdentity auth mode (#18785)
    ## Summary
    
    This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode.
    
    An AgentIdentity auth record is a standalone `auth.json` mode. When
    `AuthManager::auth().await` loads that mode, it registers one
    process-scoped task and stores it in runtime-only state on the auth
    value. Header creation stays synchronous after that because the task is
    initialized before callers receive the auth object.
    
    This PR also removes the old feature flag path. AgentIdentity is
    selected by explicit auth mode, not by a hidden flag or lazy mutation of
    ChatGPT auth records.
    
    Reference old stack: https://github.com/openai/codex/pull/17387/changes
    
    ## Design Decisions
    
    - AgentIdentity is a real auth enum variant because it can be the only
    credential in `auth.json`.
    - The process task is ephemeral runtime state. It is not serialized and
    is not stored in rollout/session data.
    - Account/user metadata needed by existing Codex backend checks lives on
    the AgentIdentity record for now.
    - `is_chatgpt_auth()` remains token-specific.
    - `uses_codex_backend()` is the broader predicate for ChatGPT-token auth
    and AgentIdentity auth.
    
    ## Stack
    
    1. https://github.com/openai/codex/pull/18757: full revert
    2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
    crate
    3. This PR: explicit AgentIdentity auth mode and startup task allocation
    4. https://github.com/openai/codex/pull/18811: migrate Codex backend
    auth callsites through AuthProvider
    5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
    and load `CODEX_AGENT_IDENTITY`
    
    ## Testing
    
    Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
  • [tool search] support namespaced deferred dynamic tools (#18413)
    Deferred dynamic tools need to round-trip a namespace so a tool returned
    by `tool_search` can be called through the same registry key that core
    uses for dispatch.
    
    This change adds namespace support for dynamic tool specs/calls,
    persists it through app-server thread state, and routes dynamic tool
    calls by full `ToolName` while still sending the app the leaf tool name.
    Deferred dynamic tools must provide a namespace; non-deferred dynamic
    tools may remain top-level.
    
    It also introduces `LoadableToolSpec` as the shared
    function-or-namespace Responses shape used by both `tool_search` output
    and dynamic tool registration, so dynamic tools use the same wrapping
    logic in both paths.
    
    Validation:
    - `cargo test -p codex-tools`
    - `cargo test -p codex-core tool_search`
    
    ---------
    
    Co-authored-by: Sayan Sisodiya <sayan@openai.com>
  • feat(auto-review) Handle request_permissions calls (#18393)
    ## Summary
    When auto-review is enabled, it should handle request_permissions tool.
    We'll need to clean up the UX but I'm planning to do that in a separate
    pass
    
    ## Testing
    - [x] Ran locally
    <img width="893" height="396" alt="Screenshot 2026-04-17 at 1 16 13 PM"
    src="https://github.com/user-attachments/assets/4c045c5f-1138-4c6c-ac6e-2cb6be4514d8"
    />
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Wire the PatchUpdated events through app_server (#18289)
    Wires patch_updated events through app_server. These events are parsed
    and streamed while apply_patch is being written by the model. Also adds 500ms of buffering to the patch_updated events in the diff_consumer.
    
    The eventual goal is to use this to display better progress indicators in
    the codex app.
  • feat: Budget skill metadata and surface trimming as a warning (#18298)
    Cap the model-visible skills section to a small share of the context
    window, with a fallback character budget, and keep only as many implicit
    skills as fit within that budget.
    
    Emit a non-fatal warning when enabled skills are omitted, and add a new
    app-server warning notification
    
    Record thread-start skill metrics for total enabled skills, kept skills,
    and whether truncation happened
    
    ---------
    
    Co-authored-by: Matthew Zeng <mzeng@openai.com>
    Co-authored-by: Codex <noreply@openai.com>
  • [codex] Propagate rate limit reached type (#18227)
    ## Summary
    
    First PR in the split from #17956.
    
    - adds the core/app-server `RateLimitReachedType` shape
    - maps backend `rate_limit_reached_type` into Codex rate-limit snapshots
    - carries the field through app-server notifications/responses and
    generated schemas
    - updates existing constructors/tests for the new optional field
    
    ## Validation
    
    - `cargo test -p codex-backend-client`
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-app-server rate_limits`
    - `cargo test -p codex-tui workspace_`
    - `cargo test -p codex-tui status_`
    - `just fmt`
    - `just fix -p codex-backend-client`
    - `just fix -p codex-app-server-protocol`
    - `just fix -p codex-app-server`
    - `just fix -p codex-tui`
  • Guardian -> Auto-Review (#18021)
    This PR is a user-facing change for our rebranding of guardian to
    auto-review.
  • Add PermissionRequest hooks support (#17563)
    ## Why
    
    We need `PermissionRequest` hook support!
    
    Also addresses:
    - https://github.com/openai/codex/issues/16301
    - run a script on Hook to do things like play a sound to draw attention
    but actually no-op so user can still approve
    - can omit the `decision` object from output or just have the script
    exit 0 and print nothing
    - https://github.com/openai/codex/issues/15311
      - let the script approve/deny on its own
      - external UI what will run on Hook and relay decision back to codex
    
    
    ## Reviewer Note
    
    There's a lot of plumbing for the new hook, key files to review are:
    - New hook added in `codex-rs/hooks/src/events/permission_request.rs`
    - Wiring for network approvals
    `codex-rs/core/src/tools/network_approval.rs`
    - Wiring for tool orchestrator `codex-rs/core/src/tools/orchestrator.rs`
    - Wiring for execve
    `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs`
    
    ## What
    
    - Wires shell, unified exec, and network approval prompts into the
    `PermissionRequest` hook flow.
    - Lets hooks allow or deny approval prompts; quiet or invalid hooks fall
    back to the normal approval path.
    - Uses `tool_input.description` for user-facing context when it helps:
      - shell / `exec_command`: the request justification, when present
      - network approvals: `network-access <domain>`
    - Uses `tool_name: Bash` for shell, unified exec, and network approval
    permission-request hooks.
    - For network approvals, passes the originating command in
    `tool_input.command` when there is a single owning call; otherwise falls
    back to the synthetic `network-access ...` command.
    
    <details>
    <summary>Example `PermissionRequest` hook input for a shell
    approval</summary>
    
    ```json
    {
      "session_id": "<session-id>",
      "turn_id": "<turn-id>",
      "transcript_path": "/path/to/transcript.jsonl",
      "cwd": "/path/to/cwd",
      "hook_event_name": "PermissionRequest",
      "model": "gpt-5",
      "permission_mode": "default",
      "tool_name": "Bash",
      "tool_input": {
        "command": "rm -f /tmp/example"
      }
    }
    ```
    
    </details>
    
    <details>
    <summary>Example `PermissionRequest` hook input for an escalated
    `exec_command` request</summary>
    
    ```json
    {
      "session_id": "<session-id>",
      "turn_id": "<turn-id>",
      "transcript_path": "/path/to/transcript.jsonl",
      "cwd": "/path/to/cwd",
      "hook_event_name": "PermissionRequest",
      "model": "gpt-5",
      "permission_mode": "default",
      "tool_name": "Bash",
      "tool_input": {
        "command": "cp /tmp/source.json /Users/alice/export/source.json",
        "description": "Need to copy a generated file outside the workspace"
      }
    }
    ```
    
    </details>
    
    <details>
    <summary>Example `PermissionRequest` hook input for a network
    approval</summary>
    
    ```json
    {
      "session_id": "<session-id>",
      "turn_id": "<turn-id>",
      "transcript_path": "/path/to/transcript.jsonl",
      "cwd": "/path/to/cwd",
      "hook_event_name": "PermissionRequest",
      "model": "gpt-5",
      "permission_mode": "default",
      "tool_name": "Bash",
      "tool_input": {
        "command": "curl http://codex-network-test.invalid",
        "description": "network-access http://codex-network-test.invalid"
      }
    }
    ```
    
    </details>
    
    ## Follow-ups
    
    - Implement the `PermissionRequest` semantics for `updatedInput`,
    `updatedPermissions`, `interrupt`, and suggestions /
    `permission_suggestions`
    - Add `PermissionRequest` support for the `request_permissions` tool
    path
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • Sync local plugin imports, async remote imports, refresh caches after… (#18246)
    … import
    
    ## Why
    
    `externalAgentConfig/import` used to spawn plugin imports in the
    background and return immediately. That meant local marketplace imports
    could still be in flight when the caller refreshed plugin state, so
    newly imported plugins would not show up right away.
    
    This change makes local marketplace imports complete before the RPC
    returns, while keeping remote marketplace imports asynchronous so we do
    not block on remote fetches.
    
    ## What changed
    
    - split plugin migration details into local and remote marketplace
    imports based on the external config source
    - import local marketplaces synchronously during
    `externalAgentConfig/import`
    - return pending remote plugin imports to the app-server so it can
    finish them in the background
    - clear the plugin and skills caches before responding to plugin
    imports, and again after background remote imports complete, so the next
    `plugin/list` reloads fresh state
    - keep marketplace source parsing encapsulated behind
    `is_local_marketplace_source(...)` instead of re-exporting the internal
    enum
    - add core and app-server coverage for the synchronous local import path
    and the pending remote import path
    
    ## Verification
    
    - `cargo test -p codex-app-server-protocol`
    - `cargo test -p codex-core` (currently fails an existing unrelated
    test:
    `config_loader::tests::cli_override_can_update_project_local_mcp_server_when_project_is_trusted`)
    - `cargo test` (currently fails existing `codex-app-server` integration
    tests in MCP/skills/thread-start areas, plus the unrelated `codex-core`
    failure above)
  • Add codex_hook_run analytics event (#17996)
    # Why
    Add product analytics for hook handler executions so we can understand
    which hooks are running, where they came from, and whether they
    completed, failed, stopped, or blocked work.
    
    # What
    - add the new `codex_hook_run` analytics event and payload plumbing in
    `codex-rs/analytics`
    - emit hook-run analytics from the shared hook completion path in
    `codex-rs/core`
    - classify hook source from the loaded hook path as `system`, `user`,
    `project`, or `unknown`
    
    ```
    {
      "event_type": "codex_hook_run",
      "event_params": {
        "thread_id": "string",
        "turn_id": "string",
        "model_slug": "string",
        "hook_name": "string, // any HookEventName
        "hook_source": "system | user | project | unknown",
        "status": "completed | failed | stopped | blocked"
      }
    }
    ```
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • [codex][mcp] Add resource uri meta to tool call item. (#17831)
    - [x] Add resource uri meta to tool call item so that the app-server
    client can start prefetching resources immediately without loading mcp
    server status.
  • Spread AbsolutePathBuf (#17792)
    Mechanical change to promote absolute paths through code.
  • Add realtime output modality and transcript events (#17701)
    - Add outputModality to thread/realtime/start and wire text/audio output
    selection through app-server, core, API, and TUI.\n- Rename the realtime
    transcript delta notification and add a separate transcript done
    notification that forwards final text from item done without correlating
    it with deltas.
  • Support prolite plan type (#17419)
    Addresses #17353
    
    Problem: Codex rate-limit fetching failed when the backend returned the
    new `prolite` subscription plan type.
    
    Solution: Add `prolite` to the backend/account/auth plan mappings, keep
    unknown WHAM plan values decodable, and regenerate app-server plan
    schemas.
  • representing guardian review timeouts in protocol types (#17381)
    ## Summary
    
    - Add `TimedOut` to Guardian/review carrier types:
      - `ReviewDecision::TimedOut`
      - `GuardianAssessmentStatus::TimedOut`
      - app-server v2 `GuardianApprovalReviewStatus::TimedOut`
    - Regenerate app-server JSON/TypeScript schemas for the new wire shape.
    - Wire the new status through core/app-server/TUI mappings with
    conservative fail-closed handling.
    - Keep `TimedOut` non-user-selectable in the approval UI.
    
    **Does not change runtime behavior yet; emitting `TimeOut` and
    parent-model timeout messaging will come in followup PRs**
  • Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391)
    Reverts openai/codex#16969
    
    #sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300
  • fix(guardian, app-server): introduce guardian review ids (#17298)
    ## Description
    
    This PR introduces `review_id` as the stable identifier for guardian
    reviews and exposes it in app-server `item/autoApprovalReview/started`
    and `item/autoApprovalReview/completed` events.
    
    Internally, guardian rejection state is now keyed by `review_id` instead
    of the reviewed tool item ID. `target_item_id` is still included when a
    review maps to a concrete thread item, but it is no longer overloaded as
    the review lifecycle identifier.
    
    ## Motivation
    
    We'd like to give users the ability to preempt a guardian review while
    it's running (approve or decline).
    
    However, we can't implement the API that allows the user to override a
    running guardian review because we didn't have a unique `review_id` per
    guardian review. Using `target_item_id` is not correct since:
    - with execve reviews, there can be multiple execve calls (and therefore
    guardian reviews) per shell command
    - with network policy reviews, there is no target item ID
    
    The PR that actually implements user overrides will use `review_id` as
    the stable identifier.
  • Option to Notify Workspace Owner When Usage Limit is Reached (#16969)
    ## Summary
    - Replace the manual `/notify-owner` flow with an inline confirmation
    prompt when a usage-based workspace member hits a credits-depleted
    limit.
    - Fetch the current workspace role from the live ChatGPT
    `accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
    the desktop and web clients.
    - Keep owner, member, and spend-cap messaging distinct so we only offer
    the owner nudge when the workspace is actually out of credits.
    
    ## What Changed
    - `backend-client`
    - Added a typed fetch for the current account role from
    `accounts/check`.
      - Mapped backend role values into a Rust workspace-role enum.
    - `app-server` and protocol
      - Added `workspaceRole` to `account/read` and `account/updated`.
    - Derived `isWorkspaceOwner` from the live role, with a fallback to the
    cached token claim when the role fetch is unavailable.
    - `tui`
      - Removed the explicit `/notify-owner` slash command.
    - When a member is blocked because the workspace is out of credits, the
    error now prompts:
    - `Your workspace is out of credits. Request more from your workspace
    owner? [y/N]`
      - Choosing `y` sends the existing owner-notification request.
    - Choosing `n`, pressing `Esc`, or accepting the default selection
    dismisses the prompt without sending anything.
    - Selection popups now honor explicit item shortcuts, which is how the
    `y` / `n` interaction is wired.
    
    ## Reviewer Notes
    - The main behavior change is scoped to usage-based workspace members
    whose workspace credits are depleted.
    - Spend-cap reached should not show the owner-notification prompt.
    - Owners and admins should continue to see `/usage` guidance instead of
    the member prompt.
    - The live role fetch is best-effort; if it fails, we fall back to the
    existing token-derived ownership signal.
    
    ## Testing
    - Manual verification
      - Workspace owner does not see the member prompt.
    - Workspace member with depleted credits sees the confirmation prompt
    and can send the nudge with `y`.
    - Workspace member with spend cap reached does not see the
    owner-notification prompt.
    
    ### Workspace member out of usage
    
    https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1
    
    ### Workspace owner
    <img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
    22 AM"
    src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
    />
  • Update guardian output schema (#17061)
    ## Summary
    - Update guardian output schema to separate risk, authorization,
    outcome, and rationale.
    - Feed guardian rationale into rejection messages.
    - Split the guardian policy into template and tenant-config sections.
    
    ## Validation
    - `cargo test -p codex-core mcp_tool_call`
    - `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
    -p codex-core guardian::`
    
    ---------
    
    Co-authored-by: Owen Lin <owen@openai.com>
  • Add WebRTC transport to realtime start (#16960)
    Adds WebRTC startup to the experimental app-server
    `thread/realtime/start` method with an optional transport enum. The
    websocket path remains the default; WebRTC offers create the realtime
    session through the shared start flow and emit the answer SDP via
    `thread/realtime/sdp`.
    
    ---------
    
    Co-authored-by: Codex <noreply@openai.com>
  • app-server: Move watch_id to request of fs/watch (#17026)
    It's easier for clients to maintain watchers if they define the watch
    id, so move it into the request.
    It's not used yet, so should be a safe change.
  • [codex-analytics] add protocol-native turn timestamps (#16638)
    ---
    [//]: # (BEGIN SAPLING FOOTER)
    Stack created with [Sapling](https://sapling-scm.com). Best reviewed
    with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
    * #16870
    * #16706
    * #16659
    * #16641
    * #16640
    * __->__ #16638
  • Fix fork source display in /status (expose forked_from_id in app server) (#16596)
    Addresses #16560
    
    Problem: `/status` stopped showing the source thread id in forked TUI
    sessions after the app-server migration.
    
    Solution: Carry fork source ids through app-server v2 thread data and
    the TUI session adapter, and update TUI fixtures so `/status` matches
    the old TUI behavior.
  • fix(guardian): make GuardianAssessmentEvent.action strongly typed (#16448)
    ## Description
    
    Previously the `action` field on `EventMsg::GuardianAssessment`, which
    describes what Guardian is reviewing, was typed as an arbitrary JSON
    blob. This PR cleans it up and defines a sum type representing all the
    various actions that Guardian can review.
    
    This is a breaking change (on purpose), which is fine because:
    - the Codex app / VSCE does not actually use `action` at the moment
    - the TUI code that consumes `action` is updated in this PR as well
    - rollout files that serialized old `EventMsg::GuardianAssessment` will
    just silently drop these guardian events
    - the contract is defined as unstable, so other clients have a fair
    warning :)
    
    This will make things much easier for followup Guardian work.
    
    ## Why
    
    The old guardian review payloads worked, but they pushed too much shape
    knowledge into downstream consumers. The TUI had custom JSON parsing
    logic for commands, patches, network requests, and MCP calls, and the
    app-server protocol was effectively just passing through an opaque blob.
    
    Typing this at the protocol boundary makes the contract clearer.
  • Add usage-based business plan types (#15934)
    ## Summary
    - add `self_serve_business_usage_based` and `enterprise_cbp_usage_based`
    to the public/internal plan enums and regenerate the app-server + Python
    SDK artifacts
    - map both plans through JWT login and backend rate-limit payloads, then
    bucket them with the existing Team/Business entitlement behavior in
    cloud requirements, usage-limit copy, tooltips, and status display
    - keep the earlier display-label remap commit on this branch so the new
    Team-like and Business-like plans render consistently in the UI
    
    ## Testing
    - `just write-app-server-schema`
    - `uv run --project sdk/python python
    sdk/python/scripts/update_sdk_artifacts.py generate-types`
    - `just fix -p codex-protocol -p codex-login -p codex-core -p
    codex-backend-client -p codex-cloud-requirements -p codex-tui -p
    codex-tui-app-server -p codex-backend-openapi-models`
    - `just fmt`
    - `just argument-comment-lint`
    - `cargo test -p codex-protocol
    usage_based_plan_types_use_expected_wire_names`
    - `cargo test -p codex-login usage_based`
    - `cargo test -p codex-backend-client usage_based`
    - `cargo test -p codex-cloud-requirements usage_based`
    - `cargo test -p codex-core usage_limit_reached_error_formats_`
    - `cargo test -p codex-tui plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui remapped`
    - `cargo test -p codex-tui-app-server
    plan_type_display_name_remaps_display_labels`
    - `cargo test -p codex-tui-app-server remapped`
    - `cargo test -p codex-tui-app-server
    preserves_usage_based_plan_type_wire_name`
    
    ## Notes
    - a broader multi-crate `cargo test` run still hits unrelated existing
    guardian-approval config failures in
    `codex-rs/core/src/config/config_tests.rs`
  • [hooks] add non-streaming (non-stdin style) shell-only PostToolUse support (#15531)
    CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main
    -- so the following PR should be first
    
    Matching post-tool hook to the pre-tool functionality here:
    https://github.com/openai/codex/pull/15211
    
    So, PreToolUse calls for plain shell calls, allows blocking. This
    PostToolUse call runs after the command executed
    
    example run:
    
    ```
    › as a test, run in parallel the following commands:
      - echo 'one'
      - echo '[block-pre-tool-use]'
      - echo '[block-post-tool-use]'
    
    
    ⚠ MCP startup incomplete (failed: notion, linear)
    
    • Cruising through those three commands in parallel now, and I’ll share the exact outputs right after
      they land.
    
    • Running PreToolUse hook: checking the observatory runes
    
    • Running PreToolUse hook: checking the observatory runes
    
    • Running PreToolUse hook: checking the observatory runes
    
    PreToolUse hook (blocked)
      warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
      feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo 'one'
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo '[block-post-tool-use]'
    
    • Ran echo '[block-post-tool-use]'
      └ [block-post-tool-use]
    
    • Ran echo 'one'
      └ one
    
    • Running PostToolUse hook: reading the observatory aftermath
    
    • Running PostToolUse hook: reading the observatory aftermath
    
    PostToolUse hook (blocked)
      warning: wizard-tower PostToolUse demo blocked the Bash follow-through on purpose.
      feedback: PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to
    continue.
    
    PostToolUse hook (completed)
      warning: wizard-tower PostToolUse demo inspected Bash: echo 'one'
      hook context: PostToolUse demo saw the Bash result. Response preview: one
    
    ──────────────────────────────────────────────────────────────────────────────────────────────────────
    
    • Ran them in parallel. Little wave report:
    
      - echo 'one' completed and returned one.
      - echo '[block-pre-tool-use]' was blocked before execution by the PreToolUse hook.
      - echo '[block-post-tool-use]' ran, but its result was blocked afterward by the PostToolUse hook.
    
      The hook messages were:
    
      - PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
      - PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue.
    
    
    › for the blocked messages (correctly blocked in this test) did the tool output land back into
      context?
    
    
    • Yeah, from this run:
    
      - PreToolUse block: the command itself did not execute, so no command output landed in context. Only
        the block message did.
      - PostToolUse block: the command did execute, but the actual stdout did not come back into context.
        Only the post-tool block message did.
    
      So the blocked notifications landed in context, but the blocked command payloads themselves did not.
      Nice clean guardrail, aloha.
    ```
  • app-server: add filesystem watch support (#14533)
    ### Summary
    Add the v2 app-server filesystem watch RPCs and notifications, wire them
    through the message processor, and implement connection-scoped watches
    with notify-backed change delivery. This also updates the schema
    fixtures, app-server documentation, and the v2 integration coverage for
    watch and unwatch behavior.
    
    This allows clients to efficiently watch for filesystem updates, e.g. to
    react on branch changes.
    
    ### Testing
    - exercise watch lifecycles for directory changes, atomic file
    replacement, missing-file targets, and unwatch cleanup
  • [hooks] add non-streaming (non-stdin style) shell-only PreToolUse support (#15211)
    - add `PreToolUse` hook for bash-like tool execution only at first
    - block shell execution before dispatch with deny-only hook behavior
    - introduces common.rs matcher framework for matching when hooks are run
    
    example run:
    
    ```
    › run three parallel echo commands, and the second one should echo "[block-pre-tool-use]" as a test
    
    
    • Running the three echo commands in parallel now and I’ll report the output directly.
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    • Running PreToolUse hook: name for demo pre tool use hook
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo "first parallel echo"
      
    PreToolUse hook (blocked)
      warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
      feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
    
    PreToolUse hook (completed)
      warning: wizard-tower PreToolUse demo inspected Bash: echo "third parallel echo"
    
    • Ran echo "first parallel echo"
      └ first parallel echo
    
    • Ran echo "third parallel echo"
      └ third parallel echo
    
    • Three little waves went out in parallel.
    
      1. printed first parallel echo
      2. was blocked before execution because it contained the exact test string [block-pre-tool-use]
      3. printed third parallel echo
    
      There was also an unrelated macOS defaults warning around the successful commands, but the echoes
      themselves worked fine. If you want, I can rerun the second one with a slightly modified string so
      it passes cleanly.
    ```