42 Commits

  • Load executor skills without host path conversion (#29626)
    ## Why
    
    After #28918, selected skill roots are `PathUri`, but the executor skill
    provider still converts them to the app-server host's `AbsolutePathBuf`.
    A foreign Windows root therefore cannot be discovered by a Unix host,
    and the inverse has the same problem.
    
    This PR keeps executor skill discovery and reads on the filesystem that
    owns the selected root while reusing the existing skill rules.
    
    ## What changed
    
    - Generalize the existing skill traversal to operate on `PathUri`
    through `ExecutorFileSystem`, preserving its depth, directory, symlink,
    and sibling-metadata concurrency behavior.
    - Add a small environment skill loader that reuses the shared discovery,
    frontmatter validation, dependency parsing, product policy, and
    prompt-visibility rules.
    - Keep the environment id and entrypoint `PathUri` in the skill catalog,
    then route `skills.read` back through the same environment filesystem.
    - Preserve the executor's path convention when deriving catalog handles,
    including literal backslashes in POSIX filenames.
    - Resolve plugin namespaces from nearby manifests through URI-native
    filesystem reads.
    - Cover foreign Windows roots, executor-owned reads, namespaces,
    metadata, policy, and path identity.
    
    ```text
    selected root (PathUri)
            |
            v
    shared discovery over ExecutorFileSystem
            |
            v
    environment-bound catalog entry --skills.read--> same ExecutorFileSystem
    ```
    
    No second filesystem abstraction or duplicate traversal implementation
    is introduced.
    
    ## Stack
    
    1. #29614 — add lexical `PathUri` containment.
    2. #29620 — share URI-native manifest path resolution.
    3. #28918 — keep selected plugin roots and resources URI-native.
    4. **This PR** — load executor skills without host path conversion.
    5. #29628 — resolve executor MCP working directories without host path
    conversion.
  • Fix goal-first live threads missing from thread/list (#28808)
    Fixes #28263.
    
    ## Why
    
    When a thread starts with `/goal`, the goal extension can update SQLite
    goal state before the thread has any user-turn rollout items.
    `thread/list` and `thread/search` rely on persisted listing metadata, so
    a goal-first live thread could be absent from app-server listings after
    restart even though the goal itself existed.
    
    This regressed when goal handling moved out of core: the core path wrote
    the goal update through the live thread rollout path, while the
    extension-backed app-server path only updated goal state and emitted the
    live notification.
    
    ## What
    
    - Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension
    owns the canonical `ThreadGoalUpdated` rollout item shape.
    - Expose a narrow `CodexThread::append_rollout_items()` helper that
    appends through the live thread and keeps derived SQLite metadata in
    sync.
    - When app-server sets a goal on an active live thread, persist the goal
    update through that live-thread path.
    - Add an app-server regression test that starts a live thread with
    `thread/goal/set` and verifies it appears in state-DB-only
    `thread/list`.
    
    ## Verification
    
    - `env -u CODEX_SQLITE_HOME just test -p codex-app-server
    goal_first_live_thread_appears_in_state_db_thread_list`
  • [codex] Use expect in integration tests (#28441)
    The workspace denies `clippy::expect_used` in production. Although
    `clippy.toml` allows `expect` in tests, Bazel Clippy compiles
    integration-test helper code in a way that does not receive that
    exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
    and equivalent `match`/`let else` forms.
    
    This allows `clippy::expect_used` once at each integration-test crate
    root (including aggregated suites and test-support libraries), then
    replaces manual panic-based Result and Option unwraps with
    `expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
    crate roots. Intentional assertion and unexpected-variant panics remain
    unchanged, and the production `expect_used = "deny"` lint remains in
    place.
    
    The cleanup is mechanical and net-negative in line count.
  • skills: hide orchestrator skills with a local executor (#28333)
    ## Why
    
    App-server threads without a local executor need orchestrator-owned
    skills from the hosted `codex_apps` MCP server. Threads with the local
    executor already discover installed skills from the local filesystem.
    
    After the orchestrator skill provider was enabled for every app-server
    thread, local-executor threads also received the hosted skill catalog
    and the `skills.list` and `skills.read` tools. This changed the existing
    local behavior and could expose a second hosted copy of a skill that was
    already installed locally.
    
    ## What changed
    
    - Expose the thread's selected execution environments to extensions at
    thread startup.
    - Enable orchestrator skills only when the reserved local environment is
    not selected.
    - Apply that decision consistently to hosted skill catalog discovery,
    explicit skill injection, and the `skills.list` and `skills.read` tools.
    
    ## Verification
    
    - The existing no-executor app-server test continues to verify hosted
    skill discovery, invocation, and child-resource reads.
    - A new app-server test verifies that local-executor threads do not
    receive hosted skill context or `skills.*` tools.
  • build: run buildifier from just fmt (#28125)
    ## Intent
    
    Keep Bazel and Starlark files consistently formatted without requiring
    contributors to install or version buildifier themselves.
    
    ## Implementation
    
    - Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
    v8.5.1.
    - Run buildifier from the shared `just fmt` and `just fmt-check` driver,
    with Windows-safe explicit DotSlash invocation.
    - Provision DotSlash in formatting CI and contributor devcontainers, and
    document the source-build prerequisite.
    - Apply the initial mechanical buildifier formatting baseline.
  • Route image extension reads through turn environments v2 (#27498)
    ## Why
    
    Image generation used `std::fs::read` for referenced image paths, which
    did not support environment-backed filesystems or their sandbox context.
    
    ## What changed
    
    - Expose optional turn environments to extension tool calls.
    - Include each environment’s ID, working directory, filesystem, and
    sandbox context.
    - Read referenced images through the selected environment filesystem.
    - Keep sandbox usage at the extension call site so extensions can choose
    the appropriate access mode.
    - Consolidate image request construction into one async function.
    - Add coverage for successful environment reads and read failures.
    
    ## Validation
    
    - `cargo check -p codex-image-generation-extension --tests`
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    `just test -p codex-image-generation-extension` could not complete
    because the build exhausted available disk space.
  • [codex] Remove async_trait from ToolExecutor (#27304)
    ## Why
    
    We're now [discouraging use of
    `async_trait`](https://github.com/openai/codex/pull/20242).
    
    Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
    debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
    
    Stacked on #27299, this PR applies the trait change after the handler
    bodies have been outlined.
    
    ## What
    
    Changed `ToolExecutor::handle` to return an explicit boxed
    `ToolExecutorFuture` instead of using `async_trait`.
    
    Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
    the future alias through `codex-tools` and `codex-extension-api`, and
    removed `codex-tools` direct `async-trait` dependency.
  • Remove async-trait from extension contributors (#27383)
    ## Why
    
    Extension contributors are registered behind `dyn Trait` objects, so
    native `async fn`/RPITIT methods would make these traits
    non-object-safe. Spell out the boxed, `Send` future contract directly so
    `extension-api` no longer needs `async-trait` while retaining the
    existing runtime model.
    
    ## What changed
    
    - add a shared `ExtensionFuture` alias and use it for asynchronous
    contributor methods
    - migrate production and test implementations to return `Box::pin(async
    move { ... })`
    - remove `async-trait` dependencies where they are no longer used,
    keeping it dev-only where unrelated test executors still require it
    
    ## Behavior
    
    No behavior change is intended. Contributor futures remain boxed,
    `Send`, dynamically dispatched, and lazily executed; cancellation and
    callback ordering stay unchanged.
    
    ## Testing
    
    - `just test -p codex-extension-api` (11 passed)
    - affected extension crates (64 passed)
    - targeted `codex-core` contributor tests (14 passed)
    - `just fmt`
    - `just bazel-lock-update`
    - `just bazel-lock-check`
    
    A broad local `codex-core` run compiled successfully but encountered
    unrelated sandbox and missing test-binary fixture failures; CI will run
    the full checks.
  • [codex-analytics] emit goal lifecycle analytics (#27078)
    ## Why
    - Currently, there is no analytics event for `/goal` behavior
    - Existing events cannot identify goal execution or its resulting
    outcome
    - The original update in
    [#26182](https://github.com/openai/codex/pull/26182) was implemented
    before `/goal` moved into `codex-goal-extension`.
    
    ## What Changed
    - Adds `codex_goal_event` serialization and enrichment to
    `codex-analytics`
    - Emits goal events from the canonical `codex-goal-extension` mutation
    and accounting paths:
      - `created` when a new logical goal is persisted
      - `usage_accounted` when cumulative goal usage is persisted
      - `status_changed` when the stored goal status changes
      - `cleared` when the goal is deleted
    - Preserves causal `turn_id` for turn driven events and uses null
    attribution for external or idle lifecycle events
    - Changes goal deletion to return the deleted row so `cleared` retains
    the stable goal ID
    
    ## Event Details
    
    Includes standard analytics metadata along with goal specific fields:
    - `goal_id`: Stable ID stored in the local SQLite goal row and shared
    across the goal's events
    - `event_kind`: Observed operation (see the 4 lifecycle events cited in
    the above bullet)
    - `goal_status`: Resulting or last stored status: `active`, `paused`,
    `blocked`, `usage_limited`, etc.
      - `has_token_budget`: Indicates whether a token budget is configured
      - `turn_id`: Causal turn ID, or null when no causal turn exists
    - `cumulative_tokens_accounted`: Cumulative tokens on `usage_accounted`
    events; null otherwise
    - `cumulative_time_accounted_seconds`: Cumulative active time on
    `usage_accounted` events; null otherwise
    
    ## Validation
    - `just test -p codex-analytics -p codex-state -p codex-goal-extension`
    - `just test -p codex-core -E 'test(/goal/)'`
    - `just test -p codex-app-server`
    - `cargo build -p codex-analytics -p codex-core -p codex-state -p
    codex-app-server`
  • Allow creating a new goal after completion (#26681)
    ## Why
    
    Users have indicated that they want an agent to be able to create a new
    goal for itself after completing the previous goal. Currently, that's
    not possible because agents cannot overwrite an existing goal even if
    it's complete. This PR removes this limitation and allows `create_goal`
    to overwrite an existing goal if it is in the `complete` state.
    
    ## What changed
    
    `create_goal` now replaces the existing goal only when its status is
    `complete`. The replacement is performed atomically in the goal store,
    creates a fresh active goal with reset usage, and continues to reject
    creation while any unfinished goal exists. App server clients see a
    single `thread/goal/updated` event when the previous goal is replaced
    with the new one.
    
    The tool description and error message now reflect these semantics.
    
    ## What didn't change
    
    Agents are not allowed to create a new goal (overwrite their existing
    goal) if an existing goal is still active, blocked, paused, or in any
    other state other than "completed".
  • Block active goals after terminal turn errors (#26690)
    ## Why
    
    Terminal turn errors can leave a goal active. Automatic goal
    continuation may then repeatedly hit a permanent failure, including
    compaction requests rejected with HTTP 400, and consume excessive
    tokens.
    
    This PR changes the goal extension to treat all turn-ending errors
    (including non-retryable errors and retryable errors that have exceeded
    their retry count) as "blocking" for the goal. The downside to this
    change is that there are some errors that may eventually succeed (e.g. a
    429 due to a service outage), and previously the goal runtime would have
    kept the agent going in these situations.
    
    ## What changed
    
    - Block the current active goal when a turn ends with an error other
    than a usage-limit error.
    - Preserve the existing `usage_limited` transition for usage-limit
    errors.
    - Share progress accounting, guarded state updates, metrics, and event
    emission in the goal runtime.
  • [1 of 2] Align goal extension with core behavior (#26547)
    ## Stack
    
    1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
    goal extension with core behavior
    2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
    goal runtime to extension
    
    ## Why
    
    The goal runtime is moving out of `codex-core` and into
    `codex-goal-extension`. This first PR brings the extension back in line
    with the current core behavior before the follow-up PR switches
    app-server sessions over to the extension, so that review can focus on
    ownership and wiring rather than hidden behavior drift.
    
    ## What Changed
    
    - Updates the extension `create_goal` and `update_goal` tool
    schemas/descriptions to match the current core wording for explicit
    token budgets, blocked-goal audits, resumed blocked goals, and
    system-owned budget/usage-limit transitions.
    - Marks `codex-goal-extension` as the live `/goal` extension crate
    rather than an unwired sketch.
    - Looks up the live thread before reading goal state for idle
    continuation, so continuation setup exits early when no live thread can
    accept the automatic turn.
  • Gate automatic idle turns in Plan mode (#26147)
    ## Why
    
    Goal idle continuation is extension-triggered model-visible work, so it
    should follow one core-owned rule for when automatic work may start. In
    particular, it should not jump ahead of queued user/client work, start
    while another task is active, or inject a continuation turn while the
    thread is in Plan mode.
    
    Keeping this policy in `try_start_turn_if_idle` avoids passing
    `collaboration_mode` or review-specific state through
    `ThreadLifecycleContributor::on_thread_idle`. Active `/review` is
    covered by the same active-task gate because Review turns are not
    steerable.
    
    ## What Changed
    
    - Teach `Session::try_start_turn_if_idle` to reject automatic idle turns
    in Plan mode, both before reserving an idle turn and after building the
    turn context.
    - Document `CodexThread::try_start_turn_if_idle` as the extension-facing
    gate for automatic idle work, including Plan-mode and active Review-task
    behavior.
    - Add focused coverage for Plan-mode rejection and active Review-task
    rejection without queuing synthetic input.
    
    ## Testing
    
    - `just test -p codex-core try_start_turn_if_idle`
  • fix: serialize goal progress accounting (#26155)
    ## Why
    
    Goal progress accounting can be reached from multiple completion paths
    for the same thread. Each path takes a progress snapshot, writes the
    usage delta, and then marks that snapshot as accounted. When two
    tool-completion hooks run at the same time, they can both observe the
    same unaccounted delta and charge it twice.
    
    ## What changed
    
    - Added a per-thread progress-accounting permit to
    `GoalAccountingState`.
    - Held that permit across the snapshot/write/mark-accounted critical
    section for active-turn, idle, and tool-finish accounting.
    - Added regression coverage for parallel tool-finish hooks so a shared
    token delta is charged once and only one progress event is emitted.
    
    ## Testing
    
    - Not run locally.
    - Added `parallel_tool_finish_accounts_active_goal_progress_once`.
  • Add goal extension GoalApi (#25096)
    ## Summary
    
    - add an extension-owned `GoalApi` for thread goal get/set/clear
    operations
    - register live goal runtimes with the API from the goal extension
    backend
    - cover the API and runtime-effect paths in goal extension tests
    
    ## Stack
    
    Follow-up app-server wiring PR: #25108
    
    ## Validation
    
    - `just fmt`
    - `just fix -p codex-goal-extension`
    - `just test -p codex-goal-extension`
  • Use templates for goal steering prompts (#25576)
    ## Why
    
    Goal steering prompts have grown into long inline Rust strings, which
    makes the authored prompt text hard to review and easy to damage while
    changing the surrounding plumbing. Moving those prompts into embedded
    Markdown templates keeps the policy text in the shape reviewers actually
    read, while preserving the existing runtime substitution and objective
    escaping behavior.
    
    ## What changed
    
    - Added `ext/goal/templates/goals/continuation.md`, `budget_limit.md`,
    and `objective_updated.md` for the three goal steering prompts.
    - Updated `ext/goal/src/steering.rs` to parse those embedded templates
    once with `codex-utils-template` and render the existing goal values
    into them.
    - Kept user objectives XML-escaped before rendering and converted budget
    counters into template variables.
    - Added the template directory to `ext/goal/BUILD.bazel` `compile_data`
    so Bazel has the same embedded prompt inputs as Cargo.
    
    ## Testing
    
    - Not run locally.
  • Add goal extension idle continuation (#25060)
    ## Why
    
    The goal extension needs a way to resume an active goal after the thread
    becomes idle, but the old core goal runtime should not be refactored as
    part of this step. The missing piece is a small core-owned turn-start
    primitive: let an extension ask for a normal model turn only when the
    thread is idle, and otherwise fail without injecting into whatever is
    currently active.
    
    ## What Changed
    
    - Adds `CodexThread::try_start_turn_if_idle(...)` as the narrow
    extension-facing primitive for synthetic idle work.
    - Implements the session side so it refuses to start when:
      - the provided input is empty,
      - the session is in plan mode,
      - a turn is already active, or
      - trigger-turn mailbox work is pending.
    - Gives trigger-turn mailbox work priority if it appears while the idle
    turn is being prepared.
    - Wires `GoalExtension::on_thread_idle` to read the active persisted
    goal and submit the continuation prompt through this idle-only
    primitive.
    - Keeps the legacy core goal continuation implementation in place
    instead of folding it into this PR.
    
    ## Behavior
    
    This is intentionally best-effort. If `try_start_turn_if_idle` observes
    that the thread is not idle, or that higher-priority mailbox work should
    run first, it returns the input to the caller. The goal extension drops
    that continuation prompt and waits for a future idle opportunity instead
    of injecting stale synthetic goal text into an active turn.
    
    ## Validation
    
    - `just test -p codex-core
    try_start_turn_if_idle_rejects_active_turn_without_injecting`
    - `just test -p codex-goal-extension`
  • [codex] Require model for standalone web search (#25131)
    ## Why
    
    The standalone `/v1/alpha/search` request now requires a `model`, but
    the `web.run` extension currently omits it.
    
    Adds `model` to extension `ToolCall` invocation.
    
    Follow-up to #23823.
    
    ## What changed
    
    - Make `SearchRequest.model` required.
    - Expose the effective per-turn model on extension tool calls and pass
    it in standalone web-search requests.
    - Assert the model is forwarded in the app-server round-trip test.
    
    ## Testing
    
    - `just test -p codex-api -p codex-tools -p codex-web-search-extension
    -p codex-memories-extension -p codex-goal-extension`
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    - `just test -p codex-app-server -E
    'test(standalone_web_search_round_trips_encrypted_output)'`
  • Handle goal usage limits from turn errors (#25095)
    ## Summary
    - handle goal usage-limit turn errors in the goal extension
    - exercise the extension path in the goal backend test
    
    ## Tests
    - just fmt
    - just test -p codex-goal-extension
    - just fix -p codex-goal-extension
  • Use inject_if_running for active goal steering (#24924)
    ## Why
    
    This PR is stacked on #24918, which moves goal steering onto
    source-labeled internal model context fragments. Active-turn goal
    steering should use the same running-turn injection path as other
    runtime steering, so those fragments enter the pending input queue as
    `ResponseItem`s through the existing
    [`Session::inject_if_running`](https://github.com/openai/codex/blob/8d6f6cdf69b055c27682e7cdea9caf72a3e2ee7f/codex-rs/core/src/session/inject.rs#L12-L27)
    behavior instead of through a goal-specific conversion wrapper.
    
    ## What Changed
    
    - Exposes a narrow `CodexThread::inject_if_running` bridge for callers
    that only hold a thread handle.
    - Changes `ext/goal` active-turn steering to pass `ResponseItem`s
    directly.
    - Builds goal steering prompts as contextual internal model context
    `ResponseItem`s before injecting them into the running turn.
    
    ## Testing
    
    Not run locally; PR metadata update only.
  • Use internal model context fragments for goal steering (#24918)
    ## Why
    
    Goal steering is one form of runtime-owned model context, but the old
    `<goal_context>` wrapper made the contextual-fragment hiding path
    goal-specific. Using a source-labeled internal context fragment gives
    core and extensions a shared shape for hidden model steering while
    keeping those prompts out of visible turn history.
    
    The change also keeps legacy `<goal_context>` messages recognized as
    hidden contextual input so existing stored history does not start
    rendering old goal-steering prompts as user-visible turn items.
    
    ## What Changed
    
    - Replaces `GoalContext` with `InternalModelContextFragment` plus a
    validated `InternalContextSource`.
    - Renders goal steering as `<codex_internal_context
    source="goal">...</codex_internal_context>`.
    - Updates core goal steering and `ext/goal` steering to inject the new
    internal-context fragment.
    - Updates contextual-fragment, event-mapping, goal, and session tests
    for the new wrapper.
    
    ## Test Coverage
    
    - Adds coverage for detecting the new internal model context fragment.
    - Preserves coverage for hiding legacy `<goal_context>` fragments.
    - Verifies invalid internal context sources are rejected and arbitrary
    context tags are not hidden.
    - Updates goal steering/session assertions to expect the new
    `source="goal"` wrapper.
  • extension-api: add TurnItemEmitter to tool calls (#24813)
    ## Why
    Extension-contributed tools need to emit visible turn items through
    Codex's normal event and persistence pipeline.
    
    ## What
    - Add `TurnItemEmitter` to extension `ToolCall`s and route the core
    implementation through `Session::emit_turn_item_*`.
    - Hold weak session and turn references so retained tool calls cannot
    keep host state alive.
    - Provide a no-op emitter for extension test callers.
    
    ## Test Plan
    - `just test -p codex-core -E
    'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
    
    ---------
    
    Co-authored-by: jif-oai <jif@openai.com>
  • Gate goal tools by thread eligibility (#24925)
    ## Why
    
    Goal tools create and update goal state for a persistent thread. The
    extension was only checking whether goals were enabled before
    advertising those tools, which meant they could be surfaced in contexts
    that should not receive thread goal controls: ephemeral threads without
    persistent thread state and review subagents.
    
    Those sessions can still run the goal extension lifecycle, but the
    thread tools should only be visible when the current thread can safely
    use them.
    
    ## What changed
    
    - Adds a `GoalRuntimeConfig` that separates goal enablement from whether
    goal tools are available for the current thread.
    - Computes tool eligibility on thread start from
    `persistent_thread_state_available` and `SessionSource`, hiding tools
    for review subagents.
    - Uses `GoalRuntimeHandle::tools_visible()` when contributing thread
    tools so enabled runtime state does not automatically imply tool
    exposure.
    - Adds backend coverage for hiding goal tools on ephemeral threads and
    review subagents.
    
    ## Testing
    
    - Added `goal_tools_hidden_for_ephemeral_threads`.
    - Added `goal_tools_hidden_for_review_subagents`.
  • Add thread start contributor facts (#24915)
    Summary: add session source and persistent-state availability to
    ThreadStartInput; populate them from session init; update existing goal
    test harness constructors. Tests: just fmt; git diff --check. No full
    tests or clippy run per request.
  • feat: handle goal usage limits in goal extension (#24628)
    ## Why
    
    The extracted goal runtime needs a host-callable path for turns that
    stop because the workspace usage limit is reached. In that case, any
    in-turn goal progress should be accounted before the goal becomes
    terminal, and active goal accounting must be cleared so later
    tool-finish or turn-stop handling does not keep charging usage to a
    stopped goal.
    
    ## What changed
    
    - Adds `GoalRuntimeHandle::usage_limit_active_goal_for_turn`, which
    accounts current active-goal progress, marks the active or
    budget-limited thread goal as `UsageLimited`, records terminal metrics
    when the status changes, clears active goal accounting, and emits the
    updated goal event.
    - Covers both active and budget-limited goals in
    `ext/goal/tests/goal_extension_backend.rs`, including the invariant that
    later token/tool events do not add usage after the goal has been
    usage-limited.
    
    ## Testing
    
    - Added
    `usage_limit_active_goal_accounts_progress_and_clears_accounting`.
    - Added `usage_limit_budget_limited_goal_accounts_remaining_progress`.
  • Add experimental turn additional context (#24154)
    ## Summary
    
    Adds experimental `additionalContext` support to `turn/start` and
    `turn/steer` so clients can provide ephemeral external context, such as
    browser or automation state, without turning that plumbing into a
    visible user prompt or triggering user-prompt lifecycle behavior.
    
    ## API Shape
    
    The parameter shape is:
    
    ```ts
    additionalContext?: Record<string, {
      value: string
      kind: "untrusted" | "application"
    }> | null
    ```
    
    Example:
    
    ```json
    {
      "additionalContext": {
        "browser_info": {
          "value": "Active tab is CI failures.",
          "kind": "untrusted"
        },
        "automation_info": {
          "value": "CI rerun is in progress.",
          "kind": "application"
        }
      }
    }
    ```
    
    The keys are opaque and caller-defined.
    
    ## Context Injection
    
    When provided, accepted entries are inserted into model context as
    hidden contextual message items, not as visible thread user-message
    items.
    
    `kind: "untrusted"` entries are inserted with role `user`:
    
    ```text
    <external_${key}>${value}</external_${key}>
    ```
    
    `kind: "application"` entries are inserted with role `developer`:
    
    ```text
    <${key}>${value}</${key}>
    ```
    
    Values are not escaped. Each value is truncated to 1k approximate tokens
    before wrapping.
    
    For `turn/start`, accepted additional context is inserted before normal
    user input. For `turn/steer`, additional context is merged only when the
    steer includes non-empty user input; context-only steers still reject as
    empty input.
    
    ## Dedupe Strategy
    
    `AdditionalContextStore` lives on session state and stores the latest
    complete additional-context map.
    
    Each `turn/start` or non-empty `turn/steer` treats its
    `additionalContext` as the current complete set of values. Entries are
    injected only when the key is new or the exact entry for that key
    changed, including `value` or `kind`. After merging, the store is
    replaced with the provided map, so omitted keys are removed from the
    retained set and can be injected again later if reintroduced.
    
    Omitting `additionalContext`, passing `null`, or passing an empty object
    resets the store to empty and injects nothing.
    
    ## What Changed
    
    - Threads experimental v2 `additionalContext` through app-server into
    core turn start and steer handling.
    - Adds separate contextual fragment types for untrusted user-role
    context and application developer-role context.
    - Uses pending response input items so additional context can be
    combined with normal user input without treating it as prompt text.
    - Adds integration coverage for start/steer flow, role routing,
    dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
    behavior, empty context-only steer rejection, external-fragment marker
    matching, and truncation.
  • fix: restore goal accounting after thread resume (#24626)
    ## Why
    
    Goal idle accounting is supposed to survive a thread resume. Previously,
    the resume hook restored the active goal state inline from the extension
    lifecycle contributor, which left the runtime handle without a reusable
    restoration path and made the behavior hard to cover directly. When a
    thread with an active goal was resumed, goal accounting could lose track
    of the active idle goal instead of continuing to accrue elapsed time.
    
    ## What changed
    
    - Moved thread-resume restoration into
    `GoalRuntimeHandle::restore_after_resume()` so the runtime owns
    rehydrating active goal accounting from persisted thread goal state.
    - Kept disabled goal runtimes as a no-op and preserved the existing
    warning path when persisted goal state cannot be loaded.
    - Added a backend regression test that seeds an active goal, resumes the
    thread, waits briefly, and verifies elapsed idle time is reflected on
    the next external goal mutation.
    
    ## Testing
    
    - Not run locally; this metadata update only rewrote the PR title/body.
  • Add goal extension telemetry parity (#24615)
    ## Why
    
    `core/src/goals.rs` already emits OTEL metrics for goal creation,
    resume, terminal transitions, token counts, and duration. As `/goal`
    moves into `ext/goal`, the extension needs to preserve that telemetry
    contract instead of only emitting app-visible `ThreadGoalUpdated`
    events.
    
    This keeps the existing `codex.goal.*` metric surface intact while goal
    lifecycle ownership shifts toward the extension.
    
    ## What changed
    
    - Added an extension-local `GoalMetrics` helper that records the
    existing `codex.goal.*` counters and histograms through `codex-otel`.
    - Threaded an optional `MetricsClient` through `install_with_backend`,
    `GoalExtension`, `GoalRuntimeHandle`, and `GoalToolExecutor`.
    - Emitted created, resumed, and terminal goal metrics from the extension
    paths that create goals, restore active goals on thread resume, account
    budget limits, complete or block goals, and handle external goal
    mutations.
    - Updated existing goal extension test setup callsites to pass `None`
    for metrics when instrumentation is not under test.
    
    ## Verification
    
    Not run locally.
  • Expose conversation history to extension tools (#23963)
    ## Why
    
    Extension tools that need conversation context should be able to read it
    from the live tool invocation instead of reaching into thread
    persistence themselves.
    
    ## What changed
    
    - Add a `ConversationHistory` snapshot to extension `ToolCall`s and
    populate it from the current raw in-memory response history.
    - Expose all history items at this boundary so each extension can filter
    and bound the subset it needs before consuming or forwarding it.
    - Cover the adapter and registry dispatch paths and update existing
    extension tests that construct `ToolCall` literals.
    
    ## Test plan
    
    - `cargo test -p codex-tools`
    - `cargo test -p codex-extension-api`
    - `cargo test -p codex-goal-extension`
    - `cargo test -p codex-memories-extension`
    - `cargo test -p codex-core passes_turn_fields_to_extension_call`
    - `cargo test -p codex-core
    extension_tool_executors_are_model_visible_and_dispatchable`
  • Make tool executor specs mandatory (#23870)
    ## Why
    
    `ToolExecutor` is the runtime contract that keeps a callable tool and
    its model-visible spec together. Leaving `spec()` optional lets a
    registered runtime silently omit that half of the contract, and it also
    overloads a missing spec as an exposure decision for tools that should
    stay dispatchable without being shown to the model.
    
    ## What
    
    - Make `ToolExecutor::spec()` required and update core, extension, and
    test tool executors to return a concrete `ToolSpec`.
    - Add `ToolExposure::Hidden` for dispatch-only tools. The legacy
    `shell_command` runtime in unified-exec sessions now uses that explicit
    exposure instead of hiding itself by omitting a spec.
    - Build MCP tool specs when `McpHandler` is constructed so invalid MCP
    specs are skipped before the handler is registered.
    - Keep tool planning aligned with the new contract for direct, deferred,
    hidden, code-mode, dynamic, and namespaced tool paths.
    
    ## Testing
    
    - Added tool-plan coverage that invalid MCP tool specs are not
    registered.
    - Updated shell-family coverage for the hidden legacy `shell_command`
    runtime and the affected tool executor test fixtures.
  • [codex] Steer budget-limited goal extension turns (#23718)
    ## What
    - Add a small extension capability for injecting model-visible response
    items into the active turn
    - Have the goal extension inject hidden goal-context steering when
    tool-finish accounting reaches `BudgetLimited`
    - Cover the extension backend path with an assertion on the injected
    steering item
    
    ## Why
    PR #23696 persists and emits the budget-limited goal update from
    tool-finish accounting, but it leaves the model unaware of that
    transition. The existing core runtime steers the model to wrap up in
    this case; the extension path should do the same through an explicit
    host capability.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-goal-extension`
    - `cargo test -p codex-extension-api`
  • Fix thread settings clippy failure (#23724)
    ## Why
    
    `main` picked up two small Rust build failures after nearby merges:
    
    - #23507 added a real handler for
    `ServerNotification::ThreadSettingsUpdated`, but the same variant was
    still listed in the ignored-notification match arm. Full Clippy runs
    treat the resulting unreachable-pattern warning as an error.
    - #23666 added `turn_id` and `truncation_policy` to
    `codex_tools::ToolCall`, while the goal extension backend test fixtures
    from the goal-extension work still used the old shape. That left
    `codex-goal-extension` tests unable to compile once the branches met on
    `main`.
    
    ## What changed
    
    Removed the duplicate `ThreadSettingsUpdated` match pattern from
    `tui/src/chatwidget/protocol.rs`.
    
    Updated the goal extension test `tool_call` helper to populate the new
    `ToolCall` fields, and reused that helper for the one direct literal
    that still had the old field list.
    
    ## Verification
    
    - `just fix -p codex-tui`
    - `cargo test -p codex-goal-extension`
  • [codex] Preserve failed goal accounting flushes (#23717)
    ## What
    - Preserve database accounting failures from the goal extension instead
    of collapsing them into `None`
    - Warn with turn/tool context when a flush fails
    - Keep stop/abort accounting snapshots alive when the final flush did
    not persist
    
    ## Why
    PR #23696 can finish and discard a turn snapshot after
    `account_thread_goal_usage` fails. That loses the final accumulated
    accounting state silently. This follow-up keeps that failure explicit
    and avoids deleting the local snapshot in the failing path.
    
    ## Testing
    - `just fmt`
    - `cargo test -p codex-goal-extension`
  • feat: account active goal progress in the goal extension (#23696)
    ## Why
    
    The goal extension can create and surface goals, but the live
    turn-accounting path still stopped short of persisting active-goal
    progress. That leaves token and wall-clock usage, plus
    `ThreadGoalUpdated` events, out of sync with the extension boundary once
    work actually advances or a goal transitions out of active state.
    
    ## What changed
    
    - Teach `GoalAccountingState` to track the current turn, active goal,
    token deltas, and wall-clock progress snapshots against the persisted
    goal id.
    - Flush active-goal accounting from tool-finish, turn-stop, and
    turn-abort lifecycle hooks, and emit `ThreadGoalUpdated` events when
    persisted progress changes.
    - Route `create_goal` and `update_goal` through the same accounting
    state so new goals start from the right baseline, final progress is
    flushed before status changes, and `update_goal` can mark a goal
    `blocked` as well as `complete`.
    - Keep budget-limited goals accruing through the end of the turn while
    clearing local active-goal state once a turn or explicit update is
    finished.
    - Expand backend and lifecycle coverage around store ids, baseline
    reset, tool-finish accounting, budget-limited carry-through, and
    blocked-goal updates.
    
    ## Testing
    
    - Added focused backend coverage in
    `codex-rs/ext/goal/tests/goal_extension_backend.rs` for baseline reset,
    tool-finish accounting, budget-limited turns, and blocked-goal updates.
    - Extended `codex-rs/core/src/session/tests.rs` to assert that lifecycle
    inputs expose the expected session, thread, and turn store ids.
  • feat: expose turn-start metadata to extensions (#23688)
    ## Why
    
    The goal extension needs more context when a turn starts than
    `turn_store` alone provides.
    
    In particular, goal accounting needs the stable turn id, the effective
    collaboration mode, and the cumulative token-usage baseline captured at
    turn start so it can:
    
    - suppress goal accounting for plan-mode turns
    - compute exact per-turn deltas from cumulative `total_token_usage`
    snapshots instead of relying on the most recent usage event alone
    - keep the extension-owned accounting path aligned with the host turn
    lifecycle
    
    ## What
    
    - extend `codex_extension_api::TurnStartInput` to expose `turn_id`,
    `collaboration_mode`, and `token_usage_at_turn_start`
    - pass the full `TurnContext` plus the captured token-usage baseline
    through the turn-start lifecycle emission path
    - initialize goal turn accounting from the turn-start baseline and
    collaboration mode
    - switch goal token accounting to compute deltas from cumulative
    `total_token_usage` snapshots
    - add coverage for the new turn-start lifecycle fields and for
    goal-accounting baseline behavior
    
    ## Testing
    
    - added `turn_start_lifecycle_exposes_turn_metadata_and_token_baseline`
    in `codex-rs/core/src/session/tests.rs`
    - added `ext/goal/tests/accounting.rs` coverage for baseline-aware goal
    accounting and plan-mode suppression
  • feat: wire goal extension tools to the dedicated goal store (#23685)
    ## Why
    
    `ext/goal` already had the tool specs and contributor wiring for
    `/goal`, but the installed tools still depended on a placeholder backend
    that always errored. That meant the extension could not actually own
    goal persistence even though the dedicated `thread_goals` store already
    exists.
    
    This change wires the extension tools directly to the dedicated goal
    store so the extension can create, read, and complete goals against real
    state instead of falling back to host-side placeholders.
    
    ## What changed
    
    - make `install_with_backend(...)` require
    `Arc<codex_state::StateRuntime>` so goal storage is always available
    when the extension is installed
    - remove the unused no-backend/public backend abstraction from
    `ext/goal` and have the tool executors talk directly to `StateRuntime`
    - map `thread_goals` rows into the existing protocol response shape for
    `get_goal`, `create_goal`, and `update_goal`
    - preserve current thread-list behavior by filling an empty thread
    preview from the goal objective when a goal is created through the
    extension path
    - add integration coverage for the installed tool surface, including
    successful goal creation and duplicate-create rejection
    
    ## Testing
    
    - `cargo test -p codex-goal-extension`
  • Add tool lifecycle extension contributor (#23309)
    ## Why
    
    Extensions that need to track runtime progress currently have no typed
    host signal for tool execution. The goal extension in particular needs
    to observe tool attempts without inspecting tool payloads, owning tool
    implementations, or staying coupled to core-only runtime plumbing.
    
    This adds a narrow lifecycle contributor API for host-owned tool
    execution: extensions can observe when an accepted tool call starts and
    how it finishes, while policy hooks and tool handlers continue to own
    payload rewriting, blocking, and execution.
    
    Relevant code:
    
    -
    [`ToolLifecycleContributor`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors.rs#L119)
    defines the extension-facing observer contract.
    -
    [`tool_lifecycle.rs`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors/tool_lifecycle.rs)
    defines the typed start/finish inputs, source, and outcome enums.
    - [`notify_tool_start` /
    `notify_tool_finish`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/core/src/tools/lifecycle.rs)
    bridges core tool dispatch into the extension registry.
    
    ## What Changed
    
    - Added `ToolLifecycleContributor` to `codex-extension-api`, including:
      - `ToolStartInput`
      - `ToolFinishInput`
      - `ToolCallSource`
      - `ToolCallOutcome`
    - Added registration and lookup support on `ExtensionRegistryBuilder` /
    `ExtensionRegistry`.
    - Wired core tool dispatch to notify lifecycle contributors for:
      - accepted tool starts
      - completed tool calls, including the tool output success marker
      - pre-tool-use blocks
      - failures before or after the handler runs
      - cancellation/abort in the parallel tool path
    - Registered the goal extension as a lifecycle contributor and added the
    outcome filter it will use for goal progress accounting.
    
    ## Test Coverage
    
    - Added `dispatch_notifies_tool_lifecycle_contributors` to cover
    lifecycle notification ordering and outcomes for successful and
    handler-failed tool calls.
  • Emit goal update events from goal extension tools (#23306)
    ## Why
    
    Goal creation and completion are moving through the goal extension, but
    the rest of Codex still observes goal state through `ThreadGoalUpdated`
    events. Without an event from the extension-owned tool path, a
    model-initiated `create_goal` or `update_goal` can mutate the backend
    and return a tool result while app-server and TUI listeners miss the
    goal state transition.
    
    ## What changed
    
    - Added `GoalEventEmitter` as a small wrapper around the host
    `ExtensionEventSink` to build `EventMsg::ThreadGoalUpdated` events for
    goal updates.
    - Threaded the registry event sink into `GoalExtension` and the
    `GoalToolExecutor`s created by the extension. The public
    `GoalExtension::new` constructor keeps a `NoopExtensionEventSink`
    fallback for standalone use.
    - Emitted a goal update after successful `create_goal` and `update_goal`
    tool calls. Until `ToolCall` exposes the current turn submission id,
    these events use the tool call id as the event id and leave `turn_id`
    unset.
    
    Relevant code:
    
    -
    [`GoalEventEmitter::thread_goal_updated`](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/events.rs#L19-L32)
    - [`GoalToolExecutor` emission
    points](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/tool.rs#L161-L190)
    
    ## Testing
    
    - `cargo test -p codex-goal-extension`
  • chore: make token usage async (#23305)
    Make the `TokenUsageContributor` async. This will be required for future
    extension and it's basically free
  • Make extension lifecycle hooks async (#23291)
    ## Why
    
    Extension lifecycle hooks sit on the host/extension boundary, but the
    current trait surface only allows synchronous callbacks. That forces
    extensions that need to seed, rehydrate, observe, or flush
    extension-owned state during thread and turn transitions to either block
    inside the callback or move async work into separate host plumbing.
    
    This PR makes those lifecycle callbacks awaitable so extension
    implementations can perform async work directly at the lifecycle point
    where the host already has the relevant session, thread, or turn stores
    available.
    
    ## What changed
    
    - Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor`
    async in `codex-extension-api`.
    - Awaits thread start/resume/stop and turn start/stop/abort lifecycle
    callbacks from `codex-core`.
    - Updates the guardian and memories extensions to implement the async
    lifecycle trait surface.
    - Updates the existing lifecycle tests to use async contributor
    implementations.
    - Adds `async-trait` to the crates that now expose or implement these
    async object-safe lifecycle traits.
    
    ## Testing
    
    - Existing `codex-core` lifecycle tests were updated to cover async
    implementations for thread stop and turn abort ordering.
  • chore: goal ext skeleton (#23288)
    Skeleton of `/goal` in extension
    Lot's of follow-ups coming