## Why
After #28918, selected skill roots are `PathUri`, but the executor skill
provider still converts them to the app-server host's `AbsolutePathBuf`.
A foreign Windows root therefore cannot be discovered by a Unix host,
and the inverse has the same problem.
This PR keeps executor skill discovery and reads on the filesystem that
owns the selected root while reusing the existing skill rules.
## What changed
- Generalize the existing skill traversal to operate on `PathUri`
through `ExecutorFileSystem`, preserving its depth, directory, symlink,
and sibling-metadata concurrency behavior.
- Add a small environment skill loader that reuses the shared discovery,
frontmatter validation, dependency parsing, product policy, and
prompt-visibility rules.
- Keep the environment id and entrypoint `PathUri` in the skill catalog,
then route `skills.read` back through the same environment filesystem.
- Preserve the executor's path convention when deriving catalog handles,
including literal backslashes in POSIX filenames.
- Resolve plugin namespaces from nearby manifests through URI-native
filesystem reads.
- Cover foreign Windows roots, executor-owned reads, namespaces,
metadata, policy, and path identity.
```text
selected root (PathUri)
|
v
shared discovery over ExecutorFileSystem
|
v
environment-bound catalog entry --skills.read--> same ExecutorFileSystem
```
No second filesystem abstraction or duplicate traversal implementation
is introduced.
## Stack
1. #29614 — add lexical `PathUri` containment.
2. #29620 — share URI-native manifest path resolution.
3. #28918 — keep selected plugin roots and resources URI-native.
4. **This PR** — load executor skills without host path conversion.
5. #29628 — resolve executor MCP working directories without host path
conversion.
Fixes#28263.
## Why
When a thread starts with `/goal`, the goal extension can update SQLite
goal state before the thread has any user-turn rollout items.
`thread/list` and `thread/search` rely on persisted listing metadata, so
a goal-first live thread could be absent from app-server listings after
restart even though the goal itself existed.
This regressed when goal handling moved out of core: the core path wrote
the goal update through the live thread rollout path, while the
extension-backed app-server path only updated goal state and emitted the
live notification.
## What
- Add `GoalSetOutcome::thread_goal_updated_item()` so the goal extension
owns the canonical `ThreadGoalUpdated` rollout item shape.
- Expose a narrow `CodexThread::append_rollout_items()` helper that
appends through the live thread and keeps derived SQLite metadata in
sync.
- When app-server sets a goal on an active live thread, persist the goal
update through that live-thread path.
- Add an app-server regression test that starts a live thread with
`thread/goal/set` and verifies it appears in state-DB-only
`thread/list`.
## Verification
- `env -u CODEX_SQLITE_HOME just test -p codex-app-server
goal_first_live_thread_appears_in_state_db_thread_list`
The workspace denies `clippy::expect_used` in production. Although
`clippy.toml` allows `expect` in tests, Bazel Clippy compiles
integration-test helper code in a way that does not receive that
exemption, which encouraged verbose `unwrap_or_else(... panic!(...))`
and equivalent `match`/`let else` forms.
This allows `clippy::expect_used` once at each integration-test crate
root (including aggregated suites and test-support libraries), then
replaces manual panic-based Result and Option unwraps with
`expect`/`expect_err`. Standalone `tests/*.rs` files remain their own
crate roots. Intentional assertion and unexpected-variant panics remain
unchanged, and the production `expect_used = "deny"` lint remains in
place.
The cleanup is mechanical and net-negative in line count.
## Why
App-server threads without a local executor need orchestrator-owned
skills from the hosted `codex_apps` MCP server. Threads with the local
executor already discover installed skills from the local filesystem.
After the orchestrator skill provider was enabled for every app-server
thread, local-executor threads also received the hosted skill catalog
and the `skills.list` and `skills.read` tools. This changed the existing
local behavior and could expose a second hosted copy of a skill that was
already installed locally.
## What changed
- Expose the thread's selected execution environments to extensions at
thread startup.
- Enable orchestrator skills only when the reserved local environment is
not selected.
- Apply that decision consistently to hosted skill catalog discovery,
explicit skill injection, and the `skills.list` and `skills.read` tools.
## Verification
- The existing no-executor app-server test continues to verify hosted
skill discovery, invocation, and child-resource reads.
- A new app-server test verifies that local-executor threads do not
receive hosted skill context or `skills.*` tools.
## Intent
Keep Bazel and Starlark files consistently formatted without requiring
contributors to install or version buildifier themselves.
## Implementation
- Add a SHA-256-pinned, cross-platform DotSlash manifest for buildifier
v8.5.1.
- Run buildifier from the shared `just fmt` and `just fmt-check` driver,
with Windows-safe explicit DotSlash invocation.
- Provision DotSlash in formatting CI and contributor devcontainers, and
document the source-build prerequisite.
- Apply the initial mechanical buildifier formatting baseline.
## Why
Image generation used `std::fs::read` for referenced image paths, which
did not support environment-backed filesystems or their sandbox context.
## What changed
- Expose optional turn environments to extension tool calls.
- Include each environment’s ID, working directory, filesystem, and
sandbox context.
- Read referenced images through the selected environment filesystem.
- Keep sandbox usage at the extension call site so extensions can choose
the appropriate access mode.
- Consolidate image request construction into one async function.
- Add coverage for successful environment reads and read failures.
## Validation
- `cargo check -p codex-image-generation-extension --tests`
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
`just test -p codex-image-generation-extension` could not complete
because the build exhausted available disk space.
## Why
We're now [discouraging use of
`async_trait`](https://github.com/openai/codex/pull/20242).
Removing use of `async_trait` from `ToolExecutor` yields a `codex_core`
debug test build speedup of ~78% (from 227.5s to 50.3s) on my machine.
Stacked on #27299, this PR applies the trait change after the handler
bodies have been outlined.
## What
Changed `ToolExecutor::handle` to return an explicit boxed
`ToolExecutorFuture` instead of using `async_trait`.
Updated ToolExecutor implementors to return `Box::pin(...)`, reexported
the future alias through `codex-tools` and `codex-extension-api`, and
removed `codex-tools` direct `async-trait` dependency.
## Why
Extension contributors are registered behind `dyn Trait` objects, so
native `async fn`/RPITIT methods would make these traits
non-object-safe. Spell out the boxed, `Send` future contract directly so
`extension-api` no longer needs `async-trait` while retaining the
existing runtime model.
## What changed
- add a shared `ExtensionFuture` alias and use it for asynchronous
contributor methods
- migrate production and test implementations to return `Box::pin(async
move { ... })`
- remove `async-trait` dependencies where they are no longer used,
keeping it dev-only where unrelated test executors still require it
## Behavior
No behavior change is intended. Contributor futures remain boxed,
`Send`, dynamically dispatched, and lazily executed; cancellation and
callback ordering stay unchanged.
## Testing
- `just test -p codex-extension-api` (11 passed)
- affected extension crates (64 passed)
- targeted `codex-core` contributor tests (14 passed)
- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
A broad local `codex-core` run compiled successfully but encountered
unrelated sandbox and missing test-binary fixture failures; CI will run
the full checks.
## Why
- Currently, there is no analytics event for `/goal` behavior
- Existing events cannot identify goal execution or its resulting
outcome
- The original update in
[#26182](https://github.com/openai/codex/pull/26182) was implemented
before `/goal` moved into `codex-goal-extension`.
## What Changed
- Adds `codex_goal_event` serialization and enrichment to
`codex-analytics`
- Emits goal events from the canonical `codex-goal-extension` mutation
and accounting paths:
- `created` when a new logical goal is persisted
- `usage_accounted` when cumulative goal usage is persisted
- `status_changed` when the stored goal status changes
- `cleared` when the goal is deleted
- Preserves causal `turn_id` for turn driven events and uses null
attribution for external or idle lifecycle events
- Changes goal deletion to return the deleted row so `cleared` retains
the stable goal ID
## Event Details
Includes standard analytics metadata along with goal specific fields:
- `goal_id`: Stable ID stored in the local SQLite goal row and shared
across the goal's events
- `event_kind`: Observed operation (see the 4 lifecycle events cited in
the above bullet)
- `goal_status`: Resulting or last stored status: `active`, `paused`,
`blocked`, `usage_limited`, etc.
- `has_token_budget`: Indicates whether a token budget is configured
- `turn_id`: Causal turn ID, or null when no causal turn exists
- `cumulative_tokens_accounted`: Cumulative tokens on `usage_accounted`
events; null otherwise
- `cumulative_time_accounted_seconds`: Cumulative active time on
`usage_accounted` events; null otherwise
## Validation
- `just test -p codex-analytics -p codex-state -p codex-goal-extension`
- `just test -p codex-core -E 'test(/goal/)'`
- `just test -p codex-app-server`
- `cargo build -p codex-analytics -p codex-core -p codex-state -p
codex-app-server`
## Why
Users have indicated that they want an agent to be able to create a new
goal for itself after completing the previous goal. Currently, that's
not possible because agents cannot overwrite an existing goal even if
it's complete. This PR removes this limitation and allows `create_goal`
to overwrite an existing goal if it is in the `complete` state.
## What changed
`create_goal` now replaces the existing goal only when its status is
`complete`. The replacement is performed atomically in the goal store,
creates a fresh active goal with reset usage, and continues to reject
creation while any unfinished goal exists. App server clients see a
single `thread/goal/updated` event when the previous goal is replaced
with the new one.
The tool description and error message now reflect these semantics.
## What didn't change
Agents are not allowed to create a new goal (overwrite their existing
goal) if an existing goal is still active, blocked, paused, or in any
other state other than "completed".
## Why
Terminal turn errors can leave a goal active. Automatic goal
continuation may then repeatedly hit a permanent failure, including
compaction requests rejected with HTTP 400, and consume excessive
tokens.
This PR changes the goal extension to treat all turn-ending errors
(including non-retryable errors and retryable errors that have exceeded
their retry count) as "blocking" for the goal. The downside to this
change is that there are some errors that may eventually succeed (e.g. a
429 due to a service outage), and previously the goal runtime would have
kept the agent going in these situations.
## What changed
- Block the current active goal when a turn ends with an error other
than a usage-limit error.
- Preserve the existing `usage_limited` transition for usage-limit
errors.
- Share progress accounting, guarded state updates, metrics, and event
emission in the goal runtime.
## Stack
1. [#26547](https://github.com/openai/codex/pull/26547) - [1 of 2] Align
goal extension with core behavior
2. [#26548](https://github.com/openai/codex/pull/26548) - [2 of 2] Move
goal runtime to extension
## Why
The goal runtime is moving out of `codex-core` and into
`codex-goal-extension`. This first PR brings the extension back in line
with the current core behavior before the follow-up PR switches
app-server sessions over to the extension, so that review can focus on
ownership and wiring rather than hidden behavior drift.
## What Changed
- Updates the extension `create_goal` and `update_goal` tool
schemas/descriptions to match the current core wording for explicit
token budgets, blocked-goal audits, resumed blocked goals, and
system-owned budget/usage-limit transitions.
- Marks `codex-goal-extension` as the live `/goal` extension crate
rather than an unwired sketch.
- Looks up the live thread before reading goal state for idle
continuation, so continuation setup exits early when no live thread can
accept the automatic turn.
## Why
Goal idle continuation is extension-triggered model-visible work, so it
should follow one core-owned rule for when automatic work may start. In
particular, it should not jump ahead of queued user/client work, start
while another task is active, or inject a continuation turn while the
thread is in Plan mode.
Keeping this policy in `try_start_turn_if_idle` avoids passing
`collaboration_mode` or review-specific state through
`ThreadLifecycleContributor::on_thread_idle`. Active `/review` is
covered by the same active-task gate because Review turns are not
steerable.
## What Changed
- Teach `Session::try_start_turn_if_idle` to reject automatic idle turns
in Plan mode, both before reserving an idle turn and after building the
turn context.
- Document `CodexThread::try_start_turn_if_idle` as the extension-facing
gate for automatic idle work, including Plan-mode and active Review-task
behavior.
- Add focused coverage for Plan-mode rejection and active Review-task
rejection without queuing synthetic input.
## Testing
- `just test -p codex-core try_start_turn_if_idle`
## Why
Goal progress accounting can be reached from multiple completion paths
for the same thread. Each path takes a progress snapshot, writes the
usage delta, and then marks that snapshot as accounted. When two
tool-completion hooks run at the same time, they can both observe the
same unaccounted delta and charge it twice.
## What changed
- Added a per-thread progress-accounting permit to
`GoalAccountingState`.
- Held that permit across the snapshot/write/mark-accounted critical
section for active-turn, idle, and tool-finish accounting.
- Added regression coverage for parallel tool-finish hooks so a shared
token delta is charged once and only one progress event is emitted.
## Testing
- Not run locally.
- Added `parallel_tool_finish_accounts_active_goal_progress_once`.
## Summary
- add an extension-owned `GoalApi` for thread goal get/set/clear
operations
- register live goal runtimes with the API from the goal extension
backend
- cover the API and runtime-effect paths in goal extension tests
## Stack
Follow-up app-server wiring PR: #25108
## Validation
- `just fmt`
- `just fix -p codex-goal-extension`
- `just test -p codex-goal-extension`
## Why
Goal steering prompts have grown into long inline Rust strings, which
makes the authored prompt text hard to review and easy to damage while
changing the surrounding plumbing. Moving those prompts into embedded
Markdown templates keeps the policy text in the shape reviewers actually
read, while preserving the existing runtime substitution and objective
escaping behavior.
## What changed
- Added `ext/goal/templates/goals/continuation.md`, `budget_limit.md`,
and `objective_updated.md` for the three goal steering prompts.
- Updated `ext/goal/src/steering.rs` to parse those embedded templates
once with `codex-utils-template` and render the existing goal values
into them.
- Kept user objectives XML-escaped before rendering and converted budget
counters into template variables.
- Added the template directory to `ext/goal/BUILD.bazel` `compile_data`
so Bazel has the same embedded prompt inputs as Cargo.
## Testing
- Not run locally.
## Why
The goal extension needs a way to resume an active goal after the thread
becomes idle, but the old core goal runtime should not be refactored as
part of this step. The missing piece is a small core-owned turn-start
primitive: let an extension ask for a normal model turn only when the
thread is idle, and otherwise fail without injecting into whatever is
currently active.
## What Changed
- Adds `CodexThread::try_start_turn_if_idle(...)` as the narrow
extension-facing primitive for synthetic idle work.
- Implements the session side so it refuses to start when:
- the provided input is empty,
- the session is in plan mode,
- a turn is already active, or
- trigger-turn mailbox work is pending.
- Gives trigger-turn mailbox work priority if it appears while the idle
turn is being prepared.
- Wires `GoalExtension::on_thread_idle` to read the active persisted
goal and submit the continuation prompt through this idle-only
primitive.
- Keeps the legacy core goal continuation implementation in place
instead of folding it into this PR.
## Behavior
This is intentionally best-effort. If `try_start_turn_if_idle` observes
that the thread is not idle, or that higher-priority mailbox work should
run first, it returns the input to the caller. The goal extension drops
that continuation prompt and waits for a future idle opportunity instead
of injecting stale synthetic goal text into an active turn.
## Validation
- `just test -p codex-core
try_start_turn_if_idle_rejects_active_turn_without_injecting`
- `just test -p codex-goal-extension`
## Why
The standalone `/v1/alpha/search` request now requires a `model`, but
the `web.run` extension currently omits it.
Adds `model` to extension `ToolCall` invocation.
Follow-up to #23823.
## What changed
- Make `SearchRequest.model` required.
- Expose the effective per-turn model on extension tool calls and pass
it in standalone web-search requests.
- Assert the model is forwarded in the app-server round-trip test.
## Testing
- `just test -p codex-api -p codex-tools -p codex-web-search-extension
-p codex-memories-extension -p codex-goal-extension`
- `just test -p codex-core -E
'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
- `just test -p codex-app-server -E
'test(standalone_web_search_round_trips_encrypted_output)'`
## Summary
- handle goal usage-limit turn errors in the goal extension
- exercise the extension path in the goal backend test
## Tests
- just fmt
- just test -p codex-goal-extension
- just fix -p codex-goal-extension
## Why
This PR is stacked on #24918, which moves goal steering onto
source-labeled internal model context fragments. Active-turn goal
steering should use the same running-turn injection path as other
runtime steering, so those fragments enter the pending input queue as
`ResponseItem`s through the existing
[`Session::inject_if_running`](https://github.com/openai/codex/blob/8d6f6cdf69b055c27682e7cdea9caf72a3e2ee7f/codex-rs/core/src/session/inject.rs#L12-L27)
behavior instead of through a goal-specific conversion wrapper.
## What Changed
- Exposes a narrow `CodexThread::inject_if_running` bridge for callers
that only hold a thread handle.
- Changes `ext/goal` active-turn steering to pass `ResponseItem`s
directly.
- Builds goal steering prompts as contextual internal model context
`ResponseItem`s before injecting them into the running turn.
## Testing
Not run locally; PR metadata update only.
## Why
Goal steering is one form of runtime-owned model context, but the old
`<goal_context>` wrapper made the contextual-fragment hiding path
goal-specific. Using a source-labeled internal context fragment gives
core and extensions a shared shape for hidden model steering while
keeping those prompts out of visible turn history.
The change also keeps legacy `<goal_context>` messages recognized as
hidden contextual input so existing stored history does not start
rendering old goal-steering prompts as user-visible turn items.
## What Changed
- Replaces `GoalContext` with `InternalModelContextFragment` plus a
validated `InternalContextSource`.
- Renders goal steering as `<codex_internal_context
source="goal">...</codex_internal_context>`.
- Updates core goal steering and `ext/goal` steering to inject the new
internal-context fragment.
- Updates contextual-fragment, event-mapping, goal, and session tests
for the new wrapper.
## Test Coverage
- Adds coverage for detecting the new internal model context fragment.
- Preserves coverage for hiding legacy `<goal_context>` fragments.
- Verifies invalid internal context sources are rejected and arbitrary
context tags are not hidden.
- Updates goal steering/session assertions to expect the new
`source="goal"` wrapper.
## Why
Extension-contributed tools need to emit visible turn items through
Codex's normal event and persistence pipeline.
## What
- Add `TurnItemEmitter` to extension `ToolCall`s and route the core
implementation through `Session::emit_turn_item_*`.
- Hold weak session and turn references so retained tool calls cannot
keep host state alive.
- Provide a no-op emitter for extension test callers.
## Test Plan
- `just test -p codex-core -E
'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'`
---------
Co-authored-by: jif-oai <jif@openai.com>
## Why
Goal tools create and update goal state for a persistent thread. The
extension was only checking whether goals were enabled before
advertising those tools, which meant they could be surfaced in contexts
that should not receive thread goal controls: ephemeral threads without
persistent thread state and review subagents.
Those sessions can still run the goal extension lifecycle, but the
thread tools should only be visible when the current thread can safely
use them.
## What changed
- Adds a `GoalRuntimeConfig` that separates goal enablement from whether
goal tools are available for the current thread.
- Computes tool eligibility on thread start from
`persistent_thread_state_available` and `SessionSource`, hiding tools
for review subagents.
- Uses `GoalRuntimeHandle::tools_visible()` when contributing thread
tools so enabled runtime state does not automatically imply tool
exposure.
- Adds backend coverage for hiding goal tools on ephemeral threads and
review subagents.
## Testing
- Added `goal_tools_hidden_for_ephemeral_threads`.
- Added `goal_tools_hidden_for_review_subagents`.
Summary: add session source and persistent-state availability to
ThreadStartInput; populate them from session init; update existing goal
test harness constructors. Tests: just fmt; git diff --check. No full
tests or clippy run per request.
## Why
The extracted goal runtime needs a host-callable path for turns that
stop because the workspace usage limit is reached. In that case, any
in-turn goal progress should be accounted before the goal becomes
terminal, and active goal accounting must be cleared so later
tool-finish or turn-stop handling does not keep charging usage to a
stopped goal.
## What changed
- Adds `GoalRuntimeHandle::usage_limit_active_goal_for_turn`, which
accounts current active-goal progress, marks the active or
budget-limited thread goal as `UsageLimited`, records terminal metrics
when the status changes, clears active goal accounting, and emits the
updated goal event.
- Covers both active and budget-limited goals in
`ext/goal/tests/goal_extension_backend.rs`, including the invariant that
later token/tool events do not add usage after the goal has been
usage-limited.
## Testing
- Added
`usage_limit_active_goal_accounts_progress_and_clears_accounting`.
- Added `usage_limit_budget_limited_goal_accounts_remaining_progress`.
## Summary
Adds experimental `additionalContext` support to `turn/start` and
`turn/steer` so clients can provide ephemeral external context, such as
browser or automation state, without turning that plumbing into a
visible user prompt or triggering user-prompt lifecycle behavior.
## API Shape
The parameter shape is:
```ts
additionalContext?: Record<string, {
value: string
kind: "untrusted" | "application"
}> | null
```
Example:
```json
{
"additionalContext": {
"browser_info": {
"value": "Active tab is CI failures.",
"kind": "untrusted"
},
"automation_info": {
"value": "CI rerun is in progress.",
"kind": "application"
}
}
}
```
The keys are opaque and caller-defined.
## Context Injection
When provided, accepted entries are inserted into model context as
hidden contextual message items, not as visible thread user-message
items.
`kind: "untrusted"` entries are inserted with role `user`:
```text
<external_${key}>${value}</external_${key}>
```
`kind: "application"` entries are inserted with role `developer`:
```text
<${key}>${value}</${key}>
```
Values are not escaped. Each value is truncated to 1k approximate tokens
before wrapping.
For `turn/start`, accepted additional context is inserted before normal
user input. For `turn/steer`, additional context is merged only when the
steer includes non-empty user input; context-only steers still reject as
empty input.
## Dedupe Strategy
`AdditionalContextStore` lives on session state and stores the latest
complete additional-context map.
Each `turn/start` or non-empty `turn/steer` treats its
`additionalContext` as the current complete set of values. Entries are
injected only when the key is new or the exact entry for that key
changed, including `value` or `kind`. After merging, the store is
replaced with the provided map, so omitted keys are removed from the
retained set and can be injected again later if reintroduced.
Omitting `additionalContext`, passing `null`, or passing an empty object
resets the store to empty and injects nothing.
## What Changed
- Threads experimental v2 `additionalContext` through app-server into
core turn start and steer handling.
- Adds separate contextual fragment types for untrusted user-role
context and application developer-role context.
- Uses pending response input items so additional context can be
combined with normal user input without treating it as prompt text.
- Adds integration coverage for start/steer flow, role routing,
dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
behavior, empty context-only steer rejection, external-fragment marker
matching, and truncation.
## Why
Goal idle accounting is supposed to survive a thread resume. Previously,
the resume hook restored the active goal state inline from the extension
lifecycle contributor, which left the runtime handle without a reusable
restoration path and made the behavior hard to cover directly. When a
thread with an active goal was resumed, goal accounting could lose track
of the active idle goal instead of continuing to accrue elapsed time.
## What changed
- Moved thread-resume restoration into
`GoalRuntimeHandle::restore_after_resume()` so the runtime owns
rehydrating active goal accounting from persisted thread goal state.
- Kept disabled goal runtimes as a no-op and preserved the existing
warning path when persisted goal state cannot be loaded.
- Added a backend regression test that seeds an active goal, resumes the
thread, waits briefly, and verifies elapsed idle time is reflected on
the next external goal mutation.
## Testing
- Not run locally; this metadata update only rewrote the PR title/body.
## Why
`core/src/goals.rs` already emits OTEL metrics for goal creation,
resume, terminal transitions, token counts, and duration. As `/goal`
moves into `ext/goal`, the extension needs to preserve that telemetry
contract instead of only emitting app-visible `ThreadGoalUpdated`
events.
This keeps the existing `codex.goal.*` metric surface intact while goal
lifecycle ownership shifts toward the extension.
## What changed
- Added an extension-local `GoalMetrics` helper that records the
existing `codex.goal.*` counters and histograms through `codex-otel`.
- Threaded an optional `MetricsClient` through `install_with_backend`,
`GoalExtension`, `GoalRuntimeHandle`, and `GoalToolExecutor`.
- Emitted created, resumed, and terminal goal metrics from the extension
paths that create goals, restore active goals on thread resume, account
budget limits, complete or block goals, and handle external goal
mutations.
- Updated existing goal extension test setup callsites to pass `None`
for metrics when instrumentation is not under test.
## Verification
Not run locally.
## Why
Extension tools that need conversation context should be able to read it
from the live tool invocation instead of reaching into thread
persistence themselves.
## What changed
- Add a `ConversationHistory` snapshot to extension `ToolCall`s and
populate it from the current raw in-memory response history.
- Expose all history items at this boundary so each extension can filter
and bound the subset it needs before consuming or forwarding it.
- Cover the adapter and registry dispatch paths and update existing
extension tests that construct `ToolCall` literals.
## Test plan
- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`
- `cargo test -p codex-goal-extension`
- `cargo test -p codex-memories-extension`
- `cargo test -p codex-core passes_turn_fields_to_extension_call`
- `cargo test -p codex-core
extension_tool_executors_are_model_visible_and_dispatchable`
## Why
`ToolExecutor` is the runtime contract that keeps a callable tool and
its model-visible spec together. Leaving `spec()` optional lets a
registered runtime silently omit that half of the contract, and it also
overloads a missing spec as an exposure decision for tools that should
stay dispatchable without being shown to the model.
## What
- Make `ToolExecutor::spec()` required and update core, extension, and
test tool executors to return a concrete `ToolSpec`.
- Add `ToolExposure::Hidden` for dispatch-only tools. The legacy
`shell_command` runtime in unified-exec sessions now uses that explicit
exposure instead of hiding itself by omitting a spec.
- Build MCP tool specs when `McpHandler` is constructed so invalid MCP
specs are skipped before the handler is registered.
- Keep tool planning aligned with the new contract for direct, deferred,
hidden, code-mode, dynamic, and namespaced tool paths.
## Testing
- Added tool-plan coverage that invalid MCP tool specs are not
registered.
- Updated shell-family coverage for the hidden legacy `shell_command`
runtime and the affected tool executor test fixtures.
## What
- Add a small extension capability for injecting model-visible response
items into the active turn
- Have the goal extension inject hidden goal-context steering when
tool-finish accounting reaches `BudgetLimited`
- Cover the extension backend path with an assertion on the injected
steering item
## Why
PR #23696 persists and emits the budget-limited goal update from
tool-finish accounting, but it leaves the model unaware of that
transition. The existing core runtime steers the model to wrap up in
this case; the extension path should do the same through an explicit
host capability.
## Testing
- `just fmt`
- `cargo test -p codex-goal-extension`
- `cargo test -p codex-extension-api`
## Why
`main` picked up two small Rust build failures after nearby merges:
- #23507 added a real handler for
`ServerNotification::ThreadSettingsUpdated`, but the same variant was
still listed in the ignored-notification match arm. Full Clippy runs
treat the resulting unreachable-pattern warning as an error.
- #23666 added `turn_id` and `truncation_policy` to
`codex_tools::ToolCall`, while the goal extension backend test fixtures
from the goal-extension work still used the old shape. That left
`codex-goal-extension` tests unable to compile once the branches met on
`main`.
## What changed
Removed the duplicate `ThreadSettingsUpdated` match pattern from
`tui/src/chatwidget/protocol.rs`.
Updated the goal extension test `tool_call` helper to populate the new
`ToolCall` fields, and reused that helper for the one direct literal
that still had the old field list.
## Verification
- `just fix -p codex-tui`
- `cargo test -p codex-goal-extension`
## What
- Preserve database accounting failures from the goal extension instead
of collapsing them into `None`
- Warn with turn/tool context when a flush fails
- Keep stop/abort accounting snapshots alive when the final flush did
not persist
## Why
PR #23696 can finish and discard a turn snapshot after
`account_thread_goal_usage` fails. That loses the final accumulated
accounting state silently. This follow-up keeps that failure explicit
and avoids deleting the local snapshot in the failing path.
## Testing
- `just fmt`
- `cargo test -p codex-goal-extension`
## Why
The goal extension can create and surface goals, but the live
turn-accounting path still stopped short of persisting active-goal
progress. That leaves token and wall-clock usage, plus
`ThreadGoalUpdated` events, out of sync with the extension boundary once
work actually advances or a goal transitions out of active state.
## What changed
- Teach `GoalAccountingState` to track the current turn, active goal,
token deltas, and wall-clock progress snapshots against the persisted
goal id.
- Flush active-goal accounting from tool-finish, turn-stop, and
turn-abort lifecycle hooks, and emit `ThreadGoalUpdated` events when
persisted progress changes.
- Route `create_goal` and `update_goal` through the same accounting
state so new goals start from the right baseline, final progress is
flushed before status changes, and `update_goal` can mark a goal
`blocked` as well as `complete`.
- Keep budget-limited goals accruing through the end of the turn while
clearing local active-goal state once a turn or explicit update is
finished.
- Expand backend and lifecycle coverage around store ids, baseline
reset, tool-finish accounting, budget-limited carry-through, and
blocked-goal updates.
## Testing
- Added focused backend coverage in
`codex-rs/ext/goal/tests/goal_extension_backend.rs` for baseline reset,
tool-finish accounting, budget-limited turns, and blocked-goal updates.
- Extended `codex-rs/core/src/session/tests.rs` to assert that lifecycle
inputs expose the expected session, thread, and turn store ids.
## Why
The goal extension needs more context when a turn starts than
`turn_store` alone provides.
In particular, goal accounting needs the stable turn id, the effective
collaboration mode, and the cumulative token-usage baseline captured at
turn start so it can:
- suppress goal accounting for plan-mode turns
- compute exact per-turn deltas from cumulative `total_token_usage`
snapshots instead of relying on the most recent usage event alone
- keep the extension-owned accounting path aligned with the host turn
lifecycle
## What
- extend `codex_extension_api::TurnStartInput` to expose `turn_id`,
`collaboration_mode`, and `token_usage_at_turn_start`
- pass the full `TurnContext` plus the captured token-usage baseline
through the turn-start lifecycle emission path
- initialize goal turn accounting from the turn-start baseline and
collaboration mode
- switch goal token accounting to compute deltas from cumulative
`total_token_usage` snapshots
- add coverage for the new turn-start lifecycle fields and for
goal-accounting baseline behavior
## Testing
- added `turn_start_lifecycle_exposes_turn_metadata_and_token_baseline`
in `codex-rs/core/src/session/tests.rs`
- added `ext/goal/tests/accounting.rs` coverage for baseline-aware goal
accounting and plan-mode suppression
## Why
`ext/goal` already had the tool specs and contributor wiring for
`/goal`, but the installed tools still depended on a placeholder backend
that always errored. That meant the extension could not actually own
goal persistence even though the dedicated `thread_goals` store already
exists.
This change wires the extension tools directly to the dedicated goal
store so the extension can create, read, and complete goals against real
state instead of falling back to host-side placeholders.
## What changed
- make `install_with_backend(...)` require
`Arc<codex_state::StateRuntime>` so goal storage is always available
when the extension is installed
- remove the unused no-backend/public backend abstraction from
`ext/goal` and have the tool executors talk directly to `StateRuntime`
- map `thread_goals` rows into the existing protocol response shape for
`get_goal`, `create_goal`, and `update_goal`
- preserve current thread-list behavior by filling an empty thread
preview from the goal objective when a goal is created through the
extension path
- add integration coverage for the installed tool surface, including
successful goal creation and duplicate-create rejection
## Testing
- `cargo test -p codex-goal-extension`
## Why
Extensions that need to track runtime progress currently have no typed
host signal for tool execution. The goal extension in particular needs
to observe tool attempts without inspecting tool payloads, owning tool
implementations, or staying coupled to core-only runtime plumbing.
This adds a narrow lifecycle contributor API for host-owned tool
execution: extensions can observe when an accepted tool call starts and
how it finishes, while policy hooks and tool handlers continue to own
payload rewriting, blocking, and execution.
Relevant code:
-
[`ToolLifecycleContributor`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors.rs#L119)
defines the extension-facing observer contract.
-
[`tool_lifecycle.rs`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/ext/extension-api/src/contributors/tool_lifecycle.rs)
defines the typed start/finish inputs, source, and outcome enums.
- [`notify_tool_start` /
`notify_tool_finish`](https://github.com/openai/codex/blob/3ad2850ffc7d8a1da19c65a92425637a59098f1b/codex-rs/core/src/tools/lifecycle.rs)
bridges core tool dispatch into the extension registry.
## What Changed
- Added `ToolLifecycleContributor` to `codex-extension-api`, including:
- `ToolStartInput`
- `ToolFinishInput`
- `ToolCallSource`
- `ToolCallOutcome`
- Added registration and lookup support on `ExtensionRegistryBuilder` /
`ExtensionRegistry`.
- Wired core tool dispatch to notify lifecycle contributors for:
- accepted tool starts
- completed tool calls, including the tool output success marker
- pre-tool-use blocks
- failures before or after the handler runs
- cancellation/abort in the parallel tool path
- Registered the goal extension as a lifecycle contributor and added the
outcome filter it will use for goal progress accounting.
## Test Coverage
- Added `dispatch_notifies_tool_lifecycle_contributors` to cover
lifecycle notification ordering and outcomes for successful and
handler-failed tool calls.
## Why
Goal creation and completion are moving through the goal extension, but
the rest of Codex still observes goal state through `ThreadGoalUpdated`
events. Without an event from the extension-owned tool path, a
model-initiated `create_goal` or `update_goal` can mutate the backend
and return a tool result while app-server and TUI listeners miss the
goal state transition.
## What changed
- Added `GoalEventEmitter` as a small wrapper around the host
`ExtensionEventSink` to build `EventMsg::ThreadGoalUpdated` events for
goal updates.
- Threaded the registry event sink into `GoalExtension` and the
`GoalToolExecutor`s created by the extension. The public
`GoalExtension::new` constructor keeps a `NoopExtensionEventSink`
fallback for standalone use.
- Emitted a goal update after successful `create_goal` and `update_goal`
tool calls. Until `ToolCall` exposes the current turn submission id,
these events use the tool call id as the event id and leave `turn_id`
unset.
Relevant code:
-
[`GoalEventEmitter::thread_goal_updated`](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/events.rs#L19-L32)
- [`GoalToolExecutor` emission
points](https://github.com/openai/codex/blob/1fe2d73890df9a50996f67f705d4da4cc3d4b866/codex-rs/ext/goal/src/tool.rs#L161-L190)
## Testing
- `cargo test -p codex-goal-extension`
## Why
Extension lifecycle hooks sit on the host/extension boundary, but the
current trait surface only allows synchronous callbacks. That forces
extensions that need to seed, rehydrate, observe, or flush
extension-owned state during thread and turn transitions to either block
inside the callback or move async work into separate host plumbing.
This PR makes those lifecycle callbacks awaitable so extension
implementations can perform async work directly at the lifecycle point
where the host already has the relevant session, thread, or turn stores
available.
## What changed
- Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor`
async in `codex-extension-api`.
- Awaits thread start/resume/stop and turn start/stop/abort lifecycle
callbacks from `codex-core`.
- Updates the guardian and memories extensions to implement the async
lifecycle trait surface.
- Updates the existing lifecycle tests to use async contributor
implementations.
- Adds `async-trait` to the crates that now expose or implement these
async object-safe lifecycle traits.
## Testing
- Existing `codex-core` lifecycle tests were updated to cover async
implementations for thread stop and turn abort ordering.