mirror of https://github.com/pchuan98/codex.git synced 2026-07-01 00:31:56 +08:00

Files

T

Owen Lin a107b84967 feat(protocol): define missing rollout turn items (#30282 )

## Description

This PR adds canonical core `TurnItem` shapes for command execution,
dynamic tool calls, collab agent tool calls, and sub-agent activity, to
be stored in the rollout file soon.

It also teaches app-server protocol / `ThreadHistoryBuilder` how to
render those items, and adds the small legacy fanout helpers needed for
existing event-based consumers. No core producer or rollout persistence
behavior changes here, that will be done in a followup.

## Making ThreadHistoryBuilder stateless

This is the first PR in a stack to make `ThreadHistoryBuilder` stateless
enough that we can materialize app-server `ThreadItem`s from only a
given slice of `RolloutItem` history, without ever needing to replay the
whole thread from the beginning.

The persisted legacy `RolloutItem::EventMsg` records are mostly shaped
like live UI events, not like materialized `ThreadItem`s. They work if
we replay the full rollout in order, but they often do not contain
enough stable identity or complete item state to project an arbitrary
suffix on its own.

A few examples:

- `UserMessageEvent` and `AgentMessageEvent` have content, but
historically do not carry the persisted app-server item ID that should
become the SQLite primary key.
- `AgentReasoningEvent` and `AgentReasoningRawContentEvent` are
fragments. `ThreadHistoryBuilder` currently merges them into the last
reasoning item, which means a slice starting in the middle of reasoning
cannot know whether to append to an earlier item or create a new one.
- `WebSearchEndEvent`, `McpToolCallEndEvent`, collab end events, and
similar legacy events can often render a final-looking item, but they
usually rely on prior replay state to know which turn owns the item.
- Begin/end legacy events are partial views of one logical item. The
builder correlates them by `call_id` and mutates prior state to
synthesize the final `ThreadItem`.

That is the problem this direction fixes. A persisted canonical
lifecycle record looks much closer to the read model we actually want
later:

```rust
ItemCompletedEvent {
    turn_id,
    item: TurnItem { id, ...full snapshot... },
    completed_at_ms,
}
```

Once rollout has explicit `turn_id`, stable `item.id`, and a canonical
completed item snapshot, the future SQLite projector can reduce only the
new rollout suffix and upsert the affected `thread_items` rows. It no
longer needs to synthesize `item-N`, infer item ownership from the
active turn, or replay earlier events just to reconstruct the current
item snapshot.

## What changed

- Added core `TurnItem` variants and item structs for command execution,
dynamic tool calls, collab agent tool calls, and sub-agent activity.
- Added conversions from those canonical items back into the legacy
event shapes where current consumers still need them.
- Added app-server v2 `ThreadItem` conversion for the new core item
variants.
- Taught `ThreadHistoryBuilder` and rollout persistence metrics to
recognize the new item variants.

## Follow-up

The next PR https://github.com/openai/codex/pull/30283 switches the live
core producers for these item families onto canonical `ItemStarted` /
`ItemCompleted` events.

a107b84967 · 2026-06-26 16:44:34 -07:00

History

src

feat(protocol): define missing rollout turn items (#30282 )

2026-06-26 16:44:34 -07:00

templates

[codex] Consolidate shared prompts in codex-prompts (#25151 )

2026-06-01 18:45:07 +00:00

tests

[codex] allow AGENTS.md and skills to authorize delegation (#30274 )

2026-06-26 12:17:26 -07:00

BUILD.bazel

Run core integration tests against a Wine-backed Windows executor (#28401 )

2026-06-16 00:38:41 +00:00

Cargo.toml

[codex] Inject agent graph store into ThreadManager (#29736 )

2026-06-24 13:24:10 -07:00

config.schema.json

[codex] wire process-owned code mode host into core (#30142 )

2026-06-26 00:23:33 -07:00

gpt_5_1_prompt.md

chore(core) rm AskForApproval::OnFailure (#28418 )

2026-06-23 12:13:54 -07:00

gpt_5_2_prompt.md

chore(core) rm AskForApproval::OnFailure (#28418 )

2026-06-23 12:13:54 -07:00

gpt_5_codex_prompt.md

Assemble sandbox/approval/network prompts dynamically (#8961 )

2026-01-12 23:12:59 +00:00

gpt-5.1-codex-max_prompt.md

Assemble sandbox/approval/network prompts dynamically (#8961 )

2026-01-12 23:12:59 +00:00

gpt-5.2-codex_prompt.md

Assemble sandbox/approval/network prompts dynamically (#8961 )

2026-01-12 23:12:59 +00:00

prompt_with_apply_patch_instructions.md

chore(core) rm AskForApproval::OnFailure (#28418 )

2026-06-23 12:13:54 -07:00

README.md

test: branch on target OS instead of runner flavor (#29712 )

2026-06-23 14:27:13 -07:00

README.md

codex-core

This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.

Wine-exec integration tests

On x86-64 Linux, run the shared suite against the Windows exec server with bazel test //codex-rs/core:core-all-wine-exec-test.

Local execution targets the host OS, Docker targets Linux, and Wine exec targets Windows. Choose the skip macro by what the test depends on:

skip_if_target_windows!: Windows target behavior.
skip_if_host_windows!: Windows host constraints.
skip_if_remote!: Local-only test behavior.
skip_if_no_remote_env!: Remote-only test behavior.
skip_if_wine_exec!: Wine-specific runner debt.

Dependencies

Note that codex-core makes some assumptions about certain helper utilities being available in the environment. Currently, this support matrix is:

macOS

Expects /usr/bin/sandbox-exec to be present.

When using the workspace-write sandbox policy, the Seatbelt profile allows writes under the configured writable roots while keeping .git (directory or pointer file), the resolved gitdir: target, and .codex read-only.

Network access and filesystem read/write roots are controlled by SandboxPolicy. Seatbelt consumes the resolved policy and enforces it.

Seatbelt also keeps the legacy default preferences read access (user-preference-read) needed for cfprefs-backed macOS behavior.

Linux

Expects the binary containing codex-core to run the equivalent of codex sandbox when arg0 is codex-linux-sandbox. See the codex-arg0 crate for details.

Legacy SandboxPolicy / sandbox_mode configs are still supported on Linux. They can continue to use the legacy Landlock path when the split filesystem policy is sandbox-equivalent to the legacy model after cwd resolution. Split filesystem policies that need direct FileSystemSandboxPolicy enforcement, such as read-only or denied carveouts under a broader writable root, automatically route through bubblewrap. The legacy Landlock path is used only when the split filesystem policy round-trips through the legacy SandboxPolicy model without changing semantics. That includes overlapping cases like /repo = write, /repo/a = none, /repo/a/b = write, where the more specific writable child must reopen under a denied parent.

The Linux sandbox helper prefers the first bwrap found on PATH outside the current working directory whenever it is available. If bwrap is present but too old to support --argv0, the helper keeps using system bubblewrap and switches to a no---argv0 compatibility path for the inner re-exec. If bwrap is missing, it falls back to the bundled codex-resources/bwrap binary shipped with Codex and Codex surfaces a startup warning through its normal notification path instead of printing directly from the sandbox helper. Codex also surfaces a startup warning when bubblewrap cannot create user namespaces. WSL2 uses the normal Linux bubblewrap path. WSL1 is not supported for bubblewrap sandboxing because it cannot create the required user namespaces, so Codex rejects sandboxed shell commands that would enter the bubblewrap path before invoking bwrap.

Windows

Legacy SandboxPolicy / sandbox_mode configs are still supported on Windows. Legacy read-only and workspace-write policies imply full filesystem read access; exact readable roots are represented by split filesystem policies instead.

The elevated Windows sandbox also supports:

legacy ReadOnly and WorkspaceWrite behavior
split filesystem policies that need exact readable roots, exact writable roots, or extra read-only carveouts under writable roots
backend-managed system read roots required for basic execution, such as C:\Windows, C:\Program Files, C:\Program Files (x86), and C:\ProgramData, when a split filesystem policy requests platform defaults

The unelevated restricted-token backend still supports the legacy full-read Windows model for legacy ReadOnly and WorkspaceWrite behavior. It also supports a narrow split-filesystem subset: full-read split policies whose writable roots still match the legacy WorkspaceWrite root set, but add extra read-only carveouts under those writable roots.

New [permissions] / split filesystem policies remain supported on Windows only when they can be enforced directly by the selected Windows backend or round-trip through the legacy SandboxPolicy model without changing semantics. Policies that would require direct explicit unreadable carveouts (none) or reopened writable descendants under read-only carveouts still fail closed instead of running with weaker enforcement.

All Platforms

Expects the binary containing codex-core to simulate the virtual apply_patch CLI when arg1 is --codex-run-as-apply-patch. See the codex-arg0 crate for details.