Files
codex/codex-rs/core
T
jif-oai 01cb97851b Compress cold local rollouts (#25089)
## Rollout compression stack

This stack splits #24941 into reviewable steps for local rollout
compression. The design is intentionally staged:

1. Teach readers, listing, search, and lookup to understand compressed
rollouts.
2. Make append and resume paths materialize compressed rollouts back to
plain JSONL before writing.
3. Add a disabled-by-default worker that can compress cold archived
rollouts behind `local_thread_store_compression`.

The key invariant is that writers append to plain `.jsonl`. A
`.jsonl.zst` file is a cold/read representation; if a write is needed,
the compressed file is materialized back to plain JSONL first. Readers
prefer plain `.jsonl` when both forms exist and can fall back to the
compressed sibling during transitions.

The worker is deliberately the last PR and remains behind an
under-development feature flag. It currently scans only
`archived_sessions`, not active `sessions`, because active sessions have
the highest resume/append race risk. That means this stack does not yet
compress most unarchived local history.

## Known race / follow-up

The remaining unresolved design question is writer/compressor
coordination. Even for archived rollouts, a resume or metadata update
can append while the worker is replacing the plain file with
`.jsonl.zst`; the current double-stat checks narrow but do not fully
eliminate the window where a writer has opened the plain file before
unlink. Do not treat the worker PR as production-ready until we either:

- prevent append/resume paths from racing archived compression, or
- introduce a shared representation/append lock or equivalent
coordination.

The first two PRs are useful independently: they make compressed
rollouts readable and make append paths safely recover back to plain
JSONL. The third PR isolates the worker behavior so that coordination
issue is reviewable separately.

## Validation

Focused local validation for the stack includes:

- `just test -p codex-rollout`
- `just test -p codex-thread-store` where thread-store paths were
touched
- `just test -p codex-features` for the feature flag slice
- `just bazel-lock-check` after dependency graph changes
- scoped `just fix -p ...` passes for changed crates

CI is still the source of truth for the full platform matrix.

## This PR in the stack

This is PR 3/3, based on #25088. It adds the under-development feature
flag and starts the best-effort background worker when enabled. The
worker currently compresses only cold archived rollouts, skips active
sessions, verifies compressed output, preserves mtime and permissions,
keeps a store-level lock heartbeat, and cleans stale temp files.

Stack order:

1. #25087: read compressed local rollouts.
2. #25088: materialize compressed rollouts before append.
3. This PR: add the disabled local compression worker.
01cb97851b · 2026-06-01 18:35:58 +02:00
History
..
2026-06-01 18:35:58 +02:00

codex-core

This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.

Dependencies

Note that codex-core makes some assumptions about certain helper utilities being available in the environment. Currently, this support matrix is:

macOS

Expects /usr/bin/sandbox-exec to be present.

When using the workspace-write sandbox policy, the Seatbelt profile allows writes under the configured writable roots while keeping .git (directory or pointer file), the resolved gitdir: target, and .codex read-only.

Network access and filesystem read/write roots are controlled by SandboxPolicy. Seatbelt consumes the resolved policy and enforces it.

Seatbelt also keeps the legacy default preferences read access (user-preference-read) needed for cfprefs-backed macOS behavior.

Linux

Expects the binary containing codex-core to run the equivalent of codex sandbox when arg0 is codex-linux-sandbox. See the codex-arg0 crate for details.

Legacy SandboxPolicy / sandbox_mode configs are still supported on Linux. They can continue to use the legacy Landlock path when the split filesystem policy is sandbox-equivalent to the legacy model after cwd resolution. Split filesystem policies that need direct FileSystemSandboxPolicy enforcement, such as read-only or denied carveouts under a broader writable root, automatically route through bubblewrap. The legacy Landlock path is used only when the split filesystem policy round-trips through the legacy SandboxPolicy model without changing semantics. That includes overlapping cases like /repo = write, /repo/a = none, /repo/a/b = write, where the more specific writable child must reopen under a denied parent.

The Linux sandbox helper prefers the first bwrap found on PATH outside the current working directory whenever it is available. If bwrap is present but too old to support --argv0, the helper keeps using system bubblewrap and switches to a no---argv0 compatibility path for the inner re-exec. If bwrap is missing, it falls back to the bundled codex-resources/bwrap binary shipped with Codex and Codex surfaces a startup warning through its normal notification path instead of printing directly from the sandbox helper. Codex also surfaces a startup warning when bubblewrap cannot create user namespaces. WSL2 uses the normal Linux bubblewrap path. WSL1 is not supported for bubblewrap sandboxing because it cannot create the required user namespaces, so Codex rejects sandboxed shell commands that would enter the bubblewrap path before invoking bwrap.

Windows

Legacy SandboxPolicy / sandbox_mode configs are still supported on Windows. Legacy read-only and workspace-write policies imply full filesystem read access; exact readable roots are represented by split filesystem policies instead.

The elevated Windows sandbox also supports:

  • legacy ReadOnly and WorkspaceWrite behavior
  • split filesystem policies that need exact readable roots, exact writable roots, or extra read-only carveouts under writable roots
  • backend-managed system read roots required for basic execution, such as C:\Windows, C:\Program Files, C:\Program Files (x86), and C:\ProgramData, when a split filesystem policy requests platform defaults

The unelevated restricted-token backend still supports the legacy full-read Windows model for legacy ReadOnly and WorkspaceWrite behavior. It also supports a narrow split-filesystem subset: full-read split policies whose writable roots still match the legacy WorkspaceWrite root set, but add extra read-only carveouts under those writable roots.

New [permissions] / split filesystem policies remain supported on Windows only when they can be enforced directly by the selected Windows backend or round-trip through the legacy SandboxPolicy model without changing semantics. Policies that would require direct explicit unreadable carveouts (none) or reopened writable descendants under read-only carveouts still fail closed instead of running with weaker enforcement.

All Platforms

Expects the binary containing codex-core to simulate the virtual apply_patch CLI when arg1 is --codex-run-as-apply-patch. See the codex-arg0 crate for details.