codex

Pipeline bounded AGENTS.md and Git root probes (#29870 )

## Why

When Codex uses a remote `ExecutorFileSystem`, every `get_metadata` call
is an exec-server round trip. Upward discovery currently pays those
round trips serially in two latency-sensitive places:

- session startup, while locating the configured project root before
loading `AGENTS.md`; and
- Git-root discovery, which runs before per-turn Git diff enrichment.

The goal is to remove the serial ancestor dependency without adding a
new filesystem RPC, JSON-RPC batch method, Git executable dependency, or
cache.

## Example

Assume this layout, with `.git` as the configured project-root marker:

```text
/workspace/repo/.git
/workspace/repo/AGENTS.md
/workspace/repo/crates/core/    <- cwd
```

The marker probes have this required precedence:

```text
1. /workspace/repo/crates/core/.git
2. /workspace/repo/crates/.git
3. /workspace/repo/.git
4. /workspace/.git
5. /.git
```

Previously, probe 2 was not sent until probe 1 returned, and probe 3 was
not sent until probe 2 returned. With this change, the client lazily
keeps up to eight ordinary `fs/getMetadata` requests in flight, but
consumes their results in the order above. Codex must still learn that
probes 1 and 2 are absent before accepting probe 3, so the nearest root
always wins. Once probe 3 succeeds, the client has its answer and stops
awaiting probes 4 and 5. Requests that were already sent may still
finish on the worker.

For the marker phase alone, with a 50 ms client-to-worker round trip and
fast local metadata calls, finding the root at probe 3 changes from
roughly three serialized round trips (150 ms) to one round trip plus
worker processing. The later `AGENTS.md` candidate phase remains
separate and ordered.

Only after `/workspace/repo` is selected does `AGENTS.md` discovery
check instruction candidates, in root-to-cwd order:

```text
/workspace/repo/AGENTS.override.md
/workspace/repo/AGENTS.md
/workspace/repo/crates/AGENTS.override.md
/workspace/repo/crates/AGENTS.md
/workspace/repo/crates/core/AGENTS.override.md
/workspace/repo/crates/core/AGENTS.md
```

The first configured candidate found in each directory wins. These
checks remain ordered and no instruction candidate above
`/workspace/repo` is issued. Git-root discovery uses the same bounded
lookup with only `.git` as the marker.

## What changed

- Added a client-side find-up helper that generates `ancestor x marker`
probes lazily, nearest directory first and configured marker order
within each directory.
- Uses an ordered concurrency window of eight scalar metadata requests.
This bounds executor load while preserving nearest-root and marker
precedence.
- Reuses the helper for both configured project-root discovery and
remote Git-root discovery.
- Keeps Git ancestor and marker construction in `AbsolutePathBuf`,
converting only each complete `.git` probe to `PathUri`. This preserves
native paths that require an opaque URI fallback, such as Windows
namespace paths.
- Preserves existing error behavior: `AGENTS.md` discovery propagates
non-`NotFound` metadata errors, while Git discovery treats a failed
marker probe as absent and continues upward.
- Reads each discovered `AGENTS.md` directly instead of statting it a
second time.

No filesystem trait or exec-server protocol method is added. An empty
`project_root_markers` list performs no ancestor-marker I/O and checks
instruction candidates only in `cwd`. This change also deliberately does
not cache roots across turns.

## Symlinks

Upward traversal remains **lexical**. The helper does not canonicalize
`cwd`; it appends marker names to the supplied path and walks that
path's textual parents. The filesystem performs the actual metadata/read
operation, and the current local and exec-server implementations follow
live symlink targets.

For example:

```text
/tmp/pkg -> /workspace/repo/packages/pkg
cwd = /tmp/pkg/src
actual Git marker = /workspace/repo/.git
```

The lexical probes are `/tmp/pkg/src/.git`, `/tmp/pkg/.git`,
`/tmp/.git`, and `/.git`. They do not jump from `/tmp/pkg` to the
target's parent `/workspace/repo`, so this spelling of `cwd` does not
discover `/workspace/repo/.git`. That is the existing behavior and is
unchanged by this PR.

Conversely, if `/tmp/repo -> /workspace/repo`, then probing
`/tmp/repo/.git` follows the directory symlink and finds
`/workspace/repo/.git`; the reported root remains the lexical path
`/tmp/repo`. A live symlink used directly as `.git`, another configured
marker, or `AGENTS.md` is also followed. A symlinked `AGENTS.md` is
loaded when its target is a regular file, while a broken symlink behaves
as `NotFound`.

jif · 2026-06-24 22:58:34 +01:00

39aab9fc45

[codex] make PathUri::from_abs_path infallible (#27976 )

## Why

`PathUri::from_abs_path` can fail for absolute paths that do not have a
normal `file:` URI representation, forcing filesystem call sites to
handle a conversion error even though the original path can be preserved
losslessly.

## What

Make `from_abs_path` infallible and migrate its callers. Unrepresentable
paths use `file:///%00/bad/path/<base64>`, encoding Unix bytes or
Windows UTF-16LE; `to_abs_path` validates and decodes that fallback. The
leading encoded null reserves a namespace that cannot collide with a
real Unix or Windows path, and fallback URIs remain opaque to lexical
path operations.

## Validation

Added path-URI coverage for Unix null and non-UTF-8 paths, Windows
device/verbatim and non-Unicode paths, serialization, malformed
fallbacks, opaque lexical operations, invalid native payloads, and
literal `/bad/path` collision resistance.

Adam Perry @ OpenAI · 2026-06-12 16:58:42 -07:00

968a3ac9c1

[codex] migrate ExecutorFileSystem paths to PathUri (#27424 )

## Why

We're moving exec-server to use PathUri for its internal path
representations.

## What

Move `ExecutorFileSystem` APIs to use `PathUri` instead of
`AbsolutePathBuf`. Future changes will convert higher-level parts of
exec-server.

Adam Perry @ OpenAI · 2026-06-11 18:44:18 +00:00

b2a4e3be27

[codex] preserve fsmonitor for worktree Git reads (#26880 )

Codex forces `core.fsmonitor=false` on internal Git commands so a
repository cannot select an executable fsmonitor helper. This also
disables Git's built-in daemon for `status`, `diff`, and `ls-files`,
turning those worktree reads into full scans in large repositories.

Read the raw effective `core.fsmonitor` value and preserve it only when
Git interprets it as true and advertises built-in daemon support through
`git version --build-options`. Query uncommon boolean spellings back
through Git using the exact effective value. Unset, false, helper paths,
malformed values, probe failures, and unsupported Git builds continue to
force `core.fsmonitor=false`.

Centralize this policy in `git-utils` while keeping process execution in
the existing local and workspace-command adapters. Probe once per
worktree workflow and reuse the result for its Git commands, including
the TUI `/diff` path. Metadata-only commands and repository discovery
remain disabled without probing. Each probe and requested Git process
keeps its own existing timeout, and the decision is not cached because
layered and conditional Git configuration can change while Codex runs.

---------

Co-authored-by: Chris Bookholt <bookholt@openai.com>

Tamir Duberstein · 2026-06-08 21:32:46 -07:00

dffc4bf75d

Fix remote turn diff display roots (#23261 )

## Why

`TurnDiffTracker` computes a display root so turn diffs can be rendered
repo-relative. For remote exec-server turns, the selected turn `cwd` may
exist only inside the selected environment, but `run_turn` was
discovering the git root through the local host filesystem. When that
lookup failed, nested remote-session diffs fell back to the nested `cwd`
and showed `/tmp/...`-prefixed paths instead of repo-relative paths.

## What changed

- Resolve the diff display root from the primary selected turn
environment when one exists, using that environment's filesystem and
`cwd`.
- Add `codex_git_utils::get_git_repo_root_with_fs(...)` so git-root
discovery can run against an `ExecutorFileSystem`, including remote
environments.
- Reuse that helper from `resolve_root_git_project_for_trust(...)` and
add coverage for `.git` gitdir-pointer detection.

## Validation

- Devbox Bazel: `//codex-rs/core:core-unit-tests
--test_filter=get_git_repo_root_with_fs_detects_gitdir_pointer`
- Devbox Docker-backed remote-env repro: `//codex-rs/core:core-all-test
--test_filter=apply_patch_turn_diff_paths_stay_repo_relative_when_session_cwd_is_nested`

starr-openai · 2026-05-18 10:53:49 -07:00

9286ff2805

Ignore configured hooks in git helpers (#22843 )

## What
- Internal Git helper commands now ignore configured hook directories
during repository bookkeeping.

## Why
- These helper flows should stay consistent even when a repository has
hook-directory configuration of its own.

## How
- Pass a command-local `core.hooksPath` override in the shared helper
path and the Git-info helper path.
- Add regressions for the baseline index rewrite flow and the metadata
status flow.

## Validation
- `cargo fmt --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml --all --check`
- `cargo test --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-git-utils`
- `cargo test --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-core
test_get_has_changes_`

Chris Bookholt · 2026-05-15 10:07:54 -07:00

9facdccb37

[codex] Ignore fsmonitor config in Git metadata reads (#22652 )

## Summary
- keep Git metadata/status subprocesses independent of repository
`core.fsmonitor` configuration
- preserve existing working-tree state reporting while making the helper
behavior more predictable
- add regression coverage for `get_has_changes` when a repository
defines an fsmonitor command

## Validation
- `cargo fmt --all`
- `cargo test -p codex-core test_get_has_changes_`
- `cargo test -p codex-git-utils`

Chris Bookholt · 2026-05-14 10:07:43 -07:00

6ec8c4a6ec

Emit accepted line fingerprint analytics (#21601 )

## Why

Codex assisted-code attribution needs a client-side accepted-code source
that does not upload raw code. This adds a hash-only analytics event
derived from the turn diff so downstream attribution can compare
accepted Codex lines against commit or PR diffs.

## What Changed

- Parse accepted/effective added lines from the final turn diff and emit
`codex_accepted_line_fingerprints` analytics.
- Hash repo, path, and normalized line content before upload; raw code
and raw diffs are not included in the event.
- Chunk large fingerprint payloads and send accepted-line fingerprint
events in isolated requests while preserving normal batching for other
analytics events.
- Canonicalize Git remote URLs before repo hashing so SSH/HTTPS GitHub
remotes join to the same repo hash.
- Add parser coverage for unified diff hunk lines that look like `+++`
or `---` file headers.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-git-utils canonicalize_git_remote_url`
- `just fix -p codex-analytics`
- `just bazel-lock-check`
- `git diff --check`

alexsong-oai · 2026-05-08 12:16:24 -07:00

bbb6bf0a37

Disable empty Cargo test targets (#21584 )

## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>

Charlie Marsh · 2026-05-07 15:44:17 -07:00

54ef99a365

Remove ghost snapshots (#19481 )

## Summary
- Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface
and generated SDK/schema artifacts.
- Keep legacy config loading compatible, but make undo a no-op that
reports the feature is unavailable.
- Clean up core history, compaction, telemetry, rollout, and tests to
stop carrying ghost snapshot items.

## Testing
- Unit tests passed for `codex-protocol`, `codex-core` targeted undo and
compaction flows, `codex-rollout`, and `codex-app-server-protocol`.
- Regenerated config and app-server schemas plus Python SDK artifacts
and verified they match the checked-in outputs.

pakrym-oai · 2026-04-27 18:48:57 -07:00

4e05f3053c

Refactor exec-server filesystem API into codex-file-system (#19892 )

## Summary
- Extracted the shared filesystem types and `ExecutorFileSystem` trait
into a new `codex-file-system` crate
- Switched `codex-config` and `codex-git-utils` to depend on that crate
instead of `codex-exec-server`
- Kept `codex-exec-server` re-exporting the same API for existing
callers

## Testing
- Ran `cargo test -p codex-file-system`
- Ran `cargo test -p codex-git-utils`
- Ran `cargo test -p codex-config`
- Ran `cargo test -p codex-exec-server`
- Ran `just fix -p codex-file-system`, `just fix -p codex-git-utils`,
`just fix -p codex-config`, `just fix -p codex-exec-server`
- Ran `just fmt`
- Updated and verified the Bazel module lockfile

Michael Zeng · 2026-04-27 17:43:15 -07:00

a3350de855

feat: use git-backed workspace diffs for memory consolidation (#18982 )

## Why

This PR make the `morpheus` agent (memory phase 2) use a git diff to
start it's consolidation. The workflow is the following:
1. The agent acquire a lock
2. If `.codex/memories` does not exist or is not a git root, initialize
everything (and make a first empty commit)
3. Update `raw_memories.md` and `rollout_summaries/` as before.
Basically we select max N phase 1 memories based on a given policy
4. We use git (`gix`) to get a diff between the current state of
`.codex/memories` and the last commit.
5. Dump the diff in `phase2_workspace_diff.md`
6. Spawn `morpheus` and point it to `phase2_workspace_diff.md`
7. Wait for `morpheus` to be done
8. Re-create a new `.git` and make one single commit on it. We do this
because we don't want to preserve history through `.git` and this is
cheap anyway
9. We release the lock
On top of this, we keep the retry policies etc etc

The goals of this new workflow are:
* Better support of any memory extensions such as `chronicle`
* Allow the user to manually edit memories and this will be considered
by the phase 2 agent
 
As a follow-up we will need to add support for user's edition while
`morpheus` is running

## What Changed

- Added memory workspace helpers that prepare the git baseline, compute
the diff, write `phase2_workspace_diff.md`, and reset the baseline after
successful consolidation.
- Updated Phase 2 to sync current inputs into `raw_memories.md` and
`rollout_summaries/`, prune old extension resources, skip clean
workspaces, and run the consolidation subagent only when the workspace
has changes.
- Tightened Phase 2 job ownership around long-running consolidation with
heartbeats and an ownership check before resetting the baseline.
- Simplified the prompt and state APIs so DB watermarks are bookkeeping,
while workspace dirtiness decides whether consolidation work exists.
- Updated the memory pipeline README and tests for workspace diffs,
extension-resource cleanup, pollution-driven forgetting, selection
ranking, and baseline persistence.

## Verification

- Added/updated coverage in `core/src/memories/tests.rs`,
`core/src/memories/workspace_tests.rs`, `state/src/runtime/memories.rs`,
and `core/tests/suite/memories.rs`.

---------

Co-authored-by: Codex <noreply@openai.com>

jif-oai · 2026-04-27 14:32:44 +02:00

01ab25dbb5

nit: expose lib (#18962 )

As a follow-up

jif-oai · 2026-04-22 10:06:53 +01:00

b04ffeee4c

feat: baseline lib (#18848 )

This add with 2 entry point:
* `reset_git_repository` that takes a directory and set it as a new git
root
* `diff_since_latest_init` this returns the diff for a given directory
since the last `reset_git_repository`

jif-oai · 2026-04-21 17:24:30 +01:00

bf2a34b4b2

Refactor config loading to use filesystem abstraction (#18209 )

Initial pass propagating FileSystem through config loading.

pakrym-oai · 2026-04-17 00:51:21 +00:00

9effa0509f

Async config loading (#18022 )

Parts of config will come from executor. Prepare for that by making
config loading methods async.

pakrym-oai · 2026-04-15 19:18:38 -07:00

bd61737e8a

[codex] Make AbsolutePathBuf joins infallible (#16981 )

Having to check for errors every time join is called is painful and
unnecessary.

pakrym-oai · 2026-04-07 10:52:08 -07:00

f1a2b920f9

chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 )

## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.

Michael Bolin · 2026-03-27 19:00:44 -07:00

61dfe0b86c

Move git utilities into a dedicated crate (#15564 )

- create `codex-git-utils` and move the shared git helpers into it with
file moves preserved for diff readability
- move the `GitInfo` helpers out of `core` so stacked rollout work can
depend on the shared crate without carrying its own git info module

---------

Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>

Ahmed Ibrahim · 2026-03-24 13:26:23 -07:00

0f957a93cd

19 Commits