mirror of
https://github.com/pchuan98/codex.git
synced 2026-07-01 00:31:56 +08:00
39aab9fc45
## Why When Codex uses a remote `ExecutorFileSystem`, every `get_metadata` call is an exec-server round trip. Upward discovery currently pays those round trips serially in two latency-sensitive places: - session startup, while locating the configured project root before loading `AGENTS.md`; and - Git-root discovery, which runs before per-turn Git diff enrichment. The goal is to remove the serial ancestor dependency without adding a new filesystem RPC, JSON-RPC batch method, Git executable dependency, or cache. ## Example Assume this layout, with `.git` as the configured project-root marker: ```text /workspace/repo/.git /workspace/repo/AGENTS.md /workspace/repo/crates/core/ <- cwd ``` The marker probes have this required precedence: ```text 1. /workspace/repo/crates/core/.git 2. /workspace/repo/crates/.git 3. /workspace/repo/.git 4. /workspace/.git 5. /.git ``` Previously, probe 2 was not sent until probe 1 returned, and probe 3 was not sent until probe 2 returned. With this change, the client lazily keeps up to eight ordinary `fs/getMetadata` requests in flight, but consumes their results in the order above. Codex must still learn that probes 1 and 2 are absent before accepting probe 3, so the nearest root always wins. Once probe 3 succeeds, the client has its answer and stops awaiting probes 4 and 5. Requests that were already sent may still finish on the worker. For the marker phase alone, with a 50 ms client-to-worker round trip and fast local metadata calls, finding the root at probe 3 changes from roughly three serialized round trips (150 ms) to one round trip plus worker processing. The later `AGENTS.md` candidate phase remains separate and ordered. Only after `/workspace/repo` is selected does `AGENTS.md` discovery check instruction candidates, in root-to-cwd order: ```text /workspace/repo/AGENTS.override.md /workspace/repo/AGENTS.md /workspace/repo/crates/AGENTS.override.md /workspace/repo/crates/AGENTS.md /workspace/repo/crates/core/AGENTS.override.md /workspace/repo/crates/core/AGENTS.md ``` The first configured candidate found in each directory wins. These checks remain ordered and no instruction candidate above `/workspace/repo` is issued. Git-root discovery uses the same bounded lookup with only `.git` as the marker. ## What changed - Added a client-side find-up helper that generates `ancestor x marker` probes lazily, nearest directory first and configured marker order within each directory. - Uses an ordered concurrency window of eight scalar metadata requests. This bounds executor load while preserving nearest-root and marker precedence. - Reuses the helper for both configured project-root discovery and remote Git-root discovery. - Keeps Git ancestor and marker construction in `AbsolutePathBuf`, converting only each complete `.git` probe to `PathUri`. This preserves native paths that require an opaque URI fallback, such as Windows namespace paths. - Preserves existing error behavior: `AGENTS.md` discovery propagates non-`NotFound` metadata errors, while Git discovery treats a failed marker probe as absent and continues upward. - Reads each discovered `AGENTS.md` directly instead of statting it a second time. No filesystem trait or exec-server protocol method is added. An empty `project_root_markers` list performs no ancestor-marker I/O and checks instruction candidates only in `cwd`. This change also deliberately does not cache roots across turns. ## Symlinks Upward traversal remains **lexical**. The helper does not canonicalize `cwd`; it appends marker names to the supplied path and walks that path's textual parents. The filesystem performs the actual metadata/read operation, and the current local and exec-server implementations follow live symlink targets. For example: ```text /tmp/pkg -> /workspace/repo/packages/pkg cwd = /tmp/pkg/src actual Git marker = /workspace/repo/.git ``` The lexical probes are `/tmp/pkg/src/.git`, `/tmp/pkg/.git`, `/tmp/.git`, and `/.git`. They do not jump from `/tmp/pkg` to the target's parent `/workspace/repo`, so this spelling of `cwd` does not discover `/workspace/repo/.git`. That is the existing behavior and is unchanged by this PR. Conversely, if `/tmp/repo -> /workspace/repo`, then probing `/tmp/repo/.git` follows the directory symlink and finds `/workspace/repo/.git`; the reported root remains the lexical path `/tmp/repo`. A live symlink used directly as `.git`, another configured marker, or `AGENTS.md` is also followed. A symlinked `AGENTS.md` is loaded when its target is a regular file, while a broken symlink behaves as `NotFound`.
39aab9fc45
ยท
2026-06-24 22:58:34 +01:00
History
codex-git-utils
Helpers for interacting with git, including patch application. The crate also
exposes a lightweight baseline API for internal directories that use git only
as a resettable diff mechanism: ensure_git_baseline_repository preserves a
usable root/.git baseline or creates one when it is missing or unusable,
reset_git_repository replaces root/.git with a fresh one-commit baseline,
and diff_since_latest_init returns structured file changes plus a unified
diff from that baseline to the current directory contents.
use std::path::Path;
use codex_git_utils::{apply_git_patch, ApplyGitRequest};
let repo = Path::new("/path/to/repo");
// Apply a patch (omitted here) to the repository.
let request = ApplyGitRequest {
cwd: repo.to_path_buf(),
diff: String::from("...diff contents..."),
revert: false,
preflight: false,
};
let result = apply_git_patch(&request)?;