diff --git a/understand-anything-plugin/hooks/auto-update-prompt.md b/understand-anything-plugin/hooks/auto-update-prompt.md index 5fe88cc..b20aa18 100644 --- a/understand-anything-plugin/hooks/auto-update-prompt.md +++ b/understand-anything-plugin/hooks/auto-update-prompt.md @@ -240,12 +240,54 @@ Perform lightweight validation (no graph-reviewer agent): } ``` -3. **Update fingerprints:** Write and execute a Node.js script that: - - Reads the existing `fingerprints.json` - - For each re-analyzed file: computes new content hash and extracts structural elements via regex - - For deleted files: removes their entries - - Merges with existing fingerprints (keep unchanged files as-is) - - Writes updated `fingerprints.json` +3. **Update fingerprints (LOAD-PATCH-SAVE, not OVERWRITE).** + + The most common failure mode here: writing only the freshly-computed batch entries to `fingerprints.json`, discarding every other file's fingerprint. The next auto-update then sees all those files as new (no stored fingerprint), classifies them as STRUCTURAL, and escalates to FULL_UPDATE permanently (issue #152). The script must LOAD ALL existing entries, PATCH only the re-analyzed ones, and SAVE the full dict back. + + Write and execute a Node.js script in this exact ordering: + + ```javascript + import { readFileSync, writeFileSync, existsSync } from 'node:fs'; + import { createHash } from 'node:crypto'; + import path from 'node:path'; + + const fpPath = path.join(PROJECT_ROOT, '.understand-anything', 'fingerprints.json'); + const existedAndNonEmpty = existsSync(fpPath) && readFileSync(fpPath, 'utf-8').trim().length > 0; + + // 1. LOAD ALL existing entries (NEVER skip — preserves un-analyzed files) + const all = existedAndNonEmpty + ? JSON.parse(readFileSync(fpPath, 'utf-8')) + : {}; + const before = Object.keys(all).length; + + // 2. PATCH (file still exists) or REMOVE (file deleted) for each re-analyzed path. + // `filesToReanalyze` may include paths that were deleted in this commit — + // handle both branches inline rather than expecting a separate deleted list. + for (const filePath of filesToReanalyze) { + const fullPath = path.join(PROJECT_ROOT, filePath); + if (!existsSync(fullPath)) { + delete all[filePath]; + continue; + } + const content = readFileSync(fullPath, 'utf-8'); + const contentHash = createHash('sha256').update(content).digest('hex'); + // Extract functions, classes, imports, exports via the same regex as Phase 1. + all[filePath] = { contentHash, functions, classes, imports, exports }; + } + + // 3. GUARD against silent load failure: if fingerprints.json existed and was + // non-empty but `before` came out as 0, refuse to overwrite — something + // went wrong reading the file and writing now would clobber every entry. + if (existedAndNonEmpty && before === 0) { + throw new Error('fingerprints.json existed and was non-empty but loaded as {} — refusing to overwrite'); + } + + // 4. SAVE ALL entries back (full dict — not just the patched subset) + writeFileSync(fpPath, JSON.stringify(all, null, 2)); + console.log(`Fingerprints: ${before} → ${Object.keys(all).length}`); + ``` + + The `existedAndNonEmpty && before === 0` guard catches the silent-load-failure case before it corrupts the store. If the count shrinks from N to a small number that matches the batch size, the LOAD step was skipped — abort the write rather than persist the wrong dict. 4. Clean up intermediate files: ```bash