Addresses codex review feedback (P2). `resolve(__dirname, '../..')` breaks
when `skills/understand/` is copied to a runtime skills directory whose
parent is not the plugin checkout — exactly the case SKILL.md Phase 0 warns
about and resolves via its multi-candidate $PLUGIN_ROOT search.
This script now prefers $PLUGIN_ROOT from the env (validated via
package.json presence) and falls back to the existing relative resolution.
SKILL.md Phase 0.5 passes the env var in the invocation.
Same latent pattern exists in scan-project / compute-batches / extract-import-map
/ extract-structure / build-fingerprints; hardening those is a separate
concern (no behaviour change today for installs that already work).
Refs #76
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the duplicated Node.js block in Phase 0.5 with a call into
`generateStarterIgnoreFile` via a thin wrapper script, mirroring the
scan-project.mjs pattern. Removes ~40 lines of duplicated logic; single
source of truth in @understand-anything/core.
Also tightens code review nits:
- Add 3 tests: stable language-group ordering, all-commented invariant
on empty dirs, suffix-glob rejects non-directory entries
- Clarify comments on EXACT_DIR_NAMES (ecosystem mix, not Python) and
SUFFIX_DIR_GLOBS (unanchored String.endsWith match)
- Type detectDirectories' readdirSync result explicitly (Dirent[]) to
pin the utf-8 encoding overload
Refs #76
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Detect C# project-suffix dirs (Foo.Tests/, Foo.UnitTests/) and PascalCase
test dirs (Tests/, UnitTests/, IntegrationTests/) via case-insensitive
match; group test-file suggestions by language (JS, C#, Java, Go).
Keeps all suggestions commented-out — same opt-in model as today. Updates
SKILL.md Phase 0.5 inline generator to stay in sync with the TS module.
Refs #76
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two gaps in the call-graph walker, both flagged by codex on #435:
1. `const Foo(...)` / `new Foo(...)` constructor calls were silently
dropped. The grammar emits these as `const_object_expression` /
`new_expression` containing `arguments` directly — they bypass the
`selector > argument_part` shape the walker relied on. Added a
dedicated branch that records the inner `type_identifier` as the
callee. Critical for Flutter widget trees where
`runApp(const MyApp())` would otherwise lose the MyApp construction
edge.
2. When a getter / setter / constructor / factory_constructor has a
body, its `method_signature` wraps `getter_signature` /
`setter_signature` / `constructor_signature` /
`factory_constructor_signature` instead of `function_signature`. The
walker only looked for `function_signature`, so `pendingName`
stayed null and the sibling `function_body` was walked with an
empty stack — calls inside ctor/factory/getter/setter bodies were
silently dropped even though those members were already extracted
as functions. Now dispatch across all five signature variants,
using `constructorName` for the (factory) constructor pair to
match what `collectClassBody` pushes.
Tests: 41 → 47 dart cases (+6); full core 733 → 739; no regressions.
Incorporates stronger pieces from the prior Dart attempts (#348, #415)
that @Lum1104 called out:
- `extractParams` now walks `optional_formal_parameters` (covers both
optional positional `[...]` AND named `{...}` parameters — the Dart
grammar uses one wrapper for both).
- New `extractParamName` helper extracts the user-visible field name
from `this.field` and `super.field` initializer parameters by
unwrapping `constructor_param` / `super_formal_parameter`.
- `collectClassBody` now routes `getter_signature` and `setter_signature`
in both shapes:
- concrete: `method_signature > getter_signature` + sibling function_body
- abstract: `declaration > getter_signature`
Setters use the same path. The previous limitation assertion
(`methods).not.toContain("value")`) flipped to a positive
`.toContain("value")`.
- Added import/export edge-case tests: `dart:` SDK URIs, multi-import
declaration-order preservation, and `export ... show` clauses.
- Added a comma-list field test (`int a, b, c;`).
Underscore-prefix visibility carries through naturally to all new code
paths via the existing `isExported` gate inside `pushMethod`; explicit
test added for an underscore-prefixed getter.
Test counts: 28 → 41 dart cases; full core suite 720 → 733; no
regressions.
Implements extractCallGraph with a sibling-aware walk that pairs each
function_signature with its subsequent function_body sibling (Dart's
AST differs from Kotlin's: signature and body are siblings, not
parent/child). Detects call sites via selector nodes containing
argument_part; uses startIndex for sibling lookup (web-tree-sitter
returns new wrapper objects per child() call, making === unreliable).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds enum_declaration handling to DartExtractor: enum constants are surfaced
as properties[] so the structural graph captures Color.red / Color.green etc.
Implements Task 9 of the Dart language support plan (TDD, 16/16 dart tests
pass, full suite 708/708).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add mixin_declaration handling to extractStructure, folding mixins into
classes[] (same convention as class_definition). The `on` constraint
sibling is intentionally ignored for graph purposes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add constructorName() helper and extend collectClassBody() to surface
unnamed constructors as "ClassName", named constructors as "Class.named",
and factory named constructors as "Class.named" in methods[]/functions[].
Probe confirmed plan's AST shapes match exactly; extractReturnType returns
undefined for all constructor forms (factory keyword is an unnamed node).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add TDD tests and implement extractTopLevelFunction with helpers for
extracting function name, params, and return type (including generics
where the grammar emits type_identifier + type_arguments as siblings).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Empty extractor that satisfies the LanguageExtractor interface so the
plugin pipeline can load it. Real extraction logic lands in subsequent
TDD commits.
Adds the Dart language config and wires it into builtinLanguageConfigs
so .dart files are recognized by the language registry. References the
vendored @understand-anything/tree-sitter-dart-wasm package for grammar
loading.
No extractor yet — structural extraction lands in the next commit.
The upstream tree-sitter-dart@1.0.0 ships a pre-`dylink.0` wasm that
fails to load in web-tree-sitter@0.26.x. The grammar source itself is
sound — rebuilding with the current tree-sitter-cli + wasi-sdk produces
a working dylink.0 wasm. Vendor that artifact as a workspace-internal
package so @understand-anything/core can depend on it via workspace:*.
BUILD.md documents the provenance and rebuild instructions.
Phase 7's `rm -rf` of the just-created `intermediate/` and `tmp/` dirs
trips destructive-action gates on hardened hosts (e.g. freshness-window
checks that flag deleting paths created moments earlier). Move them into
a timestamped `.trash-<epoch>/` instead; Phase 0 reclaims the space once
the trash is older than 7 days, well past any freshness window. Behavior
on normal hosts is unchanged — disk usage is identical after the next
run's purge.
Closes#301
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the regression flagged by ZebangCheng on #346: under the
parallelised `buildResolutionContext`, `loadTsConfigs` /
`loadGoModules` / `loadPhpAutoloads` ran concurrently but each wrote
warnings to stderr inline as it iterated read results, so a fixture
with both a malformed `tsconfig.json` and a malformed `composer.json`
could emit `composer, tsconfig` instead of the pre-PR `tsconfig,
composer` depending on I/O timing.
Each loader now buffers its warnings into a returned array and the
caller drains them in canonical order (tsconfig → go → php) after
`Promise.all`, restoring byte-identical stderr output. Added a
regression test that fixtures both malformed configs and asserts the
tsconfig warning precedes the composer warning in stderr.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When /understand runs with no --language flag and no stored outputLanguage,
step 3.6 now infers the conversation language and — only when it is non-English
— confirms once before generating, then persists the choice to config.json.
English conversations keep the exact same silent `en` path; --language flag and
stored config still take priority. README documents the behavior; version
bumped 2.7.5 -> 2.7.6 across all five manifests (user-visible behavior change).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve conflict in tests/skill/understand/test_extract_import_map.test.mjs
by keeping both new test groups — they cover independent fixes that should
coexist:
- upstream #214: tsconfig path-alias targets with leading "./"
- this PR #294: NodeNext .js → .ts rewrite for ESM TypeScript imports
The extract-import-map.mjs script auto-merged cleanly; both fixes are
already present in the merged source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the silent near-edgeless-graph regression on any modern ESM
TypeScript project. Reported in #294 with full repro + root-cause
analysis.
### Why this matters
Under `moduleResolution: NodeNext` (or `Node16` / `Bundler` with
explicit extensions — the default for new TS-ESM projects since 2023),
TypeScript does NOT rewrite import specifiers during compilation:
// src/index.ts — real, idiomatic NodeNext source
import { x } from './config.js'; // on disk: config.ts
Before this fix, `probeWithExtensions` only tried APPENDING extensions
to the import specifier:
'./config.js' → not in fileSet
'./config.js.ts', './config.js.tsx', './config.js.js', ... → all miss
→ returns null → edge dropped at merge as dangling
Net result on the reporter's repro: a knowledge graph with hundreds of
file nodes and almost no `imports` edges between them — silently
removing exactly the dependency structure the graph is meant to show.
### Fix
New `NODENEXT_REWRITES` table maps each compiled-output extension to
the TypeScript source extensions that could have produced it:
.js → [.ts, .tsx, .js, .jsx]
.jsx → [.tsx, .jsx]
.mjs → [.mts, .mjs, .ts]
.cjs → [.cts, .cjs, .ts]
`probeWithExtensions` now applies the rewrite when the import already
ends with one of these extensions and no such file exists on disk. The
rewrite runs BEFORE the legacy append-extensions loop — otherwise
`./foo.js` would generate the nonsense candidate `foo.js.ts` and the
append loop would never reach the actual `foo.ts`.
### Disambiguation
If both `config.ts` and `config.js` exist on disk (rare, but possible
during a partial migration), `import './config.js'` still resolves to
the .js — that's an exact-disk match and what NodeNext compilation
actually does. The rewrite only kicks in when the .js doesn't exist.
### Tests
6 new tests in `test_extract_import_map.test.mjs`:
- The main #294 case (`.js → .ts`)
- `.jsx → .tsx` and `.mjs → .mts` rewrites
- Disambiguation when both `.ts` and `.js` exist on disk
- Pure-JS projects still work (real `.js → .js` imports)
- Historical no-extension probes unaffected
- Missing files still return null (rewrite can't invent targets)
Total: 202 tests passing (was 196).
Closes#294
Wires Kotlin into the existing tree-sitter pipeline so .kt and .kts
files now produce functions, classes, data classes, sealed classes,
interfaces, objects, imports, exports, and call-graph edges — matching
the behavior of the other language extractors.
## Why @tree-sitter-grammars/tree-sitter-kotlin
The standard `tree-sitter-kotlin` (v0.3.8) ships only native bindings.
The new `@tree-sitter-grammars/tree-sitter-kotlin@1.1.0` ships a
prebuilt `.wasm` (loads cleanly with `web-tree-sitter@^0.26.6`,
nodeTypeCount=289, parses class_declaration / function_declaration as
expected). Same shape that PR1 used for Swift, just a different
publisher because the repomix WASM bundle does not include Kotlin.
`@tree-sitter-grammars` is the official tree-sitter org's GitHub
account, so this is the canonical upstream WASM source for Kotlin.
## Notes for reviewers
- `kotlinConfig` already existed as a stub (no `treeSitter` field), so
Android / JVM / Gradle codebases currently produce no structural
edges between `.kt` files. This PR adds the `treeSitter` field; the
existing plugin loader picks it up unchanged.
- **Visibility rule differs from Swift**: Kotlin's default visibility
is `public`, so the extractor treats *every* declaration with no
modifier as exported. Only an explicit `private` opts out. `internal`
and `protected` remain exported in the project-graph sense because
they are still resolvable from other files (within the module / via
inheritance).
- `class_declaration` in tree-sitter-kotlin is overloaded for class,
data class, sealed class, and interface (distinguished by the keyword
child and `modifiers > class_modifier`). The extractor handles all
four uniformly.
- `object_declaration` is a separate node type (Kotlin singletons) —
treated as a class-like entry with its own `name` and members.
- Primary-constructor parameters marked `val` / `var` are surfaced as
class properties; plain `parameter`s without `val/var` are
constructor-only and are NOT counted as properties (matching Kotlin
semantics).
- Import handling distinguishes the three forms: plain dotted
(`import a.b.C`), wildcard (`import a.b.*` → specifier `"*"`), and
aliased (`import a.b.C as Foo` → specifier `"Foo"`).
## Verification
- `pnpm lint` clean
- `pnpm --filter @understand-anything/core build` clean
- `pnpm --filter @understand-anything/skill build` clean
- `pnpm --filter @understand-anything/core test`: **692/692** (+22 new
Kotlin tests, matching the bar set by go-extractor.test.ts /
swift-extractor.test.ts)
- `pnpm test`: 196/196 (no regressions)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /understand pipeline reads every code file twice during analysis:
once in compute-batches (`extractExports` for the cross-batch neighbour
map) and once again in extract-import-map (per-language config loaders).
Both sites used sequential `readFileSync` loops, so on the iOS repo in
issue #226 (~15k files) the disk-read time was effectively serialised
behind a single libuv thread while the rest of the pool sat idle.
## Changes
- `extractExports` now batches files into `IO_PARALLELISM = 64` slices
and issues all `readFile` calls in each slice through `Promise.all`,
letting libuv's worker-thread pool overlap disk reads. The
tree-sitter parse stays on the main thread because `web-tree-sitter`
is single-threaded WASM — pipelining the I/O while parses run is
where the wall-time savings come from.
- `loadTsConfigs`, `loadGoModules`, `loadPhpAutoloads` and
`buildResolutionContext` switch to async / `Promise.all` for the
same reason. `buildResolutionContext` also runs the three loader
passes concurrently (`Promise.all([...])`) since they're independent.
- A small `readFilesParallel(paths)` helper is added at the top of
`extract-import-map.mjs` so the three loaders share the same
error-preserving shape.
## Why behavior stays identical
- Each loader collects its candidate paths in `files[]` order *before*
issuing reads, then iterates `reads` in the same order to emit
warnings + populate output maps. So stderr order and the final map
contents are byte-identical to the previous sequential loops.
- `extractExports` collects per-file errors in-place in the
`Promise.all` callbacks and emits warnings during the post-read
serial loop, again in chunk order — so warning text and order match
the previous implementation.
- Tree-sitter parsing is unchanged: parses still run serially on the
main thread, just with reads pipelined alongside.
## What's NOT in this PR
- `buildFingerprintStore` and `analyzeChanges` in `core/fingerprint.ts`
have the same sequential pattern. They're left alone here because
they're part of the public `@understand-anything/core` API; making
them async would be a breaking change worth its own discussion.
Internal-only `.mjs` scripts are safe to refactor without API churn.
- No change to scan-project: most of its sync I/O is `statSync`
(metadata, not content) plus a handful of small `.gitignore` /
`.understandignore` reads. The parallelism win is marginal there.
## Verification
- `pnpm lint` clean
- `pnpm --filter @understand-anything/core build` clean
- `pnpm --filter @understand-anything/skill build` clean
- `pnpm test`: 196/196 — including
`test_compute_batches.test.mjs` (19 tests) and
`test_extract_import_map.test.mjs` (40 tests), which exercise both
changed pipelines end-to-end with fixture projects. No output
diff vs main.
Refs #76
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`enumerateViaGit` ran `git ls-files -co --exclude-standard` (newline-separated
output) and then `split('\n').map(trim)` on the result. Without `-z`,
`git ls-files` C-escapes any byte outside the locale's "safe" set and wraps
the path in double quotes — for example, a directory named `30. 🏗️ docs/`
comes back as `"30. \360\237\217\227\357\270\217 docs/"`. Downstream
consumers then can't round-trip those octal-quoted strings to real disk
paths, so every file under such directories is silently dropped from the
scan.
This is particularly biting on Windows (where the issue surfaces even with
UTF-8 locale settings) and for any project that uses emoji, accented
characters, or CJK codepoints in directory names — which is increasingly
common in design/spec/journal trees.
The fix is to use `-z` (NUL-terminated output), the same approach git
itself documents for downstream consumers (e.g. `xargs -0`). NUL-separated
chunks are raw bytes, so every codepoint round-trips back to its real disk
path on every platform. Split on `\0` instead of `\n`; drop the now-
unnecessary `.trim()`.
Verified on a real project with emoji-prefixed directory names:
bare `git ls-files`:
"30. \360\237\217\227\357\270\217\360\237\247\231\342\200\215..."
`git ls-files -z`:
30. 🏗️🧙♂️🔮 BD-CCSP/01. Demo's/DEMO--...
Discovered during a multi-agent scan of an Atlas Intelligence spoke repo;
~33 design-intent files in `30. 🏗️ BD-{app}/` directories were silently
dropped per scan. Full report: atlas-intelligence-io/fleet-feedback#491.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`model: inherit` is a Claude Code-specific keyword that means "use the
parent session's model." Other tools that read the same agent frontmatter
(opencode, codex, etc.) don't understand it and instead try to use
`inherit` as a literal model id, which the configured provider rejects.
Reproduction (from #167): opencode + deepseek runs `/understand`, the
project-scanner subagent dispatches with `model: inherit`, deepseek
returns `ProviderModelNotFoundError`, and the pipeline halts on every
subagent dispatch.
With the field omitted, each platform falls back to its own configured
default:
- Claude Code: user's default subagent model
- opencode / codex / etc.: globally configured model
Note for Claude Code Opus users: subagents will no longer auto-inherit
the Opus session model. If you want the previous behavior, set your
default subagent model globally — that single setting now controls all
nine agents.
Closes#167
Adds phase status lines, batch progress with total count, and phase
completion confirmations to the skill definition. Users now see
[Phase N/7] headers and Batch X/N during analysis instead of
unnumbered batch lines with no context.
Fixes#182
Tailwind v4's default source detection walks the nearest .git and
collects tracked files via git ls-files. When the dashboard sources
sit inside a gitignored subtree of an ancestor repo (e.g. the default
marketplace install path ~/.claude/plugins/cache/, which is ignored by
~/.claude/.gitignore), detection returns 0 files and the Oxide engine
skips all utility generation — the dashboard renders unstyled.
Adding explicit @source directives is the supported Tailwind v4 escape
hatch and is a no-op for installs where automatic detection works.
Verified: built CSS bundle jumps from ~9 KB to ~55 KB and utility
classes (.flex, .grid, .absolute, .w-full, .h-full) are present.
Fixes#179
- typescript-eslint preset: strict -> recommended for a usable first-pass
baseline (per PR discussion); ratchet up in a follow-up.
- Drop the projectService/parserOptions block. Neither `recommended` nor
`strict` is type-aware, so it was unused; removing it also avoids the
pnpm-workspace tsconfig-resolution failure mode flagged in review.
- Add Node + browser globals via the `globals` package so .mjs scripts and
the dashboard stop hitting `no-undef`.
- Expand ignores: built bundles (**/public/**), Astro generated (.astro/),
and .private/ (eval scratch). Cuts 2400+ errors in vendored output.
- Allow `_`-prefixed unused vars/args/caught errors; skip irregular
whitespace inside comments (json-parser intentionally embeds ZWSP-escaped
block-comment examples in JSDoc).
- Fix the residual 13 genuine errors: drop dead imports/vars, replace
two `as any[]` in schema.ts with `Array<Record<string, unknown>>`,
drop unused destructure in change-classifier, drop unused catch binding
in extract-structure.mjs.
- Add EOF newline to eslint.config.mjs.
- Refresh pnpm-lock.yaml.
- Add `pnpm lint` step to .github/workflows/ci.yml so the tooling
actually enforces something.
pnpm lint now exits 0 locally; 33+13 test files / 1445 tests still pass.
The generated onboarding markdown linked to a nonexistent repository
(anthropics/understand-anything) instead of the actual project URL
(Lum1104/Understand-Anything).