435 Commits

  • fix(hooks): use shared renameWithRetry in writeWarnState (ecc-context-monitor)
    Mirror the previous commit's Windows-EPERM retry on the companion
    `writeWarnState` in `scripts/hooks/ecc-context-monitor.js`. Same
    race: two PostToolUse subprocesses writing concurrent debounce
    state racing on `MoveFileExW`, target-in-use throwing EPERM on
    Windows even though each writer's tmp path is now unique.
    
    Implementation: import `renameWithRetry` from `scripts/lib/session-bridge.js`
    (exported in the previous commit) instead of duplicating the helper.
    The retry policy, backoff schedule, and main-thread `Atomics.wait`
    strategy stay identical to `writeBridgeAtomic`.
    
    Three writers in the repo now share the same atomic-write contract:
    - `writeBridgeAtomic` (scripts/lib/session-bridge.js) — round 1 +
      this round's retry
    - `writeWarnState` (this file) — round 1 + this round's retry via shared helper
    - `writeCostWarningIfChanged` (scripts/hooks/ecc-metrics-bridge.js) —
      out of scope for this PR (already uses unique tmp suffix; a future
      consolidation could move it to the shared helper too).
    
    Local: `yarn test` green, `yarn lint` clean. The companion test
    suite for `ecc-context-monitor.js` does not currently exercise
    concurrent `writeWarnState` writes, but the helper it now uses is
    covered by the `tests/lib/session-bridge.test.js` concurrent-write
    regression added in round 1's last commit.
  • fix(lib): retry rename on Windows EPERM/EACCES/EBUSY in writeBridgeAtomic
    PR #1983 round 1 introduced unique-suffix tmp paths so two concurrent
    writers no longer share a single `.tmp` file. That fix is correct
    under POSIX semantics — `rename(2)` is atomic between source and
    destination, so each writer renames onto the same target without
    conflict.
    
    Windows `MoveFileExW` is not the same. It fails with
    EPERM / EACCES / EBUSY when the target is currently being renamed
    by *another* process — a short race window that fires reliably under
    this hook's PostToolUse + statusline concurrency. Round 1's CI run
    made this visible:
    
      Test (windows-latest, Node 18.x, npm) — FAILURE
      Error: EPERM: operation not permitted, rename
        'C:\…\ecc-metrics-test-bridge-race-….json.9504.4aef575a.tmp' ->
        'C:\…\ecc-metrics-test-bridge-race-….json'
          at writeBridgeAtomic (scripts/lib/session-bridge.js:79:8)
    
    All nine Windows matrix cells (Node 18 / 20 / 22 × npm / pnpm / yarn)
    hit the same path. POSIX matrices (Linux + macOS) passed unchanged.
    
    Fix: extract a `renameWithRetry(tmp, target)` helper that retries
    `fs.renameSync` up to 5 times on EPERM / EACCES / EBUSY with
    exponential backoff (20 ms → 320 ms total). Other error codes
    (ENOENT, ENOSPC, EROFS, …) re-throw on the first attempt — they are
    not transient. POSIX runs hit the first try and exit immediately.
    
    The backoff uses `Atomics.wait` on a throwaway `SharedArrayBuffer`
    so the retry path does not busy-spin the CPU; verified on Node ≥ 17
    that this works on the main thread. There is a `try/catch` fallback
    to a brief busy-wait for older runtimes where `Atomics.wait` is
    restricted to workers.
    
    `writeBridgeAtomic` calls the helper instead of `fs.renameSync` and
    keeps its existing best-effort tmp cleanup on terminal failure.
    
    `renameWithRetry` is added to `module.exports` so the companion
    `writeWarnState` in `scripts/hooks/ecc-context-monitor.js` can
    adopt the same retry policy without duplicating the helper. That
    adoption lands in the next commit.
    
    Local: `node tests/lib/session-bridge.test.js` 14/14, `yarn test`
    green, `yarn lint` clean. The round-1 test (two concurrent child
    writers, 200 iterations each) now passes on macOS without retrying
    at all (POSIX path) and is expected to pass on Windows via the new
    retry loop.
  • fix(hooks): use unique tmp suffix in writeWarnState (ecc-context-monitor)
    Mirror the previous commit's `writeBridgeAtomic` fix on the
    companion `writeWarnState` in `ecc-context-monitor.js`. Same shape:
    fixed `${target}.tmp` → `${target}.${process.pid}.${randomNonce}.tmp`,
    plus best-effort cleanup of the tmp file on `renameSync` failure
    (throws original error after cleanup).
    
    `writeWarnState` debounces the context-monitor's threshold alarms
    (`COST_NOTICE_USD`, `COST_WARNING_USD`, `COST_CRITICAL_USD`, plus the
    context-remaining and loop-detection ones). Without unique suffixes,
    two PostToolUse subprocesses racing on the warn-state file produce
    either a corrupted JSON debounce-state on disk or an ENOENT throw
    that the hook catches and swallows — either way the next warn-state
    read returns the default `{callsSinceWarn: 0, lastSeverity: null}`
    and the threshold alarms re-fire or stop firing erratically. Users
    see warning messages flicker or vanish; debounce no longer works.
    
    Three call sites in this repo now share the same atomic-write
    contract:
    - `writeBridgeAtomic` (scripts/lib/session-bridge.js) — primary
    - `writeCostWarningIfChanged` (scripts/hooks/ecc-metrics-bridge.js) — cost cache
    - `writeWarnState` (this file) — debounce state
    
    `yarn lint` clean. Regression test covering both `writeBridgeAtomic`
    and `writeWarnState` under concurrent load lands in the next commit.
  • fix(lib): use unique tmp suffix in writeBridgeAtomic to eliminate ENOENT race
    `writeBridgeAtomic` wrote to a fixed `${target}.tmp` path before
    calling `renameSync`. When two processes write to the same session
    bridge concurrently (e.g. PostToolUse `ecc-metrics-bridge` + the
    background `ecc-statusline`, both calling `writeBridgeAtomic(sessionId, ...)`),
    the canonical atomic-rename race fires:
    
      1. Process A: writeFileSync(target.tmp, JSON_A) — tmp file exists.
      2. Process B: writeFileSync(target.tmp, JSON_B) — tmp file overwritten.
      3. Process A: renameSync(target.tmp, target) — succeeds; target = JSON_B
         (A's payload silently corrupted en-route).
      4. Process B: renameSync(target.tmp, target) — throws ENOENT (the
         rename consumed the file).
    
    Every caller in the repo wraps `writeBridgeAtomic` in `try {} catch {}`,
    so the ENOENT exception is swallowed and the user-visible symptom is
    just "the bridge file occasionally contains the wrong process's
    payload" with no diagnostic.
    
    Reproduced before this commit:
    
      $ # two concurrent writers, each calling writeBridgeAtomic 500 times
      $ # against the same session ID
      [A] errors=244   # 244 ENOENT exceptions swallowed
      [B] errors=248   # ditto
    
    After this commit the same workload reports 0 errors in both
    subprocesses: tmp paths no longer collide.
    
    Fix: change `${target}.tmp` to
    `${target}.${process.pid}.${crypto.randomBytes(4).toString('hex')}.tmp`,
    matching the pattern already used by `writeCostWarningIfChanged` in
    `scripts/hooks/ecc-metrics-bridge.js` (commit 9b1d8918). The pid +
    4-byte nonce gives each writer process a distinct tmp path, so step 2
    above no longer overwrites step 1's payload and step 4 no longer
    races step 3.
    
    Also added: on `renameSync` failure, attempt `fs.unlinkSync(tmp)` so
    a writer that fails (disk full, permission, parent dir gone) does
    not leak its tmp file. The cleanup is best-effort and the original
    error is still re-thrown.
    
    **Scope clarification.** This commit closes the atomic-rename
    primitive's race only. The *read-modify-write* race in callers —
    two writers each read the same bridge state, increment, and write
    back, the second clobbering the first — is a separate concern that
    needs locking or per-writer logs, and is intentionally out of scope
    for this PR. The cost-tracker / metrics-bridge callers tolerate
    last-writer-wins on their cumulative aggregates today and this
    commit does not change that contract.
    
    The companion `writeWarnState` in `ecc-context-monitor.js` has the
    same fixed-suffix pattern and the same race; that fix lands in the
    next commit so each can be reviewed against its own diff.
  • fix(ci): cover other widely-cited invisible code points in check-unicode-safety
    Extend `isDangerousInvisibleCodePoint` with five additional code
    points / ranges that are routinely cited in invisible-character
    smuggling references but were not in the previous denylist:
    
    - **U+180E** MONGOLIAN VOWEL SEPARATOR. Formerly classified as a
      space separator (Zs) until Unicode 6.3 reclassified it as Cf
      (Format control). Renders as zero-width; widely abused for
      homograph attacks and prompt smuggling.
    
    - **U+115F** HANGUL CHOSEONG FILLER and **U+1160** HANGUL JUNGSEONG
      FILLER. Zero-width fillers used in Korean text shaping. Both are
      cited as common LLM-injection vectors in Korean / multilingual
      threat models.
    
    - **U+2061–U+2064** invisible math operators (FUNCTION APPLICATION,
      INVISIBLE TIMES, INVISIBLE SEPARATOR, INVISIBLE PLUS). Zero-width
      and only meaningful inside math typesetting. No legitimate
      Markdown or source code uses them.
    
    - **U+3164** HANGUL FILLER. Reported in real-world Discord and
      Twitter smuggling incidents; not used in legitimate Korean text.
    
    Reproduced before this commit: a file containing any one of these
    code points passed `check-unicode-safety.js` silently.
    
    After this commit each one is reported as
    `dangerous-invisible U+<HEX>` and `--write` mode strips it.
    
    Verified by writing 8 single-character probe files
    (`probe-0x180E.md`, `probe-0x115F.md`, …) and confirming exit=1 with
    each violation listed.
    
    ECC repo self-scan reports only the pre-existing `U+2605` BLACK
    STAR warnings (unchanged) and exits with the same status (no new
    in-repo violations introduced). Existing 5 unicode-safety tests
    still pass; `yarn lint` clean.
    
    Regression coverage for both the previous commit's Tag block fix
    and this commit's additions lands in the next commit.
  • fix(ci): cover Unicode Tag block (U+E0000–U+E007F) in check-unicode-safety
    `isDangerousInvisibleCodePoint` enumerated seven ranges of invisible/
    bidi/variation-selector code points but omitted the Unicode Tag block
    (U+E0000–U+E007F). Tag characters were proposed for language tagging
    in Unicode 3.1 and have been deprecated since Unicode 5.1, so no
    legitimate text uses them. They are the canonical vector for
    "ASCII Smuggling" / "Tag Smuggling" LLM prompt injection: an attacker
    hides instructions inside an ASCII-looking string, the model reads
    the tag bytes, the human reviewer sees nothing. Demonstrated against
    multiple LLM assistants during 2024–2025.
    
    `check-unicode-safety.js` is the repo's last line of defence before
    contributor content reaches agent context; the same script also runs
    in `--write` auto-sanitize mode on `.md` / `.mdx` / `.txt`. Today it
    silently passes tag-block characters through unchanged in both
    detection mode and `--write` mode.
    
    Reproduced before this commit:
    
      $ mkdir -p /tmp/uni-test && node -e "
          const fs = require('fs');
          const hidden = [...Array(5)].map((_,i) =>
            String.fromCodePoint(0xE0041 + i)).join('');
          fs.writeFileSync('/tmp/uni-test/innocent.md',
            '# Title\\n\\nBenign text' + hidden + ' more.\\n');"
    
      $ ECC_UNICODE_SCAN_ROOT=/tmp/uni-test \
          node scripts/ci/check-unicode-safety.js
      Unicode safety check passed.
      $ echo $?
      0
    
    Expected: tag-block characters reported as `dangerous-invisible`
    violations (exit 1) and stripped under `--write`.
    Actual: validator passes, `--write` leaves the bytes intact.
    
    Fix: extend the denylist with one new range
    `(codePoint >= 0xE0000 && codePoint <= 0xE007F)`. The change is
    purely additive; the existing seven ranges are untouched.
    
    After this commit the same reproduction returns:
    
      $ ECC_UNICODE_SCAN_ROOT=/tmp/uni-test \
          node scripts/ci/check-unicode-safety.js
      Unicode safety violations detected:
      innocent.md:3:12 dangerous-invisible U+E0041
      innocent.md:3:14 dangerous-invisible U+E0042
      innocent.md:3:16 dangerous-invisible U+E0043
      innocent.md:3:18 dangerous-invisible U+E0044
      innocent.md:3:20 dangerous-invisible U+E0045
      exit=1
    
    `--write` mode also strips the bytes (verified: file length 47 → 42
    after sanitize, regex `/[\u{E0000}-\u{E007F}]/u` no longer matches).
    
    Existing 5 unicode-safety tests still pass; `yarn lint` clean. The
    ECC repo's own self-scan (`node scripts/ci/check-unicode-safety.js`
    with no `ECC_UNICODE_SCAN_ROOT`) reports the same warnings as before
    this commit and exits with the same status (no regressions on
    in-repo content).
    
    A handful of other widely-cited invisible code points are missing
    from the denylist (`U+180E`, `U+115F`, `U+1160`, `U+2061–U+2064`,
    `U+3164`); those are addressed in the next commit so each fix
    remains independently reviewable. Regression coverage for both
    fixes lands two commits later.
  • docs(hooks): correct PreToolUse → PostToolUse in readSessionCost docblock
    greptile P2 nitpick: the previous commit's docblock said "on every
    PreToolUse hook" but the module header (and the actual hook wiring
    in `hooks/hooks.json`) identifies this script as a PostToolUse
    hook — it runs *after* each tool invocation to update the running
    session aggregate. One-word typo, no behavior change.
  • fix(hooks): log fail-open breadcrumb on parse/read errors in metrics bridge
    coderabbitai flagged: the two `catch` blocks in `readSessionCost`
    silently swallowed every failure mode. A malformed `costs.jsonl`
    row, a permission error opening the file, or any other unexpected
    I/O failure would silently return zero cost — masking real
    problems and feeding stale or zero numbers into
    `ecc-context-monitor.js` (which then injects them as
    `additionalContext` into the live model turn).
    
    Fix two things, both fail-open-preserving:
    
    1. **Inner JSON.parse catch** — count malformed lines and write
       one aggregated breadcrumb per call:
    
         [ecc-metrics-bridge] skipped N malformed line(s) in <path>
    
       Aggregating (rather than per-line) keeps a log-flooded
       `costs.jsonl` diagnosable without overwhelming stderr.
    
    2. **Outer fs.readFileSync catch** — write a breadcrumb on real
       errors, but stay silent on `ENOENT`. The "no costs.jsonl yet"
       case is genuinely normal (no Stop event has fired this session)
       and producing noise on every PreToolUse before the first Stop
       would be reviewer-visible spam. All other error codes
       (`EACCES`, `EISDIR`, `EMFILE`, …) get:
    
         [ecc-metrics-bridge] failing open after <name> reading <path>: <msg>
    
    In both cases the function still returns the zero-cost fallback
    so the bridge never breaks tool execution — only the
    diagnosability changes.
    
    Two new regression tests in
    `tests/hooks/ecc-metrics-bridge.test.js`:
    
      ✓ readSessionCost writes a stderr breadcrumb when malformed
        lines are skipped — feeds 4 rows (2 valid, 2 malformed),
        asserts the last valid row still wins AND captured stderr
        contains "skipped 2 malformed line(s)".
    
      ✓ readSessionCost stays silent when costs.jsonl does not exist
        (ENOENT) — uses a fresh tmp HOME with no metrics dir, asserts
        zero return AND empty stderr.
    
    Test count: 16 → 18; `npm test` green; `yarn lint` clean.
  • fix(hooks): scan full costs.jsonl when locating session row
    `readSessionCost` read only the trailing 8 KiB of
    `~/.claude/metrics/costs.jsonl` to "avoid scanning entire file".
    That ceiling is the opposite-sign sibling of the double-count bug
    fixed in the previous commit: once a session's most recent
    cumulative row gets pushed past the 8 KiB window by newer rows
    from other sessions, the bridge silently reports `totalCost: 0`,
    `totalIn: 0`, `totalOut: 0` for that session — same false signal
    to `ecc-context-monitor.js`, same wrong number injected into the
    live model turn as `additionalContext`.
    
    `cost-tracker.js` has no rotation policy, so on any non-trivial
    workstation costs.jsonl grows past 8 KiB within minutes of normal
    use. For users who keep multiple concurrent sessions, this means
    the second-and-later sessions silently report zero almost
    immediately.
    
    Reproduced before this commit:
    
      $ HOME=/tmp/eccc node -e '
          const fs = require("fs");
          const m = require("./scripts/hooks/ecc-metrics-bridge.js");
          // S1 row at file start, then 200 rows of OTHER-session noise (~16 KiB).
          // S1 is the row we want, but it sits past the 8 KiB tail.
          const s1 = `{"session_id":"S1","estimated_cost_usd":0.5,"input_tokens":500,"output_tokens":250}`;
          const other = `{"session_id":"OTHER","estimated_cost_usd":1,"input_tokens":100,"output_tokens":50}`;
          fs.mkdirSync("/tmp/eccc/.claude/metrics", { recursive: true });
          fs.writeFileSync("/tmp/eccc/.claude/metrics/costs.jsonl",
            [s1, ...Array(200).fill(other)].join("\\n") + "\\n");
          console.log(JSON.stringify(m.readSessionCost("S1")));'
      {"totalCost":0,"totalIn":0,"totalOut":0}
    
    Expected: `{"totalCost":0.5, "totalIn":500, "totalOut":250}` (the
    S1 row that exists in the file).
    Actual: zero — the row is past the 8 KiB tail.
    
    Fix: drop the `fs.openSync` + bounded `fs.readSync` + position
    arithmetic in favour of `fs.readFileSync(costsPath, 'utf8')` and
    iterate every line. Each row is ~150 bytes; even 100k rows is
    ~15 MB and a single sync read on PreToolUse is in the low ms.
    If file rotation lands in `cost-tracker.js` later, this scan
    becomes proportionally cheaper.
    
    After this commit the reproduction above returns
    `{"totalCost":0.5, "totalIn":500, "totalOut":250}`.
    
    Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
    `readSessionCost finds session row beyond the old 8 KiB tail
    boundary`. The test asserts the costs.jsonl fixture is > 8 KiB
    before reading so any reintroduction of a bounded tail would
    re-fail the test (i.e. the assertion is the contract, not the
    specific number 8192).
    
    Together with the previous commit, both directions of the
    metrics-bridge cost-reporting bug are closed.
  • fix(hooks): use last cumulative row for session cost in metrics bridge
    `ecc-metrics-bridge.js#readSessionCost` summed the
    `estimated_cost_usd`, `input_tokens`, and `output_tokens` of
    every matching row in `~/.claude/metrics/costs.jsonl`. That breaks
    the documented contract of `scripts/hooks/cost-tracker.js`, which
    explicitly states (in its module docblock):
    
      Cumulative behavior: Stop fires per assistant response, not
      per session. Each row therefore represents the cumulative
      session total up to that point. To get per-session cost, take
      the last row per session_id.
    
    Summing N cumulative rows over-counts by roughly (N+1)/2 ×. For a
    session with 3 rows at 0.01, 0.02, 0.03 USD (true running total
    0.03), the bridge today reports 0.06 USD. The over-counted value
    feeds `ecc-context-monitor.js`, which then trips its
    COST_NOTICE_USD / COST_WARNING_USD / COST_CRITICAL_USD thresholds
    on phantom spend AND injects the inflated number as
    `additionalContext` into the live model turn — so the agent
    itself is told a wrong cost.
    
    Reproduced on `main` before this commit:
    
      $ cat > /tmp/eccc/.claude/metrics/costs.jsonl <<EOF
      {"session_id":"S1","estimated_cost_usd":0.01,"input_tokens":333,"output_tokens":166}
      {"session_id":"S1","estimated_cost_usd":0.02,"input_tokens":666,"output_tokens":333}
      {"session_id":"S1","estimated_cost_usd":0.03,"input_tokens":1000,"output_tokens":500}
      EOF
    
      $ HOME=/tmp/eccc node -e 'const m = require("./scripts/hooks/ecc-metrics-bridge.js"); \
          console.log(JSON.stringify(m.readSessionCost("S1")))'
      {"totalCost":0.06,"totalIn":1999,"totalOut":999}
    
    Expected: `{"totalCost":0.03,"totalIn":1000,"totalOut":500}` (the
    last cumulative row).
    Actual: 2× over-count.
    
    Fix: replace `+=` with `=` in the matching branch so the assigned
    values reflect the most recent row encountered. The iteration
    order is file order, which is also event time order, so the last
    assignment wins — exactly the contract cost-tracker writes
    against.
    
    After this commit the reproduction above returns
    `{"totalCost":0.03,"totalIn":1000,"totalOut":500}`.
    
    Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
    `readSessionCost returns the LAST cumulative row, not the sum
    (cost-tracker contract)`. The existing
    `readSessionCost does not include unrelated default-session rows`
    test happened to pass even with the bug because it only had one
    target-session row — single-row sessions are coincidentally
    correct under both formulas. The new test uses three rows so the
    two formulas diverge.
    
    A second issue in the same function — the 8 KiB tail-only read
    silently drops older rows once a session's recent cumulative
    totals scroll past that window — is fixed in the next commit.
  • fix(ci): match quoted write-all + dedupe duplicate checkout violations
    Two round-1 review findings, fixed together because they touch the
    same regex/loop region of `findViolations`:
    
    1. **cubic P0 — quoted write-all bypass**.
       `WRITE_ALL_PATTERN` was `/^\s*permissions:\s*write-all\b/m`, which
       does not match the perfectly valid YAML forms
       `permissions: "write-all"` and `permissions: 'write-all'`. A
       workflow that quoted the shorthand slipped right through the
       persist-credentials gate the previous commit was supposed to close.
    
       Reproduced before this commit:
         $ cat /tmp/q.yml
         name: bad
         on: [push]
         permissions: "write-all"
         jobs:
           do:
             runs-on: ubuntu-latest
             steps:
               - uses: actions/checkout@v4
         $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
         Validated workflow security for 1 workflow files
         exit=0
    
       Fix: tighten the regex to
         /^\s*permissions:\s*["']?write-all["']?\s*$/m
       which accepts the bare, double-quoted, and single-quoted YAML forms
       while still anchoring on the `permissions:` key. The trailing `\s*$`
       prevents accidentally matching keys whose value happens to start
       with `write-all` (e.g. some future literal `write-all-something`).
    
    2. **greptile P2 — duplicate violation when both patterns match**.
       A `ref: refs/pull/${{ github.event.pull_request.head.sha }}/merge`
       value matches both the `pull_request_target` rule's
       `expressionPattern` (the `head.sha` interpolation) and its
       `refPattern` (the `refs/pull/` literal). Each push generates an
       ERROR line with the same description and just a different
       `expression:` echo, so the reviewer sees the same violation twice.
    
       Fix: track `stepFlagged` inside the per-step loop and skip the
       `refPattern` fallback once any `expressionPattern` match has already
       produced a violation for this step. The `refPattern` is a fallback
       for ref-only forms (`refs/pull/123/head`, `${{ env.X }}` whose
       resolved value is a PR ref); when the more specific expression
       already fires, the fallback is redundant by definition.
    
    After both fixes, the round-1 reproductions resolve cleanly:
    
      $ # quoted form now blocks
      $ ECC_WORKFLOWS_DIR=/tmp/q1/.github/workflows node scripts/ci/validate-workflow-security.js
      ERROR: quoted.yml:8 - workflows with write permissions must disable checkout credential persistence
      exit=1
    
      $ # combined head.sha + refs/pull now prints one ERROR, not two
      $ ECC_WORKFLOWS_DIR=/tmp/q2/.github/workflows node scripts/ci/validate-workflow-security.js
      ERROR: dup.yml:10 - pull_request_target must not checkout an untrusted pull_request head ref/repository
        Unsafe expression: ${{ github.event.pull_request.head.sha }}
      exit=1
    
    Test additions land in the next commit.
  • fix(ci): flag refs/pull checkouts under pull_request_target
    The `pull_request_target` rule's `expressionPattern` matches only
    the canonical `github.event.pull_request.head.{ref,sha,repo.full_name}`
    interpolations. It does not match the second canonical form of
    the same exploit — fetching `refs/pull/<N>/{head,merge}` directly:
    
      - uses: actions/checkout@v4
        with:
          ref: refs/pull/${{ github.event.pull_request.number }}/merge
    
    The merge-ref variant is what GitHub's own security guidance calls
    out as the highest-severity privilege-escalation pattern under
    `pull_request_target`: it materialises the PR's merge commit
    (attacker code spliced with base), executes inside a workflow that
    has full repo-scoped tokens, and gives the attacker the chance to
    exfiltrate secrets or push to default branches. `refs/pull/N/head`
    is functionally equivalent — same source, same trust boundary.
    
    Reproduced on `main` before this commit:
    
      $ cat /tmp/bad.yml
      name: bad
      on: { pull_request_target: { types: [opened] } }
      permissions: { contents: read }
      jobs:
        do:
          runs-on: ubuntu-latest
          steps:
            - uses: actions/checkout@v4
              with:
                ref: refs/pull/${{ github.event.pull_request.number }}/merge
                persist-credentials: false
            - run: npm ci --ignore-scripts
    
      $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
      Validated workflow security for 1 workflow files
      $ echo $?
      0
    
    Expected: violation flagging the refs/pull checkout under pull_request_target.
    Actual: passes silently.
    
    Fix: add a `refPattern` to the `pull_request_target` rule:
    
        /^\s*ref:\s*['"]?[^'"\n]*refs\/(?:remotes\/)?pull\/[^'"\n\s]+/m
    
    and apply it per checkout step inside the existing
    event-gated loop. The pattern matches the ref VALUE so it catches
    all interpolation shapes — `refs/pull/123/head`,
    `refs/pull/${{ github.event.pull_request.number }}/merge`,
    `${{ env.FOO }}/refs/pull/N/head` — without enumerating the
    possible interpolations themselves.
    
    Scoping: the rule is already gated on the workflow containing
    `pull_request_target:`, so non-privileged `pull_request` workflows
    that legitimately check out a PR ref are not affected.
    
    After this commit the reproduction above exits 1 with:
    
      ERROR: bad.yml:10 - pull_request_target must not checkout an untrusted pull_request head ref/repository
    
    Three new regression tests in `tests/ci/validate-workflow-security.test.js`:
      - rejects pull_request_target + refs/pull/<N>/merge
      - rejects pull_request_target + hardcoded refs/pull/<N>/head
      - allows pull_request_target with no `with.ref:` (base-ref checkout —
        the safe pattern from GitHub's own guidance)
    
    Test count: 17 → 20 in this file; full `yarn test` still green.
    
    Together with the previous commit, this closes the two
    independent `validate-workflow-security.js` bypasses I found.
  • fix(ci): treat 'permissions: write-all' as a write-permission gate
    `WRITE_PERMISSION_PATTERN` in `validate-workflow-security.js`
    enumerates named GitHub Actions scopes (`contents: write`,
    `issues: write`, etc.) to decide whether a workflow needs to:
      - disable `persist-credentials` on `actions/checkout`
      - pass `--ignore-scripts` to `npm ci`
    
    The pattern misses the top-level shorthand `permissions:
    write-all`, which is the strictly broader form — it grants every
    named scope write access in a single line. As a result, a
    workflow that opts into write-all currently slips both gates.
    
    Reproduced on `main` before this commit:
    
      $ cat /tmp/bad.yml
      name: bad
      on: [push]
      permissions: write-all
      jobs:
        do:
          runs-on: ubuntu-latest
          steps:
            - uses: actions/checkout@v4
            - run: npm ci
    
      $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
      Validated workflow security for 1 workflow files
      $ echo $?
      0
    
    Expected: at least two violations (missing `persist-credentials:
    false`, missing `--ignore-scripts`).
    Actual: passes silently.
    
    Fix: add a sibling pattern `WRITE_ALL_PATTERN` that matches
    `^\s*permissions:\s*write-all\b` and OR it with
    `WRITE_PERMISSION_PATTERN` at the single gate. Both top-level
    and job-level `permissions:` blocks satisfy the `^\s*` prefix.
    
    After this commit the reproduction above exits 1 with:
    
      ERROR: bad.yml:8 - workflows with write permissions must disable checkout credential persistence
      ERROR: bad.yml:9 - workflows with write permissions must install npm dependencies with --ignore-scripts
    
    Three new regression tests in `tests/ci/validate-workflow-security.test.js`:
      - rejects write-all + credential-persisting checkout
      - rejects write-all + `npm ci` without `--ignore-scripts`
      - allows write-all when both gates are satisfied (no over-block)
    
    Test count: 14 → 17 in this file; full `yarn test` still green.
    
    A separate `refs/pull/N/merge` bypass under `pull_request_target`
    exists in the same validator and is fixed in the next commit.
  • feat(installer): add --locale flag for translated docs installation
    Adds `--locale <code>` support to the ECC installer so users can install
    localized reference docs (agents, commands, skills, rules) into
    `~/.claude/docs/<locale>/` alongside the existing English installation.
    
    Changes:
    - manifests/install-modules.json: add 8 locale doc modules (docs-ja-JP,
      docs-zh-CN, docs-ko-KR, docs-pt-BR, docs-ru, docs-tr, docs-vi-VN,
      docs-zh-TW), each with kind="docs" and defaultInstall=false
    - manifests/install-components.json: add 8 locale: components mapping to
      the new modules
    - scripts/lib/install-manifests.js: add locale: family prefix,
      SUPPORTED_LOCALES, LOCALE_ALIAS_TO_COMPONENT_ID (with aliases like
      ja=ja-JP, zh=zh-CN, ko=ko-KR), and listSupportedLocales()
    - scripts/lib/install/request.js: add --locale flag to parseInstallArgs(),
      resolve locale alias → component ID in normalizeInstallRequest(), throw
      on unsupported locale codes
    - scripts/lib/install-targets/claude-home.js: map docs/<locale>/ source
      paths to ~/.claude/docs/<locale>/ destination (side-by-side, no overwrite
      of English files)
    - scripts/install-apply.js: import listSupportedLocales, add --locale
      usage line and available locales list to --help output
    
    Usage examples:
      ./install.sh --locale ja                    # Japanese docs only
      ./install.sh --profile core --locale zh-CN  # core profile + zh-CN docs
      ./install.sh typescript --locale ja         # legacy + locale (errors)