everything-claude-code

fix(hooks): use shared renameWithRetry in writeWarnState (ecc-context-monitor)

Mirror the previous commit's Windows-EPERM retry on the companion
`writeWarnState` in `scripts/hooks/ecc-context-monitor.js`. Same
race: two PostToolUse subprocesses writing concurrent debounce
state racing on `MoveFileExW`, target-in-use throwing EPERM on
Windows even though each writer's tmp path is now unique.

Implementation: import `renameWithRetry` from `scripts/lib/session-bridge.js`
(exported in the previous commit) instead of duplicating the helper.
The retry policy, backoff schedule, and main-thread `Atomics.wait`
strategy stay identical to `writeBridgeAtomic`.

Three writers in the repo now share the same atomic-write contract:
- `writeBridgeAtomic` (scripts/lib/session-bridge.js) — round 1 +
  this round's retry
- `writeWarnState` (this file) — round 1 + this round's retry via shared helper
- `writeCostWarningIfChanged` (scripts/hooks/ecc-metrics-bridge.js) —
  out of scope for this PR (already uses unique tmp suffix; a future
  consolidation could move it to the shared helper too).

Local: `yarn test` green, `yarn lint` clean. The companion test
suite for `ecc-context-monitor.js` does not currently exercise
concurrent `writeWarnState` writes, but the helper it now uses is
covered by the `tests/lib/session-bridge.test.js` concurrent-write
regression added in round 1's last commit.

Jamkris · 2026-05-19 04:57:10 -04:00

f93e8f6869

fix(lib): retry rename on Windows EPERM/EACCES/EBUSY in writeBridgeAtomic

PR #1983 round 1 introduced unique-suffix tmp paths so two concurrent
writers no longer share a single `.tmp` file. That fix is correct
under POSIX semantics — `rename(2)` is atomic between source and
destination, so each writer renames onto the same target without
conflict.

Windows `MoveFileExW` is not the same. It fails with
EPERM / EACCES / EBUSY when the target is currently being renamed
by *another* process — a short race window that fires reliably under
this hook's PostToolUse + statusline concurrency. Round 1's CI run
made this visible:

  Test (windows-latest, Node 18.x, npm) — FAILURE
  Error: EPERM: operation not permitted, rename
    'C:\…\ecc-metrics-test-bridge-race-….json.9504.4aef575a.tmp' ->
    'C:\…\ecc-metrics-test-bridge-race-….json'
      at writeBridgeAtomic (scripts/lib/session-bridge.js:79:8)

All nine Windows matrix cells (Node 18 / 20 / 22 × npm / pnpm / yarn)
hit the same path. POSIX matrices (Linux + macOS) passed unchanged.

Fix: extract a `renameWithRetry(tmp, target)` helper that retries
`fs.renameSync` up to 5 times on EPERM / EACCES / EBUSY with
exponential backoff (20 ms → 320 ms total). Other error codes
(ENOENT, ENOSPC, EROFS, …) re-throw on the first attempt — they are
not transient. POSIX runs hit the first try and exit immediately.

The backoff uses `Atomics.wait` on a throwaway `SharedArrayBuffer`
so the retry path does not busy-spin the CPU; verified on Node ≥ 17
that this works on the main thread. There is a `try/catch` fallback
to a brief busy-wait for older runtimes where `Atomics.wait` is
restricted to workers.

`writeBridgeAtomic` calls the helper instead of `fs.renameSync` and
keeps its existing best-effort tmp cleanup on terminal failure.

`renameWithRetry` is added to `module.exports` so the companion
`writeWarnState` in `scripts/hooks/ecc-context-monitor.js` can
adopt the same retry policy without duplicating the helper. That
adoption lands in the next commit.

Local: `node tests/lib/session-bridge.test.js` 14/14, `yarn test`
green, `yarn lint` clean. The round-1 test (two concurrent child
writers, 200 iterations each) now passes on macOS without retrying
at all (POSIX path) and is expected to pass on Windows via the new
retry loop.

Jamkris · 2026-05-19 04:57:10 -04:00

116e61d8cb

fix(hooks): use unique tmp suffix in writeWarnState (ecc-context-monitor)

Mirror the previous commit's `writeBridgeAtomic` fix on the
companion `writeWarnState` in `ecc-context-monitor.js`. Same shape:
fixed `${target}.tmp` → `${target}.${process.pid}.${randomNonce}.tmp`,
plus best-effort cleanup of the tmp file on `renameSync` failure
(throws original error after cleanup).

`writeWarnState` debounces the context-monitor's threshold alarms
(`COST_NOTICE_USD`, `COST_WARNING_USD`, `COST_CRITICAL_USD`, plus the
context-remaining and loop-detection ones). Without unique suffixes,
two PostToolUse subprocesses racing on the warn-state file produce
either a corrupted JSON debounce-state on disk or an ENOENT throw
that the hook catches and swallows — either way the next warn-state
read returns the default `{callsSinceWarn: 0, lastSeverity: null}`
and the threshold alarms re-fire or stop firing erratically. Users
see warning messages flicker or vanish; debounce no longer works.

Three call sites in this repo now share the same atomic-write
contract:
- `writeBridgeAtomic` (scripts/lib/session-bridge.js) — primary
- `writeCostWarningIfChanged` (scripts/hooks/ecc-metrics-bridge.js) — cost cache
- `writeWarnState` (this file) — debounce state

`yarn lint` clean. Regression test covering both `writeBridgeAtomic`
and `writeWarnState` under concurrent load lands in the next commit.

Jamkris · 2026-05-19 04:57:10 -04:00

7c2f71315b

fix(lib): use unique tmp suffix in writeBridgeAtomic to eliminate ENOENT race

`writeBridgeAtomic` wrote to a fixed `${target}.tmp` path before
calling `renameSync`. When two processes write to the same session
bridge concurrently (e.g. PostToolUse `ecc-metrics-bridge` + the
background `ecc-statusline`, both calling `writeBridgeAtomic(sessionId, ...)`),
the canonical atomic-rename race fires:

  1. Process A: writeFileSync(target.tmp, JSON_A) — tmp file exists.
  2. Process B: writeFileSync(target.tmp, JSON_B) — tmp file overwritten.
  3. Process A: renameSync(target.tmp, target) — succeeds; target = JSON_B
     (A's payload silently corrupted en-route).
  4. Process B: renameSync(target.tmp, target) — throws ENOENT (the
     rename consumed the file).

Every caller in the repo wraps `writeBridgeAtomic` in `try {} catch {}`,
so the ENOENT exception is swallowed and the user-visible symptom is
just "the bridge file occasionally contains the wrong process's
payload" with no diagnostic.

Reproduced before this commit:

  $ # two concurrent writers, each calling writeBridgeAtomic 500 times
  $ # against the same session ID
  [A] errors=244   # 244 ENOENT exceptions swallowed
  [B] errors=248   # ditto

After this commit the same workload reports 0 errors in both
subprocesses: tmp paths no longer collide.

Fix: change `${target}.tmp` to
`${target}.${process.pid}.${crypto.randomBytes(4).toString('hex')}.tmp`,
matching the pattern already used by `writeCostWarningIfChanged` in
`scripts/hooks/ecc-metrics-bridge.js` (commit 9b1d8918). The pid +
4-byte nonce gives each writer process a distinct tmp path, so step 2
above no longer overwrites step 1's payload and step 4 no longer
races step 3.

Also added: on `renameSync` failure, attempt `fs.unlinkSync(tmp)` so
a writer that fails (disk full, permission, parent dir gone) does
not leak its tmp file. The cleanup is best-effort and the original
error is still re-thrown.

**Scope clarification.** This commit closes the atomic-rename
primitive's race only. The *read-modify-write* race in callers —
two writers each read the same bridge state, increment, and write
back, the second clobbering the first — is a separate concern that
needs locking or per-writer logs, and is intentionally out of scope
for this PR. The cost-tracker / metrics-bridge callers tolerate
last-writer-wins on their cumulative aggregates today and this
commit does not change that contract.

The companion `writeWarnState` in `ecc-context-monitor.js` has the
same fixed-suffix pattern and the same race; that fix lands in the
next commit so each can be reviewed against its own diff.

Jamkris · 2026-05-19 04:57:10 -04:00

28548f67ba

fix(ci): cover other widely-cited invisible code points in check-unicode-safety

Extend `isDangerousInvisibleCodePoint` with five additional code
points / ranges that are routinely cited in invisible-character
smuggling references but were not in the previous denylist:

- **U+180E** MONGOLIAN VOWEL SEPARATOR. Formerly classified as a
  space separator (Zs) until Unicode 6.3 reclassified it as Cf
  (Format control). Renders as zero-width; widely abused for
  homograph attacks and prompt smuggling.

- **U+115F** HANGUL CHOSEONG FILLER and **U+1160** HANGUL JUNGSEONG
  FILLER. Zero-width fillers used in Korean text shaping. Both are
  cited as common LLM-injection vectors in Korean / multilingual
  threat models.

- **U+2061–U+2064** invisible math operators (FUNCTION APPLICATION,
  INVISIBLE TIMES, INVISIBLE SEPARATOR, INVISIBLE PLUS). Zero-width
  and only meaningful inside math typesetting. No legitimate
  Markdown or source code uses them.

- **U+3164** HANGUL FILLER. Reported in real-world Discord and
  Twitter smuggling incidents; not used in legitimate Korean text.

Reproduced before this commit: a file containing any one of these
code points passed `check-unicode-safety.js` silently.

After this commit each one is reported as
`dangerous-invisible U+<HEX>` and `--write` mode strips it.

Verified by writing 8 single-character probe files
(`probe-0x180E.md`, `probe-0x115F.md`, …) and confirming exit=1 with
each violation listed.

ECC repo self-scan reports only the pre-existing `U+2605` BLACK
STAR warnings (unchanged) and exits with the same status (no new
in-repo violations introduced). Existing 5 unicode-safety tests
still pass; `yarn lint` clean.

Regression coverage for both the previous commit's Tag block fix
and this commit's additions lands in the next commit.

Jamkris · 2026-05-18 21:20:36 -04:00

b068069b9b

fix(ci): cover Unicode Tag block (U+E0000–U+E007F) in check-unicode-safety

`isDangerousInvisibleCodePoint` enumerated seven ranges of invisible/
bidi/variation-selector code points but omitted the Unicode Tag block
(U+E0000–U+E007F). Tag characters were proposed for language tagging
in Unicode 3.1 and have been deprecated since Unicode 5.1, so no
legitimate text uses them. They are the canonical vector for
"ASCII Smuggling" / "Tag Smuggling" LLM prompt injection: an attacker
hides instructions inside an ASCII-looking string, the model reads
the tag bytes, the human reviewer sees nothing. Demonstrated against
multiple LLM assistants during 2024–2025.

`check-unicode-safety.js` is the repo's last line of defence before
contributor content reaches agent context; the same script also runs
in `--write` auto-sanitize mode on `.md` / `.mdx` / `.txt`. Today it
silently passes tag-block characters through unchanged in both
detection mode and `--write` mode.

Reproduced before this commit:

  $ mkdir -p /tmp/uni-test && node -e "
      const fs = require('fs');
      const hidden = [...Array(5)].map((_,i) =>
        String.fromCodePoint(0xE0041 + i)).join('');
      fs.writeFileSync('/tmp/uni-test/innocent.md',
        '# Title\\n\\nBenign text' + hidden + ' more.\\n');"

  $ ECC_UNICODE_SCAN_ROOT=/tmp/uni-test \
      node scripts/ci/check-unicode-safety.js
  Unicode safety check passed.
  $ echo $?
  0

Expected: tag-block characters reported as `dangerous-invisible`
violations (exit 1) and stripped under `--write`.
Actual: validator passes, `--write` leaves the bytes intact.

Fix: extend the denylist with one new range
`(codePoint >= 0xE0000 && codePoint <= 0xE007F)`. The change is
purely additive; the existing seven ranges are untouched.

After this commit the same reproduction returns:

  $ ECC_UNICODE_SCAN_ROOT=/tmp/uni-test \
      node scripts/ci/check-unicode-safety.js
  Unicode safety violations detected:
  innocent.md:3:12 dangerous-invisible U+E0041
  innocent.md:3:14 dangerous-invisible U+E0042
  innocent.md:3:16 dangerous-invisible U+E0043
  innocent.md:3:18 dangerous-invisible U+E0044
  innocent.md:3:20 dangerous-invisible U+E0045
  exit=1

`--write` mode also strips the bytes (verified: file length 47 → 42
after sanitize, regex `/[\u{E0000}-\u{E007F}]/u` no longer matches).

Existing 5 unicode-safety tests still pass; `yarn lint` clean. The
ECC repo's own self-scan (`node scripts/ci/check-unicode-safety.js`
with no `ECC_UNICODE_SCAN_ROOT`) reports the same warnings as before
this commit and exits with the same status (no regressions on
in-repo content).

A handful of other widely-cited invisible code points are missing
from the denylist (`U+180E`, `U+115F`, `U+1160`, `U+2061–U+2064`,
`U+3164`); those are addressed in the next commit so each fix
remains independently reviewable. Regression coverage for both
fixes lands two commits later.

Jamkris · 2026-05-18 21:20:36 -04:00

e3483fda15

docs: narrow ecc tools billing blocker

Affaan Mustafa · 2026-05-18 16:45:31 -04:00

cb81f1b0fe

docs: refresh release blockers evidence

Affaan Mustafa · 2026-05-18 15:23:48 -04:00

0f1775e30b

security: scope release oidc publishing

Affaan Mustafa · 2026-05-18 13:41:10 -04:00

7911af4a39

fix: treat MCP HTTP 406 probes as reachable

Affaan Mustafa · 2026-05-18 12:48:52 -04:00

386326df8e

docs: refresh publication readiness gate

Affaan Mustafa · 2026-05-18 10:49:49 -04:00

b41e6fb3d0

docs: mirror agentshield fleet ticket evidence

Affaan Mustafa · 2026-05-18 10:24:21 -04:00

2ba0c62d8a

test: enforce release publication checklist in readiness gates

Affaan Mustafa · 2026-05-18 09:10:51 -04:00

680aeff0fb

docs: record owner-wide queue cleanup

Affaan Mustafa · 2026-05-18 06:16:45 -04:00

08807e7fd6

docs: refresh ecc tools billing blocker evidence

Affaan Mustafa · 2026-05-18 05:38:14 -04:00

5e8f412cb5

docs: refresh billing readback gate evidence

Affaan Mustafa · 2026-05-18 04:30:09 -04:00

ff3eaff137

docs: mirror target billing readback gate

Affaan Mustafa · 2026-05-18 03:27:42 -04:00

fb4b0c8dce

fix: ignore defensive ioc deny rules

Affaan Mustafa · 2026-05-18 02:29:59 -04:00

04d4d81938

docs: refine billing readback dashboard blocker

Affaan Mustafa · 2026-05-18 02:03:37 -04:00

e53933de1b

docs: mirror marketplace billing provenance gate

Affaan Mustafa · 2026-05-18 00:36:01 -04:00

c276639bc7

docs: mirror billing kv readback gate

Affaan Mustafa · 2026-05-18 00:00:37 -04:00

34cc0c1856

docs: add rc1 release URL ledger

Affaan Mustafa · 2026-05-17 23:08:53 -04:00

81fca2cea6

test: align dashboard fixture with May 18 evidence

Affaan Mustafa · 2026-05-17 22:29:06 -04:00

d14191bed8

test: point platform audit at May 18 evidence

Affaan Mustafa · 2026-05-17 22:22:02 -04:00

5475db4f97

docs: add May 18 readiness evidence

Affaan Mustafa · 2026-05-17 22:19:42 -04:00

ec171300c6

fix(hooks): persist metrics warning dedup

Affaan Mustafa · 2026-05-17 21:41:24 -04:00

9b1d891870

fix(hooks): suppress repeated metrics warning breadcrumbs

Affaan Mustafa · 2026-05-17 21:41:24 -04:00

4cafdb8304

docs(hooks): correct PreToolUse → PostToolUse in readSessionCost docblock

greptile P2 nitpick: the previous commit's docblock said "on every
PreToolUse hook" but the module header (and the actual hook wiring
in `hooks/hooks.json`) identifies this script as a PostToolUse
hook — it runs *after* each tool invocation to update the running
session aggregate. One-word typo, no behavior change.

Jamkris · 2026-05-17 21:41:24 -04:00

2de0ce45d4

fix(hooks): log fail-open breadcrumb on parse/read errors in metrics bridge

coderabbitai flagged: the two `catch` blocks in `readSessionCost`
silently swallowed every failure mode. A malformed `costs.jsonl`
row, a permission error opening the file, or any other unexpected
I/O failure would silently return zero cost — masking real
problems and feeding stale or zero numbers into
`ecc-context-monitor.js` (which then injects them as
`additionalContext` into the live model turn).

Fix two things, both fail-open-preserving:

1. **Inner JSON.parse catch** — count malformed lines and write
   one aggregated breadcrumb per call:

     [ecc-metrics-bridge] skipped N malformed line(s) in <path>

   Aggregating (rather than per-line) keeps a log-flooded
   `costs.jsonl` diagnosable without overwhelming stderr.

2. **Outer fs.readFileSync catch** — write a breadcrumb on real
   errors, but stay silent on `ENOENT`. The "no costs.jsonl yet"
   case is genuinely normal (no Stop event has fired this session)
   and producing noise on every PreToolUse before the first Stop
   would be reviewer-visible spam. All other error codes
   (`EACCES`, `EISDIR`, `EMFILE`, …) get:

     [ecc-metrics-bridge] failing open after <name> reading <path>: <msg>

In both cases the function still returns the zero-cost fallback
so the bridge never breaks tool execution — only the
diagnosability changes.

Two new regression tests in
`tests/hooks/ecc-metrics-bridge.test.js`:

  ✓ readSessionCost writes a stderr breadcrumb when malformed
    lines are skipped — feeds 4 rows (2 valid, 2 malformed),
    asserts the last valid row still wins AND captured stderr
    contains "skipped 2 malformed line(s)".

  ✓ readSessionCost stays silent when costs.jsonl does not exist
    (ENOENT) — uses a fresh tmp HOME with no metrics dir, asserts
    zero return AND empty stderr.

Test count: 16 → 18; `npm test` green; `yarn lint` clean.

Jamkris · 2026-05-17 21:41:24 -04:00

086e44c964

fix(hooks): scan full costs.jsonl when locating session row

`readSessionCost` read only the trailing 8 KiB of
`~/.claude/metrics/costs.jsonl` to "avoid scanning entire file".
That ceiling is the opposite-sign sibling of the double-count bug
fixed in the previous commit: once a session's most recent
cumulative row gets pushed past the 8 KiB window by newer rows
from other sessions, the bridge silently reports `totalCost: 0`,
`totalIn: 0`, `totalOut: 0` for that session — same false signal
to `ecc-context-monitor.js`, same wrong number injected into the
live model turn as `additionalContext`.

`cost-tracker.js` has no rotation policy, so on any non-trivial
workstation costs.jsonl grows past 8 KiB within minutes of normal
use. For users who keep multiple concurrent sessions, this means
the second-and-later sessions silently report zero almost
immediately.

Reproduced before this commit:

  $ HOME=/tmp/eccc node -e '
      const fs = require("fs");
      const m = require("./scripts/hooks/ecc-metrics-bridge.js");
      // S1 row at file start, then 200 rows of OTHER-session noise (~16 KiB).
      // S1 is the row we want, but it sits past the 8 KiB tail.
      const s1 = `{"session_id":"S1","estimated_cost_usd":0.5,"input_tokens":500,"output_tokens":250}`;
      const other = `{"session_id":"OTHER","estimated_cost_usd":1,"input_tokens":100,"output_tokens":50}`;
      fs.mkdirSync("/tmp/eccc/.claude/metrics", { recursive: true });
      fs.writeFileSync("/tmp/eccc/.claude/metrics/costs.jsonl",
        [s1, ...Array(200).fill(other)].join("\\n") + "\\n");
      console.log(JSON.stringify(m.readSessionCost("S1")));'
  {"totalCost":0,"totalIn":0,"totalOut":0}

Expected: `{"totalCost":0.5, "totalIn":500, "totalOut":250}` (the
S1 row that exists in the file).
Actual: zero — the row is past the 8 KiB tail.

Fix: drop the `fs.openSync` + bounded `fs.readSync` + position
arithmetic in favour of `fs.readFileSync(costsPath, 'utf8')` and
iterate every line. Each row is ~150 bytes; even 100k rows is
~15 MB and a single sync read on PreToolUse is in the low ms.
If file rotation lands in `cost-tracker.js` later, this scan
becomes proportionally cheaper.

After this commit the reproduction above returns
`{"totalCost":0.5, "totalIn":500, "totalOut":250}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost finds session row beyond the old 8 KiB tail
boundary`. The test asserts the costs.jsonl fixture is > 8 KiB
before reading so any reintroduction of a bounded tail would
re-fail the test (i.e. the assertion is the contract, not the
specific number 8192).

Together with the previous commit, both directions of the
metrics-bridge cost-reporting bug are closed.

Jamkris · 2026-05-17 21:41:24 -04:00

63c9788f50

fix(hooks): use last cumulative row for session cost in metrics bridge

`ecc-metrics-bridge.js#readSessionCost` summed the
`estimated_cost_usd`, `input_tokens`, and `output_tokens` of
every matching row in `~/.claude/metrics/costs.jsonl`. That breaks
the documented contract of `scripts/hooks/cost-tracker.js`, which
explicitly states (in its module docblock):

  Cumulative behavior: Stop fires per assistant response, not
  per session. Each row therefore represents the cumulative
  session total up to that point. To get per-session cost, take
  the last row per session_id.

Summing N cumulative rows over-counts by roughly (N+1)/2 ×. For a
session with 3 rows at 0.01, 0.02, 0.03 USD (true running total
0.03), the bridge today reports 0.06 USD. The over-counted value
feeds `ecc-context-monitor.js`, which then trips its
COST_NOTICE_USD / COST_WARNING_USD / COST_CRITICAL_USD thresholds
on phantom spend AND injects the inflated number as
`additionalContext` into the live model turn — so the agent
itself is told a wrong cost.

Reproduced on `main` before this commit:

  $ cat > /tmp/eccc/.claude/metrics/costs.jsonl <<EOF
  {"session_id":"S1","estimated_cost_usd":0.01,"input_tokens":333,"output_tokens":166}
  {"session_id":"S1","estimated_cost_usd":0.02,"input_tokens":666,"output_tokens":333}
  {"session_id":"S1","estimated_cost_usd":0.03,"input_tokens":1000,"output_tokens":500}
  EOF

  $ HOME=/tmp/eccc node -e 'const m = require("./scripts/hooks/ecc-metrics-bridge.js"); \
      console.log(JSON.stringify(m.readSessionCost("S1")))'
  {"totalCost":0.06,"totalIn":1999,"totalOut":999}

Expected: `{"totalCost":0.03,"totalIn":1000,"totalOut":500}` (the
last cumulative row).
Actual: 2× over-count.

Fix: replace `+=` with `=` in the matching branch so the assigned
values reflect the most recent row encountered. The iteration
order is file order, which is also event time order, so the last
assignment wins — exactly the contract cost-tracker writes
against.

After this commit the reproduction above returns
`{"totalCost":0.03,"totalIn":1000,"totalOut":500}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost returns the LAST cumulative row, not the sum
(cost-tracker contract)`. The existing
`readSessionCost does not include unrelated default-session rows`
test happened to pass even with the bug because it only had one
target-session row — single-row sessions are coincidentally
correct under both formulas. The new test uses three rows so the
two formulas diverge.

A second issue in the same function — the 8 KiB tail-only read
silently drops older rows once a session's recent cumulative
totals scroll past that window — is fixed in the next commit.

Jamkris · 2026-05-17 21:41:24 -04:00

4f21ed2acf

fix(ci): match quoted write-all + dedupe duplicate checkout violations

Two round-1 review findings, fixed together because they touch the
same regex/loop region of `findViolations`:

1. **cubic P0 — quoted write-all bypass**.
   `WRITE_ALL_PATTERN` was `/^\s*permissions:\s*write-all\b/m`, which
   does not match the perfectly valid YAML forms
   `permissions: "write-all"` and `permissions: 'write-all'`. A
   workflow that quoted the shorthand slipped right through the
   persist-credentials gate the previous commit was supposed to close.

   Reproduced before this commit:
     $ cat /tmp/q.yml
     name: bad
     on: [push]
     permissions: "write-all"
     jobs:
       do:
         runs-on: ubuntu-latest
         steps:
           - uses: actions/checkout@v4
     $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
     Validated workflow security for 1 workflow files
     exit=0

   Fix: tighten the regex to
     /^\s*permissions:\s*["']?write-all["']?\s*$/m
   which accepts the bare, double-quoted, and single-quoted YAML forms
   while still anchoring on the `permissions:` key. The trailing `\s*$`
   prevents accidentally matching keys whose value happens to start
   with `write-all` (e.g. some future literal `write-all-something`).

2. **greptile P2 — duplicate violation when both patterns match**.
   A `ref: refs/pull/${{ github.event.pull_request.head.sha }}/merge`
   value matches both the `pull_request_target` rule's
   `expressionPattern` (the `head.sha` interpolation) and its
   `refPattern` (the `refs/pull/` literal). Each push generates an
   ERROR line with the same description and just a different
   `expression:` echo, so the reviewer sees the same violation twice.

   Fix: track `stepFlagged` inside the per-step loop and skip the
   `refPattern` fallback once any `expressionPattern` match has already
   produced a violation for this step. The `refPattern` is a fallback
   for ref-only forms (`refs/pull/123/head`, `${{ env.X }}` whose
   resolved value is a PR ref); when the more specific expression
   already fires, the fallback is redundant by definition.

After both fixes, the round-1 reproductions resolve cleanly:

  $ # quoted form now blocks
  $ ECC_WORKFLOWS_DIR=/tmp/q1/.github/workflows node scripts/ci/validate-workflow-security.js
  ERROR: quoted.yml:8 - workflows with write permissions must disable checkout credential persistence
  exit=1

  $ # combined head.sha + refs/pull now prints one ERROR, not two
  $ ECC_WORKFLOWS_DIR=/tmp/q2/.github/workflows node scripts/ci/validate-workflow-security.js
  ERROR: dup.yml:10 - pull_request_target must not checkout an untrusted pull_request head ref/repository
    Unsafe expression: ${{ github.event.pull_request.head.sha }}
  exit=1

Test additions land in the next commit.

Jamkris · 2026-05-17 21:19:29 -04:00

e06d038257

fix(ci): flag refs/pull checkouts under pull_request_target

The `pull_request_target` rule's `expressionPattern` matches only
the canonical `github.event.pull_request.head.{ref,sha,repo.full_name}`
interpolations. It does not match the second canonical form of
the same exploit — fetching `refs/pull/<N>/{head,merge}` directly:

  - uses: actions/checkout@v4
    with:
      ref: refs/pull/${{ github.event.pull_request.number }}/merge

The merge-ref variant is what GitHub's own security guidance calls
out as the highest-severity privilege-escalation pattern under
`pull_request_target`: it materialises the PR's merge commit
(attacker code spliced with base), executes inside a workflow that
has full repo-scoped tokens, and gives the attacker the chance to
exfiltrate secrets or push to default branches. `refs/pull/N/head`
is functionally equivalent — same source, same trust boundary.

Reproduced on `main` before this commit:

  $ cat /tmp/bad.yml
  name: bad
  on: { pull_request_target: { types: [opened] } }
  permissions: { contents: read }
  jobs:
    do:
      runs-on: ubuntu-latest
      steps:
        - uses: actions/checkout@v4
          with:
            ref: refs/pull/${{ github.event.pull_request.number }}/merge
            persist-credentials: false
        - run: npm ci --ignore-scripts

  $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
  Validated workflow security for 1 workflow files
  $ echo $?
  0

Expected: violation flagging the refs/pull checkout under pull_request_target.
Actual: passes silently.

Fix: add a `refPattern` to the `pull_request_target` rule:

    /^\s*ref:\s*['"]?[^'"\n]*refs\/(?:remotes\/)?pull\/[^'"\n\s]+/m

and apply it per checkout step inside the existing
event-gated loop. The pattern matches the ref VALUE so it catches
all interpolation shapes — `refs/pull/123/head`,
`refs/pull/${{ github.event.pull_request.number }}/merge`,
`${{ env.FOO }}/refs/pull/N/head` — without enumerating the
possible interpolations themselves.

Scoping: the rule is already gated on the workflow containing
`pull_request_target:`, so non-privileged `pull_request` workflows
that legitimately check out a PR ref are not affected.

After this commit the reproduction above exits 1 with:

  ERROR: bad.yml:10 - pull_request_target must not checkout an untrusted pull_request head ref/repository

Three new regression tests in `tests/ci/validate-workflow-security.test.js`:
  - rejects pull_request_target + refs/pull/<N>/merge
  - rejects pull_request_target + hardcoded refs/pull/<N>/head
  - allows pull_request_target with no `with.ref:` (base-ref checkout —
    the safe pattern from GitHub's own guidance)

Test count: 17 → 20 in this file; full `yarn test` still green.

Together with the previous commit, this closes the two
independent `validate-workflow-security.js` bypasses I found.

Jamkris · 2026-05-17 21:19:29 -04:00

cdbc925d89

fix(ci): treat 'permissions: write-all' as a write-permission gate

`WRITE_PERMISSION_PATTERN` in `validate-workflow-security.js`
enumerates named GitHub Actions scopes (`contents: write`,
`issues: write`, etc.) to decide whether a workflow needs to:
  - disable `persist-credentials` on `actions/checkout`
  - pass `--ignore-scripts` to `npm ci`

The pattern misses the top-level shorthand `permissions:
write-all`, which is the strictly broader form — it grants every
named scope write access in a single line. As a result, a
workflow that opts into write-all currently slips both gates.

Reproduced on `main` before this commit:

  $ cat /tmp/bad.yml
  name: bad
  on: [push]
  permissions: write-all
  jobs:
    do:
      runs-on: ubuntu-latest
      steps:
        - uses: actions/checkout@v4
        - run: npm ci

  $ ECC_WORKFLOWS_DIR=/tmp node scripts/ci/validate-workflow-security.js
  Validated workflow security for 1 workflow files
  $ echo $?
  0

Expected: at least two violations (missing `persist-credentials:
false`, missing `--ignore-scripts`).
Actual: passes silently.

Fix: add a sibling pattern `WRITE_ALL_PATTERN` that matches
`^\s*permissions:\s*write-all\b` and OR it with
`WRITE_PERMISSION_PATTERN` at the single gate. Both top-level
and job-level `permissions:` blocks satisfy the `^\s*` prefix.

After this commit the reproduction above exits 1 with:

  ERROR: bad.yml:8 - workflows with write permissions must disable checkout credential persistence
  ERROR: bad.yml:9 - workflows with write permissions must install npm dependencies with --ignore-scripts

Three new regression tests in `tests/ci/validate-workflow-security.test.js`:
  - rejects write-all + credential-persisting checkout
  - rejects write-all + `npm ci` without `--ignore-scripts`
  - allows write-all when both gates are satisfied (no over-block)

Test count: 14 → 17 in this file; full `yarn test` still green.

A separate `refs/pull/N/merge` bypass under `pull_request_target`
exists in the same validator and is fixed in the next commit.

Jamkris · 2026-05-17 21:19:29 -04:00

7f971b7e6f

fix(installer): harden locale docs install

Affaan Mustafa · 2026-05-17 20:46:04 -04:00

666b4e2261

feat(installer): add --locale flag for translated docs installation

Adds `--locale <code>` support to the ECC installer so users can install
localized reference docs (agents, commands, skills, rules) into
`~/.claude/docs/<locale>/` alongside the existing English installation.

Changes:
- manifests/install-modules.json: add 8 locale doc modules (docs-ja-JP,
  docs-zh-CN, docs-ko-KR, docs-pt-BR, docs-ru, docs-tr, docs-vi-VN,
  docs-zh-TW), each with kind="docs" and defaultInstall=false
- manifests/install-components.json: add 8 locale: components mapping to
  the new modules
- scripts/lib/install-manifests.js: add locale: family prefix,
  SUPPORTED_LOCALES, LOCALE_ALIAS_TO_COMPONENT_ID (with aliases like
  ja=ja-JP, zh=zh-CN, ko=ko-KR), and listSupportedLocales()
- scripts/lib/install/request.js: add --locale flag to parseInstallArgs(),
  resolve locale alias → component ID in normalizeInstallRequest(), throw
  on unsupported locale codes
- scripts/lib/install-targets/claude-home.js: map docs/<locale>/ source
  paths to ~/.claude/docs/<locale>/ destination (side-by-side, no overwrite
  of English files)
- scripts/install-apply.js: import listSupportedLocales, add --locale
  usage line and available locales list to --help output

Usage examples:
  ./install.sh --locale ja                    # Japanese docs only
  ./install.sh --profile core --locale zh-CN  # core profile + zh-CN docs
  ./install.sh typescript --locale ja         # legacy + locale (errors)