6 Commits

  • fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)
    * fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222)
    
    run-with-flags.js capped stdin at 1MB but every fallthrough path still
    echoed the truncated string to stdout. The harness parses hook stdout as
    JSON, got a document cut mid-stream, and blocked the tool call — so any
    Edit/Write with a >1MB hook payload was permanently blocked by every
    registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS
    gating could run.
    
    - Exit 0 with empty stdout (no opinion) when the stdin cap trips, before
      any echo or gating logic.
    - Flush stdout via write callback before process.exit: exiting right
      after stdout.write() dropped everything past the ~64KB pipe buffer,
      cutting even sub-cap pass-through payloads mid-JSON.
    
    Regression tests cover the enabled, disabled, and missing-arg paths for
    oversized payloads plus full echo of sub-cap >64KB payloads.
    
    * fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224)
    
    The Codex MCP merge declared exa with a url key, but Codex's
    [mcp_servers.*] TOML schema is stdio-only — the url key makes the
    entire config.toml fail to load, bricking both the codex CLI and the
    desktop app. Every install/update re-injected the line because the
    urlEntry branch treated the broken entry as present.
    
    - ECC_SERVERS now emits only the current default set per
      docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args).
      Retired servers (supabase, playwright, context7, exa, github, memory,
      sequential-thinking) are never re-emitted; existing user-managed
      entries are untouched.
    - The merge now repairs the exact ECC-emitted broken form (url-only
      exa entry) on every run so re-running the installer fixes broken
      configs instead of preserving them. User stdio exa entries
      (command + mcp-remote) are left alone.
    - check-codex-global-state.sh requires chrome-devtools instead of the
      retired set, and flags url-only exa entries with a repair hint.
    
    Tests cover repair, re-run idempotence, stdio-entry preservation, and
    no-retired-server emission in add, update, dry-run, and disabled modes.
    
    * fix(hooks): never echo truncated stdin from Stop hooks (#2090)
    
    Stop hooks follow the ECC pass-through convention (echo stdin on
    stdout), but every echoing Stop hook capped stdin and echoed the capped
    string. The Stop payload carries last_assistant_message, so a long
    final assistant message produced a JSON document cut mid-stream on
    stdout, which the harness reports as 'Stop hook error: JSON validation
    failed' across the whole Stop chain.
    
    Reproduced: a Stop payload with a >64KB last_assistant_message run
    through run-with-flags + cost-tracker emitted exactly 65536 bytes of
    invalid JSON (cost-tracker capped stdin at 64KB — far below realistic
    Stop payloads).
    
    - cost-tracker: raise the cap to 1MB (matching all other hooks) and
      suppress the pass-through echo when stdin was truncated.
    - check-console-log, stop-format-typecheck, desktop-notify: suppress
      the echo when stdin was truncated; flush stdout before process.exit
      so sub-cap payloads are not cut at the ~64KB pipe buffer.
    - All hooks keep exiting 0 (fail-open); diagnostics go to stderr.
    
    New stop-hooks-stdout test asserts the contract for every registered
    Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic
    100KB payloads and oversized >1MB payloads, via the production runner
    and via direct invocation. Updated the old hooks.test.js case that
    codified the truncated-echo behavior.
    
    * fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142)
    
    In long autonomous sessions the fact-force gate produced 10+
    near-identical 'state facts -> blocked -> restate -> retry' blocks in
    one context window, which measurably raises the odds of the model
    collapsing into a degenerate single-token repetition loop.
    
    - Track a per-session fact_force_denials counter in GateGuard state
      (merged max across concurrent writers, reset with the session, robust
      to malformed on-disk values).
    - The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep
      the full four-fact block; later denials emit a condensed single-line
      message that carries the denial ordinal, so consecutive denials are
      structurally different and never textually identical.
    - True retries of the same target remain allowed without re-prompting
      (unchanged). Destructive-Bash and routine-Bash gates are unchanged,
      as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches.
    
    Eight new tests cover budget counting, condensed format, ordinal
    advancement, retry pass-through, env tuning, malformed state, MultiEdit
    dampening, and destructive-gate exemption.
    
    * fix(hooks): keep security hooks able to block on oversized stdin (#2222)
    
    Refine the truncation fail-open: instead of skipping the hook entirely,
    the runner now suppresses only its own raw-echo when stdin was
    truncated. The hook still executes and receives the truncated flag
    (run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps
    blocking truncated protected-config payloads (its test requires exit 2)
    while pass-through hooks fail open with empty stdout as before.
    
    * style: apply repo formatter to touched hook files
  • fix(hooks): prefer fresh harness cost cache (#2054)
    Uses a fresh harness cost cache when available and keeps transcript pricing as the fallback. Focused cost-tracker tests passed locally before merge.
  • fix: integrate recent hook and docs PRs (#1905)
    Integrates useful changes from #1882, #1884, #1889, #1893, #1898, #1899, and #1903:
    - fix rule install docs to preserve language directories
    - correct Ruby security command examples
    - harden dev-server hook command-substitution parsing
    - add Prisma patterns skill and catalog/package surfaces
    - allow first-time protected config creation while blocking existing configs
    - read cost metrics from Stop hook transcripts
    - emit suggest-compact additionalContext on stdout
    
    Co-authored-by: Jamkris <dltmdgus1412@gmail.com>
    Co-authored-by: Levi-Evan <levishantz@gmail.com>
    Co-authored-by: gaurav0107 <gauravdubey0107@gmail.com>
    Co-authored-by: richm-spp <richard.millar@salarypackagingplus.com.au>
    Co-authored-by: zomia <zomians@outlook.jp>
    Co-authored-by: donghyeun02 <donghyeun02@gmail.com>
  • feat: add ECC statusline observability hooks
    Salvages the useful statusline/context monitor work from stale PR #1504 while preserving the current continuous-learning hook runner wiring.
    
    Adds the metrics bridge, context monitor, statusline script, shared cost/session bridge utilities, and tests. Fixes the reviewed false loop-detection hash collision for non-file tools, avoids default-session cost inflation, sanitizes statusline task lookup, and records hook payload session IDs in cost-tracker.