Commit Graph

10 Commits

  • Replace shuf with $RANDOM in quickstart for broader compatibility
    `shuf` is a GNU coreutil which requires `brew install coreutils` on
    macOS. Replace it with `echo $((RANDOM % <sides> + 1))` which works in
    bash and zsh on both Linux and macOS.
    
    Also reword "true randomness" to "using a random number generator" to
    more clearly distinguish programmatic RNG from LLM non-determinism.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add quickstart guide for skill creators
    Adds a tutorial-style quickstart guide that walks the reader through
    creating their first Agent Skill — a `roll-dice` skill that teaches an
    agent to roll dice using true system randomness. The guide covers
    creating the `SKILL.md` file, verifying discovery via `/skills` in VS
    Code, testing with a "Roll a d20" prompt, and a brief explanation of the
    discovery/activation/execution lifecycle. Includes both bash and
    PowerShell command variants, and a note about model variation in
    tool-use reliability.
    
    Also adds the quickstart as the first page under "For skill creators" in
    `docs.json` navigation.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Merge pull request #240 from jonathanhefner/clarify-skill-eval-file-tree
    Add skill directory to structure diagram in evaluating skills doc
  • Add gotchas section to best practices guide
    Add a new "Gotchas sections" subsection to "Patterns for effective
    instructions" in the best practices guide. Covers what a gotcha is
    (environment-specific facts that defy reasonable assumptions), how to
    structure them (problem/correction pairs), and why they belong directly
    in `SKILL.md` rather than a separate reference file.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add skill directory to structure diagram in evaluating skills doc
    Show `csv-analyzer/` (containing `SKILL.md` and `evals/evals.json`)
    alongside `csv-analyzer-workspace/` so readers can see the full layout
    at a glance.
    
    Closes #238. Closes #239.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add best practices guide for skill creators
    Covers grounding skills in real expertise (extracting from tasks,
    synthesizing from project artifacts), iterating with execution feedback,
    managing context budget, calibrating instruction specificity, and
    reusable instruction patterns (templates, checklists, validation loops,
    plan-validate-execute, script bundling).
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add "Optimizing descriptions" guide for skill creators
    How-to guide covering the full description optimization workflow:
    writing effective descriptions, designing trigger eval queries
    (should-trigger and should-not-trigger with near-miss examples), testing
    trigger rates with a bash eval script, train/validation splits to avoid
    overfitting, and the iterative optimization loop.
    
    The guide is client-agnostic by default but includes a working Claude
    Code example in the `check_triggered` function using
    `--output-format json` and `jq` to detect `Skill` tool calls.
    
    Adds the page to the "For skill creators" navigation group in
    `docs.json`.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add "Evaluating skills" guide for skill creators
    A how-to guide for evaluating skill output quality using structured
    evals. Covers the full eval workflow: designing test cases, running
    with-skill vs. baseline comparisons, writing assertions, LLM-based
    grading, aggregating benchmarks, analyzing patterns, human review, and
    LLM-driven iterative improvement.
    
    Derived from the workflow implemented by the `skill-creator` Skill, but
    written as a standalone guide that readers can follow without using that
    tool.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
  • Add "Using scripts" guide for skill creators (#196)
    * Add "Using scripts" guide for skill creators
    
    New guide at `docs/skill-creation/using-scripts.mdx` covering how to use
    commands and scripts in skills:
    
    - One-off commands with `uvx`, `pipx`, `npx`, `bunx`, `go run`,
      `deno run` (tabbed by ecosystem, with pinned version examples)
    - Referencing bundled scripts from `SKILL.md` using relative paths
    - Self-contained scripts with inline dependency declarations (PEP 723,
      Deno `npm:` imports, Bun auto-install, Ruby `bundler/inline` — tabbed
      with a common HTML-parsing example)
    - Designing scripts for agentic use: non-interactive execution, `--help`
      documentation, error messages, structured output, and a compressed
      checklist of further considerations
    
    Also updates `docs/docs.json` to organize navigation into groups ("For
    skill creators" and "For client implementors").
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * Address review feedback on "Using scripts" guide
    
    Relative paths note: Clarify that the convention applies to support
    files like `references/*.md`, and explain *why* (the agent runs commands
    from the skill root).
    
    Structured output: Reframe motivation around composability with both
    agents and standard tools (`jq`, `cut`, `awk`) rather than LLM parsing
    ambiguity. Shorten prose; let the code example's inline comments carry
    the contrast.
    
    Predictable output size: Add `--output` flag as an alternative strategy
    for scripts whose output is large and not amenable to pagination. The
    `--output` flag acts as a consent mechanism — the agent must explicitly
    choose a file destination or pass `-` to opt in to stdout, preventing
    accidental context-window flooding.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    ---------
    
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>