Files
Eduard van Valkenburg 977c3adfb2 Python: replace pre-commit with prek, add PEP 723 script deps, clean up dev dependencies (#3748)
* python: replace pre-commit with prek, add PEP 723 script deps, clean up dev dependencies

- Replace pre-commit with prek (Rust-native, faster pre-commit alternative)
- Move supported hooks to repo: builtin for zero-clone speed
- Add new builtin hooks: trailing-whitespace, check-merge-conflict, detect-private-key, check-added-large-files
- Update all hook versions to latest (pre-commit-hooks v6, pyupgrade v3.21.2, bandit 1.9.3, uv-pre-commit 0.10.0)
- Add PEP 723 inline script metadata to 34 samples with external deps
- Remove autogen-agentchat/autogen-ext from dev deps (now declared per-sample)
- Remove unused dev deps: pytest-env, tomli-w
- Add agent-framework-core>=1.0.0b260130 lower bound to all 21 packages
- Update CI workflow to use j178/prek-action
- Update docs: DEV_SETUP.md, AGENTS.md, CODING_STANDARD.md, SAMPLE_GUIDELINES.md

* updated lock

* python: fix prek config paths for local execution and CI workflow

Remove global 'files: ^python/' filter and strip python/ prefix from all path patterns in .pre-commit-config.yaml so prek finds files when run from the python/ directory. Update CI workflow to use --cd python instead of --config path. Include trailing whitespace fixes and dev dependency cleanup.

* python: move helper scripts to scripts/ folder and exclude from checks

* python: exclude AGENTS.md from prek markdown code lint

* python: exclude AGENTS.md and azure_ai_search sample from markdown lint

* fix m365 sample

* python: ignore CPY rule for samples with PEP 723 headers

* fix in dev_setup

* python: replace aiofiles with regular open in samples

* python: suppress reportUnusedImport in markdown code block checker

* python: use samples pyright config for markdown code block checker

Write a temp pyrightconfig.json matching pyrightconfig.samples.json rules (typeCheckingMode=off, only reportMissingImports and reportAttributeAccessIssue). Filter output to only fail on these rules since syntax-level errors (top-level await, undefined vars) are expected in README documentation snippets.

* python: use markdown-code-lint with fixed globs instead of prek file list

The prek-markdown-code-lint task received all changed files including non-README markdown and files with pre-existing broken imports. Replace with the standard markdown-code-lint task which uses the correct glob patterns (README.md, packages/**/README.md, samples/**/*.md).

* python: exclude READMEs with pre-existing broken imports from markdown lint

* python: fix broken README code snippets instead of excluding them

- ag-ui: replace TextContent (removed) with content.type == 'text'
- durabletask: fix import path to durabletask.worker.TaskHubGrpcWorker
- orchestrations: use constructor params instead of .participants() method
- observability: mark deprecated code blocks as plain text, filter
  reportMissingImports to agent_framework modules only
- remove README excludes from markdown-code-lint task

* add revision to gaia download

* feat(python): parallelize checks across packages

Run (package × task) cross-product in parallel using ThreadPoolExecutor
and subprocesses. Key changes:

- Add scripts/task_runner.py with shared parallel execution engine
- Update run_tasks_in_packages_if_exists.py to accept multiple tasks
- Update run_tasks_in_changed_packages.py with --files flag and parallel support
- Add check-packages poe task (fmt+lint+pyright+mypy in parallel)
- Add prek-markdown-code-lint and prek-samples-check with change detection
- Split CI code quality workflow into parallel prek and mypy jobs
- Update DEV_SETUP.md to document new parallel behavior

Core package changes still trigger checks on all packages.

* feat(ci): split code quality into 4 parallel jobs

Split the single prek job into parallel jobs:
- pre-commit-hooks: lightweight hooks (SKIP=poe-check)
- package-checks: fmt/lint/pyright/mypy via check-packages
- samples-markdown: samples-lint, samples-syntax, markdown-code-lint
- mypy: change-detected mypy checks

All 4 jobs run concurrently (×2 Python versions = 8 runners).

* feat(ci): use only Python 3.10 for code quality checks

* refactor(python): add future annotations and remove quoted types

Add `from __future__ import annotations` to 93 package files that
used quoted string annotations, then run pyupgrade --py310-plus to
remove the now-unnecessary quotes.

Fixes https://github.com/microsoft/agent-framework/issues/3578
2026-02-09 17:51:01 +00:00

158 lines
7.1 KiB
Python

# Copyright (c) Microsoft. All rights reserved.
"""Check code blocks in Markdown files for syntax errors."""
import argparse
from enum import Enum
import glob
import logging
import os
import tempfile
import subprocess # nosec
from pygments import highlight # type: ignore
from pygments.formatters import TerminalFormatter
from pygments.lexers import PythonLexer
logger = logging.getLogger(__name__)
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.INFO)
class Colors(str, Enum):
CEND = "\33[0m"
CRED = "\33[31m"
CREDBG = "\33[41m"
CGREEN = "\33[32m"
CGREENBG = "\33[42m"
CVIOLET = "\33[35m"
CGREY = "\33[90m"
def with_color(text: str, color: Colors) -> str:
"""Prints a string with the specified color."""
return f"{color.value}{text}{Colors.CEND.value}"
def expand_file_patterns(patterns: list[str], skip_glob: bool = False) -> list[str]:
"""Expand glob patterns to actual file paths."""
all_files: list[str] = []
for pattern in patterns:
if skip_glob:
# When skip_glob is True, treat patterns as literal file paths
# Only include if it's a markdown file
if pattern.endswith('.md'):
matches = glob.glob(pattern, recursive=False)
all_files.extend(matches)
else:
# Handle both relative and absolute paths with glob expansion
matches = glob.glob(pattern, recursive=True)
all_files.extend(matches)
return sorted(set(all_files)) # Remove duplicates and sort
def extract_python_code_blocks(markdown_file_path: str) -> list[tuple[str, int]]:
"""Extract Python code blocks from a Markdown file."""
with open(markdown_file_path, encoding="utf-8") as file:
lines = file.readlines()
code_blocks: list[tuple[str, int]] = []
in_code_block = False
current_block: list[str] = []
for i, line in enumerate(lines):
if line.strip().startswith("```python"):
in_code_block = True
current_block = []
elif line.strip().startswith("```"):
in_code_block = False
code_blocks.append(("\n".join(current_block), i - len(current_block) + 1))
elif in_code_block:
current_block.append(line)
return code_blocks
def check_code_blocks(markdown_file_paths: list[str], exclude_patterns: list[str] | None = None) -> None:
"""Check Python code blocks in a Markdown file for syntax errors."""
files_with_errors: list[str] = []
exclude_patterns = exclude_patterns or []
for markdown_file_path in markdown_file_paths:
# Skip files that match any exclude pattern
if any(pattern in markdown_file_path for pattern in exclude_patterns):
logger.info(f"Skipping {markdown_file_path} (matches exclude pattern)")
continue
code_blocks = extract_python_code_blocks(markdown_file_path)
had_errors = False
for code_block, line_no in code_blocks:
markdown_file_path_with_line_no = f"{markdown_file_path}:{line_no}"
logger.info("Checking a code block in %s...", markdown_file_path_with_line_no)
# Skip blocks that don't import agent_framework modules or import lab modules
if (all(
all(import_code not in code_block for import_code in [f"import {module}", f"from {module}"])
for module in ["agent_framework"]
) or "agent_framework.lab" in code_block):
logger.info(f' {with_color("OK[ignored]", Colors.CGREENBG)}')
continue
with tempfile.TemporaryDirectory() as tmp_dir:
# Use the same rules as pyrightconfig.samples.json:
# typeCheckingMode=off, only reportMissingImports and reportAttributeAccessIssue enabled.
pyright_cfg = os.path.join(tmp_dir, "pyrightconfig.json")
with open(pyright_cfg, "w") as cfg:
cfg.write(
'{"include":["."],"typeCheckingMode":"off",'
'"reportMissingImports":"error","reportAttributeAccessIssue":"error"}'
)
tmp_file = os.path.join(tmp_dir, "snippet.py")
with open(tmp_file, "w", encoding="utf-8") as f:
f.write(code_block)
result = subprocess.run(["uv", "run", "pyright", "-p", tmp_dir], capture_output=True, text=True, cwd=".") # nosec
# Filter to only errors from our config rules; syntax-level errors
# (top-level await, etc.) are expected in README documentation snippets.
# Only flag reportMissingImports for agent_framework modules, not third-party packages.
relevant_errors = [
line for line in result.stdout.splitlines()
if ("reportMissingImports" in line and "agent_framework" in line)
or "reportAttributeAccessIssue" in line
]
if relevant_errors:
highlighted_code = highlight(code_block, PythonLexer(), TerminalFormatter()) # type: ignore
logger.info(
f" {with_color('FAIL', Colors.CREDBG)}\n"
f"{with_color('========================================================', Colors.CGREY)}\n"
f"{with_color('Error', Colors.CRED)}: Pyright found issues in {with_color(markdown_file_path_with_line_no, Colors.CVIOLET)}:\n"
f"{with_color('--------------------------------------------------------', Colors.CGREY)}\n"
f"{highlighted_code}\n"
f"{with_color('--------------------------------------------------------', Colors.CGREY)}\n"
"\n"
f"{with_color('pyright output:', Colors.CVIOLET)}\n"
f"{with_color(result.stdout, Colors.CRED)}"
f"{with_color('========================================================', Colors.CGREY)}\n"
)
had_errors = True
else:
logger.info(f" {with_color('OK', Colors.CGREENBG)}")
if had_errors:
files_with_errors.append(markdown_file_path)
if files_with_errors:
raise RuntimeError("Syntax errors found in the following files:\n" + "\n".join(files_with_errors))
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Check code blocks in Markdown files for syntax errors.")
# Argument is a list of markdown files containing glob patterns
parser.add_argument("markdown_files", nargs="+", help="Markdown files to check (supports glob patterns).")
parser.add_argument("--exclude", action="append", help="Exclude files containing this pattern.")
parser.add_argument("--no-glob", action="store_true", help="Treat file arguments as literal paths (no glob expansion).")
args = parser.parse_args()
# Expand glob patterns to actual file paths (or skip if --no-glob)
expanded_files = expand_file_patterns(args.markdown_files, skip_glob=args.no_glob)
check_code_blocks(expanded_files, args.exclude)