Commit Graph

5 Commits

  • Python: Adding AgentFileStore and FileAccessProvider to support file access operations. (#6099)
    * Adding AgentFileStore and FileAccessProvider to support file ased operations for agents.
    
    * Address PR review feedback on FileAccessProvider
    
    - Probe symlinks on the unresolved candidate path so in-root symlinks
      cannot silently pass and out-of-root symlinks surface the correct
      error message.
    - Validate matching_lines elements in FileSearchResult.from_dict and
      raise a clean ValueError for non-mapping entries.
    - Cap search regex pattern length (256 chars) via a new
      _compile_search_regex helper to mitigate ReDoS, and surface the cap
      in the file_access_search_files tool description.
    - Skip non-UTF-8 files during filesystem search instead of aborting
      the entire directory walk.
    - Replace the module-scope trailing string in the data-processing
      sample with comments to avoid Ruff B018.
    - Remove the checked-in working/region_totals.md sample artifact so
      the save flow works from a clean checkout.
    - Expand the Windows stdout reconfiguration comment in task_runner.py
      for clarity.
    - Add tests for invalid/oversize regex, non-UTF-8 file search, and
      in-root symlink rejection.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix mypy redundant-cast in FileSearchResult.from_dict
    
    Use cast(list[object], ...) instead of cast(list[Any], ...) so the
    cast represents a real type change (lists are invariant) and is no
    longer flagged by mypy as redundant, while still satisfying pyright's
    reportUnknownVariableType. Matches the existing pattern in _memory.py.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Tighten path normalization and directory resolution in FileAccess
    
    - _normalize_relative_path now strips surrounding whitespace up front
      so leading/trailing spaces never leak into file segments, and
      rejects trailing path separators for file paths so 'foo/' is no
      longer silently coerced to 'foo'.
    - FileSystemAgentFileStore._resolve_safe_directory_path normalizes
      with is_directory=True and maps an empty normalized result to the
      root. This matches InMemoryAgentFileStore so whitespace-only
      directory inputs resolve to the root instead of raising.
    - Added tests for whitespace stripping, trailing-separator rejection,
      and whitespace-only directory listing on the filesystem store.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Harden FileAccess search and atomic save in store API
    
    - Add wall-clock timeout (10s) around regex scans so a pathological pattern (e.g. `(a+)+`) below the length cap cannot stall the event loop.
    - Offload the InMemoryAgentFileStore regex scan to a worker thread, matching the filesystem store.
    - Fail closed when `Path.is_symlink` raises during the safe-path probe so a permission error cannot silently bypass the symlink/reparse-point rejection.
    - Add `overwrite: bool = True` to `AgentFileStore.write_file`; the in-memory store performs the check under the existing lock and the filesystem store uses `open(mode='x')` so concurrent callers cannot race past `overwrite=False`.
    - `file_access_save_file` now relies on the atomic store call instead of a separate `file_exists` round-trip.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix Python 3.10 timeout handling and add directory arg to list/search tools
    
    - Catch asyncio.TimeoutError in _run_search_with_timeout. In Python 3.10
      asyncio.wait_for raises asyncio.exceptions.TimeoutError, which is
      distinct from the builtin TimeoutError (the two were unified in 3.11).
      Catching the asyncio alias works on every supported version.
    - Add an optional directory parameter to file_access_list_files and
      file_access_search_files so agents can enumerate / scope searches to
      nested folders, not just the store root.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address FileAccess review feedback: case, errors, signal, TOCTOU
    
    - InMemoryAgentFileStore now stores (display_name, content) so list_files
      and search_files return the original-case names callers wrote, matching
      the behaviour of FileSystemAgentFileStore on case-preserving filesystems
      and removing the silent in-memory vs. on-disk contract divergence.
    - FileSystemAgentFileStore.read_file raises ValueError instead of letting
      UnicodeDecodeError bubble for binary / non-UTF-8 input, restoring
      symmetry with search_files (which still skips) and giving the tool
      layer a recoverable type to translate.
    - Tool wrappers now catch ValueError and OSError around every operation
      and surface them as readable strings, so 'you used ..' and 'the file
      already exists' are both reported to the model the same way instead of
      the former crashing out as an unhandled exception.
    - _search_files_sync logs per skipped non-UTF-8 file at WARNING and an
      aggregate INFO summary so operators can distinguish 'no matches' from
      'half the corpus was unreadable'.
    - FileSystemAgentFileStore softens its docstrings to acknowledge the
      inherent probe-then-open TOCTOU window. On POSIX both read and write
      now pass O_NOFOLLOW so the kernel refuses if the leaf segment becomes
      a symlink between the probe and the open. Windows has no equivalent
      flag; the limitation is documented.
    - Tests cover: case preservation on list/search, ValueError on non-UTF-8
      read at the store and tool layer, tool-layer string responses for
      path-traversal and oversized-regex inputs, search-skip log output,
      symlink rejection on delete/search/list, and symlinked intermediate
      directory rejection.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Address FileAccess nit comments: docstrings, enumerate, opt-in delete approval
    
    - Expand FileSearchMatch/FileSearchResult.to_dict docstrings to explain why
      the override is needed (__slots__ defeats the mixin's __dict__ iteration)
      and why exclude/exclude_none are accepted-but-ignored (mixin signature
      compatibility for callers like to_json).
    - Use enumerate(lines, start=1) in _search_file_content so the +1 below is
      no longer needed; rename loop variable to line_number for clarity.
    - Add opt-in require_delete_approval: bool = False on FileAccessProvider.
      When True, file_access_delete_file is registered with approval_mode
      'always_require' so the host must approve every delete. Default False
      preserves current behaviour and matches the .NET reference, but
      deployments that want a safer-by-default posture can enable it.
    - Add tests covering both delete approval modes.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * FileAccess: require delete approval by default
    
    Flip the default for FileAccessProvider(require_delete_approval=...) from
    False to True so destructive deletes are gated by host approval out of the
    box. Callers that want the previous autonomous behaviour (which matches the
    .NET reference) can pass require_delete_approval=False.
    
    Tests updated accordingly.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fixing linkinspector by installing Chrome for puppeteer first.
    
    ---------
    
    Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com>
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • ci: pin third-party GitHub Actions to commit SHAs (#5972)
    Replaces every floating tag in our workflow and composite action files
    with an immutable 40-character commit SHA, keeping the original `# vX`
    comment so Dependabot can still propose version bumps. 186 occurrences
    across 25 workflows and 2 composite actions.
    
    Also widens the github-actions Dependabot entry to use the plural
    `directories` key with `/.github/actions/*` so composite actions under
    `.github/actions/<name>/action.yml` are kept up to date. Previously
    Dependabot only scanned `.github/workflows` and the repo-root
    `action.yml`, leaving our `python-setup` and `sample-validation-setup`
    composite actions unmaintained.
  • Bump actions/checkout from 5 to 6 (#2404)
    Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
    - [Release notes](https://github.com/actions/checkout/releases)
    - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/actions/checkout/compare/v5...v6)
    
    ---
    updated-dependencies:
    - dependency-name: actions/checkout
      dependency-version: '6'
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
  • Change MD linkchecker to only run when md files are changed in a PR and to run nightly (#1148)
    * Ignore null, timeout, status codes
    
    * Only run when a markdown file is changed and run nightly
    
    * Add markdown change
    
    * Add run on link checker change and arbitrary change
  • .NET: Add link inspector (#1062)
    * Add link inspector
    
    * Comment out excludedirs while it's empty
    
    * Fix broken links
    
    * More links fixes
    
    * Push further fixes
    
    * Fix more links