Files
Eduard van Valkenburg 66a09a76af Python: fix: hyperlight skips symlinks when staging sandbox input (#5919)
* Python: fix(hyperlight): skip symlinks when staging files into the sandbox

The helpers that populate the sandbox input tree (``_copy_path`` and the
``_path_tree_signature`` walker used for cache invalidation) relied on
``Path.is_file()``, ``Path.is_dir()`` and ``shutil.copy2`` - all of which
follow symlinks by default. When the source tree contains symlinks, that
let entries from outside the configured input source surface inside the
sandbox.

Harden both code paths to never follow symlinks:

- ``_copy_path`` now bails out via ``Path.is_symlink()`` before any
  ``is_dir()`` / ``is_file()`` check, skips non-regular files, and uses
  ``shutil.copy2(..., follow_symlinks=False)`` as defense in depth.
- New ``_iter_real_entries`` walker replaces the previous ``Path.rglob``
  call inside ``_path_tree_signature`` (rglob follows directory symlinks).
- ``_path_tree_signature`` switches to ``Path.lstat()`` so size/mtime are
  never read through a symlink target.

Added regression tests covering:

- A pre-placed file symlink in ``workspace_root`` (top level).
- A pre-placed directory symlink in ``workspace_root``.
- A nested file symlink inside a real subdirectory.
- ``_path_tree_signature`` ignoring symlinks so the cache key reflects only
  what is actually staged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(hyperlight): address PR #5919 review feedback

- _iter_real_entries now yields directories and regular files only,
  skipping non-regular entries (sockets/FIFOs/devices). Keeps the
  cache-key signature consistent with what _copy_path actually stages.
- The four new symlink regression tests skip when the platform does not
  support symlink creation (e.g. unprivileged Windows runners), via a
  local _symlinks_supported helper modelled on the one in
  packages/core/tests/core/test_skills.py. Prevents OSError /
  NotImplementedError from failing CI jobs that have nothing to do with
  the change under test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(hyperlight): address PR #5919 follow-up review feedback

- _copy_path docstring: narrow the scope to "symlink entries present in
  the source tree at rest" and explicitly call out that the copy is NOT
  atomic with respect to concurrent mutation of the source tree.
  Callers who need that stronger guarantee should snapshot their
  workspace before passing it in. Avoids overpromising on a TOCTOU
  window that pathlib cannot express; closing it properly would need
  fd-based traversal (O_NOFOLLOW | O_DIRECTORY + os.scandir(fd)) with
  a separate Windows story, which is out of scope for this targeted
  fix.

- _path_tree_signature: drop the `if path.is_symlink(): return ()`
  short-circuit. Resolve a symlink root to its real target before
  walking instead. The public construction flow already resolves
  workspace_root / file_mounts[].host_path up front so this never
  affected user-facing code, but the short-circuit was misleading and
  would have produced an empty, stable signature for any direct
  caller that builds a _RunConfig without going through the public
  constructor. Defense in depth: even if a future call site forgets
  to resolve the root, the cache key still reflects real contents.

- Added regression test
  test_path_tree_signature_walks_through_symlinked_root: a symlinked
  workspace root must produce a non-empty signature, AND the signature
  must change when the real target's contents change so the cache key
  actually invalidates.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
66a09a76af ยท 2026-05-19 11:41:53 +00:00
History
..

agent-framework-hyperlight

Hyperlight-backed CodeAct integrations for Microsoft Agent Framework.

Installation

pip install agent-framework-hyperlight --pre

This package depends on hyperlight-sandbox, the packaged Python guest, and the Wasm backend package on supported platforms. If the backend is not published for your current platform yet, execute_code will fail at runtime when it tries to create the sandbox.

Quick start

Use HyperlightCodeActProvider to automatically inject the execute_code tool and CodeAct instructions into every agent run. Tools registered on the provider are available inside the sandbox via call_tool(...) but are not exposed as direct agent tools.

from agent_framework import Agent, tool
from agent_framework_hyperlight import HyperlightCodeActProvider

@tool
def compute(operation: str, a: float, b: float) -> float:
    """Perform a math operation."""
    ops = {"add": a + b, "subtract": a - b, "multiply": a * b, "divide": a / b}
    return ops[operation]

codeact = HyperlightCodeActProvider(
    tools=[compute],
    approval_mode="never_require",
)

agent = Agent(
    client=client,
    name="CodeActAgent",
    instructions="You are a helpful assistant.",
    context_providers=[codeact],
)

result = await agent.run("Multiply 6 by 7 using execute_code.")

Standalone tool

Use HyperlightExecuteCodeTool directly when you want full control over how the tool is added to the agent. This is useful when mixing sandbox tools with direct-only tools on the same agent.

from agent_framework import Agent, tool
from agent_framework_hyperlight import HyperlightExecuteCodeTool

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email (direct-only, not available inside the sandbox)."""
    return f"Email sent to {to}"

execute_code = HyperlightExecuteCodeTool(
    tools=[compute],
    approval_mode="never_require",
)

agent = Agent(
    client=client,
    name="MixedToolsAgent",
    instructions="You are a helpful assistant.",
    tools=[send_email, execute_code],
)

Manual static wiring

For fixed configurations where provider lifecycle overhead is unnecessary, build the CodeAct instructions once and pass them to the agent at construction time:

execute_code = HyperlightExecuteCodeTool(
    tools=[compute],
    approval_mode="never_require",
)

codeact_instructions = execute_code.build_instructions(tools_visible_to_model=False)

agent = Agent(
    client=client,
    name="StaticWiringAgent",
    instructions=f"You are a helpful assistant.\n\n{codeact_instructions}",
    tools=[execute_code],
)

File mounts and network access

Mount host directories into the sandbox and allow outbound HTTP to specific domains:

from agent_framework_hyperlight import HyperlightCodeActProvider, FileMount

codeact = HyperlightCodeActProvider(
    tools=[compute],
    file_mounts=[
        "/host/data",                                 # shorthand โ€” same path in sandbox
        ("/host/models", "/sandbox/models"),           # explicit host โ†’ sandbox mapping
        FileMount("/host/config", "/sandbox/config"),  # named tuple
    ],
    allowed_domains=[
        "api.github.com",                             # all methods
        ("internal.api.example.com", "GET"),           # GET only
    ],
)

Notes

  • This package is intentionally separate from agent-framework-core so CodeAct usage and installation remain optional. With agent-framework-core[all] (or the meta agent-framework) installed it is also reachable through the lazy-loading namespace agent_framework.hyperlight.
  • file_mounts accepts a single string shorthand, an explicit (host_path, mount_path) pair, or a FileMount named tuple. The host-side path in the explicit forms may be a str or Path. Use the explicit two-value form when the host path differs from the sandbox path.
  • allowed_domains accepts a single string target such as "github.com" to allow all backend-supported methods, an explicit (target, method_or_methods) tuple such as ("github.com", "GET"), or an AllowedDomain named tuple.
  • Tools registered with the sandbox return their native Python value (dict, list, primitives, or custom objects) directly to the guest via the Hyperlight FFI. Any result_parser configured on a FunctionTool is intended for LLM-facing consumers and does not run on the sandbox path โ€” apply formatting inside the tool function itself if you need it for in-sandbox consumers.