Files
Eduard van Valkenburg 57c901a245 Python: Fix hyperlight WasmSandbox cross-thread Drop and harden hosted-agent sample (#5603)
* update hyperlight to beta and move samples, add hosted agent sample

* Python: Fix hyperlight WasmSandbox cross-thread Drop and harden sample

Root cause: when a worker-side closure raised, the exception's __traceback__
retained frame locals that included the partially constructed PyO3 sandbox.
Future.result() re-raised that exception on the caller thread, and when the
caller's exception was eventually GC'd the frame locals were released
off-thread, dec_ref'ing the unsendable sandbox from the wrong thread and
tripping the PyO3 panic
'_native_wasm::WasmSandbox is unsendable, but is being dropped on another thread'.

Fix:
* Add _SandboxWorker._run_on_worker which catches every exception on the
  worker, drops __traceback__ there, deletes the original exception, and
  re-raises a fresh instance on the caller thread. initialize and execute
  route through it; dispose keeps its bare-submit semantics.
* Add an opt-in diagnostic module _drop_diagnostic (no-op unless
  HYPERLIGHT_TRACE_DROPS=1) that installs a sys.unraisablehook and dumps
  owner-thread + per-thread stacks on any future cross-thread unsendable
  Drop. Useful for triaging similar PyO3 regressions.
* Tests: cross-thread invocation, traceback-leak isolation, _SandboxEntry
  attribute-shape check, and a stale-reference stress test driven through
  asyncio.to_thread.

Sample (samples/04-hosting/foundry-hosted-agents/responses/06_hyperlight_codeact):
* Dockerfile installs agent-framework-* from in-tree source with python/ as
  build context so unreleased fixes can be validated end-to-end.
* call_server.py pins the Responses API version.
* main.py enables include_detailed_errors=True so future tool failures
  surface the actual exception text instead of a bare 'Error: Function
  failed.' string.
* README.md documents the in-tree-package build and the Hyperlight
  hypervisor requirement (/dev/kvm on Linux, MSHV on Windows). Hosted
  environments without hypervisor passthrough surface 'No Hypervisor was
  found for Sandbox'; this is a hosting constraint, not a hyperlight bug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: remove _drop_diagnostic from hyperlight package

The diagnostic module was useful while bisecting the cross-thread Drop bug,
but it is no longer needed now that _SandboxWorker._run_on_worker prevents
the panic at the source.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: address PR review feedback on hyperlight

- Use lazy agent_framework.hyperlight import in sample main.py.
- Env-driven endpoint (FOUNDRY_AGENT_ENDPOINT) in call_server.py; remove personal URLs.
- Align agent.yaml model deployment with manifest (gpt-4.1-mini).
- Tighten Dockerfile requirements guard; drop dangling deploy.ps1 reference.
- Preserve exception args when sanitizing tracebacks in _run_on_worker.
- Add public _SandboxWorker.is_alive(); update test to avoid private attr.
- Add namespace coverage tests for agent_framework.hyperlight lazy loader.
- Add prominent note: Foundry hosted-agent runtime does not yet support
  Hyperlight (no hypervisor exposed); container works locally with /dev/kvm.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: bump hyperlight-sandbox dependencies to 0.4.x

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: renumber hyperlight codeact sample to 08

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Coerce worker exception args to strings for cross-thread safety

Stringify exc.args on the worker thread before propagating, so any
PyO3 unsendable object captured in args (e.g. via a caller-supplied
callback or underlying SDK) cannot be Dropped on the calling thread.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* moved sample

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-05 10:06:16 +00:00

3.7 KiB

What this sample demonstrates

An Agent Framework agent that runs Python in a Hyperlight WebAssembly sandbox via the CodeAct pattern, hosted using the Responses protocol. The model is only given a single execute_code tool. Local Python tools (compute, fetch_data) are registered on HyperlightCodeActProvider and are reachable from inside the sandbox via call_tool(...), never as direct LLM tools. All of this can be run as a container, however not under all circumstances.

⚠️ Foundry hosted-agent runtime support is in progress. Hyperlight requires a hypervisor (/dev/kvm on Linux, MSHV on Windows). The default Foundry hosted-agent runtime does not currently expose a hypervisor to the workload container, so deploying this sample as a Foundry hosted agent will fail at runtime with Failed to create sandbox: ... No Hypervisor was found for Sandbox. The sample container itself works end-to-end when run locally with docker run --device=/dev/kvm ... (see Hypervisor requirement below). We are working with the platform team to enable a hypervisor-capable hosting target.

How It Works

Model integration

The agent uses FoundryChatClient to talk to a Foundry-hosted model deployment. A HyperlightCodeActProvider is attached as a context provider, which on every run injects the execute_code tool plus the CodeAct instructions that teach the model how to author Python that calls call_tool(...) for sandbox-only tools.

See main.py for the full implementation.

Agent hosting

The agent is hosted with ResponsesHostServer from agent-framework-foundry-hosting, which exposes a REST endpoint compatible with the OpenAI Responses protocol.

The Hyperlight Wasm backend is currently published only for linux/x86_64 and win32/AMD64 with Python <3.14. The hosted container runs python:3.12-slim on linux/x86_64, which is supported.

Hypervisor requirement

Hyperlight executes guest WebAssembly inside a micro-VM and requires a hypervisor on the host:

  • Linux: /dev/kvm must be present and the container must have access to it (docker run --device=/dev/kvm ...).
  • Windows: the Microsoft Hypervisor Platform (MSHV) must be enabled.

Without a hypervisor, sandbox creation fails with:

Failed to create sandbox: failed to build ProtoWasmSandbox: No Hypervisor was found for Sandbox

This affects hosted environments that don't expose /dev/kvm to the workload container (most managed PaaS, including the default Foundry hosted-agent runtime). To run this sample as a hosted agent you need a hosting target with nested virtualization and /dev/kvm device passthrough — for example an Azure VM, AKS nodes with KVM enabled, or Azure Container Instances configured for nested virt.

Running the Agent Host

Follow the instructions in the Running the Agent Host Locally section of the README in the Foundry Hosted Agent directory.

Interacting with the agent

Send a POST request to the server with a JSON body containing an "input" field. The model should respond by calling execute_code with Python that uses call_tool(...) to reach the sandbox-only tools:

curl -X POST http://localhost:8088/responses \
  -H "Content-Type: application/json" \
  -d '{"input": "Fetch all users, find the admins, multiply 7 by 6, and print the users, admins and multiplication result. Use execute_code with call_tool(...)."}'

Deploying the Agent to Foundry

Deploying this container to Foundry will not work yet, as soon as it does, we will update this sample.