mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
Python: Fix hyperlight WasmSandbox cross-thread Drop and harden hosted-agent sample (#5603)
* update hyperlight to beta and move samples, add hosted agent sample * Python: Fix hyperlight WasmSandbox cross-thread Drop and harden sample Root cause: when a worker-side closure raised, the exception's __traceback__ retained frame locals that included the partially constructed PyO3 sandbox. Future.result() re-raised that exception on the caller thread, and when the caller's exception was eventually GC'd the frame locals were released off-thread, dec_ref'ing the unsendable sandbox from the wrong thread and tripping the PyO3 panic '_native_wasm::WasmSandbox is unsendable, but is being dropped on another thread'. Fix: * Add _SandboxWorker._run_on_worker which catches every exception on the worker, drops __traceback__ there, deletes the original exception, and re-raises a fresh instance on the caller thread. initialize and execute route through it; dispose keeps its bare-submit semantics. * Add an opt-in diagnostic module _drop_diagnostic (no-op unless HYPERLIGHT_TRACE_DROPS=1) that installs a sys.unraisablehook and dumps owner-thread + per-thread stacks on any future cross-thread unsendable Drop. Useful for triaging similar PyO3 regressions. * Tests: cross-thread invocation, traceback-leak isolation, _SandboxEntry attribute-shape check, and a stale-reference stress test driven through asyncio.to_thread. Sample (samples/04-hosting/foundry-hosted-agents/responses/06_hyperlight_codeact): * Dockerfile installs agent-framework-* from in-tree source with python/ as build context so unreleased fixes can be validated end-to-end. * call_server.py pins the Responses API version. * main.py enables include_detailed_errors=True so future tool failures surface the actual exception text instead of a bare 'Error: Function failed.' string. * README.md documents the in-tree-package build and the Hyperlight hypervisor requirement (/dev/kvm on Linux, MSHV on Windows). Hosted environments without hypervisor passthrough surface 'No Hypervisor was found for Sandbox'; this is a hosting constraint, not a hyperlight bug. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: remove _drop_diagnostic from hyperlight package The diagnostic module was useful while bisecting the cross-thread Drop bug, but it is no longer needed now that _SandboxWorker._run_on_worker prevents the panic at the source. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: address PR review feedback on hyperlight - Use lazy agent_framework.hyperlight import in sample main.py. - Env-driven endpoint (FOUNDRY_AGENT_ENDPOINT) in call_server.py; remove personal URLs. - Align agent.yaml model deployment with manifest (gpt-4.1-mini). - Tighten Dockerfile requirements guard; drop dangling deploy.ps1 reference. - Preserve exception args when sanitizing tracebacks in _run_on_worker. - Add public _SandboxWorker.is_alive(); update test to avoid private attr. - Add namespace coverage tests for agent_framework.hyperlight lazy loader. - Add prominent note: Foundry hosted-agent runtime does not yet support Hyperlight (no hypervisor exposed); container works locally with /dev/kvm. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: bump hyperlight-sandbox dependencies to 0.4.x Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: renumber hyperlight codeact sample to 08 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Coerce worker exception args to strings for cross-thread safety Stringify exc.args on the worker thread before propagating, so any PyO3 unsendable object captured in args (e.g. via a caller-supplied callback or underlying SDK) cannot be Dropped on the calling thread. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * moved sample --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
Unverified
parent
36b9b41e3b
commit
57c901a245
@@ -0,0 +1,85 @@
|
||||
# What this sample demonstrates
|
||||
|
||||
An [Agent Framework](https://github.com/microsoft/agent-framework) agent that
|
||||
runs Python in a [Hyperlight](https://github.com/hyperlight-dev/hyperlight)
|
||||
WebAssembly sandbox via the **CodeAct** pattern, hosted using the **Responses
|
||||
protocol**. The model is only given a single `execute_code` tool. Local Python
|
||||
tools (`compute`, `fetch_data`) are registered on `HyperlightCodeActProvider`
|
||||
and are reachable from inside the sandbox via `call_tool(...)`, never as
|
||||
direct LLM tools. All of this can be run as a container, however not under all circumstances.
|
||||
|
||||
> **⚠️ Foundry hosted-agent runtime support is in progress.**
|
||||
> Hyperlight requires a hypervisor (`/dev/kvm` on Linux, MSHV on Windows). The
|
||||
> default Foundry hosted-agent runtime does not currently expose a hypervisor
|
||||
> to the workload container, so deploying this sample as a Foundry hosted
|
||||
> agent will fail at runtime with
|
||||
> `Failed to create sandbox: ... No Hypervisor was found for Sandbox`.
|
||||
> The sample container itself works end-to-end when run locally with
|
||||
> `docker run --device=/dev/kvm ...` (see [Hypervisor requirement](#hypervisor-requirement)
|
||||
> below). We are working with the platform team to enable a hypervisor-capable
|
||||
> hosting target.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Model integration
|
||||
|
||||
The agent uses `FoundryChatClient` to talk to a Foundry-hosted model deployment.
|
||||
A `HyperlightCodeActProvider` is attached as a context provider, which on every
|
||||
run injects the `execute_code` tool plus the CodeAct instructions that teach the
|
||||
model how to author Python that calls `call_tool(...)` for sandbox-only tools.
|
||||
|
||||
See [`main.py`](main.py) for the full implementation.
|
||||
|
||||
### Agent hosting
|
||||
|
||||
The agent is hosted with `ResponsesHostServer` from
|
||||
`agent-framework-foundry-hosting`, which exposes a REST endpoint compatible with
|
||||
the OpenAI Responses protocol.
|
||||
|
||||
> The Hyperlight Wasm backend is currently published only for `linux/x86_64` and
|
||||
> `win32/AMD64` with Python `<3.14`. The hosted container runs `python:3.12-slim`
|
||||
> on linux/x86_64, which is supported.
|
||||
|
||||
### Hypervisor requirement
|
||||
|
||||
Hyperlight executes guest WebAssembly inside a micro-VM and **requires a
|
||||
hypervisor on the host**:
|
||||
|
||||
- **Linux:** `/dev/kvm` must be present *and* the container must have access to
|
||||
it (`docker run --device=/dev/kvm ...`).
|
||||
- **Windows:** the Microsoft Hypervisor Platform (MSHV) must be enabled.
|
||||
|
||||
Without a hypervisor, sandbox creation fails with:
|
||||
|
||||
```
|
||||
Failed to create sandbox: failed to build ProtoWasmSandbox: No Hypervisor was found for Sandbox
|
||||
```
|
||||
|
||||
This affects hosted environments that don't expose `/dev/kvm` to the workload
|
||||
container (most managed PaaS, including the default Foundry hosted-agent
|
||||
runtime). To run this sample as a hosted agent you need a hosting target with
|
||||
nested virtualization and `/dev/kvm` device passthrough — for example an Azure
|
||||
VM, AKS nodes with KVM enabled, or Azure Container Instances configured for
|
||||
nested virt.
|
||||
|
||||
## Running the Agent Host
|
||||
|
||||
Follow the instructions in the
|
||||
[Running the Agent Host Locally](../../foundry-hosted-agents//README.md#running-the-agent-host-locally)
|
||||
section of the README in the Foundry Hosted Agent directory.
|
||||
|
||||
## Interacting with the agent
|
||||
|
||||
Send a POST request to the server with a JSON body containing an `"input"`
|
||||
field. The model should respond by calling `execute_code` with Python that uses
|
||||
`call_tool(...)` to reach the sandbox-only tools:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8088/responses \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"input": "Fetch all users, find the admins, multiply 7 by 6, and print the users, admins and multiplication result. Use execute_code with call_tool(...)."}'
|
||||
```
|
||||
|
||||
## Deploying the Agent to Foundry
|
||||
|
||||
Deploying this container to Foundry will not work yet, as soon as it does, we will update this sample.
|
||||
Reference in New Issue
Block a user