Python: Fix hyperlight WasmSandbox cross-thread Drop and harden hosted-agent sample (#5603)

* update hyperlight to beta and move samples, add hosted agent sample * Python: Fix hyperlight WasmSandbox cross-thread Drop and harden sample Root cause: when a worker-side closure raised, the exception's __traceback__ retained frame locals that included the partially constructed PyO3 sandbox. Future.result() re-raised that exception on the caller thread, and when the caller's exception was eventually GC'd the frame locals were released off-thread, dec_ref'ing the unsendable sandbox from the wrong thread and tripping the PyO3 panic '_native_wasm::WasmSandbox is unsendable, but is being dropped on another thread'. Fix: * Add _SandboxWorker._run_on_worker which catches every exception on the worker, drops __traceback__ there, deletes the original exception, and re-raises a fresh instance on the caller thread. initialize and execute route through it; dispose keeps its bare-submit semantics. * Add an opt-in diagnostic module _drop_diagnostic (no-op unless HYPERLIGHT_TRACE_DROPS=1) that installs a sys.unraisablehook and dumps owner-thread + per-thread stacks on any future cross-thread unsendable Drop. Useful for triaging similar PyO3 regressions. * Tests: cross-thread invocation, traceback-leak isolation, _SandboxEntry attribute-shape check, and a stale-reference stress test driven through asyncio.to_thread. Sample (samples/04-hosting/foundry-hosted-agents/responses/06_hyperlight_codeact): * Dockerfile installs agent-framework-* from in-tree source with python/ as build context so unreleased fixes can be validated end-to-end. * call_server.py pins the Responses API version. * main.py enables include_detailed_errors=True so future tool failures surface the actual exception text instead of a bare 'Error: Function failed.' string. * README.md documents the in-tree-package build and the Hyperlight hypervisor requirement (/dev/kvm on Linux, MSHV on Windows). Hosted environments without hypervisor passthrough surface 'No Hypervisor was found for Sandbox'; this is a hosting constraint, not a hyperlight bug. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: remove _drop_diagnostic from hyperlight package The diagnostic module was useful while bisecting the cross-thread Drop bug, but it is no longer needed now that _SandboxWorker._run_on_worker prevents the panic at the source. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: address PR review feedback on hyperlight - Use lazy agent_framework.hyperlight import in sample main.py. - Env-driven endpoint (FOUNDRY_AGENT_ENDPOINT) in call_server.py; remove personal URLs. - Align agent.yaml model deployment with manifest (gpt-4.1-mini). - Tighten Dockerfile requirements guard; drop dangling deploy.ps1 reference. - Preserve exception args when sanitizing tracebacks in _run_on_worker. - Add public _SandboxWorker.is_alive(); update test to avoid private attr. - Add namespace coverage tests for agent_framework.hyperlight lazy loader. - Add prominent note: Foundry hosted-agent runtime does not yet support Hyperlight (no hypervisor exposed); container works locally with /dev/kvm. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: bump hyperlight-sandbox dependencies to 0.4.x Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: renumber hyperlight codeact sample to 08 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Coerce worker exception args to strings for cross-thread safety Stringify exc.args on the worker thread before propagating, so any PyO3 unsendable object captured in args (e.g. via a caller-supplied callback or underlying SDK) cannot be Dropped on the calling thread. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * moved sample --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-16 21:04:09 +08:00 · 2026-05-05 12:06:16 +02:00
parent 36b9b41e3b
commit 57c901a245
26 changed files with 967 additions and 393 deletions
@@ -0,0 +1,6 @@
+.venv
+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.Python
@@ -0,0 +1,2 @@
+FOUNDRY_PROJECT_ENDPOINT="..."
+AZURE_AI_MODEL_DEPLOYMENT_NAME="..."
@@ -0,0 +1,36 @@
+# Build this image with the repository's `python/` directory as the build context so
+# the in-tree agent-framework packages can be installed from source. From the repo root:
+#
+#   docker build \
+#     -f python/samples/04-hosting/foundry-hosted-agents/responses/08_hyperlight_codeact/Dockerfile \
+#     -t <acr>.azurecr.io/<image>:<tag> \
+#     python/
+FROM python:3.12-slim
+
+WORKDIR /app
+
+# Copy the in-tree agent-framework packages we need. Order matters for editable
+# installs because of inter-package dependencies; we install in dependency order
+# below. Hyperlight backends are platform gated, so we install them via pip
+# resolution rather than copying the wheels.
+COPY packages/core /opt/af/core
+COPY packages/openai /opt/af/openai
+COPY packages/foundry /opt/af/foundry
+COPY packages/foundry_hosting /opt/af/foundry_hosting
+COPY packages/hyperlight /opt/af/hyperlight
+
+# Copy just the sample we care about into the user agent location.
+COPY samples/04-hosting/foundry-hosted-agents/responses/08_hyperlight_codeact/ /app/user_agent/
+WORKDIR /app/user_agent
+
+RUN pip install --no-cache-dir --upgrade pip \
+    && pip install --no-cache-dir /opt/af/core \
+    && pip install --no-cache-dir /opt/af/openai \
+    && pip install --no-cache-dir /opt/af/foundry \
+    && pip install --no-cache-dir /opt/af/foundry_hosting \
+    && pip install --no-cache-dir /opt/af/hyperlight \
+    && if grep -Eq '^[[:space:]]*[^#[:space:]]' requirements.txt; then pip install --no-cache-dir -r requirements.txt; fi
+
+EXPOSE 8088
+
+CMD ["python", "main.py"]
@@ -0,0 +1,85 @@
+# What this sample demonstrates
+
+An [Agent Framework](https://github.com/microsoft/agent-framework) agent that
+runs Python in a [Hyperlight](https://github.com/hyperlight-dev/hyperlight)
+WebAssembly sandbox via the **CodeAct** pattern, hosted using the **Responses
+protocol**. The model is only given a single `execute_code` tool. Local Python
+tools (`compute`, `fetch_data`) are registered on `HyperlightCodeActProvider`
+and are reachable from inside the sandbox via `call_tool(...)`, never as
+direct LLM tools. All of this can be run as a container, however not under all circumstances.
+
+> **⚠️ Foundry hosted-agent runtime support is in progress.**
+> Hyperlight requires a hypervisor (`/dev/kvm` on Linux, MSHV on Windows). The
+> default Foundry hosted-agent runtime does not currently expose a hypervisor
+> to the workload container, so deploying this sample as a Foundry hosted
+> agent will fail at runtime with
+> `Failed to create sandbox: ... No Hypervisor was found for Sandbox`.
+> The sample container itself works end-to-end when run locally with
+> `docker run --device=/dev/kvm ...` (see [Hypervisor requirement](#hypervisor-requirement)
+> below). We are working with the platform team to enable a hypervisor-capable
+> hosting target.
+
+## How It Works
+
+### Model integration
+
+The agent uses `FoundryChatClient` to talk to a Foundry-hosted model deployment.
+A `HyperlightCodeActProvider` is attached as a context provider, which on every
+run injects the `execute_code` tool plus the CodeAct instructions that teach the
+model how to author Python that calls `call_tool(...)` for sandbox-only tools.
+
+See [`main.py`](main.py) for the full implementation.
+
+### Agent hosting
+
+The agent is hosted with `ResponsesHostServer` from
+`agent-framework-foundry-hosting`, which exposes a REST endpoint compatible with
+the OpenAI Responses protocol.
+
+> The Hyperlight Wasm backend is currently published only for `linux/x86_64` and
+> `win32/AMD64` with Python `<3.14`. The hosted container runs `python:3.12-slim`
+> on linux/x86_64, which is supported.
+
+### Hypervisor requirement
+
+Hyperlight executes guest WebAssembly inside a micro-VM and **requires a
+hypervisor on the host**:
+
+- **Linux:** `/dev/kvm` must be present *and* the container must have access to
+  it (`docker run --device=/dev/kvm ...`).
+- **Windows:** the Microsoft Hypervisor Platform (MSHV) must be enabled.
+
+Without a hypervisor, sandbox creation fails with:
+
+```
+Failed to create sandbox: failed to build ProtoWasmSandbox: No Hypervisor was found for Sandbox
+```
+
+This affects hosted environments that don't expose `/dev/kvm` to the workload
+container (most managed PaaS, including the default Foundry hosted-agent
+runtime). To run this sample as a hosted agent you need a hosting target with
+nested virtualization and `/dev/kvm` device passthrough — for example an Azure
+VM, AKS nodes with KVM enabled, or Azure Container Instances configured for
+nested virt.
+
+## Running the Agent Host
+
+Follow the instructions in the
+[Running the Agent Host Locally](../../foundry-hosted-agents//README.md#running-the-agent-host-locally)
+section of the README in the Foundry Hosted Agent directory.
+
+## Interacting with the agent
+
+Send a POST request to the server with a JSON body containing an `"input"`
+field. The model should respond by calling `execute_code` with Python that uses
+`call_tool(...)` to reach the sandbox-only tools:
+
+```bash
+curl -X POST http://localhost:8088/responses \
+  -H "Content-Type: application/json" \
+  -d '{"input": "Fetch all users, find the admins, multiply 7 by 6, and print the users, admins and multiplication result. Use execute_code with call_tool(...)."}'
+```
+
+## Deploying the Agent to Foundry
+
+Deploying this container to Foundry will not work yet, as soon as it does, we will update this sample.
@@ -0,0 +1,24 @@
+name: agent-framework-agent-with-hyperlight-codeact-responses
+description: >
+  An Agent Framework agent with a Hyperlight CodeAct sandbox hosted by Foundry.
+metadata:
+  tags:
+    - Agent Framework
+    - AI Agent Hosting
+    - Azure AI AgentServer
+    - Responses Protocol
+    - Streaming
+    - Hyperlight CodeAct
+template:
+  name: agent-framework-agent-with-hyperlight-codeact-responses
+  kind: hosted
+  protocols:
+    - protocol: responses
+      version: 1.0.0
+  environment_variables:
+    - name: AZURE_AI_MODEL_DEPLOYMENT_NAME
+      value: "{{AZURE_AI_MODEL_DEPLOYMENT_NAME}}"
+resources:
+  - kind: model
+    id: gpt-4.1-mini
+    name: AZURE_AI_MODEL_DEPLOYMENT_NAME
@@ -0,0 +1,23 @@
+# yaml-language-server: $schema=https://raw.githubusercontent.com/microsoft/AgentSchema/refs/heads/main/schemas/v1.0/ContainerAgent.yaml
+
+kind: hosted
+name: agent-framework-agent-with-hyperlight-codeact-responses
+description: |
+    An Agent Framework agent with a Hyperlight CodeAct sandbox hosted by Foundry.
+metadata:
+    tags:
+        - Agent Framework
+        - AI Agent Hosting
+        - Azure AI AgentServer
+        - Responses Protocol
+        - Streaming
+        - Hyperlight CodeAct
+protocols:
+    - protocol: responses
+      version: 1.0.0
+resources:
+    cpu: "1"
+    memory: 2Gi
+environment_variables:
+    - name: AZURE_AI_MODEL_DEPLOYMENT_NAME
+      value: gpt-4.1-mini
@@ -0,0 +1,41 @@
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+#     "openai>=1.50,<3",
+#     "azure-identity>=1.19,<2",
+# ]
+# ///
+# Run with: uv run call_server.py
+
+# Copyright (c) Microsoft. All rights reserved.
+
+"""Call the deployed Hyperlight CodeAct Foundry hosted agent via the OpenAI client."""
+
+import os
+
+from azure.identity import AzureCliCredential
+from openai import OpenAI
+
+# Set FOUNDRY_AGENT_ENDPOINT to your deployed agent endpoint, e.g.
+#   https://<your-foundry-resource>.services.ai.azure.com/api/projects/<project>/agents/<agent-name>
+ENDPOINT = os.environ.get(
+    "FOUNDRY_AGENT_ENDPOINT",
+    "https://<your-foundry-resource>.services.ai.azure.com"
+    "/api/projects/<project>/agents/<agent-name>",
+)
+SCOPE = "https://ai.azure.com/.default"
+PROMPT = (
+    "Fetch all users, find the admins, multiply 7 by 6, and print the users, "
+    "admins and multiplication result. Use execute_code with call_tool(...)."
+)
+
+
+def main() -> None:
+    token = AzureCliCredential().get_token(SCOPE).token
+    client = OpenAI(base_url=ENDPOINT, api_key=token, default_query={"api-version": "v1"})
+    response = client.responses.create(model="hosted-agent", input=PROMPT)
+    print(response.output_text)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,89 @@
+# Copyright (c) Microsoft. All rights reserved.
+
+import asyncio
+import os
+from typing import Annotated, Any, Literal
+
+from agent_framework import Agent, tool
+from agent_framework.foundry import FoundryChatClient
+from agent_framework.hyperlight import HyperlightCodeActProvider
+from agent_framework_foundry_hosting import ResponsesHostServer
+from azure.identity import DefaultAzureCredential
+from dotenv import load_dotenv
+
+# Load environment variables from .env file
+load_dotenv()
+
+
+@tool(approval_mode="never_require")
+def compute(
+    operation: Annotated[
+        Literal["add", "subtract", "multiply", "divide"],
+        "Math operation: add, subtract, multiply, or divide.",
+    ],
+    a: Annotated[float, "First numeric operand."],
+    b: Annotated[float, "Second numeric operand."],
+) -> float:
+    """Perform a math operation for sandboxed code."""
+    operations = {
+        "add": a + b,
+        "subtract": a - b,
+        "multiply": a * b,
+        "divide": a / b if b else float("inf"),
+    }
+    return operations[operation]
+
+
+@tool(approval_mode="never_require")
+async def fetch_data(
+    table: Annotated[str, "Name of the simulated table to query."],
+) -> list[dict[str, Any]]:
+    """Fetch records from a named table."""
+    await asyncio.sleep(0.5)
+    data: dict[str, list[dict[str, Any]]] = {
+        "users": [
+            {"id": 1, "name": "Alice", "role": "admin"},
+            {"id": 2, "name": "Bob", "role": "user"},
+            {"id": 3, "name": "Charlie", "role": "admin"},
+        ],
+        "products": [
+            {"id": 101, "name": "Widget", "price": 9.99},
+            {"id": 102, "name": "Gadget", "price": 19.99},
+        ],
+    }
+    return data.get(table, [])
+
+
+def main():
+    # 1. Create the Foundry chat client.
+    client = FoundryChatClient(
+        project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
+        model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
+        credential=DefaultAzureCredential(),
+        function_invocation_configuration={"include_detailed_errors": True},
+    )
+
+    # 2. Register sandbox tools on a Hyperlight CodeAct provider. The model only
+    #    sees `execute_code`; `compute` and `fetch_data` are reachable from
+    #    inside the sandbox via `call_tool(...)`.
+    codeact = HyperlightCodeActProvider(
+        tools=[compute, fetch_data],
+        approval_mode="never_require",
+    )
+
+    # 3. Build the agent. History is managed by the hosting infrastructure, so
+    #    request the model not to persist server-side conversation state.
+    agent = Agent(
+        client=client,
+        instructions="You are a helpful assistant. Keep your answers brief.",
+        context_providers=[codeact],
+        default_options={"store": False},
+    )
+
+    # 4. Serve the agent over the Foundry Responses protocol.
+    server = ResponsesHostServer(agent)
+    server.run()
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,3 @@
+# agent-framework, agent-framework-foundry-hosting, and agent-framework-hyperlight
+# are installed from local source by the Dockerfile (build context = python/).
+# Add any sample-only third-party deps here.