mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Files

T

Eduard van Valkenburg 4609535e22 Python: feat: add agent-framework-monty (Monty-backed CodeAct provider) (#5915 )

* Python: feat: add agent-framework-monty (Monty-backed CodeAct)

New alpha package that wraps pydantic-monty (a Rust-based Python
interpreter) behind the same CodeAct API surface as
agent-framework-hyperlight, so users can swap providers with minimal
code change.

Public API (agent_framework_monty):
- MontyCodeActProvider — ContextProvider that injects a run-scoped
  execute_code tool plus dynamic CodeAct instructions.
- MontyExecuteCodeTool — standalone FunctionTool for mixed-tool agents
  or manual static wiring.
- FileMount / FileMountInput / MountMode — public types mirroring the
  Hyperlight names, with Monty's mode (read-only/read-write/overlay)
  and write_bytes_limit on FileMount.

Constructor kwargs (both classes) mirror Hyperlight where possible:
tools, approval_mode, workspace_root, file_mounts; plus a Monty-only
resource_limits forwarding ResourceLimits to Monty.start().

Filesystem flow:
- workspace_root auto-mounts at /input (read-write), matching Hyperlight.
- file_mounts accepts string shorthand, (host, mount) tuple, or
  FileMount with mode + write cap.
- Files written under read-write mounts are scanned post-execution and
  returned as Content.from_data items (mirrors Hyperlight /output).
- overlay mounts buffer writes in-memory; read-only mounts reject writes.

Internals:
- _monty_bridge.InlineCodeBridge ports the inline (non-durable) bridge
  from anthonychu/maf-codeact-monty-python; handles FunctionSnapshot /
  FutureSnapshot pause/resume, dispatches direct typed calls + the
  call_tool fallback, forwards mount/limits to Monty.start(...).
- generate_type_stubs emits per-tool stubs so Monty's `ty` type-checker
  rejects bad calls before any host tool runs.

Alpha-policy compliance (per python-package-management skill):
- Added agent-framework-monty = { workspace = true } to root
  pyproject.toml.
- Added row to python/PACKAGE_STATUS.md.
- Added monty entry under Experimental in python/AGENTS.md.
- NOT added to core[all]; NO agent_framework.monty lazy shim (deferred
  to beta promotion).

Samples (three sets, import from agent_framework_monty directly):
- samples/02-agents/context_providers/code_act/monty_code_act.py
  (provider pattern) + updated local README.
- samples/02-agents/tools/monty_code_interpreter/ (standalone +
  manual-wiring + README).
- samples/04-hosting/foundry-hosted-agents/responses/11_monty_codeact/
  (full hosted-agent layout with uv-based pyproject.toml + Dockerfile,
  Azure Monitor wiring via APPLICATIONINSIGHTS_CONNECTION_STRING +
  enable_instrumentation, ENABLE_INSTRUMENTATION and
  ENABLE_SENSITIVE_DATA env vars). The alpha wheel is vendored into
  ./wheels/ (gitignored) via vendor-wheel.sh; new row added to the
  parent Responses-API README.

Tests:
- 28 hermetic unit tests (stubbed pydantic_monty).
- 18 integration tests marked @pytest.mark.integration, auto-skipped
  when pydantic_monty is unimportable; exercise the real Monty
  runtime: print round-trip, last-expression value, direct typed
  tool dispatch, call_tool fallback, async tool, asyncio.gather
  parallelism, ty type-check rejection, OS blocked by default,
  workspace_root read+write capture, read-only / overlay mount
  semantics, resource_limits.max_duration_secs abort, approval
  gating end-to-end, full Agent run with a scripted chat client.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix: monty FileMount test compares against the normalized POSIX path

The shorthand string mount goes through _normalize_mount_path, which
rewrites Windows drive letters like 'C:\\Users\\...' into
'/C:/Users/...' (POSIX-style). The Windows CI runners surfaced this
because tmp_path resolves to a backslashed Windows path; the test was
comparing against the raw str(host_a) instead of the normalized form.

Compare against _normalize_mount_path(str(host_a)) so the assertion is
platform-independent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix: address PR #5915 review feedback

- _execute_code_tool docstring: clarify that the Monty backend supports
  scoped filesystem access via workspace_root / file_mounts (blocked by
  default).
- _to_monty_mount: import pydantic_monty lazily through load_monty so
  missing-dependency errors surface as the same actionable RuntimeError
  the rest of the package raises (not a bare ImportError at module load).
  Renamed _load_monty -> load_monty for the same reason.
- _python_type_repr: emit None for type(None) instead of Any, and
  normalize both typing.Union[...] and PEP-604 X | Y to PEP-604 syntax
  so Optional[X] / Union[..., None] / -> None signatures round-trip
  correctly through ty validation. Added a regression test.
- _PrintCollector: track a running character count instead of
  recomputing sum(len(c) for c in self.chunks) per callback. Eliminates
  the O(n^2) cost on print-heavy code.
- Instructions: mention that the value of the final expression is also
  returned alongside captured stdout (matches actual behavior).
- 11_monty_codeact Dockerfile: pin ghcr.io/astral-sh/uv to 0.11.6
  instead of :latest for reproducible builds.
- 11_monty_codeact README: replace the bare "see parent README" pointer
  with sample-specific steps (./vendor-wheel.sh + uv sync + uv run),
  since the sample uses pyproject.toml + a vendored wheel rather than
  requirements.txt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: sample: 11_monty_codeact installs agent-framework-monty from PyPI

Drop the vendored-wheel scaffolding now that agent-framework-monty is on
PyPI as an alpha (1.0.0a*) release:

- pyproject.toml: remove [tool.uv.sources] override; keep [tool.uv]
  prerelease = "allow" so uv pulls the alpha automatically.
- Dockerfile: drop the COPY wheels/ step.
- README: drop the ./vendor-wheel.sh setup step and the
  not-yet-on-PyPI warning.
- Delete vendor-wheel.sh and the gitignored wheels/ directory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): harden post-execution file capture against symlink escape

Same class of issue as the MSRC-reported Hyperlight finding: the
post-execution capture walked workspace_root with Path.rglob() +
is_file() + read_bytes() - all of which follow symlinks. An attacker
who controls the workspace (cloned repo, extracted archive, shared
workspace) could pre-place `workspace/leak.txt -> /etc/passwd` or
`workspace/outside_dir -> /etc/` and have host files surface as
captured Content items.

Monty's mount layer already rejects symlink reads from inside the
sandbox across all three modes (verified empirically), so the runtime
path was safe. This commit closes the post-execution scan path.

Changes:
- New `_iter_real_files(root)` walker that uses iterdir() +
  is_symlink() to skip symlinks at every directory level and yields
  only real files. Replaces the previous `host_root.rglob("*")` calls
  in both `_snapshot_writable_mounts` and `_capture_written_files`.
- Use `Path.lstat()` instead of `Path.stat()` so size/mtime can never
  be taken from a symlink target.
- Three new integration tests reproducing the MSRC attack shape
  against the workspace_root flow: symlink-to-file outside workspace,
  symlink-to-directory outside workspace, and a guard ensuring
  legitimate sandbox writes are still captured when symlinks are
  present.

Per user request, hyperlight is untouched in this commit (separate fix).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): skip symlink regression tests when unsupported

Apply the same Windows-CI safety guard as the hyperlight fix in PR #5919:
the three symlink integration tests create symlinks via Path.symlink_to(),
which fails with OSError / NotImplementedError on unprivileged Windows
runners. Add a local _symlinks_supported helper (mirroring the one in
packages/core/tests/core/test_skills.py) and pytest.skip when symlinks
aren't available, so the tests no longer fail for environment reasons.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): address PR #5915 follow-up review feedback

- _invoke_tool: drop the inspect.iscoroutinefunction(...) branch and
  always `await self.tool_map[name](**kwargs)`. Every entry in
  tool_map is `partial(FunctionTool.invoke, skip_parsing=True)` and
  FunctionTool.invoke is `async def`, so the branching was dead code -
  and on Python versions affected by cpython#98590,
  iscoroutinefunction(partial(bound_async_method, ...)) returns False,
  causing the bridge to take the asyncio.to_thread path, return an
  unawaited coroutine, and surface it as a JSON-serialization failure
  for every tool call. Added a regression test
  test_invoke_tool_awaits_partial_wrapped_async_method.

- generate_type_stubs: skip tools whose name is not a valid Python
  identifier or is a Python keyword. FunctionTool.name has no upstream
  validation, so a name like "weird-name" produced a syntax error in
  the stubs and a name like "broken\n    pass\nasync def injected"
  would inject arbitrary stub source. Non-identifier names stay
  reachable via `call_tool("weird-name", ...)` at runtime; they just
  don't get type-checked stubs. Added regression test
  test_generate_type_stubs_skips_non_identifier_tool_names.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-05-20 00:35:23 +00:00

6.2 KiB

Raw Blame History

agent-framework-monty

Monty-backed CodeAct integrations for Microsoft Agent Framework.

Warning

This package is in alpha. APIs may change without notice. It is not part of agent-framework[all] yet; install it explicitly with --pre.

Installation

pip install agent-framework-monty --pre

The package depends on pydantic-monty, a Rust-based Python interpreter, so it runs on Linux, macOS, and Windows wherever Monty wheels are published — no hypervisor or WASM backend required.

Quick start

Context provider (recommended)

Use MontyCodeActProvider to automatically inject the execute_code tool and CodeAct instructions into every agent run. Tools registered on the provider are available inside the Monty interpreter as typed async functions (e.g. await compute(operation="add", a=1, b=2)), and as a fallback through call_tool(...).

from agent_framework import Agent, tool
from agent_framework_monty import MontyCodeActProvider


@tool
def compute(operation: str, a: float, b: float) -> float:
    """Perform a math operation."""
    ops = {"add": a + b, "subtract": a - b, "multiply": a * b, "divide": a / b}
    return ops[operation]


codeact = MontyCodeActProvider(
    tools=[compute],
    approval_mode="never_require",
)

agent = Agent(
    client=client,
    name="CodeActAgent",
    instructions="You are a helpful assistant.",
    context_providers=[codeact],
)

result = await agent.run("Multiply 6 by 7 using execute_code.")

Standalone tool

Use MontyExecuteCodeTool directly when you want full control over how the tool is added to the agent (e.g. when mixing sandbox tools with direct-only tools on the same agent).

from agent_framework import Agent, tool
from agent_framework_monty import MontyExecuteCodeTool


@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email (direct-only, not available inside the sandbox)."""
    return f"Email sent to {to}"


execute_code = MontyExecuteCodeTool(
    tools=[compute],
    approval_mode="never_require",
)

agent = Agent(
    client=client,
    name="MixedToolsAgent",
    instructions="You are a helpful assistant.",
    tools=[send_email, execute_code],
)

Manual static wiring

For fixed configurations where provider lifecycle overhead is unnecessary, build the CodeAct instructions once and pass them to the agent at construction time:

execute_code = MontyExecuteCodeTool(
    tools=[compute],
    approval_mode="never_require",
)

codeact_instructions = execute_code.build_instructions(tools_visible_to_model=False)

agent = Agent(
    client=client,
    name="StaticWiringAgent",
    instructions=f"You are a helpful assistant.\n\n{codeact_instructions}",
    tools=[execute_code],
)

File mounts and resource limits

Mount host directories into the sandbox and cap execution resources:

from agent_framework_monty import FileMount, MontyCodeActProvider

codeact = MontyCodeActProvider(
    tools=[compute],
    workspace_root="/host/workspace",       # auto-mounted at /input (read-write)
    file_mounts=[
        "/host/data",                                                # shorthand: same path on both sides
        ("/host/models", "/sandbox/models"),                          # explicit (host, mount_path)
        FileMount(                                                    # full control
            host_path="/host/cache",
            mount_path="/sandbox/cache",
            mode="overlay",                # "read-only" | "read-write" | "overlay"
            write_bytes_limit=10 * 1024 * 1024,
        ),
    ],
    resource_limits={                       # Monty ResourceLimits TypedDict
        "max_duration_secs": 5.0,
        "max_memory": 64 * 1024 * 1024,
    },
)

workspace_root mirrors the Hyperlight default: the directory is mounted at /input in read-write mode.
file_mounts accepts a string shorthand, a (host_path, mount_path) tuple, or a FileMount named tuple (with optional mode and write_bytes_limit).
Files written by the sandbox to any read-write mount are scanned after each execute_code call and returned as Content.from_data(...) attachments (with a path annotation in additional_properties), mirroring Hyperlight's /output flow.
overlay mounts buffer writes in memory (nothing leaks to the host and nothing is captured). read-only mounts reject writes.
resource_limits is forwarded straight to Monty's ResourceLimits TypedDict (max_allocations, max_duration_secs, max_memory, gc_interval, max_recursion_depth).

DSL inside `execute_code`

The model generates Python code that runs inside Monty's Rust-based interpreter. Available primitives:

Primitive	Behavior
`await tool_name(**kwargs)`	Direct typed call to a registered host tool. Argument types are checked before execution.
`await call_tool("name", **kwargs)`	Generic fallback that dispatches by tool name. Not type-checked.
`asyncio.gather(...)`	Fans out concurrent tool calls.
`print(...)`	Captured and surfaced as text in the tool result.

Notes

MontyCodeActProvider and MontyExecuteCodeTool mirror the API surface of the agent-framework-hyperlight counterparts where the underlying runtime supports it.
Monty interprets a subset of Python (a Rust-based interpreter). Most control flow, common stdlib modules (sys, os, typing, asyncio, re, datetime, json), and async functions are supported, but exotic features may not be available. OS-level access (filesystem, network, subprocess) is rejected with PermissionError by default; mount host directories with workspace_root / file_mounts to grant scoped filesystem access.
Code is type-checked against tool signatures via ty before execution, so wrong argument types surface as a clear error before any host tool runs.
The alpha package is not part of agent-framework[all] yet, so it must be installed explicitly. Once promoted to beta it will be reachable via the lazy-loading namespace agent_framework.monty.

6.2 KiB Raw Blame History