agent-framework

mirror of https://github.com/microsoft/agent-framework.git synced 2026-06-16 21:04:09 +08:00

Simplify Python hosting core (#6492 )

Remove linking, multicast, durable delivery, and host push machinery from the v1 hosting core. Keep those scenarios in a proposed follow-up ADR and update channel packages, samples, docs, tests, and workspace metadata around the smaller host/channel contract.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-12 08:34:08 +02:00

36ce0950e4

Python: feat(python): cross-channel hosting improvements (endpoint paths, Activity push, Telegram/Teams fixes) (#6307 )

* Update hosting channel endpoint paths

Treat channel paths as concrete endpoint paths so built-in channels can be mounted at their defaults or at the app root without sample-specific subclasses. Update docs, tests, and the Foundry Telegram Invocations sample accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add push support to ActivityProtocolChannel

Implement the ChannelPush protocol so the Activity Protocol channel can
receive cross-channel fan-out (ResponseTarget.all_linked) and echo_input
replay as a non-originating destination:

- Add push() that reconstructs a proactive Bot Framework activity (bot/user
swap) from the stored conversation reference and POSTs it to
/v3/conversations/{id}/activities.
- Record a ChannelIdentity (service_url, conversation, bot, user, channel_id,
locale) on ChannelRequest.identity so the host registers the channel under
its isolation key for fan-out resolution.
- Route the streaming path through deliver_response so Activity-originated
turns broadcast like Telegram/Discord.
- Add tests for push delivery, service_url validation, ChannelPush instance
check, and inbound identity recording.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Don't delete Telegram webhook on shutdown by default

The TelegramChannel deleted its webhook on shutdown in webhook mode. During
a rolling redeploy the new revision registers the webhook on startup, then
the old revision's shutdown deletes it, silently breaking inbound delivery
until the next boot. setWebhook is overwriting/idempotent, so startup
re-asserts the webhook every boot and no teardown is needed.

Add a delete_webhook_on_shutdown flag (default False) so teardown is opt-in
for ephemeral deployments, and leave the webhook in place otherwise.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Activity channel streaming on non-Teams channels (405 on updateActivity)

The Activity Protocol channel streamed replies the Teams way: POST a
placeholder, then PUT-edit it as tokens arrive. Only Teams supports the
updateActivity REST op; Web Chat, Direct Line and the Emulator return
405 Method Not Allowed on the PUT, so the user saw only the placeholder.

Gate the placeholder+edit flow on edit-capable channels (msteams). Other
channels now buffer the stream and POST a single final message, mirroring
the non-streaming path's fan-out and response-hook semantics. Also add a
defensive 405 fallback inside the Teams edit loop so an unexpected 405
can never strand the user on the placeholder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(hosting-activity-protocol): don't parse Teams inline attachment content as a URI

Teams message activities include a text/html attachment whose inline
`content` is raw HTML (not a URL). _parse_activity fell back to
`attachment["content"]` and passed it to Content.from_uri, raising
ContentError ("URI must contain a scheme") and failing the whole turn,
so Teams users got no response.

Only treat `contentUrl` as a URI, require an absolute scheme, and skip
unparseable attachments defensively instead of failing the message.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(hosting-activity-protocol): native slash-command dispatch for Teams/Activity

Add a commands= parameter to ActivityProtocolChannel that intercepts a
leading /command (after stripping the bot's own @mention) and dispatches
to ChannelCommand handlers, mirroring the Telegram channel. Unknown
commands fall through to the agent. The channel run_hook is applied to
command requests so handlers observe the same resolved isolation key as
ordinary messages, and handler errors are swallowed (200, no Bot Service
retry of non-idempotent commands).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(hosting): silent attributed Telegram echoes + Teams markdown rendering

- hosting-telegram: send cross-channel input echoes with disable_notification
(silent) and detect echo payloads so they aren't re-broadcast.
- hosting-activity-protocol: render outbound + push activities as textFormat
'markdown' so Teams shows formatted replies (enables per-channel variants).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(hosting-activity-protocol): address PR #6307 review feedback

Consult the host delivery pipeline even for empty streamed replies so
ResponseTarget.none is honoured and non-originating fan-out is consulted
instead of always emitting an originating "(no response)" message. Applies
to both the progressive-edit (Teams) and buffered (Web Chat/Direct Line)
streaming paths.

Re-validate service_url against the allow-list in push(): the identity is
read from a persisted store and push runs out-of-band, so the captured
service_url must be re-checked before a bearer token is sent.

Adds tests for empty-stream host consultation/suppression on both streaming
paths and for push rejecting a disallowed service_url.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-03 16:37:03 +02:00

e5a6e35843

Python: add hosting Channels sample apps (#5645 )

* samples(hosting): add hosting Channels sample apps under samples/04-hosting/af-hosting

Adds five end-to-end sample apps under
``python/samples/04-hosting/af-hosting/`` that exercise the
``agent-framework-hosting`` Channels stack from the simplest single-channel
case up to a multi-channel deployment with cross-channel identity linking.

Samples (ordered by complexity)
-------------------------------

* ``foundry_hosted_agent/`` — minimal Responses + Invocations host with a
  Foundry-backed agent and ``FoundryHostedAgentHistoryProvider``.
  ``agd``-deployable; bundles a ``Dockerfile`` and
  ``scripts/vendor-packages.sh`` that copies workspace packages into
  ``_vendor/`` for self-contained builds. ``_vendor/`` is gitignored.
* ``local_responses/`` — single-channel Responses host with a
  ``run_hook`` that strips caller-supplied options and forces a
  reasoning preset. Demonstrates the hook seam over the uniform
  ``ChannelRequest`` envelope.
* ``local_responses_workflow/`` — Responses + Invocations exposing a
  three-agent workflow with per-conversation checkpoint storage.
* ``local_telegram/`` — Responses + Telegram with a ``@tool``,
  ``FileHistoryProvider``, hooks, and a ``ResponseTarget`` multicast
  variant (``call_server_multicast.py``) that pushes a single Responses
  reply to a separate Telegram chat.
* ``local_identity_link/`` — full surface: Responses + Invocations +
  Telegram + Activity Protocol (Teams) + the ``EntraIdentityLinkChannel``
  sidecar. Resolves per-channel ids onto a single Entra object id so a
  user's history follows them across surfaces.

Notes
-----

* Samples that use Telegram/Teams via Activity Protocol depend on the
  renamed ``agent-framework-hosting-activity-protocol`` package (see the
  PR-5 series).
* All samples use ``[tool.uv.sources]`` editable workspace deps, except
  ``foundry_hosted_agent/`` which uses the ``./_vendor/`` self-contained
  layout for ``azd`` Docker builds.
* Each sample includes a ``README.md`` with run instructions and an
  ``app.py`` ASGI entrypoint plus a ``call_server.py`` client harness.

Depends on the prior hosting PRs (foundry-hosted-agent refactor +
hosting-core + the per-channel packages). After those merge, this
branch can be rebased onto ``main`` cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* samples(hosting): point sample deps at the feature/python-hosting GitHub branch

Switches every sample's ``[tool.uv.sources]`` from in-monorepo
editable path deps (which only resolve when running inside the
agent-framework workspace) to git refs targeting the
``feature/python-hosting`` branch on
``microsoft/agent-framework``. Samples now install standalone outside
the monorepo while the ``agent-framework-hosting*`` packages are still
pre-PyPI; once they publish, the ``[tool.uv.sources]`` block can be
dropped and the declared deps resolve from PyPI.

Cleanup
-------

* Drops ``foundry_hosted_agent/scripts/vendor-packages.sh``,
  ``_vendor/`` from ``.gitignore``, the ``hooks.prepackage`` block in
  ``azure.yaml`` and the ``COPY _vendor/`` step in the Dockerfile —
  vendoring is no longer needed because git refs make the deps
  network-resolvable from any context.
* Drops obsolete ``workspace.pyproject.toml`` reference and ``scripts/``
  / ``workspace.pyproject.toml`` entries from
  ``Dockerfile.dockerignore``.
* Updates the foundry sample's Dockerfile to ``uv sync --no-dev``
  (no ``--frozen``) so it locks fresh against the GitHub-hosted deps
  at build time.
* Drops every committed ``uv.lock`` because the resolver needs network
  access to ``feature/python-hosting`` to lock — they regenerate the
  first time a user runs ``uv sync`` after the branch lands.
* Refreshes the per-sample READMEs to mention the GitHub install path
  instead of "in-tree workspace packages".

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* samples(hosting): address PR #5645 review comments

- foundry_hosted_agent/call_server.py: replace hard-coded
  project_endpoint and service_session_id with FOUNDRY_PROJECT_ENDPOINT,
  FOUNDRY_HOSTED_AGENT_NAME, and optional FOUNDRY_HOSTED_SESSION_ID
  environment variables. Session-id is now optional so the sample
  exercises the new-conversation path by default.

- local_identity_link/app.py:
  * make_telegram_hook: apply the reasoning bump regardless of
    identity-link state (the previous early-return on linked chats
    silently dropped the high-effort preset for the very flow the
    sample exists to demonstrate).
  * make_responses_hook: add a prominent DEV-ONLY warning that the
    client-supplied entra_oid shortcut bypasses identity verification
    and must be replaced by a JWT validator in production.
  * /link command: early-return when chat_id is missing instead of
    minting an authorize URL keyed on "telegram:None" (which would
    poison the link store with a binding any future chat_id-less
    update would collapse onto).
  * Switch ENTRA_CERT_PATH / ENTRA_CERT_PASSWORD env vars to the
    longer ENTRA_CERTIFICATE_PATH / ENTRA_CERTIFICATE_PASSWORD names
    that the README already documents.
  * channels: Sequence[Channel] -> list[Channel] (the next line
    appends, which a Sequence type doesn't expose).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(hosting-samples): apply sample formatting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(hosting-samples): guard command input text

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-05-28 14:57:46 +02:00

6b822853eb

Python: Shell tool with support for local and Docker (#5664 )

* feat(tools): add cross-OS LocalShellTool in new agent-framework-tools package

Introduces a safe, cross-OS local shell tool as the first citizen of a new

agent-framework-tools workspace package. Supports persistent (default) and

stateless modes across pwsh/powershell.exe/bash/sh, with policy denylist,

allowlist, approval gating, process-tree kill on timeout, output truncation,

and audit hooks. Integrates with existing provider get_shell_tool(func=...)

factories via FunctionTool kind='shell'.

See docs/decisions/0026-builtin-tools-local-shell.md for the full design.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(tools): security hardening for LocalShellTool

Codifies what LocalShellTool does and does not defend against, and

delegates the security-relevant lifecycle primitive to a battle-tested

library instead of hand-rolled per-OS code.

Changes:

- Adopt psutil for cross-OS process-tree termination (executor + session).

  Replaces hand-rolled taskkill/killpg with one canonical implementation.

- Resolve taskkill.exe to absolute %SystemRoot%\System32 path so PATH

  poisoning cannot redirect us to an attacker-supplied binary.

- Reframe ShellPolicy docstring + ADR + README: denylist is a guardrail,

  not a security boundary.

- Require acknowledge_unsafe=True to set approval_mode='never_require',

  making the unsafe path explicitly opt-in with a self-documenting name.

- Add tests/test_security.py codifying named CVE-style cases. Defenses

  we DO claim are asserted; non-defenses (denylist bypasses via

  backslash insertion, variable expansion, interpreter escape, base64,

  alternative tools, PowerShell-native verbs) are documented as

  expected-to-pass tests so residual risk stays visible.

- Add Threat Model + Confidence Strategy sections to ADR 0026.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(tools): add DockerShellTool sandboxed shell tier

Adds a container-backed shell executor as the recommended pattern for untrusted-input shell workflows. The container provides the security boundary (--network none, non-root user, --read-only, --cap-drop ALL, no-new-privileges, memory/pids limits, tmpfs /tmp), so approval gating is optional unlike LocalShellTool.

Also introduces a ShellExecutor Protocol so callers can plug in custom backends (Firecracker, SSH, WASI) without forking the framework.

Removes the planned HyperlightShellExecutor follow-up from ADR 0026: Hyperlight is a WASM code sandbox with no kernel/userland/shell binary, so a Hyperlight-backed shell is not viable. Docker is the realistic sandbox tier for shell.

Tests: 11 unit tests for argv builders + lifecycle (no Docker daemon required); 3 integration tests gated on is_docker_available().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(tools): backport shell-tool fixes from .NET parity review

Applies the applicable subset of bug fixes accumulated during the
.NET shell-tool PR review (microsoft/agent-framework#5604) to the
Python shell tool.

A1 - Quote workdir safely in _maybe_reanchor

  Previously _tool.py used double-quote interpolation when emitting
  the cd/Set-Location prefix, which expanded $VAR, $(), and backticks
  in the workdir path. A workdir containing shell metacharacters could
  trigger arbitrary command execution before the user command ran.

  Replaced with single-quote escaping helpers _quote_posix and
  _quote_powershell that emit literal-string forms safe for both
  hosts.

A5/A6 - Consolidate truncation to a single byte-aware helper

  Extracted a shared truncate_head_tail / truncate_text_head_tail
  helper in _truncate.py. The new implementation distributes odd
  caps so head receives floor(cap/2) and tail receives ceil(cap/2)
  bytes, matching the .NET round-9 fix and ensuring no input bytes
  are silently dropped on the boundary.

  _session.py previously truncated by Python str length while the
  caller passed _max_output_bytes - the unit mismatch is now gone:
  raw byte buffers go through truncate_head_tail and decoded text
  goes through truncate_text_head_tail.

Unit tests added for the truncate and quote helpers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(tools): tone down narrative and overconfident comments in shell tool

The shell tool's docstrings and comments contained two patterns that
the .NET review pushed back on:

- Narrative framing about implementation history ("hard-won",
  "we sidestep", "design inspiration: ...", competitor framework
  name-drops in module docstrings).
- Overstated security guarantees ("battle-tested",
  "reasonable for untrusted input", "recommended executor for any
  agent that runs commands from untrusted input",
  "destructive commands are blocked", "safe local shell tool",
  "blocks shell injection").

Rewrites the affected docstrings and comments to describe what the
code does in neutral terms. Behaviour is unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(tools): add ShellEnvironmentProvider for the Python shell tool

Ports the .NET ShellEnvironmentProvider as a Python ContextProvider
so agents using LocalShellTool or DockerShellTool can be primed with
an accurate description of the shell they're talking to (family,
version, OS, working directory, and which CLIs are available).

The provider runs probes through any ShellExecutor, caches the
resulting snapshot, and on every before_run extends the session
instructions with a markdown block describing the shell idiom to
use. A failed first probe leaves the cache empty so the next call
retries (no permanent poisoning).

Probe failures from a narrow set of expected error types
(ShellCommandError, ShellExecutionError, ShellTimeoutError, and
asyncio.TimeoutError from the per-probe timeout) are recorded as
None fields in the snapshot. Other exceptions propagate. Tool
names are validated against ^[A-Za-z0-9._-]+$ before being
interpolated into a probe command.

Includes 12 unit tests covering happy path, stderr fallback,
timeout handling, expected/unexpected exception paths, malicious
tool name rejection, case-insensitive deduplication, retry after
failure, concurrent first-callers sharing one probe, and the
default and custom formatter paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(tools): document ShellEnvironmentProvider and finish comment cleanup

Add a README section introducing ShellEnvironmentProvider, soften two remaining overconfident security-boundary comments in _executor_base.py and the DockerShellTool class docstring, and add a sample (shell_with_environment_provider.py) that demonstrates the provider in stateless and persistent modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor(tools): move shell samples to python/samples/02-agents/tools

The repository convention is to host samples under python/samples/ rather than inside the package directory. Move the two net-new shell samples (allow-list and environment-provider) to python/samples/02-agents/tools/ and drop the in-package samples/ directory; the existing top-level providers/openai/client_with_local_shell.py already covers the basic LocalShellTool walkthrough.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(tools): cover confine_workdir default and ShellResult.format_for_model

Two new tests in test_local_shell_tool.py exercise the default confine_workdir=True behaviour on POSIX and PowerShell, asserting that 'cd' inside one persistent-mode call does not leak into the next. A new test_shell_result.py module provides direct unit coverage for every conditional branch of ShellResult.format_for_model (stdout, truncated, stderr, timed_out, exit_code) so regressions in the LLM-facing format are caught immediately.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(tools): address PR #5664 review feedback

- _tool.py: detect PowerShell via is_powershell() helper instead of basename string match

- _environment.py: use public ContextProvider import (no private _ prefix)

- _session.py: trim _stdout_buf/_stderr_buf after copying to avoid unbounded retention across calls

- _docker.py: short-circuit start()/close() in stateless mode; add configurable shell kwarg (default bash, e.g. 'sh' for alpine)

- tests: parenthesized multi-line assert; alpine integration tests now pass shell='sh'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(tools): satisfy CI quality gates

- pyupgrade: drop quoted self-class refs in __aenter__/method annotations

- ruff format: reflow long lines per workspace style

- pyright: assert psutil non-None in optional-import branch; lowercase mutable module globals; annotate _approval_mode as Literal so tool() Literal-typed kwarg is accepted; add ... body to ShellExecutor.run protocol; remove unused deprecated _kill_tree wrapper

- tests: skip docker integration tests on win32 (Windows containers don't support --read-only / alpine images)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Remove DEFAULT_DENYLIST; document single-session ownership; fix bandit findings

Mirrors the .NET PR #5604 cleanup:

- Remove DEFAULT_DENYLIST from ShellPolicy. ShellPolicy() now ships with an empty deny-list; operators opt into site-specific patterns explicitly. No major agent framework uses regex matching as a primary security control; AutoGen v2 removed theirs. Approval gating + sandbox tier remain the real boundaries.

- Rewrite module / class docstrings to frame ShellPolicy as a UX pre-filter, not a security control.

- Add Single-session ownership paragraphs to ShellExecutor, ShellSession, LocalShellTool, and DockerShellTool: a persistent-mode tool is owned by exactly one conversation / agent session; do not share across users or concurrent conversations.

- Tests now supply explicit deny patterns instead of relying on a default.

- Address Pre-commit Hooks (bandit) CI failures: convert internal-invariant asserts to explicit RuntimeError, annotate intentional subprocess/shell usage with # nosec, document container-internal /tmp paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5664 round-2 review feedback

Deny-list documentation drift:

- README and the OpenAI/local-shell sample no longer claim a built-in deny-list of destructive commands. ShellPolicy is described as an optional, operator-supplied UX pre-filter; the real boundaries remain approval gating and the sandbox tier.

Behavioural fixes called out in review:

- ShellPolicy.evaluate() now denies empty / whitespace-only commands explicitly instead of returning allow with no rationale.

- truncate_head_tail() raises ValueError for cap <= 0 instead of silently returning the full input with truncated=False, which previously could defeat output-capping in callers that mis-configured the budget.

- LocalShellTool.as_function() / DockerShellTool.as_function() return the ShellCommandError text directly so the model sees a single, non-redundant 'Command rejected by policy: …' message instead of the prior duplicated 'Command blocked by policy: Command rejected …' wrapping.

- ShellSession POSIX sentinel trailer now snapshots and restores the prior errexit (set -e) state around the trailer, so a user 'set -e' in the persistent shell is no longer permanently disabled by the next run().

Tests:

- New test_shell_parse_rc.py covers the full _parse_rc() edge-case surface (zero, positive, negative, CRLF, no newline, missing prefix, empty input, non-digits, trailing garbage, partial digits).

- test_policy.py asserts the new empty-command deny.

- test_shell_truncate_and_quote.py asserts ValueError for cap=0 and cap<0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review feedback for shell tool

- _resolve.py: reject empty/whitespace shell override string
- _tool.py / _docker.py: mode-aware default tool description (persistent vs stateless)
- _tool.py: fix misleading workdir docstring (re-anchor, not blocking)
- _types.py: emit stream-agnostic [output truncated] marker
- _policy.py: declare _denies/_allows as dataclass fields
- _environment.py: use $(pwd) instead of $PWD in POSIX probe

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review feedback: shell override flag + probe timeout safety

- _resolve.py: in stateless mode, ensure shell overrides end with -c/-Command so commands aren't misinterpreted as script-file paths.
- ShellExecutor.run / LocalShellTool.run / DockerShellTool.run now accept an optional 	imeout kwarg; ShellEnvironmentProvider drops the outer asyncio.wait_for and lets the executor enforce the probe timeout internally, so cancellation no longer risks leaving a hung subprocess or corrupted session.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback: docker isolation + lifecycle robustness

- pyproject.toml: bump agent-framework-core minimum from 1.2.0 to 1.2.2 to align with the rest of the workspace.
- _docker.py: validate extra_run_args at construction time and reject flags that would dismantle the isolation defaults (--privileged, --cap-add, --security-opt, --network/--net, -v/--volume/--mount, --device, --pid, --ipc, --userns, --user, --read-only, --tmpfs, --add-host, --gpus, --cgroupns, --device-cgroup-rule); also documented the warning on the docstring.
- _docker._stop_container: retry docker rm -f once and log a warning/error when it does not succeed, so operators can audit leaked containers instead of getting a silent success.
- _docker._run_stateless timeout path: fall back to docker rm -f when docker kill fails or times out (--rm only reaps on clean exit), and log instead of silently swallowing communicate() errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: alliscode <bentho@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: alliscode <25218250+alliscode@users.noreply.github.com>

Ben Thomas · 2026-05-22 00:29:59 +00:00

8e54f0b0e7

Python: Show more authentication methods in Foundry Toolbox MCP (#5719 )

* Show more authentication methods in Foundry Toolbox MCP

* Remove hardcoded toolbox version num

* Add Foundry MCP OAuth consent handling

* Use message instead of the dedicated item type

* Go back to using OAuthConsentRequestOutputItem

* WIP: sample testing

* Update error code

* Address review on Foundry Toolbox MCP samples

Reviewed feedback addressed:

- Drop the branch-pinned `git+https://...@feature/...` entries from
  `04_foundry_toolbox/requirements.txt`; restore the simple comment + `mcp`
  runtime dep. The git pins were only useful while iterating on the PR and
  shouldn't ship. (eavanvalkenburg)

- Fix the `/toolsets/` typo in both `04_foundry_toolbox/README.md` and
  `06_files/README.md`. Verified empirically against the
  research_toolbox in the test workspace: the toolbox MCP gateway lives at
  `/toolboxes/{name}/mcp?api-version=v1` and requires the
  `Foundry-Features: Toolboxes=V1Preview` header. `/toolsets/{name}/mcp`
  returns 403 with `preview_feature_required: Toolsets=V1Preview` (a
  different opt-in feature).

- Wrap `httpx.AsyncClient(...)` in `async with ... as http_client:` in both
  samples so the connection pool is cleaned up. (Copilot reviewer)

- Make the `TOOLBOX_NAME` env var consistent in both samples. Previously the
  tool name silently fell back to `"toolbox"` when `TOOLBOX_NAME` was unset,
  but `resolve_toolbox_endpoint()` still required `TOOLBOX_NAME` and would
  raise `KeyError`. The samples now resolve the endpoint once and derive the
  tool name from the resolved URL when `TOOLBOX_NAME` isn't set, so the
  local tool name always matches the upstream toolbox identity regardless
  of which env var the user set. (Copilot reviewer)

- Rename `_responses.is_consent_error` to `consent_url_from_error`: the
  helper returns `str | None` (the consent URL), not a bool, so the new
  name matches behavior. Update the test class accordingly. (eavanvalkenburg)

- Tighten `_handle_inner_agent`'s lazy-entry catch from `Exception` to
  `AgentFrameworkException`, the type the MCP layer actually wraps consent
  errors in via `MCPStreamableHTTPTool.__aenter__` →
  `ToolExecutionException(inner_exception=mcp_error)`. Network failures,
  cancellations, and other non-framework exceptions now propagate normally
  instead of being briefly caught and re-raised. The test helper
  `_make_consent_error` is updated to use `ToolExecutionException` so it
  matches the real-world wrapping. (eavanvalkenburg)

- Clarify the `github_pat` description in `agent.manifest.yaml` to note
  it's only needed when the PAT-based connection (`github-mcp-pat-conn`)
  is chosen; users selecting the OAuth2 connection (`github-mcp-oauth-conn`)
  can leave it empty. (Copilot reviewer)

Validation: ran both samples end-to-end against a real Foundry toolbox
(`research_toolbox`) -- the samples connect successfully and the agent
lists the toolbox's MCP tools (`api_specs___fetch_azure_rest_api_docs`,
etc.). `uv run poe test -P foundry_hosting` passes (119 tests), pyright +
mypy clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: fix broken Foundry samples link in 04_foundry_toolbox README

The previous URL pointed to an old location of the toolbox supported-scenarios
doc; the doc moved to /samples/python/hosted-agents/SUPPORTED_TOOLBOX_SCENARIOS.md
and the old /samples/python/toolbox/azd path now 404s.

Caught by the markdown-link-check CI step.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tao Chen · 2026-05-20 12:00:38 +00:00

d74d26c917

[BREAKING] Python: Enable instrumentation by default (#5865 )

* Enable instrumentation by default

* Update samples

* Optimization when span is not recording

* Address Copilot comments

* Revert uv.lock

* Add warning

* Formatting

* Fix mypy

* Add disable_instrumentation() with sticky user-intent semantics

Add a public disable_instrumentation() entry point so users can explicitly opt
out of Agent Framework telemetry, with a sticky-disable flag that makes the
user's intent "leading" — no framework code path (foundry's
configure_azure_monitor, configure_otel_providers, enable_instrumentation,
enable_sensitive_telemetry, or direct OBSERVABILITY_SETTINGS.enable_*
writes) can re-enable instrumentation until the user explicitly clears the
disable with enable_instrumentation(force=True) /
enable_sensitive_telemetry(force=True).

Also addresses the two remaining unresolved review threads on the PR:
1. test_observability_settings_defaults_instrumentation_true pins the new
   "ENABLE_INSTRUMENTATION defaults to True when env unset" behavior.
2. test_enable_instrumentation_reads_env_sensitive_data restores coverage
   for the post-import load_dotenv() fallback path.

Implementation:
- ObservabilitySettings.enable_instrumentation / enable_sensitive_data become
  properties backed by _enable_*. While _user_disabled is True, the getters
  return False and the setters drop True writes (defense in depth so third-
  party writes can't subvert the disable).
- Public is_user_disabled read-only property lets integrations (e.g. foundry's
  configure_azure_monitor) cheaply check the disable state without poking at
  privates.
- enable_instrumentation() and enable_sensitive_telemetry() short-circuit with
  an info log when disabled; gain a force=True kwarg that clears the disable.
- configure_otel_providers() still creates providers / exporters / views so a
  later force-enable can use them, but logs an info message when called while
  disabled.
- Foundry's FoundryChatClient.configure_azure_monitor and
  FoundryAgent.configure_azure_monitor early-return when the user has
  disabled, so Azure Monitor's global providers aren't installed unnecessarily.

Tests: 11 new tests covering default-on, env re-read at call time, sticky
behavior against each re-enable surface (enable_instrumentation,
enable_sensitive_telemetry, configure_otel_providers, direct attribute
writes), force=True override, re-arming the disable, and the __all__ export.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: document disable_instrumentation() and force=True paths

Add a "Disabling instrumentation" section to the observability sample README
that walks through:

- The distinction between the ENABLE_INSTRUMENTATION env var (initial,
  non-sticky) and disable_instrumentation() (process-wide, sticky).
- Why the sticky semantics matter: framework integrations like
  FoundryChatClient.configure_azure_monitor() can call
  enable_instrumentation() as part of their setup, and the user's opt-out
  needs to win.
- All five surfaces guarded by the sticky disable (property reads, public
  enable functions, configure_otel_providers, direct attribute writes,
  is_user_disabled-aware integrations).
- The force=True escape hatch on both enable_instrumentation() and
  enable_sensitive_telemetry().
- How third-party integrations should consult OBSERVABILITY_SETTINGS.is_user_disabled.
- The limits of the disable (does not tear down existing providers /
  in-flight spans / third-party instrumentation, does not persist across
  processes).

Cross-links the new section from the ENABLE_INSTRUMENTATION row in the env
vars table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: soften disable_instrumentation() overclaim about telemetry guarantees

Replace 'no telemetry will be emitted no matter what' (which is too strong,
since callers can still pass force=True or mutate private attributes) with
language framing the disable as a user-intent contract that library and
framework code is expected to honor: the framework actively short-circuits
the public enable paths, force=True and private-attribute writes are
acknowledged as out-of-contract escape hatches that integrations should
not use on the user's behalf.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: correct observability Dependencies section

- opentelemetry-sdk is no longer a hard dependency; it is lazily imported by
  create_resource(), create_metric_views(), and configure_otel_providers()
  with a clear ImportError when missing. Day-to-day instrumentation works
  with opentelemetry-api alone provided some other component configures the
  global OpenTelemetry providers (Azure Monitor, an APM agent, application
  bootstrap, etc.).
- opentelemetry-semantic-conventions-ai is no longer used anywhere in the
  source; remove it from the listed dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: replace stale observability migration guide with current PR's only relevant migration

The old guide documented the move away from setup_observability(otlp_endpoint=...)
which was an earlier-release API change unrelated to this PR and stale enough that
it's more confusing than helpful at this point. Replace it with a short note on the
single migration this PR introduces: callers of
enable_instrumentation(enable_sensitive_data=True) should switch to
enable_sensitive_telemetry(). Cross-link to the Disabling instrumentation section
for the rare 'force on without enabling sensitive data' use case where
enable_instrumentation() still applies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tao Chen · 2026-05-20 11:52:08 +00:00

72a6157c6a

Python: feat: add agent-framework-monty (Monty-backed CodeAct provider) (#5915 )

* Python: feat: add agent-framework-monty (Monty-backed CodeAct)

New alpha package that wraps pydantic-monty (a Rust-based Python
interpreter) behind the same CodeAct API surface as
agent-framework-hyperlight, so users can swap providers with minimal
code change.

Public API (agent_framework_monty):
- MontyCodeActProvider — ContextProvider that injects a run-scoped
  execute_code tool plus dynamic CodeAct instructions.
- MontyExecuteCodeTool — standalone FunctionTool for mixed-tool agents
  or manual static wiring.
- FileMount / FileMountInput / MountMode — public types mirroring the
  Hyperlight names, with Monty's mode (read-only/read-write/overlay)
  and write_bytes_limit on FileMount.

Constructor kwargs (both classes) mirror Hyperlight where possible:
tools, approval_mode, workspace_root, file_mounts; plus a Monty-only
resource_limits forwarding ResourceLimits to Monty.start().

Filesystem flow:
- workspace_root auto-mounts at /input (read-write), matching Hyperlight.
- file_mounts accepts string shorthand, (host, mount) tuple, or
  FileMount with mode + write cap.
- Files written under read-write mounts are scanned post-execution and
  returned as Content.from_data items (mirrors Hyperlight /output).
- overlay mounts buffer writes in-memory; read-only mounts reject writes.

Internals:
- _monty_bridge.InlineCodeBridge ports the inline (non-durable) bridge
  from anthonychu/maf-codeact-monty-python; handles FunctionSnapshot /
  FutureSnapshot pause/resume, dispatches direct typed calls + the
  call_tool fallback, forwards mount/limits to Monty.start(...).
- generate_type_stubs emits per-tool stubs so Monty's `ty` type-checker
  rejects bad calls before any host tool runs.

Alpha-policy compliance (per python-package-management skill):
- Added agent-framework-monty = { workspace = true } to root
  pyproject.toml.
- Added row to python/PACKAGE_STATUS.md.
- Added monty entry under Experimental in python/AGENTS.md.
- NOT added to core[all]; NO agent_framework.monty lazy shim (deferred
  to beta promotion).

Samples (three sets, import from agent_framework_monty directly):
- samples/02-agents/context_providers/code_act/monty_code_act.py
  (provider pattern) + updated local README.
- samples/02-agents/tools/monty_code_interpreter/ (standalone +
  manual-wiring + README).
- samples/04-hosting/foundry-hosted-agents/responses/11_monty_codeact/
  (full hosted-agent layout with uv-based pyproject.toml + Dockerfile,
  Azure Monitor wiring via APPLICATIONINSIGHTS_CONNECTION_STRING +
  enable_instrumentation, ENABLE_INSTRUMENTATION and
  ENABLE_SENSITIVE_DATA env vars). The alpha wheel is vendored into
  ./wheels/ (gitignored) via vendor-wheel.sh; new row added to the
  parent Responses-API README.

Tests:
- 28 hermetic unit tests (stubbed pydantic_monty).
- 18 integration tests marked @pytest.mark.integration, auto-skipped
  when pydantic_monty is unimportable; exercise the real Monty
  runtime: print round-trip, last-expression value, direct typed
  tool dispatch, call_tool fallback, async tool, asyncio.gather
  parallelism, ty type-check rejection, OS blocked by default,
  workspace_root read+write capture, read-only / overlay mount
  semantics, resource_limits.max_duration_secs abort, approval
  gating end-to-end, full Agent run with a scripted chat client.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix: monty FileMount test compares against the normalized POSIX path

The shorthand string mount goes through _normalize_mount_path, which
rewrites Windows drive letters like 'C:\\Users\\...' into
'/C:/Users/...' (POSIX-style). The Windows CI runners surfaced this
because tmp_path resolves to a backslashed Windows path; the test was
comparing against the raw str(host_a) instead of the normalized form.

Compare against _normalize_mount_path(str(host_a)) so the assertion is
platform-independent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix: address PR #5915 review feedback

- _execute_code_tool docstring: clarify that the Monty backend supports
  scoped filesystem access via workspace_root / file_mounts (blocked by
  default).
- _to_monty_mount: import pydantic_monty lazily through load_monty so
  missing-dependency errors surface as the same actionable RuntimeError
  the rest of the package raises (not a bare ImportError at module load).
  Renamed _load_monty -> load_monty for the same reason.
- _python_type_repr: emit None for type(None) instead of Any, and
  normalize both typing.Union[...] and PEP-604 X | Y to PEP-604 syntax
  so Optional[X] / Union[..., None] / -> None signatures round-trip
  correctly through ty validation. Added a regression test.
- _PrintCollector: track a running character count instead of
  recomputing sum(len(c) for c in self.chunks) per callback. Eliminates
  the O(n^2) cost on print-heavy code.
- Instructions: mention that the value of the final expression is also
  returned alongside captured stdout (matches actual behavior).
- 11_monty_codeact Dockerfile: pin ghcr.io/astral-sh/uv to 0.11.6
  instead of :latest for reproducible builds.
- 11_monty_codeact README: replace the bare "see parent README" pointer
  with sample-specific steps (./vendor-wheel.sh + uv sync + uv run),
  since the sample uses pyproject.toml + a vendored wheel rather than
  requirements.txt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: sample: 11_monty_codeact installs agent-framework-monty from PyPI

Drop the vendored-wheel scaffolding now that agent-framework-monty is on
PyPI as an alpha (1.0.0a*) release:

- pyproject.toml: remove [tool.uv.sources] override; keep [tool.uv]
  prerelease = "allow" so uv pulls the alpha automatically.
- Dockerfile: drop the COPY wheels/ step.
- README: drop the ./vendor-wheel.sh setup step and the
  not-yet-on-PyPI warning.
- Delete vendor-wheel.sh and the gitignored wheels/ directory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): harden post-execution file capture against symlink escape

Same class of issue as the MSRC-reported Hyperlight finding: the
post-execution capture walked workspace_root with Path.rglob() +
is_file() + read_bytes() - all of which follow symlinks. An attacker
who controls the workspace (cloned repo, extracted archive, shared
workspace) could pre-place `workspace/leak.txt -> /etc/passwd` or
`workspace/outside_dir -> /etc/` and have host files surface as
captured Content items.

Monty's mount layer already rejects symlink reads from inside the
sandbox across all three modes (verified empirically), so the runtime
path was safe. This commit closes the post-execution scan path.

Changes:
- New `_iter_real_files(root)` walker that uses iterdir() +
  is_symlink() to skip symlinks at every directory level and yields
  only real files. Replaces the previous `host_root.rglob("*")` calls
  in both `_snapshot_writable_mounts` and `_capture_written_files`.
- Use `Path.lstat()` instead of `Path.stat()` so size/mtime can never
  be taken from a symlink target.
- Three new integration tests reproducing the MSRC attack shape
  against the workspace_root flow: symlink-to-file outside workspace,
  symlink-to-directory outside workspace, and a guard ensuring
  legitimate sandbox writes are still captured when symlinks are
  present.

Per user request, hyperlight is untouched in this commit (separate fix).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): skip symlink regression tests when unsupported

Apply the same Windows-CI safety guard as the hyperlight fix in PR #5919:
the three symlink integration tests create symlinks via Path.symlink_to(),
which fails with OSError / NotImplementedError on unprivileged Windows
runners. Add a local _symlinks_supported helper (mirroring the one in
packages/core/tests/core/test_skills.py) and pytest.skip when symlinks
aren't available, so the tests no longer fail for environment reasons.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix(monty): address PR #5915 follow-up review feedback

- _invoke_tool: drop the inspect.iscoroutinefunction(...) branch and
  always `await self.tool_map[name](**kwargs)`. Every entry in
  tool_map is `partial(FunctionTool.invoke, skip_parsing=True)` and
  FunctionTool.invoke is `async def`, so the branching was dead code -
  and on Python versions affected by cpython#98590,
  iscoroutinefunction(partial(bound_async_method, ...)) returns False,
  causing the bridge to take the asyncio.to_thread path, return an
  unawaited coroutine, and surface it as a JSON-serialization failure
  for every tool call. Added a regression test
  test_invoke_tool_awaits_partial_wrapped_async_method.

- generate_type_stubs: skip tools whose name is not a valid Python
  identifier or is a Python keyword. FunctionTool.name has no upstream
  validation, so a name like "weird-name" produced a syntax error in
  the stubs and a name like "broken\n    pass\nasync def injected"
  would inject arbitrary stub source. Non-identifier names stay
  reachable via `call_tool("weird-name", ...)` at runtime; they just
  don't get type-checked stubs. Added regression test
  test_generate_type_stubs_skips_non_identifier_tool_names.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-05-20 00:35:23 +00:00

4609535e22

Python: Improve the handling of intermediate outputs for workflows and orchestrations (#5623 )

* Improve the handling of intermediate outputs for workflows and orchestrations

* Address PR review feedback on intermediate output forwarding

- Switch workflow.as_agent() forwarding to an explicit allowlist of {output,
  intermediate, data, request_info} so orchestration-internal events
  (group_chat, handoff_sent, magentic_orchestrator) stay inside the workflow
  instead of leaking into agent responses via str(data) coercion.
- Stop raising on intermediate AgentResponseUpdate in non-streaming run();
  surface the partial as a Message with text_reasoning content. The defensive
  raise still applies to terminal output events, where Update payloads would
  corrupt message ordering.
- Extend the DevUI workflow-event mapper so intermediate yields wrapping
  plain strings, Messages, and list[Message] render as visible output items
  instead of generic completed-trace events.
- Add orchestration coverage for GroupChat, Handoff, and Magentic builders
  (default vs intermediate_outputs=True; structural where end-to-end is heavy).

* Lift output-designation policy into a value type

Replace the ``Workflow._output_executors`` list and the
``RunnerContext.should_label_as_intermediate`` Protocol method with a single
immutable ``OutputDesignation`` value type owned by ``Workflow``. Thread the
designation as a parameter through the existing call chain (Runner ->
EdgeRunner -> Executor -> WorkflowContext) so ``yield_output`` consults the
threaded snapshot directly rather than calling back into the runner context.

Removes the ``InProcRunnerContext._workflow`` back-reference and the
``WorkflowBuilder.build()`` assignment that wired it up. Adds the public
predicate ``Workflow.is_terminal_executor(executor_id)`` for external
observers; ``OutputDesignation`` itself stays package-internal.

Key decisions
- ``OutputDesignation.designated`` is ``frozenset[str] | None`` -- ``None``
  preserves legacy "every yield is type='output'" behavior, any frozenset
  (including empty) opts into strict mode. The ``DeprecationWarning`` for
  legacy mode at build time is unchanged.
- ``output_designation`` is an optional parameter on ``Runner``,
  ``EdgeRunner.send_message``, ``EdgeRunner._execute_on_target``,
  ``Executor.execute``, ``Executor._create_context_for_handler``, and
  ``WorkflowContext.__init__``. Each defaults to legacy ``OutputDesignation()``
  so direct callers (Azure Functions ``CapturingRunnerContext``,
  ``test_runner`` recording fixtures) keep working without ceremony.
- The workflow-level filter in ``_run_core`` reads ``self._output_designation``
  live, preserving today's semantics where mutating the designation after
  build still affects subsequent runs (used by two existing tests).
- ``Workflow.to_dict()`` continues to emit ``"output_executors":
  list[str] | None`` (sorted from the frozenset). Checkpoint format unchanged.

Files changed
- _workflow.py: add ``OutputDesignation`` dataclass; replace
  ``_output_executors`` with ``_output_designation``; add
  ``is_terminal_executor``; delete ``_should_yield_output_event``.
- _runner_context.py: drop ``should_label_as_intermediate`` Protocol method
  and ``InProcRunnerContext`` impl; drop ``_workflow`` back-reference.
- _workflow_builder.py: remove ``context._workflow = workflow`` assignment.
- _runner.py, _edge_runner.py, _executor.py, _workflow_context.py: thread
  ``output_designation`` parameter through the call chain.
- tests/workflow/test_output_designation.py (new): three-state coverage of
  the value type plus the public predicate delegation.
- tests/workflow/test_workflow_builder.py, test_validation.py,
  test_workflow.py, test_runner.py and
  orchestrations/tests/test_orchestration_intermediate_vs_terminal.py:
  switch probes from ``_output_executors`` set checks to
  ``get_output_executors`` / ``is_terminal_executor``; update two
  post-build mutation tests to set ``_output_designation`` instead.

Verification
- core/tests/workflow/, orchestrations/tests/, azurefunctions/tests/:
  1119 passed, 42 skipped, 2 xfailed.
- ``uv run poe lint``: clean.
- ``uv run poe typing``: only the pre-existing
  ``_AGENT_FORWARDED_EVENT_TYPES`` pyright warning from 394bcd607 remains.

Notes for next iteration
- The builder's own ``_output_executors`` attribute (``list[Executor |
  SupportsAgentRun]``) is intentionally untouched; the issue scoped the
  rename to the workflow attribute.
- Adjacent review candidates (twin ``WorkflowAgent`` translators,
  ``_AGENT_FORWARDED_EVENT_TYPES`` kind classifier,
  ``_event_origin_context`` ContextVar removal, ``WorkflowEvent`` ADT
  split, legacy-mode removal) remain out of scope.

* Add explicit workflow output designation

Key decisions

- Extend the internal OutputDesignation value type from terminal-only membership to output/intermediate/hidden classification. Legacy mode remains outputs=None, so workflows built without output_executors or intermediate_executors still label every yield_output as type='output'.

- WorkflowBuilder now accepts intermediate_executors. Providing either designation enters explicit mode; output executors emit output, intermediate executors emit intermediate, and unlisted yield_output payloads are hidden from caller-facing events while remaining in executor_completed data.

- Empty explicit designation, duplicate entries, overlaps, unknown executors, and designated executors without workflow output annotations fail build validation. Existing orchestration builders pass intermediate-capable participants through intermediate_executors to preserve current intermediate_outputs behavior until participant-oriented designation lands.

Files changed

- packages/core/agent_framework/_workflows/_workflow.py, _workflow_builder.py, _workflow_context.py, _validation.py, _events.py

- packages/core/tests/workflow/test_output_designation.py, test_output_executors_contract.py, test_strict_mode_event_labeling.py, test_validation.py, test_workflow.py, test_workflow_agent_intermediate.py

- packages/orchestrations/agent_framework_orchestrations/_sequential.py, _concurrent.py, _group_chat.py, _magentic.py

- packages/core/AGENTS.md

Verification

- uv run pytest packages/core/tests/workflow packages/orchestrations/tests packages/devui/tests/devui/test_mapper.py -q

- uv run pytest packages/azurefunctions/tests -q

- uv run poe lint

- uv run poe typing fails only on pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.

Notes for next iteration

- issues/03-core-workflow-explicit-designation.md was moved to issues/done but issues/ remains untracked and intentionally excluded from this commit.

- Slice 4 should tighten workflow.as_agent() mapping for hidden emissions and streaming-only update payloads; Slice 5 should replace orchestration intermediate_outputs with participant-oriented designation.

* Tighten workflow-as-agent output mapping

Key decisions

- Treat AgentResponseUpdate as a streaming-only payload across the workflow.as_agent() adapter, so non-streaming agent runs now reject both terminal output and intermediate workflow events carrying updates.
- Keep streaming classification behavior explicit: terminal update payloads remain normal text content, while intermediate update payloads are rewritten to text_reasoning content.
- Add explicit-mode coverage proving hidden yield_output emissions do not appear in non-streaming AgentResponse messages or streaming AgentResponseUpdate chunks.

Files changed

- packages/core/agent_framework/_workflows/_agent.py
- packages/core/tests/workflow/test_workflow_agent_intermediate.py

Verification

- uv run pytest packages/core/tests/workflow/test_workflow_agent_intermediate.py -q
- uv run pytest packages/core/tests/workflow/test_workflow_agent.py packages/core/tests/workflow/test_workflow_agent_intermediate.py -q
- uv run pytest packages/core/tests/workflow packages/orchestrations/tests packages/devui/tests/devui/test_mapper.py -q
- uv run poe lint
- uv run poe typing fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.

Blockers or notes for next iteration

- issues/04-workflow-as-agent-output-mapping.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
- Slice 5 should replace orchestration intermediate_outputs with participant-oriented designation.

* Add orchestration participant output designation

Key decisions

- Replace orchestration intermediate_outputs with participant-oriented output_participants and intermediate_participants across Sequential, Concurrent, GroupChat, Magentic, and Handoff builders.
- Keep synthetic final executors terminal by default for Concurrent, GroupChat, and Magentic; keep Sequential's final participant terminal by default; keep Handoff participants terminal by default.
- Centralize participant designation validation for empty explicit designation, duplicates, overlaps, and unknown participants, then map validated participants to workflow output/intermediate executors.

Files changed

- packages/orchestrations/agent_framework_orchestrations/_participant_designation.py
- packages/orchestrations/agent_framework_orchestrations/_sequential.py
- packages/orchestrations/agent_framework_orchestrations/_concurrent.py
- packages/orchestrations/agent_framework_orchestrations/_group_chat.py
- packages/orchestrations/agent_framework_orchestrations/_magentic.py
- packages/orchestrations/agent_framework_orchestrations/_handoff.py
- packages/orchestrations/tests/test_orchestration_intermediate_vs_terminal.py
- packages/orchestrations/tests/test_magentic.py

Blockers or notes for next iteration

- issues/05-orchestration-participant-designation.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
- Slice 7 should migrate samples and docs away from intermediate_outputs to the new participant designation API.
- uv run poe typing still fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.

* Migrate samples to explicit output designation

Key decisions

- Replace sample usage of the removed orchestration intermediate_outputs boolean with participant-oriented intermediate_participants designation.
- Update raw workflow guidance to show output_executors together with intermediate_executors, and document that unlisted yields are hidden in explicit designation mode.
- Keep orchestration final outputs terminal while streaming designated participant responses as intermediate progress, including workflow.as_agent() samples where intermediates map to text_reasoning content.
- Refresh workflow and orchestration README guidance plus the changelog reference so public docs no longer point users at intermediate_outputs.

Files changed

- CHANGELOG.md
- packages/orchestrations/README.md
- samples/README.md
- samples/03-workflows/README.md
- samples/03-workflows/control-flow/intermediate_vs_terminal_outputs.py
- samples/03-workflows/orchestrations/README.md
- samples/03-workflows/orchestrations/group_chat_agent_manager.py
- samples/03-workflows/orchestrations/group_chat_philosophical_debate.py
- samples/03-workflows/orchestrations/group_chat_simple_selector.py
- samples/03-workflows/orchestrations/magentic.py
- samples/03-workflows/orchestrations/magentic_human_plan_review.py
- samples/03-workflows/orchestrations/sequential_chain_only_agent_responses.py
- samples/03-workflows/agents/group_chat_workflow_as_agent.py
- samples/03-workflows/agents/magentic_workflow_as_agent.py
- samples/03-workflows/agents/sequential_workflow_as_agent.py
- samples/semantic-kernel-migration/orchestrations/group_chat.py
- samples/semantic-kernel-migration/orchestrations/magentic.py

Blockers or notes for next iteration

- issues/07-samples-and-docs-explicit-output-designation.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
- issues/06-devui-intermediate-event-rendering.md remains present and appears already satisfied by existing DevUI mapper/tests from the prior implementation slice.
- PRD-explicit-workflow-output-designation.md remains untracked and intentionally excluded from this commit.

* Render DevUI intermediate workflow outputs

Key decisions

- Preserve workflow output designation metadata on visible DevUI output messages and text deltas so intermediate/data emissions remain distinguishable from terminal output.
- Render intermediate workflow message items in the execution timeline using executor metadata, while excluding them from the final workflow result aggregation.
- Keep terminal output message rendering unchanged and retain legacy data events on the intermediate compatibility path.

Files changed

- packages/devui/agent_framework_devui/_mapper.py
- packages/devui/frontend/src/components/features/workflow/execution-timeline.tsx
- packages/devui/frontend/src/components/features/workflow/workflow-view.tsx
- packages/devui/frontend/src/types/openai.ts
- packages/devui/tests/devui/test_mapper.py

Blockers or notes for next iteration

- issues/06-devui-intermediate-event-rendering.md was moved to issues/done/ but issues/ remains untracked and intentionally excluded from this commit.
- PRD-explicit-workflow-output-designation.md remains untracked and intentionally excluded from this commit.
- uv run poe typing still fails only on the pre-existing packages/core/agent_framework/_workflows/_agent.py _AGENT_FORWARDED_EVENT_TYPES private-use pyright error.

* Fix mypy

* Clarify orchestration participant output config

* Rename participant output kwargs for clarity

output_participants -> final_output_from, intermediate_participants ->
intermediate_output_from. The old names read like categories of
participant; the new names make it clear the kwarg designates which
participants' outputs surface as final vs. intermediate events.

* Rename core workflow output kwargs with deprecation shim

Adds final_output_from / intermediate_output_from as canonical kwargs on
Workflow and WorkflowBuilder. Old output_executors / intermediate_executors
kwargs continue to work but emit DeprecationWarning via a shared coalesce
helper that also rejects supplying both. Wire-format keys in to_dict()
stay as output_executors / intermediate_executors so checkpoint
compatibility is preserved.

Internal call sites in orchestrations and samples updated to the new
names so users following sample code learn the canonical vocabulary;
legacy callers still work with a one-shot warning.

* Suppress pyright reportPrivateUsage on cross-module sentinel import

* Update docstrings

* Propagate sub-workflow intermediate outputs, fix handoff/sequential intermediate-only designation, and shore up tests, sample, and docstrings around the intermediate output contract.

* Add canonical workflow output_from selection

Key decisions:\n- Make output_from the canonical workflow-output allow-list and keep output_executors/final_output_from as deprecated compatibility aliases.\n- Treat empty output_from/intermediate_output_from lists as explicit selections and keep validation responsible for empty, duplicate, overlap, and unknown selections.\n- Remove the branch-only public intermediate_executors WorkflowBuilder kwarg while preserving legacy wire keys in to_dict().\n\nFiles changed:\n- packages/core/agent_framework/_workflows/_workflow.py\n- packages/core/agent_framework/_workflows/_workflow_builder.py\n- packages/core/agent_framework/_workflows/_workflow_context.py\n- packages/core/agent_framework/_workflows/_agent.py\n- packages/core/agent_framework/_workflows/_agent_executor.py\n- packages/core/tests/workflow/* output-selection coverage updates\n- packages/core/AGENTS.md\n- issues/done/001-canonical-list-based-output-selection.md\n\nBlockers/notes:\n- Orchestration builders still pass final_output_from internally; follow-up issue 004 should migrate them to output_from.\n- Legacy omitted-selection behavior and explicit all/all_other literals are left for issues 002 and 003.

* Add explicit all workflow output selection

Key decisions:
- Treat output_from='all' as an explicit workflow-output selection sentinel and expand it at build time to executors with declared workflow output types.
- Keep omitted output selections in legacy all-output mode with a deprecation warning that names output_from and intermediate_output_from and points to output_from='all'.
- Reject intermediate_output_from='all' at construction because the all-output literal is output-only for this issue.

Files changed:
- packages/core/agent_framework/_workflows/_workflow_builder.py
- packages/core/tests/workflow/test_output_executors_contract.py
- issues/done/002-explicit-all-output-and-legacy-migration.md

Blockers/notes:
- all_other intermediate-output selection remains for issue 003.
- Workflow-as-agent/orchestration parity remains for issue 004.

* Add all-other intermediate output selection

Key decisions:
- Treat intermediate_output_from='all_other' as an explicit intermediate-output selection sentinel and expand it at build time after the workflow graph is complete.
- Expand all_other to output-capable executors not selected by output_from; omitted or empty output_from selects no workflow outputs, while output_from='all' leaves an empty intermediate selection.
- Keep output_from='all_other' invalid so all_other remains intermediate-output-only and runtime classification still receives concrete executor-id sets.

Files changed:
- packages/core/agent_framework/_workflows/_workflow_builder.py
- packages/core/tests/workflow/test_output_executors_contract.py
- issues/done/003-all-other-intermediate-output-selection.md

Blockers/notes:
- Workflow-as-agent and orchestration parity remains for issue 004.
- Full documentation updates remain for issue 005.

* Add orchestration output selection parity

Key decisions:
- Expose output_from on sequential, concurrent, group chat, handoff, and magentic builders while keeping final_output_from as a deprecated compatibility alias.
- Resolve orchestration participant selections through the same explicit rules as workflows: output_from='all', intermediate_output_from='all_other', hidden unselected participant payloads, and overlap/duplicate/unknown/invalid-literal validation.
- Continue preserving documented orchestration defaults by always designating each pattern's terminal internal executor where applicable.

Files changed:
- packages/orchestrations/agent_framework_orchestrations/_participant_output_config.py
- packages/orchestrations/agent_framework_orchestrations/_sequential.py
- packages/orchestrations/agent_framework_orchestrations/_concurrent.py
- packages/orchestrations/agent_framework_orchestrations/_group_chat.py
- packages/orchestrations/agent_framework_orchestrations/_handoff.py
- packages/orchestrations/agent_framework_orchestrations/_magentic.py
- packages/orchestrations/agent_framework_orchestrations/_orchestration_request_info.py
- packages/orchestrations/tests/test_orchestration_intermediate_vs_terminal.py
- issues/done/004-workflow-as-agent-and-orchestration-parity.md

Blockers/notes:
- Full documentation and sample migration wording remains for issue 005.
- Existing tests that intentionally use final_output_from now emit the new deprecation warning.

* Document workflow output selection contract

Key decisions:
- Use Workflow Output and Intermediate Output as the developer-facing terms for selected caller-facing emissions.
- Document output_from and intermediate_output_from as the canonical API, with output_from as an allow-list and unselected payloads hidden unless explicitly selected as intermediate.
- Add scenario and invalid-selection tables for workflow and orchestration docs, including legacy omission warnings, output_from='all', intermediate_output_from='all_other', list selections, invalid literals, overlap, duplicates, unknown selections, and empty explicit selections.
- Migrate samples away from final_output_from and output_executors except where compatibility aliases are explicitly documented.

Files changed:
- packages/core/AGENTS.md
- packages/orchestrations/README.md
- packages/orchestrations/agent_framework_orchestrations/_handoff.py
- packages/orchestrations/agent_framework_orchestrations/_sequential.py
- samples/03-workflows/README.md
- samples/03-workflows/control-flow/intermediate_vs_terminal_outputs.py
- samples/03-workflows/human-in-the-loop/agents_with_approval_requests.py
- samples/03-workflows/orchestrations/README.md
- samples/04-hosting/foundry-hosted-agents/responses/05_workflows/main.py
- scripts/sample_validation/create_dynamic_workflow_executor.py
- issues/done/005-document-output-selection-contract.md

Blockers/notes:
- Direct full Ruff on scripts/sample_validation/create_dynamic_workflow_executor.py still reports pre-existing docstring/print/line-length issues outside this docs migration; syntax-focused checks for changed files pass.
- No remaining AFK issue files are present under issues/.

* Latest updates

* Typing fixes

* Cleanup

Evan Mattson · 2026-05-19 00:15:25 +00:00

3bbc81554b

Python: New Foundry Hosted Agents samples: RAG, Skills, and Memory (#5822 )

* WIP: Add rag sample; need deployment testing

* Rag sample ready

* Add Foundry Skills sample

* WIP: Foundry memory

* Done: Foundry Memory

* Address Copilot comments

* Fix README

* Restore uv.loack

Tao Chen · 2026-05-15 17:31:57 +00:00

da308f5f1e

Python: Fix A2A v1.0 non-streaming response and sample runtime issues (#5849 )

- Fix non-streaming empty response by accumulating intermediate WORKING
  status updates and flushing them when an empty terminal event arrives
- Fix sample agent_executor.py to enqueue Task before status events
  (required by v1.0 ActiveTask validation)
- Fix create_jsonrpc_routes() calls to include required rpc_url param
- Fix TYPE_CHECKING imports in sample agent_definitions.py
- Add tests for non-streaming content accumulation behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-05-14 22:28:02 +00:00

68357b0250

Python: Support list[str] arguments for file-based skill scripts (#5850 )

Port of .NET PR #5475. Broadens the args type from dict[str, Any] | None
to dict[str, Any] | list[str] | None across the skill script API surface,
enabling CLI-style argv forwarding to subprocess scripts.

Changes:
- SkillScript.run(), InlineSkillScript.run(), FileSkillScript.run(): widen
  args type; InlineSkillScript rejects list with TypeError
- FileSkillScript.parameters_schema: returns array-of-strings schema
- FileSkill.content: appends <scripts> block with parameters_schema
- SkillScriptRunner protocol: widen args type
- SkillsProvider._run_skill_script: widen args type
- run_skill_script tool schema: accept object, array, or null
- subprocess_script_runner sample: accept list[str], reject dict
- class_based_skill sample: fix missing SkillFrontmatter wrapper
- Standardize 'folder' to 'directory' in docstrings (#5712)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SergeyMenshykh · 2026-05-14 17:58:10 +00:00

7432105ebe

[Python] [Breaking] Extract skill spec metadata into SkillFrontmatter (#5775 )

* Fix Skill docstring consistency and spelling

- Add ClassSkill to Skill class docstring concrete implementations list
- Normalize 'defence' to 'defense' for American English consistency
- Remove extra blank line in InlineSkill docstring example

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix E501 line-too-long lint error in test_skills.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix stale test section header to reflect SkillFrontmatter API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix metadata children overriding top-level frontmatter fields

Scope YAML_KV_RE to column-0 keys only so indented children
under metadata: are not mistakenly parsed as top-level fields.
Add regression test and spec fields to sample SKILL.md files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SergeyMenshykh · 2026-05-13 20:35:52 +00:00

ab09246dc4

Python: [BREAKING] Migrate agent-framework-a2a to a2a-sdk v1.0 (#5752 )

* Python: Migrate agent-framework-a2a to a2a-sdk v1.0

Upgrade the a2a-sdk dependency from v0.3.x to v1.0.0 and migrate all
source, tests, samples, and documentation to the v1.0 API.

Key changes:
- Dependency: a2a-sdk>=1.0.0,<2 (was >=0.3.5,<0.3.24)
- Types are now protobuf-based: Part replaces TextPart/FilePart/DataPart
- Enums use SCREAMING_SNAKE_CASE (e.g. TaskState.TASK_STATE_COMPLETED)
- Roles: Role.ROLE_AGENT, Role.ROLE_USER
- Client: SendMessageRequest wrapper, subscribe() replaces resubscribe()
- Server: A2AStarletteApplication replaced by Starlette + route factories
- DefaultRequestHandler now requires agent_card parameter
- TaskUpdater: final parameter removed, add_artifact gains last_chunk
- AgentCard.url removed; use supported_interfaces with AgentInterface
- Stream yields StreamResponse with WhichOneof('payload')

Closes #5661

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: validate fallback URL, remove unused task_id vars

- Raise ValueError with clear message when transport negotiation fails
  and no fallback URL is available (neither url arg nor supported_interfaces)
- Remove unused task_id local in status_update branch
- Inline artifact_event.task_id directly in artifact_update branch

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-05-11 22:46:12 +00:00

4ad96b64e7

Python: Upgrade github-copilot-sdk to v1.0.0b2 with new features (#5665 )

* Upgrade github-copilot-sdk to v1.0.0b1 and implement new features

- Bump github-copilot-sdk dependency from 0.2.1 to 1.0.0b1
- Fix breaking type renames: ErrorClass -> ToolExecutionCompleteError,
  Result -> ToolExecutionCompleteResult
- Add instruction_directories support in GitHubCopilotOptions (session-level)
- Add copilot_home support in GitHubCopilotSettings (client-level)
- Add sample: github_copilot_with_instruction_directories.py
- Update README with new env var and sample entry
- Add 8 new unit tests covering the new features (103 total, 96% coverage)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* mypy fix

* small fix

* Address PR feedback: fix resume path, remove copilot_home from Options, bump to beta.2

- Forward runtime_options through _resume_session (fixes silent drop of
  instruction_directories/model/etc on resumed sessions)
- Remove copilot_home from GitHubCopilotOptions (client-level setting only
  consumed at startup, not per-call)
- Bump github-copilot-sdk from 1.0.0b1 to 1.0.0b2
- Add test for instruction_directories override on resumed sessions
- Update existing resume test to match new _resume_session signature

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-05-07 21:43:47 +00:00

57fb32efc8

Python: Add ClassSkill for class-based skill definitions (#5678 )

* Python: Add ClassSkill for class-based skill definitions

Add ClassSkill abstract base class with decorator-based resource and script
discovery, porting .NET's AgentClassSkill (PRs #5027 and #5183) to Python.

- Add ClassSkill(Skill, ABC) with instructions abstract property, cached
  content/resources/scripts properties
- Add @ClassSkill.resource and @ClassSkill.script static method decorators
  for auto-discovery of methods and properties
- Extract _build_skill_content() and _create_resource_element() shared
  helpers from InlineSkill for reuse
- Add _discover_marked_members() for scanning class hierarchies
- Add _make_method_name() for Python-to-skill name conversion
- Add class_based_skill sample (UnitConverterSkill)
- Update mixed_skills sample with TemperatureConverterSkill
- Add 58 new tests covering ClassSkill, decorator discovery, property
  resources, inheritance, kwargs forwarding, and duplicate detection
- Export ClassSkill from agent_framework public API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: replace try/except/continue with assignment to satisfy bandit B112

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* address PR review feedback

- Walk cls.__mro__ in _discover_marked_members for inherited property resources
- Use inspect.getattr_static for MRO-aware is_property check
- Return defensive copies from resources/scripts properties
- Raise TypeError on wrong decorator stacking order (@resource above @property)
- Log warning instead of silently swallowing descriptor errors during discovery
- Validate explicit name= at decoration time via _validate_member_name
- Add tests for all of the above

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix temperature converter skill: make resource necessary for script

Refactor TemperatureConverterSkill so the agent must read the
formulas resource (factor/offset) before calling the script,
aligning with the volume-converter pattern.

- Resource: numeric factor/offset table instead of symbolic formulas
- Script: generic linear transform (value * factor + offset)
- Instructions: updated to reflect new workflow

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SergeyMenshykh · 2026-05-07 19:39:12 +00:00

1d94518f37

Python: Add support for function approval flow in Foundry hosted agent (#5666 )

* Add support for function approval flow in Foundry hosted agent

* Address comments

* Address comments

* Address comments

Tao Chen · 2026-05-07 14:55:26 +00:00

213491da66

Python: Remove bespoke Foundry toolbox helpers; standardize on MCP for toolbox consumption (#5671 )

* Remove Foundry toolbox helpers; standardize on MCP for toolbox consumption

- Remove RawFoundryChatClient.get_toolbox() and its fetch_toolbox import
- Remove fetch_toolbox, select_toolbox_tools, get_toolbox_tool_name,
  get_toolbox_tool_type, FoundryHostedToolType, ToolboxToolSelectionInput
  from agent_framework_foundry._tools
- Remove ExperimentalFeature.TOOLBOXES from _feature_stage.py (no consumers)
- Drop toolbox re-exports from agent_framework_foundry/__init__.py and
  agent_framework.foundry namespace
- Update _sanitize_foundry_response_tool docstring to remove toolbox framing;
  sanitization logic itself is unchanged
- Update _agent.py docstring: 'toolbox-fetched MCP' → 'hosted MCP'
- Delete tests/test_toolbox.py (all tests covered removed helpers)
- Update test_foundry_chat_client.py: rename/redoc tests that mentioned
  toolbox but test sanitization that remains
- Delete foundry_chat_client_with_toolbox.py (bespoke toolbox API sample)
- Delete foundry_toolbox_context_provider.py (relied on select_toolbox_tools)
- Rename foundry_chat_client_with_toolbox_mcp.py →
  foundry_chat_client_with_toolbox.py (canonical MCP pattern)
- Rewrite 04_foundry_toolbox/main.py to use MCPStreamableHTTPTool
- Update provider/README, context_providers/README, 04_foundry_toolbox/README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(samples): update 06_files sample to consume toolbox via MCP (#5670)

Replace removed get_toolbox/select_toolbox_tools APIs with
MCPStreamableHTTPTool, using allowed_tools=["code_interpreter"] to
select only the code interpreter from the toolbox endpoint.

Update .env.example and README to use FOUNDRY_TOOLBOX_ENDPOINT
instead of TOOLBOX_NAME.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(foundry): remove non-existent toolbox helper APIs from README (#5670)

Remove the 'fetch, optionally filter, and pass tools directly' pattern
from the FoundryChatClient toolbox documentation, as select_toolbox_tools
and get_toolbox were removed. Only the MCP endpoint pattern is documented.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(foundry): remove residual toolbox docstring references and reproduction report

Remove REPRODUCTION_REPORT.md (workflow artifact that should not be committed),
and update two remaining docstring references that still said 'toolbox reads'
/'toolbox definition' after the toolbox helpers were removed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Remove bespoke Foundry toolbox helpers; standardize on MCP for toolbox consumption

Fixes #5670

* fix(#5670): resolve toolbox endpoint from TOOLBOX_NAME fallback; add namespace regression tests

- Add _resolve_toolbox_endpoint() helper in 04_foundry_toolbox/main.py and
  06_files/main.py that prefers FOUNDRY_TOOLBOX_ENDPOINT but falls back to
  deriving the MCP URL from FOUNDRY_PROJECT_ENDPOINT + TOOLBOX_NAME — fixing
  the startup KeyError when agents are deployed via azd provision (which injects
  TOOLBOX_NAME, not FOUNDRY_TOOLBOX_ENDPOINT).
- Update 04_foundry_toolbox/.env.example to use FOUNDRY_TOOLBOX_ENDPOINT
  (consistent with 06_files).
- Add TOOLBOX_NAME env var to 06_files/agent.yaml so deployed agents have it
  available for the fallback derivation.
- Update both READMEs to document the two ways to supply the toolbox endpoint.
- Add test_foundry_namespace_no_longer_exposes_toolbox_helpers() with negative
  assertions for FoundryHostedToolType, get_toolbox_tool_name,
  get_toolbox_tool_type, and select_toolbox_tools — guarding against accidental
  re-introduction of removed symbols.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(samples): fail fast on empty FOUNDRY_TOOLBOX_ENDPOINT; add unit tests

Addresses review feedback for #5670:

- In _resolve_toolbox_endpoint() (04_foundry_toolbox/main.py and
  06_files/main.py) change the walrus-operator check from a truthy
  test to an explicit 'is not None' guard.  An explicitly set empty
  string now raises ValueError immediately with a clear message
  instead of silently falling through to the fallback URL
  construction.

- Add tests/samples/hosting/test_toolbox_endpoint.py covering both
  sample modules:
    (a) FOUNDRY_TOOLBOX_ENDPOINT set → returned as-is
    (b) FOUNDRY_TOOLBOX_ENDPOINT set to empty string → ValueError
    (c) fallback constructs URL from FOUNDRY_PROJECT_ENDPOINT + TOOLBOX_NAME,
        stripping trailing slashes
    (d) neither variable group set → KeyError

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback: remove extraneous test and docstring content

- Remove test_foundry_namespace_no_longer_exposes_toolbox_helpers (no longer warranted)
- Remove docstring from _agent.py _prepare_tools_for_openai (extraneous)
- Trim _chat_client.py _prepare_tools_for_openai docstring to one-liner (toolbox references no longer relevant)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: remove remaining extraneous docstring from RawFoundryChatClient._prepare_tools_for_openai

Address review comment on PR #5671: reviewer noted the description
isn't warranted now that toolbox helpers have been removed. Matches
the pattern in RawFoundryAgentChatClient which has no docstring.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-05-06 23:56:16 +00:00

e56e6dad4d

Python: [Breaking] Restructure agent skills to use multi-source architecture (#5584 )

* migrate skills to multi source architecture

* Fix ruff lint errors in skills module (ASYNC240, SIM108, E501)

- Use anyio.Path for async file I/O in _FileSkillResource.read()
- Use noqa: ASYNC240 for pure string os.path calls in async context
- Restore pre-commit if/else pattern in InlineSkillScript.run()
- Break long lines to fit 120-char limit in _skills.py and test_skills.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: collapse multi-line lambdas to single lines to fix pyright errors

The pyright ignore comments only suppress errors on the same line, so
multi-line lambdas left arguments on continuation lines uncovered.
Collapse both lambdas to single lines matching the existing load_skill
lambda pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: replace untyped lambdas with typed inner functions to fix pyright errors

Python lambdas cannot have type annotations, so pyright reports
reportUnknownLambdaType and reportUnknownArgumentType errors that
cannot be suppressed with inline ignore comments. Replace the
lambdas for read_skill_resource and run_skill_script with typed
inner async functions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address PR review feedback on docs and prompt template

- Update with_prompt_template() docstring to document the
  {resource_instructions} placeholder requirement
- Remove stray backslashes after {resource_instructions} and
  {runner_instructions} in DEFAULT_SKILLS_INSTRUCTION_PROMPT
- Update subprocess_script_runner docstring to reflect
  FileSkillScript.full_path usage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: replace dict[str, Skill] with Sequence[Skill] in SkillsProvider

Replace internal dict-based skills storage with Sequence[Skill] to
eliminate silent duplicate overwrites and simplify the code. Add
_find_skill helper for case-insensitive linear lookup.

Also fix pyright errors in tests by adding isinstance assertions
before accessing .function on SkillResource/SkillScript base types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: add read-time resource path validation in _FileSkillsSource

Move security validation (path-traversal and symlink guards) for
file-based skill resources into _FileSkillsSource, restoring the
read-time checks that existed in main via _read_file_skill_resource.

- Add _get_validated_resource_path static method on _FileSkillsSource
  that validates containment, existence, and symlink safety
- _FileSkillsSource.get_skills() validates resource paths at discovery
  time via _get_validated_resource_path before passing to _FileSkillResource
- Move _normalize_resource_path, _is_path_within_directory, and
  _has_symlink_in_path from module-level into _FileSkillsSource as
  static methods (only used there)
- _FileSkillResource remains a simple path-to-content reader
- Add tests for _get_validated_resource_path security checks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: reject str/Path in SkillsProvider constructor to prevent str-as-Sequence ambiguity

Since str is a Sequence, passing a path string to the source parameter
would silently be treated as a sequence of characters instead of a
file source. Add an explicit TypeError with a helpful message pointing
callers to SkillsProvider.from_paths().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5584 review feedback

- Remove .NET reference from _FileSkillResource docstring
- Fix inconsistent resource name example (references/FAQ.md -> references/FAQ)
- Simplify SkillsProvider usage in code_defined_skill sample (pass single skill directly)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* remove skillsproviderbuilder

* Update python/packages/core/agent_framework/_skills.py

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

* fix: remove dead code and fix sync function call in InlineSkillResource.read()

- Change await self.function() to self.function() for sync functions
  without **kwargs; async results are handled by inspect.isawaitable()
- Remove unreachable raise ValueError since __init__ already validates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* remove full_path unnecessary property

* replace anyio with asyncio.to_thread for file I/O in _FileSkillResource

Replace anyio.Path usage with asyncio.to_thread + pathlib.Path since
anyio is not a direct dependency of core (transitive via mcp).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* simplify awaitable check to return directly

Use 'return await result' instead of assigning then returning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* address PR review feedback for skills refactoring

- Replace anyio with asyncio.to_thread + pathlib.Path for file I/O
- Simplify awaitable check to return directly
- Remove unnecessary function None guard in InlineSkillResource.read()
- Add assert for type narrowing on self.function

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* address PR review feedback for skills refactoring

- Replace anyio with asyncio.to_thread + pathlib.Path for file I/O
- Simplify awaitable checks to return directly
- Remove unnecessary function None guard in InlineSkillResource.read()
- Use typing.cast instead of assert for type narrowing
- Add caching behavior note to SkillsProvider docstring

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: move name/description from abstract properties to Skill.__init__

Replace abstract properties for name and description on the Skill ABC
with a base __init__ that validates and stores them as regular
attributes. This simplifies custom Skill subclasses (only content
remains abstract) and centralizes validation in the base class,
consistent with SkillResource and SkillScript base classes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

SergeyMenshykh · 2026-05-06 09:45:06 +00:00

be8d2619e4

Python: Add Python parity for InvokeMcpTool in declarative workflow (#5630 )

* Add Python parity for HttpRequestAction in declarative workflow

* Ran pyupgrade and pright to fix CI issues

* Fix conversation ID dot parsing for http executor

* Removed unnecessary export command

* Initial implementation of invoke mcp tool in python

* Update sample to support require approval to be toggled by environment variable.

* Fix cache and PR comments

* Update python/samples/03-workflows/declarative/invoke_mcp_tool/main.py

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

---------

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

Peter Ibekwe · 2026-05-05 20:16:03 +00:00

f25e81701d

Python: information-flow control prompt injection defense (#5331 )

* Python: Information-flow control based prompt injection defense (#5024)

* fides integration

* documentation

* documentation

* documentation

* human-approval on policy violation

* numenous hyena 'works'

* IFC based implementation

* minor edits in documentation

* rebasing the branch and running the email example

* Add security tests for IFC middleware

* Fix Role.TOOL NameError in approval handling

* tiered labelling scheme

* 3 tier labelling scheme in middleware

* Adapt security middleware to list[Content] tool results

* Refactor SecureAgentConfig as context provider and address Copilot review comments

* Update FIDES docs to reflect context provider pattern and update code for ContextProvider rename

* Fix security examples: use OpenAIChatClient instead of non-existent AzureOpenAIChatClient

* Address PR review: consolidate security modules, remove ContentLineage, update docs

* remove unrelated files

* remove comment from _tools.py and rename decision file

* Fix CI failures: Bandit B110, broken md links, hosted approval passthrough

* apply template to decision doc 0024

* minor fixes to decision doc 0024

---------

Co-authored-by: Aashish <t-akolluri@microsoft.com>

* Python: follow up FIDES security flow (#5330)

* Python: follow up FIDES security flow

Refine the secure approval path, mark the security classes with the FIDES experimental feature label, and clean up the related docs/tests. Also fix workspace-level validation regressions uncovered while running the full Python check suite.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: remove FIDES GitHub MCP sample

Drop the GitHub MCP security sample from the FIDES follow-up branch while keeping the remaining security docs and samples intact.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: fix paths and update FIDES implementation (#5352)

* Python: updated import naming and comment from review (#5421)

* updated import naming and comment from review

* Add approval replay None call-id test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Address PR 5331 comments and track sesssion while calling Agent in email_security_example (#5446)

* Address PR review: fix paths and update FIDES implementation

* Address PR comments and add session tracking in email example in samples

* Fix session creation and resolve merge conflict in docstring example

* Resolve merge conflict in docstring example

* Python: add test for empty-message pruning in approval result replacement (#5617)

Adds test coverage for the second-pass logic in
`_replace_approval_contents_with_results` that removes messages whose
`contents` list becomes empty after first-pass content removal.

Addresses review comment on PR #5331:
https://github.com/microsoft/agent-framework/pull/5331#discussion_r3129039445

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: shrutitople <shruti.tople@gmail.com>
Co-authored-by: Aashish <t-akolluri@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-05-05 18:08:08 +00:00

ddfbdf5c7a

Python: Fix hyperlight WasmSandbox cross-thread Drop and harden hosted-agent sample (#5603 )

* update hyperlight to beta and move samples, add hosted agent sample

* Python: Fix hyperlight WasmSandbox cross-thread Drop and harden sample

Root cause: when a worker-side closure raised, the exception's __traceback__
retained frame locals that included the partially constructed PyO3 sandbox.
Future.result() re-raised that exception on the caller thread, and when the
caller's exception was eventually GC'd the frame locals were released
off-thread, dec_ref'ing the unsendable sandbox from the wrong thread and
tripping the PyO3 panic
'_native_wasm::WasmSandbox is unsendable, but is being dropped on another thread'.

Fix:
* Add _SandboxWorker._run_on_worker which catches every exception on the
  worker, drops __traceback__ there, deletes the original exception, and
  re-raises a fresh instance on the caller thread. initialize and execute
  route through it; dispose keeps its bare-submit semantics.
* Add an opt-in diagnostic module _drop_diagnostic (no-op unless
  HYPERLIGHT_TRACE_DROPS=1) that installs a sys.unraisablehook and dumps
  owner-thread + per-thread stacks on any future cross-thread unsendable
  Drop. Useful for triaging similar PyO3 regressions.
* Tests: cross-thread invocation, traceback-leak isolation, _SandboxEntry
  attribute-shape check, and a stale-reference stress test driven through
  asyncio.to_thread.

Sample (samples/04-hosting/foundry-hosted-agents/responses/06_hyperlight_codeact):
* Dockerfile installs agent-framework-* from in-tree source with python/ as
  build context so unreleased fixes can be validated end-to-end.
* call_server.py pins the Responses API version.
* main.py enables include_detailed_errors=True so future tool failures
  surface the actual exception text instead of a bare 'Error: Function
  failed.' string.
* README.md documents the in-tree-package build and the Hyperlight
  hypervisor requirement (/dev/kvm on Linux, MSHV on Windows). Hosted
  environments without hypervisor passthrough surface 'No Hypervisor was
  found for Sandbox'; this is a hosting constraint, not a hyperlight bug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: remove _drop_diagnostic from hyperlight package

The diagnostic module was useful while bisecting the cross-thread Drop bug,
but it is no longer needed now that _SandboxWorker._run_on_worker prevents
the panic at the source.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: address PR review feedback on hyperlight

- Use lazy agent_framework.hyperlight import in sample main.py.
- Env-driven endpoint (FOUNDRY_AGENT_ENDPOINT) in call_server.py; remove personal URLs.
- Align agent.yaml model deployment with manifest (gpt-4.1-mini).
- Tighten Dockerfile requirements guard; drop dangling deploy.ps1 reference.
- Preserve exception args when sanitizing tracebacks in _run_on_worker.
- Add public _SandboxWorker.is_alive(); update test to avoid private attr.
- Add namespace coverage tests for agent_framework.hyperlight lazy loader.
- Add prominent note: Foundry hosted-agent runtime does not yet support
  Hyperlight (no hypervisor exposed); container works locally with /dev/kvm.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: bump hyperlight-sandbox dependencies to 0.4.x

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: renumber hyperlight codeact sample to 08

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Coerce worker exception args to strings for cross-thread safety

Stringify exc.args on the worker thread before propagating, so any
PyO3 unsendable object captured in args (e.g. via a caller-supplied
callback or underlying SDK) cannot be Dropped on the calling thread.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* moved sample

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-05-05 10:06:16 +00:00

57c901a245

Python: Add hosted agent sample with observability (#5608 )

* Add hosted agent sample with observability

* Address comments

* Remove unneeded changes

* Update README

Tao Chen · 2026-05-04 22:31:47 +00:00

5a087885a2

Python: Support GPT-5 verbosity option and restore Foundry agent_reference (#5619 )

* Python: Support GPT-5 verbosity option and restore Foundry agent_reference

Adds verbosity as a typed Literal["low","medium","high"] field on
OpenAIChatOptions (Responses API) and OpenAIChatCompletionOptions (Chat
Completions API), set in the same way as the existing reasoning options.
For the Responses API, top-level verbosity is translated to the nested
text.verbosity shape the OpenAI service expects. The same field flows
through to FoundryChatClient via the existing FoundryChatOptions alias.

Also fixes #5582: PR #5447 removed the agent_reference injection from
RawFoundryAgentChatClient._prepare_options, so first-turn calls against
a Foundry Prompt Agent went out without model and without agent_reference
and were rejected by the Responses API with "Missing required parameter:
'model'". Restores the injection on the non-preview path
(allow_preview=False) and adds a guard test that asserts the preview
path does not inject agent_reference, since the preview SDK injects it
via project_client.get_openai_client(agent_name=...).

Closes #5516
Closes #5582

* Python: Address Copilot review on PR #5619

- Foundry verbosity sample docstring: replace the misleading "set deployment
  name on model=" instruction with the actual env-var pattern the sample relies
  on (FOUNDRY_PROJECT_ENDPOINT and FOUNDRY_MODEL).
- _build_agent_reference docstring: clarify the helper is used for both
  Prompt Agents and HostedAgents on the non-preview path.
- Add a Responses API test that locks in the documented precedence rule:
  when both top-level verbosity and text["verbosity"] are supplied, the
  top-level value wins.

* Python: Drop redundant Foundry verbosity sample and list OpenAI sample in README

- Remove samples/02-agents/providers/foundry/foundry_chat_client_verbosity.py
  per review feedback. The verbosity functionality is identical across the
  OpenAI and Foundry clients (FoundryChatOptions is an alias of
  OpenAIChatOptions), so a single sample on the OpenAI side is sufficient.
- Add the new client_verbosity.py entry to the OpenAI samples README.

Evan Mattson · 2026-05-04 21:21:40 +00:00

f3db60fa65

docs: fix outdated @ai_function reference to @tool in workflows README (#5622 )

The @ai_function decorator was renamed to @tool in release 
python-1.0.0b260128 (PR #3413) as a breaking change.

Line 58 of python/samples/03-workflows/README.md still referenced 
the old @ai_function name, causing users to hit:
ImportError: cannot import name 'AIFunction'

Changes made:
- Fixed @ai_function to @tool on line 58 only
- No formatting or whitespace changes

Aishwarya Sawant · 2026-05-04 10:59:09 +00:00

e558d36ff6

Python: docs(python/samples): recommend uv venv and document Windows ensurepip hang workaround (#5508 )

* docs(samples): recommend uv venv to avoid Windows ensurepip hang

Replace bare 'python -m venv .venv' with 'uv venv .venv' as the
recommended approach in azure_functions and foundry-hosted-agents
READMEs. Add a note explaining that python -m venv can hang
indefinitely on Windows with Microsoft Store Python due to a known
ensurepip issue.

This matches the pattern already used in a2a/README.md which uses
uv run exclusively.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: docs(python/samples): recommend `uv venv` and document Windows ensurepip hang workaround

Fixes #5401

* fix: correct Windows venv activation commands in foundry-hosted-agents README (#5401)

Split the Windows activation section into separate PowerShell (.venv\Scripts\Activate.ps1)
and Command Prompt (.venv\Scripts\activate.bat) instructions, replacing the incorrect
extensionless `Activate` path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5401: Python: [Samples][Python] `python -m venv` hangs on Windows — READMEs should recommend uv or document workaround

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-05-04 04:46:17 +00:00

6582926af5

Python: Add redis[asyncio] to requirements.txt for streaming samples (#5509 )

* fix: add redis[asyncio] to streaming sample requirements.txt

Both streaming samples import redis.asyncio in redis_stream_response_handler.py
but neither included redis in their requirements.txt, causing ModuleNotFoundError
on fresh installs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Add `redis[asyncio]` to requirements.txt for streaming samples

Fixes #5396

* Revert unrelated formatting and cleanup changes

Revert formatting-only edits in sample files and unrelated cleanup
(unused import removal, __all__ reordering) that were accidentally
included in the redis dependency fix (issue #5396).

The only intended changes for this PR are the Redis dependency
additions to requirements.txt files for the streaming samples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5396: Python: [Samples][Python] redis package missing from requirements.txt in streaming samples

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-05-04 04:45:07 +00:00

0507179d3b

Python: Document that W3C trace context injection does not apply to Foundry hosted/toolbox MCP tools (#5580 )

* docs: clarify MCP trace-context propagation scope for hosted/toolbox tools (#5547)

Automatic W3C trace-context injection via params._meta applies only to
MCP sessions opened by the agent process (MCPStreamableHTTPTool,
MCPStdioTool, MCPWebsocketTool).  Hosted MCP tools
(FoundryChatClient.get_mcp_tool) and toolbox-fetched tools
(FoundryChatClient.get_toolbox) execute inside the Foundry agent service
runtime; the framework never issues the tools/call for those and
therefore cannot inject traceparent/tracestate.  The previous wording
("for all transports") implied coverage that does not exist.

The updated section:
- removes the inaccurate "for all transports" claim
- adds a Scope paragraph naming the three client-opened transports that
  are covered
- explicitly states that propagation across the agent-to-toolbox-to-MCP
  boundary is the responsibility of the Foundry service runtime
- documents the workaround (use MCPStreamableHTTPTool directly) for
  users who need end-to-end distributed tracing today

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: broaden MCP _meta scope note to cover all provider-managed transports (#5547)

- List OpenAIChatClient.get_mcp_tool() and AnthropicClient.get_mcp_tool()
  alongside FoundryChatClient.get_mcp_tool() as hosted/provider-managed
  exceptions; restricting the carve-out to Foundry was misleading for
  readers using other providers
- Fix get_toolbox() wording: use 'await client.get_toolbox(...)' and note
  that toolbox.tools is passed into Agent(tools=...) so it reads as an
  async instance method call, not a static/class method call
- Add parenthetical '(or any other client-opened MCPTool subclass)' to
  future-proof the list of covered transports

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add GeminiChatClient to MCP scope note and add learn-site observability doc (#5547)

- Add GeminiChatClient.get_mcp_tool(...) to the hosted/provider-managed
  list in the MCP trace propagation scope note; Gemini's get_mcp_tool()
  returns a types.Tool with an McpServer entry executed by the Gemini
  service runtime, so it belongs alongside FoundryChatClient,
  OpenAIChatClient, and AnthropicClient in that list.
- Create docs/features/observability/README.md as the learn-site
  documentation surface for observability, covering telemetry setup and
  MCP trace propagation with the same scope note (including
  GeminiChatClient) so that both doc surfaces are consistent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Remove unneeded observability docs README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-05-03 23:08:56 +00:00

b8e66a1144

Python: Add Python parity for HttpRequestAction in declarative workflow (#5599 )

* Add Python parity for HttpRequestAction in declarative workflow

* Ran pyupgrade and pright to fix CI issues

* Fix conversation ID dot parsing for http executor

* Removed unnecessary export command

Peter Ibekwe · 2026-05-01 23:04:07 +00:00

bc42874690

Python: Add sample for hosted agent with files (#5596 )

* Add sample for hosted agent with files

* Update python/samples/04-hosting/foundry-hosted-agents/responses/06_files/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/04-hosting/foundry-hosted-agents/responses/06_files/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/04-hosting/foundry-hosted-agents/responses/06_files/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/04-hosting/foundry-hosted-agents/responses/04_foundry_toolbox/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/04-hosting/foundry-hosted-agents/responses/06_files/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Improve README

* Address comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Tao Chen · 2026-05-01 18:40:42 +00:00

18293ffb31

Python: Enforce approval_mode in Claude and GitHub Copilot agents (#5562 )

* Python: Enforce approval_mode in Claude and GitHub Copilot agents

Tools declared with approval_mode="always_require" were bypassed by the
ClaudeAgent and GitHubCopilotAgent because their SDK-managed tool-calling
loops invoke FunctionTool.invoke() directly via package-supplied handlers,
skipping the standard _try_execute_function_calls approval gate.

Per discussion on #5494, the fix lives in the agents (not in FunctionTool):
any flag added to the tool itself can be spoofed by code with the same
level of access, so the security boundary is the agent that owns the
tool-calling loop.

- Add on_function_approval option to ClaudeAgentOptions and
  GitHubCopilotOptions. Callback receives a FunctionCallContent describing
  the pending call and returns bool (sync or async).
- Gate FunctionTool.invoke() inside each agent's existing tool-handler
  closure when approval_mode == "always_require". Default policy is deny;
  callbacks that raise also deny safely.
- Deny path returns a tool-error to the model (Claude: text content;
  Copilot: ToolResult(result_type="failure", error="approval_denied"))
  so the LLM can react gracefully instead of silently failing.
- Tests for both agents covering: deny by default, sync False, sync True,
  async True, callback-raises -> deny, no-op for never_require tools.
- Samples demonstrating sync, async, and deny-by-default flows for both
  agents.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: preserve empty arg dicts, reject runtime approval override

- _resolve_function_approval no longer collapses {} into None when building
  the FunctionCallContent passed to the callback (Claude + Copilot).
- Claude _apply_runtime_options and Copilot _run_impl/_stream_updates now
  raise ValueError if on_function_approval is supplied via per-run options,
  instead of silently ignoring it. Approval policy must be set at agent
  construction time.
- Drop unnecessary # type: ignore[attr-defined] on Content.name/.arguments
  in samples (Content is a unified class with both attributes defined).
- Add regression tests for the new runtime-options validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* warning when non callback handler and approval needed

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-05-01 14:11:28 +00:00

c1cc6ee6df

Python: Reduce flaky integration tests and improve CI signal quality (#5454 )

* Enable Ollama integration tests in CI and rename report to Integration Test Report

- Install Ollama, cache models (qwen2.5:0.5b + nomic-embed-text), and start
  server in the Misc integration job for both workflow files
- Set OLLAMA_MODEL and OLLAMA_EMBEDDING_MODEL env vars so the 5 Ollama tests
  are no longer skipped
- Rename Flaky Test Report to Integration Test Report throughout (job names,
  artifact names, cache keys, file names, script titles/docstrings)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Bump Ollama model to qwen2.5:1.5b for better instruction following

The 0.5b model was too small to reliably follow simple prompts like
'Say Hello World', causing test assertion failures. The 1.5b model
follows instructions more reliably while still being small enough
for fast CI pulls (~1GB).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable reliable streaming integration tests

Remove the hard skip on test_03_reliable_streaming tests that was
temporarily disabled for instability investigation. CI infrastructure
(Azurite, DTS emulator, Redis, func CLI) is already in place.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable skipped Functions/DurableTask tests and bump timeout to 480s

- Remove hard skips from 4 tests in test_11_workflow_parallel.py
- Remove hard skip from test_conditional_branching in test_06_dt_multi_agent_orchestration_conditionals.py
- Increase pytest --timeout from 360 to 480 for Functions+DurableTask CI job
- Updated in both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip failing Functions/DurableTask tests with specific root causes

- test_11_workflow_parallel (4 tests): xdist worker crashes during execution
- test_conditional_branching: orchestration fails with RuntimeError, not a timeout
- Keep 480s timeout bump for remaining Functions tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix auth routing in samples 06/11: api_key -> credential for Azure OpenAI

Both samples passed a bearer token provider via api_key= which caused the
client to route to api.openai.com instead of Azure OpenAI, resulting in
401 Unauthorized. Changed to credential= which correctly triggers Azure
routing and picks up AZURE_OPENAI_ENDPOINT from the environment.

- samples/azure_functions/11_workflow_parallel/function_app.py: 1 fix
- samples/durabletask/06_multi_agent_orchestration_conditionals/worker.py: 2 fixes
- Re-enable 4 parallel workflow tests and 1 conditional branching test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip parallel workflow tests: xdist worker distribution issue

The 4 parallel workflow tests crash because xdist worksteal distributes
them across separate workers, each spawning its own func process against
shared emulators. Auth fix (api_key->credential) was valid and stays.
test_conditional_branching now passes with the auth fix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix E501 line-too-long in azurefunctions parallel test skip reasons

Wrap skip reason strings to stay within 120 char line limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add retry logic and port-conflict fix for Ollama CI setup

- Kill any auto-started Ollama before launching serve (fixes port
  conflict: 'address already in use')
- Retry ollama pull up to 3 times with 15s backoff (fixes 429 rate
  limit failures)
- Applied to both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky integration tests and re-enable skipped tests

- Foundry agent: add allow_preview=True to custom client test
- Foundry hosting: raise max_output_tokens 50->200, add temperature,
  relax assertion in test_temperature_and_max_tokens
- Foundry embedding: update skip reason with root cause (endpoint mismatch)
- OpenAI file search: fix vector store indexing race condition by polling
  file_counts before querying; fix get_streaming_response -> get_response(stream=True)
- Azure OpenAI file search: remove skip (transient 500 resolved)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Remove temperature from foundry hosting test (unsupported by CI model)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Ollama tool call integration tests with no-arg function

Use a no-argument greet() function instead of hello_world(arg1) for
integration tests. The 1.5B model in CI is unreliable at generating
correct tool call arguments, causing 'Argument parsing failed' errors.
A no-arg function eliminates this flakiness entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Increase reliable streaming test timeouts from 30s to 60s

The LLM call through Azure OpenAI + Redis streaming pipeline can exceed
30s in CI due to cold starts or throttling. Raise to 60s to reduce
flaky timeouts while still bounded by pytest's 120s per-test limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable workflow parallel tests with xdist_group marker

The tests were skipped because xdist distributes module tests across
workers, each spawning their own func process (port conflicts). Adding
xdist_group forces all tests in this module onto a single worker so
the module-scoped function_app_for_test fixture works correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert "Re-enable workflow parallel tests with xdist_group marker"

This reverts commit 455c28da62.

* Rename flaky_report to integration_test_report and add try/finally cleanup

- Rename scripts/flaky_report/ to scripts/integration_test_report/ to
  reflect expanded scope beyond flaky-test detection
- Update workflow references in both CI files
- Wrap file search integration tests in try/finally to ensure vector
  store cleanup runs even on test failure or timeout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Ollama pull failure propagation and Azure OpenAI vector store readiness

- Ollama CI: fail the step immediately if model pull fails after 3
  retries instead of silently proceeding to tests
- Azure OpenAI file search: add the same vector-store readiness polling
  that was applied to the non-Azure OpenAI tests, preventing eventual
  consistency race conditions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* remove load_dotenv from test file

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-05-01 00:41:39 +00:00

540193ccef

Python: Update package dependencies (#5555 )

* Update dependencies

* Preserve mcp[ws] and uvicorn[standard] extras in override-dependencies

Bare-package overrides on mcp and uvicorn dropped the [ws] and [standard]
extras (and their transitive deps like httptools, watchfiles) from the
generated lock. Re-add the extras to the overrides so the lock matches
what workspace packages actually request.

Evan Mattson · 2026-04-29 06:18:03 +00:00

094f9903b3

Python: [BREAKING] Standardize orchestration terminal outputs as AgentResponse (#5301 )

* Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs

* Fix orchestration output issues from review comments

1. Sample cleanup: Remove commented-out FoundryChatClient block and update
prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars.

2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response
from a no-op sink to yield response.agent_response. When the last participant is
AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output
executor so the yield produces the terminal answer. When the last participant is a
regular AgentExecutor, _EndWithConversation is not in output_executors so the yield
is silently filtered out.

3. Forward data events through WorkflowExecutor: _process_workflow_result now also
forwards 'data' events from sub-workflows so that emit_intermediate_data=True on
AgentExecutor works correctly when wrapped in AgentApprovalExecutor.

4. Concurrent docstring: Update _AggregateAgentConversations docstring to say
'deterministic participant order' instead of 'completion order'.

5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that
ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events
alongside the single aggregated output event.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add tests for sequential workflow with_request_info and intermediate_outputs (#5301)

Address PR review comments 2, 3, and 5:

- Add test_sequential_request_info_last_participant_emits_output:
Verifies that when the last participant is wrapped via with_request_info()
(AgentApprovalExecutor), the workflow still emits a terminal output after
approval, exercising the _EndWithConversation.end_with_agent_executor_response
fallback path.

- Add test_sequential_request_info_with_intermediate_outputs_emits_data_events:
Verifies that emit_intermediate_data=True works correctly through
AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already
forwards data events from sub-workflows, so intermediate agent responses
surface as data events in the parent workflow.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix pyright type errors from AgentResponse output refactor (#5301)

Update cast() calls in _group_chat.py and _magentic.py to use
WorkflowContext[Never, AgentResponse] instead of the old
WorkflowContext[Never, list[Message]], matching the updated method
signatures in _base_group_chat_orchestrator.py.

Fix _sequential.py _EndWithConversation.end_with_agent_executor_response
to declare WorkflowContext[Any, AgentResponse] so yield_output accepts
AgentResponse[None].

Fix _workflow_executor.py data event forwarding to handle nullable
executor_id.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix pyright reportUnknownVariableType in _agent.py (#5301)

Extract event.data into a typed local variable before the isinstance
check to avoid pyright narrowing it to AgentResponse[Unknown].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix pyright reportMissingImports for orjson in file history samples (#5301)

Add pyright: ignore[reportMissingImports] to orjson imports that are
already guarded by try/except ImportError, matching the existing pattern
used elsewhere in the samples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5301: review comment fixes

* Revert sequential_workflow_as_agent sample to FoundryChatClient

Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient
in the sequential workflow as agent sample.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address ultrareview feedback: emit_data_events rename + WorkflowAgent reasoning conversion

Layered on top of the prior review-feedback work in this branch.

Renames:
- AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical
rename; orchestration semantics live at the orchestration layer, not
the general-purpose executor). Forwarded through MagenticAgentExecutor,
AgentApprovalExecutor, and all orchestration call sites.
- HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate
(pure predicate; no longer yields anything). HandoffBuilder docstring
rewritten to describe the new per-agent AgentResponse output contract.

WorkflowAgent reasoning-content conversion:
- Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg)
helpers; the as_agent() path now reframes text content from data events
as text_reasoning Content blocks before merging into the AgentResponse.
- Consumers iterate msg.contents and branch on content.type — same path
they already use for Claude thinking and OpenAI reasoning. No new
field on Message/AgentResponse/WorkflowEvent.
- Streaming branch constructs fresh AgentResponseUpdate instances instead
of mutating shared payloads (regression test added).
- Helper _msg_maybe_reasoning consolidates the conditional rewrite at
three call sites in the non-streaming conversion.

Tests:
- TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion
add 9 new tests covering helpers, non-streaming, streaming, mixed content,
already-reasoning passthrough, and mutation-safety regression.
- Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain
to assert text_reasoning content for intermediate agents.

* Fix pyright: widen event.data to Any to avoid partial-unknown narrowing

The streaming conversion path narrowed event.data via isinstance against
generic AgentResponse, producing AgentResponse[Unknown] and tripping
reportUnknownVariableType/reportUnknownMemberType. Binding data: Any
before the check keeps runtime behavior identical while restoring a fully
known type for downstream access.

* Clean up design

* Scope to agent output semantics only

* yield AgentResponseUpdate streaming, AgentResponse non-streaming

* Fix mypy/pyright: widen cast types at GroupChat callsites

Eight callsites in _group_chat.py still cast to WorkflowContext[Never,
AgentResponse] but the base orchestrator methods now accept the wider
WorkflowContext[Never, AgentResponse | AgentResponseUpdate] (mode-aware
yields). W_OutT is invariant, so the narrower cast is not assignable.
Magentic was widened in the same commit; this catches the GroupChat
callsites that were missed.

* Python: skip flaky Foundry / Foundry Hosting integration tests (#5553)

These two integration tests have been failing in the merge queue across
multiple unrelated PRs (5301, 5531). Both are marked `@pytest.mark.flaky`
with 3 retries, but all attempts fail back-to-back. Skipping both with a
reason pointing to #5553 so they can be fixed properly without continuing
to block unrelated merges.

- packages/foundry_hosting/tests/test_responses_int.py::TestOptions::test_temperature_and_max_tokens
- packages/foundry/tests/foundry/test_foundry_embedding_client.py::TestFoundryEmbeddingIntegration::test_text_embedding_live

Also includes a one-line uv.lock specifier-ordering normalization
auto-applied by the poe-check pre-commit hook.

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-04-29 00:35:36 +00:00

866a325b48

Bump vite in /python/samples/05-end-to-end/chatkit-integration/frontend (#5126 )

Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 7.1.12 to 7.3.2.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v7.3.2/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v7.3.2/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-version: 7.3.2
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dependabot[bot] · 2026-04-28 08:08:36 +00:00

df6041bcc1

Bump picomatch (#4936 )

Bumps [picomatch](https://github.com/micromatch/picomatch) from 4.0.3 to 4.0.4.
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/picomatch/compare/4.0.3...4.0.4)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 4.0.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dependabot[bot] · 2026-04-28 07:27:15 +00:00

7c4837744b

Bump postcss (#5491 )

Bumps [postcss](https://github.com/postcss/postcss) from 8.5.6 to 8.5.10.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.5.6...8.5.10)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.10
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dependabot[bot] · 2026-04-28 07:23:50 +00:00

8f4efe5fb9

Bump postcss (#5527 )

Bumps [postcss](https://github.com/postcss/postcss) from 8.5.6 to 8.5.12.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.5.6...8.5.12)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.12
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dependabot[bot] · 2026-04-28 07:23:17 +00:00

362c4c5f84

Python: Update hosting agent samples + fixes (#5485 )

* Update foundry hosting samples

* Add file data type support

* Fix file content and add more tests

* Fix README

* Address comments

* Fix int tests

* remove temp

Tao Chen · 2026-04-28 04:24:05 +00:00

88347f6494

Python: Add requirements.txt and .env.example to the a2a/ sample for pip-based setup (#5510 )

* Add requirements.txt and .env.example to a2a sample

Beginners following the a2a/ sample had no pip-based install path:
the directory lacked requirements.txt and .env.example, unlike every
other 04-hosting/ sample.

- Add requirements.txt with editable local package paths matching the
  pattern used in azure_functions/ and similar hosting samples
- Add .env.example documenting FOUNDRY_PROJECT_ENDPOINT, FOUNDRY_MODEL,
  and A2A_AGENT_HOST
- Update README Quick Start to cover both pip (.venv) and uv workflows

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Add `requirements.txt` and `.env.example` to the `a2a/` sample for pip-based setup

Fixes #5395

* fix(a2a-sample): address PR review feedback for issue #5395

- Remove 'from repo root' wording from Option B uv heading in README
  to avoid contradicting the 'run from this directory' instruction
- Fix A2A_AGENT_HOST default in .env.example from 5001 to 5000 to match
  function-tools flow; add clarifying comments about port usage
- Add note for pip users explaining they can replace 'uv run python'
  with 'python' once the virtual environment is activated

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5395: Python: [Samples][Python] a2a/ sample missing requirements.txt — beginners cannot install dependencies

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-04-27 22:22:07 +00:00

9b22ecd119

Python: (core): Add functional workflow API (#4238 )

* Add functional workflow api

* cleanup

* More cleanup

* address copilot feedback

* Address PR feedbacK

* updates

* PR feedback

* Address review comments on functional workflow samples

- Swap 05/06 get-started samples: agent workflow first (motivates
  why workflows exist), simple text workflow second
- Rename text_pipeline → text_workflow, poem_pipeline → poem_workflow
- Add @step to agent workflow sample (05) to demonstrate caching
- Switch agent samples to AzureOpenAIResponsesClient with Foundry
- Remove .as_agent() from agent_integration.py to focus on the key
  difference between inline agent calls vs @step-cached calls
- Add commented-out Agent.run example in hitl_review.py
- Add clarifying comment in _functional.py that event streaming is
  buffered (not true per-token streaming)
- Add naive_group_chat.py functional sample: round-robin group chat
  as a plain Python loop
- Update READMEs to reflect new file names and group chat sample

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix pyright type errors

* Address PR review comments on functional workflow API

1. Allow request_info inside @step: Auto-inject RunContext into step
   functions that declare a RunContext parameter (by type or name 'ctx'),
   and expose get_run_context() for programmatic access.

2. Handle None responses: Log a warning when a response value is None,
   and document the behavior in request_info docstring.

3. Add executor_bypassed event type: Replace executor_invoked +
   executor_completed with a single executor_bypassed event when a step
   replays from cache, making cached vs live execution explicit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add regression tests for PR review comments on functional workflow API

The three review comments (request_info in @step, None response handling,
executor_bypassed event type) were already addressed in 7da7db4e. This
commit adds cross-cutting regression tests that exercise the interactions
between these features:

- HITL in step with caching: preceding step bypassed on resume
- Full checkpoint lifecycle with HITL step (interrupt -> resume -> restore)
- None response inside step-level request_info logs warning
- WorkflowInterrupted from step does not emit executor_failed

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #4238 review comments on functional workflow API

Comment 1 (request_info in @step): Already supported. Added comment in
StepWrapper.__call__ explaining why WorkflowInterrupted (BaseException)
safely bypasses the except Exception handler.

Comment 2 (None response): Added docstring to _get_response clarifying
the (found, value) return tuple semantics and None handling.

Comment 3 (bypass event type): executor_bypassed is already a dedicated
event type in WorkflowEventType. Updated comment at the bypass site to
make the deliberate event type choice explicit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add experimental API warnings to functional workflow module

Mark all public classes and decorators (workflow, step, RunContext,
FunctionalWorkflow, StepWrapper, FunctionalWorkflowAgent) as
experimental and subject to change or removal.

* Address PR #4238 review comments from @eavanvalkenburg

- RunContext docstring leads with purpose (opt-in handle for HITL,
  custom events, state) so readers importing it from the public surface
  understand its role before the mechanics (#2993513452).
- Rename `06_first_functional_workflow.py` to
  `06_functional_workflow_basics.py`; the previous filename was
  confusing since it followed `05_functional_workflow_with_agents.py`
  (#2993531979).
- Simplify `05_functional_workflow_with_agents.py` to call agents
  directly without a @step wrapper; the step-vs-no-step contrast lives
  in `03-workflows/functional/agent_integration.py`, keeping the
  get-started sample minimal (#2993525532).
- Switch functional samples to `FoundryChatClient` for consistency with
  the rest of 01-get-started and 03-workflows (follow-up on #2876988570).
- Use walrus in `hitl_review.py` final-state assertion (#2993572182).
- Add expected-output block to `basic_streaming_pipeline.py` (#2993557609).
- Clarify in `parallel_pipeline.py` that `@step` composes with
  `asyncio.gather` (#2993597282).
- `naive_group_chat.py` threads `list[Message]` between turns instead
  of stringifying the transcript, preserving role/authorship (#2993583231).

Drive-by: pre-commit hook sorts an unrelated import block in
`samples/04-hosting/foundry-hosted-agents/responses/02_local_tools/main.py`.

* Fix 10 functional-workflow API bugs from /ultrareview pass

- bug_001: `ctx.request_info()` without an explicit `request_id` now derives
  a deterministic `auto::<index>` id from the call-counter, so HITL resume
  works correctly on the documented default path.  A uuid was regenerated on
  every replay, making resume impossible.

- bug_002: `StepWrapper.__call__` no longer deepcopies arguments on the
  cache-hit replay branch.  The copy is only performed on the live-execution
  path (for the event log) and falls back to the original mapping if deepcopy
  fails, so steps whose args aren't deepcopyable (locks, sockets, sessions)
  can still resume from checkpoint.

- bug_007: `_set_responses` now prunes each resolved `request_id` from
  `_pending_requests`, and the cache-hit branch in `request_info` does the
  same.  Previously, answered requests were re-serialized into every
  subsequent checkpoint and the final checkpoint falsely claimed pending
  requests even after the workflow completed.

- bug_008: `_compute_signature_hash` now mixes the function's `co_code` and
  `co_names` into the checkpoint signature, so changes to the workflow body
  invalidate older checkpoints even when steps are accessed via module /
  class attributes (which `_discover_step_names` can't see statically).
  `RunContext._record_observed_step` records observed step names for
  diagnostics.

- bug_010: `FunctionalWorkflow.run()` docstring corrected — says "at least
  one of message/responses/checkpoint_id" and explicitly notes `responses`
  may be combined with `checkpoint_id` (the validator already allowed this).

- bug_013: `FunctionalWorkflowAgent` now surfaces `request_info` events as
  `FunctionApprovalRequestContent` items (mirroring graph `WorkflowAgent`),
  threads `responses=` and `checkpoint_id=` through to the underlying
  workflow, and exposes `pending_requests`.  Previously `.as_agent()`
  returned empty `AgentResponse` for HITL workflows — effectively unusable.

- bug_014: `FunctionalWorkflow` now clears `_last_message`,
  `_last_step_cache`, and `_last_pending_request_ids` on clean completion.
  `run()` validates that `responses=` keys intersect the currently-pending
  request set (or raises with a clear error) instead of silently replaying
  against stale singleton state from a prior run.

- bug_015: `FunctionalWorkflow.as_agent` signature now matches graph
  `Workflow.as_agent`: accepts `name`, `description`, `context_providers`,
  and `**kwargs`.  `FunctionalWorkflowAgent` stores the overrides.

- bug_017: `RunContext.set_state` raises `ValueError` for underscore-
  prefixed keys (the framework's `_step_cache` / `_original_message` keys
  would silently clobber user state on checkpoint save and user
  underscore-prefixed state was dropped on restore).  Docstring documents
  the reserved prefix.

- merged_bug_003: Workflow function arity is validated at decoration time.
  Multiple non-ctx parameters raise `ValueError` immediately (previously
  every arg past the first was silently dropped at call time).  Passing a
  non-None `message` to a ctx-only workflow raises `ValueError` instead of
  silently discarding the message.

Test coverage: +18 regression tests covering every fix.  Full workflow
suite now 766 passed, 1 skipped, 2 xfailed; full core suite 2338 passed.

* Deslop functional.py fix commit

- Remove dead instrumentation added in the prior commit that was never
  consumed: `RunContext._observed_step_names`,
  `RunContext._record_observed_step`, `FunctionalWorkflow._runtime_step_names`,
  and `FunctionalWorkflowAgent._extra_kwargs`.  The signature hash relies on
  `co_code` alone, which covers the attribute-access case without the
  collection-scaffolding.
- Trim over-explanatory comments that restated what the code does or what
  it no longer does.  Keep only the comments that answer "why" for the
  non-obvious bits (deterministic id contract, defensive deepcopy, stale
  replay guard).
- Compress the `_compute_signature_hash` and FunctionalWorkflow `__init__`
  block docstrings without losing the user-facing reasoning.

Net -49 lines.  Regression lock preserved (766 passed, 1 skipped, 2 xfailed).

* Fix functional workflow review feedback

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>

Evan Mattson · 2026-04-24 09:41:20 +00:00

da32e8cf80

Python: update FoundryAgent for hosted agent sessions (#5447 )

* fixes to FoundryAgent to connect to new hosted agents

Co-authored-by: Copilot <copilot@github.com>

* fix mypy

Co-authored-by: Copilot <copilot@github.com>

* Python: remove Foundry service session helpers

Remove the public hosted-agent service session CRUD helpers from FoundryAgent and drop the related feature-stage inventory entry.

Update the hosted-agent sample to create and delete service sessions directly through the preview AIProjectClient APIs, and tighten a few test harnesses surfaced by full workspace validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix from merge

* fix hosted env detection

Co-authored-by: Copilot <copilot@github.com>

* reverted sample update

* fix tests and code

Co-authored-by: Copilot <copilot@github.com>

* remove aenter

* skipping some tests

Co-authored-by: Copilot <copilot@github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-24 09:25:03 +00:00

62e02da698

Python: Add OpenTelemetry integration for GitHubCopilotAgent (#5142 )

* Python: Add OpenTelemetry integration for GitHubCopilotAgent

- Split GitHubCopilotAgent into RawGitHubCopilotAgent (core, no OTel) and
  GitHubCopilotAgent(AgentTelemetryLayer, RawGitHubCopilotAgent) with tracing
- Add default_options property to expose model for span attributes
- Export RawGitHubCopilotAgent from all public namespaces
- Add github_copilot_with_observability.py sample and update README

* Python: Fix OTEL_SERVICE_NAME default in GitHub Copilot README

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Python: Add unit tests for RawGitHubCopilotAgent.default_options property

* Python: Address review feedback on GitHubCopilotAgent OTel integration

- Add middleware param to GitHubCopilotAgent.run() overloads so per-call
  middleware is explicitly forwarded through AgentTelemetryLayer
- Remove github_copilot_with_observability.py sample per feedback; replace
  with inline snippet + link to observability samples in README

* Python: Address review feedback on log_level and session kwargs typing

- Add middleware param to RawGitHubCopilotAgent.run() overloads for interface
  compatibility with AgentTelemetryLayer
- Fix import in README observability snippet to use agent_framework.github

* Python: Add AgentMiddlewareLayer to GitHubCopilotAgent MRO

Follow FoundryAgent pattern: AgentMiddlewareLayer runs outside the telemetry
span so middleware execution time is not captured in traces. Overloads removed
as AgentMiddlewareLayer.run() handles dispatch via MRO.

* Python: Add explicit __init__ to GitHubCopilotAgent for auto-complete and docstrings

* Python: Address review feedback on middleware warning and test assertions

- Add assert "timeout" not in opts to test_default_options_includes_model_for_telemetry
  to document the intentional asymmetry where timeout is extracted into _settings
  and not returned in default_options.
- Replace silent del middleware with a logged warning when per-run middleware is
  passed to RawGitHubCopilotAgent, making it clear that the GitHub Copilot SDK
  handles tool execution internally and chat/function middleware cannot be injected.

* Python: Use Self for __aenter__ return type in RawGitHubCopilotAgent

Address review feedback: use typing.Self (3.11+) / typing_extensions.Self
(3.10) for __aenter__ so subclasses like GitHubCopilotAgent get the correct
return type from async context manager usage.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Dineshsuriya D · 2026-04-24 08:44:44 +00:00

63c0a51797

Python: feat: Add Agent Framework to A2A bridge support (#2403 )

* feat: Add Agent Framework to A2A bridge support

- Implement A2A event adapter for converting agent messages to A2A protocol
- Add A2A execution context for managing agent execution state
- Implement A2A executor for running agents in A2A environment
- Add comprehensive unit tests for event adapter, execution context, and executor
- Update agent framework core A2A module exports and type stubs
- Integrate thread management utilities for async execution
- Add getting started sample for A2A agent framework integration
- Update dependencies in uv.lock

This integration enables agent framework agents to communicate and execute within the A2A (Agent to Agent) infrastructure.

* fix: Update references from agent_thread_storage to _agent_thread_storage in A2A executor tests

* Refactor A2A agent framework and improve code structure

- Reordered imports in various files for consistency and clarity.
- Updated `__all__` definitions to maintain a consistent order across modules.
- Simplified method signatures by removing unnecessary line breaks.
- Enhanced readability by adjusting formatting in several sections.
- Removed redundant comments and example scenarios in the execution context.
- Improved handling of agent messages in the event adapter.
- Added type hints for better clarity and type checking.
- Cleaned up test cases for better organization and readability.

* fix: Lint fix new line added

* test: Add unit tests for AgentThreadStorage and InMemoryAgentThreadStorage

* refactor: Update type hints to use new syntax for Union and List

* fix: Validate RequestContext for context_id and message before execution

* Refactor tests and remove A2aExecutionContext references

- Deleted the test file for A2aExecutionContext as it is no longer needed.
- Updated A2aExecutor tests to remove dependencies on A2aExecutionContext and adjusted method calls accordingly.
- Modified event adapter tests to use ChatMessage instead of AgentRunResponseUpdate.
- Removed A2aExecutionContext from imports in agent_framework.a2a module and updated type hints accordingly.

* Refactor A2AExecutor tests and remove event adapter

- Updated test cases to use A2AExecutor instead of A2aExecutor for consistency.
- Removed mock_event_adapter fixture and related tests as A2aEventAdapter is deprecated.
- Consolidated event handling tests into TestA2AExecutorEventAdapter.
- Adjusted imports in various files to reflect the removal of deprecated components.
- Ensured all references to A2aExecutor are updated to A2AExecutor across the codebase.

* refactor: Remove AgentThreadStorage and InMemoryAgentThreadStorage classes from threads and tests

* feat: A2AExecutor to have its own override able save and get threads methods for persistent storage.

* fix: linter bugs

* removed unnecessary changes form core package

* new line added

* Refactor A2AExecutor tests and update imports

- Consolidated mock agent fixtures in test_a2a_executor.py to simplify agent mocking.
- Removed redundant tests related to thread storage and agent types, focusing on A2AExecutor's core functionality.
- Updated test assertions to reflect changes in message handling with new Message and Content classes.
- Enhanced integration tests to ensure compatibility with the new agent framework structure.
- Added A2AExecutor to the module exports in __init__.py and __init__.pyi for better accessibility.

* Update A2A documentation: enhance usage examples for A2AAgent and A2AExecutor

* Updated uv lock

* Fix metadata assertion in TestA2AExecutorHandleEvents and reorder load_dotenv call in agent_framework_to_a2a.py

* Update agent card configuration: add default input and output modes, and fix agent creation method

* Fix assertion for metadata in TestA2AExecutorHandleEvents

* Fix formatting issues in TestA2AExecutorExecute and TestA2AExecutorIntegration

* Enhance A2AExecutor documentation with examples and clarify agent execution process

* Revert uv lock to main

* Refactor A2AExecutor: Improve formatting and streamline constructor parameters

* Apply suggestions from code review

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

* Refactor A2AExecutor to use SupportsAgentRun and enhance logging; update agent framework sample for flight and hotel booking capabilities

* Enhance A2AExecutor with streaming support and custom run arguments; update tests for initialization and execution scenarios

* Enhance A2AExecutor event handling with streamed artifact tracking; update tests for new behavior

* Refactor A2AExecutor to enforce type hints for stream and run_kwargs attributes

* Refactor A2AExecutor and tests: replace AsyncMock with MagicMock for response stream handling; clean up imports in agent_framework_to_a2a.py

* refactor: streamline imports and improve code readability across multiple files

* feat: enhance A2AExecutor cancel method with context validation and fixed review comments

* feat: implement get_uri_data utility function for extracting base64 data from data URIs and update references

* fix: update import path for get_uri_data utility function in A2AExecutor and A2AAgent

* fix: correct error message handling in A2AExecutor and update test assertions

---------

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

Shubham Kumar · 2026-04-24 08:35:40 +00:00

b00465d7be

Python: fix(foundry): reconcile toolbox hosted-tool payloads with Responses API (#5414 )

* fix(foundry): reconcile toolbox hosted-tool payloads with Responses API

* docs(foundry): update create_sample_toolbox docstring to reflect all tools created

Evan Mattson · 2026-04-22 17:43:26 +00:00

fffd0acb3e

Python: Fix OpenAI Responses streaming to propagate created_at from final response.completed event (#5382 )

* Fix streaming response losing created_at from response.completed event (#5347)

The streaming path in _parse_chunk_from_openai did not extract created_at
from the response.completed event, unlike the non-streaming path in
_parse_responses_response. This caused durabletask persistence warnings
when created_at was None.

Extract created_at in the response.completed case and pass it to the
returned ChatResponseUpdate.

Also fix pre-existing pyright errors for optional orjson import in sample
files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix orjson import suppression to use pyright instead of mypy (#5347)

Replace `# type: ignore[import-not-found]` with
`# pyright: ignore[reportMissingImports]` on optional orjson imports
in conversation sample files, matching the repo's Pyright strict
configuration.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-04-22 06:19:31 +00:00

ea3320d39f

Python: feat(evals): add ground_truth support for similarity evaluator (#5234 )

* feat(evals): add ground_truth support for similarity evaluator

- Include expected_output as ground_truth in Foundry JSONL dataset rows
- Add ground_truth to item schema and data mapping for similarity evaluator
- Add expected_output parameter to evaluate_workflow
- Add similarity Pattern 3 to evaluate_agent and evaluate_workflow samples
- Add tests for ground_truth in dataset, schema, and evaluate_workflow

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: wrap long line to satisfy ruff E501

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

chetantoshniwal · 2026-04-21 19:40:53 +00:00

aa582d021d

Python: Add second approval-required tool (set_stop_loss) to concurrent_builder_tool_approval sample (#4875 )

* Add set_stop_loss tool to concurrent_builder_tool_approval sample

Add a second approval-gated tool (set_stop_loss) to the concurrent workflow
tool approval sample to demonstrate handling approval requests for different
tools in the same concurrent workflow.

Changes:
- Add set_stop_loss(symbol, stop_price) with approval_mode='always_require'
- Include new tool in both agents' tool lists
- Update agent instructions and prompt to encourage stop-loss usage
- Update docstring to reflect two approval-gated tools
- Update sample output to show mixed approval requests

Fixes #4874

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Print tool name and arguments in concurrent sample's process_event_stream (#4874)

Align process_event_stream in concurrent_builder_tool_approval.py to print
the tool name and arguments when collecting approval requests, matching the
sample output comment and the sequential_builder_tool_approval.py pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add None-guard for function_call access in tool approval sample (#4874)

Add explicit None-checks before accessing function_call.name and
function_call.arguments in concurrent_builder_tool_approval.py. The
function_call field is typed Content | None, so direct attribute access
without a guard could raise AttributeError and required type: ignore
comments. The None-guard is consistent with the pattern used in
_agent_run.py and removes the suppression comments.

Also add a regression test verifying that function_call defaults to None
and that the None-guard pattern is safe.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Apply same function_call None-guard to sibling tool-approval samples (#4874)

Apply the same fix to sequential_builder_tool_approval.py and
group_chat_builder_tool_approval.py, which had the identical pattern
of accessing function_call.name/arguments without a None-guard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-04-21 07:08:50 +00:00

b6b191ad9c

Python: Foundry hosted agent V2 (#5379 )

* Python: Wrapper + Samples 1st (#5177)

* Experiment

* Update dependency and add non streaming

* Add more samples

* Rename samples

* Add invocations

* Comments 1

* Comments 2

* Comments 3

* Improve README

* Add local shell sample

* WIP: Add eval and memory samples

* Update user agent prefix

* Update user agent prefix doc

* Update dependency (#5215)

* Add tests and more content types (#5235)

* Add tests

* fix tests and sample

* Fix formatting

* Remove function approval contents

* Python: Refine samples and upgrade packages (#5261)

* Refine samples and upgrade pacakges

* Upgrade to a new package that fixes a bug

* Update model env var

* Move samples (#5281)

* Python: Upgrade agentserver packages (#5284)

* Upgrade agentserver packages

* Fix new types

* Python: Add special handling for workflows (#5298)

* Add special handling for workflows

* Address comments

* Improve samples (#5372)

* Python: Add more types (#5378)

* Add more type supports

* Upgrade packages

* Remove TODOs in README

* Fix README

* Comments and mypy

* User agent scoped

* Fix README

* Fix pre commit

* Fix pre commit 2

* Fix pre commit 3

* Fix pre commit 4

* Fix pre commit 5

* Fix pre commit 6

* Add azure-monitor-opentelemetry to dev deps

Fixes Samples & Markdown CI failure. The PR's new transitive dep on
azure-monitor-opentelemetry-exporter (via azure-ai-agentserver-core) makes
pyright resolve the azure.monitor.opentelemetry namespace, flipping the
check_md_code_blocks diagnostic for `configure_azure_monitor` from
reportMissingImports (filtered) to reportAttributeAccessIssue (not filtered).
Installing the umbrella azure-monitor-opentelemetry package in dev makes
pyright resolve the symbol correctly, matching the install guidance the
observability README already gives users.

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

Tao Chen · 2026-04-21 05:21:27 +00:00

ce8b6305d8

Python: Add support for Foundry Toolboxes (#5346 )

* Add support for the Foundry Toolbox in MAF

Introduces a Foundry Toolbox integration: FoundryChatClient gains a
get_toolbox() helper plus select_toolbox_tools(), normalize_tools in
the core package flattens tool-collection wrappers (ToolboxVersionObject
and generic iterables, while leaving Pydantic BaseModel instances
alone), and the new agent_framework.foundry namespace re-exports the
toolbox helpers. Ships with unit tests, a sample, and a design doc.

azure-ai-projects is pinned to the public >=2.0.0,<3.0 range and the
lockfile resolves from public PyPI. The toolbox test module skips when
Toolbox* types are unavailable so CI stays green until the public 2.1.0
SDK lands. OMC tooling directories (.omc/, .omx/) are gitignored.

* Update to latest azure ai projects package

* Improve sample

* Rename ADR to 0025

* Update ADR

* Apply suggestion from @alliscode

Co-authored-by: Ben Thomas <ben.thomas@microsoft.com>

* Improve samples

* Update test

---------

Co-authored-by: Ben Thomas <ben.thomas@microsoft.com>

Evan Mattson · 2026-04-20 23:56:01 +00:00

04aaf0c1fe

Python: Add Hyperlight CodeAct package and docs (#5185 )

* initial work on code_mode

* updated samples

* updates to codeact

* udpated codeact

* Draft CodeAct ADR and sample updates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* initial implementation and adr and feature

* Python: Limit Hyperlight wasm backend to Python <3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix CI for Hyperlight CodeAct PR

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Run Hyperlight integration when available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Address Hyperlight review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Simplify Hyperlight file mount inputs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Accept Path host paths in Hyperlight mounts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix Hyperlight mount typing for CI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* temp run integration test

* Python: Strengthen Hyperlight real sandbox tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added additional tests

* Python: Simplify Hyperlight CodeAct API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* set tests as non-integration

* Retry Hyperlight allowed-domain registration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Gate Hyperlight integration tests by runtime support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight skip test on Python 3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Delay Hyperlight runtime probe until test execution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Relax Hyperlight Windows integration stdout assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scan Hyperlight output directory for artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight output artifact collection

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden Hyperlight integration output assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight read-back check in integration test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify Hyperlight integration write assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid pathlib in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use socket network check in Hyperlight sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Replace blocked Azure AI Search blog link

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify Hyperlight guest stdlib limits

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use _socket in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Handle Hyperlight mounted file paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Broaden Hyperlight sandbox path fallbacks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Search Hyperlight guest mounts recursively

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight mount coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight live network tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight file-write test on Windows

Enable the sandbox filesystem by providing a workspace_root so
/output is mounted. Remove os.path.exists assertion (unsupported
in WASM guest) and fix Content data assertion to use .uri.
Skip the network integration test on Windows where the WASM
sandbox lacks the encodings.idna codec.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: ADR intro, manual wiring sample, doc clarifications

- Add CodeAct introduction section to ADR for unfamiliar readers
- Clarify 'less runtime efficient' con with specific overhead description
- Add note in Python impl doc clarifying ADR vs impl doc split
- Explain why before_run hooks must be per-run (CRUD, concurrency, approval)
- Rename code_interpreter variable to codeact in E2E sample
- Add manual static wiring sample (codeact_manual_wiring.py)
- Add 'when to use which pattern' guidance to samples README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5185 review comments and add .NET CodeAct design doc

- Fix async callback: _make_sandbox_callback returns sync wrapper with
  thread + asyncio.run() bridge (was broken with real Wasm FFI)
- Fix stale output: clear output_dir before each sandbox.run() call
- Fix blocking event loop: _run_code now async with asyncio.to_thread()
- Revert _agents.py options['tools'] injection (unnecessary; provider
  uses context.extend_tools())
- Revert SessionContext.options docstring back to read-only
- Add real-sandbox test fixtures (shared/restored/fresh)
- Add 8 new real-sandbox tests for callback round-trip, stale output,
  event loop non-blocking, basic execution, stdout/stderr, errors,
  snapshot/restore, and tool registration
- Add comprehensive .NET HyperlightCodeActProvider design document

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hyperlight README with code snippets and remove Public API section

Replace bare export list with Quick Start code examples covering the
context provider, standalone tool, manual static wiring, and file
mounts / network access patterns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-17 00:49:44 +00:00

b03cb324d5

1 2 3 4 5 ...

475 Commits