mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
866a325b48
* Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs * Fix orchestration output issues from review comments 1. Sample cleanup: Remove commented-out FoundryChatClient block and update prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars. 2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response from a no-op sink to yield response.agent_response. When the last participant is AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output executor so the yield produces the terminal answer. When the last participant is a regular AgentExecutor, _EndWithConversation is not in output_executors so the yield is silently filtered out. 3. Forward data events through WorkflowExecutor: _process_workflow_result now also forwards 'data' events from sub-workflows so that emit_intermediate_data=True on AgentExecutor works correctly when wrapped in AgentApprovalExecutor. 4. Concurrent docstring: Update _AggregateAgentConversations docstring to say 'deterministic participant order' instead of 'completion order'. 5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events alongside the single aggregated output event. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add tests for sequential workflow with_request_info and intermediate_outputs (#5301) Address PR review comments 2, 3, and 5: - Add test_sequential_request_info_last_participant_emits_output: Verifies that when the last participant is wrapped via with_request_info() (AgentApprovalExecutor), the workflow still emits a terminal output after approval, exercising the _EndWithConversation.end_with_agent_executor_response fallback path. - Add test_sequential_request_info_with_intermediate_outputs_emits_data_events: Verifies that emit_intermediate_data=True works correctly through AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already forwards data events from sub-workflows, so intermediate agent responses surface as data events in the parent workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix pyright type errors from AgentResponse output refactor (#5301) Update cast() calls in _group_chat.py and _magentic.py to use WorkflowContext[Never, AgentResponse] instead of the old WorkflowContext[Never, list[Message]], matching the updated method signatures in _base_group_chat_orchestrator.py. Fix _sequential.py _EndWithConversation.end_with_agent_executor_response to declare WorkflowContext[Any, AgentResponse] so yield_output accepts AgentResponse[None]. Fix _workflow_executor.py data event forwarding to handle nullable executor_id. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix pyright reportUnknownVariableType in _agent.py (#5301) Extract event.data into a typed local variable before the isinstance check to avoid pyright narrowing it to AgentResponse[Unknown]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix pyright reportMissingImports for orjson in file history samples (#5301) Add pyright: ignore[reportMissingImports] to orjson imports that are already guarded by try/except ImportError, matching the existing pattern used elsewhere in the samples. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5301: review comment fixes * Address review feedback for #5301: review comment fixes * Revert sequential_workflow_as_agent sample to FoundryChatClient Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient in the sequential workflow as agent sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address ultrareview feedback: emit_data_events rename + WorkflowAgent reasoning conversion Layered on top of the prior review-feedback work in this branch. Renames: - AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical rename; orchestration semantics live at the orchestration layer, not the general-purpose executor). Forwarded through MagenticAgentExecutor, AgentApprovalExecutor, and all orchestration call sites. - HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate (pure predicate; no longer yields anything). HandoffBuilder docstring rewritten to describe the new per-agent AgentResponse output contract. WorkflowAgent reasoning-content conversion: - Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg) helpers; the as_agent() path now reframes text content from data events as text_reasoning Content blocks before merging into the AgentResponse. - Consumers iterate msg.contents and branch on content.type — same path they already use for Claude thinking and OpenAI reasoning. No new field on Message/AgentResponse/WorkflowEvent. - Streaming branch constructs fresh AgentResponseUpdate instances instead of mutating shared payloads (regression test added). - Helper _msg_maybe_reasoning consolidates the conditional rewrite at three call sites in the non-streaming conversion. Tests: - TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion add 9 new tests covering helpers, non-streaming, streaming, mixed content, already-reasoning passthrough, and mutation-safety regression. - Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain to assert text_reasoning content for intermediate agents. * Fix pyright: widen event.data to Any to avoid partial-unknown narrowing The streaming conversion path narrowed event.data via isinstance against generic AgentResponse, producing AgentResponse[Unknown] and tripping reportUnknownVariableType/reportUnknownMemberType. Binding data: Any before the check keeps runtime behavior identical while restoring a fully known type for downstream access. * Clean up design * Scope to agent output semantics only * yield AgentResponseUpdate streaming, AgentResponse non-streaming * Fix mypy/pyright: widen cast types at GroupChat callsites Eight callsites in _group_chat.py still cast to WorkflowContext[Never, AgentResponse] but the base orchestrator methods now accept the wider WorkflowContext[Never, AgentResponse | AgentResponseUpdate] (mode-aware yields). W_OutT is invariant, so the narrower cast is not assignable. Magentic was widened in the same commit; this catches the GroupChat callsites that were missed. * Python: skip flaky Foundry / Foundry Hosting integration tests (#5553) These two integration tests have been failing in the merge queue across multiple unrelated PRs (5301, 5531). Both are marked `@pytest.mark.flaky` with 3 retries, but all attempts fail back-to-back. Skipping both with a reason pointing to #5553 so they can be fixed properly without continuing to block unrelated merges. - packages/foundry_hosting/tests/test_responses_int.py::TestOptions::test_temperature_and_max_tokens - packages/foundry/tests/foundry/test_foundry_embedding_client.py::TestFoundryEmbeddingIntegration::test_text_embedding_live Also includes a one-line uv.lock specifier-ordering normalization auto-applied by the poe-check pre-commit hook. --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
377 lines
14 KiB
Python
377 lines
14 KiB
Python
# Copyright (c) Microsoft. All rights reserved.
|
|
|
|
from collections.abc import AsyncIterable, Awaitable
|
|
from typing import Any, Literal, cast, overload
|
|
|
|
import pytest
|
|
from agent_framework import (
|
|
AgentExecutorRequest,
|
|
AgentExecutorResponse,
|
|
AgentResponse,
|
|
AgentResponseUpdate,
|
|
AgentRunInputs,
|
|
AgentSession,
|
|
BaseAgent,
|
|
Content,
|
|
Executor,
|
|
Message,
|
|
ResponseStream,
|
|
WorkflowContext,
|
|
WorkflowRunState,
|
|
handler,
|
|
)
|
|
from agent_framework._workflows._checkpoint import InMemoryCheckpointStorage
|
|
from agent_framework.orchestrations import ConcurrentBuilder
|
|
from typing_extensions import Never
|
|
|
|
|
|
class _FakeAgentExec(Executor):
|
|
"""Test executor that mimics an agent by emitting an AgentExecutorResponse.
|
|
|
|
It takes the incoming AgentExecutorRequest, produces a single assistant message
|
|
with the configured reply text, and sends an AgentExecutorResponse that includes
|
|
full_conversation (the original user prompt followed by the assistant message).
|
|
"""
|
|
|
|
def __init__(self, id: str, reply_text: str) -> None:
|
|
super().__init__(id)
|
|
self._reply_text = reply_text
|
|
|
|
@handler
|
|
async def run(self, request: AgentExecutorRequest, ctx: WorkflowContext[AgentExecutorResponse]) -> None:
|
|
response = AgentResponse(messages=Message(role="assistant", contents=[self._reply_text]))
|
|
full_conversation = list(request.messages) + list(response.messages)
|
|
await ctx.send_message(AgentExecutorResponse(self.id, response, full_conversation=full_conversation))
|
|
|
|
|
|
def test_concurrent_builder_rejects_empty_participants() -> None:
|
|
with pytest.raises(ValueError):
|
|
ConcurrentBuilder(participants=[])
|
|
|
|
|
|
def test_concurrent_builder_rejects_duplicate_executors() -> None:
|
|
a = _FakeAgentExec("dup", "A")
|
|
b = _FakeAgentExec("dup", "B") # same executor id
|
|
with pytest.raises(ValueError):
|
|
ConcurrentBuilder(participants=[a, b])
|
|
|
|
|
|
async def test_concurrent_default_aggregator_emits_assistants_only() -> None:
|
|
"""Default aggregator yields a single AgentResponse with one assistant message per participant.
|
|
|
|
The user prompt is intentionally not included — that belongs in the input, not the answer.
|
|
"""
|
|
e1 = _FakeAgentExec("agentA", "Alpha")
|
|
e2 = _FakeAgentExec("agentB", "Beta")
|
|
e3 = _FakeAgentExec("agentC", "Gamma")
|
|
|
|
wf = ConcurrentBuilder(participants=[e1, e2, e3]).build()
|
|
|
|
output_events = [ev for ev in await wf.run("prompt: hello world") if ev.type == "output"]
|
|
assert len(output_events) == 1
|
|
response = output_events[0].data
|
|
assert isinstance(response, AgentResponse)
|
|
|
|
# Exactly one assistant message per participant; no user prompt.
|
|
assert len(response.messages) == 3
|
|
assert all(m.role == "assistant" for m in response.messages)
|
|
assert {m.text for m in response.messages} == {"Alpha", "Beta", "Gamma"}
|
|
|
|
|
|
async def test_concurrent_custom_aggregator_callback_is_used() -> None:
|
|
# Two synthetic agent executors for brevity
|
|
e1 = _FakeAgentExec("agentA", "One")
|
|
e2 = _FakeAgentExec("agentB", "Two")
|
|
|
|
async def summarize(results: list[AgentExecutorResponse]) -> str:
|
|
texts: list[str] = []
|
|
for r in results:
|
|
msgs: list[Message] = r.agent_response.messages
|
|
texts.append(msgs[-1].text if msgs else "")
|
|
return " | ".join(sorted(texts))
|
|
|
|
wf = ConcurrentBuilder(participants=[e1, e2]).with_aggregator(summarize).build()
|
|
|
|
completed = False
|
|
output: str | None = None
|
|
async for ev in wf.run("prompt: custom", stream=True):
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
completed = True
|
|
elif ev.type == "output":
|
|
output = cast(str, ev.data)
|
|
if completed and output is not None:
|
|
break
|
|
|
|
assert completed
|
|
assert output is not None
|
|
# Custom aggregator returns a string payload
|
|
assert isinstance(output, str)
|
|
assert output == "One | Two"
|
|
|
|
|
|
async def test_concurrent_custom_aggregator_sync_callback_is_used() -> None:
|
|
e1 = _FakeAgentExec("agentA", "One")
|
|
e2 = _FakeAgentExec("agentB", "Two")
|
|
|
|
# Sync callback with ctx parameter (should run via asyncio.to_thread)
|
|
def summarize_sync(results: list[AgentExecutorResponse], _ctx: WorkflowContext[Any]) -> str: # type: ignore[unused-argument]
|
|
texts: list[str] = []
|
|
for r in results:
|
|
msgs: list[Message] = r.agent_response.messages
|
|
texts.append(msgs[-1].text if msgs else "")
|
|
return " | ".join(sorted(texts))
|
|
|
|
wf = ConcurrentBuilder(participants=[e1, e2]).with_aggregator(summarize_sync).build()
|
|
|
|
completed = False
|
|
output: str | None = None
|
|
async for ev in wf.run("prompt: custom sync", stream=True):
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
completed = True
|
|
elif ev.type == "output":
|
|
output = cast(str, ev.data)
|
|
if completed and output is not None:
|
|
break
|
|
|
|
assert completed
|
|
assert output is not None
|
|
assert isinstance(output, str)
|
|
assert output == "One | Two"
|
|
|
|
|
|
def test_concurrent_custom_aggregator_uses_callback_name_for_id() -> None:
|
|
e1 = _FakeAgentExec("agentA", "One")
|
|
e2 = _FakeAgentExec("agentB", "Two")
|
|
|
|
def summarize(results: list[AgentExecutorResponse]) -> str: # type: ignore[override]
|
|
return str(len(results))
|
|
|
|
wf = ConcurrentBuilder(participants=[e1, e2]).with_aggregator(summarize).build()
|
|
|
|
assert "summarize" in wf.executors
|
|
aggregator = wf.executors["summarize"]
|
|
assert aggregator.id == "summarize"
|
|
|
|
|
|
async def test_concurrent_with_aggregator_executor_instance() -> None:
|
|
"""Test with_aggregator using an Executor instance (not factory)."""
|
|
|
|
class CustomAggregator(Executor):
|
|
@handler
|
|
async def aggregate(self, results: list[AgentExecutorResponse], ctx: WorkflowContext[Never, str]) -> None:
|
|
texts: list[str] = []
|
|
for r in results:
|
|
msgs: list[Message] = r.agent_response.messages
|
|
texts.append(msgs[-1].text if msgs else "")
|
|
await ctx.yield_output(" & ".join(sorted(texts)))
|
|
|
|
e1 = _FakeAgentExec("agentA", "One")
|
|
e2 = _FakeAgentExec("agentB", "Two")
|
|
|
|
aggregator_instance = CustomAggregator(id="instance_aggregator")
|
|
wf = ConcurrentBuilder(participants=[e1, e2]).with_aggregator(aggregator_instance).build()
|
|
|
|
completed = False
|
|
output: str | None = None
|
|
async for ev in wf.run("prompt: instance test", stream=True):
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
completed = True
|
|
elif ev.type == "output":
|
|
output = cast(str, ev.data)
|
|
if completed and output is not None:
|
|
break
|
|
|
|
assert completed
|
|
assert output is not None
|
|
assert isinstance(output, str)
|
|
assert output == "One & Two"
|
|
|
|
|
|
def test_concurrent_builder_rejects_multiple_calls_to_with_aggregator() -> None:
|
|
"""Test that multiple calls to .with_aggregator() raises an error."""
|
|
|
|
def summarize(results: list[AgentExecutorResponse]) -> str: # type: ignore[override]
|
|
return str(len(results))
|
|
|
|
with pytest.raises(ValueError, match=r"with_aggregator\(\) has already been called"):
|
|
(
|
|
ConcurrentBuilder(participants=[_FakeAgentExec("a", "A")])
|
|
.with_aggregator(summarize)
|
|
.with_aggregator(summarize)
|
|
)
|
|
|
|
|
|
async def test_concurrent_checkpoint_resume_round_trip() -> None:
|
|
storage = InMemoryCheckpointStorage()
|
|
|
|
participants = (
|
|
_FakeAgentExec("agentA", "Alpha"),
|
|
_FakeAgentExec("agentB", "Beta"),
|
|
_FakeAgentExec("agentC", "Gamma"),
|
|
)
|
|
|
|
wf = ConcurrentBuilder(participants=list(participants), checkpoint_storage=storage).build()
|
|
|
|
baseline_output: AgentResponse | None = None
|
|
async for ev in wf.run("checkpoint concurrent", stream=True):
|
|
if ev.type == "output":
|
|
baseline_output = ev.data # type: ignore[assignment]
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
break
|
|
|
|
assert baseline_output is not None
|
|
|
|
checkpoints = await storage.list_checkpoints(workflow_name=wf.name)
|
|
assert checkpoints
|
|
checkpoints.sort(key=lambda cp: cp.timestamp)
|
|
resume_checkpoint = checkpoints[1]
|
|
|
|
resumed_participants = (
|
|
_FakeAgentExec("agentA", "Alpha"),
|
|
_FakeAgentExec("agentB", "Beta"),
|
|
_FakeAgentExec("agentC", "Gamma"),
|
|
)
|
|
wf_resume = ConcurrentBuilder(participants=list(resumed_participants), checkpoint_storage=storage).build()
|
|
|
|
resumed_output: AgentResponse | None = None
|
|
async for ev in wf_resume.run(checkpoint_id=resume_checkpoint.checkpoint_id, stream=True):
|
|
if ev.type == "output":
|
|
resumed_output = ev.data # type: ignore[assignment]
|
|
if ev.type == "status" and ev.state in (
|
|
WorkflowRunState.IDLE,
|
|
WorkflowRunState.IDLE_WITH_PENDING_REQUESTS,
|
|
):
|
|
break
|
|
|
|
assert resumed_output is not None
|
|
assert [m.role for m in resumed_output.messages] == [m.role for m in baseline_output.messages]
|
|
assert [m.text for m in resumed_output.messages] == [m.text for m in baseline_output.messages]
|
|
|
|
|
|
async def test_concurrent_checkpoint_runtime_only() -> None:
|
|
"""Test checkpointing configured ONLY at runtime, not at build time."""
|
|
storage = InMemoryCheckpointStorage()
|
|
|
|
agents = [_FakeAgentExec(id="agent1", reply_text="A1"), _FakeAgentExec(id="agent2", reply_text="A2")]
|
|
wf = ConcurrentBuilder(participants=agents).build()
|
|
|
|
baseline_output: AgentResponse | None = None
|
|
async for ev in wf.run("runtime checkpoint test", checkpoint_storage=storage, stream=True):
|
|
if ev.type == "output":
|
|
baseline_output = ev.data # type: ignore[assignment]
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
break
|
|
|
|
assert baseline_output is not None
|
|
|
|
checkpoints = await storage.list_checkpoints(workflow_name=wf.name)
|
|
assert len(checkpoints) >= 2, (
|
|
"Expected at least 2 checkpoints. The first one is after the start executor, "
|
|
"and the second one is after the first round of agent executions."
|
|
)
|
|
checkpoints.sort(key=lambda cp: cp.timestamp)
|
|
resume_checkpoint = checkpoints[1]
|
|
|
|
resumed_agents = [_FakeAgentExec(id="agent1", reply_text="A1"), _FakeAgentExec(id="agent2", reply_text="A2")]
|
|
wf_resume = ConcurrentBuilder(participants=resumed_agents).build()
|
|
|
|
resumed_output: AgentResponse | None = None
|
|
async for ev in wf_resume.run(
|
|
checkpoint_id=resume_checkpoint.checkpoint_id, checkpoint_storage=storage, stream=True
|
|
):
|
|
if ev.type == "output":
|
|
resumed_output = ev.data # type: ignore[assignment]
|
|
if ev.type == "status" and ev.state in (
|
|
WorkflowRunState.IDLE,
|
|
WorkflowRunState.IDLE_WITH_PENDING_REQUESTS,
|
|
):
|
|
break
|
|
|
|
assert resumed_output is not None
|
|
assert [m.role for m in resumed_output.messages] == [m.role for m in baseline_output.messages]
|
|
|
|
|
|
async def test_concurrent_checkpoint_runtime_overrides_buildtime() -> None:
|
|
"""Test that runtime checkpoint storage overrides build-time configuration."""
|
|
import tempfile
|
|
|
|
with tempfile.TemporaryDirectory() as temp_dir1, tempfile.TemporaryDirectory() as temp_dir2:
|
|
from agent_framework._workflows._checkpoint import FileCheckpointStorage
|
|
|
|
buildtime_storage = FileCheckpointStorage(temp_dir1)
|
|
runtime_storage = FileCheckpointStorage(temp_dir2)
|
|
|
|
agents = [_FakeAgentExec(id="agent1", reply_text="A1"), _FakeAgentExec(id="agent2", reply_text="A2")]
|
|
wf = ConcurrentBuilder(participants=agents, checkpoint_storage=buildtime_storage).build()
|
|
|
|
baseline_output: list[Message] | None = None
|
|
async for ev in wf.run("override test", checkpoint_storage=runtime_storage, stream=True):
|
|
if ev.type == "output":
|
|
baseline_output = ev.data # type: ignore[assignment]
|
|
if ev.type == "status" and ev.state == WorkflowRunState.IDLE:
|
|
break
|
|
|
|
assert baseline_output is not None
|
|
|
|
buildtime_checkpoints = await buildtime_storage.list_checkpoints(workflow_name=wf.name)
|
|
runtime_checkpoints = await runtime_storage.list_checkpoints(workflow_name=wf.name)
|
|
|
|
assert len(runtime_checkpoints) > 0, "Runtime storage should have checkpoints"
|
|
assert len(buildtime_checkpoints) == 0, "Build-time storage should have no checkpoints when overridden"
|
|
|
|
|
|
async def test_concurrent_builder_reusable_after_build_with_participants() -> None:
|
|
"""Test that the builder can be reused to build multiple identical workflows with participants()."""
|
|
e1 = _FakeAgentExec("agentA", "One")
|
|
e2 = _FakeAgentExec("agentB", "Two")
|
|
|
|
builder = ConcurrentBuilder(participants=[e1, e2])
|
|
|
|
builder.build()
|
|
|
|
assert builder._participants[0] is e1 # type: ignore
|
|
assert builder._participants[1] is e2 # type: ignore
|
|
|
|
|
|
class _EchoAgent(BaseAgent):
|
|
"""Simple agent that appends a single assistant message with its name."""
|
|
|
|
@overload
|
|
def run(
|
|
self,
|
|
messages: AgentRunInputs | None = ...,
|
|
*,
|
|
stream: Literal[False] = ...,
|
|
session: AgentSession | None = ...,
|
|
**kwargs: Any,
|
|
) -> Awaitable[AgentResponse[Any]]: ...
|
|
@overload
|
|
def run(
|
|
self,
|
|
messages: AgentRunInputs | None = ...,
|
|
*,
|
|
stream: Literal[True],
|
|
session: AgentSession | None = ...,
|
|
**kwargs: Any,
|
|
) -> ResponseStream[AgentResponseUpdate, AgentResponse[Any]]: ...
|
|
|
|
def run(
|
|
self,
|
|
messages: AgentRunInputs | None = None,
|
|
*,
|
|
stream: bool = False,
|
|
session: AgentSession | None = None,
|
|
**kwargs: Any,
|
|
) -> Awaitable[AgentResponse[Any]] | ResponseStream[AgentResponseUpdate, AgentResponse[Any]]:
|
|
if stream:
|
|
|
|
async def _stream() -> AsyncIterable[AgentResponseUpdate]:
|
|
yield AgentResponseUpdate(contents=[Content.from_text(text=f"{self.name} reply")])
|
|
|
|
return ResponseStream(_stream(), finalizer=AgentResponse.from_updates)
|
|
|
|
async def _run() -> AgentResponse:
|
|
return AgentResponse(messages=[Message("assistant", [f"{self.name} reply"])])
|
|
|
|
return _run()
|