mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
1acd242550
* Python: Add AgentLoopMiddleware for re-running agents in a loop Add `AgentLoopMiddleware`, an `AgentMiddleware` that re-runs the wrapped agent in a loop. A single configurable class covers three common patterns, each with a convenience classmethod factory: - Ralph loop (`.ralph(...)`): no exit criteria, with feedback tracking (`record_feedback`/`progress`), progress injection (`inject_progress`), optional fresh context per iteration (`fresh_context`), and an early-stop completion signal (`is_complete`). - Predicate (`.with_predicate(...)`): loop while a `should_continue` callable returns True (e.g. paired with `todos_remaining`/`background_tasks_running`). - Judge (`.with_judge(...)`): a second chat client decides whether the original request was answered, using a `JudgeVerdict` structured-output response. The loop also auto-resolves pending function-approval / user-input requests via an `on_approval_request` callable (bounded by `max_approval_rounds`), and the next iteration's input is controlled by `next_message`. Supports both streaming and non-streaming runs. Exports `AgentLoopMiddleware`, `JudgeVerdict`, `todos_remaining`, and `background_tasks_running`. Adds tests, a sample, and docs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Refine AgentLoopMiddleware API and sample - with_judge: add criteria list with {{criteria}} templating into judge instructions plus an agent-side instruction; add fresh_context, additional judge feedback relay; default judge max_iterations. - should_continue is now required and positional; supports (bool, str|None) feedback tuples surfaced to next_message/record_feedback via feedback kwarg. - Judge forwards full multi-modal request and response messages. - Default max_iterations=10 (explicit None = unbounded); removed is_complete and Ralph terminology; ShouldContinueResult is a real TypeAlias. - Sample: stream all loops, print iteration counts via injected user-block boundaries (robust to function calling), <role>: content formatting, per-method expected output, and a looping todo sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Fix CI checks for AgentLoopMiddleware - Resolve pyright errors in _loop.py: drop the always-true final_result None check (the while loop always assigns it) and cast finish_reason to the AgentResponse constructor's expected type. - Apply pyupgrade --py310-plus: import TypeAlias from typing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Resolve mypy/pyright disagreement on finish_reason pyright infers AgentResponse.finish_reason as including str and rejects the direct assignment, while mypy considers a cast redundant. Drop the cast and suppress only pyright with a targeted reportArgumentType ignore, satisfying both type checkers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Add todo+judge AgentLoopMiddleware sample Add a second AgentLoopMiddleware sample that composes two criteria in one should_continue predicate: a TodoProvider check (evaluated first) and a report-style judge chat client (evaluated once todos are complete) that grades the assembled report against shared requirements. Register it in the middleware samples README. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Compose todo+judge loops as two middleware Rework the todo+judge sample to compose two AgentLoopMiddleware on the agent itself (middleware=[judge_loop, todo_loop]) instead of a single hand-written predicate. The inner todos_remaining loop drafts the report todo-by-todo and the outer with_judge loop re-runs it until an editor chat client judges the report publication-ready, reusing the built-in helpers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Reset session for fresh_context loops via snapshot/restore AgentLoopMiddleware.fresh_context previously only reset context.messages, so with an attached session each iteration still reloaded the local transcript or re-threaded the service-side conversation id and the model saw the accumulated history. Snapshot the session once before the loop (via to_dict) and restore it (from_dict + field copy) between iterations, so every pass starts from the pre-loop baseline. The final iteration's pass is persisted (no restore after the terminating iteration), so a subsequent agent.run continues from there. Removed the obsolete warning, updated docstrings and core AGENTS.md, and added tests: a snapshot/restore round-trip, a session-reset streaming x fresh_context x inject_progress x store matrix across multiple runs and loop iterations, and response_format parsing across the loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Updated samples and docstrings --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
119 lines
5.0 KiB
Python
119 lines
5.0 KiB
Python
# Copyright (c) Microsoft. All rights reserved.
|
|
|
|
import asyncio
|
|
|
|
from agent_framework import Agent, AgentLoopMiddleware
|
|
from agent_framework.foundry import FoundryChatClient
|
|
from azure.identity.aio import AzureCliCredential
|
|
from dotenv import load_dotenv
|
|
|
|
# Load environment variables from .env file
|
|
load_dotenv()
|
|
|
|
"""
|
|
Agent Loop Middleware: ChatClient judge
|
|
|
|
This sample demonstrates ``AgentLoopMiddleware.with_judge(...)``: a second chat client decides (via a
|
|
``JudgeVerdict`` structured output) whether the original request was answered, and the loop continues
|
|
while the answer is "no". The judge's ``reasoning`` is fed back to the agent as the next iteration's
|
|
input, so the agent knows what is missing. The loop also passes a list of ``criteria``, which are
|
|
injected as an extra instruction for the agent and rendered into the judge's instructions.
|
|
|
|
The loop is run with streaming, so the judge's feedback between iterations shows up as a ``user``
|
|
update; the stream is printed as ``<role>: <content>`` lines.
|
|
|
|
Environment variables:
|
|
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint URL
|
|
FOUNDRY_MODEL — Model deployment name
|
|
|
|
Authentication:
|
|
Run ``az login`` before running this sample.
|
|
"""
|
|
|
|
|
|
async def judge_loop(client: FoundryChatClient, judge_client: FoundryChatClient) -> None:
|
|
"""A second chat client judges whether the request was answered."""
|
|
print("\n=== ChatClient judge (loop until the request is answered) ===")
|
|
|
|
# 1. Provide a ``judge_client``. The middleware asks it (via a ``JudgeVerdict`` structured
|
|
# output) whether the original request has been fully addressed and continues while the
|
|
# answer is "no". The judge's ``reasoning`` is fed back to the agent as the next iteration's
|
|
# input, so the agent knows what is missing. Judge loops default to a small ``max_iterations``
|
|
# cap because each pass costs an extra model call.
|
|
#
|
|
# ``criteria`` is a list of requirements the response must satisfy. The loop (a) injects them
|
|
# as an extra instruction for the agent before it runs and (b) renders them into the judge's
|
|
# instructions (the default judge prompt includes a ``{{criteria}}`` placeholder). Supply your
|
|
# own ``instructions`` string with ``{{criteria}}`` to control the wording, or omit ``criteria``
|
|
# entirely and pass a plain ``instructions`` string.
|
|
loop = AgentLoopMiddleware.with_judge(
|
|
judge_client,
|
|
criteria=[
|
|
"Mentions the moon",
|
|
"Includes at least one good joke",
|
|
"Is written as a single piece of fluent prose",
|
|
],
|
|
max_iterations=4,
|
|
)
|
|
|
|
agent = Agent(
|
|
client=client,
|
|
name="answerer",
|
|
instructions="You are a helpful assistant. Answer the user's question thoroughly.",
|
|
middleware=[loop],
|
|
)
|
|
|
|
# 2. Run with streaming; the judge's feedback appears as a ``user`` update between iterations
|
|
# until the judge is satisfied (or the iteration cap is reached). Each contiguous ``user``
|
|
# block marks the boundary into the next iteration, so we count loop iterations by those
|
|
# boundaries (robust to function calling, where one iteration may issue several model calls).
|
|
iterations = 1
|
|
in_user_block = False
|
|
assistant_open = False
|
|
async for update in agent.run("Explain why the sky is blue and sunsets are red.", stream=True):
|
|
if update.role == "user":
|
|
if not in_user_block:
|
|
iterations += 1
|
|
in_user_block = True
|
|
assistant_open = False
|
|
print(f"\nuser: {update.text}", flush=True)
|
|
continue
|
|
in_user_block = False
|
|
if update.text:
|
|
if not assistant_open:
|
|
print("\nassistant: ", end="", flush=True)
|
|
assistant_open = True
|
|
print(update.text, end="", flush=True)
|
|
print(f"\n\nCompleted in {iterations} iteration(s).")
|
|
|
|
|
|
async def main() -> None:
|
|
# A single credential is reused; the judge uses its own client instance.
|
|
async with AzureCliCredential() as credential:
|
|
client = FoundryChatClient(credential=credential)
|
|
judge_client = FoundryChatClient(credential=credential)
|
|
await judge_loop(client, judge_client)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
asyncio.run(main())
|
|
|
|
|
|
"""
|
|
Sample output (abridged; exact text varies by model):
|
|
|
|
=== ChatClient judge (loop until the request is answered) ===
|
|
assistant: The sky is blue because shorter (blue) wavelengths scatter more (Rayleigh scattering).
|
|
user: An evaluator reviewed your previous response and judged that it does not yet fully
|
|
address the original request.
|
|
|
|
Evaluator feedback: The response does not mention the moon.
|
|
|
|
Revise and continue so the original request is fully addressed.
|
|
assistant: The sky is blue because shorter (blue) wavelengths scatter more. At sunset, light travels
|
|
through more atmosphere, scattering away blue and leaving red/orange hues. The moon follows the
|
|
sky's colors because the same scattering applies to the light reaching it.
|
|
|
|
Completed in 2 iteration(s).
|
|
"""
|