mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
Python: Add AgentLoopMiddleware for re-running agents in a loop (#6174)
* Python: Add AgentLoopMiddleware for re-running agents in a loop Add `AgentLoopMiddleware`, an `AgentMiddleware` that re-runs the wrapped agent in a loop. A single configurable class covers three common patterns, each with a convenience classmethod factory: - Ralph loop (`.ralph(...)`): no exit criteria, with feedback tracking (`record_feedback`/`progress`), progress injection (`inject_progress`), optional fresh context per iteration (`fresh_context`), and an early-stop completion signal (`is_complete`). - Predicate (`.with_predicate(...)`): loop while a `should_continue` callable returns True (e.g. paired with `todos_remaining`/`background_tasks_running`). - Judge (`.with_judge(...)`): a second chat client decides whether the original request was answered, using a `JudgeVerdict` structured-output response. The loop also auto-resolves pending function-approval / user-input requests via an `on_approval_request` callable (bounded by `max_approval_rounds`), and the next iteration's input is controlled by `next_message`. Supports both streaming and non-streaming runs. Exports `AgentLoopMiddleware`, `JudgeVerdict`, `todos_remaining`, and `background_tasks_running`. Adds tests, a sample, and docs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Refine AgentLoopMiddleware API and sample - with_judge: add criteria list with {{criteria}} templating into judge instructions plus an agent-side instruction; add fresh_context, additional judge feedback relay; default judge max_iterations. - should_continue is now required and positional; supports (bool, str|None) feedback tuples surfaced to next_message/record_feedback via feedback kwarg. - Judge forwards full multi-modal request and response messages. - Default max_iterations=10 (explicit None = unbounded); removed is_complete and Ralph terminology; ShouldContinueResult is a real TypeAlias. - Sample: stream all loops, print iteration counts via injected user-block boundaries (robust to function calling), <role>: content formatting, per-method expected output, and a looping todo sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Fix CI checks for AgentLoopMiddleware - Resolve pyright errors in _loop.py: drop the always-true final_result None check (the while loop always assigns it) and cast finish_reason to the AgentResponse constructor's expected type. - Apply pyupgrade --py310-plus: import TypeAlias from typing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Resolve mypy/pyright disagreement on finish_reason pyright infers AgentResponse.finish_reason as including str and rejects the direct assignment, while mypy considers a cast redundant. Drop the cast and suppress only pyright with a targeted reportArgumentType ignore, satisfying both type checkers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Add todo+judge AgentLoopMiddleware sample Add a second AgentLoopMiddleware sample that composes two criteria in one should_continue predicate: a TodoProvider check (evaluated first) and a report-style judge chat client (evaluated once todos are complete) that grades the assembled report against shared requirements. Register it in the middleware samples README. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Compose todo+judge loops as two middleware Rework the todo+judge sample to compose two AgentLoopMiddleware on the agent itself (middleware=[judge_loop, todo_loop]) instead of a single hand-written predicate. The inner todos_remaining loop drafts the report todo-by-todo and the outer with_judge loop re-runs it until an editor chat client judges the report publication-ready, reusing the built-in helpers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Reset session for fresh_context loops via snapshot/restore AgentLoopMiddleware.fresh_context previously only reset context.messages, so with an attached session each iteration still reloaded the local transcript or re-threaded the service-side conversation id and the model saw the accumulated history. Snapshot the session once before the loop (via to_dict) and restore it (from_dict + field copy) between iterations, so every pass starts from the pre-loop baseline. The final iteration's pass is persisted (no restore after the terminating iteration), so a subsequent agent.run continues from there. Removed the obsolete warning, updated docstrings and core AGENTS.md, and added tests: a snapshot/restore round-trip, a session-reset streaming x fresh_context x inject_progress x store matrix across multiple runs and loop iterations, and response_format parsing across the loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Updated samples and docstrings --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
Unverified
parent
3f77c555cf
commit
1acd242550
@@ -116,6 +116,11 @@ agent_framework/
|
||||
available, approval requests for known non-approval-required tools are treated as already approved, hidden, stored
|
||||
in session state keyed to the visible approval request ids from that batch, and reinjected only when that visible
|
||||
approval flow resumes.
|
||||
### Agent Loop (`_harness/_loop.py`)
|
||||
|
||||
- **`AgentLoopMiddleware`** - `AgentMiddleware` that re-runs an agent in a loop by calling `call_next()` repeatedly (the pipeline re-reads `context.messages` each time). One configurable class covers two patterns: a required user `should_continue` predicate (sync or async, the first positional/keyword arg), and a chat-client judge built via the `.with_judge(...)` factory (a second chat client decides whether the original request was answered; loops while it is *not*, using a `JudgeVerdict` structured-output response — internally just an async `should_continue` predicate). The constructor covers the predicate pattern directly; only the judge has a convenience classmethod factory (`.with_judge(judge_client, ...)`) that forwards to `__init__`. Supports both streaming and non-streaming runs. By default a non-streaming run returns an aggregated `AgentResponse` containing every iteration's messages plus the injected `next_message` "nudge" messages (as `user` messages); set `return_final_only=True` to return only the last iteration's response. Streaming runs always yield each iteration's updates and emit the injected nudge messages as `user` updates between iterations (the `return_final_only` flag has no effect on streaming, and the final response reflects the last iteration; `MiddlewareTermination` is handled cleanly). `should_continue` is required; other constructor args are optional: `max_iterations` (safety cap; defaults to `DEFAULT_MAX_ITERATIONS`=10, explicit `None`→unbounded, positive int caps; `.with_judge` uses `DEFAULT_JUDGE_MAX_ITERATIONS`=5 as its default), `next_message` (defaults to a short "continue" nudge), `return_final_only`, and `additional_instructions` (an extra `system` message injected ahead of the input before the agent runs — becomes part of the original messages so it survives `fresh_context` resets and persists via a session). The judge is configured only through `.with_judge` (`judge_client`/`instructions`/`criteria`), not the constructor, and its `reasoning` is fed back to the agent as the next iteration's input; the judge forwards the original request messages and the agent's latest response messages verbatim so multi-modal content is preserved. `criteria` (a `list[str]`) is both injected as the agent's `additional_instructions` and rendered into the judge instructions wherever the `{{criteria}}` placeholder (`CRITERIA_PLACEHOLDER`) appears (`DEFAULT_JUDGE_INSTRUCTIONS` ends with it; custom `instructions` may include it, and it is stripped when no criteria are given). The `should_continue`/`next_message` callables are invoked with keyword args (`iteration`, `last_result`, `messages`, `original_messages`, `session`, `agent`, `progress`, `feedback`) and may be sync or async; declare only what you need plus `**kwargs`. `should_continue` may return a plain `bool` or a `(bool, str | None)` tuple whose second item is feedback surfaced to `next_message`/`record_feedback` via the `feedback` kwarg (the judge uses this to relay its `reasoning`). Stop precedence per iteration is `max_iterations` → `should_continue`, evaluated before `record_feedback` so the feedback is available to it.
|
||||
- **Feedback tracking** - `record_feedback` captures a per-iteration progress entry (called with the loop kwargs; if it returns a truthy string the entry is appended, otherwise the agent's response text is used as the fallback entry). The accumulated log is exposed to every callback via the `progress` keyword (a per-iteration copy of prior entries) and, when `inject_progress=True` (default), injected into the next iteration's input as a `user` message (the full log without a session, only the latest entry with a session to avoid duplicating history). `fresh_context=True` restarts each iteration from the original task plus the progress log; when a session is attached it is snapshotted (`to_dict()`) before the loop and restored (`from_dict` + field copy) between iterations so the local transcript and any service-side conversation id reset too (in-loop working-state is discarded, pre-loop state preserved, continuity carried only by the progress log).
|
||||
- **`todos_remaining(provider)`** / **`background_tasks_running(provider)`** - Helper factories returning `should_continue` predicates that loop while a `TodoProvider` has open items, or while a `BackgroundAgentsProvider`'s persisted state shows running tasks.
|
||||
|
||||
### Workflows (`_workflows/`)
|
||||
|
||||
|
||||
@@ -102,6 +102,12 @@ from ._harness._file_access import (
|
||||
FileSystemAgentFileStore,
|
||||
InMemoryAgentFileStore,
|
||||
)
|
||||
from ._harness._loop import (
|
||||
AgentLoopMiddleware,
|
||||
JudgeVerdict,
|
||||
background_tasks_running,
|
||||
todos_remaining,
|
||||
)
|
||||
from ._harness._memory import (
|
||||
DEFAULT_MEMORY_SOURCE_ID,
|
||||
MemoryContextProvider,
|
||||
@@ -363,6 +369,7 @@ __all__ = [
|
||||
"AgentExecutorResponse",
|
||||
"AgentFileStore",
|
||||
"AgentFrameworkException",
|
||||
"AgentLoopMiddleware",
|
||||
"AgentMiddleware",
|
||||
"AgentMiddlewareLayer",
|
||||
"AgentMiddlewareTypes",
|
||||
@@ -454,6 +461,7 @@ __all__ = [
|
||||
"InlineSkill",
|
||||
"InlineSkillResource",
|
||||
"InlineSkillScript",
|
||||
"JudgeVerdict",
|
||||
"LocalEvaluator",
|
||||
"MCPSkill",
|
||||
"MCPSkillResource",
|
||||
@@ -558,6 +566,7 @@ __all__ = [
|
||||
"agent_middleware",
|
||||
"annotate_message_groups",
|
||||
"apply_compaction",
|
||||
"background_tasks_running",
|
||||
"chat_middleware",
|
||||
"create_always_approve_tool_response",
|
||||
"create_always_approve_tool_with_arguments_response",
|
||||
@@ -588,6 +597,7 @@ __all__ = [
|
||||
"response_handler",
|
||||
"set_agent_mode",
|
||||
"step",
|
||||
"todos_remaining",
|
||||
"tool",
|
||||
"tool_call_args_match",
|
||||
"tool_called_check",
|
||||
|
||||
@@ -0,0 +1,796 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
"""AgentLoopMiddleware: re-run an agent in a loop until a criterion is met.
|
||||
|
||||
This module provides :class:`AgentLoopMiddleware`, an :class:`~agent_framework.AgentMiddleware`
|
||||
that repeatedly re-invokes the wrapped agent while a ``should_continue`` predicate says to keep
|
||||
going. It serves two common patterns through a single configurable class:
|
||||
|
||||
1. A user-supplied ``should_continue`` predicate - for example, keep looping while a response does
|
||||
not yet contain a completion marker, while a :class:`~agent_framework.TodoProvider` still has
|
||||
open items, or while a :class:`~agent_framework.BackgroundAgentsProvider` still has running
|
||||
tasks (see the :func:`todos_remaining` and :func:`background_tasks_running` helpers). The loop
|
||||
can track a **feedback log** across iterations (``record_feedback``): each pass contributes an
|
||||
entry that is exposed to every callback via the ``progress`` keyword and (by default) injected
|
||||
into the next iteration's input. Set ``fresh_context=True`` to restart each pass from the
|
||||
original task plus the progress log (with a session attached, the session is also snapshotted
|
||||
before the loop and restored between iterations so no accumulated history leaks back in).
|
||||
``max_iterations`` bounds the loop as a safety cap.
|
||||
2. A chat-client judge (via :meth:`AgentLoopMiddleware.with_judge`) - a second chat client decides
|
||||
whether the user's original request has been answered (via a :class:`JudgeVerdict` structured
|
||||
output); the loop continues while the answer is "no". This is a convenience wrapper that builds an
|
||||
async ``should_continue`` predicate, so it is a special case of (1).
|
||||
|
||||
In every case, the input for the next iteration is controlled by the ``next_message`` callable.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import inspect
|
||||
from collections.abc import Awaitable, Callable, Sequence
|
||||
from typing import TYPE_CHECKING, Any, TypeAlias
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
from typing_extensions import Self
|
||||
|
||||
from .._feature_stage import ExperimentalFeature, experimental
|
||||
from .._middleware import AgentContext, AgentMiddleware, MiddlewareTermination
|
||||
from .._types import (
|
||||
AgentResponse,
|
||||
AgentResponseUpdate,
|
||||
AgentRunInputs,
|
||||
Message,
|
||||
ResponseStream,
|
||||
UsageDetails,
|
||||
add_usage_details,
|
||||
normalize_messages,
|
||||
)
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .._clients import SupportsChatGetResponse
|
||||
|
||||
__all__ = [
|
||||
"AgentLoopMiddleware",
|
||||
"JudgeVerdict",
|
||||
"background_tasks_running",
|
||||
"todos_remaining",
|
||||
]
|
||||
|
||||
DEFAULT_NEXT_MESSAGE = "Continue working on the task. If it is complete, say so."
|
||||
|
||||
# Placeholder substituted with the rendered ``criteria`` block in judge instructions (see
|
||||
# :meth:`AgentLoopMiddleware.with_judge`). User-supplied instructions may include it to control
|
||||
# where the criteria are inserted; if absent, the criteria are not added to the judge instructions.
|
||||
CRITERIA_PLACEHOLDER = "{{criteria}}"
|
||||
|
||||
# Verdict markers the judge is asked to emit for clients that do not honor structured output. They
|
||||
# are deliberately non-overlapping: neither marker is a substring of the other, nor of the JSON
|
||||
# field name ``answered``, so the text fallback in :func:`_build_judge_condition` cannot misclassify
|
||||
# a negative verdict (e.g. ``{"answered": false}``) as a positive one.
|
||||
JUDGE_VERDICT_DONE = "VERDICT: DONE"
|
||||
JUDGE_VERDICT_MORE = "VERDICT: MORE"
|
||||
|
||||
DEFAULT_JUDGE_INSTRUCTIONS = (
|
||||
"You are an evaluator. You are given a user's original request and an agent's latest response. "
|
||||
"Decide whether the agent has fully addressed the original request. "
|
||||
"Set 'answered' to true if the request has been fully addressed, or false if more work is still "
|
||||
"required, and use 'reasoning' to briefly justify your decision. "
|
||||
f"If you cannot return structured output, end your reply with a line reading exactly "
|
||||
f"'{JUDGE_VERDICT_DONE}' when the request has been fully addressed or '{JUDGE_VERDICT_MORE}' "
|
||||
f"when more work is still required."
|
||||
"{{criteria}}"
|
||||
)
|
||||
|
||||
|
||||
def _render_criteria_block(criteria: Sequence[str] | None) -> str:
|
||||
"""Render a list of criteria into a bullet block for the judge instructions (``""`` if none)."""
|
||||
if not criteria:
|
||||
return ""
|
||||
bullets = "\n".join(f"- {item}" for item in criteria)
|
||||
return f"\n\nThe response must satisfy all of the following criteria:\n{bullets}"
|
||||
|
||||
|
||||
def _criteria_agent_instruction(criteria: Sequence[str]) -> str:
|
||||
"""Render the criteria into an extra instruction injected for the agent before each run."""
|
||||
bullets = "\n".join(f"- {item}" for item in criteria)
|
||||
return f"Your response must satisfy all of the following criteria:\n{bullets}"
|
||||
|
||||
|
||||
class JudgeVerdict(BaseModel):
|
||||
"""Structured verdict returned by the judge chat client."""
|
||||
|
||||
answered: bool = Field(
|
||||
description=(
|
||||
"True if the agent has fully addressed the original request and it adheres to the other "
|
||||
"judging standards, otherwise False."
|
||||
),
|
||||
)
|
||||
reasoning: str = Field(
|
||||
default="",
|
||||
description="Brief justification for the verdict.",
|
||||
)
|
||||
|
||||
|
||||
# Default iteration cap applied when ``max_iterations`` is not provided. Loops are bounded by
|
||||
# default to guard against runaway re-invocation; pass ``max_iterations=None`` explicitly to opt
|
||||
# into an unbounded loop.
|
||||
DEFAULT_MAX_ITERATIONS = 10
|
||||
|
||||
# Default iteration cap for judge-driven loops. LLM-judged loops are costly and probabilistic, so
|
||||
# they are bounded by a smaller default. Pass ``max_iterations=None`` explicitly to opt into an
|
||||
# unbounded judge loop.
|
||||
DEFAULT_JUDGE_MAX_ITERATIONS = 5
|
||||
|
||||
|
||||
# A callable invoked between iterations. It always receives the loop keyword arguments
|
||||
# (``iteration``, ``last_result``, ``messages``, ``original_messages``, ``session``, ``agent``,
|
||||
# ``progress``, ``feedback``). Callers declare only the keywords they need plus ``**kwargs`` to
|
||||
# ignore the rest. ``should_continue`` may return a plain ``bool`` (continue/stop) or a
|
||||
# ``(bool, str | None)`` tuple whose second item is feedback surfaced to the ``next_message`` and
|
||||
# ``record_feedback`` callables via the ``feedback`` keyword argument.
|
||||
ShouldContinueResult: TypeAlias = "bool | tuple[bool, str | None]"
|
||||
ShouldContinueCallable = Callable[..., "ShouldContinueResult | Awaitable[ShouldContinueResult]"]
|
||||
NextMessageCallable = Callable[..., "AgentRunInputs | Awaitable[AgentRunInputs | None] | None"]
|
||||
|
||||
# A callable invoked once per work iteration to capture a progress-log entry from that iteration. It
|
||||
# receives the loop keyword arguments and returns a string entry (appended to the log) or ``None``
|
||||
# (record nothing for that iteration).
|
||||
FeedbackCallable = Callable[..., "str | Awaitable[str | None] | None"]
|
||||
|
||||
|
||||
async def _maybe_await(value: Any) -> Any:
|
||||
"""Await ``value`` if it is awaitable, otherwise return it as-is."""
|
||||
if inspect.isawaitable(value):
|
||||
return await value
|
||||
return value
|
||||
|
||||
|
||||
def _build_judge_condition(
|
||||
judge_client: SupportsChatGetResponse,
|
||||
instructions: str,
|
||||
) -> tuple[ShouldContinueCallable, NextMessageCallable]:
|
||||
"""Build the ``should_continue`` predicate and ``next_message`` callable for a judge loop.
|
||||
|
||||
The judge is called directly (no agent tools, session, or middleware) with fresh messages, so
|
||||
the loop's evaluation cannot recurse back through the agent pipeline. The original input messages
|
||||
are forwarded verbatim (rather than collapsed to text) so multi-modal requests are preserved. The
|
||||
judge is asked for a :class:`JudgeVerdict` structured output; if the client does not honor
|
||||
structured output the verdict falls back to the explicit, non-overlapping ``VERDICT: DONE`` /
|
||||
``VERDICT: MORE`` markers (``MORE`` wins, keeping the loop running, when the marker is ambiguous
|
||||
or absent).
|
||||
|
||||
The predicate returns a ``(continue, reasoning)`` tuple; the loop surfaces that ``reasoning`` to
|
||||
the next-message callable as the ``feedback`` keyword argument, which feeds it back to the agent
|
||||
so it knows *why* its previous answer was judged incomplete.
|
||||
"""
|
||||
|
||||
async def _judge(
|
||||
*, last_result: AgentResponse, original_messages: list[Message], **kwargs: Any
|
||||
) -> tuple[bool, str | None]:
|
||||
judge_messages = [
|
||||
Message(role="system", contents=[instructions]),
|
||||
Message(
|
||||
role="user",
|
||||
contents=["Evaluate the agent's work. The user's original request follows:"],
|
||||
),
|
||||
*original_messages,
|
||||
Message(role="user", contents=["The agent's latest response was:"]),
|
||||
*last_result.messages,
|
||||
Message(role="user", contents=["Has the original request been fully addressed?"]),
|
||||
]
|
||||
response = await judge_client.get_response(judge_messages, options={"response_format": JudgeVerdict})
|
||||
verdict = response.value
|
||||
if isinstance(verdict, JudgeVerdict):
|
||||
answered = verdict.answered
|
||||
reasoning = verdict.reasoning
|
||||
else:
|
||||
# Fallback for clients that do not honor structured output: look for the explicit,
|
||||
# non-overlapping verdict markers. ``FAIL`` (more work needed) takes precedence so an
|
||||
# ambiguous or marker-less reply keeps looping rather than stopping on an incomplete
|
||||
# answer.
|
||||
text = response.text.upper()
|
||||
# ``MORE`` (more work needed) takes precedence so an ambiguous reply keeps looping.
|
||||
answered = False if JUDGE_VERDICT_MORE in text else JUDGE_VERDICT_DONE in text
|
||||
reasoning = response.text.strip()
|
||||
# Continue looping while the request is not yet answered, surfacing the reasoning as feedback.
|
||||
return (not answered), (reasoning or None)
|
||||
|
||||
def _next_message(*, feedback: str | None = None, **kwargs: Any) -> AgentRunInputs:
|
||||
# Feed the judge's reasoning back to the agent so the next iteration addresses the gap.
|
||||
if feedback:
|
||||
return (
|
||||
"An evaluator reviewed your previous response and judged that it does not yet fully "
|
||||
f"address the original request.\n\nEvaluator feedback: {feedback}\n\n"
|
||||
"Revise and continue so the original request is fully addressed."
|
||||
)
|
||||
return DEFAULT_NEXT_MESSAGE
|
||||
|
||||
return _judge, _next_message
|
||||
|
||||
|
||||
@experimental(feature_id=ExperimentalFeature.HARNESS)
|
||||
class AgentLoopMiddleware(AgentMiddleware):
|
||||
"""Re-run an agent in a loop until a criterion is met (or never).
|
||||
|
||||
This middleware repeatedly invokes the wrapped agent. After each run it decides whether to run
|
||||
again based on ``should_continue`` and ``max_iterations``, and uses ``next_message`` to build
|
||||
the input for the next iteration. Use :meth:`with_judge` to drive the loop with a chat-client
|
||||
judge instead of a hand-written predicate.
|
||||
|
||||
By default a non-streaming run returns an aggregated :class:`~agent_framework.AgentResponse`
|
||||
containing every iteration's messages plus the injected ``next_message`` "nudge" messages (set
|
||||
``return_final_only=True`` to return only the last iteration's response). Streaming runs always
|
||||
yield each iteration's updates and emit the injected nudge messages as ``user`` updates between
|
||||
iterations.
|
||||
|
||||
The ``should_continue`` and ``next_message`` callables are invoked with keyword arguments, so a
|
||||
caller only needs to declare the ones it uses plus ``**kwargs``. The keywords are:
|
||||
|
||||
- ``iteration`` (int): the number of completed runs so far (1-based after the first run).
|
||||
- ``last_result`` (AgentResponse): the result of the iteration that just completed.
|
||||
- ``messages`` (list[Message]): the messages used for the iteration that just completed.
|
||||
- ``original_messages`` (list[Message]): the input used for the first iteration.
|
||||
- ``session`` (AgentSession | None): the active session, used by the provider helpers.
|
||||
- ``agent``: the agent being looped.
|
||||
- ``progress`` (list[str]): the feedback log accumulated so far (see ``record_feedback``).
|
||||
- ``feedback`` (str | None): the feedback string returned by ``should_continue`` for this
|
||||
iteration (``None`` when it returned a plain bool). ``should_continue`` may return either a
|
||||
``bool`` or a ``(bool, str | None)`` tuple; the string is surfaced here so ``next_message``
|
||||
and ``record_feedback`` can reference it.
|
||||
|
||||
Examples:
|
||||
.. code-block:: python
|
||||
|
||||
from agent_framework import Agent, AgentResponse
|
||||
from agent_framework._harness._loop import AgentLoopMiddleware
|
||||
|
||||
|
||||
async def should_continue(*, iteration: int, last_result: AgentResponse, **kwargs) -> bool:
|
||||
return iteration < 3 and "DONE" not in last_result.text
|
||||
|
||||
|
||||
agent = Agent(client=client, middleware=[AgentLoopMiddleware(should_continue)])
|
||||
|
||||
Note:
|
||||
``max_iterations`` acts as a safety cap and defaults to ``DEFAULT_MAX_ITERATIONS`` (10). Pass
|
||||
an explicit ``None`` to make the loop unbounded, in which case it relies entirely on
|
||||
``should_continue`` to stop, so make sure the predicate can eventually return ``False``.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
should_continue: ShouldContinueCallable,
|
||||
*,
|
||||
max_iterations: int | None = DEFAULT_MAX_ITERATIONS,
|
||||
next_message: NextMessageCallable | None = None,
|
||||
record_feedback: FeedbackCallable | None = None,
|
||||
inject_progress: bool = True,
|
||||
fresh_context: bool = False,
|
||||
return_final_only: bool = False,
|
||||
additional_instructions: str | None = None,
|
||||
) -> None:
|
||||
"""Initialize the agent loop middleware.
|
||||
|
||||
Args:
|
||||
should_continue: Predicate that decides whether to run the agent again. May be sync or
|
||||
async and is called with the loop keyword arguments (``iteration``, ``last_result``,
|
||||
``messages``, ``original_messages``, ``session``, ``agent``, ``progress``, and
|
||||
``feedback`` -- see the class docstring for what each one carries; declare only the
|
||||
ones you need plus ``**kwargs``). Return ``True``/``False`` to
|
||||
continue/stop, or a ``(bool, str | None)`` tuple to also provide feedback; the
|
||||
feedback string is surfaced to the ``next_message`` and ``record_feedback`` callables
|
||||
via the ``feedback`` keyword argument. To loop on a chat-client judge instead, build
|
||||
the middleware via :meth:`with_judge`.
|
||||
|
||||
Keyword Args:
|
||||
max_iterations: Maximum number of agent runs, used as a safety cap. Defaults to
|
||||
``DEFAULT_MAX_ITERATIONS`` (10); pass an explicit ``None`` for an unbounded loop, or
|
||||
a positive integer to set a custom cap. (The :meth:`with_judge` factory uses
|
||||
``DEFAULT_JUDGE_MAX_ITERATIONS`` (5) as its default instead.)
|
||||
next_message: Callable that produces the input for the next iteration, called with the
|
||||
loop keyword arguments. Defaults to a short "continue" nudge. Returning ``None``
|
||||
reuses the previous iteration's messages verbatim (in which case the progress log is
|
||||
*not* injected; see ``inject_progress``).
|
||||
record_feedback: Optional callable invoked once per work iteration to capture a feedback
|
||||
entry. Called as ``record_feedback(**loop_kwargs)`` and returns a
|
||||
string entry appended to the progress log, or ``None`` to record nothing for that
|
||||
iteration. When not provided, the iteration's response text (``last_result.text``) is
|
||||
recorded instead. The accumulated log is exposed to every callback via the
|
||||
``progress`` loop keyword argument. For production loops prefer a ``record_feedback``
|
||||
that returns a terse summary rather than relying on the full response text.
|
||||
inject_progress: When ``True`` (default), the accumulated progress log is injected into
|
||||
the next iteration's input as a single ``user`` message ("Progress so far: ..."). To
|
||||
avoid duplication, only the most recent entry is injected when a session is attached
|
||||
(the session already retains earlier turns); the full log is injected when there is
|
||||
no session or ``fresh_context`` is set. When ``False`` the log is only exposed via the
|
||||
``progress`` loop keyword argument and never injected automatically.
|
||||
fresh_context: When ``True``, each iteration starts from a clean context: ``context``
|
||||
messages are reset to the original input messages (plus the injected progress log)
|
||||
instead of accumulating the prior conversation. When a session is attached, the
|
||||
session is snapshotted once before the loop and restored to that pre-loop baseline
|
||||
before each subsequent iteration, so the local transcript and any service-side
|
||||
conversation id are reset too and the agent does not re-read the accumulated history.
|
||||
In-loop working-state mutations are discarded; pre-loop state is preserved; continuity
|
||||
is carried only by the progress log.
|
||||
return_final_only: Controls what a non-streaming run returns. When ``False`` (default),
|
||||
the returned :class:`~agent_framework.AgentResponse` aggregates every iteration: each
|
||||
iteration's response messages plus the injected ``next_message`` "nudge" messages
|
||||
(as ``user`` messages), so the caller sees the full back-and-forth. When ``True``,
|
||||
only the final iteration's :class:`~agent_framework.AgentResponse` is returned. This
|
||||
flag has no effect on streaming runs (the stream cannot know in advance which
|
||||
iteration is last); streaming always yields each iteration's updates and injects the
|
||||
``next_message`` messages as ``user`` updates between iterations.
|
||||
additional_instructions: Optional extra instruction injected as a ``system`` message
|
||||
ahead of the input messages before the agent runs. It becomes part of the original
|
||||
messages, so it is preserved across ``fresh_context`` resets and (with a session)
|
||||
persists server-side across iterations. Used by :meth:`with_judge` to tell the agent
|
||||
about the criteria its response must satisfy, but available to any loop.
|
||||
|
||||
Raises:
|
||||
ValueError: If ``max_iterations`` is not ``None`` and is less than 1.
|
||||
"""
|
||||
if max_iterations is not None and max_iterations < 1:
|
||||
raise ValueError("max_iterations must be None or a positive integer (>= 1).")
|
||||
|
||||
self.max_iterations: int | None = max_iterations
|
||||
self.should_continue: ShouldContinueCallable = should_continue
|
||||
self.next_message = next_message
|
||||
self.record_feedback = record_feedback
|
||||
self.inject_progress = inject_progress
|
||||
self.fresh_context = fresh_context
|
||||
self.return_final_only = return_final_only
|
||||
self.additional_instructions = additional_instructions
|
||||
|
||||
@classmethod
|
||||
def with_judge(
|
||||
cls,
|
||||
judge_client: SupportsChatGetResponse,
|
||||
*,
|
||||
criteria: Sequence[str] | None = None,
|
||||
instructions: str | None = None,
|
||||
max_iterations: int | None = DEFAULT_JUDGE_MAX_ITERATIONS,
|
||||
next_message: NextMessageCallable | None = None,
|
||||
fresh_context: bool = False,
|
||||
) -> Self:
|
||||
"""Create a loop that continues until a judge chat client decides the request was answered.
|
||||
|
||||
Convenience factory for the judge pattern: ``judge_client`` is queried with a
|
||||
:class:`JudgeVerdict` structured-output response after each iteration and the loop continues
|
||||
while the request is *not* answered. The judge's ``reasoning`` is fed back to the agent as
|
||||
the next iteration's input (unless a custom ``next_message`` is provided), so the agent knows
|
||||
why its previous answer was judged incomplete. See :meth:`__init__` for the full meaning of
|
||||
each argument.
|
||||
|
||||
Args:
|
||||
judge_client: Chat client used to judge whether the original request was answered.
|
||||
|
||||
Keyword Args:
|
||||
criteria: Optional list of criteria the response must satisfy. When provided, they are
|
||||
(1) injected as an extra ``system`` instruction for the agent before it runs (via
|
||||
``additional_instructions``) and (2) rendered into the judge instructions wherever
|
||||
the ``{{criteria}}`` placeholder appears (``CRITERIA_PLACEHOLDER``).
|
||||
instructions: Optional system instructions for the judge. Defaults to
|
||||
``DEFAULT_JUDGE_INSTRUCTIONS``. May contain the ``{{criteria}}`` placeholder, which
|
||||
is replaced with the rendered ``criteria`` (or removed when no criteria are given).
|
||||
max_iterations: Maximum number of agent runs. Defaults to
|
||||
``DEFAULT_JUDGE_MAX_ITERATIONS`` (5); pass ``None`` for unbounded, or a positive
|
||||
integer to set a custom cap.
|
||||
next_message: Callable that produces the next iteration's input. Defaults to one that
|
||||
relays the judge's ``reasoning`` back to the agent.
|
||||
fresh_context: When ``True``, each iteration restarts from the original input messages
|
||||
(plus the injected progress log and judge feedback) instead of accumulating the prior
|
||||
conversation; an attached session is snapshotted before the loop and restored to that
|
||||
baseline between iterations. See :meth:`__init__` for the full semantics. Defaults to
|
||||
``False``.
|
||||
"""
|
||||
judge_instructions = (instructions or DEFAULT_JUDGE_INSTRUCTIONS).replace(
|
||||
CRITERIA_PLACEHOLDER, _render_criteria_block(criteria)
|
||||
)
|
||||
should_continue, judge_next_message = _build_judge_condition(judge_client, judge_instructions)
|
||||
return cls(
|
||||
should_continue=should_continue,
|
||||
max_iterations=max_iterations,
|
||||
next_message=next_message or judge_next_message,
|
||||
fresh_context=fresh_context,
|
||||
additional_instructions=_criteria_agent_instruction(criteria) if criteria else None,
|
||||
)
|
||||
|
||||
async def process(
|
||||
self,
|
||||
context: AgentContext,
|
||||
call_next: Callable[[], Awaitable[None]],
|
||||
) -> None:
|
||||
"""Run the wrapped agent in a loop."""
|
||||
if self.additional_instructions is not None:
|
||||
# Inject the extra instruction as a system message ahead of the input so it is present
|
||||
# on every iteration and preserved across fresh_context resets (which restart from
|
||||
# ``original_messages``).
|
||||
context.messages = [
|
||||
Message(role="system", contents=[self.additional_instructions]),
|
||||
*context.messages,
|
||||
]
|
||||
original_messages = list(context.messages)
|
||||
# For a truly fresh context per iteration the session must also be reset, otherwise the
|
||||
# next run reloads the local transcript or re-threads the service-side conversation and the
|
||||
# model still sees the accumulated history. Snapshot the session once here (the pre-loop
|
||||
# baseline) and restore it before each subsequent iteration so every pass starts clean.
|
||||
snapshot = context.session.to_dict() if self.fresh_context and context.session is not None else None
|
||||
if context.stream:
|
||||
self._process_streaming(context, call_next, original_messages, snapshot)
|
||||
else:
|
||||
await self._process_non_streaming(context, call_next, original_messages, snapshot)
|
||||
|
||||
@staticmethod
|
||||
def _restore_session(session: Any, snapshot: dict[str, Any]) -> None:
|
||||
"""Restore a session in place to a previously captured ``to_dict()`` snapshot.
|
||||
|
||||
Re-hydrates the snapshot via :meth:`AgentSession.from_dict` and copies the mutable fields
|
||||
(``service_session_id`` and ``state``) back onto the live ``session`` instance, so any
|
||||
reference held by the agent/context observes the reset. ``session_id`` is preserved (the
|
||||
snapshot carries the same id). A fresh ``from_dict`` is built on every call so repeated
|
||||
restores from one snapshot do not alias the same state dict.
|
||||
"""
|
||||
from .._sessions import AgentSession
|
||||
|
||||
restored = AgentSession.from_dict(snapshot)
|
||||
session.service_session_id = restored.service_session_id
|
||||
session.state = restored.state
|
||||
|
||||
async def _process_non_streaming(
|
||||
self,
|
||||
context: AgentContext,
|
||||
call_next: Callable[[], Awaitable[None]],
|
||||
original_messages: list[Message],
|
||||
snapshot: dict[str, Any] | None,
|
||||
) -> None:
|
||||
iteration = 0
|
||||
work_iterations = 0
|
||||
progress: list[str] = []
|
||||
# Aggregated transcript across iterations: each iteration's response messages plus the
|
||||
# injected "nudge" messages, used to build the combined response when return_final_only=False.
|
||||
aggregated: list[Message] = []
|
||||
aggregated_usage: UsageDetails | None = None
|
||||
final_result: AgentResponse | None = None
|
||||
while True:
|
||||
await call_next()
|
||||
iteration += 1
|
||||
|
||||
result = context.result
|
||||
if not isinstance(result, AgentResponse):
|
||||
raise TypeError(
|
||||
"AgentLoopMiddleware expected an AgentResponse from a non-streaming run, "
|
||||
f"got {type(result).__name__}."
|
||||
)
|
||||
|
||||
final_result = result
|
||||
aggregated.extend(result.messages)
|
||||
if result.usage_details is not None:
|
||||
aggregated_usage = add_usage_details(aggregated_usage, result.usage_details)
|
||||
|
||||
messages_used = context.messages
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=result,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
)
|
||||
|
||||
work_iterations += 1
|
||||
# Decide whether to stop and capture any feedback from should_continue first, so the
|
||||
# feedback is available to both the progress and next-message callables this iteration.
|
||||
stop, feedback = await self._evaluate_stop(loop_kwargs, work_iterations)
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=result,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
feedback=feedback,
|
||||
)
|
||||
# Capture this iteration's progress entry, then refresh loop_kwargs so the next-message
|
||||
# resolution sees the latest entry.
|
||||
if await self._record_progress(result, loop_kwargs, progress):
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=result,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
feedback=feedback,
|
||||
)
|
||||
if stop:
|
||||
break
|
||||
if snapshot is not None and context.session is not None:
|
||||
# Reset the session to the pre-loop baseline so the next run starts fresh; only the
|
||||
# progress log (injected by _resolve_next_message) carries continuity forward.
|
||||
self._restore_session(context.session, snapshot)
|
||||
next_messages = await self._resolve_next_message(loop_kwargs, messages_used, original_messages)
|
||||
context.messages = next_messages
|
||||
aggregated.extend(next_messages)
|
||||
|
||||
if not self.return_final_only:
|
||||
context.result = self._aggregate_response(final_result, aggregated, aggregated_usage)
|
||||
|
||||
def _process_streaming(
|
||||
self,
|
||||
context: AgentContext,
|
||||
call_next: Callable[[], Awaitable[None]],
|
||||
original_messages: list[Message],
|
||||
snapshot: dict[str, Any] | None,
|
||||
) -> None:
|
||||
# Holds the last iteration's final response so the outer stream's finalizer can return it
|
||||
# rather than an aggregate of every iteration.
|
||||
holder: dict[str, AgentResponse | None] = {"final": None}
|
||||
|
||||
async def _generator() -> Any:
|
||||
iteration = 0
|
||||
work_iterations = 0
|
||||
progress: list[str] = []
|
||||
while True:
|
||||
try:
|
||||
await call_next()
|
||||
inner = context.result
|
||||
if not isinstance(inner, ResponseStream):
|
||||
raise TypeError(
|
||||
"AgentLoopMiddleware expected a ResponseStream from a streaming run, "
|
||||
f"got {type(inner).__name__}."
|
||||
)
|
||||
|
||||
async for update in inner:
|
||||
yield update
|
||||
|
||||
holder["final"] = await inner.get_final_response()
|
||||
except MiddlewareTermination:
|
||||
# The pipeline's MiddlewareTermination suppression is no longer active once
|
||||
# process() has returned (the stream is consumed lazily), so a termination
|
||||
# raised by a downstream middleware or during stream consumption surfaces here.
|
||||
# Stop cleanly and keep whatever final response we have from a prior iteration.
|
||||
return
|
||||
|
||||
iteration += 1
|
||||
|
||||
messages_used = context.messages
|
||||
final = holder["final"]
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=final,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
)
|
||||
|
||||
work_iterations += 1
|
||||
# Decide whether to stop and capture any feedback from should_continue first, so the
|
||||
# feedback is available to both the progress and next-message callables this iteration.
|
||||
stop, feedback = await self._evaluate_stop(loop_kwargs, work_iterations)
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=final,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
feedback=feedback,
|
||||
)
|
||||
if await self._record_progress(final, loop_kwargs, progress):
|
||||
loop_kwargs = self._build_loop_kwargs(
|
||||
context=context,
|
||||
iteration=iteration,
|
||||
last_result=final,
|
||||
messages_used=messages_used,
|
||||
original_messages=original_messages,
|
||||
progress=progress,
|
||||
feedback=feedback,
|
||||
)
|
||||
if stop:
|
||||
return
|
||||
if snapshot is not None and context.session is not None:
|
||||
# Reset the session to the pre-loop baseline before the next run. The final
|
||||
# response was already awaited above, so the service-side conversation id has
|
||||
# been propagated and is safe to discard here.
|
||||
self._restore_session(context.session, snapshot)
|
||||
next_messages = await self._resolve_next_message(loop_kwargs, messages_used, original_messages)
|
||||
context.messages = next_messages
|
||||
# Surface the injected "nudge" messages in the stream so consumers see the user
|
||||
# turns that drive each subsequent iteration (the equivalent of the aggregated
|
||||
# transcript that non-streaming runs return).
|
||||
for message in next_messages:
|
||||
yield self._message_to_update(message)
|
||||
|
||||
def _finalize(updates: Sequence[AgentResponseUpdate]) -> AgentResponse:
|
||||
if holder["final"] is not None:
|
||||
return holder["final"]
|
||||
return AgentResponse.from_updates(updates)
|
||||
|
||||
context.result = ResponseStream(_generator(), finalizer=_finalize)
|
||||
|
||||
def _build_loop_kwargs(
|
||||
self,
|
||||
*,
|
||||
context: AgentContext,
|
||||
iteration: int,
|
||||
last_result: AgentResponse | None,
|
||||
messages_used: list[Message],
|
||||
original_messages: list[Message],
|
||||
progress: list[str],
|
||||
feedback: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
return {
|
||||
"iteration": iteration,
|
||||
"last_result": last_result,
|
||||
"messages": messages_used,
|
||||
"original_messages": original_messages,
|
||||
"session": context.session,
|
||||
"agent": context.agent,
|
||||
# A copy so user callbacks cannot mutate the loop's internal progress log.
|
||||
"progress": list(progress),
|
||||
# Feedback returned by ``should_continue`` for this iteration (``None`` if it returned a
|
||||
# plain bool, or the stop was decided by ``max_iterations``).
|
||||
"feedback": feedback,
|
||||
}
|
||||
|
||||
async def _record_progress(
|
||||
self,
|
||||
last_result: AgentResponse | None,
|
||||
loop_kwargs: dict[str, Any],
|
||||
progress: list[str],
|
||||
) -> bool:
|
||||
"""Capture this iteration's feedback into ``progress``. Returns ``True`` if an entry was added."""
|
||||
if self.record_feedback is not None:
|
||||
entry = await _maybe_await(self.record_feedback(**loop_kwargs))
|
||||
else:
|
||||
entry = last_result.text.strip() if last_result is not None else None
|
||||
if entry:
|
||||
progress.append(entry)
|
||||
return True
|
||||
return False
|
||||
|
||||
async def _evaluate_stop(self, loop_kwargs: dict[str, Any], work_iterations: int) -> tuple[bool, str | None]:
|
||||
"""Decide whether the loop should stop, returning ``(stop, feedback)``.
|
||||
|
||||
``max_iterations`` is a safety cap that short-circuits before ``should_continue`` is
|
||||
evaluated (so an expensive predicate/judge is not called once the cap has fired). Any
|
||||
feedback returned by ``should_continue`` is propagated so the progress and next-message
|
||||
callables can reference it.
|
||||
"""
|
||||
if self.max_iterations is not None and work_iterations >= self.max_iterations:
|
||||
return True, None
|
||||
keep_going, feedback = await self._should_continue(loop_kwargs)
|
||||
return (not keep_going), feedback
|
||||
|
||||
async def _should_continue(self, loop_kwargs: dict[str, Any]) -> tuple[bool, str | None]:
|
||||
"""Evaluate the predicate, normalizing its result to ``(continue, feedback)``."""
|
||||
result = await _maybe_await(self.should_continue(**loop_kwargs))
|
||||
return (bool(result[0]), result[1]) if isinstance(result, tuple) else (bool(result), None) # type: ignore
|
||||
|
||||
@staticmethod
|
||||
def _message_to_update(message: Message) -> AgentResponseUpdate:
|
||||
"""Wrap an injected loop message as a streaming update so consumers see it inline."""
|
||||
return AgentResponseUpdate(
|
||||
contents=message.contents,
|
||||
role=message.role,
|
||||
author_name=message.author_name,
|
||||
message_id=message.message_id,
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _aggregate_response(
|
||||
final: AgentResponse,
|
||||
messages: list[Message],
|
||||
usage: UsageDetails | None,
|
||||
) -> AgentResponse:
|
||||
"""Build a combined response carrying every iteration's messages and summed usage.
|
||||
|
||||
Metadata (``response_id``, structured ``value``, etc.) is taken from the final iteration; the
|
||||
structured value is passed through pre-parsed so it is not re-derived from the aggregated text.
|
||||
"""
|
||||
return AgentResponse(
|
||||
messages=messages,
|
||||
response_id=final.response_id,
|
||||
agent_id=final.agent_id,
|
||||
created_at=final.created_at,
|
||||
finish_reason=final.finish_reason, # pyright: ignore[reportArgumentType]
|
||||
usage_details=usage,
|
||||
value=final.value,
|
||||
additional_properties=dict(final.additional_properties) if final.additional_properties else None,
|
||||
raw_representation=final.raw_representation,
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _render_progress(entries: list[str]) -> Message:
|
||||
"""Format progress-log entries into a single ``user`` message."""
|
||||
body = "\n".join(f"- {entry}" for entry in entries)
|
||||
return Message(role="user", contents=[f"Progress so far:\n{body}"])
|
||||
|
||||
async def _resolve_next_message(
|
||||
self,
|
||||
loop_kwargs: dict[str, Any],
|
||||
messages_used: list[Message],
|
||||
original_messages: list[Message],
|
||||
) -> list[Message]:
|
||||
# Compute the base next input. A ``next_message`` callable returning None requests a verbatim
|
||||
# reuse of the previous messages (no progress injection); in fresh-context mode that escape
|
||||
# hatch does not apply, so fall back to the default nudge instead.
|
||||
if self.next_message is None:
|
||||
next_msgs = normalize_messages(DEFAULT_NEXT_MESSAGE)
|
||||
else:
|
||||
next_input = await _maybe_await(self.next_message(**loop_kwargs))
|
||||
if next_input is None:
|
||||
if not self.fresh_context:
|
||||
return list(messages_used)
|
||||
next_msgs = normalize_messages(DEFAULT_NEXT_MESSAGE)
|
||||
else:
|
||||
next_msgs = normalize_messages(next_input)
|
||||
|
||||
progress: list[str] = loop_kwargs.get("progress") or []
|
||||
session = loop_kwargs.get("session")
|
||||
progress_msg: Message | None = None
|
||||
if self.inject_progress and progress:
|
||||
# With a session the earlier entries are already retained in the conversation, so only
|
||||
# the latest entry is injected to avoid duplication. Otherwise inject the full log.
|
||||
entries = progress if (session is None or self.fresh_context) else progress[-1:]
|
||||
progress_msg = self._render_progress(entries)
|
||||
|
||||
if self.fresh_context:
|
||||
result = list(original_messages)
|
||||
if progress_msg is not None:
|
||||
result.append(progress_msg)
|
||||
result.extend(next_msgs)
|
||||
return result
|
||||
|
||||
if progress_msg is not None:
|
||||
return [progress_msg, *next_msgs]
|
||||
return list(next_msgs)
|
||||
|
||||
|
||||
def todos_remaining(provider: Any) -> ShouldContinueCallable:
|
||||
"""Build a ``should_continue`` predicate that loops while a ``TodoProvider`` has open items.
|
||||
|
||||
Args:
|
||||
provider: A :class:`~agent_framework.TodoProvider` attached to the same session as the loop.
|
||||
|
||||
Returns:
|
||||
A predicate suitable for :class:`AgentLoopMiddleware`'s ``should_continue`` argument.
|
||||
"""
|
||||
|
||||
async def _should_continue(*, session: Any = None, **kwargs: Any) -> bool:
|
||||
if session is None:
|
||||
return False
|
||||
items = await provider.store.load_items(session, source_id=provider.source_id)
|
||||
return any(not item.is_complete for item in items)
|
||||
|
||||
return _should_continue
|
||||
|
||||
|
||||
def background_tasks_running(provider: Any) -> ShouldContinueCallable:
|
||||
"""Build a ``should_continue`` predicate that loops while a ``BackgroundAgentsProvider`` is busy.
|
||||
|
||||
The predicate inspects the provider's persisted task state and continues while any task is still
|
||||
marked as running. Pair it with ``max_iterations`` so the loop is guaranteed to stop even if a
|
||||
task's persisted status is never refreshed.
|
||||
|
||||
Args:
|
||||
provider: A :class:`~agent_framework.BackgroundAgentsProvider` attached to the same session
|
||||
as the loop.
|
||||
|
||||
Returns:
|
||||
A predicate suitable for :class:`AgentLoopMiddleware`'s ``should_continue`` argument.
|
||||
"""
|
||||
from ._background_agents import BackgroundTaskInfo, BackgroundTaskStatus
|
||||
|
||||
def _should_continue(*, session: Any = None, **kwargs: Any) -> bool:
|
||||
if session is None:
|
||||
return False
|
||||
state = session.state.get(provider.source_id)
|
||||
if not state:
|
||||
return False
|
||||
return any(
|
||||
BackgroundTaskInfo.from_dict(task).status == BackgroundTaskStatus.RUNNING for task in state.get("tasks", [])
|
||||
)
|
||||
|
||||
return _should_continue
|
||||
File diff suppressed because it is too large
Load Diff
@@ -7,6 +7,10 @@ This folder contains focused middleware samples for `Agent`, chat clients, tools
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| [`agent_and_run_level_middleware.py`](./agent_and_run_level_middleware.py) | Demonstrates combining agent-level and run-level middleware. |
|
||||
| [`agent_loop_middleware_refinement.py`](./agent_loop_middleware_refinement.py) | Demonstrates `AgentLoopMiddleware` with a `should_continue` predicate: a completion-marker refinement loop with feedback tracking and `fresh_context`. |
|
||||
| [`agent_loop_middleware_todos.py`](./agent_loop_middleware_todos.py) | Demonstrates `AgentLoopMiddleware` with a `should_continue` predicate built from a `TodoProvider` via `todos_remaining`, so the agent keeps working while open todos remain. |
|
||||
| [`agent_loop_middleware_judge.py`](./agent_loop_middleware_judge.py) | Demonstrates `AgentLoopMiddleware.with_judge`: a ChatClient judge re-runs the agent until it decides the original request was answered, with `criteria` shared between the agent and the judge. |
|
||||
| [`agent_loop_middleware_report.py`](./agent_loop_middleware_report.py) | Demonstrates composing two `AgentLoopMiddleware` on one agent: an inner `todos_remaining` loop that drafts a report todo-by-todo, wrapped by an outer report-style `with_judge` loop that re-runs it until an editor chat client judges the report publication-ready. |
|
||||
| [`chat_middleware.py`](./chat_middleware.py) | Shows class-based and function-based chat middleware that can observe, modify, and override model calls. |
|
||||
| [`class_based_middleware.py`](./class_based_middleware.py) | Shows class-based agent and function middleware. |
|
||||
| [`decorator_middleware.py`](./decorator_middleware.py) | Demonstrates middleware registration with decorators. |
|
||||
|
||||
@@ -0,0 +1,118 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
import asyncio
|
||||
|
||||
from agent_framework import Agent, AgentLoopMiddleware
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
"""
|
||||
Agent Loop Middleware: ChatClient judge
|
||||
|
||||
This sample demonstrates ``AgentLoopMiddleware.with_judge(...)``: a second chat client decides (via a
|
||||
``JudgeVerdict`` structured output) whether the original request was answered, and the loop continues
|
||||
while the answer is "no". The judge's ``reasoning`` is fed back to the agent as the next iteration's
|
||||
input, so the agent knows what is missing. The loop also passes a list of ``criteria``, which are
|
||||
injected as an extra instruction for the agent and rendered into the judge's instructions.
|
||||
|
||||
The loop is run with streaming, so the judge's feedback between iterations shows up as a ``user``
|
||||
update; the stream is printed as ``<role>: <content>`` lines.
|
||||
|
||||
Environment variables:
|
||||
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint URL
|
||||
FOUNDRY_MODEL — Model deployment name
|
||||
|
||||
Authentication:
|
||||
Run ``az login`` before running this sample.
|
||||
"""
|
||||
|
||||
|
||||
async def judge_loop(client: FoundryChatClient, judge_client: FoundryChatClient) -> None:
|
||||
"""A second chat client judges whether the request was answered."""
|
||||
print("\n=== ChatClient judge (loop until the request is answered) ===")
|
||||
|
||||
# 1. Provide a ``judge_client``. The middleware asks it (via a ``JudgeVerdict`` structured
|
||||
# output) whether the original request has been fully addressed and continues while the
|
||||
# answer is "no". The judge's ``reasoning`` is fed back to the agent as the next iteration's
|
||||
# input, so the agent knows what is missing. Judge loops default to a small ``max_iterations``
|
||||
# cap because each pass costs an extra model call.
|
||||
#
|
||||
# ``criteria`` is a list of requirements the response must satisfy. The loop (a) injects them
|
||||
# as an extra instruction for the agent before it runs and (b) renders them into the judge's
|
||||
# instructions (the default judge prompt includes a ``{{criteria}}`` placeholder). Supply your
|
||||
# own ``instructions`` string with ``{{criteria}}`` to control the wording, or omit ``criteria``
|
||||
# entirely and pass a plain ``instructions`` string.
|
||||
loop = AgentLoopMiddleware.with_judge(
|
||||
judge_client,
|
||||
criteria=[
|
||||
"Mentions the moon",
|
||||
"Includes at least one good joke",
|
||||
"Is written as a single piece of fluent prose",
|
||||
],
|
||||
max_iterations=4,
|
||||
)
|
||||
|
||||
agent = Agent(
|
||||
client=client,
|
||||
name="answerer",
|
||||
instructions="You are a helpful assistant. Answer the user's question thoroughly.",
|
||||
middleware=[loop],
|
||||
)
|
||||
|
||||
# 2. Run with streaming; the judge's feedback appears as a ``user`` update between iterations
|
||||
# until the judge is satisfied (or the iteration cap is reached). Each contiguous ``user``
|
||||
# block marks the boundary into the next iteration, so we count loop iterations by those
|
||||
# boundaries (robust to function calling, where one iteration may issue several model calls).
|
||||
iterations = 1
|
||||
in_user_block = False
|
||||
assistant_open = False
|
||||
async for update in agent.run("Explain why the sky is blue and sunsets are red.", stream=True):
|
||||
if update.role == "user":
|
||||
if not in_user_block:
|
||||
iterations += 1
|
||||
in_user_block = True
|
||||
assistant_open = False
|
||||
print(f"\nuser: {update.text}", flush=True)
|
||||
continue
|
||||
in_user_block = False
|
||||
if update.text:
|
||||
if not assistant_open:
|
||||
print("\nassistant: ", end="", flush=True)
|
||||
assistant_open = True
|
||||
print(update.text, end="", flush=True)
|
||||
print(f"\n\nCompleted in {iterations} iteration(s).")
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
# A single credential is reused; the judge uses its own client instance.
|
||||
async with AzureCliCredential() as credential:
|
||||
client = FoundryChatClient(credential=credential)
|
||||
judge_client = FoundryChatClient(credential=credential)
|
||||
await judge_loop(client, judge_client)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
||||
|
||||
"""
|
||||
Sample output (abridged; exact text varies by model):
|
||||
|
||||
=== ChatClient judge (loop until the request is answered) ===
|
||||
assistant: The sky is blue because shorter (blue) wavelengths scatter more (Rayleigh scattering).
|
||||
user: An evaluator reviewed your previous response and judged that it does not yet fully
|
||||
address the original request.
|
||||
|
||||
Evaluator feedback: The response does not mention the moon.
|
||||
|
||||
Revise and continue so the original request is fully addressed.
|
||||
assistant: The sky is blue because shorter (blue) wavelengths scatter more. At sunset, light travels
|
||||
through more atmosphere, scattering away blue and leaving red/orange hues. The moon follows the
|
||||
sky's colors because the same scattering applies to the light reaching it.
|
||||
|
||||
Completed in 2 iteration(s).
|
||||
"""
|
||||
@@ -0,0 +1,121 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
import asyncio
|
||||
|
||||
from agent_framework import Agent, AgentLoopMiddleware, AgentResponse
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
"""
|
||||
Agent Loop Middleware: refinement loop (should_continue + feedback tracking)
|
||||
|
||||
This sample demonstrates ``AgentLoopMiddleware`` driven by a ``should_continue`` predicate. The loop
|
||||
keeps refining a candidate answer until the agent's latest response contains a completion marker. It
|
||||
also shows feedback tracking: ``record_feedback`` logs per-iteration progress that is fed into the
|
||||
next pass, ``fresh_context`` restarts each pass from the original task plus that log, and
|
||||
``max_iterations`` bounds the loop as a safety cap.
|
||||
|
||||
``next_message`` controls the input for the next iteration (it defaults to a short "continue" nudge).
|
||||
The loop is run with streaming, so the injected messages between iterations show up as ``user``
|
||||
updates; the stream is printed as ``<role>: <content>`` lines.
|
||||
|
||||
Environment variables:
|
||||
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint URL
|
||||
FOUNDRY_MODEL — Model deployment name
|
||||
|
||||
Authentication:
|
||||
Run ``az login`` before running this sample.
|
||||
"""
|
||||
|
||||
COMPLETE_MARKER = "<promise>COMPLETE</promise>"
|
||||
|
||||
|
||||
async def refinement_loop(client: FoundryChatClient) -> None:
|
||||
"""Loop while the response does not yet contain a completion marker."""
|
||||
print("\n=== Refinement loop (should_continue marker + feedback tracking, capped at 5) ===")
|
||||
|
||||
# 1. ``should_continue`` keeps the loop running until the agent signals it is done by including
|
||||
# the completion marker in its latest response. It is called with the loop keyword args and
|
||||
# returns True to run the agent again.
|
||||
def should_continue(*, last_result: AgentResponse, **kwargs: object) -> bool:
|
||||
return COMPLETE_MARKER not in last_result.text
|
||||
|
||||
# 2. ``record_feedback`` captures a short progress entry each iteration. Returning a string
|
||||
# appends it to the log (returning None falls back to the response text). The accumulated log
|
||||
# is injected into the next iteration's input so the agent builds on prior work.
|
||||
def record_feedback(*, iteration: int, last_result: AgentResponse, **kwargs: object) -> str:
|
||||
return f"iteration {iteration}: {last_result.text.strip()[:80]}"
|
||||
|
||||
# 3. ``fresh_context=True`` restarts each pass from the original task plus the progress log, and
|
||||
# ``max_iterations`` bounds the loop as a safety cap.
|
||||
loop = AgentLoopMiddleware(
|
||||
should_continue,
|
||||
max_iterations=5,
|
||||
record_feedback=record_feedback,
|
||||
fresh_context=True,
|
||||
)
|
||||
|
||||
# 4. Attach the middleware to the agent.
|
||||
agent = Agent(
|
||||
client=client,
|
||||
name="refiner",
|
||||
instructions=(
|
||||
"You are iteratively refining a product name for a note-taking app. Each turn, build on the "
|
||||
"progress log: propose an improved candidate with a short reason. When you are confident the "
|
||||
f"name is final, end your message with the exact marker {COMPLETE_MARKER}."
|
||||
),
|
||||
middleware=[loop],
|
||||
)
|
||||
|
||||
# 5. Run once with streaming. The middleware drives the iterations, feeding progress forward until
|
||||
# the agent emits the completion marker or the iteration cap is reached. Each contiguous
|
||||
# ``user`` block marks the boundary into the next iteration, so we count loop iterations by
|
||||
# those boundaries (robust to function calling, where one iteration may issue several model
|
||||
# calls; tool calls/results are never ``user`` updates).
|
||||
iterations = 1
|
||||
in_user_block = False
|
||||
assistant_open = False
|
||||
async for update in agent.run("Suggest a name for a note-taking app.", stream=True):
|
||||
if update.role == "user":
|
||||
if not in_user_block:
|
||||
iterations += 1
|
||||
in_user_block = True
|
||||
assistant_open = False
|
||||
print(f"\nuser: {update.text}", flush=True)
|
||||
continue
|
||||
in_user_block = False
|
||||
if update.text:
|
||||
if not assistant_open:
|
||||
print("\nassistant: ", end="", flush=True)
|
||||
assistant_open = True
|
||||
print(update.text, end="", flush=True)
|
||||
print(f"\n\nCompleted in {iterations} iteration(s).")
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
async with AzureCliCredential() as credential:
|
||||
client = FoundryChatClient(credential=credential)
|
||||
await refinement_loop(client)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
||||
|
||||
"""
|
||||
Sample output (abridged; exact text varies by model):
|
||||
|
||||
=== Refinement loop (should_continue marker + feedback tracking, capped at 5) ===
|
||||
assistant: "QuickJot" — short and evokes fast capture.
|
||||
user: Suggest a name for a note-taking app.
|
||||
user: Progress so far:
|
||||
- iteration 1: "QuickJot" — short and evokes fast capture.
|
||||
user: Continue working on the task. If it is complete, say so.
|
||||
assistant: How about "MarginNote" — it evokes jotting ideas in the margins. <promise>COMPLETE</promise>
|
||||
|
||||
Completed in 2 iteration(s).
|
||||
"""
|
||||
@@ -0,0 +1,208 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
import asyncio
|
||||
|
||||
from agent_framework import (
|
||||
Agent,
|
||||
AgentLoopMiddleware,
|
||||
AgentSession,
|
||||
TodoProvider,
|
||||
todos_remaining,
|
||||
)
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
"""
|
||||
Agent Loop Middleware: todo list + report-style judge, composed as two middleware
|
||||
|
||||
This sample demonstrates a more complex ``AgentLoopMiddleware`` setup that composes TWO separate loop
|
||||
middleware on a single agent — rather than hand-writing one predicate that does both checks. The
|
||||
agent's ``middleware`` list is the composition point:
|
||||
|
||||
middleware=[judge_loop, todo_loop]
|
||||
|
||||
Agent middleware run outermost-first, so ``judge_loop`` wraps ``todo_loop``:
|
||||
|
||||
1. ``todo_loop`` (inner) — built from the ``todos_remaining`` helper over a ``TodoProvider``. It
|
||||
re-runs the agent while any todo item is still open, so the agent plans the report and then drafts
|
||||
it one todo at a time. Its final todo assembles and emits the complete report, so when the inner
|
||||
loop stops its final response is the full report.
|
||||
2. ``judge_loop`` (outer) — built from ``AgentLoopMiddleware.with_judge``. Each time the inner todo
|
||||
loop finishes, a separate "editor" chat client reviews the assembled report (via a ``JudgeVerdict``
|
||||
structured output) against a list of report ``criteria``. While the editor is not satisfied, the
|
||||
outer loop re-runs the inner todo loop (the todos are already complete, so it runs the agent once)
|
||||
with the editor's reasoning fed back, and the agent revises the full report.
|
||||
|
||||
``with_judge(criteria=...)`` renders the criteria into both the editor's judge instructions and an
|
||||
extra instruction injected for the agent, so the agent writes toward the same bar the editor grades
|
||||
against. A custom report-style ``instructions`` string frames the judge as an editor reviewing a
|
||||
report.
|
||||
|
||||
The loop is run with streaming, so the injected messages between iterations show up as ``user``
|
||||
updates; the stream is printed as ``<role>: <content>`` lines. Each contiguous ``user`` block (from
|
||||
either loop) marks a boundary into another agent run, so the printed count is the total number of
|
||||
agent runs across both loops.
|
||||
|
||||
Environment variables:
|
||||
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint URL
|
||||
FOUNDRY_MODEL — Model deployment name
|
||||
|
||||
Authentication:
|
||||
Run ``az login`` before running this sample.
|
||||
"""
|
||||
|
||||
# Requirements the finished report must satisfy. Passed as ``criteria`` to ``with_judge``, which
|
||||
# renders them into both the editor's judge instructions and an extra instruction for the agent.
|
||||
REPORT_REQUIREMENTS = [
|
||||
"Opens with a one-paragraph executive summary.",
|
||||
"Has a clearly titled section for each part of the brief.",
|
||||
"Ends with a short 'Key takeaways' bulleted list.",
|
||||
"Is written in clear, professional prose.",
|
||||
]
|
||||
|
||||
# Report-style judge instructions. The ``{{criteria}}`` placeholder is replaced by ``with_judge``
|
||||
# with the rendered REPORT_REQUIREMENTS block.
|
||||
EDITOR_INSTRUCTIONS = (
|
||||
"You are a senior editor reviewing a research report. You are given the user's original brief and "
|
||||
"the report the agent produced. Decide whether the report is publication-ready. Set 'answered' to "
|
||||
"true only if the report is ready, otherwise set it to false and use 'reasoning' to state "
|
||||
"concisely what is missing.{{criteria}}"
|
||||
)
|
||||
|
||||
|
||||
async def report_loop(client: FoundryChatClient, editor_client: FoundryChatClient) -> None:
|
||||
"""Compose a todo loop (inner) and a report-style judge loop (outer) on one agent."""
|
||||
print("\n=== Todo list + report-style judge (two composed middleware) ===")
|
||||
|
||||
# 1. A TodoProvider gives the agent tools to plan and track the report as todo items. A single
|
||||
# session (created below) keeps this todo state alive across loop iterations.
|
||||
todo_provider = TodoProvider()
|
||||
|
||||
# 2. Inner loop: re-run the agent while the TodoProvider still has open items. ``todos_remaining``
|
||||
# builds the ``should_continue`` predicate; ``max_iterations`` caps planning + one-todo-per-turn
|
||||
# drafting + the final assembly turn.
|
||||
todo_loop = AgentLoopMiddleware(
|
||||
todos_remaining(todo_provider),
|
||||
max_iterations=8,
|
||||
)
|
||||
|
||||
# 3. Outer loop: each time the inner todo loop finishes, ``editor_client`` judges the assembled
|
||||
# report against REPORT_REQUIREMENTS and the loop re-runs the inner loop while it is not yet
|
||||
# publication-ready. ``with_judge`` injects the criteria for the agent too, and feeds the
|
||||
# editor's reasoning back as the next iteration's input. The judge cap bounds the revision rounds.
|
||||
judge_loop = AgentLoopMiddleware.with_judge(
|
||||
editor_client,
|
||||
instructions=EDITOR_INSTRUCTIONS,
|
||||
criteria=REPORT_REQUIREMENTS,
|
||||
max_iterations=4,
|
||||
)
|
||||
|
||||
# 4. Compose the two middleware on the agent. Order matters: ``judge_loop`` is outermost (it wraps
|
||||
# and re-runs the whole ``todo_loop``), ``todo_loop`` is innermost (it drives the per-todo
|
||||
# drafting). The agent is told to finish with a dedicated assembly todo so that, when the inner
|
||||
# loop stops, its final response is the complete report the editor then grades.
|
||||
agent = Agent(
|
||||
client=client,
|
||||
name="report-writer",
|
||||
instructions=(
|
||||
"You are a research writer producing a short report. "
|
||||
"On your FIRST turn, break the report into todo items using your todo tools: one item per "
|
||||
"report section, plus a final 'Assemble and output the complete report' item — then stop, "
|
||||
"do not start writing yet. On EACH SUBSEQUENT turn while todos remain, complete exactly "
|
||||
"ONE remaining todo item, draft its content, and mark it done using your tools — never "
|
||||
"more than one item per turn. When you reach the final assembly item, output the FULL "
|
||||
"report in a single message and mark it done. If an editor later returns feedback, revise "
|
||||
"and output the full report again."
|
||||
),
|
||||
context_providers=[todo_provider],
|
||||
middleware=[judge_loop, todo_loop],
|
||||
)
|
||||
|
||||
# 5. Run once with streaming. Reuse a single session so todo state persists across iterations.
|
||||
# Each contiguous ``user`` block marks a boundary into another agent run; both loops inject
|
||||
# such blocks (todo nudges and editor feedback), so the count is the total number of agent runs.
|
||||
session = AgentSession()
|
||||
prompt = "Write a brief report on the benefits and risks of remote work for software teams."
|
||||
runs = 1
|
||||
in_user_block = False
|
||||
assistant_open = False
|
||||
async for update in agent.run(prompt, session=session, stream=True):
|
||||
if update.role == "user":
|
||||
if not in_user_block:
|
||||
runs += 1
|
||||
in_user_block = True
|
||||
assistant_open = False
|
||||
print(f"\nuser: {update.text}", flush=True)
|
||||
continue
|
||||
in_user_block = False
|
||||
if update.text:
|
||||
if not assistant_open:
|
||||
print("\nassistant: ", end="", flush=True)
|
||||
assistant_open = True
|
||||
print(update.text, end="", flush=True)
|
||||
print(f"\n\nCompleted in {runs} agent run(s).")
|
||||
|
||||
# 6. Inspect the todos the agent created, loaded from the same store the inner loop uses.
|
||||
items = await todo_provider.store.load_items(session, source_id=todo_provider.source_id)
|
||||
print("\nTodos after the run:")
|
||||
for item in items:
|
||||
mark = "x" if item.is_complete else " "
|
||||
print(f" [{mark}] {item.id}. {item.title}")
|
||||
|
||||
|
||||
"""
|
||||
Sample output for ``report_loop`` (abridged; exact text varies by model):
|
||||
|
||||
=== Todo list + report-style judge (two composed middleware) ===
|
||||
assistant: Here is my plan. I'll create todos for each section and a final assembly item.
|
||||
user: Continue working on the task. If it is complete, say so.
|
||||
...
|
||||
assistant: # Remote Work for Software Teams
|
||||
|
||||
**Executive summary:** Remote work offers flexibility and access to wider talent...
|
||||
|
||||
## Benefits
|
||||
...
|
||||
|
||||
## Risks
|
||||
...
|
||||
|
||||
## Key takeaways
|
||||
- Flexibility improves retention.
|
||||
- Async communication needs discipline.
|
||||
user: An evaluator reviewed your previous response and judged that it does not yet fully
|
||||
address the original request.
|
||||
|
||||
Evaluator feedback: Add a one-paragraph executive summary before the first section.
|
||||
|
||||
Revise and continue so the original request is fully addressed.
|
||||
assistant: # Remote Work for Software Teams
|
||||
|
||||
**Executive summary:** ... (revised, now opens with a summary)
|
||||
...
|
||||
|
||||
Completed in 7 agent run(s).
|
||||
|
||||
Todos after the run:
|
||||
[x] 1. Benefits section
|
||||
[x] 2. Risks section
|
||||
[x] 3. Key takeaways
|
||||
[x] 4. Assemble and output the complete report
|
||||
"""
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
# A single credential is reused; the editor judge uses its own client instance.
|
||||
async with AzureCliCredential() as credential:
|
||||
client = FoundryChatClient(credential=credential)
|
||||
editor_client = FoundryChatClient(credential=credential)
|
||||
|
||||
await report_loop(client, editor_client)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
@@ -0,0 +1,129 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
import asyncio
|
||||
|
||||
from agent_framework import Agent, AgentLoopMiddleware, AgentSession, TodoProvider, todos_remaining
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
"""
|
||||
Agent Loop Middleware: todo loop (should_continue via a provider helper)
|
||||
|
||||
This sample demonstrates ``AgentLoopMiddleware`` driven by a ``should_continue`` predicate built from
|
||||
a ``TodoProvider``. The ``todos_remaining`` helper keeps the agent running while it still has open
|
||||
todo items, so the agent plans work on its first turn and completes one item per turn afterwards.
|
||||
``max_iterations`` bounds the loop as a safety cap, and a single session keeps the todo state across
|
||||
iterations. After the run the sample prints the todos the agent created.
|
||||
|
||||
The loop is run with streaming, so the injected messages between iterations show up as ``user``
|
||||
updates; the stream is printed as ``<role>: <content>`` lines.
|
||||
|
||||
Environment variables:
|
||||
FOUNDRY_PROJECT_ENDPOINT — Azure AI Foundry project endpoint URL
|
||||
FOUNDRY_MODEL — Model deployment name
|
||||
|
||||
Authentication:
|
||||
Run ``az login`` before running this sample.
|
||||
"""
|
||||
|
||||
|
||||
async def todo_loop(client: FoundryChatClient) -> None:
|
||||
"""Loop while a TodoProvider still has open items."""
|
||||
print("\n=== Callable criterion (loop while todos remain) ===")
|
||||
|
||||
# 1. A TodoProvider gives the agent tools to plan and track work as todo items.
|
||||
todo_provider = TodoProvider()
|
||||
|
||||
# 2. ``todos_remaining`` builds a ``should_continue`` predicate that returns True while any todo
|
||||
# item is still open. ``max_iterations`` guarantees the loop stops even if the agent stalls.
|
||||
loop = AgentLoopMiddleware(
|
||||
should_continue=todos_remaining(todo_provider),
|
||||
max_iterations=6,
|
||||
)
|
||||
|
||||
agent = Agent(
|
||||
client=client,
|
||||
name="planner",
|
||||
instructions=(
|
||||
"You are a writing assistant working through a todo list. "
|
||||
"On your FIRST turn, break the task into todo items using your todo tools and stop "
|
||||
"(do not start writing yet). On EACH SUBSEQUENT turn, complete exactly ONE remaining "
|
||||
"todo item, write its content, and mark it done using your tools — never complete more "
|
||||
"than one item per turn. When every item is done, give a brief final summary."
|
||||
),
|
||||
context_providers=[todo_provider],
|
||||
middleware=[loop],
|
||||
)
|
||||
|
||||
# 3. Reuse a single session so todo state persists across loop iterations. Each contiguous
|
||||
# ``user`` block marks the boundary into the next iteration, so we count loop iterations by
|
||||
# those boundaries — robust to the function calling this loop relies on (the todo tools issue
|
||||
# several model calls per iteration, but tool calls/results are never ``user`` updates).
|
||||
session = AgentSession()
|
||||
prompt = "Plan and write a short 3-section blog post about Rayleigh scattering."
|
||||
iterations = 1
|
||||
in_user_block = False
|
||||
assistant_open = False
|
||||
async for update in agent.run(prompt, session=session, stream=True):
|
||||
if update.role == "user":
|
||||
if not in_user_block:
|
||||
iterations += 1
|
||||
in_user_block = True
|
||||
assistant_open = False
|
||||
print(f"\nuser: {update.text}", flush=True)
|
||||
continue
|
||||
in_user_block = False
|
||||
if update.text:
|
||||
if not assistant_open:
|
||||
print("\nassistant: ", end="", flush=True)
|
||||
assistant_open = True
|
||||
print(update.text, end="", flush=True)
|
||||
print(f"\n\nCompleted in {iterations} iteration(s).")
|
||||
|
||||
# 4. Inspect the todos the agent created, loaded from the same store the loop predicate uses.
|
||||
items = await todo_provider.store.load_items(session, source_id=todo_provider.source_id)
|
||||
print("\nTodos after the run:")
|
||||
for item in items:
|
||||
mark = "x" if item.is_complete else " "
|
||||
print(f" [{mark}] {item.id}. {item.title}")
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
async with AzureCliCredential() as credential:
|
||||
client = FoundryChatClient(credential=credential)
|
||||
await todo_loop(client)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
||||
|
||||
"""
|
||||
Sample output (abridged; exact text varies by model):
|
||||
|
||||
=== Callable criterion (loop while todos remain) ===
|
||||
assistant: Here is my plan. I'll create todos for each section.
|
||||
user: Progress so far:
|
||||
- Here is my plan. I'll create todos for each section.
|
||||
user: Continue working on the task. If it is complete, say so.
|
||||
assistant: Section 1 drafted. Marking it done.
|
||||
user: Progress so far:
|
||||
- Section 1 drafted. Marking it done.
|
||||
user: Continue working on the task. If it is complete, say so.
|
||||
assistant: Section 2 drafted. Marking it done.
|
||||
user: Progress so far:
|
||||
- Section 2 drafted. Marking it done.
|
||||
user: Continue working on the task. If it is complete, say so.
|
||||
assistant: Section 3 drafted. Marking it done.
|
||||
|
||||
Completed in 4 iteration(s).
|
||||
|
||||
Todos after the run:
|
||||
[x] 1. Draft "What light is" section
|
||||
[x] 2. Draft "How Rayleigh scattering works" section
|
||||
[x] 3. Draft "Why the sky is blue" section
|
||||
"""
|
||||
Reference in New Issue
Block a user