mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
1acd242550
* Python: Add AgentLoopMiddleware for re-running agents in a loop Add `AgentLoopMiddleware`, an `AgentMiddleware` that re-runs the wrapped agent in a loop. A single configurable class covers three common patterns, each with a convenience classmethod factory: - Ralph loop (`.ralph(...)`): no exit criteria, with feedback tracking (`record_feedback`/`progress`), progress injection (`inject_progress`), optional fresh context per iteration (`fresh_context`), and an early-stop completion signal (`is_complete`). - Predicate (`.with_predicate(...)`): loop while a `should_continue` callable returns True (e.g. paired with `todos_remaining`/`background_tasks_running`). - Judge (`.with_judge(...)`): a second chat client decides whether the original request was answered, using a `JudgeVerdict` structured-output response. The loop also auto-resolves pending function-approval / user-input requests via an `on_approval_request` callable (bounded by `max_approval_rounds`), and the next iteration's input is controlled by `next_message`. Supports both streaming and non-streaming runs. Exports `AgentLoopMiddleware`, `JudgeVerdict`, `todos_remaining`, and `background_tasks_running`. Adds tests, a sample, and docs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Refine AgentLoopMiddleware API and sample - with_judge: add criteria list with {{criteria}} templating into judge instructions plus an agent-side instruction; add fresh_context, additional judge feedback relay; default judge max_iterations. - should_continue is now required and positional; supports (bool, str|None) feedback tuples surfaced to next_message/record_feedback via feedback kwarg. - Judge forwards full multi-modal request and response messages. - Default max_iterations=10 (explicit None = unbounded); removed is_complete and Ralph terminology; ShouldContinueResult is a real TypeAlias. - Sample: stream all loops, print iteration counts via injected user-block boundaries (robust to function calling), <role>: content formatting, per-method expected output, and a looping todo sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Fix CI checks for AgentLoopMiddleware - Resolve pyright errors in _loop.py: drop the always-true final_result None check (the while loop always assigns it) and cast finish_reason to the AgentResponse constructor's expected type. - Apply pyupgrade --py310-plus: import TypeAlias from typing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Resolve mypy/pyright disagreement on finish_reason pyright infers AgentResponse.finish_reason as including str and rejects the direct assignment, while mypy considers a cast redundant. Drop the cast and suppress only pyright with a targeted reportArgumentType ignore, satisfying both type checkers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Add todo+judge AgentLoopMiddleware sample Add a second AgentLoopMiddleware sample that composes two criteria in one should_continue predicate: a TodoProvider check (evaluated first) and a report-style judge chat client (evaluated once todos are complete) that grades the assembled report against shared requirements. Register it in the middleware samples README. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Compose todo+judge loops as two middleware Rework the todo+judge sample to compose two AgentLoopMiddleware on the agent itself (middleware=[judge_loop, todo_loop]) instead of a single hand-written predicate. The inner todos_remaining loop drafts the report todo-by-todo and the outer with_judge loop re-runs it until an editor chat client judges the report publication-ready, reusing the built-in helpers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Reset session for fresh_context loops via snapshot/restore AgentLoopMiddleware.fresh_context previously only reset context.messages, so with an attached session each iteration still reloaded the local transcript or re-threaded the service-side conversation id and the model saw the accumulated history. Snapshot the session once before the loop (via to_dict) and restore it (from_dict + field copy) between iterations, so every pass starts from the pre-loop baseline. The final iteration's pass is persisted (no restore after the terminating iteration), so a subsequent agent.run continues from there. Removed the obsolete warning, updated docstrings and core AGENTS.md, and added tests: a snapshot/restore round-trip, a session-reset streaming x fresh_context x inject_progress x store matrix across multiple runs and loop iterations, and response_format parsing across the loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Updated samples and docstrings --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
3.4 KiB
3.4 KiB
Middleware samples
This folder contains focused middleware samples for Agent, chat clients, tools, sessions, and runtime context behavior.
Files
| File | Description |
|---|---|
agent_and_run_level_middleware.py |
Demonstrates combining agent-level and run-level middleware. |
agent_loop_middleware_refinement.py |
Demonstrates AgentLoopMiddleware with a should_continue predicate: a completion-marker refinement loop with feedback tracking and fresh_context. |
agent_loop_middleware_todos.py |
Demonstrates AgentLoopMiddleware with a should_continue predicate built from a TodoProvider via todos_remaining, so the agent keeps working while open todos remain. |
agent_loop_middleware_judge.py |
Demonstrates AgentLoopMiddleware.with_judge: a ChatClient judge re-runs the agent until it decides the original request was answered, with criteria shared between the agent and the judge. |
agent_loop_middleware_report.py |
Demonstrates composing two AgentLoopMiddleware on one agent: an inner todos_remaining loop that drafts a report todo-by-todo, wrapped by an outer report-style with_judge loop that re-runs it until an editor chat client judges the report publication-ready. |
chat_middleware.py |
Shows class-based and function-based chat middleware that can observe, modify, and override model calls. |
class_based_middleware.py |
Shows class-based agent and function middleware. |
decorator_middleware.py |
Demonstrates middleware registration with decorators. |
exception_handling_with_middleware.py |
Shows how middleware can handle failures and recover cleanly. |
function_based_middleware.py |
Shows function-based agent and function middleware. |
middleware_termination.py |
Demonstrates stopping a middleware pipeline early. |
override_result_with_middleware.py |
Shows how middleware can replace regular and streaming results, then post-process the final response. |
runtime_context_delegation.py |
Demonstrates delegating arguments with runtime context data. |
session_behavior_middleware.py |
Shows how middleware interacts with session-backed runs. |
shared_state_middleware.py |
Demonstrates sharing mutable state across middleware invocations. |
usage_tracking_middleware.py |
Demonstrates one chat middleware function that tracks per-call usage in non-streaming and streaming tool-loop runs. |
Running the usage tracking sample
The new usage tracking sample uses OpenAIChatClient, so set the usual OpenAI responses environment variables first:
export OPENAI_API_KEY="your-openai-api-key"
export OPENAI_CHAT_MODEL="gpt-4.1-mini"
Then run:
uv run samples/02-agents/middleware/usage_tracking_middleware.py
The sample forces a tool call so you can see middleware output for each inner model call in both non-streaming and streaming modes.