mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
Python: [BREAKING] update context provider APIs, middleware, and per-service-call history persistence (#4992)
* Rename provider base APIs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Allow provider-added chat and function middleware Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simulate service-stored history per model call Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix typing regressions in CI Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix response ID suppression review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename per-service-call history persistence APIs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address context persistence review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Stabilize markdown sample docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Persist service continuation state per call Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
Unverified
parent
38de991481
commit
b065a4ce51
@@ -9,6 +9,7 @@ This folder contains examples for direct chat client usage patterns.
|
||||
| [`built_in_chat_clients.py`](built_in_chat_clients.py) | Consolidated sample for built-in chat clients. Uses `get_client()` to create the selected client and pass it to `main()`. |
|
||||
| [`chat_response_cancellation.py`](chat_response_cancellation.py) | Demonstrates how to cancel chat responses during streaming, showing proper cancellation handling and cleanup. |
|
||||
| [`custom_chat_client.py`](custom_chat_client.py) | Demonstrates how to create custom chat clients by extending the `BaseChatClient` class. Shows a `EchoingChatClient` implementation and how to integrate it with `Agent` using the `as_agent()` method. |
|
||||
| [`require_per_service_call_history_persistence.py`](require_per_service_call_history_persistence.py) | Compares two otherwise identical `FoundryChatClient` agents with `store=False`; the only difference is whether `require_per_service_call_history_persistence` is enabled, and only the run without it stores the synthesized tool result when middleware terminates the loop early. |
|
||||
|
||||
## Selecting a built-in client
|
||||
|
||||
@@ -35,6 +36,15 @@ Example:
|
||||
uv run samples/02-agents/chat_client/built_in_chat_clients.py
|
||||
```
|
||||
|
||||
The `require_per_service_call_history_persistence.py` sample uses `FoundryChatClient`, so set the usual Foundry settings first and sign in with the Azure CLI:
|
||||
|
||||
```bash
|
||||
export FOUNDRY_PROJECT_ENDPOINT="https://<your-project>.services.ai.azure.com/api/projects/<project-name>"
|
||||
export FOUNDRY_MODEL="<your-model-deployment-name>"
|
||||
az login
|
||||
uv run samples/02-agents/chat_client/require_per_service_call_history_persistence.py
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Depending on the selected client, set the appropriate environment variables:
|
||||
|
||||
@@ -0,0 +1,194 @@
|
||||
# Copyright (c) Microsoft. All rights reserved.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
from collections.abc import Awaitable, Callable
|
||||
from typing import Annotated
|
||||
|
||||
from agent_framework import (
|
||||
Agent,
|
||||
FunctionInvocationContext,
|
||||
FunctionMiddleware,
|
||||
InMemoryHistoryProvider,
|
||||
Message,
|
||||
MiddlewareTermination,
|
||||
)
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
from pydantic import Field
|
||||
|
||||
"""
|
||||
Compare Foundry agents with and without per-service-call chat history persistence.
|
||||
|
||||
This sample runs two otherwise identical Foundry agents with ``store=False`` so
|
||||
history stays local for both runs.
|
||||
|
||||
The sample adds a function middleware that raises ``MiddlewareTermination``
|
||||
immediately after the tool runs, so the request stops before a second model
|
||||
call.
|
||||
|
||||
That early termination is the important difference:
|
||||
|
||||
- Without per-service-call chat history persistence, the synthesized tool result is
|
||||
still written to local history.
|
||||
- With ``require_per_service_call_history_persistence=True``, that synthesized tool result is
|
||||
not written to local history.
|
||||
|
||||
The per-service-call persistence case matches service-side storage behavior. When a terminated
|
||||
request never sends the tool result back to the service, that result also never
|
||||
becomes part of the service-managed history.
|
||||
"""
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
|
||||
def lookup_weather(
|
||||
location: Annotated[str, Field(description="The location to get the weather for.")],
|
||||
) -> str:
|
||||
"""Return a deterministic weather result for the requested location."""
|
||||
return f"The weather in {location} is sunny."
|
||||
|
||||
|
||||
class TerminateAfterToolMiddleware(FunctionMiddleware):
|
||||
"""Stop the tool loop after the first tool finishes."""
|
||||
|
||||
async def process(
|
||||
self,
|
||||
context: FunctionInvocationContext,
|
||||
call_next: Callable[[], Awaitable[None]],
|
||||
) -> None:
|
||||
"""Run the tool, then terminate the loop with that tool result."""
|
||||
await call_next()
|
||||
raise MiddlewareTermination(result=context.result)
|
||||
|
||||
|
||||
def _describe_message(message: Message) -> str:
|
||||
"""Render one stored message in a compact, readable format."""
|
||||
parts: list[str] = []
|
||||
for content in message.contents:
|
||||
if content.type == "text" and content.text:
|
||||
parts.append(content.text)
|
||||
elif content.type == "function_call":
|
||||
parts.append(f"function_call -> {content.name}({content.arguments})")
|
||||
elif content.type == "function_result":
|
||||
parts.append(f"function_result -> {content.result}")
|
||||
else:
|
||||
parts.append(content.type)
|
||||
|
||||
return f"{message.role}: {' | '.join(parts)}"
|
||||
|
||||
|
||||
def _includes_tool_result(messages: list[Message]) -> bool:
|
||||
"""Return whether any stored message contains a tool result."""
|
||||
return any(content.type == "function_result" for message in messages for content in message.contents)
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
"""Run both comparison scenarios."""
|
||||
print("=== require_per_service_call_history_persistence when middleware terminates the tool loop ===\n")
|
||||
|
||||
# 1. Create one Foundry chat client that both agents will share.
|
||||
client = FoundryChatClient(credential=AzureCliCredential())
|
||||
query = "What is the weather in Seattle, and should I bring sunglasses?"
|
||||
|
||||
# 2. Create and run the agent without per-service-call persistence.
|
||||
agent_without_persistence = Agent(
|
||||
client=client,
|
||||
instructions=(
|
||||
"You are a weather assistant. Call lookup_weather exactly once before answering "
|
||||
"any weather question, then summarize the tool result in one short paragraph."
|
||||
),
|
||||
tools=[lookup_weather],
|
||||
context_providers=[InMemoryHistoryProvider()],
|
||||
middleware=[TerminateAfterToolMiddleware()],
|
||||
default_options={"tool_choice": "required", "store": False},
|
||||
)
|
||||
session_without_persistence = agent_without_persistence.create_session()
|
||||
await agent_without_persistence.run(
|
||||
query,
|
||||
session=session_without_persistence,
|
||||
)
|
||||
stored_messages_without_persistence = session_without_persistence.state[InMemoryHistoryProvider.DEFAULT_SOURCE_ID][
|
||||
"messages"
|
||||
]
|
||||
|
||||
print("=== Without per-service-call persistence ===")
|
||||
print("Loop terminated immediately after the tool finished.")
|
||||
print(f"Stored synthesized tool result: {_includes_tool_result(stored_messages_without_persistence)}")
|
||||
print("Stored history:")
|
||||
for index, message in enumerate(stored_messages_without_persistence, start=1):
|
||||
print(f" {index}. {_describe_message(message)}")
|
||||
print()
|
||||
|
||||
# 3. Create and run the agent with per-service-call persistence enabled.
|
||||
agent_with_persistence = Agent(
|
||||
client=client,
|
||||
instructions=(
|
||||
"You are a weather assistant. Call lookup_weather exactly once before answering "
|
||||
"any weather question, then summarize the tool result in one short paragraph."
|
||||
),
|
||||
tools=[lookup_weather],
|
||||
context_providers=[InMemoryHistoryProvider()],
|
||||
middleware=[TerminateAfterToolMiddleware()],
|
||||
require_per_service_call_history_persistence=True,
|
||||
default_options={"tool_choice": "required", "store": False},
|
||||
)
|
||||
session_with_persistence = agent_with_persistence.create_session()
|
||||
await agent_with_persistence.run(
|
||||
query,
|
||||
session=session_with_persistence,
|
||||
)
|
||||
stored_messages_with_persistence = session_with_persistence.state[InMemoryHistoryProvider.DEFAULT_SOURCE_ID][
|
||||
"messages"
|
||||
]
|
||||
|
||||
print("=== With per-service-call persistence ===")
|
||||
print("Loop terminated immediately after the tool finished.")
|
||||
print(f"Stored synthesized tool result: {_includes_tool_result(stored_messages_with_persistence)}")
|
||||
print("Stored history:")
|
||||
for index, message in enumerate(stored_messages_with_persistence, start=1):
|
||||
print(f" {index}. {_describe_message(message)}")
|
||||
print()
|
||||
|
||||
# 4. Summarize the effect of the flag.
|
||||
print(
|
||||
"Both runs used FoundryChatClient with store=False and terminated right after the tool. "
|
||||
"Without per-service-call persistence, local history still stored the synthesized tool result. "
|
||||
"With per-service-call persistence, local history stopped at the assistant function-call message instead, "
|
||||
"which matches service-side storage because the terminated tool result is never sent back to the service."
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
||||
|
||||
"""
|
||||
Sample output:
|
||||
=== require_per_service_call_history_persistence when middleware terminates the tool loop ===
|
||||
|
||||
=== Without per-service-call persistence ===
|
||||
Loop terminated immediately after the tool finished.
|
||||
Stored synthesized tool result: True
|
||||
Stored history:
|
||||
1. user: What is the weather in Seattle, and should I bring sunglasses?
|
||||
2. assistant: function_call -> lookup_weather({"location":"Seattle"})
|
||||
3. tool: function_result -> The weather in Seattle is sunny.
|
||||
|
||||
=== With per-service-call persistence ===
|
||||
Loop terminated immediately after the tool finished.
|
||||
Stored synthesized tool result: False
|
||||
Stored history:
|
||||
1. user: What is the weather in Seattle, and should I bring sunglasses?
|
||||
2. assistant: function_call -> lookup_weather({"location":"Seattle"})
|
||||
|
||||
Both runs used FoundryChatClient with store=False and terminated right after
|
||||
the tool. Without per-service-call persistence, local history still stored the
|
||||
synthesized tool result. With per-service-call persistence, local history
|
||||
stopped at the assistant function-call message instead, which matches
|
||||
service-side storage because the terminated tool result is never sent back to
|
||||
the service.
|
||||
"""
|
||||
@@ -5,7 +5,7 @@ import os
|
||||
from contextlib import suppress
|
||||
from typing import Any
|
||||
|
||||
from agent_framework import Agent, AgentSession, BaseContextProvider, SessionContext, SupportsChatGetResponse
|
||||
from agent_framework import Agent, AgentSession, ContextProvider, SessionContext, SupportsChatGetResponse
|
||||
from agent_framework.foundry import FoundryChatClient
|
||||
from azure.identity import AzureCliCredential
|
||||
from dotenv import load_dotenv
|
||||
@@ -20,7 +20,7 @@ class UserInfo(BaseModel):
|
||||
age: int | None = None
|
||||
|
||||
|
||||
class UserInfoMemory(BaseContextProvider):
|
||||
class UserInfoMemory(ContextProvider):
|
||||
DEFAULT_SOURCE_ID = "user_info_memory"
|
||||
|
||||
def __init__(self, source_id: str = DEFAULT_SOURCE_ID, *, client: SupportsChatGetResponse, **kwargs: Any):
|
||||
|
||||
@@ -4,7 +4,7 @@ import asyncio
|
||||
from collections.abc import Sequence
|
||||
from typing import Any
|
||||
|
||||
from agent_framework import Agent, AgentSession, BaseHistoryProvider, Message
|
||||
from agent_framework import Agent, AgentSession, HistoryProvider, Message
|
||||
from agent_framework.openai import OpenAIChatClient
|
||||
from dotenv import load_dotenv
|
||||
|
||||
@@ -20,7 +20,7 @@ preferred storage solution (database, file system, etc.).
|
||||
"""
|
||||
|
||||
|
||||
class CustomHistoryProvider(BaseHistoryProvider):
|
||||
class CustomHistoryProvider(HistoryProvider):
|
||||
"""Implementation of custom history provider.
|
||||
In real applications, this can be an implementation of relational database or vector store."""
|
||||
|
||||
|
||||
Reference in New Issue
Block a user