Python: Added chat middleware and more examples (#883)

* Added example with stateful middleware

* Added chat middleware

* Updated middleware example with override scenario

* Small revert

* Small fixes

* Added kwargs to context objects

* Added README

* Added function middleware to chat client

* Small refactoring

* Reverted example files

* Made MiddlewareWrapper generic

* Added Middleware exception

* Small refactoring

* Small fix
This commit is contained in:
Dmytro Struk
2025-09-26 08:10:56 -07:00
committed by GitHub
Unverified
parent 863c8d7471
commit eec7f192eb
19 changed files with 2667 additions and 267 deletions
@@ -0,0 +1,45 @@
# Middleware Examples
This folder contains examples demonstrating various middleware patterns with the Agent Framework. Middleware allows you to intercept and modify behavior at different execution stages, including agent runs, function calls, and chat interactions.
## Examples
| File | Description |
|------|-------------|
| [`function_based_middleware.py`](function_based_middleware.py) | Demonstrates how to implement middleware using simple async functions instead of classes. Shows security validation, logging, and performance monitoring middleware. Function-based middleware is ideal for simple, stateless operations and provides a lightweight approach. |
| [`class_based_middleware.py`](class_based_middleware.py) | Shows how to implement middleware using class-based approach by inheriting from `AgentMiddleware` and `FunctionMiddleware` base classes. Includes security checks for sensitive information and detailed function execution logging with timing. |
| [`decorator_middleware.py`](decorator_middleware.py) | Demonstrates how to use `@agent_middleware` and `@function_middleware` decorators to explicitly mark middleware functions without requiring type annotations. Shows different middleware detection scenarios and explicit decorator usage. |
| [`middleware_termination.py`](middleware_termination.py) | Shows how middleware can terminate execution using the `context.terminate` flag. Includes examples of pre-termination (prevents agent processing) and post-termination (allows processing but stops further execution). Useful for security checks, rate limiting, or early exit conditions. |
| [`exception_handling_with_middleware.py`](exception_handling_with_middleware.py) | Demonstrates how to use middleware for centralized exception handling in function calls. Shows how to catch exceptions from functions, provide graceful error responses, and override function results when errors occur to provide user-friendly messages. |
| [`override_result_with_middleware.py`](override_result_with_middleware.py) | Shows how to use middleware to intercept and modify function results after execution, supporting both regular and streaming agent responses. Demonstrates result filtering, formatting, enhancement, and custom streaming response generation. |
| [`shared_state_middleware.py`](shared_state_middleware.py) | Demonstrates how to implement function-based middleware within a class to share state between multiple middleware functions. Shows how middleware can work together by sharing state, including call counting and result enhancement. |
| [`agent_and_run_level_middleware.py`](agent_and_run_level_middleware.py) | Explains the difference between agent-level middleware (applied to ALL runs of the agent) and run-level middleware (applied to specific runs only). Shows security validation, performance monitoring, and context-specific middleware patterns. |
| [`chat_middleware.py`](chat_middleware.py) | Demonstrates how to use chat middleware to observe and override inputs sent to AI models. Shows how to intercept chat requests, log and modify input messages, and override entire responses before they reach the underlying AI service. |
## Key Concepts
### Middleware Types
- **Agent Middleware**: Intercepts agent run execution, allowing you to modify requests and responses
- **Function Middleware**: Intercepts function calls within agents, enabling logging, validation, and result modification
- **Chat Middleware**: Intercepts chat requests sent to AI models, allowing input/output transformation
### Implementation Approaches
- **Function-based**: Simple async functions for lightweight, stateless operations
- **Class-based**: Inherit from base middleware classes for complex, stateful operations
- **Decorator-based**: Use decorators for explicit middleware marking
### Common Use Cases
- **Security**: Validate requests, block sensitive information, implement access controls
- **Logging**: Track execution timing, log parameters and results, monitor performance
- **Error Handling**: Catch exceptions, provide graceful fallbacks, implement retry logic
- **Result Transformation**: Filter, format, or enhance function outputs
- **State Management**: Share data between middleware functions, maintain execution context
### Execution Control
- **Termination**: Use `context.terminate` to stop execution early
- **Result Override**: Modify or replace function/agent results
- **Streaming Support**: Handle both regular and streaming responses
@@ -0,0 +1,245 @@
# Copyright (c) Microsoft. All rights reserved.
import asyncio
from collections.abc import Awaitable, Callable
from random import randint
from typing import Annotated
from agent_framework import (
ChatContext,
ChatMessage,
ChatMiddleware,
ChatResponse,
Role,
chat_middleware,
)
from agent_framework.azure import AzureAIAgentClient
from azure.identity.aio import AzureCliCredential
from pydantic import Field
"""
Chat Middleware Example
This sample demonstrates how to use chat middleware to observe and override
inputs sent to AI models. Chat middleware intercepts chat requests before they reach
the underlying AI service, allowing you to:
1. Observe and log input messages
2. Modify input messages before sending to AI
3. Override the entire response
The example covers:
- Class-based chat middleware inheriting from ChatMiddleware
- Function-based chat middleware with @chat_middleware decorator
- Middleware registration at agent level (applies to all runs)
- Middleware registration at run level (applies to specific run only)
"""
def get_weather(
location: Annotated[str, Field(description="The location to get the weather for.")],
) -> str:
"""Get the weather for a given location."""
conditions = ["sunny", "cloudy", "rainy", "stormy"]
return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
class InputObserverMiddleware(ChatMiddleware):
"""Class-based middleware that observes and modifies input messages."""
def __init__(self, replacement: str | None = None):
"""Initialize with a replacement for user messages."""
self.replacement = replacement
async def process(
self,
context: ChatContext,
next: Callable[[ChatContext], Awaitable[None]],
) -> None:
"""Observe and modify input messages before they are sent to AI."""
print("[InputObserverMiddleware] Observing input messages:")
for i, message in enumerate(context.messages):
content = message.text if message.text else str(message.contents)
print(f" Message {i + 1} ({message.role.value}): {content}")
print(f"[InputObserverMiddleware] Total messages: {len(context.messages)}")
# Modify user messages by creating new messages with enhanced text
modified_messages: list[ChatMessage] = []
modified_count = 0
for message in context.messages:
if message.role == Role.USER and message.text:
original_text = message.text
updated_text = original_text
if self.replacement:
updated_text = self.replacement
print(f"[InputObserverMiddleware] Updated: '{original_text}' -> '{updated_text}'")
modified_message = ChatMessage(role=message.role, text=updated_text)
modified_messages.append(modified_message)
modified_count += 1
else:
modified_messages.append(message)
# Replace messages in context
context.messages[:] = modified_messages
# Continue to next middleware or AI execution
await next(context)
# Observe that processing is complete
print("[InputObserverMiddleware] Processing completed")
@chat_middleware
async def security_and_override_middleware(
context: ChatContext,
next: Callable[[ChatContext], Awaitable[None]],
) -> None:
"""Function-based middleware that implements security filtering and response override."""
print("[SecurityMiddleware] Processing input...")
# Security check - block sensitive information
blocked_terms = ["password", "secret", "api_key", "token"]
for message in context.messages:
if message.text:
message_lower = message.text.lower()
for term in blocked_terms:
if term in message_lower:
print(f"[SecurityMiddleware] BLOCKED: Found '{term}' in message")
# Override the response instead of calling AI
context.result = ChatResponse(
messages=[
ChatMessage(
role=Role.ASSISTANT,
text="I cannot process requests containing sensitive information. "
"Please rephrase your question without including passwords, secrets, or other "
"sensitive data.",
)
]
)
# Set terminate flag to stop execution
context.terminate = True
return
# Continue to next middleware or AI execution
await next(context)
async def class_based_chat_middleware() -> None:
"""Demonstrate class-based middleware at agent level."""
print("\n" + "=" * 60)
print("Class-based Chat Middleware (Agent Level)")
print("=" * 60)
# For authentication, run `az login` command in terminal or replace AzureCliCredential with preferred
# authentication option.
async with (
AzureCliCredential() as credential,
AzureAIAgentClient(async_credential=credential).create_agent(
name="EnhancedChatAgent",
instructions="You are a helpful AI assistant.",
# Register class-based middleware at agent level (applies to all runs)
middleware=InputObserverMiddleware(),
tools=get_weather,
) as agent,
):
query = "What's the weather in Seattle?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Final Response: {result.text if result.text else 'No response'}")
async def function_based_chat_middleware() -> None:
"""Demonstrate function-based middleware at agent level."""
print("\n" + "=" * 60)
print("Function-based Chat Middleware (Agent Level)")
print("=" * 60)
async with (
AzureCliCredential() as credential,
AzureAIAgentClient(async_credential=credential).create_agent(
name="FunctionMiddlewareAgent",
instructions="You are a helpful AI assistant.",
# Register function-based middleware at agent level
middleware=security_and_override_middleware,
) as agent,
):
# Scenario with normal query
print("\n--- Scenario 1: Normal Query ---")
query = "Hello, how are you?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Final Response: {result.text if result.text else 'No response'}")
# Scenario with security violation
print("\n--- Scenario 2: Security Violation ---")
query = "What is my password for this account?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Final Response: {result.text if result.text else 'No response'}")
async def run_level_middleware() -> None:
"""Demonstrate middleware registration at run level."""
print("\n" + "=" * 60)
print("Run-level Chat Middleware")
print("=" * 60)
async with (
AzureCliCredential() as credential,
AzureAIAgentClient(async_credential=credential).create_agent(
name="RunLevelAgent",
instructions="You are a helpful AI assistant.",
tools=get_weather,
# No middleware at agent level
) as agent,
):
# Scenario 1: Run without any middleware
print("\n--- Scenario 1: No Middleware ---")
query = "What's the weather in Tokyo?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Response: {result.text if result.text else 'No response'}")
# Scenario 2: Run with specific middleware for this call only (both enhancement and security)
print("\n--- Scenario 2: With Run-level Middleware ---")
print(f"User: {query}")
result = await agent.run(
query,
middleware=[
InputObserverMiddleware(replacement="What's the weather in Madrid?"),
security_and_override_middleware,
],
)
print(f"Response: {result.text if result.text else 'No response'}")
# Scenario 3: Security test with run-level middleware
print("\n--- Scenario 3: Security Test with Run-level Middleware ---")
query = "Can you help me with my secret API key?"
print(f"User: {query}")
result = await agent.run(
query,
middleware=security_and_override_middleware,
)
print(f"Response: {result.text if result.text else 'No response'}")
async def main() -> None:
"""Run all chat middleware examples."""
print("Chat Middleware Examples")
print("========================")
await class_based_chat_middleware()
await function_based_chat_middleware()
await run_level_middleware()
if __name__ == "__main__":
asyncio.run(main())
@@ -1,28 +1,37 @@
# Copyright (c) Microsoft. All rights reserved.
import asyncio
from collections.abc import Awaitable, Callable
from collections.abc import AsyncIterable, Awaitable, Callable
from random import randint
from typing import Annotated
from agent_framework import FunctionInvocationContext
from agent_framework import (
AgentRunContext,
AgentRunResponse,
AgentRunResponseUpdate,
ChatMessage,
Role,
TextContent,
)
from agent_framework.azure import AzureAIAgentClient
from azure.identity.aio import AzureCliCredential
from pydantic import Field
"""
Result Override with Middleware
Result Override with Middleware (Regular and Streaming)
This sample demonstrates how to use middleware to intercept and modify function results
after execution. The example shows:
after execution, supporting both regular and streaming agent responses. The example shows:
- How to execute the original function first and then modify its result
- Replacing function outputs with custom messages or transformed data
- Using middleware for result filtering, formatting, or enhancement
- Detecting streaming vs non-streaming execution using context.is_streaming
- Overriding streaming results with custom async generators
The weather override middleware lets the original weather function execute normally,
then replaces its result with a custom "perfect weather" message, demonstrating
how middleware can be used for content filtering, A/B testing, or result enhancement.
then replaces its result with a custom "perfect weather" message. For streaming responses,
it creates a custom async generator that yields the override message in chunks.
"""
@@ -35,32 +44,39 @@ def get_weather(
async def weather_override_middleware(
context: FunctionInvocationContext, next: Callable[[FunctionInvocationContext], Awaitable[None]]
context: AgentRunContext, next: Callable[[AgentRunContext], Awaitable[None]]
) -> None:
function_name = context.function.name
"""Middleware that overrides weather results for both streaming and non-streaming cases."""
# Let the original function execute first
# Let the original agent execution complete first
await next(context)
# Override the result if it's a weather function
if function_name == "get_weather" and context.result is not None:
original_result = str(context.result)
print(f"[WeatherOverrideMiddleware] Original result: {original_result}")
# Check if there's a result to override (agent called weather function)
if context.result is not None:
# Create custom weather message
chunks = [
"Weather Advisory - ",
"due to special atmospheric conditions, ",
"all locations are experiencing perfect weather today! ",
"Temperature is a comfortable 22°C with gentle breezes. ",
"Perfect day for outdoor activities!",
]
# Override with a custom message
# It's also possible to override the result before "next()" call if needed
custom_message = (
"Weather Advisory - due to special atmospheric conditions, "
"all locations are experiencing perfect weather today! "
"Temperature is a comfortable 22°C with gentle breezes. "
"Perfect day for outdoor activities!"
)
context.result = custom_message
print(f"[WeatherOverrideMiddleware] Overriding with custom message: {custom_message}")
if context.is_streaming:
# For streaming: create an async generator that yields chunks
async def override_stream() -> AsyncIterable[AgentRunResponseUpdate]:
for chunk in chunks:
yield AgentRunResponseUpdate(contents=[TextContent(text=chunk)])
context.result = override_stream()
else:
# For non-streaming: just replace with the string message
custom_message = "".join(chunks)
context.result = AgentRunResponse(messages=[ChatMessage(role=Role.ASSISTANT, text=custom_message)])
async def main() -> None:
"""Example demonstrating result override with middleware."""
"""Example demonstrating result override with middleware for both streaming and non-streaming."""
print("=== Result Override Middleware Example ===")
# For authentication, run `az login` command in terminal or replace AzureCliCredential with preferred
@@ -74,11 +90,22 @@ async def main() -> None:
middleware=weather_override_middleware,
) as agent,
):
# Non-streaming example
print("\n--- Non-streaming Example ---")
query = "What's the weather like in Seattle?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Agent: {result}")
# Streaming example
print("\n--- Streaming Example ---")
query = "What's the weather like in Portland?"
print(f"User: {query}")
print("Agent: ", end="", flush=True)
async for chunk in agent.run_stream(query):
if chunk.text:
print(chunk.text, end="", flush=True)
if __name__ == "__main__":
asyncio.run(main())
@@ -0,0 +1,128 @@
# Copyright (c) Microsoft. All rights reserved.
import asyncio
from collections.abc import Awaitable, Callable
from random import randint
from typing import Annotated
from agent_framework import (
FunctionInvocationContext,
)
from agent_framework.azure import AzureAIAgentClient
from azure.identity.aio import AzureCliCredential
from pydantic import Field
"""
Shared State Function-based Middleware Example
This sample demonstrates how to implement function-based middleware within a class to share state.
The example includes:
- A MiddlewareContainer class with two simple function middleware methods
- First middleware: Counts function calls and stores the count in shared state
- Second middleware: Uses the shared count to add call numbers to function results
This approach shows how middleware can work together by sharing state within the same class instance.
"""
def get_weather(
location: Annotated[str, Field(description="The location to get the weather for.")],
) -> str:
"""Get the weather for a given location."""
conditions = ["sunny", "cloudy", "rainy", "stormy"]
return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
def get_time(
timezone: Annotated[str, Field(description="The timezone to get the time for.")] = "UTC",
) -> str:
"""Get the current time for a given timezone."""
import datetime
return f"The current time in {timezone} is {datetime.datetime.now().strftime('%H:%M:%S')}"
class MiddlewareContainer:
"""Container class that holds middleware functions with shared state."""
def __init__(self) -> None:
# Simple shared state: count function calls
self.call_count: int = 0
async def call_counter_middleware(
self,
context: FunctionInvocationContext,
next: Callable[[FunctionInvocationContext], Awaitable[None]],
) -> None:
"""First middleware: increments call count in shared state."""
# Increment the shared call count
self.call_count += 1
print(f"[CallCounter] This is function call #{self.call_count}")
# Call the next middleware/function
await next(context)
async def result_enhancer_middleware(
self,
context: FunctionInvocationContext,
next: Callable[[FunctionInvocationContext], Awaitable[None]],
) -> None:
"""Second middleware: uses shared call count to enhance function results."""
print(f"[ResultEnhancer] Current total calls so far: {self.call_count}")
# Call the next middleware/function
await next(context)
# After function execution, enhance the result using shared state
if context.result:
enhanced_result = f"[Call #{self.call_count}] {context.result}"
context.result = enhanced_result
print("[ResultEnhancer] Enhanced result with call number")
async def main() -> None:
"""Example demonstrating shared state function-based middleware."""
print("=== Shared State Function-based Middleware Example ===")
# Create middleware container with shared state
middleware_container = MiddlewareContainer()
# For authentication, run `az login` command in terminal or replace AzureCliCredential with preferred
# authentication option.
async with (
AzureCliCredential() as credential,
AzureAIAgentClient(async_credential=credential).create_agent(
name="UtilityAgent",
instructions="You are a helpful assistant that can provide weather information and current time.",
tools=[get_weather, get_time],
# Pass both middleware functions from the same container instance
# Order matters: counter runs first to increment count,
# then result enhancer uses the updated count
middleware=[
middleware_container.call_counter_middleware,
middleware_container.result_enhancer_middleware,
],
) as agent,
):
# Test multiple requests to see shared state in action
queries = [
"What's the weather like in New York?",
"What time is it in London?",
"What's the weather in Tokyo?",
]
for i, query in enumerate(queries, 1):
print(f"\n--- Query {i} ---")
print(f"User: {query}")
result = await agent.run(query)
print(f"Agent: {result.text if result.text else 'No response'}")
# Display final statistics
print("\n=== Final Statistics ===")
print(f"Total function calls made: {middleware_container.call_count}")
if __name__ == "__main__":
asyncio.run(main())