Files
Eduard van Valkenburg 0cd40f8354 Python: [BREAKING] Refactor middleware layering and split Anthropic raw client (#4746)
* [BREAKING] Refactor middleware layering and raw clients

Reorder chat client layers so function invocation wraps chat middleware, and chat middleware stays outside telemetry while still running for each inner model call. Add middleware pipeline caching, refresh docs and samples, and split Anthropic into raw and public clients to match the standard layering model.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Tighten typing ignores in ancillary modules

Add targeted typing ignores in workflow visualization and lab modules so pyright stays clean alongside the middleware refactor work.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix categorize_middleware to unpack tuple/Sequence and use relative MRO assertions

- Broaden isinstance check in categorize_middleware from list to Sequence
  so tuples and other Sequence types are properly unpacked instead of
  being appended as a single item.
- Replace fragile hardcoded MRO index assertions in anthropic test with
  relative ordering via mro.index().
- Add regression tests for categorize_middleware with tuple, list, and
  None inputs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix middleware string decomposition, add middleware param to FunctionInvocationLayer, and add tests (#4710)

- Guard categorize_middleware Sequence check against str/bytes to prevent
  character-by-character decomposition of accidentally passed strings
- Add explicit middleware parameter to FunctionInvocationLayer.get_response
  and merge it into client_kwargs before categorization, fixing the
  inconsistency where only OpenAIChatClient supported this parameter
- Add assertions that RawAnthropicClient does not inherit convenience layers
- Add chat middleware cache test with non-empty base middleware
- Add tests for single unwrapped middleware item and string input

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Apply pre-commit auto-fixes

* Apply pre-commit auto-fixes

* Address review feedback for #4710: review comment fixes

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
0cd40f8354 ยท 2026-03-20 00:43:37 +00:00
History
..

Custom Agent and Chat Client Examples

This folder contains examples demonstrating how to implement custom agents and chat clients using the Microsoft Agent Framework.

Examples

File Description
custom_agent.py Shows how to create custom agents by extending the BaseAgent class. Demonstrates the EchoAgent implementation with both streaming and non-streaming responses, proper session management, and message history handling.
custom_chat_client.py Demonstrates how to create custom chat clients by extending the BaseChatClient class. Shows a EchoingChatClient implementation and how to integrate it with Agent using the as_agent() method.

Key Takeaways

Custom Agents

  • Custom agents give you complete control over the agent's behavior
  • You must implement both run() for both the stream=True and stream=False cases
  • Use self._normalize_messages() to handle different input message formats
  • Store messages in session.state to properly manage conversation history

Custom Chat Clients

  • Custom chat clients allow you to integrate any backend service or create new LLM providers
  • You must implement _inner_get_response() with a stream parameter to handle both streaming and non-streaming responses
  • Custom chat clients can be used with Agent to leverage all agent framework features
  • Use the as_agent() method to easily create agents from your custom chat clients

Both approaches allow you to extend the framework for your specific use cases while maintaining compatibility with the broader Agent Framework ecosystem.

Understanding Raw Client Classes

The framework provides Raw...Client classes (e.g., RawOpenAIChatClient, RawOpenAIResponsesClient, RawAzureAIClient) that are intermediate implementations without middleware, telemetry, or function invocation support.

Warning: Raw Clients Should Not Normally Be Used Directly

The Raw...Client classes should not normally be used directly. They do not include the middleware, telemetry, or function invocation support that you most likely need. If you do use them, you should carefully consider which additional layers to apply.

Layer Ordering

There is a defined ordering for applying layers that you should follow:

  1. FunctionInvocationLayer - Handles the tool/function calling loop and should stay outermost
  2. ChatMiddlewareLayer - Wraps each model call in the loop and stays outside telemetry
  3. ChatTelemetryLayer - Must be inside the function calling loop so each model call gets its own telemetry span
  4. Raw...Client - The base implementation (e.g., RawOpenAIChatClient)

Example of correct layer composition:

class MyCustomClient(
    FunctionInvocationLayer[TOptions],
    ChatMiddlewareLayer[TOptions],
    ChatTelemetryLayer[TOptions],
    RawOpenAIChatClient[TOptions],  # or BaseChatClient for custom implementations
    Generic[TOptions],
):
    """Custom client with all layers correctly applied."""
    pass

For most use cases, use the fully-featured public client classes which already have all layers correctly composed:

  • OpenAIChatClient - OpenAI Chat completions with all layers
  • OpenAIResponsesClient - OpenAI Responses API with all layers
  • AzureOpenAIChatClient - Azure OpenAI Chat with all layers
  • AzureOpenAIResponsesClient - Azure OpenAI Responses with all layers
  • AzureAIClient - Azure AI Project with all layers

These clients handle the layer composition correctly and provide the full feature set out of the box.