Evan Mattson 2a9b68d1bd Python: Fix MCPStreamableHTTPTool leaking asyncio.CancelledError when MCP server is unreachable (#5687)
* fix: wrap asyncio.CancelledError in ToolException in _connect_on_owner (#5667)

asyncio.CancelledError is a BaseException (not Exception) in Python 3.8+.
When an MCP server is unreachable, the MCP library's internal anyio task
group raises CancelledError, which escaped all three 'except Exception'
handlers in _connect_on_owner(). This propagated through
_run_lifecycle_owner -> _run_on_lifecycle_owner -> connect -> __aenter__,
bypassing user except Exception blocks entirely.

Fix: change the three except-Exception clauses in _connect_on_owner to
'except (Exception, asyncio.CancelledError)' so spurious CancelledErrors
from the MCP transport layer are caught and wrapped in ToolException,
consistent with the method's documented contract.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(mcp): propagate genuine task CancelledError in connect() (#5667)

On Python >= 3.11, check task.cancelling() > 0 before wrapping
CancelledError as ToolException in the three except blocks inside
_connect_on_owner(). When the current task is being cancelled by its
caller, the CancelledError now propagates after cleanup, consistent
with the existing pattern at _mcp.py:560-564 and _runner.py:115-120.

On Python < 3.11 task.cancelling() is unavailable, so MCP-internal
CancelledErrors still cannot be reliably distinguished from
caller-driven cancellation; they continue to be wrapped as
ToolException with a comment documenting the trade-off.

Tests:
- Add cleanup assertion to transport-creation CancelledError test
- Add MCPStdioTool variants exercising the 'command' message branches
  for both transport-creation and initialize CancelledError paths
- Add Python 3.11+-gated tests verifying genuine task cancellation
  propagates (and still cleans up) for transport and initialize stages

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(mcp): log CancelledError with exc_info before wrapping in ToolException (#5667)

CancelledError inherits from BaseException (not Exception) on Python >= 3.8,
so the 'inner_exception=ex if isinstance(ex, Exception) else None' guard
always yields None for CancelledError. This means ToolException.__init__
calls logger.log(level, message, exc_info=None), dropping the traceback.

Add an explicit logger.debug(error_msg, exc_info=ex) before each
raise ToolException(...) in the three CancelledError handlers so the
full traceback is preserved in debug logs when MCP-internal cancellation
is wrapped rather than propagated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5667: Python: [Bug]: Error Handling Issue regarding Python MCPStreamableHTTPTool Class

* refactor(_mcp): extract cancellation helper, fix session error msg and exc_info

- Extract _should_propagate_cancelled_error() helper to eliminate duplicated
  genuine-cancellation detection logic across the three connect() except blocks
- Fix session-creation ToolException message to include exception details
  (e.g. 'Failed to create MCP session: <ex>') matching the transport and
  initialize failure paths
- Change exc_info=ex to exc_info=True in all three logger.debug() calls
  for idiomatic logging
- Add tests for _should_propagate_cancelled_error helper
- Add regression test asserting session error message includes exception text
- Add test verifying logger.debug is called with exc_info=True

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: factor out _close_and_check_cancelled helper in _connect_on_owner

Addresses review comment on PR #5687:

1. Add _close_and_check_cancelled() helper method that combines
   _safe_close_exit_stack() + _should_propagate_cancelled_error() into a
   single await-able call. This eliminates the duplicated close-then-check
   pattern that appeared identically in all three connect phases (transport,
   session, initialize), reducing future drift risk.

2. Comments 2 and 3 (missing {ex} in session error message and non-idiomatic
   exc_info=ex) were already addressed in the current code: all error messages
   include {ex} and all logger.debug calls use exc_info=True.

3. Add test_connect_genuine_cancellation_during_session_creation_propagates
   to cover the previously untested genuine-cancellation path in the
   session-creation phase (transport and initialize phases already had tests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5667: review comment fixes

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2a9b68d1bd · 2026-05-07 17:58:30 +00:00
2,031 Commits
2025-10-30 20:29:01 +00:00
2025-04-28 12:54:43 -07:00
2025-04-28 12:54:42 -07:00

Microsoft Agent Framework

Welcome to Microsoft Agent Framework!

Microsoft Foundry Discord MS Learn Documentation PyPI NuGet GitHub stars

Microsoft Agent Framework (MAF) is an open, multi-language framework for building production-grade AI agents and multi-agent workflows in .NET and Python.

Microsoft Agent Framework is built for teams taking agents from prototype to production. It provides a consistent foundation for building, orchestrating, and operating agent systems across Python and .NET, while keeping architecture choices open as requirements evolve, and supports a broad ecosystem including Microsoft Foundry, Azure OpenAI, OpenAI, and the GitHub Copilot SDK, with samples and hosting patterns for both local development and cloud deployment.

Watch the full Agent Framework introduction (30 min)

Watch the full Agent Framework introduction (30 min)

Is this the right framework for you?

MAF is a strong fit if you:

  • are building agents and workflows you expect to run in production,
  • need orchestration beyond a single prompt or stateless chat loop,
  • want graph-based patterns such as sequential, concurrent, handoff, and group collaboration,
  • care about durability, restartability, observability, governance, or human-in-the-loop control,
  • need provider flexibility so your architecture can evolve without major rewrites.

Key Features

Explore new MAF capabilities and real implementation patterns on the official blog.

  • Python and C#/.NET Support: Full framework support for both Python and C#/.NET implementations with consistent APIs
  • Multiple Agent Provider Support: Support for various LLM providers with more being added continuously
  • Middleware: Flexible middleware system for request/response processing, exception handling, and custom pipelines
  • Orchestration Patterns & Workflows: Build multi-agent systems with graph-based workflows supporting sequential, concurrent, handoff, and group collaboration patterns; includes checkpointing, streaming, human-in-the-loop, and time-travel
  • Foundry Hosted Agents (new): Deploy and host your agents to Foundry-hosted infrastructure with just 2 additional lines of code
  • Observability: Built-in OpenTelemetry integration for distributed tracing, monitoring, and debugging
  • Declarative Agents: Define agents using YAML for faster setup and versioning
  • Agent Skills: Build domain-specific knowledge bases from multiple sources—files, inline code, class libraries—for agents to discover and use
  • AF Labs: Experimental packages for cutting-edge features including benchmarking, reinforcement learning, and research initiatives
  • DevUI: Interactive developer UI for agent development, testing, and debugging workflows

Table of Contents

Getting Started

Installation

Python

pip install agent-framework
# This will install all sub-packages, see `python/packages` for individual packages.
# It may take a minute on first install on Windows.

.NET

dotnet add package Microsoft.Agents.AI
# For Foundry integration (used in the .NET quickstart below):
dotnet add package Microsoft.Agents.AI.Foundry
dotnet add package Azure.AI.Projects
dotnet add package Azure.Identity

Learning Resources

Quickstart

Basic Agent - Python

Create a simple Azure Responses Agent that writes a haiku about the Microsoft Agent Framework

# pip install agent-framework
# Use `az login` to authenticate with Azure CLI
import os
import asyncio
from agent_framework import Agent
from agent_framework.foundry import FoundryChatClient
from azure.identity import AzureCliCredential


async def main():
    # Initialize a chat agent with Microsoft Foundry
    # the endpoint, deployment name, and api version can be set via environment variables
    # or they can be passed in directly to the FoundryChatClient constructor
    agent = Agent(
      client=FoundryChatClient(
          credential=AzureCliCredential(),
          # project_endpoint=os.environ["FOUNDRY_PROJECT_ENDPOINT"],
          # model=os.environ["FOUNDRY_MODEL_DEPLOYMENT_NAME"],
      ),
      name="HaikuAgent",
      instructions="You are an upbeat assistant that writes beautifully.",
    )

    print(await agent.run("Write a haiku about Microsoft Agent Framework."))

if __name__ == "__main__":
    asyncio.run(main())

Basic Agent - .NET

Create a simple Agent, using Microsoft Foundry that writes a haiku about the Microsoft Agent Framework

// This sample shows how to create and run a basic agent with AIProjectClient.AsAIAgent(...).

using Azure.AI.Projects;
using Azure.Identity;
using Microsoft.Agents.AI;

string endpoint = Environment.GetEnvironmentVariable("AZURE_AI_PROJECT_ENDPOINT") ?? throw new InvalidOperationException("AZURE_AI_PROJECT_ENDPOINT is not set.");
string deploymentName = Environment.GetEnvironmentVariable("AZURE_AI_MODEL_DEPLOYMENT_NAME") ?? "gpt-5.4-mini";

AIAgent agent =
    new AIProjectClient(new Uri(endpoint), new DefaultAzureCredential())
    .AsAIAgent(model: deploymentName, instructions: "You are an upbeat assistant that writes beautifully.", name: "HaikuAgent");

// Once you have the agent, you can invoke it like any other AIAgent.
Console.WriteLine(await agent.RunAsync("Write a haiku about Microsoft Agent Framework."));

More Examples & Samples

Python

  • Getting Started: progressive tutorial from hello-world to hosting
  • Agent Concepts: deep-dive samples by topic (tools, middleware, providers, etc.)
  • Workflows: workflow creation and integration with agents
  • Hosting: A2A, Azure Functions, Durable Task hosting
  • End-to-End: full applications, evaluation, and demos

.NET

Community & Feedback

  • Found a bug? File a GitHub issue to help us improve.
  • Enjoying MAF? GitHub stars to show your support and help others discover the project.
  • Have questions? Join our Discord or visit weekly office hours.

Troubleshooting

Authentication

Problem Cause Fix
Authentication errors when using Azure credentials Not signed in to Azure CLI Run az login before starting your app
API key errors Wrong or missing API key Verify the key and ensure it's for the correct resource/provider

Tip: DefaultAzureCredential is convenient for development but in production, consider using a specific credential (e.g., ManagedIdentityCredential) to avoid latency issues, unintended credential probing, and potential security risks from fallback mechanisms.

Environment Variables

For environment variable configuration specific to each sample, refer to the README in the sample directory (Python samples | .NET samples).

Contributor Resources

Important Notes

Important

If you use Microsoft Agent Framework to build applications that operate with any third-party servers, agents, code, or non-Azure Direct models (“Third-Party Systems”), you do so at your own risk. Third-Party Systems are Non-Microsoft Products under the Microsoft Product Terms and are governed by their own third-party license terms. You are responsible for any usage and associated costs.

We recommend reviewing all data being shared with and received from Third-Party Systems and being cognizant of third-party practices for handling, sharing, retention and location of data. It is your responsibility to manage whether your data will flow outside of your organizations Azure compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries and approvals are provisioned.

You are responsible for carefully reviewing and testing applications you build using Microsoft Agent Framework in the context of your specific use cases, and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations such as metaprompt, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. See also: Transparency FAQ

Languages
Python 50.9%
C# 45.8%
TypeScript 2.7%
HTML 0.2%
PowerShell 0.1%
Other 0.1%