Files
agent-framework/docs/decisions/0009-support-long-running-operations.md
Laveesh Rohra cd77193742 Python: Merge main into feature-durabletask-python branch (#3261)
* Python: Add factory pattern to concurrent orchestration builder (#2738)

* Add factory pattern to concurrent orchestration builder

* Update readme

* Address AI comments

* Fix unit tests

* Fix import

* Prevent multiple calls to set participants or factories

* Add comments

* Mitigate warnings

* Fix mypy

* Address comments

* Address Copilot comments

* Fix tests

* Python: fix: GroupChat ManagerSelectionResponse JSON Schema for OpenAI Structured Outpu… (#2750)

* fix: ManagerSelectionResponse JSON Schema for OpenAI Structured Output Strict Mode

* refactor: install pre-commit then commit again

* Capture file IDs from code interpreter in streaming responses (#2741)

* .NET: [BREAKING] Prevent nulls in AIAgent property (#2719)

* prevent nulls in AIAgent property

* address feedback

* code ql sm04598 (#2723)

Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>

* .NET: Add Conversation State Sample (Step05) (#2697)

* Initial plan

* Add Agent_OpenAI_Step05_Conversation sample for conversation state management

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>

* Update Program.cs comment to accurately describe the sample

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>

* Update the code to use the ConversationClient more in line with the samples in OpenAI

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Changing sample to use ChatClientAgent and conversationId in GetNewThread

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Bump AWSSDK.Extensions.Bedrock.MEAI from 4.0.4.7 to 4.0.4.11 (#2777)

---
updated-dependencies:
- dependency-name: AWSSDK.Extensions.Bedrock.MEAI
  dependency-version: 4.0.4.11
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump Azure.Identity from 1.17.0 to 1.17.1 (#2780)

---
updated-dependencies:
- dependency-name: Azure.Identity
  dependency-version: 1.17.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.Identity
  dependency-version: 1.17.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.Identity
  dependency-version: 1.17.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.Identity
  dependency-version: 1.17.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump Azure.AI.AgentServer.AgentFramework from 1.0.0-beta.4 to 1.0.0-beta.5 (#2778)

---
updated-dependencies:
- dependency-name: Azure.AI.AgentServer.AgentFramework
  dependency-version: 1.0.0-beta.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.AI.AgentServer.AgentFramework
  dependency-version: 1.0.0-beta.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.AI.AgentServer.AgentFramework
  dependency-version: 1.0.0-beta.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Python: added more complete parsing for mcp tool arguments (#2756)

* added more complete parsing for mcp tool arguments

* fixed mypy

* added nonlocal model counter, and some fixes

* fixes in naming logic

* extracted json parsing function, added parametrized test and checked coverage

* Python: Updated package versions (#2784)

* Updated package versions

* Small fix

* Bump actions/checkout from 5 to 6 (#2404)

Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* .NET: adds support for labels in edges,  fixes rendering of labels in dot a… (#1507)

* adds support for labels in edges,  fixes rendering of labels in dot and mermaid, adds rendering of labels in edges

* Update dotnet/src/Microsoft.Agents.AI.Workflows/Visualization/WorkflowVisualizer.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* escaping edge labels, adding tests for labels containing strange characters that would break the diagram and enabling the previous signature so the API has backwards compatibility.

* Unify label in EdgeData

* Edge API adjustments, removed useless "sanitizer"

* fixed test

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Jacob Alber <jaalber@microsoft.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* Python: Added custom args and thread object to ai_function kwargs (#2769)

* Added an example of using kwargs in ai_function

* Added thread object to ai_function kwargs

* Updated docs

* Small fix

* Added thread parameter filtering

* Fix WorkflowAgent to include thread convo history. Enable checkpointing. (#2774)

* Update OpenAIResponses.yaml to match AgentSchema (#2598)

1. Update `connection` child types --  `kind: ApiKey` to `kind: key` otherwise schema will fail: https://microsoft.github.io/AgentSchema/reference/apikeyconnection/

2.  Update `outputSchema`'s `PropertySchema` to be `kind` instead of `type` otherwise schema will fail: https://microsoft.github.io/AgentSchema/reference/propertyschema/

* Python: Remove warnings from workflow builder on not using factories (#2808)

* Revert concurrent

* Fix comments

* Python: Filter framework kwargs from MCP tool invocations (#2870)

* Filter framework kwargs from MCP tool invocations

* Fixes

* Python: Fix WorkflowAgent to emit yield_output as agent response (#2866)

* Fix WorkflowAgent to emit yield_output as agent response

* use raw_representation

* Raw representation handling

* Python: Use agent description in HandoffBuilder auto-generated tools (#2713) (#2714)

## Summary
Enhanced `HandoffBuilder._apply_auto_tools` to use the target agent's
description when creating handoff tools, providing more informative tool
descriptions for LLMs.

## Changes
- Modified `_apply_auto_tools` to extract `description` from
  `AgentExecutor._agent` when available
- Updated iteration to use `.items()` for more efficient dict traversal
- Handoff tools now use agent descriptions instead of generic placeholders

## Example
Before: "Handoff to the refund_agent agent."
After: "You handle refund requests. Ask for order details and process refunds."

## Testing
- All handoff tests pass (20/20)
- No breaking changes to existing API

Fixes #2713

Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

* Python: [BREAKING] Observability updates (#2782)

* fixes Python: Add env_file_path parameter to setup_observability() similar to AzureOpenAIChatClient
Fixes #2186

* WIP on updates using configure_azure_monitor

* improved setup and clarity

* fixed root .env.example

* revert changes

* updated files

* updated sample

* updated zero code

* test fixes and fixed links

* fix devui

* removed planning docs

* added enable method and updated readme and samples

* clarified docstring

* add return annotation

* updated naming

* update capatilized version

* updated readme and some fixes

* updated decorator name inline with the rest

* feedback from comments addressed

* Python: Fix middleware terminate flag to exit function calling loop immediately (#2868)

* Fix middleware terminate flag to exit function calling loop immediately

* Eliminating duck typing

* Improve function exec result handling

* Fix race condition

* Fix mypy issues

* Python: Fix context duplication in handoff workflows when restoring from checkpoint (#2867)

* Fix context duplication in handoff workflows when restoring from checkpoint

* Address Copilot PR review

* .NET: Update to latest Azure.AI.*, OpenAI, and M.E.AI* (#2850)

* Update to latest Azure.AI.*, OpenAI, and M.E.AI*

Absorb breaking changes in Responses surface area

* Update dotnet/samples/AgentWebChat/AgentWebChat.AgentHost/Utilities/ChatClientExtensions.cs

* Update dotnet/samples/AgentWebChat/AgentWebChat.AgentHost/Utilities/ChatClientExtensions.cs

* Update dotnet/samples/AgentWebChat/AgentWebChat.AgentHost/Utilities/ChatClientExtensions.cs

* Update dotnet/samples/GettingStarted/AgentWithOpenAI/Agent_OpenAI_Step04_CreateFromOpenAIResponseClient/Program.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Using patch to remove the model is necessary, updated the response client to actually use the the ForAgent

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>

* Bump actions/download-artifact from 6 to 7 (#2862)

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/cache from 4 to 5 (#2861)

Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/upload-artifact from 5 to 6 (#2860)

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Python : Ollama Connector for Agent Framework (#1104)

* Initial Commit for Olama Connector

* Added Olama Sample

* Add Sample & Fixed Open Telemetry

* Fixed Spelling from Olama to Ollama

* remove"opentelemetry-semantic-conventions-ai ~=0.4.13" since its handled in a different pr

* Added Tool Calling

* Finalizing test cases

* Adjust samples to be more reliable

* Update python/packages/ollama/agent_framework_ollama/_chat_client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/packages/ollama/pyproject.toml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/packages/ollama/tests/test_ollama_chat_client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/packages/ollama/agent_framework_ollama/_chat_client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Improved Docstrings & Sample

* Update python/packages/ollama/agent_framework_ollama/_chat_client.py

Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

* Integrate PR Feedback
- Divided Streaming and Non-Streaming into independent Methods
- Catch Ollama Validation Error
- Add OTEL Provider Name
- Checked Ollama Messages
- Add Usage Statistics

* Revert setting, so it can be none

* Validate Message formatting between AF and Ollama

* Catch Ollama Error and raise a ServiceResponse Error

* Fix mypy error

* remove .vscode comma

* Add Reasoning support & adjust to new structure

* Add Ollama Multimodality and Reasoning

* Add test cases for reasoning

* Add Tests for Error Handling in Ollama Client

* Update python/samples/getting_started/multimodal_input/ollama_chat_multimodal.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Integrated Copilot Feedback

* Implement first PR Feedback

* Adjust Readme files for examples

* Adjust argument passing via additional chat options

* Implemented PR Feedback

* Removing Ollama Package from Core and moving samples

* Fix Link & Adding Samples to Main Sample Readme

* Fixing Links in Readme

* Moved Multimodal and Chat Example

* Fixed Link in ChatClient to Ollama

* Fix AgentFramework Links in Ollama Project

* Fix observability breaking change

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>

* Skip failing IT (#2904)

* .NET: Cosmos DB UT Fast Skip (For Non-Configured Local envs) (#2906)

* Cosmos DB UT Fast Skip (Non-Configured Local envs) + Long running UT skip in pipeline when no CosmosDB changes happened

* Force a CosmosDB source code change to trigger the pipeline

* Address possible string boolean mismatch

* Add debug

* Enabling emulator always when running IT

* .NET: Add TTLs to durable agent sessions (#2679)

* .NET: Add TTLs to durable agent sessions

* Remove unnecessary async

* PR feedback: clarify UTC

* PR feedback: limit minimum signal delay to <= 5 minutes

* PR feedback: Fix TTL disablement

* Linter: use auto-property

* Fix build break from OpenAI SDK change

* Updated CHANGELOG.md

* PR feedback

* Reduce default TTL to 14 days to work around DTS bug

* Python:  Update Mem0Provider to use v2 search API `filters` parameter (#2766)

* short fix to move id parameters to filters object

* added tests

* small fix

* mem0 dependency update

* Updated package versions (#2913)

* .NET: Switch to new "Run" method name. (#2843)

* Switch to new "RunAgent" method name.

* Try to disable false positive naming warning.

* Add comment about disabled warnings.

* Rename `RunAgent` to just `Run`.

* Update CHANGELOG.

* Python: Switch to new "run" method name. (#2890)

* Switch to `run` method.

* Add support for deprecated `run_agent`.

* Fix entity method name.

* Fix method name and improve tests.

* Update comment.

* Update Python CHANGELOG.

* [BREAKING] Python: Add factory pattern to handoff orchestration builder (#2844)

* WIP: Factory pattern to handoff

* Add factory pattern to concurrent orchestration builder; Next: tests and sample verification

* Add tests and improve comments

* Fix mypy

* Simplify handoff_simple.py

* Simplify handoff_autonoumous.py and bug fix

* Update readme

* Address Copilot comments

* Python: Flow custom kwargs to agents via Workflow SharedState (#2894)

* Flow custom kwargs to agents via SharedState

* Address Copilot feedback

* Improve sample typing

* Fix test

* Fix Pydantic error when using Literal type for tool params (#2893)

* Updated Ollama package version (#2920)

* Python: Azure AI Agent with Bing Grounding Citations Sample (#2892)

* bing grounding sample with citations

* small fix

* fix

* .NET: Make DelegatingAIAgent abstract (#2797)

* Initial plan

* Make DelegatingAIAgent abstract

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Added additional arguments for Azure AI agent (#2922)

* Python: Correction of MCP image type conversion in  _mcp.py (#2901)

* Correction of MCP image type conversion in  _mcp.py

* Added a new overload to the init function of the DataContent() type of the Agent Framework, edited the test case to correctly test the usage of the data and uri fields while using DataContent()

* Fixed tests related to the changes of the DataContent type, added testing for both string and byte representations

* Pass kwargs into subworkflows (#2923)

* Python: Move ollama samples to samples getting started dir (#2921)

* Move ollama samples to samples getting started dir

* Address feedback

* Python: fix: correct BadRequestError when using Pydantic model in response_fo… (#1843)

* fix: correct BadRequestError when using Pydantic model in response_format

* Fix lint

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

* .NET: [Breaking] Delete display name property (#2758)

* delete the AIAgent.DisplayName property

* use agent name as a first value for activity display name

* Update dotnet/src/Microsoft.Agents.AI.Workflows/Specialized/HandoffAgentExecutor.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Python: cleanup and refactoring of chat clients (#2937)

* refactoring and unifying naming schemes of internal methods of chat clients

* set tool_choice to auto

* fix for mypy

* added note on naming and fix #2951

* fix responses

* fixes in azure ai agents client

* Python: Workflow add option to visualize internal executors (#2917)

* Workflow add option to visualize internal executors

* Address Copilot comments

* Python: Fixes Run ID and Thread ID casing to align with AG-UI Typescript SDK (#2948)

* added camelCase input to run id and thread id aligning with @ag-ui/core

* fixed per copilot suggestions

* Python: Add workflow cancellation sample (#2732)

* Add workflow cancellation sample

Add sample demonstrating how to cancel a running workflow using asyncio
tasks. Shows both cancellation mid-execution and normal completion paths.
Useful for implementing timeouts, graceful shutdown, or A2A executors.

* update docstring

* .NET: Update Anthropic package to version 12.0.0 (#2914)

* Initial plan

* Update Anthropic package to version 12.0.0

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

* Python: Add Azure Managed Redis Support with Credential Provider (#2887)

* azure redis support

* small fixes

* azure managed redis sample

* fixes

* Bump CommunityToolkit.Aspire.OllamaSharp from 13.0.0-beta.440 to 13.0.0 (#2856)

---
updated-dependencies:
- dependency-name: CommunityToolkit.Aspire.OllamaSharp
  dependency-version: 13.0.0
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump AWSSDK.Extensions.Bedrock.MEAI from 4.0.4.11 to 4.0.5 (#2853)

---
updated-dependencies:
- dependency-name: AWSSDK.Extensions.Bedrock.MEAI
  dependency-version: 4.0.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>

* Bump Azure.AI.AgentServer.AgentFramework from 1.0.0-beta.4 to 1.0.0-beta.5 (#2854)

---
updated-dependencies:
- dependency-name: Azure.AI.AgentServer.AgentFramework
  dependency-version: 1.0.0-beta.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Azure.AI.AgentServer.AgentFramework
  dependency-version: 1.0.0-beta.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* Python: Fix WorkflowAgent event handling and kwargs forwarding (#2946)

* Fix kwargs propagation through workflow.as_agent()

* Fix WorkflowAgent to respect AgentExecutor output_response setting

* .NET: Use GrpcEntityRunner instead of TaskEntityDispatcher (#2759)

* Use GrpcEntityRunner instead of TaskEntityDispatcher

* Pin to Durable worker 1.11.0

* Set the invocation result

* Update all Durable packages

* Update changelog, rename dispatcher to encondedEntityRequest

* Python: Bump Py version to 1.0.0b251218 for a release. Update CHANGELOG (#2968)

* Bump Py version to 1.0.0b251218 for a release. Update CHANGELOG

* update lock

* Fix formatting

* Fix ChatKit typing

* Python: Introducing Foundry Local Chat Clients (#2915)

* redo foundry local chat client

* fix mypy and spelling

* better docstring, updated sample

* fixed tests and added tests

* small sample update

* Updated package versions (#2978)

* Python: Added GitHub MCP sample with PAT (#2967)

* added github mcp sample with PAT

* addressed copilot fixes

* env fix

* Python: Preserve reasoning blocks with OpenRouter (#2950)

* Preserve reasoning blocks with OpenRouter

* Put encrypted reasoning in TextReasoningContent

* Remove unneccessary change

* Fix docs

* Support streaming

* Fix handling None in TextReasoningContent.text

* Python: Added response.created and response.in_progress event process to OpenAIBaseResponseClient (#2975)

* added response.created and response.in_progress to include response.id

* better doc string

* added tests for the new streaming event types

* Python: Introducing support for Bedrock-hosted models (Anthropic, Cohere, etc.) (#2610)

* Pushing the bedrock related changes to the new branch after addressing the review comments

* 2524 Addressed the second round review comments

* 2524 Addressed few more minor comments on the PR

* resolving the merge conflict

* 2524 resolved the uv.lock conflicts

* 2524 addressed more comments

* 2524 removed the print statement to fix the checks failure

* 2524 resolved the CI failure issues

* 2524 fixing the CI breaks

* 2524 Addressed the review comment

* 2524 resolved conflict

---------

Co-authored-by: Sunil Dutta <sunil.dutta@penske.com>
Co-authored-by: budgetboardingai <apurva.sharma31@gmail.com>

* .NET: [Durable Agents] Reliable streaming sample (#2942)

* .NET: [Durable Agents] Reliable streaming sample

* Add automated validation for new sample

* Address Copilot PR feedback

* Fix typo in README.md about agent definitions (#2634)

* Fix typo in README.md about agent definitions

* Update agent-samples/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Python: latency improvements (#3014)

* latency improvements

* fixed mypy, added coding standards and instructions

* slight logic improvement

* Python: Updated package versions (#3024)

* Updated package versions

* Updated changelog

* Python: add powerfx safe mode (#3028)

* add powerfx safe mode

* improved docstring and aligned env_file loading

* ensured test uses reset

* .NET: [Breaking] Introduce RunCoreAsync/RunCoreStreamingAsync delegation pattern in AIAgent (#2749)

* Initial plan

* Refactor AIAgent: Make RunAsync and RunStreamingAsync non-abstract, add RunCoreAsync and RunCoreStreamingAsync

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Fix infinite recursion in test implementations

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Make RunAsync and RunStreamingAsync non-virtual as requested

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Fix DelegatingAIAgent subclasses to use RunCoreAsync/RunCoreStreamingAsync

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Fix XML documentation references in AnonymousDelegatingAIAgent

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Restore <see cref> tags with proper qualified signatures in AnonymousDelegatingAIAgent

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Rollback unnecessary XML documentation changes in AnonymousDelegatingAIAgent

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Remove pragma and update crefs to RunCoreAsync/RunCoreStreamingAsync

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Fix EntityAgentWrapper to call base.RunCoreAsync/RunCoreStreamingAsync

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* fix compilation issues

* fix compilatio issue

* fix tests

* fix unit tests

* fix unit test

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <sergemenshikh@gmail.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* add issue template and additional labeling (#3006)

* fix and extra int test (#3037)

* .NET: [BREAKING] Refactor ChatMessageStore methods to be similar to AIContextProvider and add filtering support (#2604)

* Refactor ChatMessageStore methods to be similar to AIContextProvider

* Fix file encoding

* Ensure that AIContextProvider messages area also persisted.

* Update formatting and seal context classes

* Improve formatting

* Remove optional messages from constructor and add unit test

* Add ChatMessageStore filtering via a decorator

* Update sample and cosmos message store to store AIContextProvider messages in right order. Fix unit tests.

* Update Workflowmessage store to use aicontext provider messages.

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Improve xml docs messaging

* Address code review comments.

* Also notify message store on failure

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* [BREAKING] Remove unused AgentThreadMetadata (#3067)

* Remove unused AgentThreadMetadata

* Update DurableTask Changelog

* Python: Fix AzureAIClient failure when conversation history contains assistant messages (#3076)

* Fix AzureAIClient failure when conversation history contains assistant messages

* Address PR review feedback: improve docstring and test assertions

* Remove redundant cast

* Fix: Update OTLP exporter protocol conditions (#3070)

* Python: Fix ExecutorInvokedEvent and ExecutorCompletedEvent observability data (#3090)

* Fix ExecutorInvokedEvent.data mutation bug

* Fix bug related to not yielding output type

* .NET: Seal ChatClientAgentThread (#2842)

* Initial plan

* Seal ChatClientAgentThread class

Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* Fix broken strands urls. (#3102)

* Fix broken strands urls.

* Fix typos

* .NET: Fix message ordering inconsistency when using AIContextProvider (#2659)

* Initial plan

* Fix message ordering inconsistency when using AIContextProvider

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Revert to original message ordering: Input, AIContextProvider, Response

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Reorder messages to ChatClient to match MessageStore order: Existing, Input, AIContextProvider

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Remove redundant test methods as existing tests already verify the behavior

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* fix: tool_choice parameter not being honored when passed to agent.run() (#3095)

* sharepoint sample fix (#3108)

* Bump versions to 1.0.0b260106 for a release. Update CHANGELOG.md (#3109)

* Bump Bedrock version to latest (#3110)

* Python: Fix MCP tool result serialization for list[TextContent] (#2523)

* Fix MCP tool result serialization for list[TextContent]

When MCP tools return results containing list[TextContent], they were
incorrectly serialized to object repr strings like:
'[<agent_framework._types.TextContent object at 0x...>]'

This fix properly extracts text content from list items by:
1. Checking if items have a 'text' attribute (TextContent)
2. Using model_dump() for items that support it
3. Falling back to str() for other types
4. Joining single items as plain text, multiple items as JSON array

Fixes #2509

* Address PR review feedback for MCP tool result serialization

- Extract serialize_content_result() to shared _utils.py
- Fix logic: use texts[0] instead of join for single item
- Add type annotation: texts: list[str] = []
- Return empty string for empty list instead of '[]'
- Move import json to file top level
- Add comprehensive unit tests for serialization

* Address PR review feedback: fix type checking and double serialization

- Add isinstance(item.text, str) check to ensure text attribute is a string
- Fix double-serialization issue by keeping model_dump results as dicts
  until final json.dumps (removes escaped JSON strings in arrays)
- Improve docstring with detailed return value documentation
- Add test for non-string text attribute handling
- Add tests for list type tool results in _events.py path

* Simplify PR: minimal changes to fix MCP tool result serialization

Addresses reviewer feedback about excessive refactoring:
- Reset _events.py to original structure
- Only add import and use serialize_content_result in one location
- All review comments addressed in serialize_content_result():
  - Added isinstance(item.text, str) check
  - Use model_dump(mode="json") to avoid double-serialization
  - Improved docstring with explicit return value documentation
  - Empty list returns "" instead of "[]"

* Refactor: Move MCP TextContent serialization to core prepare_function_call_results

Per reviewer feedback, moved the TextContent serialization logic from
ag-ui's serialize_content_result to the core package's
prepare_function_call_results function.

Changes:
- Added handling for objects with 'text' attribute (like MCP TextContent)
  in _prepare_function_call_results_as_dumpable
- Removed serialize_content_result from ag-ui/_utils.py
- Updated _events.py and _message_adapters.py to use
  prepare_function_call_results from core package
- Updated tests to match the core function's behavior

* Fix failing tests for prepare_function_call_results behavior

- test_tool_result_with_none: Update expected value to 'null' (JSON serialization of None)
- test_tool_result_with_model_dump_objects: Use Pydantic BaseModel instead of plain class

* Fix B903 linter error: Convert MockTextContent to dataclass

The ruff linter was reporting B903 (class could be dataclass or namedtuple)
for the MockTextContent test helper classes. This commit converts them to
dataclasses to satisfy the linter check.

* Python: Improve DevUI, add Context Inspector view as new tab under traces (#2742)

* Improve DevUI, add Context Inspector view as new tab under traces

* fix mypy errors

* fix: Handle stale MCP connections in DevUI executor

MCP tools can become stale when HTTP streaming responses end - the underlying
stdio streams close but `is_connected` remains True. This causes subsequent
requests to fail with `ClosedResourceError`.

Add `_ensure_mcp_connections()` to detect and reconnect stale MCP tools before
agent execution. This is a workaround for an upstream Agent Framework issue
where connection state isn't properly tracked.

Fixes MCP tools failing on second HTTP request in DevUI.

fixes  #1476 #1515 #2865

* fix #1572 report import dependency errors more clearly

* Ensure there is streaming toggle where users can select streaming vs non streaming mode in devui . Fixes .NET: [Python] DevUI tool call rendering in non-streaming mode?

* remove unused dead code

* improve ux - workflows with agents show a chat component in execution timelien, also ensure magentic final output shows correctly

* update ui build

* update devui to use instrumentation instead of tracing, other instrumentation and type/instance check fixes

* .NET: Seal factory contexts and add non JSO deserialize overloads (#3066)

* Seal factory contexts and add non JSO deserialize overloads

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Enable blank issues in issue template configuration

Need to re-enable creating blank issues

* updated templates (#3106)

* updated templates

* enabled blank and fixed triage

* made language optional and moved to the bottom for features

* Python: Streaming sample for azurefunctions (#3057)

* Streaming sample for azurefunctions

* Fixed links and sample name

* Addressed feedback

* Addressed feedback

* Fixed integration tests

* Updated test

* Python: fix(azure-ai): Fix response_format handling for structured outputs (#3114)

* fix(azure-ai): read response_format from chat_options instead of run_options

* refactor: use explicit None checks for response_format

* Fix mypy error

* Mypy fix

* Python: Bump python version to 1.0.0b260107 for a release (#3128)

* Bump python version to 1.0.0b260107 for a release

* Update changelog

* Make A2AAgent public, so that it's concrete implementation methods can be used. (#3119)

* .NET: Map additional props <-> A2A metadata (#3137)

* map additional props from agent run options to a2a request metadata

* small touches

* add unit tests for new extension methods

* Sort using

* add unit test

* add additiona unit tests

* special case json element to avoid unnecessary serialization

* Python: Fix Anthropic streaming response bugs (#3141)

* test commit identity

* fix(anthropic): fix raw_representation and finish_reason in streaming

* lint fix

* Bump AWSSDK.Extensions.Bedrock.MEAI from 4.0.5 to 4.0.5.1 (#2994)

---
updated-dependencies:
- dependency-name: AWSSDK.Extensions.Bedrock.MEAI
  dependency-version: 4.0.5.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* Bump Anthropic from 12.0.0 to 12.0.1 (#2993)

---
updated-dependencies:
- dependency-name: Anthropic
  dependency-version: 12.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* .NET: [Breaking] Prevent loss of input messages & streamed updates when resuming streaming (#2748)

* save input messages and stream updates to the continuation token to be able to use them in the last successful stream resumption call.

* Update dotnet/src/Microsoft.Agents.AI/ChatClient/ChatClientAgentContinuationToken.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/src/Microsoft.Agents.AI/ChatClient/ChatClientAgentContinuationToken.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/tests/Microsoft.Agents.AI.UnitTests/ChatClient/ChatClientAgent_BackgroundResponsesTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/src/Microsoft.Agents.AI/ChatClient/ChatClientAgentContinuationToken.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/src/Microsoft.Agents.AI/ChatClient/ChatClientAgentContinuationToken.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix typo

* init continuation token from chat response

* remove unnecessary types for source generation

* remove check for continuation token passed at initial run

* remove check for continuation token pass at initial run

* centralize continuation token parsing

* update xml comments

* use readonly collection instead of enumerable

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* .NET: fix: Expose WorkflowErrorEvent as ErrorContent (#2762)

* fix: Expose WorkflowErrorEvent as ErrorContent

When hosted using .AsAgent(), Workflows were not exposing inner errors coming as Exceptions (through the WorkflowErrorEvent)

The fix is to convert their message to an ErrorContent on the way out, rather than rely on the default "empty update" to collect the raw event.

* feat: Add a way to show/suppress exception information

* Bump Microsoft.Agents.AI.Workflows from 1.0.0-preview.251125.1 to 1.0.0-preview.251219.1 (#2997)

---
updated-dependencies:
- dependency-name: Microsoft.Agents.AI.Workflows
  dependency-version: 1.0.0-preview.251219.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>

* .NET: Add Run overloads to expose ChatClientAgentRunOptions in IntelliSense (#3115)

* Initial plan

* Add ChatClientAgentExtensions for improved discoverability of ChatClientAgentRunOptions

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Address code review feedback - use collection expression syntax

Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Apply suggestion from @westey-m

* Fix issues with Copilot implementation

* Add additional tests for structured output overloads.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: westey-m <164392973+westey-m@users.noreply.github.com>

* Python: Add tool call/result content types and update connectors and samples (#2971)

* Add new AI content types and image tool support

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Add Python content types for tool calls/results and image generation tool support

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Address review feedback for tool content and samples

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Tighten image generation typing and sample tools list

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Align image generation output typing

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Handle MCP naming, image options mapping, and connector tool content

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Allow MCP call in function approval request

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Remove raw image_generation tool remapping

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Restore Anthropic tool_use to function calls unless code execution

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Fix lint issues for hosted file docstring and MCP parsing

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Import ChatResponse types in Anthropic client

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Fix Anthropics citation type imports and MCP typing for handoff/tools

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Skip lightning tests without agentlightning and fix function call import

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* fix lint on lab package

* rebuilt anthropic parsing

* redid anthropic parsing

* typo

* updated parsing and added missing docstrings

* fix tests

* mypy fixes

* second mypy fix

* add new class to other samples

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <github@vanvalkenburg.eu>

* Bump Google.GenAI from 0.6.0 to 0.9.0 (#2995)

---
updated-dependencies:
- dependency-name: Google.GenAI
  dependency-version: 0.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

* Bump js-yaml from 4.1.0 to 4.1.1 in /python/packages/devui/frontend (#3123)

Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1.
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nodeca/js-yaml/compare/4.1.0...4.1.1)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.1.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Updated package versions (#3144)

* .NET: Bump Microsoft.Agents.AI.OpenAI and Microsoft.Extensions.AI.OpenAI (#2996)

* Bump Microsoft.Agents.AI.OpenAI and Microsoft.Extensions.AI.OpenAI

Bumps Microsoft.Agents.AI.OpenAI from 1.0.0-preview.251125.1 to 1.0.0-preview.251219.1
Bumps Microsoft.Extensions.AI.OpenAI from 10.1.0-preview.1.25608.1 to 10.1.1-preview.1.25612.2

---
updated-dependencies:
- dependency-name: Microsoft.Agents.AI.OpenAI
  dependency-version: 1.0.0-preview.251219.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Microsoft.Extensions.AI.OpenAI
  dependency-version: 10.1.1-preview.1.25612.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Microsoft.Agents.AI.OpenAI
  dependency-version: 1.0.0-preview.251219.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Microsoft.Extensions.AI.OpenAI
  dependency-version: 10.1.1-preview.1.25612.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fixed samples

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

* Python: fix(ag-ui): Execute tools with approval_mode, fix shared state, code cleanup  (#3079)

* fix(ag-ui): execute tools after approval in human-in-the-loop flow

* Fix shared state bug

* Bug fix finalized

* Refactoring to clean up code

* Code cleanup

* More fixes

* More code cleanup

* Add version detection in __init__.py to ruff ignore list

* Track agent name with updates for workflow agent (#3146)

* Python: Fix AzureAIClient tool call bug for AG-UI use (#3148)

* Fiz AzureAIClient tool call bug

* Address copilot feedback

* Python: multiple bug fixes (#3150)

* fix Python: kwargs are not passed to _prepare_thread_and_messages in ChatAgent.run
Fixes #3118

* fix Python: [Bug]: model_id versus model_deployment_name is confusing in Azure AI Agents
Fixes #3147

* add types

* fixed type and docstring

* fix(anthropic): fix duplicate ToolCallStartEvent in streaming tool calls (#3051)

When processing `input_json_delta` events, the Anthropic client was
passing the tool name from the previous `tool_use` event. This caused
ag-ui's `_handle_function_call_content` to emit a `ToolCallStartEvent`
for every streaming chunk (since it triggers on `if content.name:`).

This fix changes the behavior to pass an empty string for `name` in
`input_json_delta` events, matching OpenAI's behavior where streaming
argument chunks have `name=""`. The initial `tool_use` event still
provides the tool name, so only one `ToolCallStartEvent` is emitted.

Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

* .NET: [BREAKING] Change GetNewThread and DeserializeThread to async (#3152)

* Change GetNewThread and DeserializeThread plus ChatMessageStore and AIContextProvider Factories to async

* Merge fixes

* Fix Ollama model env var in documentation (#3156)

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>

* Python: Add Pydantic request model and OpenAPI tags support to AG-UI FastAPI endpoint (#2522)

* feat(ag-ui): Add Pydantic request model and OpenAPI tags support

- Add AGUIRequest Pydantic model in _types.py with field descriptions
- Update add_agent_framework_fastapi_endpoint() to accept tags parameter
- Use AGUIRequest model for automatic validation and OpenAPI schema generation
- Export AGUIRequest and DEFAULT_TAGS in __init__.py
- Update test_endpoint.py to expect 422 for invalid requests
- Add tests for OpenAPI schema, default tags, custom tags, and validation

Benefits:
- Better API documentation with complete request schema in Swagger UI
- Automatic request validation with Pydantic
- Organized endpoints under 'AG-UI' tag instead of 'default'
- Improved developer experience and type safety

Fixes #<issue-number>

* test(ag-ui): Add test for internal error handling to achieve 100% coverage

- Add test_endpoint_internal_error_handling() to cover exception handling code
- Mock copy.deepcopy to simulate internal error during default_state processing
- Add type: ignore for FastAPI tags parameter (known pyright compatibility issue)
- Achieves 100% test coverage for _endpoint.py (previously missing lines 103-105)

* .NET: Improve resolving `AITool` from DI (#3175)

* remove localagenttoolregistry

* also give the factory method API

* Python: Fix MCPStreamableHTTPTool to use new streamable_http_client API (#3088)

* Fix MCPStreamableHTTPTool to use new streamable_http_client API with proper httpx client cleanup

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Update docstring to reflect new streamable_http_client API usage

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Refactor MCPStreamableHTTPTool to accept optional http_client parameter and delegate client creation to streamable_http_client

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Update mcp package minimum version to 1.24.0 for streamable_http_client API support

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Fix critical bugs: apply headers/timeout/sse_read_timeout when creating httpx client, add version constraint <2, and properly manage client lifecycle

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Simplify implementation: remove headers/timeout/sse_read_timeout params, remove kwargs, remove close() override per feedback

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Add back **kwargs parameter for backward compatibility (accepted but not used)

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* Remove unused httpx import from test file

Note: The uv.lock file needs to be updated with 'uv sync' to reflect the mcp version constraint change (>=1.24.0,<2)

Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>

* cicd fixes

* udpated samples with headers examples

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <github@vanvalkenburg.eu>

* azureai direct a2a endpoint support (#3127)

* Python: [BREAKING]: removed display_name, renamed context_providers, middleware and AggregateContextProvider (#3139)

* removed display_name, renamed context_providers, middleware and AggregateContextProvider

* fixes

* fixed test

* testfix

* removed mistakenly put back test

* updated new test

* rename middlewares to middleware

* middleware fixes

* Python: MCP Improvements: improved connection loss behavior, pagination for loading and a param to control representation (#3154)

* pagination support (#2848) added a parse_tool_result param and connection loss (#2884)

* fix #3153

* improved connection handling

* improved logic

* Python: Add declarative workflow runtime (#2815)

* Further support for declarative python workflows

* Add tests. Clean up for typing and formatting

* Improvements and cleanup

* Typing cleanup. Improve docstrings

* Proper code in docstrings

* Fix malformed code-block directive in docstring

* Remove dead links

* PR feedback

* Address PR feedback

* Address PR feedback

* Remove sl

* Update devui frontend

* More cleanup

* Fix uv lock

* Skip Py 3.14 tests as powerfx doesn't support it

* Fix mypy error

* Fix for tool calls

* Removed stale docstring

* Fix lint

* Standardize on .NET namespaces. Revert DevUI changes (bring in later)

* Implement remaining items for Python declarative support to match dotnet

* point URL to agent, not to agentcard (#3176)

* Python: [BREAKING]: Introducing Options as TypedDict and Generic (#3140)

* WIP typeddict for options

* updated all clients and ChatAgents

* updated everything

* added ADR

* fix mypy

* proper typevar imports

* fixed import

* fixed other imports

* slight update in the sample

* updated from feedback

* fixes

* fixed missing covariants and test fixes

* fixed typing

* updated anthropic thinking config

* ruff fixes

* fixed int tests

* fix tests and mypy

* updated integration tests

* updated docstring and test fix

* improved options handling in obser

* mypy fix

* updated a host of integration tests

* fix tests

* bedrock fix

* [BREAKING] Python: Refactor orchestrations (#3023)

* Group chat refactoring Part 1; Next: HIL and handoff

* Add agent approval flow; next samples

* WIP: samples

* WIP: HIL samples

* Group chat HIL working; next: handoff

* Fix group chat tool approval sample

* WIP: refactor handoff; next handoff handling

* Handoff done; next handoff samples and concurrent and sequential

* Handoff samples, concurrent, and sequential done; next Magentic

* WIP: magentic; next test with samples + HIL

* Magentic Working; next fix all samples and tests

* Fix handoff samples; next tests

* WIP: fixing tests; some orchestration as agent samples are failing

* Group chat unit tests done

* Handoff  unit tests done

* Remove old orchestration_request_info and fix related tests

* Magentic unit tests done

* Fix samples

* Fix test

* Fix test 2

* mypy

* Address comments

* Update readme

* Address comments

* Address comments 2

* Replace display name

* Python: ADR for create/get agent API (#2618)

* ADR for create/get agent API

* Updated ADR with implementation options

* Small updates

* Updated decision outcome section

* Updated broken links

* Small updates

* Fixed merge conflicts

* Small fix

* Updated decision outcome section

* Small fixes

* Updated provider naming based on client SDK

* Add ignored parameter for CodeQL in workflow (#3204)

* Implement IReadOnlyList on InMemoryChatMessageStore (#3205)

* .NET: Make ChatMessageStore and AIContextProvider context props settable (#3196)

* Make ChatMessageStore and AIContextProvider context props setable

* Add validation to preserve non-null requirement of certain properties.

* Fix broken tests.

* Python: Add dependencies param to ag-ui FastAPI endpoint (#3191)

* Add dependencies param to ag-ui FastAPI endpoint

* Address Copilot feedback

* renamed all (#3207)

* Python: ADR for simplified get response (#3098)

* ADR for simplified get response

* updated some language, added agent option and code comparison

* small update in sample

* added workflows and expanded some points

* changed decision and number

* updated with stream=False default

* .NET: [Breaking] Rename`AgentRunResponse` and `AgentRunResponseUpdate` classes (#3197)

* rename AgentRunResponse and AgentRunResponseUpdate classes - part1

* rename varialbles, parameters, methods and tests

* rollback unnecessary changes

* .NET: [Breaking] Rename AgentRunResponseEvent and AgentRunUpdateEvent classes (#3214)

* rename AgentRunResponseEvent and AgentRunUpdateEvent classes

* rollback unnecessary changes

* Python: Create/Get Agent API for Azure V2 (#3059)

* Added get_agent method to Azure AI V2

* Small fixes

* Small fix

* Removed AzureAIAgentProvider

* Added create_agent method

* Small fixes

* Fixed code interpreter tool mapping

* Added agent provider for V2 client

* Updated response format handling

* Added provider example

* Fixed errors

* Update python/samples/getting_started/agents/azure_ai/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Small fix

* Updates from merge

* Resolved comments

* Resolved comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Python: Add more specific exceptions to Workflow (#3188)

* Add more specifc workflow exceptions

* Fix tests

* AI comments

* Misc

* Python: Added AzureAI sample for downloading code interpreter generated files (#3189)

* added azure ai code interpreter file download sample

* copilot fix suggestions

* function name fixes + readme update

* small fix

* update package versions (#3223)

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

* Python: fix(core): correct FunctionResultContent ordering in WorkflowAgent.merge_updates (#3168)

* fix(core): simplify FunctionResultContent ordering in WorkflowAgent.merge_updates

* improve comment

* Fix name

* fix(workflows): rename WorkflowOutputEvent.source_executor_id to executor_id for API consistency (#3166)

* Python: fix(ag-ui): add MCP tool support for AG-UI approval flows (#3212)

* add MCP tool support for AG-UI approval flows

* use attribute in place of property

* Python: Properly configure structured outputs based on new options dict (#3213)

* Properly configure structured outputs based on new options dict

* Fix mypy

* .NET: Merge AgentRunOptions.AdditionalProperties into ChatOptions.AdditionalProperties (#3184)

* Merge AgentRunOptions.AdditionalProperties into ChatOptions.AdditionalProperties

* Fix namespace and typo.

* .NET: Update Google.GenAI to 0.11.0 and remove polyfill implementations (#3232)

* Initial plan

* Update Google.GenAI to 0.11.0 and remove polyfill files

Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>

* .NET: [BREAKING] Renamed CreateAIAgent/GetAIAgent to AsAIAgent (#3222)

* Renamed chat client extension method

* Additional renaming

* Updated documentation

* Fixed tests

* Small fix

* Small fix

* Updated DurableAIAgent and fixed integration tests (#3241)

* Python: Create/Get Agent API for Azure V1 (#3192)

* Added provider implementation for Azure AI V1

* Small fixes

* Fixed OpenAPI example

* Fixed local MCP example

* Fixed hosted MCP example

* Fixed file search sample

* Small fixes

* Resolved comments

* Doc updates

* Bump azure-core from 1.37.0 to 1.38.0 in /python (#3209)

Bumps [azure-core](https://github.com/Azure/azure-sdk-for-python) from 1.37.0 to 1.38.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-python/releases)
- [Commits](https://github.com/Azure/azure-sdk-for-python/compare/azure-core_1.37.0...azure-core_1.38.0)

---
updated-dependencies:
- dependency-name: azure-core
  dependency-version: 1.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Python: Create/Get Agent API for OpenAI Assistants (#3208)

* Added provider implementation

* Added example with response format

* Small improvements

* Python: (AG-UI) Support service-managed thread on AG-UI  (#3136)

* added service thread support

* set service_thread_id to only supplied_thread_id

* uses raw_representation to extract the conversation_id

* removed accidental edit

* updated test to use raw_representation

* resolves copilot review feedback

* revert back StubAgent, since not used

* removed relative module import

* removed hasattr check per PR feedback

* Create/Get Agent API - fixes and example improvements (#3246)

* Fix merge conflicts

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Co-authored-by: Tao Chen <taochen@microsoft.com>
Co-authored-by: Kurt <65111699+q33566@users.noreply.github.com>
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>
Co-authored-by: Korolev Dmitry <deagle.gross@gmail.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rogerbarreto <19890735+rogerbarreto@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>
Co-authored-by: Jose Luis Latorre Millas <joslat@gmail.com>
Co-authored-by: Jacob Alber <jaalber@microsoft.com>
Co-authored-by: Richard Ortega <richardjortega@gmail.com>
Co-authored-by: 刘邦学AI <lbbniu@gmail.com>
Co-authored-by: Stephen Toub <stoub@microsoft.com>
Co-authored-by: Nico Möller <nkm-moeller@mail.de>
Co-authored-by: Chris Gillum <cgillum@microsoft.com>
Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
Co-authored-by: Phillip Hoff <phillip.hoff@gmail.com>
Co-authored-by: Ege Ozan Özyedek <36128615+egeozanozyedek@users.noreply.github.com>
Co-authored-by: samueljohnsiby <66901393+samueljohnsiby@users.noreply.github.com>
Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>
Co-authored-by: Hao Luo <338265+howlowck@users.noreply.github.com>
Co-authored-by: Victor Dibia <chuvidi2003@gmail.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: Jacob Viau <javia@microsoft.com>
Co-authored-by: SuperKenVery <39673849+SuperKenVery@users.noreply.github.com>
Co-authored-by: Sunil Dutta <dutta.2003@gmail.com>
Co-authored-by: Sunil Dutta <sunil.dutta@penske.com>
Co-authored-by: budgetboardingai <apurva.sharma31@gmail.com>
Co-authored-by: Syrine Chelly <62653967+SyChell@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <sergemenshikh@gmail.com>
Co-authored-by: westey <164392973+westey-m@users.noreply.github.com>
Co-authored-by: takanori-terai <123897708+takanori-terai@users.noreply.github.com>
Co-authored-by: claude89757 <138977524+claude89757@users.noreply.github.com>
Co-authored-by: Gavin Aguiar <80794152+gavin-aguiar@users.noreply.github.com>
Co-authored-by: Sukeesh <vsukeeshbabu@gmail.com>
Co-authored-by: eavanvalkenburg <13749212+eavanvalkenburg@users.noreply.github.com>
Co-authored-by: eavanvalkenburg <github@vanvalkenburg.eu>
Co-authored-by: Ao Chen <chenao3220@gmail.com>
Co-authored-by: Dina Suehiro Jones <dina.s.jones@intel.com>
2026-01-16 16:59:49 -08:00

89 KiB

status, contact, date, deciders, informed
status contact date deciders informed
accepted sergeymenshykh 2025-10-15 markwallace, rbarreto, westey-m, stephentoub

Long-Running Operations Design

Context and Problem Statement

The Agent Framework currently supports synchronous request-response patterns for AI agent interactions, where agents process requests and return results immediately. Similarly, MEAI chat clients follow the same synchronous pattern for AI interactions. However, many real-world AI scenarios involve complex tasks that require significant processing time, such as:

  • Code generation and analysis tasks
  • Complex reasoning and research operations
  • Image and content generation
  • Large document processing and summarization

The current Agent Framework architecture needs native support for long-running operations, as it is essential for handling these scenarios effectively. Additionally, as MEAI chat clients need to start supporting long-running operations as well to be used together with AF agents, the design should consider integration patterns and consistency with the broader Microsoft.Extensions.AI ecosystem to provide a unified experience across both agent and chat client scenarios.

Decision Drivers

  • Chat clients and agents should support long-running execution as well as quick prompts.
  • The design should be simple and intuitive for developers to use.
  • The design should be extensible to allow new long-running execution features to be added in the future.
  • The design should be additive rather than disruptive to allow existing chat clients to iteratively add support for long-running operations without breaking existing functionality.

Comparison of Long-Running Operation Features

Feature OpenAI Responses Foundry Agents A2A
Initiated by User (Background = true) Long-running execution is always on Agent
Modeled as Response Run Task
Supported modes1 Sync, Async Async Sync, Async
Getting status support
Getting result support
Update support
Cancellation support
Delete support
Non-streaming support
Streaming support
Execution statuses InProgress, Completed, Queued
Cancelled, Failed, Incomplete
InProgress, Completed, Queued
Cancelled, Failed, Cancelling,
RequiresAction, Expired
Working, Completed, Canceled,
Failed, Rejected, AuthRequired,
InputRequired, Submitted, Unknown

1 Sync is a regular message-based request/response communication pattern; Async is a pattern for long-running operations/tasks where the agent returns an ID for a run/task and allows polling for status and final results by the ID.

Note: The names for new classes, interfaces, and their members used in the sections below are tentative and will be discussed in a dedicated section of this document.

Long-Running Operations Support for Chat Clients

This section describes different options for various aspects required to add long-running operations support to chat clients.

1. Methods for Working with Long-Running Operations

Based on the analysis of existing APIs that support long-running operations (such as OpenAI Responses, Azure AI Foundry Agents, and A2A), the following operations are used for working with long-running operations:

  • Common operations:
    • Start Long-Running Execution: Initiates a long-running operation and returns its Id.
    • Get Status of Long-Running Execution: This method retrieves the status of a long-running operation.
    • Get Result of Long-Running Execution: Retrieves the result of a long-running operation.
  • Uncommon operations:
    • Update Long-Running Execution: This method updates a long-running operation, such as adding new messages or modifying existing ones.
    • Cancel Long-Running Execution: This method cancels a long-running operation.
    • Delete Long-Running Execution: This method deletes a long-running operation.

To support these operations by IChatClient implementations, the following options are available:

  • 1.1 New IAsyncChatClient Interface for All Long-Running Execution Operations
  • 1.2 Get{Streaming}ResponseAsync for Common Operations & New IAsyncChatClient Interface for Uncommon Operations
  • 1.3 Get{Streaming}ResponseAsync for Common Operations & New IAsyncChatClient Interface for Uncommon Operations & Capability Check
  • 1.4 Get{Streaming}ResponseAsync for Common Operations & Individual Interface per Uncommon Operation

1.1 New IAsyncChatClient Interface for All Long-Running Execution Operations

This option suggests adding a new interface IAsyncChatClient that some implementations of IChatClient may implement to support long-running operations.

public interface IAsyncChatClient
{
    Task<AsyncRunResult> StartAsyncRunAsync(IList<ChatMessage> chatMessages, RunOptions? options = null, CancellationToken ct = default);
    Task<AsyncRunResult> GetAsyncRunStatusAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> GetAsyncRunResultAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken ct = default);
    Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken ct = default);
}

public class CustomChatClient : IChatClient, IAsyncChatClient
{
    ...
}

Consumer code example:

IChatClient chatClient = new CustomChatClient();

string prompt = "..."

// Determine if the prompt should be run as a long-running execution
if(chatClient.GetService<IAsyncChatClient>() is { } asyncChatClient && ShouldRunPromptAsynchronously(prompt)) 
{
    try
    {
        // Start a long-running execution
        AsyncRunResult result = await asyncChatClient.StartAsyncRunAsync(prompt);
    }
    catch (NotSupportedException)
    {
        Console.WriteLine("This chat client does not support long-running operations.");
        throw;
    }

    AsyncRunContent? asyncRunContent = GetAsyncRunContent(result);
    
    // Poll for the status of the long-running execution
    while (asyncRunContent.Status is AsyncRunStatus.InProgress or AsyncRunStatus.Queued)
    {
        result = await asyncChatClient.GetAsyncRunStatusAsync(asyncRunContent.RunId);
        asyncRunContent = GetAsyncRunContent(result);
    }
    
    // Get the result of the long-running execution
    result = await asyncChatClient.GetAsyncRunStatusAsync(asyncRunContent.RunId);
    Console.WriteLine(result);
}
else
{
    // Complete a quick prompt
    ChatResponse response = await chatClient.GetResponseAsync(prompt);
    Console.WriteLine(response);
}

Pros:

  • Not a breaking change: Existing chat clients are not affected.
  • Callers can determine if a chat client supports long-running operations by calling its GetService<IAsyncChatClient>() method.

Cons:

  • Not extensible: Adding new methods to the IAsyncChatClient interface after its release will break existing implementations of the interface.
  • Missing capability check: Callers cannot determine if chat clients support specific uncommon operations before attempting to use them.
  • Insufficient information: Callers may not have enough information to decide whether a prompt should run as a long-running operation.
  • The new method calls bypass existing decorators such as logging, telemetry, etc.
  • An alternative solution for decorating the new methods will have to be put in place because the new method calls bypass existing decorators such as logging, telemetry, etc.

1.2 Get{Streaming}ResponseAsync for Common Operations & New IAsyncChatClient Interface for Uncommon Operations

This option suggests using the existing GetResponseAsync and GetStreamingResponseAsync methods of the IChatClient interface to support common long-running operations, such as starting long-running operations, getting their status, their results, and potentially updating them, in addition to their existing functionality of serving quick prompts. Methods for the uncommon operations, such as updating, cancelling, and deleting long-running operations, will be added to a new IAsyncChatClient interface that will be implemented by chat clients that support them.

This option presumes that Option 3.2 (Have one method for getting long-running execution status and result) is selected.

public interface IAsyncChatClient
{
    /// The update can be handled by GetResponseAsync method as well.
    Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken ct = default);
    
    Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken ct = default);
}

public class ResponsesChatClient : IChatClient, IAsyncChatClient
{
    public async Task<ChatResponse> GetResponseAsync(string prompt, ChatOptions? options = null, CancellationToken ct = default)
    {
        ClientResult<OpenAI.Responses.OpenAIResponse>? result = null;

        // If long-running execution mode is enabled, we run the prompt as a long-running execution
        if(enableLongRunningResponses)
        {
            // No RunId is provided, so we start a long-running execution
            if(options?.RunId is null)
            {
                result = await this._openAIResponseClient.CreateResponseAsync(prompt, new ResponseCreationOptions
                {
                    Background = true,
                });
            }
            else // RunId is provided, so we get the status of a long-running execution
            {
                result = await this._openAIResponseClient.GetResponseAsync(options.RunId);
            }
        }
        else
        {
            // Handle the case when the prompt should be run as a quick prompt
            result = await this._openAIResponseClient.CreateResponseAsync(prompt, new ResponseCreationOptions
            {
                Background = false
            });
        }

        ...
    }

    public Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken ct = default)
    {
        throw new NotSupportedException("This chat client does not support updating long-running operations.");
    }

    public Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.CancelResponseAsync(runId, cancellationToken);
    }

    public Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.DeleteResponseAsync(runId, cancellationToken);
    }
}

Consumer code example:

IChatClient chatClient = new ResponsesChatClient();

ChatResponse response = await chatClient.GetResponseAsync("<prompt>");

if (GetAsyncRunContent(response) is AsyncRunContent asyncRunContent)
{
    // Get result of the long-running execution
    response = await chatClient.GetResponseAsync([], new ChatOptions
    { 
        RunId = asyncRunContent.RunId 
    });

    // After some time

    // If it's still running, cancel and delete the run
    if (GetAsyncRunContent(response).Status is AsyncRunStatus.InProgress or AsyncRunStatus.Queued)
    {
        IAsyncChatClient? asyncChatClient = chatClient.GetService<IAsyncChatClient>();

        try
        {
            await asyncChatClient?.CancelAsyncRunAsync(asyncRunContent.RunId);
        }
        catch (NotSupportedException)
        {
            Console.WriteLine("This chat client does not support cancelling long-running operations.");
        }
        
        try
        {
            await asyncChatClient?.DeleteAsyncRunAsync(asyncRunContent.RunId);
        }
        catch (NotSupportedException)
        {
            Console.WriteLine("This chat client does not support deleting long-running operations.");
        }
    }
}
else
{
    // Handle the case when the response is a quick prompt completion
    Console.WriteLine(response);
}

This option addresses the issue that the option above has with callers needing to know whether the prompt should be run as a long-running operation or a quick prompt. It allows callers to simply call the existing GetResponseAsync method, and the chat client will decide whether to run the prompt as a long-running operation or a quick prompt. If control over the execution mode is still needed, and the underlying API supports it, it will be possible for callers to set the mode at the chat client invocation or configuration. More details about this are provided in one of the sections below about enabling long-running operation mode.

Additionally, it addresses another issue where the GetResponseAsync method may return a long-running execution response and the StartAsyncRunAsync method may return a quick prompt response. Having one method that handles both cases allows callers to not worry about this behavior and simply check the type of the response to determine if it is a long-running operation or a quick prompt completion.

With the GetResponseAsync method becoming responsible for starting, getting status, getting results and updating long-running operations, there are only a few operations left in the IAsyncChatClient interface - cancel and delete. As a result, the IAsyncChatClient interface name may not be the best fit, as it suggests that it is responsible for all long-running operations while it is not. Should the interface be renamed to reflect the operations it supports? What should the new name be? Option 1.4 considers an alternative that might solve the naming issue.

Pros:

  • Delegation and control: Callers delegate the decision of whether to run a prompt as a long-running operation or quick prompt to chat clients, while still having the option to control the execution mode to determine how to handle prompts if needed.
  • Not a breaking change: Existing chat clients are not affected.

Cons:

  • Not extensible: Adding new methods to the IAsyncChatClient interface after its release will break existing implementations of the interface.
  • Missing capability check: Callers cannot determine if chat clients support specific uncommon operations before attempting to use them.
  • An alternative solution for decorating the new methods will have to be put in place because the new method calls bypass existing decorators such as logging, telemetry, etc.

1.3 Get{Streaming}ResponseAsync for Common Operations & New IAsyncChatClient Interface for Uncommon Operations & Capability Check

This option extends the previous option with a way for callers to determine if a chat client supports uncommon operations before attempting to use them.

public interface IAsyncChatClient
{
    bool CanUpdateAsyncRun { get; }
    bool CanCancelAsyncRun { get; }  
    bool CanDeleteAsyncRun { get; } 

    Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken ct = default);
    Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken ct = default);
}

public class ResponsesChatClient : IChatClient, IAsyncChatClient
{
    public async Task<ChatResponse> GetResponseAsync(string prompt, ChatOptions? options = null, CancellationToken ct = default)
    {
        ...
    }

    public bool CanUpdateAsyncRun => false; // This chat client does not support updating long-running operations.
    public bool CanCancelAsyncRun => true;  // This chat client supports cancelling long-running operations.
    public bool CanDeleteAsyncRun => true;  // This chat client supports deleting long-running operations.

    public Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken ct = default)
    {
        throw new NotSupportedException("This chat client does not support updating long-running operations.");
    }

    public Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.CancelResponseAsync(runId, cancellationToken);
    }

    public Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.DeleteResponseAsync(runId, cancellationToken);
    }
}

Consumer code example:

IChatClient chatClient = new ResponsesChatClient();

ChatResponse response = await chatClient.GetResponseAsync("<prompt>");

if (GetAsyncRunContent(response) is AsyncRunContent asyncRunContent)
{
    // Get result of the long-running execution
    response = await chatClient.GetResponseAsync([], new ChatOptions
    { 
        RunId = asyncRunContent.RunId 
    });

    // After some time

    IAsyncChatClient? asyncChatClient = chatClient.GetService<IAsyncChatClient>();

    // If it's still running, cancel and delete the run
    if (GetAsyncRunContent(response).Status is AsyncRunStatus.InProgress or AsyncRunStatus.Queued)
    {
        if(asyncChatClient?.CanCancelAsyncRun ?? false)
        {
            await asyncChatClient?.CancelAsyncRunAsync(asyncRunContent.RunId);
        }

        if(asyncChatClient?.CanDeleteAsyncRun ?? false)
        {
            await asyncChatClient?.DeleteAsyncRunAsync(asyncRunContent.RunId);
        }   
    }
}
else
{
    // Handle the case when the response is a quick prompt completion
    Console.WriteLine(response);
}

Pros:

  • Delegation and control: Callers delegate the decision of whether to run a prompt as a long-running execution or quick prompt to chat clients, while still having the option to control the execution mode to determine how to handle prompts if needed.
  • Not a breaking change: Existing chat clients are not affected.
  • Capability check: Callers can determine if the chat client supports an uncommon operation before attempting to use it.

Cons:

  • Not extensible: Adding new members to the IAsyncChatClient interface after its release will break existing implementations of the interface.
  • An alternative solution for decorating the new methods will have to be put in place because the new method calls bypass existing decorators such as logging, telemetry, etc.

1.4 Get{Streaming}ResponseAsync for Common Operations & Individual Interface per Uncommon Operation

This option suggests using the existing Get{Streaming}ResponseAsync methods of the IChatClient interface to support common long-running operations, such as starting long-running operations, getting their status, and their results, and potentially updating them, in addition to their existing functionality of serving quick prompts.

The uncommon operations that are not supported by all analyzed APIs, such as updating (which can be handled by Get{Streaming}ResponseAsync), cancelling, and deleting long-running operations, as well as future ones, will be added to their own interfaces that will be implemented by chat clients that support them.

This option presumes that Option 3.2 (Have one method for getting long-running execution status and result) is selected.

The interfaces can inherit from IChatClient to allow callers to use an instance of ICancelableChatClient, IUpdatableChatClient, or IDeletableChatClient for calling the Get{Streaming}ResponseAsync methods as well. However, those methods belong to a leaf chat client that, if obtained via the GetService<T>() method, won't be decorated by existing decorators such as function invocation, logging, etc. As a result, an alternative solution (wrap the instance of the leaf chat client in a decorator at the GetService method call) will need to be applied not only to the new methods of one of the interfaces but also to the existing Get{Streaming}ResponseAsync ones.

public interface ICancelableChatClient
{  
    Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken cancellationToken = default);
}

public interface IUpdatableChatClient
{  
    Task<AsyncRunResult> UpdateAsyncRunAsync(string runId, IList<ChatMessage> chatMessages, CancellationToken cancellationToken = default);
}

public interface IDeletableChatClient
{  
    Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken cancellationToken = default);
}

// Responses chat client that supports standard long-running operations + cancellation and deletion
public class ResponsesChatClient : IChatClient, ICancelableChatClient, IDeletableChatClient
{
    public async Task<ChatResponse> GetResponseAsync(string prompt, ChatOptions? options = null, CancellationToken ct = default)
    {
        ...
    }

    public Task<AsyncRunResult> CancelAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.CancelResponseAsync(runId, cancellationToken);
    }

    public Task<AsyncRunResult> DeleteAsyncRunAsync(string runId, CancellationToken cancellationToken = default)
    {
        return this._openAIResponseClient.DeleteResponseAsync(runId, cancellationToken);
    }
}

Example that starts a long-running operation, gets its status, and cancels and deletes it if it's not completed after some time:

IChatClient chatClient = new ResponsesChatClient();

ChatResponse response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = true });

if (GetAsyncRunContent(response) is AsyncRunContent asyncRunContent)
{
    // Get result
    response = await chatClient.GetResponseAsync([], new ChatOptions
    { 
        RunId = asyncRunContent.RunId 
    });

    // After some time

    // If it's still running, cancel and delete the run
    if (GetAsyncRunContent(response).Status is AsyncRunStatus.InProgress or AsyncRunStatus.Queued)
    {
        if(chatClient.GetService<ICancelableChatClient>() is {} cancelableChatClient)
        {
            await cancelableChatClient.CancelAsyncRunAsync(asyncRunContent.RunId);
        }

        if(chatClient.GetService<IDeletableChatClient>() is {} deletableChatClient)
        {
            await deletableChatClient.DeleteAsyncRunAsync(asyncRunContent.RunId);
        }
    }
}

Pros:

  • Extensible: New interfaces can be added and implemented to support new long-running operations without breaking existing chat client implementations.
  • Not a breaking change: Existing chat clients that implement the IChatClient interface are not affected.
  • Delegation and control: Callers delegate the decision of whether to run a prompt as a long-running operation or quick prompt to chat clients, while still having the option to control the execution mode to determine how to handle prompts if needed.

Cons:

  • Breaking changes: Changing the signatures of the methods of the operation-specific interfaces or adding new members to them will break existing implementations of those interfaces. However, the blast radius of this change is much smaller and limited to a subset of chat clients that implement the operation-specific interfaces. However, this is still a breaking change.

2. Enabling Long-Running Operations

Based on the API analysis, some APIs must be explicitly configured to run in long-running operation mode, while others don't need additional configuration because they either decide themselves whether a request should run as a long-running operation, or they always operate in long-running operation mode or quick prompt mode:

Feature OpenAI Responses Foundry Agents A2A
Long-running execution User (Background = true) Long-running execution is always on Agent

The options below consider how to enable long-running operation mode for chat clients that support both quick prompts and long-running operations.

2.1 Execution Mode per Get{Streaming}ResponseAsync Invocation

This option proposes adding a new nullable AllowLongRunningResponses property to the ChatOptions class. The property value will be true if the caller requests a long-running operation, false, null or omitted otherwise.

Chat clients that work with APIs requiring explicit configuration per operation will use this property to determine whether to run the prompt as a long-running operation or quick prompt. Chat clients that work with APIs that don't require explicit configuration will ignore this property and operate according to their own logic/configuration.

public class ChatOptions
{
    // Existing properties...
    public bool? AllowLongRunningResponses { get; set; }
}

// Consumer code example
IChatClient chatClient = ...; // Get an instance of IChatClient

// Start a long-running execution for the prompt if supported by the underlying API
ChatResponse response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = true });

// Start a quick prompt
ChatResponse quickResponse = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = false });

Pros:

  • Callers can switch between quick prompts and long-running operation per invocation of the Get{Streaming}ResponseAsync methods without changing the client configuration.
  • Enables explicit control over the execution mode by callers per invocation, meaning that no caller site is broken if the agent is injected via DI, and the caller can turn on the long-running operation mode when it can handle it.

Con: This may not be valuable for all callers, as they may not have enough information to decide whether the prompt should run as a long-running operation or quick prompt.

2.2 Execution Mode per Get{Streaming}ResponseAsync Invocation + Model Class

This option is similar to the previous one, but suggest using a model class LongRunningResponsesOptions for properties related to long-running operations.

public class LongRunningResponsesOptions
{
    public bool? Allow { get; set; }
    //public PollingSettings? PollingSettings { get; set; } // Can be added leter if necessary
}

public class ChatOptions
{
    public LongRunningResponsesOptions? LongRunningResponsesOptions { get; set; }
}

// Consumer code example
IChatClient chatClient = ...; // Get an instance of IChatClient

// Start a long-running execution for the prompt if supported by the underlying API
ChatResponse response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { LongRunningResponsesOptions = new() { Allow = true } });

Pros:

  • Enables explicit control over the execution mode by callers per invocation, meaning that no caller site is broken if the agent is injected via DI, and the caller can turn on the long-running operation mode when it can handle it.
  • No proliferation of long-running operation-related properties in the ChatOptions class.

Con: Slightly more complex initialization.

2.3 Execution Mode per Chat Client Instance

This option proposes adding a new enableLongRunningResponses parameter to constructors of chat clients that support both quick prompts and long-running operations. The parameter value will be true if the chat client should operate in long-running operation mode, false if it should operate in quick prompt mode.

Chat clients that work with APIs requiring explicit configuration will use this parameter to determine whether to run prompts as long-running operations or quick prompts. Chat clients that work with APIs that don't require explicit configuration won't have this parameter in their constructors and will operate according to their own logic/configuration.

public class CustomChatClient : IChatClient
{
    private readonly bool _enableLongRunningResponses;

    public CustomChatClient(bool enableLongRunningResponses)
    {
        this._enableLongRunningResponses = enableLongRunningResponses;
    }

    // Existing methods...
}

// Consumer code example
IChatClient chatClient = new CustomChatClient(enableLongRunningResponses: true);

// Start a long-running execution for the prompt
ChatResponse response = await chatClient.GetResponseAsync("<prompt>");

Chat clients can be configured to always operate in long-running operation mode or quick prompt mode based on their role in a specific scenario. For example, a chat client responsible for generating ideas for images can be configured for quick prompt mode, while a chat client responsible for image generation can be configured to always use long-running operation mode.

Pro: Can be beneficial for scenarios where chat clients need to be configured upfront in accordance with their role in a scenario.

Con: Less flexible than the previous option, as it requires configuring the chat client upfront at instantiation time. However, this flexibility might not be needed.

2.4 Combined Approach

This option proposes a combined approach that allows configuration per chat client instance and per Get{Streaming}ResponseAsync method invocation.

The chat client will use whichever configuration is provided, whether set in the chat client constructor or in the options for the Get{Streaming}ResponseAsync method invocation. If both are set, the one provided in the Get{Streaming}ResponseAsync method invocation takes precedence.

public class CustomChatClient : IChatClient
{
    private readonly bool _enableLongRunningResponses;

    public CustomChatClient(bool enableLongRunningResponses)
    {
        this._enableLongRunningResponses = enableLongRunningResponses;
    }
    
    public async Task<ChatResponse> GetResponseAsync(string prompt, ChatOptions? options = null, CancellationToken ct = default)
    {
        bool enableLongRunningResponses = options?.AllowLongRunningResponses ?? this._enableLongRunningResponses;
        // Logic to handle the prompt based on enableLongRunningResponses...
    }
}

// Consumer code example
IChatClient chatClient = new CustomChatClient(enableLongRunningResponses: true);

// Start a long-running execution for the prompt
ChatResponse response = await chatClient.GetResponseAsync("<prompt>");

// Start a quick prompt
ChatResponse quickResponse = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = false });

Pros: Flexible approach that combines the benefits of both previous options.

3. Getting Status and Result of Long-Running Execution

The explored APIs use different approaches for retrieving the status and results of long-running operations. Some are using one method to retrieve both status and result, while others use two separate methods for each operation:

Feature OpenAI Responses Foundry Agents A2A
API to Get Status GetResponseAsync(responseId) Runs.GetRunAsync(thread.Id, threadRun.Id) GetTaskAsync(task.Id)
API to Get Result GetResponseAsync(responseId) Messages.GetMessagesAsync(thread.Id, threadRun.Id) GetTaskAsync(task.Id)

Taking into account the differences, the following options propose a few ways to model the API for getting the status and result of long-running operations for the AIAgent interface implementations.

3.1 Two Separate Methods for Status and Result

This option suggests having two separate methods for getting the status and result of long-running operations:

public interface IAsyncChatClient
{
    Task<AsyncRunResult> GetAsyncRunStatusAsync(string runId, CancellationToken ct = default);
    Task<AsyncRunResult> GetAsyncRunResultAsync(string runId, CancellationToken ct = default);
}

Pros: Could be more intuitive for developers, as it clearly separates the concerns of checking the status and retrieving the result of a long-running operation.

Cons: Creates inefficiency for chat clients that use APIs that return both status and result in a single call, as callers might make redundant calls to get the result after checking the status that already contains the result.

3.2 One Method to Get Status and Result

This option suggests having a single method for getting both the status and result of long-running operations:

public interface IAsyncChatClient
{
    Task<AsyncRunResult> GetAsyncRunResultAsync(string runId, AgentThread? thread = null, CancellationToken ct = default);
}

This option will redirect the call to the appropriate method of the underlying API that uses one method to retrieve both. For APIs that use two separate methods, the method will first get the status and if the status indicates that the operation is still running, it will return the status to the caller. If the status indicates that the operation is completed, it will then call the method to get the result of the long-running operation and return it together with the status.

Pros:

  • Simplifies the API by providing a single, intuitive method for retrieving long-running operation information.
  • More optimal for chat clients that use APIs that return both status and result in a single call, as it avoids unnecessary API calls.

4. Place For RunId, Status, and UpdateId of Long-Running Operations

This section considers different options for exposing the RunId, Status, and UpdateId properties of long-running operations.

4.1. As AIContent

The AsyncRunContent class will represent a long-running operation initiated and managed by an agent/LLM. Items of this content type will be returned in a chat message as part of the AgentResponse or ChatResponse response to represent the long-running operation.

The AsyncRunContent class has two properties: RunId and Status. The RunId identifies the long-running operation, and the Status represents the current status of the operation. The class
inherits from AIContent, which is a base class for all AI-related content in MEAI and AF.

The AsyncRunStatus class represents the status of a long-running operation. Initially, it will have a set of predefined statuses that represent the possible statuses used by existing Agent/LLM APIs that support long-running operations. It will be extended to support additional statuses as needed while also allowing custom, not-yet-defined statuses to propagate as strings from the underlying API to the callers.

The content class type can be used by both agents and chat clients to represent long-running operations. For chat clients to use it, it should be declared in one of the MEAI packages.

public class AsyncRunContent : AIContent
{
    public string RunId { get; }
    public AsyncRunStatus? Status { get; }
}

public readonly struct AsyncRunStatus : IEquatable<AsyncRunStatus>
{
    public static AsyncRunStatus Queued { get; } = new("Queued");
    public static AsyncRunStatus InProgress { get; } = new("InProgress");
    public static AsyncRunStatus Completed { get; } = new("Completed");
    public static AsyncRunStatus Cancelled { get; } = new("Cancelled");
    public static AsyncRunStatus Failed { get; } = new("Failed");
    public static AsyncRunStatus RequiresAction { get; } = new("RequiresAction");
    public static AsyncRunStatus Expired { get; } = new("Expired");
    public static AsyncRunStatus Rejected { get; } = new("Rejected");
    public static AsyncRunStatus AuthRequired { get; } = new("AuthRequired");
    public static AsyncRunStatus InputRequired { get; } = new("InputRequired");
    public static AsyncRunStatus Unknown { get; } = new("Unknown");

    public string Label { get; }

    public AsyncRunStatus(string label)
    {
        if (string.IsNullOrWhiteSpace(label))
        {
            throw new ArgumentException("Label cannot be null or whitespace.", nameof(label));
        }

        this.Label = label;
    }

    /// Other members
}

The streaming API may return an UpdateId identifying a particular update within a streamed response. This UpdateId should be available together with RunId to callers, allowing them to resume a long-running operation identified by the RunId from the last received update, identified by the UpdateId.

4.2. As Properties Of ChatResponse{Update}

This option suggests adding properties related to long-running operations directly to the ChatResponse and ChatResponseUpdate classes rather than using a separate content class for that. See section "6. Model To Support Long-Running Operations" for more details.

5. Streaming Support

All analyzed APIs that support long-running operations also support streaming.

Some of them natively support resuming streaming from a specific point in the stream, while for others, this is either implementation-dependent or needs to be emulated:

API Can Resume Streaming Model
OpenAI Responses Yes StreamingResponseUpdate.SequenceNumber + GetResponseStreamingAsync(responseId, startingAfter, ct)
Azure AI Foundry Agents Emulated2 RunStep.Id + custom pseudo code: client.Runs.GetRunStepsAsync(...).AllStepsAfter(stepId)
A2A Implementation dependent1

1 The A2A specification allows an A2A agent implementation to decide how to handle streaming resumption: If a client's SSE connection breaks prematurely while a task is still active (and the server hasn't sent a final: true event for that phase), the client can attempt to reconnect to the stream using the tasks/resubscribe RPC method. The server's behavior regarding missed events during the disconnection period (e.g., whether it backfills or only sends new updates) is implementation-dependent.

2 The Azure AI Foundry Agents API has an API to start a streaming run but does not have an API to resume streaming from a specific point in the stream. However, it has non-streaming APIs to access already started runs, which can be used to emulate streaming resumption by accessing a run and its steps and streaming all the steps after a specific step.

Required Changes

To support streaming resumption, the following model changes are required:

  • The ChatOptions class needs to be extended with a new StartAfter property that will identify an update to resume streaming from and to start generating responses after.
  • The ChatResponseUpdate class needs to be extended with a new SequenceNumber property that will identify the update number within the stream.

All the chat clients supporting the streaming resumption will need to return the SequenceNumber property as part of the ChatResponseUpdate class and honor the StartAfter property of the ChatOptions class.

Function Calling

Function calls over streaming are communicated to chat clients through a series of updates. Chat clients accumulate these updates in their internal state to build the function call content once the last update has been received. The completed function call content is then returned to the function-calling chat client, which eventually invokes it.

Since chat clients keep function call updates in their internal state, resuming streaming from a specific update can be impossible if the resumption request is made using a chat client that does not have the previous updates stored. This situation can occur if a host suspends execution during an ongoing function call stream and later resumes from that particular update. Because chat clients' internal state is not persisted, they will lack the prior updates needed to continue the function call, leading to a failure in resumption.

To address this issue, chat clients can only return sequence numbers for updates that are resumable. For updates that cannot be resumed from, chat clients can return the sequence number of the most recent update received before the non-resumable one. This allows callers to resume from that earlier update, even if it means re-processing some updates that have already been handled.

Chat clients will continue returning the sequence number of the last resumable update until a new resumable update becomes available. For example, a chat client might keep returning sequence number 2, corresponding to the last resumable update received before an update for the first function call. Once all function call updates are received and processed, and the model returns a non-function call response, the chat client will then return a sequence number, say 10, which corresponds to the first non-function call update.

Status of Streaming Updates

Different APIs provide different statuses for streamed function call updates

Sequence of updates from OpenAI Responses API to answer the question "What time is it?" using a function call:

Id SN Update.Kind Response.Status ChatResponseUpdate.Status Description
resp_1 0 resp.created Queued Queued
resp_1 1 resp.queued Queued Queued
resp_1 2 resp.in_progress InProgress InProgress
resp_1 3 resp.output_item.added - InProgress
resp_1 4 resp.func_call.args.delta - InProgress
resp_1 5 resp.func_call.args.done - InProgress
resp_1 6 resp.output_item.done - InProgress
resp_1 7 resp.completed Completed Complete
resp_1 - - - null FunctionInvokingChatClient yields function result
OpenAI Responses created a new response to handle function call result
resp_2 0 resp.created Queued Queued
resp_2 1 resp.queued Queued Queued
resp_2 2 resp.in_progress InProgress InProgress
resp_2 3 resp.output_item.added - InProgress
resp_2 4 resp.cnt_part.added - InProgress
resp_2 5 resp.output_text.delta - InProgress
resp_2 6 resp.output_text.delta - InProgress
resp_2 7 resp.output_text.delta - InProgress
resp_2 8 resp.output_text.done - InProgress
resp_2 9 resp.cnt_part.done - InProgress
resp_2 10 resp.output_item.done - InProgress
resp_2 11 resp.completed Completed Completed

Sequence of updates from Azure AI Foundry Agents API to answer the question "What time is it?" using a function call:

Id SN UpdateKind Run.Status Step.Status Message.Status ChatResponseUpdate.Status Description
run_1 - RunCreated Queued - - Queued
run_1 step_1 - RequiredAction InProgress - RequiredAction
TBD - - - - - - FunctionInvokingChatClient yields function result
run_1 - RunStepCompleted Completed - - InProgress
run_1 - RunQueued Queued - - Queued
run_1 - RunInProgress InProgress - - InProgress
run_1 step_2 RunStepCreated - InProgress - InProgress
run_1 step_2 RunStepInProgress - InProgress - InProgress
run_1 - MessageCreated - - InProgress InProgress
run_1 - MessageInProgress - - InProgress InProgress
run_1 - MessageUpdated - - - InProgress
run_1 - MessageUpdated - - - InProgress
run_1 - MessageUpdated - - - InProgress
run_1 - MessageCompleted - - Completed InProgress
run_1 step_2 RunStepCompleted Completed - - InProgress
run_1 - RunCompleted Completed - - Completed

6. Model To Support Long-Running Operations

To support long-running operations, the following values need to be returned by the GetResponseAsync and GetStreamingResponseAsync methods:

  • ResponseId - identifier of the long-running operation or an entity representing it, such as a task.
  • ConversationId - identifier of the conversation or thread the long-running operation is part of. Some APIs, like Azure AI Foundry Agents, use this identifier together with the ResponseId to identify a run.
  • SequenceNumber - identifier of an update within a stream of updates. This is required to support streaming resumption by the GetStreamingResponseAsync method only.
  • Status - status of the long-running operation: whether it is queued, running, failed, cancelled, completed, etc.

These values need to be supplied to subsequent calls of the GetResponseAsync and GetStreamingResponseAsync methods to get the status and result of long-running operations.

6.1 ChatOptions

The following options consider different ways of extending the ChatOptions class to include the following properties to support long-running operations:

  • AllowLongRunningResponses - a boolean property that indicates whether the caller allows the chat client to run in long-running operation mode if it's supported by the chat client.
  • ResponseId - a string property that represents the identifier of the long-running operation or an entity representing it. A non-null value of this property would indicate to chat clients that callers want to get the status and result of an existing long-running operation, identified by the property value, rather than starting a new one.
  • StartAfter - a string property that represents the sequence number of an update within a stream of updates so that the chat client can resume streaming after the last received update.
6.1.1 Direct Properties in ChatOptions
public class ChatOptions
{
    // Existing properties...
    /// <summary>Gets or sets an optional identifier used to associate a request with an existing conversation.</summary>
    public string? ConversationId { get; set; }
    ...

    // New properties...
    public bool? AllowLongRunningResponses { get; set; }
    public string? ResponseId { get; set; }
    public string? StartAfter { get; set; }
}

// Usage example
var response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = true });

// If the response indicates a long-running operation, get its status and result
if(response.Status is {} status)
{
    response = await chatClient.GetResponseAsync([], new ChatOptions 
    { 
        AllowLongRunningResponses = true,
        ResponseId = response.ResponseId,
        ConversationId = response.ConversationId,
        //StartAfter = response.SequenceNumber // for GetStreamingResponseAsync only
    });
}

Con: Proliferation of long-running operation properties in the ChatOptions class.

6.1.2 LongRunOptions Model Class
public class ChatOptions
{
    // Existing properties...
    public string? ConversationId { get; set; } 
    ...
    
    // New properties...
    public bool? AllowLongRunningResponses { get; set; }

    public LongRunOptions? LongRunOptions { get; set; }
}

public class LongRunOptions
{
    public string? ResponseId { get; set; }
    public string? ConversationId { get; set; } 
    public string? StartAfter { get; set; }

    // Alternatively, ChatResponse can have an extension method ToLongRunOptions.
    public LongRunOptions FromChatResponse(ChatResponse response)
    {
        return new LongRunOptions
        {
            ResponseId = response.ResponseId,
            ConversationId = response.ConversationId,
        };
    }

    // Alternatively, ChatResponseUpdate can have an extension method ToLongRunOptions.
    public LongRunOptions FromChatResponseUpdate(ChatResponseUpdate update)
    {
        return new LongRunOptions
        {
            ResponseId = update.ResponseId,
            ConversationId = update.ConversationId,
            StartAfter = update.SequenceNumber,
        };
    }
}

// Usage example
var response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = true });

// If the response indicates a long-running operation, get its status and result
if(response.Status is {} status)
{
    while(status != ResponseStatus.Completed)
    {
        response = await chatClient.GetResponseAsync([], new ChatOptions 
        { 
            AllowLongRunningResponses = true,
            LongRunOptions = LongRunOptions.FromChatResponse(response)
            // or extension method
            LongRunOptions = response.ToLongRunOptions()
            // or implicit conversion
            LongRunOptions = response
        });
    }
}

Pro: No proliferation of long-running operation properties in the ChatOptions class.

Con: Duplicated property ConversationId.

6.1.3 Continuation Token of System.ClientModel.ContinuationToken Type

This option suggests using System.ClientModel.ContinuationToken to encapsulate all properties required for long-running operations. The continuation token will be returned by chat clients as part of the ChatResponse and ChatResponseUpdate responses to indicate that the response is part of a long-running execution. A null value of the property will indicate that the response is not part of a long-running execution. Chat clients will accept a non-null value of the property to indicate that callers want to get the status and result of an existing long-running operation.

Each chat client will implement its own continuation token class that inherits from ContinuationToken to encapsulate properties required for long-running operations that are specific to the underlying API the chat client works with. For example, for the OpenAI Responses API, the continuation token class will encapsulate the ResponseId and SequenceNumber properties.

public class ChatOptions
{
    // Existing properties...
    public string? ConversationId { get; set; } 
    ...
    
    // New properties...
    public bool? AllowLongRunningResponses { get; set; }

    public ContinuationToken? ContinuationToken { get; set; }
}

internal sealed class LongRunContinuationToken : ContinuationToken
{
    public LongRunContinuationToken(string responseId)
    {
        this.ResponseId = responseId;
    }

    public string ResponseId { get; set; }

    public int? SequenceNumber { get; set; }

    public static LongRunContinuationToken FromToken(ContinuationToken token)
    {
        if (token is LongRunContinuationToken longRunContinuationToken)
        {
            return longRunContinuationToken;
        }

        BinaryData data = token.ToBytes();

        Utf8JsonReader reader = new(data);

        string responseId = null!;
        int? startAfter = null;

        reader.Read();

        // Reading functionality

        return new(responseId)
        {
            SequenceNumber = startAfter
        };
    }
}

// Usage example
ChatOptions options = new() { AllowLongRunningResponses = true };

var response = await chatClient.GetResponseAsync("<prompt>", options);

while (response.ContinuationToken is { } token)
{
    options.ContinuationToken = token;

    response = await chatClient.GetResponseAsync([], options);
}

Console.WriteLine(response.Text);

Pro: No proliferation of long-running operation properties in the ChatOptions class, including the Status property.

6.1.4 Continuation Token of String Type

This options is similar to the previous one but suggests using a string type for the continuation token instead of the System.ClientModel.ContinuationToken type.

internal sealed class LongRunContinuationToken
{
    public LongRunContinuationToken(string responseId)
    {
        this.ResponseId = responseId;
    }

    public string ResponseId { get; set; }

    public int? SequenceNumber { get; set; }

    public static LongRunContinuationToken Deserialize(string json)
    {
        Throw.IfNullOrEmpty(json);

        var token = JsonSerializer.Deserialize<LongRunContinuationToken>(json, OpenAIJsonContext2.Default.LongRunContinuationToken)
            ?? throw new InvalidOperationException("Failed to deserialize LongRunContinuationToken.");

        return token;
    }

    public string Serialize()
    {
        return JsonSerializer.Serialize(this, OpenAIJsonContext2.Default.LongRunContinuationToken);
    }
}

public class ChatOptions
{
    public string? ContinuationToken { get; set; }
}

Pro: No dependency on the System.ClientModel package.

6.1.5 Continuation Token of a Custom Type

The option is similar the the "6.1.3 Continuation Token of System.ClientModel.ContinuationToken Type" option but suggests using a custom type for the continuation token instead of the System.ClientModel.ContinuationToken type.

Pros

  • There is no dependency on the System.ClientModel package.
  • There is no ambiguity between extension methods for IChatClient that would occur if a new extension method, which accepts a continuation token of string type as the first parameter, is added.

6.2 Overloads of GetResponseAsync and GetStreamingResponseAsync

This option proposes introducing overloads of the GetResponseAsync and GetStreamingResponseAsync methods that will accept long-running operation parameters directly:

public interface ILongRunningChatClient
{
    Task<ChatResponse> GetResponseAsync(
        IEnumerable<ChatMessage> messages,
        string responseId,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
        IEnumerable<ChatMessage> messages,
        string responseId,
        string? startAfter = null,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);
}

public class CustomChatClient : IChatClient, ILongRunningChatClient
{
    ...
}

// Usage example
IChatClient chatClient = ...; // Get an instance of IChatClient

ChatResponse response = await chatClient.GetResponseAsync("<prompt>", new ChatOptions { AllowLongRunningResponses = true });

if(response.Status is {} status && chatClient.GetService<ILongRunningChatClient>() is {} longRunningChatClient)
{
    while(status != AsyncRunStatus.Completed)
    {
        response = await longRunningChatClient.GetResponseAsync([], response.ResponseId, new ChatOptions { ConversationId = response.ConversationId });
    }
    ...
}

Pros:

  • No proliferation of long-running operation properties in the ChatOptions class, except for the new AllowLongRunningResponses property discussed in section 2.

Cons:

  • Interface switching: Callers need to switch to the ILongRunningChatClient interface to get the status and result of long-running operations.
  • An alternative solution for decorating the new methods will have to be put in place.

Long-Running Operations Support for AF Agents

1. Methods for Working with Long-Running Operations

The design for supporting long-running operations by agents is very similar to that for chat clients because it is based on the same analysis of existing APIs and anticipated consumption patterns.

1.1 Run{Streaming}Async Methods for Common Operations and the Update Operation + New Method Per Uncommon Operation

This option suggests using the existing Run{Streaming}Async methods of the AIAgent interface implementations to start, get results, and update long-running operations.

For cancellation and deletion of long-running operations, new methods will be added to the AIAgent interface implementations.

public abstract class AIAgent
{
    // Existing methods...
    public Task<AgentResponse> RunAsync(string message, AgentThread? thread = null, AgentRunOptions? options = null, CancellationToken cancellationToken = default) { ... }
    public IAsyncEnumerable<AgentResponseUpdate> RunStreamingAsync(string message, AgentThread? thread = null, AgentRunOptions? options = null, CancellationToken cancellationToken = default) { ... }

    // New methods for uncommon operations
    public virtual Task<AgentResponse?> CancelRunAsync(string id, AgentCancelRunOptions? options = null, CancellationToken cancellationToken = default)
    {
        return Task.FromResult<AgentResponse?>(null);
    }

    public virtual Task<AgentResponse?> DeleteRunAsync(string id, AgentDeleteRunOptions? options = null, CancellationToken cancellationToken = default)
    {
        return Task.FromResult<AgentResponse?>(null);
    }
}

// Agent that supports update and cancellation
public class CustomAgent : AIAgent
{
    public override async Task<AgentResponse?> CancelRunAsync(string id, AgentCancelRunOptions? options = null, CancellationToken cancellationToken = default)
    {
        var response = await this._client.CancelRunAsync(id, options?.Thread?.ConversationId);

        return ConvertToAgentResponse(response); 
    }

    // No overload for DeleteRunAsync as it's not supported by the underlying API
}

// Usage
AIAgent agent = new CustomAgent();

AgentThread thread = agent.GetNewThread();

AgentResponse response = await agent.RunAsync("What is the capital of France?");

response = await agent.CancelRunAsync(response.ResponseId, new AgentCancelRunOptions { Thread = thread });

In case an agent supports either or both cancellation and deletion of long-running operations, it will override the corresponding methods. Otherwise, it won't override them, and the base implementations will return null by default.

Some agents, for example Azure AI Foundry Agents, require the thread identifier to cancel a run. To accommodate this requirement, the CancelRunAsync method accepts an optional AgentCancelRunOptions parameter that allows callers to specify the thread associated with the run they want to cancel.

public class AgentCancelRunOptions
{
    public AgentThread? Thread { get; set; }
}

Similar design considerations can be applied to the DeleteRunAsync method and the AgentDeleteRunOptions class.

Having options in the method signatures allows for future extensibility; however, they can be added later if needed to the method overloads.

Pros:

  • Existing Run{Streaming}Async methods are reused for common operations.
  • New methods for uncommon operations can be added in a non-breaking way.

2. Enabling Long-Running Operations

The options for enabling long-running operations are exactly the same as those discussed in section "2. Enabling Long-Running Operations" for chat clients:

  • Execution Mode per Run{Streaming}Async Invocation
  • Execution Mode per Run{Streaming}Async Invocation + Model Class
  • Execution Mode per agent instance
  • Combined Approach

Below are the details of the option selected for chat clients that is also selected for agents.

2.1 Execution Mode per Run{Streaming}Async Invocation

This option proposes adding a new nullable AllowLongRunningResponses property of bool type to the AgentRunOptions class. The property value will be true if the caller requests a long-running operation, false, null or omitted otherwise.

AI agents that work with APIs requiring explicit configuration per operation will use this property to determine whether to run the prompt as a long-running operation or quick prompt. Agents that work with APIs that don't require explicit configuration will ignore this property and operate according to their own logic/configuration.

public class AgentRunOptions
{
    // Existing properties...
    public bool? AllowLongRunningResponses { get; set; }
}

// Consumer code example
AIAgent agent = ...; // Get an instance of an AIAgent

// Start a long-running execution for the prompt if supported by the underlying API
AgentResponse response = await agent.RunAsync("<prompt>", new AgentRunOptions { AllowLongRunningResponses = true });

// Start a quick prompt
AgentResponse response = await agent.RunAsync("<prompt>");

Pros:

  • Callers can switch between quick prompts and long-running operations per invocation of the Run{Streaming}Async methods without changing agent configuration.
  • Enables explicit control over the execution mode by callers per invocation, meaning that no caller site is broken if the agent is injected via DI, and the caller can turn on the long-running operation mode when it can handle it.

Con: This may not be valuable for all callers, as they may not have enough information to decide whether the prompt should run as a long-running operation or quick prompt.

3. Model To Support Long-Running Operations

The options for modeling long-running operations are exactly the same as those for chat clients discussed in section "6. Model To Support Long-Running Operations" above:

  • Direct Properties in ChatOptions
  • LongRunOptions Model Class
  • Continuation Token of System.ClientModel.ContinuationToken Type
  • Continuation Token of String Type
  • Continuation Token of a Custom Type

Below are the details of the option selected for chat clients that is also selected for agents.

3.1 Continuation Token of a Custom Type

This option suggests using ContinuationToken to encapsulate all properties representing a long-running operation. The continuation token will be returned by agents in the ContinuationToken property of the AgentResponse and AgentResponseUpdate responses to indicate that the response is part of a long-running operation. A null value of the property will indicate that the response is not part of a long-running operation or the long-running operation has been completed. Callers will set the token in the ContinuationToken property of the AgentRunOptions class in follow-up calls to the Run{Streaming}Async methods to indicate that they want to "continue" the long-running operation identified by the token.

Each agent will implement its own continuation token class that inherits from ContinuationToken to encapsulate properties required for long-running operations that are specific to the underlying API the agent works with. For example, for the A2A agent, the continuation token class will encapsulate the TaskId property.

internal sealed class A2AAgentContinuationToken : ResponseContinuationToken
{
    public A2AAgentContinuationToken(string taskId)
    {
        this.TaskId = taskId;
    }

    public string TaskId { get; set; }

    public static LongRunContinuationToken FromToken(ContinuationToken token)
    {
        if (token is LongRunContinuationToken longRunContinuationToken)
        {
            return longRunContinuationToken;
        }

        ... // Deserialization logic
    }
}

public class AgentRunOptions
{
    public ResponseContinuationToken? ContinuationToken { get; set; }
}

public class AgentResponse
{
    public ResponseContinuationToken? ContinuationToken { get; }
}
 
public class AgentResponseUpdate
{
    public ResponseContinuationToken? ContinuationToken { get; }
}

// Usage example
AgentResponse response = await agent.RunAsync("What is the capital of France?");

AgentRunOptions options = new() { ContinuationToken = response.ContinuationToken };

while (response.ContinuationToken is { } token)
{
    options.ContinuationToken = token;
    response = await agent.RunAsync([], options);
}

Console.WriteLine(response.Text);

4. Continuation Token and Agent Thread

There are two types of agent threads: server-managed and client-managed. The server-managed threads live server-side and are identified by a conversation identifier, and agents use the identifier to associate runs with the threads. The client-managed threads live client-side and are represented by a collection of chat messages that agents maintain by adding user messages to them before sending the thread to the service and by adding the agent response back to the thread when received from the service.

When long-running operations are enabled and an agent is configured with tools, the initial run response may contain a tool call that needs to be invoked by the agent. If the agent runs with a server-managed thread, the tool call will be captured as part of the conversation history server-side and follow-up runs will have access to it, and as a result the agent will invoke the tool. However, if no thread is provided at the agent's initial run and a client-managed thread is provided for follow-up runs and the agent calls a tool, the tool call which the agent made at the initial run will not be added to the client-managed thread since the initial run was made with no thread, and as a result the agent will not be able to invoke the tool.

4.1 Require Thread for Long-Running Operations

This option suggests that AI agents require a thread to be provided when long-running operations are enabled. If no thread is provided, the agent will throw an exception.

Pro: Ensures agent responses are always captured by client-managed threads when long-running operations are enabled, providing a consistent experience for callers.

Con: May be inconvenient for callers to always provide a thread when long-running operations are enabled.

4.2 Don't Require Thread for Long-Running Operations

This option suggests that AI agents don't require a thread to be provided when long-running operations are enabled. According to this option, it's up to the caller to ensure that the thread is provided with background operations consistently for all runs.

Pro: Provides more flexibility to callers by not enforcing thread requirements.

Con: May lead to an inconsistent experience for callers if they forget to provide the thread for initial or follow-up runs.

Decision Outcome

Long-Running Execution Support for Chat Clients

  • Methods: Option 1.4 - Use existing Get{Streaming}ResponseAsync for common operations; individual interfaces for uncommon operations (e.g., ICancelableChatClient)
  • Enabling: Option 2.1 - Execution mode per invocation via ChatOptions.AllowLongRunningResponses
  • Status/Result: Option 3.2 - Single method to get both status and result
  • RunId/UpdateId: Option 4.2 - As properties of ChatResponse{Update}
  • Model: Option 6.1.5 - Custom continuation token type

Long-Running Operations Support for AF Agents

  • Methods: Option 1.1 - Use existing Run{Streaming}Async for common operations; new methods for uncommon operations
  • Enabling: Option 2.1 - Execution mode per invocation via AgentRunOptions.AllowLongRunningResponses
  • Model: Option 3.1 - Custom continuation token type
  • Thread Requirement: Option 4.1 - Require thread for long-running operations

Addendum 1: APIs of Agents Supporting Long-Running Execution

OpenAI Responses
  • Create a background response and wait for it to complete using polling:

    ClientResult<OpenAI.Responses.OpenAIResponse> result = await this._openAIResponseClient.CreateResponseAsync("What is SLM in AI?", new ResponseCreationOptions
    {
        Background = true,
    });
    
    // InProgress, Completed, Cancelled, Queued, Incomplete, Failed
    while (result.Value.Status is (ResponseStatus.Queued or ResponseStatus.InProgress))
    {
        Thread.Sleep(500); // Wait for 0.5 seconds before checking the status again
        result = await this._openAIResponseClient.GetResponseAsync(result.Value.Id);
    }
    
    Console.WriteLine($"Response Status: {result.Value.Status}"); // Completed
    Console.WriteLine(result.Value.GetOutputText()); // SLM in the context of AI refers to ...
    
  • Cancel a background response:

    ...
    ClientResult<OpenAI.Responses.OpenAIResponse> result = await this._openAIResponseClient.CreateResponseAsync("What is SLM in AI?", new ResponseCreationOptions
    {
        Background = true,
    });
    
    result = await this._openAIResponseClient.CancelResponseAsync(result.Value.Id);
    
    Console.WriteLine($"Response Status: {result.Value.Status}"); // Cancelled
    
  • Delete a background response:

    ClientResult<OpenAI.Responses.OpenAIResponse> result = await this._openAIResponseClient.CreateResponseAsync("What is SLM in AI?", new ResponseCreationOptions
    {
        Background = true,
    });
    
    ClientResult<OpenAI.Responses.ResponseDeletionResult> deleteResult = await this._openAIResponseClient.DeleteResponseAsync(result.Value.Id);
    
    Console.WriteLine($"Response Deleted: {deleteResult.Value.Deleted}"); // True if the response was deleted successfully
    
  • Streaming a background response

    await foreach (StreamingResponseUpdate update in this._openAIResponseClient.CreateResponseStreamingAsync("What is SLM in AI?", new ResponseCreationOptions { Background = true }))
    {
        Console.WriteLine($"Sequence Number: {update.SequenceNumber}"); // 0, 1, 2, etc.
    
        switch (update)
        {
            case StreamingResponseCreatedUpdate createdUpdate:
                Console.WriteLine($"Response Status: {createdUpdate.Response.Status}"); // Queued
                break;
            case StreamingResponseQueuedUpdate queuedUpdate:
                Console.WriteLine($"Response Status: {queuedUpdate.Response.Status}"); // Queued
                break;
            case StreamingResponseInProgressUpdate inProgressUpdate:
                Console.WriteLine($"Response Status: {inProgressUpdate.Response.Status}"); // InProgress
                break;
            case StreamingResponseOutputItemAddedUpdate outputItemAddedUpdate:
                Console.WriteLine($"Output index: {outputItemAddedUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {outputItemAddedUpdate.Item.Id}");
                break;
            case StreamingResponseContentPartAddedUpdate contentPartAddedUpdate:
                Console.WriteLine($"Output Index: {contentPartAddedUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {contentPartAddedUpdate.ItemId}");
                Console.WriteLine($"Content Index: {contentPartAddedUpdate.ContentIndex}");
                break;
            case StreamingResponseOutputTextDeltaUpdate outputTextDeltaUpdate:
                Console.WriteLine($"Output Index: {outputTextDeltaUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {outputTextDeltaUpdate.ItemId}");
                Console.WriteLine($"Content Index: {outputTextDeltaUpdate.ContentIndex}");
                Console.WriteLine($"Delta: {outputTextDeltaUpdate.Delta}");  // SL>M> in> AI> typically>....
                break;
            case StreamingResponseOutputTextDoneUpdate outputTextDoneUpdate:
                Console.WriteLine($"Output Index: {outputTextDoneUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {outputTextDoneUpdate.ItemId}");
                Console.WriteLine($"Content Index: {outputTextDoneUpdate.ContentIndex}");
                Console.WriteLine($"Text: {outputTextDoneUpdate.Text}");  // SLM in the context of AI typically refers to ...
                break;
            case StreamingResponseContentPartDoneUpdate contentPartDoneUpdate:
                Console.WriteLine($"Output Index: {contentPartDoneUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {contentPartDoneUpdate.ItemId}");
                Console.WriteLine($"Content Index: {contentPartDoneUpdate.ContentIndex}");
                Console.WriteLine($"Text: {contentPartDoneUpdate.Part.Text}");  // SLM in the context of AI typically refers to ...
                break;
            case StreamingResponseOutputItemDoneUpdate outputItemDoneUpdate:
                Console.WriteLine($"Output Index: {outputItemDoneUpdate.OutputIndex}");
                Console.WriteLine($"Item Id: {outputItemDoneUpdate.Item.Id}");
                break;
            case StreamingResponseCompletedUpdate completedUpdate:
                Console.WriteLine($"Response Status: {completedUpdate.Response.Status}"); // Completed
                Console.WriteLine($"Output: {completedUpdate.Response.GetOutputText()}"); // SLM in the context of AI typically refers to ...
                break;
            default:
                Console.WriteLine($"Unexpected update type: {update.GetType().Name}");
                break;
        }
    }
    

    Docs: OpenAI background mode

  • Background Mode Disabled

    • Non-streaming API - returns the final result

      Method Call Status Result Notes
      CreateResponseAsync(msgs, opts, ct) Completed The capital of France is Paris.
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is less than 5 minutes old
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is more than 5 minutes old
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is more than 12 hours old
      Cancellation Method Result
      CancelResponseAsync Cannot cancel a synchronous response
    • Streaming API - returns streaming updates callers can iterate over to get the result

      Method Call Status Result
      CreateResponseStreamingAsync(msgs, opts, ct) - updates
      Iterating over updates InProgress -
      Iterating over updates InProgress -
      Iterating over updates InProgress The
      Iterating over updates InProgress capital
      Iterating over updates InProgress ...
      Iterating over updates InProgress Paris.
      Iterating over updates Completed The capital of France is Paris.
      GetStreamingResponseAsync(responseId, ct) - HTTP 400 - Response cannot be streamed, it was not created with background=true.
      Cancellation Method Result
      CancelResponseAsync Cannot cancel a synchronous response
  • Background Mode Enabled

    • Non-streaming API - returns queued response immediately and allow polling for the status and result

      Method Call Status Result Notes
      CreateResponseAsync(msgs, opts, ct) Queued responseId
      GetResponseAsync(responseId, ct) Queued - if called before the response is completed
      GetResponseAsync(responseId, ct) Queued - if called before the response is completed
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is less than 5 minutes old
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is more than 5 minutes old
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. response is more than 12 hours old

      The response started in background mode runs server-side until it completes, fails, or is cancelled. The client can poll for the status of the response using its Id. If the client polls before the response is completed, it will get the latest status of the response. If the client polls after the response is completed, it will get the completed response with the result.

      Cancellation Method Result Notes
      CancelResponseAsync Cancelled if cancelled before response completed
      CancelResponseAsync Completed if cancelled after response completed
      CancellationToken No effect it just cancels the client side call
    • Streaming API - returns streaming updates callers can iterate over immediately or after dropping the stream and picking it up later

      Method Call Status Result Notes
      CreateResponseStreamingAsync(msgs, opts, ct) - updates
      Iterating over updates Queued -
      Iterating over updates Queued -
      Iterating over updates InProgress -
      Iterating over updates InProgress -
      Iterating over updates InProgress The
      Iterating over updates InProgress capital
      Iterating over updates InProgress ...
      Iterating over updates InProgress Paris.
      Iterating over updates Completed The capital of France is Paris.
      GetStreamingResponseAsync(responseId, ct) - updates response is less than 5 minutes old
      Iterating over updates Queued -
      ... ... ...
      GetStreamingResponseAsync(responseId, ct) - HTTP 400 - Response can no longer be streamed, it is more than 5 minutes old. response is more than 5 minutes old
      GetResponseAsync(responseId, ct) Completed The capital of France is Paris. accessing response that can't be streamed

      The streamed response that is not available after 5 minutes can be retrieved using the non-streaming API GetResponseAsync.

      Cancellation Method Result Notes
      CancelResponseAsync Canceled1 if cancelled before response completed
      CancelResponseAsync Cannot cancel a completed response if cancelled after response completed
      CancellationToken No effect it just cancels the client side call

      1 The CancelResponseAsync method returns Canceled status, but a subsequent call to GetResponseStreamingAsync returns an enumerable that can be iterated over to get the rest of the response until it completes.

Azure AI Foundry Agents
  • Create a thread and run the agent against it and wait for it to complete using polling:

    // Create a thread with a message.
    ThreadMessageOptions options = new(MessageRole.User, "What is SLM in AI?");
    thread = await this._persistentAgentsClient!.Threads.CreateThreadAsync([options]);
    
    // Run the agent on the thread.
    ThreadRun threadRun = await this._persistentAgentsClient.Runs.CreateRunAsync(thread.Id, agent.Id);
    
    // Poll for the run status.
    // InProgress, Completed, Cancelling, Cancelled, Queued, Failed, RequiresAction, Expired
    while (threadRun.Status == RunStatus.InProgress || threadRun.Status == RunStatus.Queued)
    {
        threadRun = await this._persistentAgentsClient.Runs.GetRunAsync(thread.Id, threadRun.Id);
    }
    
    // Access the run result.
    await foreach (PersistentThreadMessage msg in this._persistentAgentsClient.Messages.GetMessagesAsync(thread.Id, threadRun.Id))
    {
        foreach (MessageContent content in msg.ContentItems)
        {
            switch (content)
            {
                case MessageTextContent textItem:
                    Console.WriteLine($"  Text: {textItem.Text}");
                    //M1: In the context of Artificial Intelligence (AI), **SLM** often ...
                    //M2: What is SLM in AI?
                    break;
            }
        }
    }
    
  • Cancel an agent run:

    // Create a thread with a message.
    ThreadMessageOptions options = new(MessageRole.User, "What is SLM in AI?");
    thread = await this._persistentAgentsClient!.Threads.CreateThreadAsync([options]);
    
    // Run the agent on the thread.
    ThreadRun threadRun = await this._persistentAgentsClient.Runs.CreateRunAsync(thread.Id, agent.Id);
    
    Response<ThreadRun> cancellationResponse = await this._persistentAgentsClient.Runs.CancelRunAsync(thread.Id, threadRun.Id);
    
  • Other agent run operations: GetRunStepAsync

A2A Agents
  • Send message to agent and handle the response

    // Send message to the A2A agent.
    A2AResponse response = await this.Client.SendMessageAsync(messageSendParams, cancellationToken).ConfigureAwait(false);
    
    // Handle task responses.
    if (response is AgentTask task)
    {
        while (task.Status.State == TaskState.Working)
        {
            task = await this.Client.GetTaskAsync(task.Id, cancellationToken).ConfigureAwait(false);
        }
    
        if (task.Artifacts != null && task.Artifacts.Count > 0)
        {
            foreach (var artifact in task.Artifacts)
            {
                foreach (var part in artifact.Parts)
                {
                    if (part is TextPart textPart)
                    {
                        Console.WriteLine($"Result: {textPart.Text}");
                    }
                }
            }
            Console.WriteLine();
        }
    }
    // Handle message responses.
    else if (response is Message message)
    {
        foreach (var part in message.Parts)
        {
            if (part is TextPart textPart)
            {
                Console.WriteLine($"Result: {textPart.Text}");
            }
        }
    }
    else
    {
        throw new InvalidOperationException("Unexpected response type from A2A client.");
    }
    
  • Cancel task

    // Send message to the A2A agent.
    A2AResponse response = await this.Client.SendMessageAsync(messageSendParams, cancellationToken).ConfigureAwait(false);
    
    // Cancel the task
    if (response is AgentTask task)
    {
        await this.Client.CancelTaskAsync(new TaskIdParams() { Id = task.Id }, cancellationToken).ConfigureAwait(false);
    }