* Python: Enhance Azure AI Search citations with document URLs in Foundry V2 (Responses API)
Override _parse_response_from_openai and _parse_chunk_from_openai in
RawAzureAIClient to extract get_urls from azure_ai_search_call_output
items and enrich url_citation annotations with document-specific URLs.
- Non-streaming: first pass collects get_urls, post-processes annotations
- Streaming: captures search output state, enriches url_citation events
(also handles url_citation annotation type not handled by base class)
- Updated V2 sample to demonstrate citation URL extraction
- Added 14 unit tests covering extraction, enrichment, and edge cases
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: rework search citation enrichment to override _inner_get_response
- Remove all direct openai/pydantic imports from _client.py
- Override _inner_get_response instead of _parse_response_from_openai/_parse_chunk_from_openai
- Use closure-local state for streaming instead of instance-level _streaming_search_get_urls
- Add _build_url_citation_content helper for streaming url_citation handling
- Fix mypy errors by using str(value or '') for Annotation TypedDict fields
- Fix docstring to say 'citation' instead of 'url_citation'
- Update tests to match new approach
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: handle streaming search citations from output_item.done events
The azure_ai_search_call_output item only has populated output data
(including get_urls) in the response.output_item.done event, not in
the response.output_item.added event. Also removed the search_get_urls
guard on url_citation handling so annotations are always produced even
if get_urls haven't been captured yet.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* addressed comments
* refactor: address PR review - eliminate type: ignore[assignment] pattern
Call super()._inner_get_response() independently in each branch instead
of once at the top with union type reassignment. Non-streaming uses
two-arg super() in the closure; streaming uses cast() for type narrowing.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: remove defensive patterns per PR review
- Replace all getattr() with direct attribute access
- Remove cast() for streaming branch, use type: ignore[assignment]
- Simplify _build_url_citation_content to use dict access directly
- Simplify _extract_azure_search_urls to use item.type/item.output
- Handle empty list output from streaming 'added' events
- Update tests to match actual runtime types (objects, not dicts)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* mypy fix
* small fixes
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add max_function_calls to FunctionInvocationConfiguration (#2329)
Add a new per-request max_function_calls setting to FunctionInvocationConfiguration
that limits the total number of individual function invocations across all iterations
within a single get_response call. This complements max_iterations (which limits LLM
roundtrips) by providing a hard cap on actual tool executions regardless of parallelism.
- Add max_function_calls field to FunctionInvocationConfiguration (default: None/unlimited)
- Track cumulative function call count in both streaming and non-streaming tool loops
- Force tool_choice='none' when the limit is reached
- Add validation in normalize_function_invocation_configuration
- Improve docstrings for FunctionInvocationConfiguration, FunctionTool, and @tool
to clarify semantics of max_iterations vs max_function_calls vs max_invocations
- Add tests for parallel calls, single calls, unlimited mode, and config validation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add sample for controlling total tool executions
Showcases all three mechanisms for limiting tool executions:
1. max_iterations — caps LLM roundtrips
2. max_function_calls — caps total individual function invocations per request
3. max_invocations — lifetime cap on a specific tool instance
Plus a combined scenario demonstrating defense in depth.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Suppress ruff E305/fmt in hosting sample to preserve XML doc tags
The XML snippet tags (# <create_agent> / # </create_agent>) are used for
docs extraction and must stay adjacent to the code they wrap. Both ruff
check (E305) and ruff format add blank lines after the function definition,
pushing the closing tag away. Suppress with ruff: noqa: E305 and fmt: off.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add per-agent tool wrapping scenario to control_total_tool_executions sample
Show that wrapping the same callable with @tool multiple times creates
independent FunctionTool instances with separate invocation counters,
enabling per-agent max_invocations budgets for shared functions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Clarify max_function_calls is a best-effort limit
The limit is checked after each batch of parallel calls completes, so the
current batch always runs to completion even if it overshoots the limit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review: fix docstring reference, clarify best-effort in sample
- Fix malformed Sphinx :attr: role in FunctionTool docstring — use plain
backtick reference instead
- Update sample to say 'best-effort cap' instead of 'hard cap' for
max_function_calls, noting it's checked between iterations
- Parametrize pattern is correct (fixture override, matching existing tests)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* clarify max_invocations limits
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-24 01:00:25 +00:00
* Fix structured_output propagation in ClaudeAgent
Capture structured_output from ResultMessage in _get_stream() and
propagate it to AgentResponse.value via a custom finalizer. Previously
structured_output was silently discarded, making output_format unusable.
Fixes#4095
* Address review feedback: use value parameter instead of private properties
- Extend AgentResponse.from_updates() to accept optional value parameter
- Remove structured_output yield from _get_stream()
- Update _finalize_response() to pass value via public API
- Update streaming test to use get_final_response()
* Fix mypy errors: add value parameter to from_updates overloads
Add value parameter to both @overload signatures of
AgentResponse.from_updates() so mypy recognizes the argument.
---------
Co-authored-by: Amit Mukherjee <amimukherjee@microsoft.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
* .NET: Add Web Search sample #3674
* .NET: Fix WebSearch sample to use Responses API built-in web search
Remove incorrect Bing Grounding connection ID requirement from the
WebSearch sample. The web search tool uses the OpenAI Responses API
built-in capability and does not need a connection ID.
- Remove AZURE_FOUNDRY_BING_CONNECTION_ID env var requirement
- Use HostedWebSearchTool() without connectionId properties
- Refactor creation options into local functions (MEAI + NativeSDK)
- Switch from AzureCliCredential to DefaultAzureCredential
- Update README to reflect correct prerequisites
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix README to align DefaultAzureCredential docs with code
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address review: add project to solution, README, simplify response text
- Add FoundryAgents_Step25_WebSearch to agent-framework-dotnet.slnx
- Add web search sample entry to parent FoundryAgents README.md
- Simplify text response extraction to use response.Text directly
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix merge conflict in slnx solution file
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When converting base AgentRunOptions to ChatClientAgentRunOptions, the middleware
now preserves AllowBackgroundResponses, ContinuationToken, and AdditionalProperties
in addition to ResponseFormat.
Added unit test verifying all properties are preserved during the conversion.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Updated merge test permissions
* Removed repo check
* Added fetch from main for comparison
* Updated path detection logic
* Small updates
* Reverted file rename
* Created dedicated workflows for integration tests
* Small fix for Python
* Small fixes
* Small update
* Small update
* Added tests check for Python
* Add ChatClient decorator for calling AIContextProviders
* Format new files
* Address PR comments
* Revert problematic change
* Rename Use to UseAIContextProvider
* fix Workflow.as_agent() streaming regression in ag-ui
* Address PR feedback
* workflows wip
* wip
* wip
* Workflow AG-UI demo
* Fixes for handoff workflow demo
* Fixes to workflows support in AG-UI
* Fixes
* Add headers to some demo files
* Fix comment
* Fixes for store
* Make _input_schema lazy-loaded
* fix mypy
* revert session change to handoff only for now
---------
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
* Fix system message content sent as list instead of string
Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.
Fixes https://github.com/microsoft/agent-framework/issues/1407
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14
Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).
Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.
Fixes https://github.com/microsoft/agent-framework/issues/4160
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Flatten text-only message content to string for all roles
Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).
Partially fixes https://github.com/microsoft/agent-framework/issues/4084
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix streaming text lost when usage data in same chunk
Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.
Fixes https://github.com/microsoft/agent-framework/issues/3434
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix mypy errors in _chat_client.py
Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-23 10:05:36 +00:00
Extract 11 private const string fields for vector store property names
(Key, Role, MessageId, AuthorName, ApplicationId, AgentId, UserId,
SessionId, Content, CreatedAt, ContentEmbedding) and replace all inline
usages across the collection definition, store dictionary, search result
access, and filter expressions.
Fixes#3801
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add Azure AI Foundry Memory Context Provider with unit tests
* Add FoundryMemory integration tests and sample application
* Fix ClearStoredMemoriesAsync to handle 404 gracefully and rename to EnsureStoredMemoriesDeletedAsync
* Refactor FoundryMemory: simplify architecture and add memory store creation
- Remove IFoundryMemoryOperations interface (was only for test mocking)
- Remove AIProjectClientMemoryOperations wrapper class
- Provider now directly uses AIProjectClient with internal extension methods
- Extension methods return actual response models instead of extracted values
- Remove WaitForUpdateCompletionAsync from provider (sample uses delay)
- Simplify EnsureMemoryStoreCreatedAsync to return Task instead of Task<bool>
- Add memory store creation with chat_model and embedding_model
- Add UpdateMemoriesResponse with SupersededBy and Error fields
- Simplify unit tests to focus on constructor validation and serialization
- Update sample to use simple delay for memory processing wait
* Add waiting operation for memory store updates
* Fix UTF-8 BOM encoding for FoundryMemory csproj files
* Update copilot instructions for UTF-8 BOM and fix sample API rename
* Fix UTF-8 BOM encoding for TestableAIProjectClient.cs
* Add missing response headers for TS
* Changing default embedding
* Using the SDK Models
* Program update
* Remove debugging code from sample
* Adapt FoundryMemoryProvider to new AIContextProvider API and add UTF-8 BOM instruction
- Override ProvideAIContextAsync/StoreAIContextAsync instead of removed virtual InvokingAsync/InvokedAsync
- Use ProviderSessionState<State> for session-scoped state management (matching Mem0Provider pattern)
- Replace constructor-based scope with stateInitializer delegate
- Remove Serialize method (no longer on base class)
- Add SearchInputMessageFilter, StorageInputMessageFilter, StateKey to options
- Update sample to use AIContextProviders list instead of AIContextProviderFactory
- Update unit and integration tests for new API
- Add UTF-8 BOM encoding and --tl:off instructions to dotnet/AGENTS.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Use DefaultAzureCredential in Foundry Memory sample
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review comments for FoundryMemoryProvider
- Move memoryStoreName from options to required constructor parameter
- Make FoundryMemoryProviderScope require non-null/whitespace scope in constructor
- Make Scope property read-only (getter only)
- Replace ConcurrentQueue with single last update ID to fix memory leak
- Only clear pending update ID after successful completion
- Add delete success logging
- Mark FoundryMemoryProvider with [Experimental] attribute
- Update unit tests for new API signatures
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Use Throw.IfNullOrWhitespace for scope and memoryStoreName validation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: Normalize Run/RunStreaming with AIAgent
* refactor: Clarify Session vs. Run -level concepts
* Rename RunId to SessionId to better match Run/Session terminology in AIAgent
* [BREAKING]: Will break existing checkpointed sessions in CosmosDb due to field rename
* refactor: Rename and simplify interface around getting typed data out of ExternalRequest/Response
* Also adds hints around using value types in PortableValue
* refactor: Rename AddFanInEdge to AddFanInBarrierEdge
This will prevent a breaking change later when we introduce a programmable FanIn edge, analogous to the FanOut edge's EdgeSelector.
The goal, in the long run is to support a number of different FanIn scenarios, with naive FanIn (no barrier) by default, similar to FanOut.
* refactor: AsAgent(this Workflow, ...) => AsAIAgent(...)
* misc - part1: SwitchBuilder internal
---------
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
* fix: strip function_call and text_reasoning from cross-agent workflow handoff
When a reasoning model (e.g. gpt-5-mini) runs as Agent 1 in a workflow, its
response includes text_reasoning items (with server-scoped IDs like rs_XXXX)
and function_call items. Forwarding these to Agent 2 in a fresh conversation
caused API errors because the reasoning/call IDs are scoped to the original
stored response context.
Changes:
- Strip 'function_call', 'text_reasoning', 'function_approval_request', and
'function_approval_response' from handoff messages in _agent_executor.py
- Keep 'function_result' so the actual tool output content is preserved for
the next agent's context
- Update unit tests to reflect that function_result messages survive handoff
(messages grow from 2→3: user, tool(result), assistant(summary))
- Fix incorrect test assertions in test_function_invocation_stop_clears_*
that assumed the client layer updates session.service_session_id
- Also fixed _extract_function_calls to search all messages with call_id
deduplication, and the error-limit stop path to submit function_call_output
items before halting (via tool_choice=none cleanup call)
Relates to: https://github.com/microsoft/agent-framework/issues/4047
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: reasoning model workflow handoff and history serialization
Fixes multiple related issues when using reasoning models (gpt-5-mini,
gpt-5.2) in multi-agent workflows that chain agents via from_response
or replay full conversation history via AgentExecutorRequest.
## Reasoning items always emitted on output_item.added
When a reasoning model produces encrypted or hidden reasoning (no
visible text), the Responses API still fires a reasoning output item
without any reasoning_text.delta events. Previously no text_reasoning
Content was emitted in that case, making it invisible to downstream
logic. Both the non-streaming (_parse_response_from_openai) and
streaming (output_item.added) paths now always emit at least one
text_reasoning Content — with empty text if no content is available —
so co-occurrence detection and serialization guards work reliably.
## Reasoning items only serialized when paired with a function_call
The Responses API only accepts reasoning items in input when they
directly preceded a function_call in the original response. Sending a
reasoning item that preceded a text response (no tool call) causes:
"reasoning was provided without its required following item"
_prepare_message_for_openai now checks has_function_call per message
and skips text_reasoning serialization when there is no accompanying
function_call.
## summary field is an array, not an object
The reasoning item summary field sent to the Responses API must be an
array of objects ([{"type": "summary_text", "text": ...}]), not a
single object. Fixed _prepare_content_for_openai accordingly.
## service_session_id cleared when explicit history is provided
When a workflow coordinator replays a full conversation (including
function calls from a previous agent run) back to an executor via
AgentExecutorRequest or from_response, the executor's session still
held a service_session_id (previous_response_id) from the prior run.
The API then received the same function-call items twice — once from
previous_response_id (server-stored) and once from the explicit input —
causing: "Duplicate item found with id fc_...".
AgentExecutor.run (when should_respond=True) and from_response now
reset self._session.service_session_id = None before running so that
explicit input is the sole source of conversation context.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* small improvements in text reasoning
* refactor: add reset_service_session to AgentExecutorRequest for explicit history replay
Replace the implicit 'always clear service_session_id when should_respond=True'
with an explicit opt-in field on AgentExecutorRequest.
The old approach used should_respond=True as a proxy for 'full history replay',
but that conflates two distinct intents:
- Orchestrations group chat sends should_respond=True with an empty/single-message
list (not a full replay) — unnecessarily clearing service_session_id.
- HITL / feedback coordinators send the full prior conversation and truly need
a fresh service session ID to avoid duplicate-item API errors.
Changes:
- Add AgentExecutorRequest.reset_service_session: bool = False
- AgentExecutor.run only clears service_session_id when this flag is True
- AgentExecutor.from_response unchanged (always clears; always full conversation)
- Set reset_service_session=True in all full-history-replay call sites:
agents_with_HITL.py, azure_chat_agents_tool_calls_with_feedback.py,
autogen-migration round-robin coordinator, tau2 runner
- Update _FullHistoryReplayCoordinator test helper to pass the flag
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* comment update
* fixes from feedback
* fix test
* reverted changes to agent executor
* fix: remove reset_service_session from tau2 runner
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* two other reverts
* fix sample
---------
Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-19 21:02:20 +00:00
* Fix handoff orchestration not passing user message to handoff target agent (#3161)
Filter out internal handoff function call and tool result messages before
passing conversation history to the target agent's LLM. These messages
confused the model into ignoring the original user question.
* Add handoff tool call filtering behavior and enhance workflow builder
- Introduced HandoffToolCallFilteringBehavior enum to specify filtering behavior for tool call contents in handoff workflows.
- Updated HandoffsWorkflowBuilder to support customizable handoff instructions and tool call filtering behavior.
- Enhanced HandoffAgentExecutor to utilize new filtering options for improved message handling during agent handoffs.
* Enhance handoff message filtering logic and add unit tests for filtering behaviors
* Refactor HandoffMessagesFilter to remove unused handoff function names and enhance filtering logic for non-handoff function calls
* Refactor HandoffMessagesFilter to streamline FilterCandidateState initialization and improve clarity
* Refactor HandoffMessagesFilter to improve filtering logic and add integration tests for handoff workflows
* fix: HandoffAgentExecutor tests
* [BREAKING] refactor: Decouple Checkpointing and Execution APIs
With this change, Checkpointing becomes an property of an IWorkflowExecutionEnvironment. This lets environments that are tightly-coupled to their CheckpointManager avoid needing to present APIs that would not work (e.g. taking in an InMemory CheckpointManager for Durable Tasks, for example)
* refactor: Normalize IsCheckpointingEnabled naming
- Rename UserNameProvider → UserMemoryProvider
- Use session state (state dict) instead of instance variables
- Use context.extend_instructions() instead of context.instructions.append()
- Use DEFAULT_SOURCE_ID class attribute
- Fix imports to use public agent_framework API
- Add session state inspection at end of sample
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-19 15:34:49 +00:00
Track the last CheckpointInfo in InProcessRunner so that newly created
checkpoints reference their parent. When resuming from a checkpoint,
the resumed-from checkpoint becomes the parent of the next checkpoint.
Adds tests verifying:
- First checkpoint has null parent
- Subsequent checkpoints chain parents correctly
- Checkpoint after resume references the resumed-from checkpoint
* feat: Implement Polymorphic Routing
* feat: Add support for Send/Yield annotations with basic Executor
* Adds annotations to Declarative workflow executors
* fix: Address PR Comments
* Implicit filter in collection loops
* Remove debug / usused / superfluous code
* Fix ProtocolBuilder implicit output registrations
* Fix logic error in ExecuteRouteGeneratorTests.ClassWithManualConfigureProtocol_DoesNotGenerate
* fix: Solidify type checks and send/yield type registrations
* fix: Suppress generation of TurnTokens out of AggregateTurnMessagesExecutor
* Fixes an issue where ConcurrentEndExecutor is not expecting TurnTokens.
* fix: Add ProtocolBuilder support for chained-delegation
* Updates Declarative pacakge to rely on chained-delegation Send/Yield registration
* Renames DeclarativeActionExectuor's new ExecuteAsync to ExecuteActionAsync to avoid colliding with Executor.ExecutoeAsync
* fix: Address PR Comments
* Fixes type mapping in FanInEdgeRunner
* Fixes and expalins send/yield type registration in FunctionExecutor
* fixup: build-break
* fix: Add missing SendsMesage declaration to InvokeAzureAgentExecutor
* Fix FoundryAgents_Step15_ComputerUse sample for Azure Agents API
The Azure Agents API rejects previous_response_id alongside computer_call_output
items, unlike the vanilla OpenAI Responses API. This fix:
- Send all prior response output items (reasoning, computer_call, etc.) as input
items in follow-up calls so the API has full conversation context
- Create a fresh session per call to avoid ConversationId/previous_response_id
- Use currentCallId instead of initialCallId for computer_call_output
- Clear ContinuationToken after polling to prevent stale tokens
- Remove unused initialCallId tracking variable
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address comments
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Initial Implementation of InvokeFunctionTool
* Added unit test for InvokeFunctionTool executor.
* Implemented unit and integration tests for InvokeFunctionTool.
* Add sample for InvokeFunctionTool in declarative workflows.
* Remove unused sample and updated comments.
* Updating to official OM release with InvokeFunctionTool
* Fix formatting issues.
* Updated PowerFx version
* Update test fixture
* Cleanup - Removed unused method in InvokeFunctionToolExecutor
* Update test based on PR feedback.
* Update based on PR comments