* Python: Enhance Azure AI Search citations with document URLs in Foundry V2 (Responses API)
Override _parse_response_from_openai and _parse_chunk_from_openai in
RawAzureAIClient to extract get_urls from azure_ai_search_call_output
items and enrich url_citation annotations with document-specific URLs.
- Non-streaming: first pass collects get_urls, post-processes annotations
- Streaming: captures search output state, enriches url_citation events
(also handles url_citation annotation type not handled by base class)
- Updated V2 sample to demonstrate citation URL extraction
- Added 14 unit tests covering extraction, enrichment, and edge cases
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: rework search citation enrichment to override _inner_get_response
- Remove all direct openai/pydantic imports from _client.py
- Override _inner_get_response instead of _parse_response_from_openai/_parse_chunk_from_openai
- Use closure-local state for streaming instead of instance-level _streaming_search_get_urls
- Add _build_url_citation_content helper for streaming url_citation handling
- Fix mypy errors by using str(value or '') for Annotation TypedDict fields
- Fix docstring to say 'citation' instead of 'url_citation'
- Update tests to match new approach
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: handle streaming search citations from output_item.done events
The azure_ai_search_call_output item only has populated output data
(including get_urls) in the response.output_item.done event, not in
the response.output_item.added event. Also removed the search_get_urls
guard on url_citation handling so annotations are always produced even
if get_urls haven't been captured yet.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* addressed comments
* refactor: address PR review - eliminate type: ignore[assignment] pattern
Call super()._inner_get_response() independently in each branch instead
of once at the top with union type reassignment. Non-streaming uses
two-arg super() in the closure; streaming uses cast() for type narrowing.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: remove defensive patterns per PR review
- Replace all getattr() with direct attribute access
- Remove cast() for streaming branch, use type: ignore[assignment]
- Simplify _build_url_citation_content to use dict access directly
- Simplify _extract_azure_search_urls to use item.type/item.output
- Handle empty list output from streaming 'added' events
- Update tests to match actual runtime types (objects, not dicts)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* mypy fix
* small fixes
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add max_function_calls to FunctionInvocationConfiguration (#2329)
Add a new per-request max_function_calls setting to FunctionInvocationConfiguration
that limits the total number of individual function invocations across all iterations
within a single get_response call. This complements max_iterations (which limits LLM
roundtrips) by providing a hard cap on actual tool executions regardless of parallelism.
- Add max_function_calls field to FunctionInvocationConfiguration (default: None/unlimited)
- Track cumulative function call count in both streaming and non-streaming tool loops
- Force tool_choice='none' when the limit is reached
- Add validation in normalize_function_invocation_configuration
- Improve docstrings for FunctionInvocationConfiguration, FunctionTool, and @tool
to clarify semantics of max_iterations vs max_function_calls vs max_invocations
- Add tests for parallel calls, single calls, unlimited mode, and config validation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add sample for controlling total tool executions
Showcases all three mechanisms for limiting tool executions:
1. max_iterations — caps LLM roundtrips
2. max_function_calls — caps total individual function invocations per request
3. max_invocations — lifetime cap on a specific tool instance
Plus a combined scenario demonstrating defense in depth.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Suppress ruff E305/fmt in hosting sample to preserve XML doc tags
The XML snippet tags (# <create_agent> / # </create_agent>) are used for
docs extraction and must stay adjacent to the code they wrap. Both ruff
check (E305) and ruff format add blank lines after the function definition,
pushing the closing tag away. Suppress with ruff: noqa: E305 and fmt: off.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add per-agent tool wrapping scenario to control_total_tool_executions sample
Show that wrapping the same callable with @tool multiple times creates
independent FunctionTool instances with separate invocation counters,
enabling per-agent max_invocations budgets for shared functions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Clarify max_function_calls is a best-effort limit
The limit is checked after each batch of parallel calls completes, so the
current batch always runs to completion even if it overshoots the limit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review: fix docstring reference, clarify best-effort in sample
- Fix malformed Sphinx :attr: role in FunctionTool docstring — use plain
backtick reference instead
- Update sample to say 'best-effort cap' instead of 'hard cap' for
max_function_calls, noting it's checked between iterations
- Parametrize pattern is correct (fixture override, matching existing tests)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* clarify max_invocations limits
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-24 01:00:25 +00:00
* Fix structured_output propagation in ClaudeAgent
Capture structured_output from ResultMessage in _get_stream() and
propagate it to AgentResponse.value via a custom finalizer. Previously
structured_output was silently discarded, making output_format unusable.
Fixes#4095
* Address review feedback: use value parameter instead of private properties
- Extend AgentResponse.from_updates() to accept optional value parameter
- Remove structured_output yield from _get_stream()
- Update _finalize_response() to pass value via public API
- Update streaming test to use get_final_response()
* Fix mypy errors: add value parameter to from_updates overloads
Add value parameter to both @overload signatures of
AgentResponse.from_updates() so mypy recognizes the argument.
---------
Co-authored-by: Amit Mukherjee <amimukherjee@microsoft.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
* fix Workflow.as_agent() streaming regression in ag-ui
* Address PR feedback
* workflows wip
* wip
* wip
* Workflow AG-UI demo
* Fixes for handoff workflow demo
* Fixes to workflows support in AG-UI
* Fixes
* Add headers to some demo files
* Fix comment
* Fixes for store
* Make _input_schema lazy-loaded
* fix mypy
* revert session change to handoff only for now
---------
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
* Fix system message content sent as list instead of string
Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.
Fixes https://github.com/microsoft/agent-framework/issues/1407
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14
Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).
Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.
Fixes https://github.com/microsoft/agent-framework/issues/4160
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Flatten text-only message content to string for all roles
Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).
Partially fixes https://github.com/microsoft/agent-framework/issues/4084
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix streaming text lost when usage data in same chunk
Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.
Fixes https://github.com/microsoft/agent-framework/issues/3434
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix mypy errors in _chat_client.py
Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-23 10:05:36 +00:00
* fix: strip function_call and text_reasoning from cross-agent workflow handoff
When a reasoning model (e.g. gpt-5-mini) runs as Agent 1 in a workflow, its
response includes text_reasoning items (with server-scoped IDs like rs_XXXX)
and function_call items. Forwarding these to Agent 2 in a fresh conversation
caused API errors because the reasoning/call IDs are scoped to the original
stored response context.
Changes:
- Strip 'function_call', 'text_reasoning', 'function_approval_request', and
'function_approval_response' from handoff messages in _agent_executor.py
- Keep 'function_result' so the actual tool output content is preserved for
the next agent's context
- Update unit tests to reflect that function_result messages survive handoff
(messages grow from 2→3: user, tool(result), assistant(summary))
- Fix incorrect test assertions in test_function_invocation_stop_clears_*
that assumed the client layer updates session.service_session_id
- Also fixed _extract_function_calls to search all messages with call_id
deduplication, and the error-limit stop path to submit function_call_output
items before halting (via tool_choice=none cleanup call)
Relates to: https://github.com/microsoft/agent-framework/issues/4047
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: reasoning model workflow handoff and history serialization
Fixes multiple related issues when using reasoning models (gpt-5-mini,
gpt-5.2) in multi-agent workflows that chain agents via from_response
or replay full conversation history via AgentExecutorRequest.
## Reasoning items always emitted on output_item.added
When a reasoning model produces encrypted or hidden reasoning (no
visible text), the Responses API still fires a reasoning output item
without any reasoning_text.delta events. Previously no text_reasoning
Content was emitted in that case, making it invisible to downstream
logic. Both the non-streaming (_parse_response_from_openai) and
streaming (output_item.added) paths now always emit at least one
text_reasoning Content — with empty text if no content is available —
so co-occurrence detection and serialization guards work reliably.
## Reasoning items only serialized when paired with a function_call
The Responses API only accepts reasoning items in input when they
directly preceded a function_call in the original response. Sending a
reasoning item that preceded a text response (no tool call) causes:
"reasoning was provided without its required following item"
_prepare_message_for_openai now checks has_function_call per message
and skips text_reasoning serialization when there is no accompanying
function_call.
## summary field is an array, not an object
The reasoning item summary field sent to the Responses API must be an
array of objects ([{"type": "summary_text", "text": ...}]), not a
single object. Fixed _prepare_content_for_openai accordingly.
## service_session_id cleared when explicit history is provided
When a workflow coordinator replays a full conversation (including
function calls from a previous agent run) back to an executor via
AgentExecutorRequest or from_response, the executor's session still
held a service_session_id (previous_response_id) from the prior run.
The API then received the same function-call items twice — once from
previous_response_id (server-stored) and once from the explicit input —
causing: "Duplicate item found with id fc_...".
AgentExecutor.run (when should_respond=True) and from_response now
reset self._session.service_session_id = None before running so that
explicit input is the sole source of conversation context.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* small improvements in text reasoning
* refactor: add reset_service_session to AgentExecutorRequest for explicit history replay
Replace the implicit 'always clear service_session_id when should_respond=True'
with an explicit opt-in field on AgentExecutorRequest.
The old approach used should_respond=True as a proxy for 'full history replay',
but that conflates two distinct intents:
- Orchestrations group chat sends should_respond=True with an empty/single-message
list (not a full replay) — unnecessarily clearing service_session_id.
- HITL / feedback coordinators send the full prior conversation and truly need
a fresh service session ID to avoid duplicate-item API errors.
Changes:
- Add AgentExecutorRequest.reset_service_session: bool = False
- AgentExecutor.run only clears service_session_id when this flag is True
- AgentExecutor.from_response unchanged (always clears; always full conversation)
- Set reset_service_session=True in all full-history-replay call sites:
agents_with_HITL.py, azure_chat_agents_tool_calls_with_feedback.py,
autogen-migration round-robin coordinator, tau2 runner
- Update _FullHistoryReplayCoordinator test helper to pass the flag
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* comment update
* fixes from feedback
* fix test
* reverted changes to agent executor
* fix: remove reset_service_session from tau2 runner
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* two other reverts
* fix sample
---------
Co-authored-by: Giles Odigwe <79032838+giles17@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-19 21:02:20 +00:00
- Rename UserNameProvider → UserMemoryProvider
- Use session state (state dict) instead of instance variables
- Use context.extend_instructions() instead of context.instructions.append()
- Use DEFAULT_SOURCE_ID class attribute
- Fix imports to use public agent_framework API
- Add session state inspection at end of sample
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-19 15:34:49 +00:00
* Python: improve .env precedence and observability samples
- Switch load_settings to explicit precedence: overrides -> explicit .env -> environment -> defaults\n- Raise when env_file_path is provided but missing\n- Update settings docs and tests for new behavior\n- Refresh observability samples and README guidance for env loading options\n\nCloses #3864\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fixed some imports
* Fix load_settings CI regressions
Allow explicit env_file_path values that exist but are not regular files (for example /dev/null) by checking path existence before dotenv parsing, and restore a dict accumulator with typed return cast to satisfy mypy.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Avoid implicit dotenv in observability
Only load dotenv in observability helpers when env_file_path is explicitly provided, and remove test os.devnull workarounds that are no longer necessary.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-18 11:18:52 +00:00
* Fix streaming branch in weather override middleware sample
The streaming branch of weather_override_middleware only prefixed the
original weather data via a transform hook instead of replacing the
content with the 'perfect weather' override like the non-streaming
branch does. Replace with a new ResponseStream that yields the override
content as ChatResponseUpdate chunks.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fixed exception handling middleware sample
* Fixed runtime context delegation middleware example
* Fixed multimodal input examples
* Small update
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add workflow support for Azure Functions
* fix compatability with latest framework changes and add integration tests
* refactor code
* remove white space
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* align help text with actual port used
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* replace instance id with a place holder
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* remove unused import
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* remove redundant typing import and fix SIM115
* fix latest breaking changes
* fix mypy issues
* clean up imports
* define source marker strings as constants
* fix json module name
* refactor _extract_message_content_from_dict
* refactor serialization
* add helper method for error response construction and remove _extract_message_content_from_dict since it is not needed
* use strict tpe checking for edges
* change how duplicate agent registrations are handled
* cancel approval_task on HITL timeout
* update docstring
* fix: align azurefunctions package with core API changes after rebase
- State.import_state/export_state are now sync (removed await)
- Add State.commit() before export_state() in activity execution
- Rename executor parameter shared_state -> state
- Rename ctx.set_shared_state/get_shared_state -> set_state/get_state (sync)
- WorkflowBuilder now takes start_executor as constructor kwarg
- Update WorkflowOutputEvent -> WorkflowEvent with type='output'
- Update RequestInfoEvent -> WorkflowEvent[Any]
- Update SharedState -> State in test imports
- Update duplicate agent name tests to match new warning behavior
- Update sample README API references
* fix sample check errors
* fix mypy issues
* fix trailing white spaces
* fix test imports
* feat: add durable workflow samples and adapt to main branch changes
- Add workflow samples 09-12 to 04-hosting/azure_functions/
- Adapt to ChatMessage -> Message rename from main
- Adapt to pickle-based checkpoint encoding from main
- Simplify _serialization.py to delegate to core encode/decode
- Fix Message -> WorkflowMessage disambiguation in _context.py
- Remove non-existent _checkpoint_summary import
* fix: update create_checkpoint signature to match superclass
* fix: correct relative link in HITL sample README
* fix: resolve import breakage after rebase (State, DurableAgentThread, get_logger)
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
* feat: Inject OpenTelemetry trace context into MCP requests and update documentation
* Update python/samples/getting_started/observability/README.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update python/packages/core/tests/core/test_mcp.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* refactor: move opentelemetry import to module level
OpenTelemetry is a hard dependency of agent-framework-core (per
pyproject.toml), so the try/except ImportError guard was dead code.
Move the import to the top of the file to fail fast on missing
dependencies instead of silently hiding installation issues.
---------
Co-authored-by: Pete Roden <Pete.Roden@microsoft.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Fix#3600: Pass JSON schemas through without Pydantic conversion
This change optimizes FunctionTool and MCP flows by passing JSON schemas
directly to providers without converting them to Pydantic models first.
Key changes:
- Store JSON schema as-is when supplied to FunctionTool
- Skip Pydantic model_validate for schema-supplied tools in invoke()
- Return MCP tool schemas directly without conversion
- Add comprehensive tests for schema passthrough behavior
Performance benefits:
- Eliminates expensive Pydantic model creation for supplied schemas
- Preserves exact schema structure (additionalProperties, custom fields, etc.)
- Reduces memory overhead and initialization time
Maintains backward compatibility:
- Function signature inference still uses Pydantic models
- Explicit Pydantic models passed as input_model work as before
- All existing tests pass
* Fix schema passthrough validation and remove helper
* Simplify FunctionTool without generic model dependency
* Fix FunctionTool typing fallout in 3600
* Remove FunctionTool[Any] compatibility shim
* Use serializable kwargs in OTEL tool args
Eduard van Valkenburg
·
2026-02-14 10:12:21 +00:00
* Python: Replace wildcard imports with explicit imports
- Replace all 'from ... import *' with explicit symbol imports
- Add __all__ declarations to namespace packages for re-exports
- Update CODING_STANDARD.md to prohibit wildcard imports
- Maintain exported API and preserve all functionality
fixes#3605
* Refine wildcard guidance example text
* Simplify explicit exports without self-aliases
Eduard van Valkenburg
·
2026-02-13 14:02:36 +00:00