* Python: Fix AgentResponse.value being None when streaming workflow (#3970)
The streaming path in BaseAgent.run() used the raw 'options' parameter
(passed by the caller) to bind response_format into the outer stream's
finalizer. When response_format was set in default_options rather than
runtime options, it was missing from the finalizer and value was None.
Fix: Use the merged chat_options from the run context (via ctx_holder),
matching the non-streaming path which already uses ctx['chat_options'].
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address review feedback for #3970: safer ctx access, add test coverage
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* support script execution by code interpretor
* improve the instruction prompt
* Add DefaultAzureCredential production warning to AgentSkills samples
Add the standard three-line WARNING comment about DefaultAzureCredential
production considerations to both AgentSkills sample Program.cs files,
matching the convention used in all other GettingStarted/Agents samples.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* address pr review comments
* address feedback
* rename Skill* types to FileAgentSkill* prefix for consistency
- Rename SkillFrontmatter -> FileAgentSkillFrontmatter
- Rename SkillScriptExecutor -> FileAgentSkillScriptExecutor
- Add FileAgentSkillScriptExecutionContext and FileAgentSkillScriptExecutionDetails
- Update sample, provider, loader, and tests accordingly
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* reorder usings
* use set for props initialization instead of init
* rename HostedCodeInterpreterSkillScriptExecutor
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* 1. Add reproduction test for issue #4155: workflow.run Activity never stopped in streaming OffThread path
The WorkflowRunActivity_IsStopped_Streaming_OffThread test demonstrates that
the workflow.run OpenTelemetry Activity created in StreamingRunEventStream.RunLoopAsync
is started but never stopped when using the OffThread/Default streaming execution.
The background run loop keeps running after event consumption completes, so the
using Activity? declaration never disposes until explicit StopAsync() is called.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2. Fix workflow.run Activity never stopped in streaming OffThread execution (#4155)
The workflow.run OpenTelemetry Activity in StreamingRunEventStream.RunLoopAsync
was scoped to the method lifetime via 'using'. Since the run loop only exits on
cancellation, the Activity was never stopped/exported until explicit disposal.
Fix: Remove 'using' and explicitly dispose the Activity when the workflow reaches
Idle status (all supersteps complete). A safety-net disposal in the finally block
handles cancellation and error paths.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add root-level workflow.session activity spanning run loop lifetime\n\nImplements two-level telemetry hierarchy per PR feedback from lokitoth:\n- workflow.session: spans the entire run loop / stream lifetime\n- workflow_invoke: per input-to-halt cycle, nested within the session\n\nThis ensures the session activity stays open across multiple turns,\nwhile individual run activities are created and disposed per cycle.\n\nAlso fixes linkedSource CancellationTokenSource disposal leak in\nStreamingRunEventStream (added using declaration)."
* Address Copilot review: fix Activity/CTS disposal, rename activity, add error tag\n\n1. LockstepRunEventStream: Remove 'using' from Activity in async iterator\n and manually dispose in finally block (fixes#4155 pattern). Also dispose\n linkedSource CTS in finally to prevent leak.\n2. Tags.cs: Add ErrorMessage (\"error.message\") tag for runtime errors,\n distinct from BuildErrorMessage (\"build.error.message\").\n3. ActivityNames: Rename WorkflowRun from \"workflow_invoke\" to \"workflow.run\"\n for cross-language consistency.\n4. WorkflowTelemetryContext: Fix XML doc to say \"outer/parent span\" instead\n of \"root-level span\".\n5. ObservabilityTests: Assert WorkflowSession absence when DisableWorkflowRun\n is true.\n6. WorkflowRunActivityStopTests: Fix streaming test race by disposing\n StreamingRun before asserting activities are stopped.\n7. StreamingRunEventStream/LockstepRunEventStream: Use Tags.ErrorMessage\n instead of Tags.BuildErrorMessage for runtime error events."
* Review fixes: revert workflow_invoke rename, use 'using' for linkedSource, move SessionStarted earlier\n\n- Revert ActivityNames.WorkflowRun back to \"workflow_invoke\" (OTEL semantic convention contract)\n- Use 'using' declaration for linkedSource CTS in LockstepRunEventStream (no timing sensitivity)\n- Move SessionStarted event before WaitForInputAsync in StreamingRunEventStream to match Lockstep behavior"
* Improve naming and comments in WorkflowRunActivityStopTests"
* Prevent session Activity.Current leak in lockstep mode, add nesting test
Save and restore Activity.Current in LockstepRunEventStream.Start() so the
session activity doesn't leak into caller code via AsyncLocal. Re-establish
Activity.Current = sessionActivity before creating the run activity in
TakeEventStreamAsync to preserve parent-child nesting.
Add test verifying app activities after RunAsync are not parented under the
session, and that the workflow_invoke activity nests under the session."
* Fix stale XML doc: WorkflowRun -> WorkflowInvoke in ObservabilityTests
---------
Co-authored-by: alliscode <bentho@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Phase 2: Embedding clients for Ollama, Bedrock, and Azure AI Inference
Add embedding client implementations to existing provider packages:
- OllamaEmbeddingClient: Text embeddings via Ollama's embed API
- BedrockEmbeddingClient: Text embeddings via Amazon Titan on Bedrock
- AzureAIInferenceEmbeddingClient: Text and image embeddings via Azure AI
Inference, supporting Content | str input with separate model IDs for
text (AZURE_AI_INFERENCE_EMBEDDING_MODEL_ID) and image
(AZURE_AI_INFERENCE_IMAGE_EMBEDDING_MODEL_ID) endpoints
Additional changes:
- Rename EmbeddingCoT -> EmbeddingT, EmbeddingOptionsCoT -> EmbeddingOptionsT
- Add otel_provider_name passthrough to all embedding clients
- Register integration pytest marker in all packages
- Add lazy-loading namespace exports for Ollama and Bedrock embeddings
- Add image embedding sample using Cohere-embed-v3-english
- Add azure-ai-inference dependency to azure-ai package
Part of #1188
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix mypy duplicate name and ruff lint issues
- Rename second 'vector' variable to 'img_vector' in image embedding loop
- Combine nested with statements in tests
- Remove unused result assignments in tests
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* updates from feedback
* Fix CI failures in embedding usage handling
- Fix Azure AI embedding mypy issues by normalizing vectors to list[float],
safely accumulating optional usage token fields, and filtering None entries
before constructing GeneratedEmbeddings
- Avoid Bandit false positive by initializing usage details as an empty dict
- Update OpenAI embedding tests to assert canonical usage keys
(input_token_count/total_token_count)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-25 17:45:08 +00:00
- Bump RCNumber from 1 to 2
- Update GitTag to 1.0.0-rc2
- Update preview date stamps from 260219 to 260225
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* small updates and improvements in the azure AISearch provider
* Fix mypy errors and embedding function test
- Use separate variable for embeddings result to avoid mypy type reassignment error
- Fix test_vectorized_query_with_embedding_function: use real async function
instead of AsyncMock which falsely matches SupportsGetEmbeddings protocol
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fixes from feedback
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-25 06:47:26 +00:00
* fix: use HasSchema check in DetermineElementType to prevent empty records
When parsing JSON arrays containing objects without a predefined schema,
`DetermineElementType()` was creating a `VariableType` with an empty
(non-null) schema via `targetType.Schema?.Select(...) ?? []`. This caused
`ParseRecord` to take the schema-based parsing path, iterating over zero
schema fields and silently discarding all JSON properties.
The fix checks `targetType.HasSchema` and falls back to
`VariableType.RecordType` (which has `Schema = null`) when no schema is
defined, ensuring `ParseRecord` takes the dynamic `ParseValues()` path
that preserves all JSON properties.
Closes#4195
* test: add regression tests for schema-less JSON array-of-objects parsing (#4195)
Add two regression tests to JsonDocumentExtensionsTests:
1. ParseRecord_ObjectWithArrayOfObjects_NoSchema_PreservesNestedProperties
- Parses a JSON object containing an array of objects using
VariableType.RecordType (no schema) and verifies that nested
object properties (name, role) are preserved in each element.
- This is the exact scenario from issue #4195 where objects in
arrays were being returned as empty dictionaries.
2. ParseList_ArrayOfObjects_NoSchema_PreservesProperties
- Parses a JSON array of objects directly via ParseList with
VariableType.ListType (no schema) and verifies all properties
are preserved.
Both tests follow the existing Arrange/Act/Assert pattern and would
have failed before the DetermineElementType() fix (empty dictionaries
instead of populated ones).
* Fix thread corruption when max_iterations exhausted (#1366)
When the function invocation loop exhausts max_iterations while the model
keeps requesting tools, the failsafe code path (calling the model with
tool_choice='none' and prepending fcc_messages) was unreachable because
'if response is not None: return response' short-circuited before it.
The fix removes the premature return so the failsafe always runs after
loop exhaustion, making a final model call with tool_choice='none' to
produce a clean text answer and prepending accumulated fcc_messages from
prior iterations. This matches the existing pattern used by the error
threshold and max_function_calls paths.
Also unskips test_max_iterations_limit and test_streaming_max_iterations_limit
which were previously skipped with 'needs investigation in unified API'.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add fix report for issue #1366
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix ruff formatting in _tools.py and test_issue_1366_thread_corruption.py
Apply ruff format to fix multi-line string concatenation and function call
formatting issues flagged by the linter.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add quality review for issue #1366 fix
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Remove temporary investigation docs.
* Address PR review: explicit enabled check in log condition, clarify mock behavior in test
- Add explicit function_invocation_configuration['enabled'] check to the
'Maximum iterations reached' log condition in both non-streaming and
streaming paths, making intent clearer when function invocation is disabled.
- Add comment in test_thread_safe_after_max_iterations_with_agent explaining
that the failsafe response (tool_choice='none') is provided automatically
by the mock client, not from run_responses.
* Blend fix and tests into project without issue-specific callouts
- Remove issue #1366 references from _tools.py comments
- Move regression tests from standalone test_issue_1366_thread_corruption.py
into test_function_invocation_logic.py alongside existing max_iterations tests
- Clean up test docstrings to describe behavior generically
- Delete the standalone issue-specific test file
---------
Co-authored-by: alliscode <bentho@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: prevent doubled tool_call arguments in MESSAGES_SNAPSHOT
When streaming with client-side tools, some providers send a full-
arguments replay after the streaming deltas complete. The `_emit_tool_call`
function unconditionally appends every arguments delta to the internal
`flow.tool_calls_by_id` tracking dictionary via `+=`. When the replay
contains the exact same complete arguments string that was already
accumulated from prior deltas, the arguments get doubled (e.g.,
`{"todoText":"buy groceries"}{"todoText":"buy groceries"}`).
This causes `MESSAGES_SNAPSHOT` events to contain invalid doubled JSON in
`tool_calls[].function.arguments`, breaking any client or middleware that
relies on snapshots for state reconstruction.
The fix adds a guard (mirroring the existing duplicate guard in
`_emit_text`) that detects when the incoming delta exactly equals the
already-accumulated arguments string, indicating a full-arguments replay
rather than an incremental delta. In this case the append is skipped,
preventing the doubling.
The `ToolCallArgsEvent` deltas are still emitted correctly for real-time
streaming — only the internal snapshot accumulator is guarded.
Fixes#4194
* fix: move duplicate check before event emission + add test
Address Copilot review feedback:
1. Move duplicate full-arguments replay detection BEFORE emitting
ToolCallArgsEvent, for consistency with _emit_text() which returns
early without emitting any events on replay detection.
2. Add test_emit_tool_call_skips_duplicate_full_arguments_replay() to
verify the duplicate detection behavior for tool call arguments,
matching the existing test pattern for text content.
* updated integration tests and guidance
* fixed merge test
* updated integration tests
* fix: remove duplicate --dist loadfile flag from pytest-xdist config
Only one --dist mode can be active at a time; the second value silently
overrides the first. Keep --dist worksteal (dynamic load balancing) and
remove the redundant --dist loadfile from all workflow files and
pyproject.toml configs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* docs: add keep-in-sync notes for merge and integration test workflows
Both python-merge-tests.yml and python-integration-tests.yml share the
same parallel job structure. Added sync reminders in workflow file
comments, the python-testing SKILL.md, and CODING_STANDARD.md.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: remove RUN_INTEGRATION_TESTS flag
Integration test gating now uses two mechanisms:
- `@pytest.mark.integration` for test selection via `-m` filtering
- `skip_if_*_disabled` for credential/service availability checks
The RUN_INTEGRATION_TESTS env var was redundant since the marker handles
selection and the skip decorators already check for actual credentials.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: sync missing env vars from merge-tests to integration-tests
Add OPENAI_EMBEDDINGS_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
to python-integration-tests.yml to match python-merge-tests.yml.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: remove remaining RUN_INTEGRATION_TESTS from embedding tests and docs
Missed test_openai_embedding_client.py and vector-stores README in the
earlier cleanup.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* set functions tests to 3.10
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-24 09:35:46 +00:00
* feat(python): Add embedding abstractions and OpenAI implementation (Phase 1)
This PR contains two parts:
1. **Overall migration plan** for porting vector stores and embeddings from
Semantic Kernel to Agent Framework (docs/features/vector-stores-and-embeddings/README.md)
covering all 10 phases from core abstractions through connectors and TextSearch.
2. **Phase 1 implementation** — core embedding abstractions and OpenAI/Azure OpenAI
embedding clients:
Core types (_types.py):
- EmbeddingGenerationOptions TypedDict (total=False)
- Embedding[EmbeddingT] generic class with model_id, dimensions, created_at
- GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT] list container with options, usage
- EmbeddingInputT (default str) and EmbeddingT (default list[float]) TypeVars
Protocol + base class (_clients.py):
- SupportsGetEmbeddings protocol — Generic[EmbeddingInputT, EmbeddingT, OptionsContraT]
- BaseEmbeddingClient ABC — Generic[EmbeddingInputT, EmbeddingT, OptionsCoT]
Telemetry (observability.py):
- EmbeddingTelemetryLayer with gen_ai.operation.name = "embeddings"
OpenAI implementation (openai/_embedding_client.py):
- RawOpenAIEmbeddingClient, OpenAIEmbeddingClient, OpenAIEmbeddingOptions
- Uses _ensure_client() factory pattern
Azure OpenAI implementation (azure/_embedding_client.py):
- AzureOpenAIEmbeddingClient following AzureOpenAIChatClient pattern
- Supports API key, Entra ID credentials, env var configuration
Tests:
- 47 unit tests for types, protocol, base class, OpenAI, and Azure clients
- 6 integration tests (gated behind RUN_INTEGRATION_TESTS + credentials)
Samples:
- samples/02-agents/embeddings/openai_embeddings.py
- samples/02-agents/embeddings/azure_openai_embeddings.py
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Add AzureOpenAIEmbeddingClient to azure __init__.pyi stub
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: Add embedding env vars to Python integration tests
Map OPENAI_EMBEDDING_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
from GitHub vars to the integration test environment.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Handle base64 encoding_format in OpenAI embedding client
When encoding_format='base64' is used, the OpenAI API returns base64-encoded
floats instead of a JSON array. Decode these automatically to list[float]
so the return type stays consistent regardless of encoding format.
Also adds a unit test for base64 decoding and fixes minor docstring/import issues.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Only record INPUT_TOKENS for embedding telemetry
Embeddings have no output/completion tokens. Remove OUTPUT_TOKENS recording
which was double-counting prompt_tokens via the total_tokens fallback.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Resolve mypy variance error and lint warning
Use contravariant/covariant TypeVars for SupportsGetEmbeddings Protocol.
Combine nested if into single statement in telemetry layer.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Make EmbeddingCoT invariant for mypy compatibility
GeneratedEmbeddings is invariant in its type param, so the Protocol
TypeVar cannot be covariant.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Address PR review - empty values guard, service_url for telemetry
- Add early return for empty values in get_embeddings to avoid unnecessary API calls
- Add service_url() method to RawOpenAIEmbeddingClient for proper telemetry endpoint reporting
- Add test for empty values behavior
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Python: Fix OpenAI chat client compatibility with third-party endpoints and OTel 0.4.14 (#4161)
* Fix system message content sent as list instead of string
Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.
Fixes https://github.com/microsoft/agent-framework/issues/1407
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14
Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).
Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.
Fixes https://github.com/microsoft/agent-framework/issues/4160
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Flatten text-only message content to string for all roles
Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).
Partially fixes https://github.com/microsoft/agent-framework/issues/4084
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix streaming text lost when usage data in same chunk
Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.
Fixes https://github.com/microsoft/agent-framework/issues/3434
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix mypy errors in _chat_client.py
Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* reorder imports
* fix: Use OtelAttr.REQUEST_MODEL instead of removed SpanAttributes.LLM_REQUEST_MODEL
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* docs: Add score_threshold to vector store plan
Reference SK .NET PR #13501 for score threshold filtering semantics.
Include score_threshold in SearchOptions from Phase 3.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* docs: Add reference to roji's SK .NET MEVD work for SQL connectors
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: Clear env vars in construction tests to avoid CI leakage
Tests for missing API key / model ID now use monkeypatch.delenv to ensure
env vars from the integration test environment don't prevent the expected
ValueError from being raised.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-24 07:40:20 +00:00
* Python: Enhance Azure AI Search citations with document URLs in Foundry V2 (Responses API)
Override _parse_response_from_openai and _parse_chunk_from_openai in
RawAzureAIClient to extract get_urls from azure_ai_search_call_output
items and enrich url_citation annotations with document-specific URLs.
- Non-streaming: first pass collects get_urls, post-processes annotations
- Streaming: captures search output state, enriches url_citation events
(also handles url_citation annotation type not handled by base class)
- Updated V2 sample to demonstrate citation URL extraction
- Added 14 unit tests covering extraction, enrichment, and edge cases
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: rework search citation enrichment to override _inner_get_response
- Remove all direct openai/pydantic imports from _client.py
- Override _inner_get_response instead of _parse_response_from_openai/_parse_chunk_from_openai
- Use closure-local state for streaming instead of instance-level _streaming_search_get_urls
- Add _build_url_citation_content helper for streaming url_citation handling
- Fix mypy errors by using str(value or '') for Annotation TypedDict fields
- Fix docstring to say 'citation' instead of 'url_citation'
- Update tests to match new approach
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: handle streaming search citations from output_item.done events
The azure_ai_search_call_output item only has populated output data
(including get_urls) in the response.output_item.done event, not in
the response.output_item.added event. Also removed the search_get_urls
guard on url_citation handling so annotations are always produced even
if get_urls haven't been captured yet.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* addressed comments
* refactor: address PR review - eliminate type: ignore[assignment] pattern
Call super()._inner_get_response() independently in each branch instead
of once at the top with union type reassignment. Non-streaming uses
two-arg super() in the closure; streaming uses cast() for type narrowing.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor: remove defensive patterns per PR review
- Replace all getattr() with direct attribute access
- Remove cast() for streaming branch, use type: ignore[assignment]
- Simplify _build_url_citation_content to use dict access directly
- Simplify _extract_azure_search_urls to use item.type/item.output
- Handle empty list output from streaming 'added' events
- Update tests to match actual runtime types (objects, not dicts)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* mypy fix
* small fixes
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add max_function_calls to FunctionInvocationConfiguration (#2329)
Add a new per-request max_function_calls setting to FunctionInvocationConfiguration
that limits the total number of individual function invocations across all iterations
within a single get_response call. This complements max_iterations (which limits LLM
roundtrips) by providing a hard cap on actual tool executions regardless of parallelism.
- Add max_function_calls field to FunctionInvocationConfiguration (default: None/unlimited)
- Track cumulative function call count in both streaming and non-streaming tool loops
- Force tool_choice='none' when the limit is reached
- Add validation in normalize_function_invocation_configuration
- Improve docstrings for FunctionInvocationConfiguration, FunctionTool, and @tool
to clarify semantics of max_iterations vs max_function_calls vs max_invocations
- Add tests for parallel calls, single calls, unlimited mode, and config validation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add sample for controlling total tool executions
Showcases all three mechanisms for limiting tool executions:
1. max_iterations — caps LLM roundtrips
2. max_function_calls — caps total individual function invocations per request
3. max_invocations — lifetime cap on a specific tool instance
Plus a combined scenario demonstrating defense in depth.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Suppress ruff E305/fmt in hosting sample to preserve XML doc tags
The XML snippet tags (# <create_agent> / # </create_agent>) are used for
docs extraction and must stay adjacent to the code they wrap. Both ruff
check (E305) and ruff format add blank lines after the function definition,
pushing the closing tag away. Suppress with ruff: noqa: E305 and fmt: off.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add per-agent tool wrapping scenario to control_total_tool_executions sample
Show that wrapping the same callable with @tool multiple times creates
independent FunctionTool instances with separate invocation counters,
enabling per-agent max_invocations budgets for shared functions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Clarify max_function_calls is a best-effort limit
The limit is checked after each batch of parallel calls completes, so the
current batch always runs to completion even if it overshoots the limit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review: fix docstring reference, clarify best-effort in sample
- Fix malformed Sphinx :attr: role in FunctionTool docstring — use plain
backtick reference instead
- Update sample to say 'best-effort cap' instead of 'hard cap' for
max_function_calls, noting it's checked between iterations
- Parametrize pattern is correct (fixture override, matching existing tests)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* clarify max_invocations limits
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-24 01:00:25 +00:00
* Fix structured_output propagation in ClaudeAgent
Capture structured_output from ResultMessage in _get_stream() and
propagate it to AgentResponse.value via a custom finalizer. Previously
structured_output was silently discarded, making output_format unusable.
Fixes#4095
* Address review feedback: use value parameter instead of private properties
- Extend AgentResponse.from_updates() to accept optional value parameter
- Remove structured_output yield from _get_stream()
- Update _finalize_response() to pass value via public API
- Update streaming test to use get_final_response()
* Fix mypy errors: add value parameter to from_updates overloads
Add value parameter to both @overload signatures of
AgentResponse.from_updates() so mypy recognizes the argument.
---------
Co-authored-by: Amit Mukherjee <amimukherjee@microsoft.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
* .NET: Add Web Search sample #3674
* .NET: Fix WebSearch sample to use Responses API built-in web search
Remove incorrect Bing Grounding connection ID requirement from the
WebSearch sample. The web search tool uses the OpenAI Responses API
built-in capability and does not need a connection ID.
- Remove AZURE_FOUNDRY_BING_CONNECTION_ID env var requirement
- Use HostedWebSearchTool() without connectionId properties
- Refactor creation options into local functions (MEAI + NativeSDK)
- Switch from AzureCliCredential to DefaultAzureCredential
- Update README to reflect correct prerequisites
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix README to align DefaultAzureCredential docs with code
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address review: add project to solution, README, simplify response text
- Add FoundryAgents_Step25_WebSearch to agent-framework-dotnet.slnx
- Add web search sample entry to parent FoundryAgents README.md
- Simplify text response extraction to use response.Text directly
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix merge conflict in slnx solution file
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When converting base AgentRunOptions to ChatClientAgentRunOptions, the middleware
now preserves AllowBackgroundResponses, ContinuationToken, and AdditionalProperties
in addition to ResponseFormat.
Added unit test verifying all properties are preserved during the conversion.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Updated merge test permissions
* Removed repo check
* Added fetch from main for comparison
* Updated path detection logic
* Small updates
* Reverted file rename
* Created dedicated workflows for integration tests
* Small fix for Python
* Small fixes
* Small update
* Small update
* Added tests check for Python
* Add ChatClient decorator for calling AIContextProviders
* Format new files
* Address PR comments
* Revert problematic change
* Rename Use to UseAIContextProvider
* fix Workflow.as_agent() streaming regression in ag-ui
* Address PR feedback
* workflows wip
* wip
* wip
* Workflow AG-UI demo
* Fixes for handoff workflow demo
* Fixes to workflows support in AG-UI
* Fixes
* Add headers to some demo files
* Fix comment
* Fixes for store
* Make _input_schema lazy-loaded
* fix mypy
* revert session change to handoff only for now
---------
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
* Fix system message content sent as list instead of string
Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.
Fixes https://github.com/microsoft/agent-framework/issues/1407
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14
Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).
Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.
Fixes https://github.com/microsoft/agent-framework/issues/4160
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Flatten text-only message content to string for all roles
Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).
Partially fixes https://github.com/microsoft/agent-framework/issues/4084
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix streaming text lost when usage data in same chunk
Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.
Fixes https://github.com/microsoft/agent-framework/issues/3434
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix mypy errors in _chat_client.py
Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg
·
2026-02-23 10:05:36 +00:00