agent-framework

Add azure-monitor-opentelemetry to dev deps

Fixes Samples & Markdown CI failure. The PR's new transitive dep on
azure-monitor-opentelemetry-exporter (via azure-ai-agentserver-core) makes
pyright resolve the azure.monitor.opentelemetry namespace, flipping the
check_md_code_blocks diagnostic for `configure_azure_monitor` from
reportMissingImports (filtered) to reportAttributeAccessIssue (not filtered).
Installing the umbrella azure-monitor-opentelemetry package in dev makes
pyright resolve the symbol correctly, matching the install guidance the
observability README already gives users.

Evan Mattson · 2026-04-21 13:50:12 +09:00

8b0ef62802

Fix pre commit 6

Tao Chen · 2026-04-20 20:36:33 -07:00

49677ba789

Fix pre commit 5

Tao Chen · 2026-04-20 20:30:25 -07:00

7f751e7a0f

Fix pre commit 4

Tao Chen · 2026-04-20 20:27:26 -07:00

01d8a8af53

Fix pre commit 3

Tao Chen · 2026-04-20 18:47:15 -07:00

9d2a55ecfb

Fix pre commit 2

Tao Chen · 2026-04-20 18:42:11 -07:00

cbe3e8fd95

Fix pre commit

Tao Chen · 2026-04-20 18:39:56 -07:00

93b03140c7

Fix README

Tao Chen · 2026-04-20 18:37:24 -07:00

7aa40b16de

Merge branch 'main' into feature/python-foundry-hosted-agent-vnext

Tao Chen · 2026-04-20 18:35:09 -07:00

fc9194dcb6

User agent scoped

Tao Chen · 2026-04-20 18:34:30 -07:00

fd36871d60

Comments and mypy

Tao Chen · 2026-04-20 18:11:32 -07:00

e24d72be75

Fix README

Tao Chen · 2026-04-20 17:54:44 -07:00

8b77baf4a2

Python: Add more types (#5378 )

* Add more type supports

* Upgrade packages

* Remove TODOs in README

Tao Chen · 2026-04-20 17:46:06 -07:00

cd48c1424c

Python: Add support for Foundry Toolboxes (#5346 )

* Add support for the Foundry Toolbox in MAF

Introduces a Foundry Toolbox integration: FoundryChatClient gains a
get_toolbox() helper plus select_toolbox_tools(), normalize_tools in
the core package flattens tool-collection wrappers (ToolboxVersionObject
and generic iterables, while leaving Pydantic BaseModel instances
alone), and the new agent_framework.foundry namespace re-exports the
toolbox helpers. Ships with unit tests, a sample, and a design doc.

azure-ai-projects is pinned to the public >=2.0.0,<3.0 range and the
lockfile resolves from public PyPI. The toolbox test module skips when
Toolbox* types are unavailable so CI stays green until the public 2.1.0
SDK lands. OMC tooling directories (.omc/, .omx/) are gitignored.

* Update to latest azure ai projects package

* Improve sample

* Rename ADR to 0025

* Update ADR

* Apply suggestion from @alliscode

Co-authored-by: Ben Thomas <ben.thomas@microsoft.com>

* Improve samples

* Update test

---------

Co-authored-by: Ben Thomas <ben.thomas@microsoft.com>

Evan Mattson · 2026-04-20 23:56:01 +00:00

04aaf0c1fe

Improve samples (#5372 )

Tao Chen · 2026-04-20 16:34:53 -07:00

8bc7c3a7a8

Python: Add search tool content for OpenAI responses (#5302 )

* Add OpenAI search tool content parsing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix typing

* simplified oai image test

* same for azure

* skip az responses api test

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-20 13:35:30 +00:00

3e54a689fc

.NET: Features/3768-devui-aspire-integration (#3771 )

* adds devui integration and samples

* adds unit tests for devui integration

* fix: correct formatting of copyright notice in unit test files

* fixes formatting issues

* fixes build for net8 target

* fixes formatting errors on test apphost

* adds copyright notice to multiple files and removes unnecessary using directives

* Update dotnet/aspire-integration/Aspire.Hosting.AgentFramework.DevUI/DevUIAggregatorHostedService.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/aspire-integration/Aspire.Hosting.AgentFramework.DevUI/DevUIAggregatorHostedService.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/tests/Aspire.Hosting.AgentFramework.DevUI.UnitTests/Aspire.Hosting.AgentFramework.DevUI.UnitTests.csproj

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/samples/DevUIIntegration/DevUIIntegration.AppHost/DevUIIntegration.AppHost.csproj

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/aspire-integration/Aspire.Hosting.AgentFramework.DevUI/DevUIAggregatorHostedService.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Refactor project files to use TargetFrameworks instead of TargetFramework for multi-targeting support; add optional port property to DevUIResource class.

* Add unit tests for DevUIAggregatorHostedService; refactor project files for TargetFrameworks support

* Refactor project files to use TargetFrameworks for multi-targeting support in DevUIIntegration samples

* Remove unnecessary using directive for Aspire.Hosting in DevUIAggregatorHostedServiceTests

* merge

* fixes Conversation routing for non-first backends

* add documentation for devui integration sample

* update project references in solution file for improved integration

* fixes package versions post merge

* move Aspire.Hosting.AgentFramework.DevUI to dotnet/src

Move the project from aspire-integration/ to src/ to be consistent
with the location of all other projects in the repo.

* move DevUI sample to samples/05-end-to-end/DevUIAspireIntegration

Move the sample from samples/DevUIIntegration/ to
samples/05-end-to-end/DevUIAspireIntegration/ to match the location
of other end-to-end samples.

* remove unnecessary net472 framework condition from sample csproj files

These projects only target net10.0, so the
Condition="'$(TargetFramework)' != 'net472'" on ItemGroup is unnecessary.

* update sample model name from gpt-4.1 to gpt-5.4

Use a more up-to-date model name in the DevUI integration samples.

* Revert "remove unnecessary net472 framework condition from sample csproj files"

This reverts commit 08cf41253b.

* fix: use TargetFrameworks to override multi-targeting from Directory.Build.props

The parent Directory.Build.props sets TargetFrameworks to net10.0;net472,
which overrides the singular TargetFramework in each csproj. Use the plural
TargetFrameworks property set to net10.0 only to properly override it, and
remove the now-unnecessary net472 condition on ItemGroup.

* fixes aspire config

* fix: update Microsoft.Extensions packages to version 10.0.1

* Address Copilot review feedback on DevUI Aspire integration

- Fix request body dropping in ProxyConversationsAsync: always read the
  body when ContentLength > 0 before routing, then pass it through to
  all proxy calls (previously null was passed when backend was resolved
  from query param or conversation map)
- Fix resource leak: dispose aggregator on startup failure in catch block
- Fix XML docs: accurately describe embedded resource serving behavior
- Remove reflection from DevUIResourceTests (InternalsVisibleTo already set)
- Make sensitive telemetry conditional on Development environment in samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: update chat client version to gpt41 in both EditorAgent and WriterAgent

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tommaso Stocchi · 2026-04-20 11:12:54 +00:00

60af59ba8b

Python: Flatten hyperlight execute_code output (#5333 )

* small fix for hyperlight

* improved sandbox dependency

Eduard van Valkenburg · 2026-04-20 08:29:40 +00:00

69894eded8

Python: Fix CopilotStudioAgent to reuse conversation ID from existing session (#5299 )

* Fix CopilotStudioAgent to reuse existing conversation on session (#5285)

CopilotStudioAgent unconditionally called _start_new_conversation() in both
_run_impl and _run_stream_impl, ignoring any existing service_session_id on
the session. Add a guard to only start a new conversation when there is no
existing service_session_id, matching the pattern used by other agents.

Also fix pre-existing pyright reportMissingImports errors for orjson in
file_history_provider samples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert out-of-scope sample file changes

Remove unrelated orjson type-ignore comment changes from sample files
that were outside the scope of the conversation-ID reuse fix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-04-20 03:30:54 +00:00

495e1dad6b

.NET: fix: Add session support for Handoff-hosted Agents (#5280 )

* fix: Add session support for Handoff-hosted Agents

In order to better support using `Workflows` hosted as `AIAgents` inside of Handoff workflows, we need to make proper use of AgentSession. This causes potential issues around checkpointing and making sure that we properly compute only the new incoming messages for each agent invocation.

* fix: AgentSession checkpointing using AIAgent's Serialize/Deserialize methods

We cannot rely on implicit serialization through `HandoffHostState` because we are missing type information.

* fix: Thread safety issue in `MultiPartyConversation.AllMessages`

* fix: Enable unwrapping of FunctionResultContent when ExternalRequest was wrapped into FunctionCallContent

Jacob Alber · 2026-04-17 20:15:27 +00:00

5777ed26e6

.NET: Add Code Interpreter container file download samples (#5014 )

* Add Code Interpreter container file download samples (#3081)

- Add Agent_OpenAI_Step06_CodeInterpreterFileDownload (Public OpenAI)
- Add Agent_Step24_CodeInterpreterFileDownload (Microsoft Foundry)
- Both samples demonstrate downloading cfile_/cntr_ container files
  via ContainerClient instead of the standard Files API
- Update solution file and parent READMEs

* Address review feedback: flatten nested foreach loops using SelectMany

Addresses https://github.com/microsoft/agent-framework/pull/5014#discussion_r3046908449 and https://github.com/microsoft/agent-framework/pull/5014#discussion_r3046920209

---------

Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>
Co-authored-by: rogerbarreto <rogerbarreto@users.noreply.github.com>

chetantoshniwal · 2026-04-17 16:55:03 +00:00

52303a8d07

.NET: Fix declarative resume edge predicates to recognize both direct and PortableValue-wrapped forms after checkpoint restore (#5323 )

* Fix declarative workflows edge predicates after checkpoint restore

* Update test names to  make them clearer and more discoverable.

* Update dotnet/tests/Microsoft.Agents.AI.Workflows.Declarative.UnitTests/Kit/PortableValuePredicateTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update dotnet/tests/Microsoft.Agents.AI.Workflows.Declarative.UnitTests/Kit/PortableValuePredicateTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Peter Ibekwe · 2026-04-17 15:18:15 +00:00

c85d24da44

Python: Add special handling for workflows (#5298 )

* Add special handling for workflows

* Address comments

Tao Chen · 2026-04-16 17:55:45 -07:00

0fcd71dbeb

Python: Add Hyperlight CodeAct package and docs (#5185 )

* initial work on code_mode

* updated samples

* updates to codeact

* udpated codeact

* Draft CodeAct ADR and sample updates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* initial implementation and adr and feature

* Python: Limit Hyperlight wasm backend to Python <3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix CI for Hyperlight CodeAct PR

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Run Hyperlight integration when available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Address Hyperlight review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Simplify Hyperlight file mount inputs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Accept Path host paths in Hyperlight mounts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix Hyperlight mount typing for CI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* temp run integration test

* Python: Strengthen Hyperlight real sandbox tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added additional tests

* Python: Simplify Hyperlight CodeAct API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* set tests as non-integration

* Retry Hyperlight allowed-domain registration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Gate Hyperlight integration tests by runtime support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight skip test on Python 3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Delay Hyperlight runtime probe until test execution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Relax Hyperlight Windows integration stdout assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scan Hyperlight output directory for artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight output artifact collection

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden Hyperlight integration output assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight read-back check in integration test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify Hyperlight integration write assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid pathlib in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use socket network check in Hyperlight sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Replace blocked Azure AI Search blog link

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify Hyperlight guest stdlib limits

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use _socket in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Handle Hyperlight mounted file paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Broaden Hyperlight sandbox path fallbacks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Search Hyperlight guest mounts recursively

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight mount coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight live network tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight file-write test on Windows

Enable the sandbox filesystem by providing a workspace_root so
/output is mounted. Remove os.path.exists assertion (unsupported
in WASM guest) and fix Content data assertion to use .uri.
Skip the network integration test on Windows where the WASM
sandbox lacks the encodings.idna codec.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: ADR intro, manual wiring sample, doc clarifications

- Add CodeAct introduction section to ADR for unfamiliar readers
- Clarify 'less runtime efficient' con with specific overhead description
- Add note in Python impl doc clarifying ADR vs impl doc split
- Explain why before_run hooks must be per-run (CRUD, concurrency, approval)
- Rename code_interpreter variable to codeact in E2E sample
- Add manual static wiring sample (codeact_manual_wiring.py)
- Add 'when to use which pattern' guidance to samples README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5185 review comments and add .NET CodeAct design doc

- Fix async callback: _make_sandbox_callback returns sync wrapper with
  thread + asyncio.run() bridge (was broken with real Wasm FFI)
- Fix stale output: clear output_dir before each sandbox.run() call
- Fix blocking event loop: _run_code now async with asyncio.to_thread()
- Revert _agents.py options['tools'] injection (unnecessary; provider
  uses context.extend_tools())
- Revert SessionContext.options docstring back to read-only
- Add real-sandbox test fixtures (shared/restored/fresh)
- Add 8 new real-sandbox tests for callback round-trip, stale output,
  event loop non-blocking, basic execution, stdout/stderr, errors,
  snapshot/restore, and tool registration
- Add comprehensive .NET HyperlightCodeActProvider design document

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hyperlight README with code snippets and remove Public API section

Replace bare export list with Quick Start code examples covering the
context provider, standalone tool, manual static wiring, and file
mounts / network access patterns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-17 00:49:44 +00:00

b03cb324d5

.NET: fix: Foundry Agents without description in Handoff (#5311 )

* fix: Foundry Agents without description in Handoff

Foundry Agents without a description set will return an empty string (rather than null) for the description. This was breaking the fallback logic for `handoffReason`.

* test: Add unit tests

Jacob Alber · 2026-04-16 21:23:01 +00:00

dbf935b4e3

Merge branch 'main' into feature/python-foundry-hosted-agent-vnext

Tao Chen · 2026-04-16 13:55:04 -07:00

55e0705923

.NET: Add error checking to workflow samples (#5175 )

* Initial plan

* Add WorkflowErrorEvent and ExecutorFailedEvent error checking to all workflow samples

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/c5d77400-d7ed-4fbe-9103-f5d74aabcf2b

Co-authored-by: lokitoth <6936551+lokitoth@users.noreply.github.com>

* Fix if/else if consistency for error event handlers per code review feedback

Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/c5d77400-d7ed-4fbe-9103-f5d74aabcf2b

Co-authored-by: lokitoth <6936551+lokitoth@users.noreply.github.com>

* Address PR comments

* fixup: PR comments

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lokitoth <6936551+lokitoth@users.noreply.github.com>
Co-authored-by: Jacob Alber <jaalber@microsoft.com>

Copilot · 2026-04-16 20:03:16 +00:00

ca580a8316

.NET: Add Handoff sample (#5245 )

* feat: Add Handoff sample

* docs: Add Handoff sample to readme

Jacob Alber · 2026-04-16 20:02:31 +00:00

101e07b061

.NET: Foundry Evals integration for .NET (#4914 )

* Foundry Evals integration for .NET

- Core evaluation framework: EvalItem, LocalEvaluator, FunctionEvaluator, EvalChecks
- IAgentEvaluator interface with MeaiEvaluatorAdapter bridge
- AgentEvaluationExtensions for agent.EvaluateAsync() overloads
- FoundryEvals wrapping MEAI quality/safety evaluators
- ConversationSplitters (LastTurn, Full) and IConversationSplitter
- EvalItem.PerTurnItems() for multi-turn decomposition
- HasImageContent for multimodal content detection
- WorkflowEvaluationExtensions for per-agent workflow evaluation
- 7 eval samples mirroring Python parity:
  02-agents/Evaluation: SimpleEval, ExpectedOutputs, Multimodal
  03-workflows/Evaluation: WorkflowEval
  05-end-to-end/Evaluation: FoundryQuality, MixedProviders, ConversationSplits
- Comprehensive unit tests (1958 passing)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Rewrite FoundryEvals to use real Foundry Evals API

Replace MEAI evaluator shim with actual OpenAI EvaluationClient protocol
methods. FoundryEvals now creates eval definitions, submits runs, polls
for completion, and fetches per-item results server-side.

- New constructor: FoundryEvals(AIProjectClient, model, evaluators)
- Add FoundryEvalConverter for MEAI ChatMessage -> Foundry JSON format
- Add EvalId, RunId, ReportUrl to AgentEvaluationResults
- All 20 built-in evaluator constants now work (agent, tool, quality, safety)
- Remove Microsoft.Extensions.AI.Evaluation.Quality/Safety dependencies
- Update all samples for new constructor (no more ChatConfiguration)
- Replace BuildEvaluators tests with ResolveEvaluator tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add response output to CustomEvals and ExpectedOutputs samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review: pagination, validation, error handling, tests

FoundryEvals fixes:
- Add pagination for output items (has_more/after cursor)
- Add guard clauses for pollIntervalSeconds/timeoutSeconds <= 0
- Fix double TryGetProperty for passed field parsing
- Throw on all-tool-evaluators with no tool definitions
- Fix XML doc (default 300s, not 180s)

New tests (30 added, 1989 total):
- EvalChecks: NonEmpty, ContainsExpected (pass/fail/skip/case),
  HasImageContent, ToolCallsPresent
- FoundryEvalConverter: ConvertMessage (text, image, function call,
  function results fan-out, empty fallback, mixed content),
  ConvertEvalItem, BuildTestingCriteria (quality/agent/tool/groundedness
  data mappings), BuildItemSchema

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix review: null-refs, Data.ToString() bug, ContainsExpected, add tests

- Fix NullReferenceException in sample Response display (pattern matching)
- Fix WorkflowEvaluationExtensions Data?.ToString() producing type names
  instead of message text (pattern-match ChatMessage/AgentResponse/list)
- Change EvalChecks.ContainsExpected to return Passed=false when no
  ExpectedOutput (was silently passing, masking misconfiguration)
- Add EvalItem constructor tests with LastTurn/Full/null splitters
- Add FoundryEvalConverter.ConvertMessage DataContent (base64 image) test
- Add ExtractAgentData tests with ChatMessage, list, and AgentResponse data

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix review: conversation fidelity, eval caching, fallback tests

- WorkflowEvaluationExtensions: preserve full response messages (tool calls,
  intermediate) instead of synthetic 2-message conversation. Cast completed
  Data to AgentResponse and use Messages when available, fallback to text.
- FoundryEvals: cache evalId per schema shape (hasContext, hasTools) so
  subsequent EvaluateAsync calls create runs under the same eval definition.
- MeaiEvaluatorAdapter: code already correctly passes queryMessages (not full
  conversation) to IEvaluator — no change needed, verified by inspection.
- Add tests: AgentResponse full messages preservation, unknown object
  ToString() fallback for ExtractAgentData.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Rename AzureAI→Foundry: move eval files, update references

- Move FoundryEvals.cs and FoundryEvalConverter.cs from
  Microsoft.Agents.AI.AzureAI to Microsoft.Agents.AI.Foundry
- Update namespace from AzureAI to Foundry in both files
- Add explicit usings required by Foundry project (no implicit usings)
- Move FoundryEvalConverter tests to Foundry.UnitTests project
  (avoids ReplacingRedactor type conflict from dual project refs)
- Update all sample csproj references and using statements
- Remove Foundry project reference from AI UnitTests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* PR review round 4: wire up tool extraction, remove eval cache, fix null safety

- BuildEvalItem: extract tools from agent via GetService<ChatOptions>() into EvalItem.Tools (Python parity)
- FoundryEvals: remove eval ID cache - each call creates fresh definition (matches Python behavior)
- FoundryEvals: replace null-forgiving operators with descriptive InvalidOperationException
- MixedProviders sample: remove unnecessary explicit PackageReferences (transitively provided)
- FoundryEvalConverter: document that tool results take precedence over text content
- Add LocalEvaluator zero-checks test documenting 0 metrics = failed behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python-dotnet parity: 9 feature gaps filled

New checks:
- ToolCallArgsMatch() — verify tool call names + argument subset match
- ToolCalledCheck(ToolCalledMode.Any, ...) — match any of the specified tools
- ToolCalledMode enum (All/Any)

FoundryEvals enhancements:
- Default evaluators now [Relevance, Coherence, TaskAdherence] (was Relevance, Coherence)
- Auto-add ToolCallAccuracy when items have tool definitions
- EvaluateTracesAsync — evaluate by response_ids, trace_ids, or agent_id
- EvaluateFoundryTargetAsync — evaluate deployed Foundry targets

Result type enrichment:
- AgentEvaluationResults: added Status, Error, PerEvaluator, DetailedItems
- New EvalItemResult/EvalScoreResult/PerEvaluatorResult types
- FoundryEvals populates all new fields from API responses

Workflow fix:
- Skip internal executors (_*, input-conversation, end-conversation, end)

Tests: 8 new tests covering ToolCallArgsMatch, ToolCalledMode.Any, internal executor filtering

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add MeaiEvaluatorAdapter and PerTurnItems edge case tests

- 3 tests for MeaiEvaluatorAdapter: query message forwarding, synthetic
  response fallback, multiple items aggregation
- 3 tests for EvalItem.PerTurnItems: empty conversation, no user messages,
  system+assistant only
- StubEvaluator and StubChatClient test helpers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Blocking link check for outdated package in DevUI.

* Replace Dictionary<string, object> payloads with typed wire models

Introduce internal FoundryEvalWireModels.cs with compile-time-safe types
for the OpenAI Evals API wire format. The OpenAI .NET SDK (2.9.1) only
provides protocol-level methods with BinaryContent/ClientResult — no
typed request models. These internal models replace scattered dictionary
literals with [JsonPropertyName]-annotated classes, giving:

- Compile-time safety (typos become build errors)
- Single point of change when the API evolves
- IntelliSense discoverability
- Cleaner serialization via JsonPolymorphic for content items

Models: WireContentItem hierarchy (text, image, tool_call, tool_result),
WireMessage, WireEvalItemPayload, WireTestingCriterion, WireItemSchema,
WireCreateEvalRequest, WireCreateRunRequest, and data source variants.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Skip metric when Foundry returns neither score nor passed

When an evaluator returns no score and no passed value, the previous
code created BooleanMetric(name, false), which falsely failed items
via ItemPassed. Now we skip the MEAI metric entirely for indeterminate
results — the raw data remains available in DetailedItems for diagnostics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #4914 review comments: fix tool evaluator bug and add tests

- Fix duplicate ToolCallAccuracy: resolve evaluator names before checking
  against ToolEvaluators set (Comment 2)
- Make FilterToolEvaluators internal for testability; add tests for the
  ArgumentException edge case when all evaluators are tool-type (Comment 3)
- Add CancellationToken test for LocalEvaluator (Comment 4)
- Add EvaluateAsync integration test on Run with sequential workflow and
  per-agent SubResults verification (Comment 5)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address Peter's review comments on PR #4914

- Add trailing newline to Evaluation_FoundryQuality.csproj (Comment 6)
- Make evaluator name lookups case-insensitive: switch BuiltinEvaluators,
  ToolEvaluators, AgentEvaluators, and ResolveEvaluator's StartsWith check
  from Ordinal to OrdinalIgnoreCase (Comment 7)
- Add Trace.TraceWarning when Foundry returns fewer results than submitted
  items, indicating expected vs actual count before padding (Comment 8)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add Microsoft.Extensions.AI.Evaluation packages to Directory.Packages.props

These were removed in #5269 as unused, but are needed by the Foundry
and core evaluation integration added in this PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: alliscode <bentho@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Ben Thomas · 2026-04-16 19:40:07 +00:00

aee1acbf8b

Python: Feat: Add finish_reason support to AgentResponse and AgentResponseUpdate (#5211 )

* feat: add finish_reason support to AgentResponse and AgentResponseUpdate

Add finish_reason field to AgentResponse and AgentResponseUpdate classes,
propagate it through _process_update() and map_chat_to_agent_update(),
and add comprehensive unit tests.

Fixes #4622

* feat: add finish_reason to AgentResponse and AgentResponseUpdate

* style: add copyright header to test_finish_reason.py

* docs: add finish_reason to AgentResponse and AgentResponseUpdate docstrings

* refactor: move finish_reason tests into test_types.py per review feedback

Move all finish_reason test cases from the separate test_finish_reason.py
file into test_types.py as requested by eavanvalkenburg. Tests are placed
in a new '# region finish_reason' section at the end of the file.

* fix: use model instead of model_id in _process_update

Address PR review feedback from @eavanvalkenburg — ChatResponse and
ChatResponseUpdate both use 'model', not 'model_id'.

* fix: resolve SIM102 lint error in _process_update

Combine nested if statements for AgentResponse finish_reason check
to satisfy ruff SIM102 rule, with line wrapping to stay under 120 chars.

* fix: resolve pyright reportArgumentType in map_chat_to_agent_update

Add type: ignore[arg-type] for FinishReason NewType widening when
passing ChatResponseUpdate.finish_reason to AgentResponseUpdate.
Matches existing patterns in the codebase (40+ similar ignores).

L. Elaine Dazzio · 2026-04-16 19:39:09 +00:00

91e34358eb

Python: Fix Gemini client support for Gemini API and Vertex AI (#5258 )

* Add Gemini and Vertex AI client support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address Gemini PR review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* removed sample run readme part

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-16 19:38:50 +00:00

90a633967c

test: Add Handoff composability test (#5208 )

Jacob Alber · 2026-04-16 16:36:09 +00:00

c14beedb3a

fix: propagate A2A metadata with namespaced key in additional_properties (#5240 ) (#5256 )

Kartik Madan · 2026-04-16 15:22:39 +00:00

43d98974d3

.NET: Improve local release build perf by only formatting for one build target framework (#5266 )

* Improve local release build perf by only formatting for one build target framework

* Update dotnet/Directory.Build.targets

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

westey · 2026-04-16 15:21:33 +00:00

60da0ffb48

.NET: Update Microsoft.Extensions.AI to 10.5.0 and OpenAI to 2.10.0 and remove unused refs (#5269 )

* Update versions of System, Microsoft.Extensions and OpenAI packages

* Remove unused package references

* Remove further unused references

westey · 2026-04-16 11:03:51 +00:00

a2044829b1

Python: Handle url_citation annotations in FoundryChatClient streaming responses (#5071 )

* Fix url_citation annotations dropped in streaming (#5029)

Add url_citation branch to the streaming annotation handler in
_parse_chunk_from_openai, mirroring the existing non-streaming path.
The handler creates an Annotation with type='citation', title, url,
and annotated_regions (TextSpanRegion), wrapped in Content.from_text.

Update test_streaming_annotation_added_with_unknown_type to use a
truly unknown type, and add new tests for url_citation (with and
without url).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5029: Python: [Bug]: url_citation annotations silently dropped in Foundry streaming (SharePoint grounding citations lost)

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Giles Odigwe · 2026-04-16 09:33:04 +00:00

435c66e9c9

Bump Anthropic SDK to 12.13.0 and Anthropic.Foundry to 0.5.0 (#5279 )

- Update Anthropic from 12.11.0 to 12.13.0
- Update Anthropic.Foundry from 0.4.2 to 0.5.0
- Change Anthropic project from release candidate to preview
- Add new IBetaService members (Agents, Environments, Sessions, Vaults) to test mock

Roger Barreto · 2026-04-16 09:19:36 +00:00

52d50be9e0

Add AgentExecutorResponse.with_text() to preserve conversation history through custom executors (#5255 )

Fixes #5246

When a custom @executor transforms agent output and sends a plain str,
the downstream AgentExecutor.from_str handler loses the full conversation
context. This adds a with_text() helper that creates a new
AgentExecutorResponse with replaced text while preserving the prior
conversation chain, so AgentExecutor.from_response is invoked instead.

- Add with_text(text) method to AgentExecutorResponse dataclass
- Add 3 regression tests in test_full_conversation.py

Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Kartik Madan · 2026-04-16 08:39:19 +00:00

d20f9b5f97

.NET: Fix intermittent checkpoint-restore race in in-process workflow runs (#5134 )

* Improve workflow unit tests

* Update test name prefix for clarity.

* Update tests to surface any errors.

* fix check-point restore-time race in off-thread workflow event stream

* Fixes an intermittent checkpoint-restore race in in-process workflow runs.

Peter Ibekwe · 2026-04-16 04:20:45 +00:00

87a8fa2a9d

Merge branch 'main' into feature/python-foundry-hosted-agent-vnext

Tao Chen · 2026-04-15 20:59:51 -07:00

892d88df28

Python: Add OpenAI types to default checkpoint encoding allow list (#5297 )

* Add OpenAI types to default checkpoint encoding allow list

* Address comments

Tao Chen · 2026-04-16 12:58:28 +09:00

8f7fd9525d

Python: Add context_providers and description to workflow.as_agent() (#4651 )

* Add context_providers and description to `workflow.as_agent()`

* Add default workflow name and description

* Positional

* Move import

---------

Co-authored-by: Tao Chen <taochen@microsoft.com>
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Chinedum Echeta · 2026-04-16 02:47:29 +00:00

69697065ab

Revert to public MCP server and skip on transient upstream errors (#5296 )

The local MCP server can't be used for hosted tools tests because
Anthropic's backend needs to reach the MCP URL from their infrastructure
(not localhost on the CI runner). Revert to learn.microsoft.com/api/mcp
but catch BadRequestError, InternalServerError, APIConnectionError, and
APITimeoutError and pytest.skip so upstream outages don't block the
merge queue.

Evan Mattson · 2026-04-16 11:46:49 +09:00

fe4cd3cddc

Python: improve misc-integration test robustness (#5295 )

* Python: use local MCP server for hosted tools test and broaden image assertion

The hosted tools integration test was hitting rate limits on the external
learn.microsoft.com MCP server, causing persistent failures that retries
couldn't recover from. Switch to the local MCP server already spun up in
CI via LOCAL_MCP_URL, skipping when the env var isn't set.

Also broaden the image description assertion to accept common synonyms
(cottage, mansion, villa, etc.) instead of just "house", since the model
legitimately uses varied vocabulary for the same image.

* Address review feedback: validate LOCAL_MCP_URL scheme and use word boundaries

- Skip hosted tools test when LOCAL_MCP_URL lacks http/https scheme,
  matching the pattern used in test_mcp.py.
- Use regex word boundaries for image assertion to avoid false matches
  like "villain" matching "villa".

Evan Mattson · 2026-04-16 11:34:28 +09:00

611230cc8e

Python: bump misc-integration retry delay to 30s (#5293 )

The misc-integration job (Anthropic, Ollama, MCP) frequently fails on merge to main when the upstream MCP server (e.g. learn.microsoft.com/api/mcp) returns a transient rate-limit error. The previous 5s retry delay is too short to ride out the upstream backoff window, so all retries fail and the merge queue is blocked. Bumping to 30s gives the upstream a chance to recover before pytest-retry re-runs the test.

Evan Mattson · 2026-04-16 10:03:00 +09:00

f112150cfb

Python: add experimental file history provider (#5248 )

* add experimental file history provider

* Improve file history provider writes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* typo

* cleanup

* cleanup

* fix in readme

* added security messages

* Refine file history provider locking

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added additional sample

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-15 22:23:37 +00:00

ff05c22c58

Forward provider config to SessionConfig in GitHubCopilotAgent (fixes #5190 ) (#5195 )

Co-authored-by: Sergey Borisov <sergey.borisov@dataimpact.io>

S3rj · 2026-04-15 22:08:01 +00:00

eab7f09d03

Python: Upgrade agentserver packages (#5284 )

* Upgrade agentserver packages

* Fix new types

Tao Chen · 2026-04-15 14:16:37 -07:00

3225a59fd3

Move samples (#5281 )

Tao Chen · 2026-04-15 11:33:15 -07:00

9e3983e547

Python: Bump agent-framework-devui to 1.0.0b260414 for release (#5259 )

Update devui version and changelog for the streaming memory fix release.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-04-15 18:22:15 +00:00

python-devui-1.0.0b260414 68b93641b6

1934 Commits