mirror of
https://github.com/microsoft/agent-framework.git
synced 2026-06-16 21:04:09 +08:00
feature/python-hosting
214 Commits
-
Evan Mattson ·
2026-05-22 15:56:32 +09:00 -
ci: pin third-party GitHub Actions to commit SHAs (#5972)
Replaces every floating tag in our workflow and composite action files with an immutable 40-character commit SHA, keeping the original `# vX` comment so Dependabot can still propose version bumps. 186 occurrences across 25 workflows and 2 composite actions. Also widens the github-actions Dependabot entry to use the plural `directories` key with `/.github/actions/*` so composite actions under `.github/actions/<name>/action.yml` are kept up to date. Previously Dependabot only scanned `.github/workflows` and the repo-root `action.yml`, leaving our `python-setup` and `sample-validation-setup` composite actions unmaintained.
Roger Barreto ·
2026-05-20 22:10:32 +00:00 -
Python: Bump Python package versions for a release (#5964)
* Bump Python package versions to 1.5.0 for a release * Promote orchestrations to 1.0.0rc1 * ci(python-setup): merge dynamic exclude into existing workspace exclude The python-setup action injected exclude = [...] verbatim into [tool.uv.workspace], producing a duplicate 'exclude' key when the section already had a static exclude. Scope the rewrite to the [tool.uv.workspace] section and append the package to the existing array when present; idempotent if the package is already excluded. * Address Copilot review feedback: raise inter-package floors to 1.5.0 - foundry, foundry-local: agent-framework-openai >=1.4.0 -> >=1.5.0 - azure-contentunderstanding: agent-framework-foundry >=1.4.0 -> >=1.5.0 - azurefunctions: pin agent-framework-durabletask to >=1.0.0b260519,<2 Keeps lockstep cohort consistent and avoids mixed 1.4.x / 1.5.0 installs. * Re-include azurefunctions and durabletask in the uv workspace The pinned durabletask>=1.4.0 floor is enough to make resolution succeed; the workspace exclude was over-correction and broke CI samples and pyright type-checking (re-exports in agent_framework/azure/__init__.pyi plus samples/04-hosting/{azure_functions,durabletask}/ could not resolve their imports). Dropping them from agent-framework-core[all] still stands so the metapackage does not pull them. * Restore azurefunctions and durabletask in agent-framework-core[all] The durabletask floor pin keeps users on the safe 1.4.0, so they are once again included in the metapackage. Update CHANGELOG to reflect the pin rather than an [all] removal. * Raise uvicorn ceiling in ag-ui and devui to allow 0.42+ The root override-dependencies pins uvicorn[standard]>=0.34.0 (no upper) and the workspace lock resolves to 0.47.0. The package ceiling <0.42.0 meant the workspace was no longer testing the declared supported range. Bump to <1 so the lock fits within the declared bounds. Also picked up by validate-dependency-bounds: refresh stale orchestrations RC pin in devui dev deps.Evan Mattson ·
2026-05-20 09:20:53 +09:00 -
ci(python-setup): drop -U upgrade flag from uv sync (#5961)
The shared composite action ran `uv sync --all-packages --all-extras --dev -U` on every job, which upgrades every dependency to the latest compatible version instead of using the pinned versions in `uv.lock`. That is currently producing a hard resolver failure on every CI job: No solution found when resolving dependencies for split (markers: python_full_version >= '3.11' and sys_platform == 'darwin') Because there are no versions of durabletask and agent-framework-durabletask depends on durabletask>=1.3.0,<2, we can conclude that agent-framework-durabletask's requirements are unsatisfiable. Dropping `-U` makes the install use the workspace lockfile, which is what is reproducible locally and what we publish releases against. Upgrades should be opt-in (via a scheduled job or a separate workflow) rather than implicit on every CI run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>Eduard van Valkenburg ·
2026-05-19 19:33:11 +00:00 -
Evan Mattson ·
2026-05-15 10:49:46 +09:00 -
Replace merge-gatekeeper Docker action with github-script polling (#5533)
The upsidr/merge-gatekeeper@v1 action is a Dockerfile-based action that builds a golang image on every run. On merge_group events the run step is conditioned out via `if: github.event_name == 'pull_request'`, so the build happens but produces nothing. Replace with an actions/github-script@v8 polling loop that mirrors the action's behavior exactly: merges combined-statuses and check-runs for the PR head SHA, with combined-status winning on name collisions, and the same conclusion mapping (skipped → dropped, success/neutral → success, anything else terminal → error). Same job name, triggers, permissions, timeout (3600s), interval (30s), and ignored list, so existing required-check rules stay valid. PR runs now poll the API in seconds instead of waiting on a per-run docker image build, and merge_group runs become near-instant no-ops.
Evan Mattson ·
2026-05-13 05:45:51 +00:00 -
.NET: CI hardening — split Functions tests, re-enable skipped integration tests (#5717)
* Split DurableTask/AzureFunctions integration tests into dedicated CI job - Add -TestProjectNameExclude parameter to New-FilteredSolution.ps1 - Add 'functions' and 'core' path filters to paths-filter job - Exclude DurableTask/AzureFunctions from main dotnet-test job - Remove emulator setup from dotnet-test (no longer needed) - Add new dotnet-test-functions job (ubuntu/net10.0 only, path-conditional) - Update merge gate and report job to include dotnet-test-functions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR feedback: add Workflows.Generators to core filter, drop dotnetChanges gate from functions job Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable Anthropic integration tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Upgrade Anthropic SDK 12.13.0 -> 12.20.0 to fix M.E.AI incompatibility Fixes MissingMethodException on WebSearchToolResultContent.get_Results() caused by Anthropic 12.13.0 being compiled against an older Microsoft.Extensions.AI.Abstractions version. Suppress RT0003 in AI.Abstractions.csproj as the transitive reference from the upgraded Anthropic SDK conflicts with the explicit one. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Anthropic unit test mocks for SDK 12.20.0 interface changes Add missing interface members: IAnthropicClient.WebhookKey, IBetaService.MemoryStores, IBetaService.Webhooks, IBetaService.UserProfiles Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable CheckSystem declarative integration tests The CheckSystem.yaml tests were temporarily skipped in PR #4270 during the Azure.AI.Projects 2.0.0-beta.1 SDK update. Since then, the system variable plumbing (SystemScope, SetLastMessageAsync, conversation initialization) has been significantly updated and stabilized. The other tests in these same files pass reliably using the same infrastructure. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix CheckSystem test case to expect 1 response The CheckSystem workflow sends a 'PASSED!' SendActivity when all system variables are populated, producing 1 AgentResponseEvent. The test case had min_response_count: 0 with no max, so the assertion defaulted max to 0 and failed with 'Response count greater than expected: 0 (Actual: 1)'. Updated to expect exactly 1 response, matching the SendActivity pattern. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable Foundry OpenAPI server-side tool integration test Remove Skip="For manual testing only" from AsAIAgent_WithOpenAPITool_NativeSDKCreation_InvokesServerSideToolAsync. The test already uses RetryFact(3 retries, 5s delay) to handle transient failures from the external restcountries.com API. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Include workflow file in functions/core path filters A PR editing only dotnet-build-and-test.yml would skip dotnet-test-functions because the workflow path was missing from both the functions and core path filter lists. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename filter parameters for consistency TestProjectNameFilter -> TestProjectNameIncludeFilter TestProjectNameExclude -> TestProjectNameExcludeFilter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove unnecessary RT0003 warning suppression The RT0003 suppression was added during the Anthropic SDK 12.20.0 upgrade but the warning no longer fires. Removing it to keep the NoWarn list minimal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove duplicate WebhookKey properties from merge Both our branch and main added WebhookKey to the Anthropic test mock classes, resulting in CS0102 duplicate definition errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Giles Odigwe ·
2026-05-12 17:56:31 +00:00 -
Evan Mattson ·
2026-05-12 15:27:13 +09:00 -
Trigger issue triage on bug-labeled issues (#5763)
* Trigger issue triage on bug-labeled issues instead of manual dispatch * Address PR feedback: scope concurrency cancellation to bug-label events
Evan Mattson ·
2026-05-12 13:07:17 +09:00 -
.NET: Hosted Agents - RAG Sample with Azure AI Search (#5693) (#5701)
* .NET: Hosted Agents - RAG Sample with Azure AI Search (#5693) Adds a Hosted-AzureSearchRag sample plus a live Foundry.Hosting integration test scenario backed by a real Azure AI Search index. Sample (Hosted-AzureSearchRag): keyword-only Azure AI Search via SearchClient adapter into TextSearchProvider, scope-aware DevTemporaryTokenCredential consuming AZURE_BEARER_TOKEN_FOUNDRY + AZURE_BEARER_TOKEN_SEARCH for local Docker, Dockerfile + contributor Dockerfile mirroring Hosted-TextRag. Integration test: AzureSearchRagHostedAgentFixture extends the PR #5598 HostedAgentFixture with the new azure-search-rag scenario branch in the shared test container; AzureSearchRagHostedAgentTests asserts the model returns canary tokens (TR-CANARY-7821, SHIP-CANARY-4493) that exist only in the seeded documents - real proof the agent grounded its answer in retrieved content rather than training data. * Address PR 5701 Copilot review feedback - Sample README: drop stale 'bootstraps the index on first run' line; index is pre-provisioned out of band - Sample + TestContainer search adapters: propagate CancellationToken to await foreach via .WithCancellation()
Roger Barreto ·
2026-05-11 13:59:42 +00:00 -
.NET: Foundry.Hosting IT - eliminate MSBuild parallel-output races (#5725)
* .NET: Foundry.Hosted IT - fix MSBuild parallel-output races Two surgical changes inside the dotnet-foundry-hosted-it job: 1. Replace dotnet build <slnx> -f net10.0 with dotnet build <test.csproj>. The test csproj pins TargetFrameworks=net10.0 and its ProjectReference closure gives MSBuild a single-rooted graph, eliminating the duplicate inner-builds that race on bin/obj. Drops the two New-FilteredSolution.ps1 steps. 2. In it-build-image.ps1, drop the -UsePrebuiltProjectReferences switch and always pass --no-dependencies to dotnet publish. Publish now resolves TestContainer's framework refs by reading prebuilt DLLs and never re-touches them. Replaces the partial-mitigation in PR #5689 with a structural fix. Local validation confirmed published Foundry.dll has identical mtime and bytes as the prebuild output. * .NET: dotnet test - use --project flag for Microsoft Testing Platform
Roger Barreto ·
2026-05-11 09:39:13 +00:00 -
.NET: Python: Add dotnet integration test report to CI (#5515)
* Add dotnet integration test report to CI - Add --report-junit flag to dotnet integration test step to generate JUnit XML alongside TRX, with explicit --results-directory to centralize output in IntegrationTestResults/ - Upload JUnit XML artifacts from each matrix leg (net10.0/ubuntu, net472/windows) as dotnet-test-results-{framework}-{os} - Add dotnet-integration-test-report job that downloads artifacts, runs the existing aggregate.py script, posts markdown to Job Summary, and saves trend history via actions/cache - Refactor aggregate.py to discover JUnit XML files recursively, supporting both pytest (pytest.xml) and xunit (*.junit.xml) layouts - Handle provider name derivation for dotnet artifact naming convention - Fix nodeid collision when same test runs under multiple frameworks by qualifying keys with provider when collisions are detected - Improve module extraction for dotnet C# classnames (recognizes IntegrationTests/UnitTests namespace segments) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: trigger dotnet CI for report validation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use .junit extension (not .junit.xml) for xunit v3 output xUnit v3 generates files with .junit extension, not .junit.xml. Update upload glob and aggregate.py discovery to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use deterministic provider-qualified keys for dotnet tests Always prefix dotnet test keys with provider (e.g. net10.0 (ubuntu)::TestName) to ensure stable, comparable counts across runs regardless of file parse order. Also show Executed (passed+failed) instead of Total in summary table. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: match Python report summary format (Total, passed/total, etc.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: split dotnet report into per-framework tables Dotnet tests run on multiple frameworks (net10.0, net472). Instead of one combined table with unstable totals, show separate sections per framework — each with its own summary row and per-test table. Python reports retain the original single-table format. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable 7 flaky dotnet integration tests with increased timeouts Increase timeouts to reduce timing-related flakiness in LLM-backed integration tests (issue #4971): - ExternalClientTests: 60s -> 120s default timeout - SamplesValidationBase: 60s -> 120s default timeout - ConsoleAppSamplesValidation: 90s -> 150s for long-running tests - AzureFunctions SamplesValidation: 2min -> 3min orchestration timeout, 60s -> 90s per-step WaitForConditionAsync timeouts Remove all Skip=Flaky annotations and unused SkipFlakyTimingTest constants. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-skip LLM non-determinism flaky tests, keep timeout fixes Re-skip SingleAgentOrchestrationHITLSampleValidationAsync and LongRunningToolsSampleValidationAsync - these fail due to LLM producing extra review notifications, not timeouts. Updated skip reasons to accurately describe the root cause. Reverted unnecessary timeout change on the skipped LongRunningTools test. The remaining 5 re-enabled tests with timeout increases are stable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Enable Anthropic integration tests in CI Replace hardcoded skip with conditional skip pattern (matching CopilotStudio approach): tests gracefully skip when ANTHROPIC_API_KEY is missing, and run when present. Changes: - AnthropicChatCompletionFixture: try/catch in InitializeAsync with Assert.Skip on missing config (replaces hardcoded SkipReason) - AnthropicSkillsIntegrationTests: same pattern per test method - dotnet-build-and-test.yml: wire up ANTHROPIC_API_KEY, ANTHROPIC_CHAT_MODEL_NAME, and ANTHROPIC_REASONING_MODEL_NAME env vars to the integration test step Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix missing System using in AnthropicSkillsIntegrationTests Add 'using System;' for InvalidOperationException in try/catch blocks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Skip flaky SingleAgentOrchestrationChainingSampleValidationAsync LLM non-determinism causes Assert.NotNull failures on orchestration results. Skip until test logic is hardened against non-deterministic LLM responses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable HITL and LongRunningTools tests with timeout and flexibility fixes - Remove Skip attribute from SingleAgentOrchestrationHITLSampleValidationAsync - Remove Skip attribute from LongRunningToolsSampleValidationAsync - Increase timeout from 120s/90s to 180s to accommodate 2+ LLM round-trips - Replace rigid 2-cycle assertion with flexible approval logic that handles extra review cycles from LLM non-determinism Fixes the two failure modes identified in #4971: 1. Timeout: 120s/90s was insufficient for multiple LLM calls under CI load 2. Extra notifications: Assert.Fail on 3rd+ review cycle was too rigid Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Increase AzureFunctions LongRunningTools test timeouts from 90s to 180s The LongRunningToolsSampleValidationAsync test in the AzureFunctions integration tests was failing in CI with TimeoutException at the 'Content published notification is logged' step. The 90-second timeouts are too tight for CI environments where LLM calls and orchestration overhead can be slow. Increased all three WaitForConditionAsync timeouts from 90s to 180s: - Waiting for human feedback notification - Waiting for publish notification (the step that was failing) - Waiting for orchestration completion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Merge main and fix dotnet report path after flaky_report rename Merge upstream/main which renamed scripts/flaky_report/ to scripts/integration_test_report/ (from Python PR #5454). Update the dotnet-build-and-test workflow to reference the new path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add RetryFact to DurableTask and AzureFunctions integration tests These tests interact with LLMs via stdin/stdout (DurableTask) or HTTP (AzureFunctions) and are inherently non-deterministic. Unlike the Python side which uses pytest-retry, the dotnet tests had no retry mechanism and a single transient failure would fail the entire CI run. Changes: - Switch [Fact] to [RetryFact(2, 5000)] on all LLM-dependent tests across ConsoleAppSamplesValidation, ExternalClientTests, WorkflowConsoleAppSamplesValidation, and AzureFunctions SamplesValidation - Add re-prompt mechanism to LongRunningToolsSampleValidationAsync: if the LLM doesn't invoke the tool within 60s, re-send the prompt (up to 2 retries) instead of burning the full timeout - Reduce LongRunningTools timeout from 240s to 180s (re-prompt makes the extra buffer unnecessary) - Leave simple/deterministic tests as [Fact] (SingleAgent, unit tests) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add persist-credentials: false to Integration Test Report checkout step Matches the convention used by other checkout steps in this workflow to avoid leaving GITHUB_TOKEN credentials in the local git config. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * small fixes * disable anthropic failing tests --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>Giles Odigwe ·
2026-05-07 20:39:32 +00:00 -
.NET: Foundry.Hosting IT: avoid MSB3026 in publish; fix telemetry UT flake (#5689)
CI publish step: gate the BuildProjectReferences=false fast-path on an explicit -UsePrebuiltProjectReferences switch (passed by the workflow) instead of marker detection. Adds a preflight error when stale obj/Release/net10.0 outputs would cause CS0579, with actionable recovery instructions. Telemetry UT flake: AgentFrameworkResponseHandlerTelemetryTests was using a plain List<Activity> for OTel's InMemoryExporter. The exporter writes from background Activity completion callbacks while parallel tests on the same global ActivitySource feed every listener, racing against the assertion's enumeration and throwing 'Collection was modified'. Replaced with a small thread-safe ConcurrentActivityList that locks add/enumerate and returns a snapshot for assertions.
Roger Barreto ·
2026-05-07 18:54:46 +00:00 -
.NET: Add Foundry.Hosting.IntegrationTests (#5598)
* Foundry.Hosting.IntegrationTests: scaffold project, fixtures, and 24 tests Add a new integration test project for Foundry hosted agents alongside the existing Foundry.IntegrationTests project. The project provisions a real Foundry hosted agent per scenario via AgentAdministrationClient.CreateAgentVersionAsync, points it at a single test container image (built and pushed out of band by scripts/it-build-image.ps1 in a follow up commit), and exercises the agent through AIProjectClient.AsAIAgent. Six scenario fixtures are introduced, each pointing at the same image but selecting behavior via the IT_SCENARIO environment variable on the HostedAgentDefinition: - HappyPathHostedAgentFixture (round trip, multi turn, stored=false flag) - ToolCallingHostedAgentFixture (server side AIFunctions) - ToolCallingApprovalHostedAgentFixture (approval flow) - ToolboxHostedAgentFixture (Foundry toolbox) - McpToolboxHostedAgentFixture (MCP backed toolbox) - CustomStorageHostedAgentFixture (custom storage provider) 24 tests across 6 test classes are scaffolded. All are tagged Skip pending the test container build and the end to end smoke iteration in follow up commits. Once the container is in place the Skip annotations can be removed scenario by scenario. Adds an IT_HOSTED_AGENT_IMAGE constant to the shared TestSettings so every IT project agrees on the env var name the build script emits. * Foundry.Hosting.IntegrationTests: add TestContainer, build script, slnx, README Adds the rest of the integration test infrastructure on top of the previous scaffolding commit: * Foundry.Hosting.IntegrationTests.TestContainer csproj and Program.cs implementing the multi scenario container (one image, IT_SCENARIO env var dispatches between happy-path, tool-calling, tool-calling-approval, toolbox, mcp-toolbox, and custom-storage). The toolbox, mcp-toolbox, and custom-storage branches are placeholders pending API surface stabilization. * Dockerfile and dockerignore in the test container project, using the contributor pattern matching the investigation work (host side dotnet publish, container only does COPY out/). * scripts/it-build-image.ps1 with mandatory Registry parameter (no hardcoded ACR), content hashed tags so unchanged source results in a no op push, and emits IT_HOSTED_AGENT_IMAGE for shells and CI to consume. * slnx entry for both new projects. * README in the IT project covering env vars, image build, scenario table, and current placeholder status. Steps still pending: end to end smoke (step 5) and CI workflow integration (step 6) require a live Foundry deployment and ACR push, so they land in follow up commits. * Foundry.Hosting.IntegrationTests: address PR 5598 review feedback Fix issues raised by Copilot review: * it-build-image.ps1: hash file contents, not the path list, so any source edit produces a fresh tag. Normalize Registry input by stripping scheme and trailing slash before deriving the ACR short name. Validate the short name is non empty. * HostedAgentFixture: route GetAgentAsync through _adminClient (which has the FoundryFeaturesPolicy attached) instead of through _projectClient.AgentAdministrationClient (which does not). * HostedAgentFixture FoundryFeaturesPolicy: replace Headers.Add with Remove plus Add so retries cannot accumulate duplicate headers. * HappyPath, ToolCalling, ToolCallingApproval, CustomStorage tests: create the AgentSession before turn 1 and reuse it for both turns. The previous pattern created the session after turn 1 so turn 2 had no link to turn 1, defeating the multi turn assertion. * .NET: Foundry.Hosting.IntegrationTests: constrain to net10.0 + dotnet format autofix - Set <TargetFrameworks>net10.0</TargetFrameworks>: the project references both Microsoft.Agents.AI.Foundry.Hosting (net8/9/10 only) and AgentConformance.IntegrationTests (net10.0;net472 — inherits the tests-default TFM list). The intersection is net10.0; the previous $(TargetFrameworksCore) triple caused NU1702 + System.Text.Json version conflicts on the net8.0/net9.0 builds because AgentConformance had no matching asset. - Apply `dotnet format` autofix on the test files (IDE0005, IDE0009, IDE0032, IMPORTS). * .NET: Foundry.Hosting.IntegrationTests.TestContainer/Program.cs: add UTF-8 BOM CI's check-format requires charset=utf-8-bom per .editorconfig. * Foundry.Hosting IntegrationTests: wire end-to-end CI flow against hosted agents Make the integration tests usable end-to-end against a live Foundry deployment, including a per-run rebuild of the test container so framework code changes are exercised. Fixture (HostedAgentFixture.cs) * Switch from per-run unique agent names to stable scenario-keyed names (it-happy-path, it-tool-calling, ...). The agent's managed identity carries the Azure AI User role on the project scope, which is required for inbound inference; deleting the agent recycles the MI and breaks that role assignment, so we keep the agent across runs and only churn versions. * Add IT_RUN_ID env var to defeat Foundry's content-addressed version dedup; otherwise a rerun just receives the existing version and Dispose deletes it. * PATCH the per-agent endpoint with AgentEndpointConfig (Responses protocol, version selector at 100% to the new version). Without this, /agents/{name}/endpoint/protocols/ openai/responses returns HTTP 400. * Build a per-agent ProjectOpenAIClient (not the cached projectClient.ProjectOpenAIClient, which is bound to the project-level URL); set AgentName in options so the URL routes through the agent endpoint, and add the Foundry-Features header to the inference pipeline. * Use Versions (which serializes to container_protocol_versions) instead of the deprecated ProtocolVersions; the server now rejects the legacy field. * On Dispose, delete only the version this fixture created. Never delete the agent. Tests * Tag every HostedAgentTests class with [Trait("Category", "FoundryHostedAgents")] so the CI workflow can route them to a separate Foundry project than the rest of the integration suite. CI workflow (.github/workflows/dotnet-build-and-test.yml) * Add a foundryHosting paths-filter covering Microsoft.Agents.AI.Foundry.Hosting and its in-repo dependency chain (Foundry, Agents.AI, Agents.AI.Abstractions), the test container, the test fixture, Directory.Packages.props, the build script, and this workflow file. Skip the costly hosted-agent steps when none of those changed. * Add "Build and push Foundry Hosted Agents test container" step that invokes scripts/it-build-image.ps1 against vars.IT_HOSTED_AGENT_REGISTRY and pipes the resulting IT_HOSTED_AGENT_IMAGE=<tag> into GITHUB_ENV. * Add "Run Foundry Hosted Agents Integration Tests" step that filters in only the new trait, with AZURE_AI_PROJECT_ENDPOINT/AZURE_AI_MODEL_DEPLOYMENT_NAME pointed at IT_HOSTED_AGENT_PROJECT_ENDPOINT/IT_HOSTED_AGENT_MODEL_DEPLOYMENT_NAME (Tao project, East US 2; the SK IT project's region does not yet support hosted agents preview). * Exclude the new trait from the existing "Run Integration Tests" step. * TEMP: drop the != 'pull_request' guard on the new steps and on Azure CLI Login when the paths-filter triggers, so PR #5598 can validate the wiring before promoting to merge queue only. Restore the original guard after one green PR run. Build script (scripts/it-build-image.ps1) * Hash now spans TestContainer source AND its referenced framework projects so any framework code change forces a fresh tag and a real docker push; the previous TestContainer-only hash silently reused stale images on framework edits. Bootstrap script (dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-bootstrap-agents.ps1) * New idempotent script that creates the six stable scenario agents and grants Azure AI User on the project scope to each agent's MI. Run once per Foundry project. Includes AAD-graph propagation retries because newly created MIs take time to appear there. README (dotnet/tests/Foundry.Hosting.IntegrationTests/README.md) * Document the bootstrap prerequisite, the regional caveat (East US 2 is the only region we have validated; East US returned "Unsupported region" at the time of writing), the per-run image rebuild, and the CI wiring including the SP RBAC requirements. SDK pin (TEMP) * Bump Microsoft.Agents.AI.Foundry.Hosting's Azure.AI.Projects VersionOverride to 2.1.0-alpha.20260505.1 from the azure-sdk public daily feed (added to nuget.config). This release is the first that builds the per-agent inference URL as /agents/{name}/endpoint/protocols/openai (the 2.1.0-beta.1 release builds .../openai/openai/v1, which the server rejects). Revert both the feed and the override once the URL fix lands in a stable Azure.AI.Projects release. * Foundry.Hosting IntegrationTests: revert alpha SDK pin; move endpoint PATCH to bootstrap The alpha SDK pin (Azure.AI.Projects 2.1.0-alpha.20260505.1 from the azure-sdk public daily feed) was needed only for the URL routing fix and the strongly-typed AgentEndpointConfig/PatchAgentOptions wrapper. We do not need either right now: the fixture stays compatible with the public 2.1.0-beta.1 by moving the one-time endpoint PATCH to the bootstrap script (it sets version_selector to FixedRatio @latest, so each new fixture run becomes the served version automatically without a per-run PATCH from the test code). The hosted-agent invocation path will start working end-to-end once the URL routing fix lands in a stable Azure.AI.Projects release; until then the tests stay [Fact(Skip = ...)] as documented. * Revert dotnet/nuget.config: drop the azure-sdk-for-net public feed. * Revert Microsoft.Agents.AI.Foundry.Hosting.csproj VersionOverride to 2.1.0-beta.1. * Revert Microsoft.Agents.AI.Foundry.UnitTests and Microsoft.Agents.AI.Foundry.Hosting.UnitTests Azure.AI.Projects pin (they had been bumped to align Azure.Core 1.54 transitive). * Drop the AgentEndpointConfig PATCH block from HostedAgentFixture.cs (the type is alpha-only). Replace with a comment pointing at the bootstrap script. * Bootstrap script (it-bootstrap-agents.ps1) now also PATCHes each agent's endpoint with version_selector=@latest if not already set. Idempotent. * Foundry.Hosting IntegrationTests: drop accidentally committed filtered.slnx * Foundry.Hosting IntegrationTests: revert TEMP PR override on Azure CLI Login + IT steps The previous attempt to validate the new hosted-agent IT wiring on PR #5598 failed because the PR is from a fork (rogerbarreto/agent-framework-public). GitHub never passes environment secrets to fork PRs regardless of event-name guards on individual steps, so 'azure/login@v2' fails with 'client-id and tenant-id are not supplied'. Restore the original github.event_name != 'pull_request' guard. The new steps will execute on push to main and on merge_group runs. * Foundry.Hosting IntegrationTests: invoke build-and-push script with absolute path The pwsh shell on the GitHub Actions runner couldn't resolve ./scripts/it-build-image.ps1 when the step had no working-directory set; the step inherits the runner's PWD which is not always the repo root after preceding steps. Use github.workspace explicitly to remove the ambiguity. * Foundry.Hosting IntegrationTests: move it-build-image.ps1 inside the IT project tree The previous location at scripts/it-build-image.ps1 lived outside the sparse-checkout paths the workflow uses (.github, dotnet, python, declarative-agents), so the runner never had the file when the new step tried to invoke it. Move the script next to its sibling it-bootstrap-agents.ps1 inside the IT project tree, and anchor its relative paths to the repo root via so callers can invoke it from any PWD. * Move scripts/it-build-image.ps1 -> dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-build-image.ps1 * Add Push-Location to the resolved repo root inside the script (Pop-Location in finally) so the existing relative paths (TestContainerProject, hashed src dirs) keep working no matter where the script is invoked from. * Update the workflow path filter and the step's invocation path to the new location. * Foundry.Hosting IntegrationTests: enable 5 HappyPath tests on the live Foundry endpoint The fixture already constructs ProjectOpenAIClient via the per-agent path that beta.1 supports (new ProjectOpenAIClient(uri, cred, opts { AgentName })), so no SDK pin bump is required to run the smoke tests end-to-end. Un-skip the 5 tests that pass against the live test container. Tests un-skipped (verified passing locally against tao-foundry-prj): * RunAsync_ReturnsNonEmptyTextAsync * RunStreamingAsync_YieldsAtLeastOneUpdateAsync * MultiTurn_WithPreviousResponseId_PreservesContextAsync * StoredFalse_Baseline_DoesNotPersistResponseAsync * Instructions_FromContainerDefinition_AreObeyedAsync Tests still skipped with a more specific reason (4 of 9 in HappyPath plus all ToolCalling*, McpToolbox, Toolbox, CustomStorage) because the test container does not yet emit usable response_id / conversation_id chains, and the placeholder scenarios are not implemented in the test container's Program.cs. These are test container limitations, not infra bugs, and can be un-skipped as the container surfaces stabilize. * Foundry.Hosting IntegrationTests: extract hosted IT into parallel job, add Workflows dep Address Wesley's review feedback on PR #5598: 1. Pull Foundry hosted-agent IT into its own dotnet-foundry-hosted-it job that runs in parallel to dotnet-build and dotnet-test. Same path-filter gate keeps it skipped on unrelated edits. Builds only the filtered solution containing Foundry.Hosting.IntegrationTests and src deps. dotnet-build-and-test-check now waits on it too. 2. Add Microsoft.Agents.AI.Workflows to the foundryHosting paths-filter and to hashedDirs in it-build-image.ps1 since Foundry.Hosting transitively depends on it. TFM constraint on the IT csproj stays at net10.0 because AgentConformance.IntegrationTests targets net10/net472 and is consumed by ~12 other IT projects on net472. --------- Co-authored-by: Roger Barreto <rbarreto@microsoft.com>Roger Barreto ·
2026-05-06 16:08:15 +00:00 -
Python: Reduce flaky integration tests and improve CI signal quality (#5454)
* Enable Ollama integration tests in CI and rename report to Integration Test Report - Install Ollama, cache models (qwen2.5:0.5b + nomic-embed-text), and start server in the Misc integration job for both workflow files - Set OLLAMA_MODEL and OLLAMA_EMBEDDING_MODEL env vars so the 5 Ollama tests are no longer skipped - Rename Flaky Test Report to Integration Test Report throughout (job names, artifact names, cache keys, file names, script titles/docstrings) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Bump Ollama model to qwen2.5:1.5b for better instruction following The 0.5b model was too small to reliably follow simple prompts like 'Say Hello World', causing test assertion failures. The 1.5b model follows instructions more reliably while still being small enough for fast CI pulls (~1GB). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable reliable streaming integration tests Remove the hard skip on test_03_reliable_streaming tests that was temporarily disabled for instability investigation. CI infrastructure (Azurite, DTS emulator, Redis, func CLI) is already in place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable skipped Functions/DurableTask tests and bump timeout to 480s - Remove hard skips from 4 tests in test_11_workflow_parallel.py - Remove hard skip from test_conditional_branching in test_06_dt_multi_agent_orchestration_conditionals.py - Increase pytest --timeout from 360 to 480 for Functions+DurableTask CI job - Updated in both python-merge-tests.yml and python-integration-tests.yml Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-skip failing Functions/DurableTask tests with specific root causes - test_11_workflow_parallel (4 tests): xdist worker crashes during execution - test_conditional_branching: orchestration fails with RuntimeError, not a timeout - Keep 480s timeout bump for remaining Functions tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix auth routing in samples 06/11: api_key -> credential for Azure OpenAI Both samples passed a bearer token provider via api_key= which caused the client to route to api.openai.com instead of Azure OpenAI, resulting in 401 Unauthorized. Changed to credential= which correctly triggers Azure routing and picks up AZURE_OPENAI_ENDPOINT from the environment. - samples/azure_functions/11_workflow_parallel/function_app.py: 1 fix - samples/durabletask/06_multi_agent_orchestration_conditionals/worker.py: 2 fixes - Re-enable 4 parallel workflow tests and 1 conditional branching test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-skip parallel workflow tests: xdist worker distribution issue The 4 parallel workflow tests crash because xdist worksteal distributes them across separate workers, each spawning its own func process against shared emulators. Auth fix (api_key->credential) was valid and stays. test_conditional_branching now passes with the auth fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix E501 line-too-long in azurefunctions parallel test skip reasons Wrap skip reason strings to stay within 120 char line limit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add retry logic and port-conflict fix for Ollama CI setup - Kill any auto-started Ollama before launching serve (fixes port conflict: 'address already in use') - Retry ollama pull up to 3 times with 15s backoff (fixes 429 rate limit failures) - Applied to both python-merge-tests.yml and python-integration-tests.yml Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix flaky integration tests and re-enable skipped tests - Foundry agent: add allow_preview=True to custom client test - Foundry hosting: raise max_output_tokens 50->200, add temperature, relax assertion in test_temperature_and_max_tokens - Foundry embedding: update skip reason with root cause (endpoint mismatch) - OpenAI file search: fix vector store indexing race condition by polling file_counts before querying; fix get_streaming_response -> get_response(stream=True) - Azure OpenAI file search: remove skip (transient 500 resolved) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove temperature from foundry hosting test (unsupported by CI model) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Stabilize Ollama tool call integration tests with no-arg function Use a no-argument greet() function instead of hello_world(arg1) for integration tests. The 1.5B model in CI is unreliable at generating correct tool call arguments, causing 'Argument parsing failed' errors. A no-arg function eliminates this flakiness entirely. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Increase reliable streaming test timeouts from 30s to 60s The LLM call through Azure OpenAI + Redis streaming pipeline can exceed 30s in CI due to cold starts or throttling. Raise to 60s to reduce flaky timeouts while still bounded by pytest's 120s per-test limit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Re-enable workflow parallel tests with xdist_group marker The tests were skipped because xdist distributes module tests across workers, each spawning their own func process (port conflicts). Adding xdist_group forces all tests in this module onto a single worker so the module-scoped function_app_for_test fixture works correctly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "Re-enable workflow parallel tests with xdist_group marker" This reverts commit
455c28da62. * Rename flaky_report to integration_test_report and add try/finally cleanup - Rename scripts/flaky_report/ to scripts/integration_test_report/ to reflect expanded scope beyond flaky-test detection - Update workflow references in both CI files - Wrap file search integration tests in try/finally to ensure vector store cleanup runs even on test failure or timeout Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Ollama pull failure propagation and Azure OpenAI vector store readiness - Ollama CI: fail the step immediately if model pull fails after 3 retries instead of silently proceeding to tests - Azure OpenAI file search: add the same vector-store readiness polling that was applied to the non-Azure OpenAI tests, preventing eventual consistency race conditions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * remove load_dotenv from test file --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>Giles Odigwe ·
2026-05-01 00:41:39 +00:00 -
Python: Update hosting agent samples + fixes (#5485)
* Update foundry hosting samples * Add file data type support * Fix file content and add more tests * Fix README * Address comments * Fix int tests * remove temp
Tao Chen ·
2026-04-28 04:24:05 +00:00 -
Propagate integration-test model credentials to issue-triage repro (#5443)
Scopes the triage job to the integration GitHub Environment, adds the azure/login OIDC step, and exposes the same OpenAI / Azure OpenAI / Foundry / Anthropic env vars the integration test workflow uses. This lets the triage agent write repro code that constructs model clients from the environment without any secrets entering the agent prompt or generated-code literals. Azure OpenAI and Foundry continue to authenticate via AAD (DefaultAzureCredential), so there is no API key to leak for those providers.
Evan Mattson ·
2026-04-23 21:01:24 +09:00 -
Automated issue triage workflow (#5419)
* Automated issue triage workflow * Bump dependencies * Fix issue-triage workflow: security, reliability, and testability Address six review comments on the issue-triage workflow: 1. Change trigger from issues:opened to issues:labeled so the secret-backed triage flow is only triggered by a maintainer- controlled signal. 2. Include inputs.issue_number in the concurrency group so workflow_dispatch runs for the same issue are properly de-duplicated. 3. Improve team membership error handling to fail closed: verify the team exists before checking membership, and only treat a 404 as 'not a member' (all other errors fail the job). 4. Use optional chaining (issue.user?.login) for the API-fetched issue to handle deleted GitHub accounts without crashing. 5. Extract the inline github-script into a testable module at .github/scripts/check_team_membership.js with 10 tests in .github/tests/test_check_team_membership.js covering all code paths (payload/API author resolution, deleted accounts, team lookup failure, 404 vs non-404 membership errors). 6. Make the spam gate actually stop the job by exiting non-zero instead of just logging, so future steps cannot accidentally run for spam issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make issue-triage workflow manually triggered only for initial testing Remove the 'issues' event trigger, keeping only 'workflow_dispatch' so the workflow can be tested manually before enabling automatic triggers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Evan Mattson ·
2026-04-23 20:22:04 +09:00 -
Evan Mattson ·
2026-04-23 13:24:21 +09:00 -
Evan Mattson ·
2026-04-23 08:23:56 +09:00 -
Python: Flaky test report (#5342)
* Add flaky test trend reporting to CI workflows Parse JUnit XML (pytest.xml) from each integration test job and aggregate results into a markdown trend report showing per-test pass/fail/skip status across the last 5 runs. Changes: - Add python/scripts/flaky_report/ package (JUnit XML parser + trend report generator following the sample_validation pattern) - Add upload-artifact steps to all 6 integration test jobs in both python-merge-tests.yml and python-integration-tests.yml - Add python-flaky-test-report aggregation job with history caching - Add --junitxml=pytest.xml to integration-tests.yml jobs (already present in merge-tests.yml) - Fix Cosmos job --junitxml path (use absolute path since uv run --directory changes cwd) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix flaky report: handle missing test results gracefully - Guard against missing reports directory in load_current_run() - Only run report job when at least one integration test job completed (skip when all jobs are skipped, e.g. on pull_request events) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review: fix provider names and if-expression precedence - Use explicit provider name mapping in _derive_provider() so OpenAI renders correctly instead of 'Openai' - Fix operator precedence in workflow if-expressions by wrapping success/failure checks in parentheses Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add File column and xfail detection to flaky test report - Add File column showing module name (e.g., test_openai_chat_client) to disambiguate tests with the same function name across files - Detect pytest xfail tests in JUnit XML (type=pytest.xfail) and show them with a distinct warning emoji instead of skip emoji - Update legend to include xfail explanation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add Foundry embedding env vars to merge-tests workflow Sync the Foundry integration job in python-merge-tests.yml with python-integration-tests.yml by adding FOUNDRY_MODELS_ENDPOINT, FOUNDRY_MODELS_API_KEY, FOUNDRY_EMBEDDING_MODEL, and FOUNDRY_IMAGE_EMBEDDING_MODEL. Once the repo variables/secrets are configured, the embedding integration test will run in CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix File column showing class name instead of module name When a test is inside a class, pytest writes the classname as e.g. 'pkg.test_file.TestClass'. The previous rsplit logic extracted 'TestClass' instead of 'test_file'. Now detect uppercase-starting segments as class names and use the preceding segment instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review: UTC timestamps, XML error handling, summary fix, docstring - Use datetime.now(timezone.utc) for accurate UTC timestamps - Catch ET.ParseError per-file so corrupt XML doesn't crash the report - Remove separate 'error' key from summary (errors folded into 'failed') - Fix _short_name docstring to show actual dotted classname::name format Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Giles Odigwe ·
2026-04-22 20:16:50 +00:00 -
Add pr review GH workflow (#5418)
* Add workflow PR review * Allow reviews on draft PRs * Update .github/workflows/devflow-pr-review.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update .github/workflows/devflow-pr-review.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Bump actions/checkout to v6 and uv to 0.11.x --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Evan Mattson ·
2026-04-22 13:52:42 +09:00 -
Python: Add Hyperlight CodeAct package and docs (#5185)
* initial work on code_mode * updated samples * updates to codeact * udpated codeact * Draft CodeAct ADR and sample updates Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * initial implementation and adr and feature * Python: Limit Hyperlight wasm backend to Python <3.14 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Fix CI for Hyperlight CodeAct PR Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Run Hyperlight integration when available Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Address Hyperlight review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Simplify Hyperlight file mount inputs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Accept Path host paths in Hyperlight mounts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Fix Hyperlight mount typing for CI Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * temp run integration test * Python: Strengthen Hyperlight real sandbox tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * added additional tests * Python: Simplify Hyperlight CodeAct API Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * set tests as non-integration * Retry Hyperlight allowed-domain registration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Gate Hyperlight integration tests by runtime support Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Hyperlight skip test on Python 3.14 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Delay Hyperlight runtime probe until test execution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Relax Hyperlight Windows integration stdout assertion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Scan Hyperlight output directory for artifacts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Retry Hyperlight output artifact collection Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Harden Hyperlight integration output assertions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Retry Hyperlight read-back check in integration test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simplify Hyperlight integration write assertion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Avoid pathlib in Hyperlight integration sandbox Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use socket network check in Hyperlight sandbox Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Replace blocked Azure AI Search blog link Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Clarify Hyperlight guest stdlib limits Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use _socket in Hyperlight integration sandbox Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Handle Hyperlight mounted file paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Broaden Hyperlight sandbox path fallbacks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Search Hyperlight guest mounts recursively Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Split Hyperlight mount coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Split Hyperlight live network tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Hyperlight file-write test on Windows Enable the sandbox filesystem by providing a workspace_root so /output is mounted. Remove os.path.exists assertion (unsupported in WASM guest) and fix Content data assertion to use .uri. Skip the network integration test on Windows where the WASM sandbox lacks the encodings.idna codec. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review: ADR intro, manual wiring sample, doc clarifications - Add CodeAct introduction section to ADR for unfamiliar readers - Clarify 'less runtime efficient' con with specific overhead description - Add note in Python impl doc clarifying ADR vs impl doc split - Explain why before_run hooks must be per-run (CRUD, concurrency, approval) - Rename code_interpreter variable to codeact in E2E sample - Add manual static wiring sample (codeact_manual_wiring.py) - Add 'when to use which pattern' guidance to samples README Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR #5185 review comments and add .NET CodeAct design doc - Fix async callback: _make_sandbox_callback returns sync wrapper with thread + asyncio.run() bridge (was broken with real Wasm FFI) - Fix stale output: clear output_dir before each sandbox.run() call - Fix blocking event loop: _run_code now async with asyncio.to_thread() - Revert _agents.py options['tools'] injection (unnecessary; provider uses context.extend_tools()) - Revert SessionContext.options docstring back to read-only - Add real-sandbox test fixtures (shared/restored/fresh) - Add 8 new real-sandbox tests for callback round-trip, stale output, event loop non-blocking, basic execution, stdout/stderr, errors, snapshot/restore, and tool registration - Add comprehensive .NET HyperlightCodeActProvider design document Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update hyperlight README with code snippets and remove Public API section Replace bare export list with Quick Start code examples covering the context provider, standalone tool, manual static wiring, and file mounts / network access patterns. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-04-17 00:49:44 +00:00 -
.NET: Foundry Evals integration for .NET (#4914)
* Foundry Evals integration for .NET - Core evaluation framework: EvalItem, LocalEvaluator, FunctionEvaluator, EvalChecks - IAgentEvaluator interface with MeaiEvaluatorAdapter bridge - AgentEvaluationExtensions for agent.EvaluateAsync() overloads - FoundryEvals wrapping MEAI quality/safety evaluators - ConversationSplitters (LastTurn, Full) and IConversationSplitter - EvalItem.PerTurnItems() for multi-turn decomposition - HasImageContent for multimodal content detection - WorkflowEvaluationExtensions for per-agent workflow evaluation - 7 eval samples mirroring Python parity: 02-agents/Evaluation: SimpleEval, ExpectedOutputs, Multimodal 03-workflows/Evaluation: WorkflowEval 05-end-to-end/Evaluation: FoundryQuality, MixedProviders, ConversationSplits - Comprehensive unit tests (1958 passing) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rewrite FoundryEvals to use real Foundry Evals API Replace MEAI evaluator shim with actual OpenAI EvaluationClient protocol methods. FoundryEvals now creates eval definitions, submits runs, polls for completion, and fetches per-item results server-side. - New constructor: FoundryEvals(AIProjectClient, model, evaluators) - Add FoundryEvalConverter for MEAI ChatMessage -> Foundry JSON format - Add EvalId, RunId, ReportUrl to AgentEvaluationResults - All 20 built-in evaluator constants now work (agent, tool, quality, safety) - Remove Microsoft.Extensions.AI.Evaluation.Quality/Safety dependencies - Update all samples for new constructor (no more ChatConfiguration) - Replace BuildEvaluators tests with ResolveEvaluator tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add response output to CustomEvals and ExpectedOutputs samples Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review: pagination, validation, error handling, tests FoundryEvals fixes: - Add pagination for output items (has_more/after cursor) - Add guard clauses for pollIntervalSeconds/timeoutSeconds <= 0 - Fix double TryGetProperty for passed field parsing - Throw on all-tool-evaluators with no tool definitions - Fix XML doc (default 300s, not 180s) New tests (30 added, 1989 total): - EvalChecks: NonEmpty, ContainsExpected (pass/fail/skip/case), HasImageContent, ToolCallsPresent - FoundryEvalConverter: ConvertMessage (text, image, function call, function results fan-out, empty fallback, mixed content), ConvertEvalItem, BuildTestingCriteria (quality/agent/tool/groundedness data mappings), BuildItemSchema Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix review: null-refs, Data.ToString() bug, ContainsExpected, add tests - Fix NullReferenceException in sample Response display (pattern matching) - Fix WorkflowEvaluationExtensions Data?.ToString() producing type names instead of message text (pattern-match ChatMessage/AgentResponse/list) - Change EvalChecks.ContainsExpected to return Passed=false when no ExpectedOutput (was silently passing, masking misconfiguration) - Add EvalItem constructor tests with LastTurn/Full/null splitters - Add FoundryEvalConverter.ConvertMessage DataContent (base64 image) test - Add ExtractAgentData tests with ChatMessage, list, and AgentResponse data Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix review: conversation fidelity, eval caching, fallback tests - WorkflowEvaluationExtensions: preserve full response messages (tool calls, intermediate) instead of synthetic 2-message conversation. Cast completed Data to AgentResponse and use Messages when available, fallback to text. - FoundryEvals: cache evalId per schema shape (hasContext, hasTools) so subsequent EvaluateAsync calls create runs under the same eval definition. - MeaiEvaluatorAdapter: code already correctly passes queryMessages (not full conversation) to IEvaluator — no change needed, verified by inspection. - Add tests: AgentResponse full messages preservation, unknown object ToString() fallback for ExtractAgentData. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename AzureAI→Foundry: move eval files, update references - Move FoundryEvals.cs and FoundryEvalConverter.cs from Microsoft.Agents.AI.AzureAI to Microsoft.Agents.AI.Foundry - Update namespace from AzureAI to Foundry in both files - Add explicit usings required by Foundry project (no implicit usings) - Move FoundryEvalConverter tests to Foundry.UnitTests project (avoids ReplacingRedactor type conflict from dual project refs) - Update all sample csproj references and using statements - Remove Foundry project reference from AI UnitTests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * PR review round 4: wire up tool extraction, remove eval cache, fix null safety - BuildEvalItem: extract tools from agent via GetService<ChatOptions>() into EvalItem.Tools (Python parity) - FoundryEvals: remove eval ID cache - each call creates fresh definition (matches Python behavior) - FoundryEvals: replace null-forgiving operators with descriptive InvalidOperationException - MixedProviders sample: remove unnecessary explicit PackageReferences (transitively provided) - FoundryEvalConverter: document that tool results take precedence over text content - Add LocalEvaluator zero-checks test documenting 0 metrics = failed behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python-dotnet parity: 9 feature gaps filled New checks: - ToolCallArgsMatch() — verify tool call names + argument subset match - ToolCalledCheck(ToolCalledMode.Any, ...) — match any of the specified tools - ToolCalledMode enum (All/Any) FoundryEvals enhancements: - Default evaluators now [Relevance, Coherence, TaskAdherence] (was Relevance, Coherence) - Auto-add ToolCallAccuracy when items have tool definitions - EvaluateTracesAsync — evaluate by response_ids, trace_ids, or agent_id - EvaluateFoundryTargetAsync — evaluate deployed Foundry targets Result type enrichment: - AgentEvaluationResults: added Status, Error, PerEvaluator, DetailedItems - New EvalItemResult/EvalScoreResult/PerEvaluatorResult types - FoundryEvals populates all new fields from API responses Workflow fix: - Skip internal executors (_*, input-conversation, end-conversation, end) Tests: 8 new tests covering ToolCallArgsMatch, ToolCalledMode.Any, internal executor filtering Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add MeaiEvaluatorAdapter and PerTurnItems edge case tests - 3 tests for MeaiEvaluatorAdapter: query message forwarding, synthetic response fallback, multiple items aggregation - 3 tests for EvalItem.PerTurnItems: empty conversation, no user messages, system+assistant only - StubEvaluator and StubChatClient test helpers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Blocking link check for outdated package in DevUI. * Replace Dictionary<string, object> payloads with typed wire models Introduce internal FoundryEvalWireModels.cs with compile-time-safe types for the OpenAI Evals API wire format. The OpenAI .NET SDK (2.9.1) only provides protocol-level methods with BinaryContent/ClientResult — no typed request models. These internal models replace scattered dictionary literals with [JsonPropertyName]-annotated classes, giving: - Compile-time safety (typos become build errors) - Single point of change when the API evolves - IntelliSense discoverability - Cleaner serialization via JsonPolymorphic for content items Models: WireContentItem hierarchy (text, image, tool_call, tool_result), WireMessage, WireEvalItemPayload, WireTestingCriterion, WireItemSchema, WireCreateEvalRequest, WireCreateRunRequest, and data source variants. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Skip metric when Foundry returns neither score nor passed When an evaluator returns no score and no passed value, the previous code created BooleanMetric(name, false), which falsely failed items via ItemPassed. Now we skip the MEAI metric entirely for indeterminate results — the raw data remains available in DetailedItems for diagnostics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR #4914 review comments: fix tool evaluator bug and add tests - Fix duplicate ToolCallAccuracy: resolve evaluator names before checking against ToolEvaluators set (Comment 2) - Make FilterToolEvaluators internal for testability; add tests for the ArgumentException edge case when all evaluators are tool-type (Comment 3) - Add CancellationToken test for LocalEvaluator (Comment 4) - Add EvaluateAsync integration test on Run with sequential workflow and per-agent SubResults verification (Comment 5) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address Peter's review comments on PR #4914 - Add trailing newline to Evaluation_FoundryQuality.csproj (Comment 6) - Make evaluator name lookups case-insensitive: switch BuiltinEvaluators, ToolEvaluators, AgentEvaluators, and ResolveEvaluator's StartsWith check from Ordinal to OrdinalIgnoreCase (Comment 7) - Add Trace.TraceWarning when Foundry returns fewer results than submitted items, indicating expected vs actual count before padding (Comment 8) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add Microsoft.Extensions.AI.Evaluation packages to Directory.Packages.props These were removed in #5269 as unused, but are needed by the Foundry and core evaluation integration added in this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: alliscode <bentho@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ben Thomas ·
2026-04-16 19:40:07 +00:00 -
Python: bump misc-integration retry delay to 30s (#5293)
The misc-integration job (Anthropic, Ollama, MCP) frequently fails on merge to main when the upstream MCP server (e.g. learn.microsoft.com/api/mcp) returns a transient rate-limit error. The previous 5s retry delay is too short to ride out the upstream backoff window, so all retries fail and the merge queue is blocked. Bumping to 30s gives the upstream a chance to recover before pytest-retry re-runs the test.
Evan Mattson ·
2026-04-16 10:03:00 +09:00 -
westey ·
2026-04-13 11:00:31 +00:00 -
Python: Stop emitting duplicate reasoning content from OpenAI
response.reasoning_text.doneandresponse.reasoning_summary_text.doneevents (#5162)* Fix reasoning text done events duplicating streamed delta content (#5157) The OpenAI Responses API sends both reasoning_text.delta (incremental chunks) and reasoning_text.done (full accumulated text) events. The chat client was emitting Content for both, causing ag-ui to append the full done text onto already-accumulated delta text, producing duplicated reasoning output. Stop emitting Content for reasoning_text.done and reasoning_summary_text.done events, matching how output_text.done is already handled (not emitted). The deltas contain all the content; the done event is redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(openai): emit reasoning done content as fallback when no deltas observed (#5157) Address PR review feedback: - Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set - Emit content from done events only when no deltas were received for the item_id, preventing silent content loss on stream resumption - Add comment documenting code_interpreter done event asymmetry - Replace redundant ag-ui test with deduplication-focused test - Add integration test for delta+done sequence in OpenAI chat client tests - Add fallback path tests for done events without preceding deltas Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning" * Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157) _emit_text_reasoning now follows the same streaming pattern as _emit_text: - Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first delta for a given message_id - Emits only ReasoningMessageContentEvent for subsequent deltas - Defers ReasoningMessageEndEvent/ReasoningEndEvent until _close_reasoning_block is called (on content type switch or end-of-run) This produces the correct protocol pattern: ReasoningStartEvent ReasoningMessageStartEvent ReasoningMessageContentEvent(delta1) ReasoningMessageContentEvent(delta2) ReasoningMessageEndEvent ReasoningEndEvent Instead of wrapping every delta in a full Start→End sequence. Backward compatibility is preserved: calling _emit_text_reasoning without a flow argument still produces the full sequence per call. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix import ordering lint error in AG-UI test file (#5157) Move inline import of TextMessageContentEvent to the top-level import block and ensure alphabetical ordering to satisfy ruff I001 rule. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent The 'event' variable was already typed as WorkflowEvent[Any] from the async for loop at line 590. Reusing it in the _close_reasoning_block loop (which returns list[BaseEvent]) caused an incompatible assignment error. Renamed to 'reasoning_evt' to avoid the conflict. Fixes #5162 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5157: review comment fixes * narrow test result reporting to explicit pytest JUnit XML * Fix test args * Fix pytest-results-action in merge workflow and remove committed test artifacts Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml: add --junitxml=pytest.xml to all test commands and narrow the results action path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally committed pytest.xml and python-coverage.xml and add them to .gitignore. --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Evan Mattson ·
2026-04-09 22:44:59 +00:00 -
westey ·
2026-04-09 16:43:54 +00:00 -
.NET: Improve resilience of verify-samples by building separately and improving evaluation instructions (#5151)
* Improve resilience of verify-samples by building separately and improving evaluation instructions * Address PR comments * Address PR comment
westey ·
2026-04-09 11:25:00 +00:00 -
.NET: Add github actions workflow for verify-samples (#5034)
* Add github actions workflow for verify-samples * Make workflow run as part of PR (for now) * Update workflow to remove pr trigger * Address PR comments
westey ·
2026-04-03 09:58:24 +00:00 -
Python: [BREAKING] Python: move Azure AI embeddings to Foundry (#5056)
* renamed AzureAIINferenceEmbeddings and lazy load azure-cosmos and env var rename * updated coverage * fix readme
Eduard van Valkenburg ·
2026-04-02 11:26:35 +00:00 -
Python: Move workflow-samples and agent-samples under declarative-agents directory (#5011)
* Move workflow-samples and agent-samples under declarative-agents and update all references Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f70f7d19-9256-4eec-b7db-28007d74440c Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com> * Fix relative paths in README files inside moved directories Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/f70f7d19-9256-4eec-b7db-28007d74440c Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: sphenry <6749825+sphenry@users.noreply.github.com> Co-authored-by: Shawn Henry <shahen@microsoft.com>
Copilot ·
2026-04-02 09:34:33 +00:00 -
Python: Fix SK migration samples (#5047)
* Fix SK migration samples * Fix env vars for SK * Hard code model for sheel tool samples
Tao Chen ·
2026-04-02 08:40:34 +00:00 -
Python: [BREAKING] Standardize model selection on model (#4999)
* Refactor Anthropic model option and provider clients Rename the Anthropic client model option from model_id to model, add provider-specific Anthropic wrappers for Foundry, Bedrock, and Vertex, and expose them through the Anthropic, Foundry, Amazon, and Google namespaces. Update core option handling, docs, samples, and tests accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Anthropic skills sample typing Cast the Anthropic beta client to Any in the skills sample so the pre-commit sample pyright check no longer fails on beta skills and files endpoints that are not exposed by the current SDK stubs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * undo sample mypy * Retry CI after transient external failures Retrigger PR validation after an unrelated Copilot review workflow SAML failure and a transient external tau2 git fetch failure in the Windows Python test setup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback on model option merging Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address Anthropic compatibility review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * moved all to `model` * fixes for azure ai search * Python: standardize remaining sample env var names Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix foundry-local pyright compatibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated env vars in cicd --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-04-01 19:00:18 +00:00 -
Python: Enforce Foundry package unit test coverage (#5036)
* Enforce Foundry package unit test coverage * Sort ENFORCED_TARGETS alphabetically in python-check-coverage.py Agent-Logs-Url: https://github.com/microsoft/agent-framework/sessions/ed0b81ed-c267-4ee0-9655-56c4b3066fad Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: TaoChenOSU <12570346+TaoChenOSU@users.noreply.github.com>
Tao Chen ·
2026-04-01 17:37:27 +00:00 -
Python: [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces (#4990)
* [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces Also clean up follow-on docs, environment guidance, package metadata, and lab test stability. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix deleted semantic-kernel sample links Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * improve foundry language * Fix A2A Foundry sample regression Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-31 20:36:21 +00:00 -
Python: Fix samples (#4980)
* First samples 1st batch * Fix sample paths * Fix workflow samples * Fix workflow dependency * Correct env vars * Increase idle timeout * Fix workflows HIL sample * Fix more workflow samples
Tao Chen ·
2026-03-31 15:20:35 +00:00 -
Python: [BREAKING] Remove deprecated kwargs compatibility paths (#4858)
* [BREAKING] Remove deprecated kwargs compatibility paths Remove the deprecated kwargs compatibility shims across core agents, clients, tools, middleware, and telemetry. Keep workflow kwargs behavior intact in this branch and follow up separately in #4850. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix PR CI fallout for kwargs removal Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updates * Fix Azure AI CI fallout Remove the stale _get_current_conversation_id override from the Azure AI client after the OpenAI base helper was deleted. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fixed new classes * Fix Assistants deprecated import gating Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix integration replay regressions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Switch multi-agent hosting samples to Azure chat completions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Simplify Azure multi-agent sample config Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-27 21:00:12 +00:00 -
[BREAKING] Python: fix OpenAI Azure routing and provider samples (#4925)
* Python: fix OpenAI Azure routing and provider samples Prefer OpenAI when OPENAI_API_KEY is present unless Azure is explicitly requested. Clarify constructor docs, keep deprecated Azure wrappers compatible with stricter settings validation, and refresh the provider samples and tests to use the current client patterns. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix bandit * Python: align OpenAI embedding Azure routing Extend the shared OpenAI-vs-Azure routing and credential behavior to the embedding client, add Azure embedding regression coverage, and refresh the embedding samples to use the generic client path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix embedding client pyright check Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: thin OpenAI embedding wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: document embedding overload routing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix callable OpenAI key routing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: fix Azure credential routing tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: address OpenAI review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: narrow Azure routing markers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: refine OpenAI model fallback order Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: narrow Azure deployment docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: remove embedding routing wording Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: run embedding Azure integration tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * changed variable name * Python: expand OpenAI package README Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * clarified readme * Python: fix Azure OpenAI integration setup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: correct Azure integration env mapping Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated code to fix int tests * test updates * test fix * fix test setup * updates to tests and setup * remove openai assistants int tests * improvements in int tests * fix env var * fix env vars * fix azure responses test * trigger actions --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-27 13:33:39 +00:00 -
Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818)
* Python: Provider-leading client design & OpenAI package extraction Major refactoring of the Python Agent Framework client architecture: - Extract OpenAI clients into new `agent-framework-openai` package - Core package no longer depends on openai, azure-identity, azure-ai-projects - Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient, OpenAIChatClient → OpenAIChatCompletionClient - Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param - New FoundryChatClient for Azure AI Foundry Responses API - New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents - Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO - Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient - Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/ - ADR-0020: Provider-Leading Client Design Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: missing Agent imports in samples, .model_id → .model in foundry_local sample Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: CI failures — mypy errors, coverage targets, sample imports - azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref - Coverage: replace core.azure/openai targets with openai package target - project_provider: add type annotation for opts dict Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: populate openai .pyi stub, fix broken README links, coverage targets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fixes * updated observabilitty * reset azure init.pyi * fix errors * updated adr number * fix foundry local * fixed not renamed docstrings and comments, and added deprecated markers to old classes * fix tests and pyprojects * fix test vars * updated function tests * update durable * updated test setup for functions * Fix Foundry auth in workflow samples Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Stabilize Python integration workflows Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update hosting samples for Foundry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trigger full CI rerun Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trigger CI rerun again Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * trigger rerun * trigger rerun * fix for litellm * undo durabletask changes * Move Foundry APIs into foundry namespace Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix Foundry pyproject formatting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Split provider samples by Foundry surface Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Restore hosting sample requirements Also fix the Foundry Local sample link after the provider sample move. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated tests * udpated foundry integration tests * removed dist from azurefunctions tests * Use separate Foundry clients for concurrent agents Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix client setup in azfunc and durable * disabled two tests * updated setup for some function and durable tests * improved azure openai setup with new clients * ignore deprecated * fixes * skip 11 * remove openai assistants int tests --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-25 09:56:29 +00:00 -
Python: Update sample validation scripts (#4870)
* Update sample validation scripts * Adjust prompt * Update autogen-migration samples * Add fix suggestion * Split jobs * Add .env * Create trend report * Add timestamp * Add more env vars * Comments * force node24 * force node24 * force node22
Tao Chen ·
2026-03-25 01:21:32 +00:00 -
Bump actions/download-artifact from 7 to 8 (#4372)
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 7 to 8. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/v7...v8) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-version: '8' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot[bot] ·
2026-03-23 21:55:19 +00:00 -
Update script to ping only on
waiting-for-authorlabel (#4812)* update script to ping only on certain waiting for author label * Update .github/scripts/stale_issue_pr_ping.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update .github/scripts/stale_issue_pr_ping.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix docstring --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Evan Mattson ·
2026-03-20 19:39:22 +09:00 -
Add automated stale issue and PR follow-up ping workflow (#4776)
* Add script to ping on stale issues/PRs * Add script to ping on stale issues/PRs * Fix stale issue/PR ping script review comments - Rename TEAM_NAME env var to TEAM_SLUG for clarity - Add actionable error messages for 403/404 team lookup failures - Add contents:read permission for actions/checkout - Use github.event.inputs context with fallback for scheduled runs - Pin PyGithub to 2.6.0 for reproducible builds - Fetch comments once in should_ping() to reduce API calls - Make ping() retry loop idempotent (track comment/label state) - Validate DAYS_THRESHOLD with helpful error for non-numeric input - Fix timezone bug: use astimezone() instead of replace(tzinfo=) - Add comprehensive unit tests (29 tests) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Evan Mattson ·
2026-03-20 00:41:31 +00:00 -
Python: Simplify Python Poe tasks and unify package selectors (#4722)
* updated automation tasks and commands, with alias for the time being * Restore aggregate test exclusions Preserve the legacy all-tests scope for test --all by excluding lab and devui from the default aggregate sweep, while still allowing explicit package selection. Also ignore hidden/generated test directories such as .mypy_cache during aggregate discovery. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated versions in pre-commit --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-18 18:39:11 +00:00 -
Bump actions/upload-artifact from 4 to 7 (#4373)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 7. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/v4...v7) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot[bot] ·
2026-03-17 16:05:55 +00:00 -
Bump MishaKav/pytest-coverage-comment from 1.2.0 to 1.6.0 (#4543)
Bumps [MishaKav/pytest-coverage-comment](https://github.com/mishakav/pytest-coverage-comment) from 1.2.0 to 1.6.0. - [Release notes](https://github.com/mishakav/pytest-coverage-comment/releases) - [Changelog](https://github.com/MishaKav/pytest-coverage-comment/blob/main/CHANGELOG.md) - [Commits](https://github.com/mishakav/pytest-coverage-comment/compare/v1.2.0...v1.6.0) --- updated-dependencies: - dependency-name: MishaKav/pytest-coverage-comment dependency-version: 1.6.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot[bot] ·
2026-03-17 16:04:37 +00:00 -
Bump danielpalme/ReportGenerator-GitHub-Action from 5.5.1 to 5.5.3 (#4542)
Bumps [danielpalme/ReportGenerator-GitHub-Action](https://github.com/danielpalme/reportgenerator-github-action) from 5.5.1 to 5.5.3. - [Release notes](https://github.com/danielpalme/reportgenerator-github-action/releases) - [Commits](https://github.com/danielpalme/reportgenerator-github-action/compare/5.5.1...5.5.3) --- updated-dependencies: - dependency-name: danielpalme/ReportGenerator-GitHub-Action dependency-version: 5.5.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot[bot] ·
2026-03-17 16:04:20 +00:00 -
Bump actions/setup-dotnet from 5.1.0 to 5.2.0 (#4541)
Bumps [actions/setup-dotnet](https://github.com/actions/setup-dotnet) from 5.1.0 to 5.2.0. - [Release notes](https://github.com/actions/setup-dotnet/releases) - [Commits](https://github.com/actions/setup-dotnet/compare/v5.1.0...v5.2.0) --- updated-dependencies: - dependency-name: actions/setup-dotnet dependency-version: 5.2.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot[bot] ·
2026-03-17 16:04:07 +00:00 -
Python: chore(python): improve dependency range automation (#4343)
* chore(python): improve dependency range automation - tighten dependency bounds and coding standards guidance\n- add dependency range validation workflow, reporting, and issue automation\n- update related tests and dependency pins for compatibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated text and pyarrow * new lock * fixed workflow * updated deps * fix tiktoken * chore(python): refine dependency validation workflows Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(python): add high-level dependency validation comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * WIP * added additional comments and excludes * added dev dependency handling and workflow and updates to package ranges * added readme and simplified commands * fix markers * chore(python): address dependency review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Tighten dependency bounds, remove stale overrides, restore Python 3.10 support - Apply dependency bound policy across all packages: stable >=1.0 deps use >=floor,<next_major; pre-1.0/prerelease deps use validated hard-bounded ranges - Remove stale root tool.uv.override-dependencies (uvicorn, websockets, grpcio) - Lower github_copilot requires-python to >=3.10 with github-copilot-sdk gated behind python_version >= 3.11 marker; import raises ImportError on 3.10 - Skip github_copilot pyright/mypy/test tasks on Python <3.11 - Use version-conditional pyrightconfig for samples on Python 3.10 - Add compatibility fix in core responses client for older openai typed dicts - Normalize uv.lock prerelease mode and refresh dev dependencies - Update CODING_STANDARD.md, DEV_SETUP.md, and package management skill docs Closes #902 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * small tweaks * add note in workflow * fix workflows and several versions * fix duplicate --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eduard van Valkenburg ·
2026-03-13 12:32:37 +00:00