Commit Graph

1 Commits

  • .NET: Add Foundry.Hosting.IntegrationTests (#5598)
    * Foundry.Hosting.IntegrationTests: scaffold project, fixtures, and 24 tests
    
    Add a new integration test project for Foundry hosted agents alongside the existing Foundry.IntegrationTests project. The project provisions a real Foundry hosted agent per scenario via AgentAdministrationClient.CreateAgentVersionAsync, points it at a single test container image (built and pushed out of band by scripts/it-build-image.ps1 in a follow up commit), and exercises the agent through AIProjectClient.AsAIAgent.
    
    Six scenario fixtures are introduced, each pointing at the same image but selecting behavior via the IT_SCENARIO environment variable on the HostedAgentDefinition:
    - HappyPathHostedAgentFixture (round trip, multi turn, stored=false flag)
    - ToolCallingHostedAgentFixture (server side AIFunctions)
    - ToolCallingApprovalHostedAgentFixture (approval flow)
    - ToolboxHostedAgentFixture (Foundry toolbox)
    - McpToolboxHostedAgentFixture (MCP backed toolbox)
    - CustomStorageHostedAgentFixture (custom storage provider)
    
    24 tests across 6 test classes are scaffolded. All are tagged Skip pending the test container build and the end to end smoke iteration in follow up commits. Once the container is in place the Skip annotations can be removed scenario by scenario.
    
    Adds an IT_HOSTED_AGENT_IMAGE constant to the shared TestSettings so every IT project agrees on the env var name the build script emits.
    
    * Foundry.Hosting.IntegrationTests: add TestContainer, build script, slnx, README
    
    Adds the rest of the integration test infrastructure on top of the previous scaffolding commit:
    
    * Foundry.Hosting.IntegrationTests.TestContainer csproj and Program.cs implementing the multi scenario container (one image, IT_SCENARIO env var dispatches between happy-path, tool-calling, tool-calling-approval, toolbox, mcp-toolbox, and custom-storage). The toolbox, mcp-toolbox, and custom-storage branches are placeholders pending API surface stabilization.
    * Dockerfile and dockerignore in the test container project, using the contributor pattern matching the investigation work (host side dotnet publish, container only does COPY out/).
    * scripts/it-build-image.ps1 with mandatory Registry parameter (no hardcoded ACR), content hashed tags so unchanged source results in a no op push, and emits IT_HOSTED_AGENT_IMAGE for shells and CI to consume.
    * slnx entry for both new projects.
    * README in the IT project covering env vars, image build, scenario table, and current placeholder status.
    
    Steps still pending: end to end smoke (step 5) and CI workflow integration (step 6) require a live Foundry deployment and ACR push, so they land in follow up commits.
    
    * Foundry.Hosting.IntegrationTests: address PR 5598 review feedback
    
    Fix issues raised by Copilot review:
    
    * it-build-image.ps1: hash file contents, not the path list, so any source edit produces a fresh tag. Normalize Registry input by stripping scheme and trailing slash before deriving the ACR short name. Validate the short name is non empty.
    * HostedAgentFixture: route GetAgentAsync through _adminClient (which has the FoundryFeaturesPolicy attached) instead of through _projectClient.AgentAdministrationClient (which does not).
    * HostedAgentFixture FoundryFeaturesPolicy: replace Headers.Add with Remove plus Add so retries cannot accumulate duplicate headers.
    * HappyPath, ToolCalling, ToolCallingApproval, CustomStorage tests: create the AgentSession before turn 1 and reuse it for both turns. The previous pattern created the session after turn 1 so turn 2 had no link to turn 1, defeating the multi turn assertion.
    
    * .NET: Foundry.Hosting.IntegrationTests: constrain to net10.0 + dotnet format autofix
    
    - Set <TargetFrameworks>net10.0</TargetFrameworks>: the project references both
      Microsoft.Agents.AI.Foundry.Hosting (net8/9/10 only) and AgentConformance.IntegrationTests
      (net10.0;net472 — inherits the tests-default TFM list). The intersection is net10.0;
      the previous $(TargetFrameworksCore) triple caused NU1702 + System.Text.Json version
      conflicts on the net8.0/net9.0 builds because AgentConformance had no matching asset.
    - Apply `dotnet format` autofix on the test files (IDE0005, IDE0009, IDE0032, IMPORTS).
    
    * .NET: Foundry.Hosting.IntegrationTests.TestContainer/Program.cs: add UTF-8 BOM
    
    CI's check-format requires charset=utf-8-bom per .editorconfig.
    
    * Foundry.Hosting IntegrationTests: wire end-to-end CI flow against hosted agents
    
    Make the integration tests usable end-to-end against a live Foundry deployment, including
    a per-run rebuild of the test container so framework code changes are exercised.
    
    Fixture (HostedAgentFixture.cs)
    
    * Switch from per-run unique agent names to stable scenario-keyed names (it-happy-path,
      it-tool-calling, ...). The agent's managed identity carries the Azure AI User role on
      the project scope, which is required for inbound inference; deleting the agent recycles
      the MI and breaks that role assignment, so we keep the agent across runs and only churn
      versions.
    * Add IT_RUN_ID env var to defeat Foundry's content-addressed version dedup; otherwise a
      rerun just receives the existing version and Dispose deletes it.
    * PATCH the per-agent endpoint with AgentEndpointConfig (Responses protocol, version
      selector at 100% to the new version). Without this, /agents/{name}/endpoint/protocols/
      openai/responses returns HTTP 400.
    * Build a per-agent ProjectOpenAIClient (not the cached projectClient.ProjectOpenAIClient,
      which is bound to the project-level URL); set AgentName in options so the URL routes
      through the agent endpoint, and add the Foundry-Features header to the inference
      pipeline.
    * Use Versions (which serializes to container_protocol_versions) instead of the
      deprecated ProtocolVersions; the server now rejects the legacy field.
    * On Dispose, delete only the version this fixture created. Never delete the agent.
    
    Tests
    
    * Tag every HostedAgentTests class with [Trait("Category", "FoundryHostedAgents")] so the
      CI workflow can route them to a separate Foundry project than the rest of the
      integration suite.
    
    CI workflow (.github/workflows/dotnet-build-and-test.yml)
    
    * Add a foundryHosting paths-filter covering Microsoft.Agents.AI.Foundry.Hosting and its
      in-repo dependency chain (Foundry, Agents.AI, Agents.AI.Abstractions), the test
      container, the test fixture, Directory.Packages.props, the build script, and this
      workflow file. Skip the costly hosted-agent steps when none of those changed.
    * Add "Build and push Foundry Hosted Agents test container" step that invokes
      scripts/it-build-image.ps1 against vars.IT_HOSTED_AGENT_REGISTRY and pipes the resulting
      IT_HOSTED_AGENT_IMAGE=<tag> into GITHUB_ENV.
    * Add "Run Foundry Hosted Agents Integration Tests" step that filters in only the new
      trait, with AZURE_AI_PROJECT_ENDPOINT/AZURE_AI_MODEL_DEPLOYMENT_NAME pointed at
      IT_HOSTED_AGENT_PROJECT_ENDPOINT/IT_HOSTED_AGENT_MODEL_DEPLOYMENT_NAME (Tao project,
      East US 2; the SK IT project's region does not yet support hosted agents preview).
    * Exclude the new trait from the existing "Run Integration Tests" step.
    * TEMP: drop the != 'pull_request' guard on the new steps and on Azure CLI Login when the
      paths-filter triggers, so PR #5598 can validate the wiring before promoting to merge
      queue only. Restore the original guard after one green PR run.
    
    Build script (scripts/it-build-image.ps1)
    
    * Hash now spans TestContainer source AND its referenced framework projects so any
      framework code change forces a fresh tag and a real docker push; the previous
      TestContainer-only hash silently reused stale images on framework edits.
    
    Bootstrap script (dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-bootstrap-agents.ps1)
    
    * New idempotent script that creates the six stable scenario agents and grants Azure AI
      User on the project scope to each agent's MI. Run once per Foundry project. Includes
      AAD-graph propagation retries because newly created MIs take time to appear there.
    
    README (dotnet/tests/Foundry.Hosting.IntegrationTests/README.md)
    
    * Document the bootstrap prerequisite, the regional caveat (East US 2 is the only region
      we have validated; East US returned "Unsupported region" at the time of writing), the
      per-run image rebuild, and the CI wiring including the SP RBAC requirements.
    
    SDK pin (TEMP)
    
    * Bump Microsoft.Agents.AI.Foundry.Hosting's Azure.AI.Projects VersionOverride to
      2.1.0-alpha.20260505.1 from the azure-sdk public daily feed (added to nuget.config).
      This release is the first that builds the per-agent inference URL as
      /agents/{name}/endpoint/protocols/openai (the 2.1.0-beta.1 release builds
      .../openai/openai/v1, which the server rejects). Revert both the feed and the override
      once the URL fix lands in a stable Azure.AI.Projects release.
    
    * Foundry.Hosting IntegrationTests: revert alpha SDK pin; move endpoint PATCH to bootstrap
    
    The alpha SDK pin (Azure.AI.Projects 2.1.0-alpha.20260505.1 from the azure-sdk public
    daily feed) was needed only for the URL routing fix and the strongly-typed
    AgentEndpointConfig/PatchAgentOptions wrapper. We do not need either right now: the
    fixture stays compatible with the public 2.1.0-beta.1 by moving the one-time endpoint
    PATCH to the bootstrap script (it sets version_selector to FixedRatio @latest, so each
    new fixture run becomes the served version automatically without a per-run PATCH from
    the test code). The hosted-agent invocation path will start working end-to-end once the
    URL routing fix lands in a stable Azure.AI.Projects release; until then the tests stay
    [Fact(Skip = ...)] as documented.
    
    * Revert dotnet/nuget.config: drop the azure-sdk-for-net public feed.
    * Revert Microsoft.Agents.AI.Foundry.Hosting.csproj VersionOverride to 2.1.0-beta.1.
    * Revert Microsoft.Agents.AI.Foundry.UnitTests and Microsoft.Agents.AI.Foundry.Hosting.UnitTests
      Azure.AI.Projects pin (they had been bumped to align Azure.Core 1.54 transitive).
    * Drop the AgentEndpointConfig PATCH block from HostedAgentFixture.cs (the type is
      alpha-only). Replace with a comment pointing at the bootstrap script.
    * Bootstrap script (it-bootstrap-agents.ps1) now also PATCHes each agent's endpoint
      with version_selector=@latest if not already set. Idempotent.
    
    * Foundry.Hosting IntegrationTests: drop accidentally committed filtered.slnx
    
    * Foundry.Hosting IntegrationTests: revert TEMP PR override on Azure CLI Login + IT steps
    
    The previous attempt to validate the new hosted-agent IT wiring on PR #5598 failed
    because the PR is from a fork (rogerbarreto/agent-framework-public). GitHub never passes
    environment secrets to fork PRs regardless of event-name guards on individual steps,
    so 'azure/login@v2' fails with 'client-id and tenant-id are not supplied'. Restore the
    original github.event_name != 'pull_request' guard. The new steps will execute on
    push to main and on merge_group runs.
    
    * Foundry.Hosting IntegrationTests: invoke build-and-push script with absolute path
    
    The pwsh shell on the GitHub Actions runner couldn't resolve ./scripts/it-build-image.ps1
    when the step had no working-directory set; the step inherits the runner's PWD which is
    not always the repo root after preceding steps. Use github.workspace explicitly to remove
    the ambiguity.
    
    * Foundry.Hosting IntegrationTests: move it-build-image.ps1 inside the IT project tree
    
    The previous location at scripts/it-build-image.ps1 lived outside the sparse-checkout
    paths the workflow uses (.github, dotnet, python, declarative-agents), so the runner
    never had the file when the new step tried to invoke it. Move the script next to its
    sibling it-bootstrap-agents.ps1 inside the IT project tree, and anchor its relative
    paths to the repo root via  so callers can invoke it from any PWD.
    
    * Move scripts/it-build-image.ps1 -> dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-build-image.ps1
    * Add Push-Location to the resolved repo root inside the script (Pop-Location in finally)
      so the existing relative paths (TestContainerProject, hashed src dirs) keep working
      no matter where the script is invoked from.
    * Update the workflow path filter and the step's invocation path to the new location.
    
    * Foundry.Hosting IntegrationTests: enable 5 HappyPath tests on the live Foundry endpoint
    
    The fixture already constructs ProjectOpenAIClient via the per-agent path that beta.1
    supports (new ProjectOpenAIClient(uri, cred, opts { AgentName })), so no SDK pin bump
    is required to run the smoke tests end-to-end. Un-skip the 5 tests that pass against
    the live test container.
    
    Tests un-skipped (verified passing locally against tao-foundry-prj):
    
    * RunAsync_ReturnsNonEmptyTextAsync
    * RunStreamingAsync_YieldsAtLeastOneUpdateAsync
    * MultiTurn_WithPreviousResponseId_PreservesContextAsync
    * StoredFalse_Baseline_DoesNotPersistResponseAsync
    * Instructions_FromContainerDefinition_AreObeyedAsync
    
    Tests still skipped with a more specific reason (4 of 9 in HappyPath plus all
    ToolCalling*, McpToolbox, Toolbox, CustomStorage) because the test container does not
    yet emit usable response_id / conversation_id chains, and the placeholder scenarios are
    not implemented in the test container's Program.cs. These are test container limitations,
    not infra bugs, and can be un-skipped as the container surfaces stabilize.
    
    * Foundry.Hosting IntegrationTests: extract hosted IT into parallel job, add Workflows dep
    
    Address Wesley's review feedback on PR #5598:
    
    1. Pull Foundry hosted-agent IT into its own dotnet-foundry-hosted-it job that runs in parallel to dotnet-build and dotnet-test. Same path-filter gate keeps it skipped on unrelated edits. Builds only the filtered solution containing Foundry.Hosting.IntegrationTests and src deps. dotnet-build-and-test-check now waits on it too.
    
    2. Add Microsoft.Agents.AI.Workflows to the foundryHosting paths-filter and to hashedDirs in it-build-image.ps1 since Foundry.Hosting transitively depends on it.
    
    TFM constraint on the IT csproj stays at net10.0 because AgentConformance.IntegrationTests targets net10/net472 and is consumed by ~12 other IT projects on net472.
    
    ---------
    
    Co-authored-by: Roger Barreto <rbarreto@microsoft.com>