.NET: Foundry.Hosted IT - fix MSBuild parallel-output races

Two surgical changes inside the dotnet-foundry-hosted-it job:

1. Replace dotnet build <slnx> -f net10.0 with dotnet build <test.csproj>. The test csproj pins TargetFrameworks=net10.0 and its ProjectReference closure gives MSBuild a single-rooted graph, eliminating the duplicate inner-builds that race on bin/obj. Drops the two New-FilteredSolution.ps1 steps.

2. In it-build-image.ps1, drop the -UsePrebuiltProjectReferences switch and always pass --no-dependencies to dotnet publish. Publish now resolves TestContainer's framework refs by reading prebuilt DLLs and never re-touches them. Replaces the partial-mitigation in PR #5689 with a structural fix.

Local validation confirmed published Foundry.dll has identical mtime and bytes as the prebuild output.
This commit is contained in:
Roger Barreto
2026-05-08 20:22:54 +01:00
Unverified
parent 226c004b53
commit bbe2e91dcd
2 changed files with 43 additions and 94 deletions
+15 -34
View File
@@ -340,7 +340,6 @@ jobs:
runs-on: ubuntu-latest
environment: integration
env:
targetFramework: net10.0
configuration: Release
steps:
- uses: actions/checkout@v6
@@ -357,31 +356,15 @@ jobs:
with:
global-json-file: ${{ github.workspace }}/dotnet/global.json
- name: Generate test solution (no samples)
shell: pwsh
run: |
./dotnet/eng/scripts/New-FilteredSolution.ps1 `
-Solution dotnet/agent-framework-dotnet.slnx `
-TargetFramework $env:targetFramework `
-Configuration $env:configuration `
-ExcludeSamples `
-OutputPath dotnet/filtered.slnx `
-Verbose
- name: Generate Foundry hosted IT filtered solution
shell: pwsh
run: |
./dotnet/eng/scripts/New-FilteredSolution.ps1 `
-Solution dotnet/filtered.slnx `
-TargetFramework $env:targetFramework `
-Configuration $env:configuration `
-TestProjectNameFilter "Foundry.Hosting.IntegrationTests*" `
-OutputPath dotnet/filtered-foundry-hosted.slnx `
-Verbose
# Build the test csproj directly instead of a filtered slnx + -f override.
# The test project pins TargetFrameworks=net10.0 and its ProjectReference closure
# gives MSBuild a single-rooted graph, so each multi-targeted dependency is invoked
# exactly once for net10.0. This avoids the MSB3026/MSB3491/MSB4018/MSB3883 file-lock
# collisions caused by parallel inner-builds racing on shared bin/obj output paths
# under the previous slnx + global TFM override approach.
- name: Build Foundry hosted IT (and its deps)
shell: bash
run: dotnet build dotnet/filtered-foundry-hosted.slnx -c "$configuration" -f "$targetFramework" --warnaserror
run: dotnet build dotnet/tests/Foundry.Hosting.IntegrationTests/Foundry.Hosting.IntegrationTests.csproj -c "$configuration" --warnaserror
- name: Azure CLI Login
uses: azure/login@v2
@@ -394,13 +377,12 @@ jobs:
# are picked up; the image tag is content-hashed across the test container source AND its
# framework project references, so identical content is a no-op push.
#
# `-UsePrebuiltProjectReferences` opts into the no-rebuild fast path: publish skips
# rebuilding ProjectReferences and consumes the DLLs the prior "Build Foundry hosted IT
# (and its deps)" step already produced. This avoids MSB3026 ("file is being used by
# another process") collisions caused by the previous build's shared-compilation server
# still holding file handles to those DLLs. Safe in CI because the prebuild step ran in
# the same job against the same source. Do not remove the prebuild step (the subsequent
# `dotnet test --no-build` step depends on it too).
# The script always passes --no-dependencies to dotnet publish so publish never re-touches
# the framework lib DLLs the prior "Build Foundry hosted IT (and its deps)" step produced.
# This structurally eliminates the MSB3026 collision that VBCSCompiler from the prebuild
# would otherwise cause by holding file handles to those DLLs. Do not remove the prebuild
# step: the subsequent `dotnet test --no-build` step and the publish's ProjectReference
# resolution both depend on the prebuilt outputs being present.
- name: Build and push Foundry Hosted Agents test container
id: build-foundry-hosted-image
shell: pwsh
@@ -410,14 +392,13 @@ jobs:
if ([string]::IsNullOrWhiteSpace($registry)) {
throw "IT_HOSTED_AGENT_REGISTRY not set in the integration environment."
}
& "${{ github.workspace }}/dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-build-image.ps1" -Registry $registry -UsePrebuiltProjectReferences | Tee-Object -FilePath $env:GITHUB_ENV -Append
& "${{ github.workspace }}/dotnet/tests/Foundry.Hosting.IntegrationTests/scripts/it-build-image.ps1" -Registry $registry | Tee-Object -FilePath $env:GITHUB_ENV -Append
- name: Run Foundry Hosted Agents Integration Tests
shell: pwsh
working-directory: dotnet
run: |
dotnet test --solution ./filtered-foundry-hosted.slnx `
-f $env:targetFramework `
dotnet test tests/Foundry.Hosting.IntegrationTests/Foundry.Hosting.IntegrationTests.csproj `
-c $env:configuration `
--no-build -v Normal `
--report-xunit-trx `
@@ -41,14 +41,7 @@ param(
[string] $Repository = "foundry-hosting-it",
[string] $TestContainerProject = "dotnet/tests/Foundry.Hosting.IntegrationTests.TestContainer",
# Explicit opt-in for the no-rebuild fast path. CI sets this after running the
# "Build Foundry hosted IT (and its deps)" step, which guarantees the prebuilt
# library DLLs match current source. Off by default so local invocations always
# let publish rebuild ProjectReferences and never produce an image whose tag is
# computed from current source while the contents come from a stale build.
[switch] $UsePrebuiltProjectReferences
[string] $TestContainerProject = "dotnet/tests/Foundry.Hosting.IntegrationTests.TestContainer"
)
$ErrorActionPreference = "Stop"
@@ -107,60 +100,35 @@ if (Test-Path $out) {
Remove-Item -Recurse -Force $out
}
# Conditionally tell publish to skip rebuilding ProjectReferences and consume the
# prebuilt library DLLs in place. This avoids two failure modes that arise when
# the CI workflow runs a `dotnet build` of the same library projects immediately
# before this script:
# 1) MSB3026 "file is being used by another process" when publish's MSBuild
# tries to overwrite src/<lib>/bin/Release/net10.0/<lib>.dll while the
# previous build's shared-compilation server still holds a file handle.
# 2) Publish needlessly rebuilding identical managed (RID-agnostic) library
# DLLs that prebuild already produced.
# Gated on -UsePrebuiltProjectReferences (a strict opt-in) instead of marker
# detection, because a developer machine may have a stale Release build of the
# libraries from days ago; using those would silently produce an image whose
# content is older than the source the tag is computed from.
$publishExtraArgs = @()
if ($UsePrebuiltProjectReferences) {
Write-Host "-UsePrebuiltProjectReferences: skipping ProjectReference rebuild." -ForegroundColor DarkGray
$publishExtraArgs += "-p:BuildProjectReferences=false"
} else {
# Preflight: in default (rebuild) mode, publish propagates RuntimeIdentifier=linux-musl-x64
# to library ProjectReferences and writes their intermediates to a RID-suffixed obj path
# (e.g. obj/Release/net10.0/linux-musl-x64/). DefaultItemExcludes follows the new
# IntermediateOutputPath, so any *.AssemblyInfo.cs left in obj/Release/net10.0/ from a
# prior `dotnet build` is no longer excluded and gets picked up by the **/*.cs Compile
# glob, producing CS0579 "duplicate attribute" errors. Detect that state up front and
# tell the user exactly how to recover.
$staleObjProbes = @(
"dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/obj/Release/net10.0",
"dotnet/src/Microsoft.Agents.AI.Foundry/obj/Release/net10.0",
"dotnet/src/Microsoft.Agents.AI/obj/Release/net10.0",
"dotnet/src/Microsoft.Agents.AI.Abstractions/obj/Release/net10.0"
)
$stale = @($staleObjProbes | Where-Object { Test-Path (Join-Path $_ "*.AssemblyInfo.cs") })
if ($stale.Count -gt 0) {
$msg = @(
"Detected prior Release/net10.0 build outputs in:"
($stale | ForEach-Object { " - $_" })
""
"Publish would propagate -r linux-musl-x64 to those ProjectReferences and the"
"leftover obj/Release/net10.0/*.AssemblyInfo.cs files would cause CS0579 duplicate"
"attribute errors. Pick one:"
" (a) Pass -UsePrebuiltProjectReferences (skips ProjectReference rebuild and"
" uses the existing src/<lib>/bin/Release/net10.0/*.dll outputs in place)."
" Only safe when you know those DLLs match current source - this is the path"
" CI uses immediately after its 'Build Foundry hosted IT (and its deps)' step."
" (b) Remove the stale obj/Release trees, e.g.:"
" Remove-Item -Recurse -Force dotnet/src/Microsoft.Agents.AI*/obj/Release"
" and re-run."
) -join "`n"
throw $msg
}
Write-Host "Letting publish build ProjectReferences (pass -UsePrebuiltProjectReferences in CI to skip)." -ForegroundColor DarkGray
# Always tell publish to skip ProjectReference rebuilds via --no-dependencies. Publish
# resolves TestContainer's framework lib references (Foundry, Foundry.Hosting and their
# transitive deps) by reading the prebuilt DLLs at src/<lib>/bin/Release/net10.0/*.dll.
# This:
# 1) Structurally avoids the MSB3026 "file is being used by another process" race that
# occurs when publish overwrites the same DLL paths a prior `dotnet build` produced
# while VBCSCompiler from that build still holds file handles.
# 2) Avoids needlessly rebuilding identical managed (RID-agnostic) library DLLs.
# Callers MUST run `dotnet build dotnet/tests/Foundry.Hosting.IntegrationTests/Foundry.Hosting.IntegrationTests.csproj -c Release`
# (or equivalent) first so those prebuilt DLLs exist. The CI workflow does this in the
# preceding "Build Foundry hosted IT (and its deps)" step.
$prebuildProbes = @(
"dotnet/src/Microsoft.Agents.AI.Foundry/bin/Release/net10.0/Microsoft.Agents.AI.Foundry.dll",
"dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/bin/Release/net10.0/Microsoft.Agents.AI.Foundry.Hosting.dll"
)
$missingPrebuilds = @($prebuildProbes | Where-Object { -not (Test-Path $_) })
if ($missingPrebuilds.Count -gt 0) {
$msg = @(
"Required prebuilt outputs not found:"
($missingPrebuilds | ForEach-Object { " - $_" })
""
"Publish runs with --no-dependencies and consumes prebuilt DLLs in place. Build the"
"test project first so its ProjectReference closure populates src/<lib>/bin/Release/net10.0/:"
" dotnet build dotnet/tests/Foundry.Hosting.IntegrationTests/Foundry.Hosting.IntegrationTests.csproj -c Release"
) -join "`n"
throw $msg
}
dotnet publish $TestContainerProject -c Release -f net10.0 -r linux-musl-x64 --self-contained false -o $out @publishExtraArgs --tl:off | Out-Host
dotnet publish $TestContainerProject -c Release -f net10.0 -r linux-musl-x64 --self-contained false --no-dependencies -o $out --tl:off | Out-Host
if ($LASTEXITCODE -ne 0) {
throw "dotnet publish failed with exit code $LASTEXITCODE."
}