Commit Graph

3 Commits

  • .NET: Python: Add dotnet integration test report to CI (#5515)
    * Add dotnet integration test report to CI
    
    - Add --report-junit flag to dotnet integration test step to generate
      JUnit XML alongside TRX, with explicit --results-directory to
      centralize output in IntegrationTestResults/
    - Upload JUnit XML artifacts from each matrix leg (net10.0/ubuntu,
      net472/windows) as dotnet-test-results-{framework}-{os}
    - Add dotnet-integration-test-report job that downloads artifacts,
      runs the existing aggregate.py script, posts markdown to Job Summary,
      and saves trend history via actions/cache
    - Refactor aggregate.py to discover JUnit XML files recursively,
      supporting both pytest (pytest.xml) and xunit (*.junit.xml) layouts
    - Handle provider name derivation for dotnet artifact naming convention
    - Fix nodeid collision when same test runs under multiple frameworks
      by qualifying keys with provider when collisions are detected
    - Improve module extraction for dotnet C# classnames (recognizes
      IntegrationTests/UnitTests namespace segments)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * chore: trigger dotnet CI for report validation
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: use .junit extension (not .junit.xml) for xunit v3 output
    
    xUnit v3 generates files with .junit extension, not .junit.xml.
    Update upload glob and aggregate.py discovery to match.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: use deterministic provider-qualified keys for dotnet tests
    
    Always prefix dotnet test keys with provider (e.g. net10.0 (ubuntu)::TestName)
    to ensure stable, comparable counts across runs regardless of file parse order.
    Also show Executed (passed+failed) instead of Total in summary table.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * fix: match Python report summary format (Total, passed/total, etc.)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * feat: split dotnet report into per-framework tables
    
    Dotnet tests run on multiple frameworks (net10.0, net472). Instead of
    one combined table with unstable totals, show separate sections per
    framework — each with its own summary row and per-test table. Python
    reports retain the original single-table format.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-enable 7 flaky dotnet integration tests with increased timeouts
    
    Increase timeouts to reduce timing-related flakiness in LLM-backed
    integration tests (issue #4971):
    
    - ExternalClientTests: 60s -> 120s default timeout
    - SamplesValidationBase: 60s -> 120s default timeout
    - ConsoleAppSamplesValidation: 90s -> 150s for long-running tests
    - AzureFunctions SamplesValidation: 2min -> 3min orchestration timeout,
      60s -> 90s per-step WaitForConditionAsync timeouts
    
    Remove all Skip=Flaky annotations and unused SkipFlakyTimingTest constants.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-skip LLM non-determinism flaky tests, keep timeout fixes
    
    Re-skip SingleAgentOrchestrationHITLSampleValidationAsync and
    LongRunningToolsSampleValidationAsync - these fail due to LLM producing
    extra review notifications, not timeouts. Updated skip reasons to
    accurately describe the root cause. Reverted unnecessary timeout change
    on the skipped LongRunningTools test.
    
    The remaining 5 re-enabled tests with timeout increases are stable.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Enable Anthropic integration tests in CI
    
    Replace hardcoded skip with conditional skip pattern (matching
    CopilotStudio approach): tests gracefully skip when ANTHROPIC_API_KEY
    is missing, and run when present.
    
    Changes:
    - AnthropicChatCompletionFixture: try/catch in InitializeAsync with
      Assert.Skip on missing config (replaces hardcoded SkipReason)
    - AnthropicSkillsIntegrationTests: same pattern per test method
    - dotnet-build-and-test.yml: wire up ANTHROPIC_API_KEY,
      ANTHROPIC_CHAT_MODEL_NAME, and ANTHROPIC_REASONING_MODEL_NAME
      env vars to the integration test step
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix missing System using in AnthropicSkillsIntegrationTests
    
    Add 'using System;' for InvalidOperationException in try/catch blocks.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Skip flaky SingleAgentOrchestrationChainingSampleValidationAsync
    
    LLM non-determinism causes Assert.NotNull failures on orchestration
    results. Skip until test logic is hardened against non-deterministic
    LLM responses.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Re-enable HITL and LongRunningTools tests with timeout and flexibility fixes
    
    - Remove Skip attribute from SingleAgentOrchestrationHITLSampleValidationAsync
    - Remove Skip attribute from LongRunningToolsSampleValidationAsync
    - Increase timeout from 120s/90s to 180s to accommodate 2+ LLM round-trips
    - Replace rigid 2-cycle assertion with flexible approval logic that handles
      extra review cycles from LLM non-determinism
    
    Fixes the two failure modes identified in #4971:
    1. Timeout: 120s/90s was insufficient for multiple LLM calls under CI load
    2. Extra notifications: Assert.Fail on 3rd+ review cycle was too rigid
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Increase AzureFunctions LongRunningTools test timeouts from 90s to 180s
    
    The LongRunningToolsSampleValidationAsync test in the AzureFunctions integration
    tests was failing in CI with TimeoutException at the 'Content published
    notification is logged' step. The 90-second timeouts are too tight for CI
    environments where LLM calls and orchestration overhead can be slow.
    
    Increased all three WaitForConditionAsync timeouts from 90s to 180s:
    - Waiting for human feedback notification
    - Waiting for publish notification (the step that was failing)
    - Waiting for orchestration completion
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Merge main and fix dotnet report path after flaky_report rename
    
    Merge upstream/main which renamed scripts/flaky_report/ to
    scripts/integration_test_report/ (from Python PR #5454). Update the
    dotnet-build-and-test workflow to reference the new path.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add RetryFact to DurableTask and AzureFunctions integration tests
    
    These tests interact with LLMs via stdin/stdout (DurableTask) or HTTP
    (AzureFunctions) and are inherently non-deterministic. Unlike the Python
    side which uses pytest-retry, the dotnet tests had no retry mechanism
    and a single transient failure would fail the entire CI run.
    
    Changes:
    - Switch [Fact] to [RetryFact(2, 5000)] on all LLM-dependent tests
      across ConsoleAppSamplesValidation, ExternalClientTests,
      WorkflowConsoleAppSamplesValidation, and AzureFunctions SamplesValidation
    - Add re-prompt mechanism to LongRunningToolsSampleValidationAsync:
      if the LLM doesn't invoke the tool within 60s, re-send the prompt
      (up to 2 retries) instead of burning the full timeout
    - Reduce LongRunningTools timeout from 240s to 180s (re-prompt makes
      the extra buffer unnecessary)
    - Leave simple/deterministic tests as [Fact] (SingleAgent, unit tests)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Add persist-credentials: false to Integration Test Report checkout step
    
    Matches the convention used by other checkout steps in this workflow
    to avoid leaving GITHUB_TOKEN credentials in the local git config.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * small fixes
    
    * disable anthropic failing tests
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
  • .NET: Improve visibility for AzureFunctions Workflows samples run tests in increase timeouts (#4820)
    * Reduce timeout flakiness for AzureFunctions Workflows samples run tests
    
    * Add more updates
    
    * Address PR comments
    
    * Address PR comments
  • .NET: Add durable workflow support (#4436)
    * .NET: [Feature Branch] Add basic durable workflow support (#3648)
    
    * Add basic durable workflow support.
    
    * PR feedback fixes
    
    * Add conditional edge sample.
    
    * PR feedback fixes.
    
    * Minor cleanup.
    
    * Minor cleanup
    
    * Minor formatting improvements.
    
    * Improve comments/documentation on the execution flow.
    
    * .NET: [Feature Branch] Add Azure Functions hosting support for durable workflows (#3935)
    
    * Adding azure functions workflow support.
    
    * - PR feedback fixes.
    - Add example to demonstrate complex Object as payload.
    
    * rename instanceId to runId.
    
    * Use custom ITaskOrchestrator to run orchestrator function.
    
    * .NET: [Feature Branch] Adding support for events & shared state in durable workflows (#4020)
    
    * Adding support for events & shared state in durable workflows.
    
    * PR feedback fixes
    
    * PR feedback fixes.
    
    * Add YieldOutputAsync calls to 05_WorkflowEvents sample executors
    
    The integration test asserts that WorkflowOutputEvent is found in the
    stream, but the sample executors only used AddEventAsync for custom
    events and never called YieldOutputAsync. Since WorkflowOutputEvent is
    only emitted via explicit YieldOutputAsync calls, the assertion would
    fail. Added YieldOutputAsync to each executor to match the test
    expectation and demonstrate the API in the sample.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix deserialization to use shared serializer options.
    
    * PR feedback updates.
    
    * Sample cleanup
    
    * PR feedback fixes
    
    * Addressing PR review feedback for DurableStreamingWorkflowRun
    
       - Use -1 instead of 0 for taskId in TaskFailedException when task ID is not relevant.
       - Add [NotNullWhen(true)] to TryParseWorkflowResult out parameter following .NET TryXXX conventions.
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * .NET: [Feature Branch]  Add nested sub-workflow support for durable workflows (#4190)
    
    * .NET: [Feature Branch] Add nested sub-workflow support for durable workflows
    
    * fix readme path
    
    * Switch Orchestration output from string to DurableWorkflowResult.
    
    * PR feedback fixes
    
    * Minor cleanup based on PR feedback.
    
    * .NET: [Feature Branch] Add Human In the Loop support for durable workflows (#4358)
    
    * Add Azure Functions HITL workflow sample
    
    Add 06_WorkflowHITL Azure Functions sample demonstrating Human-in-the-Loop
    workflow support with HTTP endpoints for status checking and approval responses.
    
    The sample includes:
    - ExpenseReimbursement workflow with RequestPort for manager approval
    - Custom HTTP endpoint to check workflow status and pending approvals
    - Custom HTTP endpoint to send approval responses via RaiseEventAsync
    - demo.http file with step-by-step interaction examples
    
    * PR feedback fixes
    
    * Minor comment cleanup
    
    * Minor comment clReverted the `!context.IsReplaying` guards on `PendingEvents.Add`/`RemoveAll` and `SetCustomStatus` in `ExecuteRequestPortAsync`. The guards broke fan-out scenarios where parallel RequestPorts      need to be discoverable after replay. `SetCustomStatus` is idempotent metadata that doesn't affect replay determinism.eanup
    
    * fix  for PR feedback
    
    * PR feedback updates
    
    * Improvements to samples
    
    * Improvements to README
    
    * Update samples to use parallel request ports.
    
    * Unit tests
    
    * Introduce local variables to improve readability of Workflows.Workflows access patter
    
    * Use GitHub-style callouts and add PowerShell command variants in HITL sample README
    
    * Add changelog entries for durable workflow support (#4436)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Bump Microsoft.DurableTask.Worker to 1.19.1 to fix version downgrade
    
    Microsoft.Azure.Functions.Worker.Extensions.DurableTask 1.13.1 requires
    Microsoft.DurableTask.Worker >= 1.19.1 via its transitive dependency on
    Microsoft.DurableTask.Worker.Grpc 1.19.1.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix broken markdown links in durable workflow sample READMEs
    
    - Create Workflow/README.md with environment setup docs
    - Fix ../README.md -> ../../README.md in ConsoleApps 01, 02, 03, 08
    - Fix SubWorkflows relative path (3 levels -> 4 levels up)
    - Fix dead Durable Task Scheduler URL
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix build errors from main merge: Throw conflict, ExecuteAsync rename, GetNewSessionAsync rename
    
    - Remove InjectSharedThrow from DurableTask csproj (uses Workflows' internal Throw via InternalsVisibleTo)
    - Update ExecuteAsync -> ExecuteCoreAsync with WorkflowTelemetryContext.Disabled
    - Update GetNewSessionAsync -> CreateSessionAsync
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Move durable workflow samples to 04-hosting/DurableWorkflows
    
    Aligns with main branch sample reorganization where durable samples
    live under 04-hosting/ (alongside DurableAgents/).
    
    - Move samples/Durable/Workflow/ -> samples/04-hosting/DurableWorkflows/
    - Add Directory.Build.props matching DurableAgents pattern
    - Update slnx project paths
    - Update integration test sample paths
    - Update README cd paths and cross-references
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix build errors: remove duplicate base class members, update renamed APIs
    
    - Remove duplicate OutputLog, WriteInputAsync, CreateTestTimeoutCts, etc. from
      ConsoleAppSamplesValidation (already in SamplesValidationBase)
    - Update AddFanInEdge -> AddFanInBarrierEdge in workflow samples
    - Update GetNewSessionAsync -> CreateSessionAsync in workflow samples
    - Update SourceId -> ExecutorId (obsolete) in workflow samples
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix dotnet format issues: add UTF-8 BOM and remove unused using
    
    - Add UTF-8 BOM to 20 .cs files across DurableTask, AzureFunctions,
      unit tests, and workflow samples
    - Remove unnecessary using directive in 07_SubWorkflows/Executors.cs
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix typo PaymentProcesser -> PaymentProcessor and garbled arrows in README
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Fix GetExecutorName to handle agent names with underscores
    
    Split on last underscore instead of first, and validate that the
    suffix is a 32-char hex string (sanitized GUID) before stripping it.
    This prevents truncation of agent names like 'my_agent' when the
    executor ID is 'my_agent_<guid>'.
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Align DurableTask.Client.AzureManaged to 1.19.1
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Bump DurableTask and Azure Functions extension package versions
    
    - DurableTask.* packages: 1.19.1 -> 1.22.0
    - Functions.Worker.Extensions.DurableTask: 1.13.1 -> 1.16.0
    - Functions.Worker.Extensions.DurableTask.AzureManaged: 1.0.1 -> 1.5.0 (telemetry bug fix)
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Bump DurableTask SDK packages to 1.22.0
    
    - DurableTask.Client: 1.19.1 -> 1.22.0
    - DurableTask.Client.AzureManaged: 1.19.1 -> 1.22.0
    - DurableTask.Worker: 1.19.1 -> 1.22.0
    - DurableTask.Worker.AzureManaged: 1.19.1 -> 1.22.0
    - Azure Functions extensions kept at original versions (1.13.1/1.0.1) due to
      host-side DurableTask.Core 3.7.0 incompatibility with newer extensions
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
    
    * Update Microsoft.Azure.Functions.Worker.Extensions.DurableTask to "1.16.0"
    
    * Add the local.settings.json files to the sample which were previously ignored. This aligns with our other samples.
    
    * Increase timeout for tests as CI has them failing transiently.
    
    * increaset timeout value for azure functions integration tests.
    
    * Add YieldsOutput(string) to workflow shared state sample executors
    
    ValidateOrder and EnrichOrder call YieldOutputAsync with string messages,
    but only their TOutput (OrderDetails) was in the allowed yield types.
    This caused TargetInvocationException in the WorkflowSharedState sample
    validation integration test.
    
    * Downgrade the durable packages to 1.18.0
    
    * Downgrading Worker.Extensions.DurableTask to 1.12.1
    
    ---------
    
    Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>