Files
agent-framework/.github/workflows/python-merge-tests.yml
Giles Odigwe 540193ccef Python: Reduce flaky integration tests and improve CI signal quality (#5454)
* Enable Ollama integration tests in CI and rename report to Integration Test Report

- Install Ollama, cache models (qwen2.5:0.5b + nomic-embed-text), and start
  server in the Misc integration job for both workflow files
- Set OLLAMA_MODEL and OLLAMA_EMBEDDING_MODEL env vars so the 5 Ollama tests
  are no longer skipped
- Rename Flaky Test Report to Integration Test Report throughout (job names,
  artifact names, cache keys, file names, script titles/docstrings)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Bump Ollama model to qwen2.5:1.5b for better instruction following

The 0.5b model was too small to reliably follow simple prompts like
'Say Hello World', causing test assertion failures. The 1.5b model
follows instructions more reliably while still being small enough
for fast CI pulls (~1GB).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable reliable streaming integration tests

Remove the hard skip on test_03_reliable_streaming tests that was
temporarily disabled for instability investigation. CI infrastructure
(Azurite, DTS emulator, Redis, func CLI) is already in place.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable skipped Functions/DurableTask tests and bump timeout to 480s

- Remove hard skips from 4 tests in test_11_workflow_parallel.py
- Remove hard skip from test_conditional_branching in test_06_dt_multi_agent_orchestration_conditionals.py
- Increase pytest --timeout from 360 to 480 for Functions+DurableTask CI job
- Updated in both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip failing Functions/DurableTask tests with specific root causes

- test_11_workflow_parallel (4 tests): xdist worker crashes during execution
- test_conditional_branching: orchestration fails with RuntimeError, not a timeout
- Keep 480s timeout bump for remaining Functions tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix auth routing in samples 06/11: api_key -> credential for Azure OpenAI

Both samples passed a bearer token provider via api_key= which caused the
client to route to api.openai.com instead of Azure OpenAI, resulting in
401 Unauthorized. Changed to credential= which correctly triggers Azure
routing and picks up AZURE_OPENAI_ENDPOINT from the environment.

- samples/azure_functions/11_workflow_parallel/function_app.py: 1 fix
- samples/durabletask/06_multi_agent_orchestration_conditionals/worker.py: 2 fixes
- Re-enable 4 parallel workflow tests and 1 conditional branching test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip parallel workflow tests: xdist worker distribution issue

The 4 parallel workflow tests crash because xdist worksteal distributes
them across separate workers, each spawning its own func process against
shared emulators. Auth fix (api_key->credential) was valid and stays.
test_conditional_branching now passes with the auth fix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix E501 line-too-long in azurefunctions parallel test skip reasons

Wrap skip reason strings to stay within 120 char line limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add retry logic and port-conflict fix for Ollama CI setup

- Kill any auto-started Ollama before launching serve (fixes port
  conflict: 'address already in use')
- Retry ollama pull up to 3 times with 15s backoff (fixes 429 rate
  limit failures)
- Applied to both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky integration tests and re-enable skipped tests

- Foundry agent: add allow_preview=True to custom client test
- Foundry hosting: raise max_output_tokens 50->200, add temperature,
  relax assertion in test_temperature_and_max_tokens
- Foundry embedding: update skip reason with root cause (endpoint mismatch)
- OpenAI file search: fix vector store indexing race condition by polling
  file_counts before querying; fix get_streaming_response -> get_response(stream=True)
- Azure OpenAI file search: remove skip (transient 500 resolved)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Remove temperature from foundry hosting test (unsupported by CI model)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Ollama tool call integration tests with no-arg function

Use a no-argument greet() function instead of hello_world(arg1) for
integration tests. The 1.5B model in CI is unreliable at generating
correct tool call arguments, causing 'Argument parsing failed' errors.
A no-arg function eliminates this flakiness entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Increase reliable streaming test timeouts from 30s to 60s

The LLM call through Azure OpenAI + Redis streaming pipeline can exceed
30s in CI due to cold starts or throttling. Raise to 60s to reduce
flaky timeouts while still bounded by pytest's 120s per-test limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable workflow parallel tests with xdist_group marker

The tests were skipped because xdist distributes module tests across
workers, each spawning their own func process (port conflicts). Adding
xdist_group forces all tests in this module onto a single worker so
the module-scoped function_app_for_test fixture works correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert "Re-enable workflow parallel tests with xdist_group marker"

This reverts commit 455c28da62.

* Rename flaky_report to integration_test_report and add try/finally cleanup

- Rename scripts/flaky_report/ to scripts/integration_test_report/ to
  reflect expanded scope beyond flaky-test detection
- Update workflow references in both CI files
- Wrap file search integration tests in try/finally to ensure vector
  store cleanup runs even on test failure or timeout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Ollama pull failure propagation and Azure OpenAI vector store readiness

- Ollama CI: fail the step immediately if model pull fails after 3
  retries instead of silently proceeding to tests
- Azure OpenAI file search: add the same vector-store readiness polling
  that was applied to the non-Azure OpenAI tests, preventing eventual
  consistency race conditions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* remove load_dotenv from test file

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-01 00:41:39 +00:00

753 lines
27 KiB
YAML

name: Python - Merge - Tests
#
# NOTE: This workflow and python-integration-tests.yml share the same set of
# parallel test jobs. Keep them in sync — when adding, removing, or modifying a
# job here, apply the same change to python-integration-tests.yml.
#
on:
workflow_dispatch:
pull_request:
branches: ["main"]
merge_group:
branches: ["main"]
schedule:
- cron: "0 0 * * *" # Run at midnight UTC daily
permissions:
contents: read
id-token: write
env:
# Configure a constant location for the uv cache
UV_CACHE_DIR: /tmp/.uv-cache
UV_PYTHON: "3.13"
RUN_SAMPLES_TESTS: ${{ vars.RUN_SAMPLES_TESTS }}
jobs:
paths-filter:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
outputs:
pythonChanges: ${{ steps.filter.outputs.python }}
coreChanged: ${{ steps.filter.outputs.core }}
openaiChanged: ${{ steps.filter.outputs.openai }}
azureChanged: ${{ steps.filter.outputs.azure }}
miscChanged: ${{ steps.filter.outputs.misc }}
functionsChanged: ${{ steps.filter.outputs.functions }}
foundryChanged: ${{ steps.filter.outputs.foundry }}
foundryHostingChanged: ${{ steps.filter.outputs.foundry_hosting }}
cosmosChanged: ${{ steps.filter.outputs.cosmos }}
steps:
- uses: actions/checkout@v6
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
python:
- 'python/**'
- '.github/actions/setup-local-mcp-server/**'
- '.github/workflows/python-merge-tests.yml'
- '.github/workflows/python-integration-tests.yml'
core:
- 'python/packages/core/agent_framework/_*.py'
- 'python/packages/core/agent_framework/_workflows/**'
- 'python/packages/core/agent_framework/exceptions.py'
- 'python/packages/core/agent_framework/observability.py'
openai:
- 'python/packages/core/agent_framework/openai/**'
- 'python/packages/openai/**'
- 'python/samples/**/providers/openai/**'
azure:
- 'python/packages/openai/**'
- 'python/packages/core/agent_framework/azure/**'
- 'python/samples/**/providers/azure/**'
misc:
- 'python/packages/anthropic/**'
- 'python/packages/hyperlight/**'
- 'python/packages/ollama/**'
- 'python/packages/core/agent_framework/_mcp.py'
- 'python/packages/core/tests/core/test_mcp.py'
- 'python/scripts/local_mcp_streamable_http_server.py'
- '.github/actions/setup-local-mcp-server/**'
- '.github/workflows/python-merge-tests.yml'
- '.github/workflows/python-integration-tests.yml'
functions:
- 'python/packages/azurefunctions/**'
- 'python/packages/durabletask/**'
foundry:
- 'python/packages/foundry/**'
- 'python/samples/**/providers/foundry/**'
- 'python/samples/02-agents/embeddings/foundry_embeddings.py'
foundry_hosting:
- 'python/packages/foundry_hosting/**'
cosmos:
- 'python/packages/azure-cosmos/**'
# run only if 'python' files were changed
- name: python tests
if: steps.filter.outputs.python == 'true'
run: echo "Python file"
# run only if not 'python' files were changed
- name: not python tests
if: steps.filter.outputs.python != 'true'
run: echo "NOT python file"
# Unit tests: always run all non-integration tests across all packages
python-tests-unit:
name: Python Tests - Unit
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true'
runs-on: ubuntu-latest
environment: integration
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Test with pytest (unit tests only)
run: >
uv run poe test -A
-m "not integration"
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Unit test results
# OpenAI integration tests
python-tests-openai:
name: Python Tests - OpenAI Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.openaiChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.OPENAI__CHATMODELID }}
OPENAI_CHAT_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_EMBEDDING_MODEL: ${{ vars.OPENAI_EMBEDDING_MODEL_ID }}
OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Test with pytest (OpenAI integration)
run: >
uv run pytest --import-mode=importlib
packages/openai/tests
-m "integration and not azure"
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Test OpenAI samples
timeout-minutes: 10
if: env.RUN_SAMPLES_TESTS == 'true'
run: uv run pytest tests/samples/ -m "openai"
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: OpenAI integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-openai
path: ./python/pytest.xml
if-no-files-found: ignore
# Azure OpenAI integration tests
python-tests-azure-openai:
name: Python Tests - Azure OpenAI Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.azureChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
AZURE_OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_EMBEDDING_MODEL: ${{ vars.AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME }}
AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
if: github.event_name != 'pull_request'
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Test with pytest (Azure OpenAI integration)
run: >
uv run pytest --import-mode=importlib
packages/openai/tests/openai/test_openai_chat_completion_client_azure.py
packages/openai/tests/openai/test_openai_chat_client_azure.py
packages/openai/tests/openai/test_openai_embedding_client_azure.py
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Test Azure samples
timeout-minutes: 10
if: env.RUN_SAMPLES_TESTS == 'true'
run: uv run pytest tests/samples/ -m "azure"
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Azure OpenAI integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-azure-openai
path: ./python/pytest.xml
if-no-files-found: ignore
# Misc integration tests (Anthropic, Ollama, MCP)
python-tests-misc-integration:
name: Python Tests - Misc Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.miscChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_CHAT_MODEL: ${{ vars.ANTHROPIC_CHAT_MODEL_ID }}
LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }}
OLLAMA_MODEL: qwen2.5:1.5b
OLLAMA_EMBEDDING_MODEL: nomic-embed-text
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Install Ollama
run: curl -fsSL https://ollama.com/install.sh | sh
working-directory: .
- name: Cache Ollama models
uses: actions/cache@v4
with:
path: ~/.ollama/models
key: ollama-models-qwen2.5-1.5b-nomic-embed-text-v1
- name: Start Ollama and pull models
run: |
# Stop any Ollama instance auto-started by the install script
pkill ollama || true
sleep 2
ollama serve &
for i in $(seq 1 30); do
if curl -sf http://localhost:11434/api/tags > /dev/null 2>&1; then
break
fi
sleep 1
done
# Pull models with retry for transient 429 rate limits
for model in qwen2.5:1.5b nomic-embed-text; do
pulled=false
for attempt in 1 2 3; do
if ollama pull "$model"; then
pulled=true
break
fi
echo "Retry $attempt for $model (waiting 15s)..."
sleep 15
done
if [ "$pulled" != "true" ]; then
echo "ERROR: Failed to pull $model after 3 attempts"
exit 1
fi
done
working-directory: .
- name: Start local MCP server
id: local-mcp
uses: ./.github/actions/setup-local-mcp-server
with:
fallback_url: ${{ env.LOCAL_MCP_URL }}
- name: Prefer local MCP URL when available
run: echo "LOCAL_MCP_URL=${{ steps.local-mcp.outputs.effective_url }}" >> "$GITHUB_ENV"
- name: Test with pytest (Anthropic, Hyperlight, Ollama, MCP integration)
run: >
uv run pytest --import-mode=importlib
packages/anthropic/tests
packages/hyperlight/tests
packages/ollama/tests
packages/core/tests/core/test_mcp.py
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 30
--junitxml=pytest.xml
working-directory: ./python
- name: Stop local MCP server
if: always()
shell: bash
run: |
set -euo pipefail
server_pid="${{ steps.local-mcp.outputs.pid }}"
if [[ -z "$server_pid" ]]; then
exit 0
fi
if ! kill -0 "$server_pid" 2>/dev/null; then
exit 0
fi
kill -TERM -- "-$server_pid" 2>/dev/null || kill -TERM "$server_pid" 2>/dev/null || true
for _ in $(seq 1 10); do
if ! kill -0 "$server_pid" 2>/dev/null; then
exit 0
fi
sleep 1
done
kill -KILL -- "-$server_pid" 2>/dev/null || kill -KILL "$server_pid" 2>/dev/null || true
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Misc integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-misc
path: ./python/pytest.xml
if-no-files-found: ignore
# Azure Functions + Durable Task integration tests
python-tests-functions:
name: Python Tests - Functions Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.functionsChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
UV_PYTHON: "3.11"
OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.OPENAI__CHATMODELID }}
OPENAI_CHAT_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_EMBEDDING_MODEL: ${{ vars.OPENAI_EMBEDDING_MODEL_ID }}
OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }}
AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }}
AZURE_OPENAI_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }}
FOUNDRY_PROJECT_ENDPOINT: ${{ vars.FOUNDRY_PROJECT_ENDPOINT }}
FOUNDRY_MODEL: ${{ vars.FOUNDRY_MODEL }}
FUNCTIONS_WORKER_RUNTIME: "python"
DURABLE_TASK_SCHEDULER_CONNECTION_STRING: "Endpoint=http://localhost:8080;TaskHub=default;Authentication=None"
AzureWebJobsStorage: "UseDevelopmentStorage=true"
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
if: github.event_name != 'pull_request'
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Set up Azure Functions Integration Test Emulators
uses: ./.github/actions/azure-functions-integration-setup
id: azure-functions-setup
- name: Test with pytest (Functions + Durable Task integration)
run: >
uv run pytest --import-mode=importlib
packages/azurefunctions/tests/integration_tests
packages/durabletask/tests/integration_tests
-m integration
-n logical --dist worksteal
-x
--timeout=480 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Functions integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-functions
path: ./python/pytest.xml
if-no-files-found: ignore
python-tests-foundry:
name: Python Integration Tests - Foundry
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.foundryChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
FOUNDRY_PROJECT_ENDPOINT: ${{ vars.FOUNDRY_PROJECT_ENDPOINT }}
FOUNDRY_MODEL: ${{ vars.FOUNDRY_MODEL }}
FOUNDRY_AGENT_NAME: ${{ vars.FOUNDRY_AGENT_NAME }}
FOUNDRY_AGENT_VERSION: ${{ vars.FOUNDRY_AGENT_VERSION }}
FOUNDRY_MODELS_ENDPOINT: ${{ vars.FOUNDRY_MODELS_ENDPOINT || '' }}
FOUNDRY_MODELS_API_KEY: ${{ secrets.FOUNDRY_MODELS_API_KEY || '' }}
FOUNDRY_EMBEDDING_MODEL: ${{ vars.FOUNDRY_EMBEDDING_MODEL || '' }}
FOUNDRY_IMAGE_EMBEDDING_MODEL: ${{ vars.FOUNDRY_IMAGE_EMBEDDING_MODEL || '' }}
LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
if: github.event_name != 'pull_request'
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Test with pytest
timeout-minutes: 15
run: >
uv run pytest --import-mode=importlib
packages/foundry/tests
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-foundry
path: ./python/pytest.xml
if-no-files-found: ignore
# Foundry Hosting integration tests
python-tests-foundry-hosting:
name: Python Tests - Foundry Hosting Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.foundryHostingChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
env:
FOUNDRY_PROJECT_ENDPOINT: ${{ vars.FOUNDRY_PROJECT_ENDPOINT }}
FOUNDRY_MODEL: ${{ vars.FOUNDRY_MODEL }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
if: github.event_name != 'pull_request'
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Test with pytest (Foundry Hosting integration)
timeout-minutes: 15
run: >
uv run pytest --import-mode=importlib
packages/foundry_hosting/tests
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
--junitxml=pytest.xml
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Foundry Hosting integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-foundry-hosting
path: ./python/pytest.xml
if-no-files-found: ignore
# TODO: Add python-tests-lab
# Azure Cosmos integration tests
python-tests-cosmos:
name: Python Tests - Cosmos Integration
needs: paths-filter
if: >
github.event_name != 'pull_request' &&
needs.paths-filter.outputs.pythonChanges == 'true' &&
(github.event_name != 'merge_group' ||
needs.paths-filter.outputs.cosmosChanged == 'true' ||
needs.paths-filter.outputs.coreChanged == 'true')
runs-on: ubuntu-latest
environment: integration
services:
cosmosdb:
image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview
ports:
- 8081:8081
env:
AZURE_COSMOS_ENDPOINT: "http://localhost:8081/"
# Static Azure Cosmos DB emulator key (documented): https://learn.microsoft.com/en-us/azure/cosmos-db/emulator
AZURE_COSMOS_KEY: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
AZURE_COSMOS_DATABASE_NAME: "agent-framework-cosmos-it-db"
AZURE_COSMOS_CONTAINER_NAME: "agent-framework-cosmos-it-container"
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Wait for Cosmos DB emulator
run: |
for i in {1..60}; do
if curl --silent --show-error http://localhost:8081/ > /dev/null; then
echo "Cosmos DB emulator is ready."
exit 0
fi
sleep 2
done
echo "Cosmos DB emulator did not become ready in time." >&2
exit 1
- name: Test with pytest (Cosmos integration)
run: uv run --directory packages/azure-cosmos poe integration-tests -n logical --dist worksteal --timeout=120 --session-timeout=900 --timeout_method thread --retries 2 --retry-delay 5 --junitxml=${{ github.workspace }}/python/pytest.xml
working-directory: ./python
- name: Surface failing tests
if: always()
uses: pmeier/pytest-results-action@v0.7.2
with:
path: ./python/pytest.xml
summary: true
display-options: fEX
fail-on-empty: false
title: Cosmos integration test results
- name: Upload test results
if: always()
uses: actions/upload-artifact@v7
with:
name: test-results-cosmos
path: ./python/pytest.xml
if-no-files-found: ignore
# Integration test trend report (aggregates per-job JUnit XML results)
python-integration-test-report:
name: Integration Test Report
if: >
always() &&
(contains(join(needs.*.result, ','), 'success') ||
contains(join(needs.*.result, ','), 'failure'))
needs:
[
python-tests-openai,
python-tests-azure-openai,
python-tests-misc-integration,
python-tests-functions,
python-tests-foundry,
python-tests-foundry-hosting,
python-tests-cosmos,
]
runs-on: ubuntu-latest
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
- name: Set up python and install the project
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Download all test results from current run
uses: actions/download-artifact@v4
with:
pattern: test-results-*
path: test-results/
- name: Restore report history cache
uses: actions/cache/restore@v4
with:
path: python/integration-report-history.json
key: integration-report-history-merge-${{ github.run_id }}
restore-keys: |
integration-report-history-merge-
- name: Generate trend report
run: >
uv run python scripts/integration_test_report/aggregate.py
../test-results/
integration-report-history.json
integration-test-report.md
- name: Post to Job Summary
if: always()
run: cat integration-test-report.md >> $GITHUB_STEP_SUMMARY
- name: Save report history cache
if: always()
uses: actions/cache/save@v4
with:
path: python/integration-report-history.json
key: integration-report-history-merge-${{ github.run_id }}
- name: Upload unified trend report
if: always()
uses: actions/upload-artifact@v7
with:
name: integration-test-report
path: |
python/integration-test-report.md
python/integration-report-history.json
python-integration-tests-check:
if: always()
runs-on: ubuntu-latest
needs:
[
python-tests-unit,
python-tests-openai,
python-tests-azure-openai,
python-tests-misc-integration,
python-tests-functions,
python-tests-foundry,
python-tests-foundry-hosting,
python-tests-cosmos,
]
steps:
- name: Fail workflow if tests failed
id: check_tests_failed
if: contains(join(needs.*.result, ','), 'failure')
uses: actions/github-script@v8
with:
script: core.setFailed('Integration Tests Failed!')
- name: Fail workflow if tests cancelled
id: check_tests_cancelled
if: contains(join(needs.*.result, ','), 'cancelled')
uses: actions/github-script@v8
with:
script: core.setFailed('Integration Tests Cancelled!')