Files
agent-framework/.github/workflows/python-integration-tests.yml
Eduard van Valkenburg b03cb324d5 Python: Add Hyperlight CodeAct package and docs (#5185)
* initial work on code_mode

* updated samples

* updates to codeact

* udpated codeact

* Draft CodeAct ADR and sample updates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* initial implementation and adr and feature

* Python: Limit Hyperlight wasm backend to Python <3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix CI for Hyperlight CodeAct PR

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Run Hyperlight integration when available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Address Hyperlight review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Simplify Hyperlight file mount inputs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Accept Path host paths in Hyperlight mounts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix Hyperlight mount typing for CI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* temp run integration test

* Python: Strengthen Hyperlight real sandbox tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added additional tests

* Python: Simplify Hyperlight CodeAct API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* set tests as non-integration

* Retry Hyperlight allowed-domain registration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Gate Hyperlight integration tests by runtime support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight skip test on Python 3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Delay Hyperlight runtime probe until test execution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Relax Hyperlight Windows integration stdout assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scan Hyperlight output directory for artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight output artifact collection

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden Hyperlight integration output assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight read-back check in integration test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify Hyperlight integration write assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid pathlib in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use socket network check in Hyperlight sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Replace blocked Azure AI Search blog link

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify Hyperlight guest stdlib limits

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use _socket in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Handle Hyperlight mounted file paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Broaden Hyperlight sandbox path fallbacks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Search Hyperlight guest mounts recursively

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight mount coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight live network tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight file-write test on Windows

Enable the sandbox filesystem by providing a workspace_root so
/output is mounted. Remove os.path.exists assertion (unsupported
in WASM guest) and fix Content data assertion to use .uri.
Skip the network integration test on Windows where the WASM
sandbox lacks the encodings.idna codec.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: ADR intro, manual wiring sample, doc clarifications

- Add CodeAct introduction section to ADR for unfamiliar readers
- Clarify 'less runtime efficient' con with specific overhead description
- Add note in Python impl doc clarifying ADR vs impl doc split
- Explain why before_run hooks must be per-run (CRUD, concurrency, approval)
- Rename code_interpreter variable to codeact in E2E sample
- Add manual static wiring sample (codeact_manual_wiring.py)
- Add 'when to use which pattern' guidance to samples README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5185 review comments and add .NET CodeAct design doc

- Fix async callback: _make_sandbox_callback returns sync wrapper with
  thread + asyncio.run() bridge (was broken with real Wasm FFI)
- Fix stale output: clear output_dir before each sandbox.run() call
- Fix blocking event loop: _run_code now async with asyncio.to_thread()
- Revert _agents.py options['tools'] injection (unnecessary; provider
  uses context.extend_tools())
- Revert SessionContext.options docstring back to read-only
- Add real-sandbox test fixtures (shared/restored/fresh)
- Add 8 new real-sandbox tests for callback round-trip, stale output,
  event loop non-blocking, basic execution, stdout/stderr, errors,
  snapshot/restore, and tool registration
- Add comprehensive .NET HyperlightCodeActProvider design document

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hyperlight README with code snippets and remove Public API section

Replace bare export list with Quick Start code examples covering the
context provider, standalone tool, manual static wiring, and file
mounts / network access patterns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-17 00:49:44 +00:00

369 lines
13 KiB
YAML

#
# Dedicated Python integration tests workflow, called from the manual integration test orchestrator.
# Runs all tests (unit + integration) split into parallel jobs by provider.
#
# NOTE: This workflow and python-merge-tests.yml share the same set of parallel
# test jobs. Keep them in sync — when adding, removing, or modifying a job here,
# apply the same change to python-merge-tests.yml.
#
name: python-integration-tests
on:
workflow_call:
inputs:
checkout-ref:
description: "Git ref to checkout (e.g., refs/pull/123/head)"
required: true
type: string
permissions:
contents: read
id-token: write
env:
UV_CACHE_DIR: /tmp/.uv-cache
UV_PYTHON: "3.13"
jobs:
# Unit tests: all non-integration tests across all packages
python-tests-unit:
name: Python Integration Tests - Unit
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Test with pytest (unit tests only)
run: >
uv run poe test -A
-m "not integration"
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
# OpenAI integration tests
python-tests-openai:
name: Python Integration Tests - OpenAI
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
env:
OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.OPENAI__CHATMODELID }}
OPENAI_CHAT_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_EMBEDDING_MODEL: ${{ vars.OPENAI_EMBEDDING_MODEL_ID }}
OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Test with pytest (OpenAI integration)
run: >
uv run pytest --import-mode=importlib
packages/openai/tests
-m "integration and not azure"
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
# Azure OpenAI integration tests
python-tests-azure-openai:
name: Python Integration Tests - Azure OpenAI
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
env:
AZURE_OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_EMBEDDING_MODEL: ${{ vars.AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME }}
AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Test with pytest (Azure OpenAI integration)
run: >
uv run pytest --import-mode=importlib
packages/openai/tests/openai/test_openai_chat_completion_client_azure.py
packages/openai/tests/openai/test_openai_chat_client_azure.py
packages/openai/tests/openai/test_openai_embedding_client_azure.py
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
# Misc integration tests (Anthropic, Hyperlight, Ollama, MCP)
python-tests-misc-integration:
name: Python Integration Tests - Misc
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_CHAT_MODEL: ${{ vars.ANTHROPIC_CHAT_MODEL_ID }}
LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Start local MCP server
id: local-mcp
uses: ./.github/actions/setup-local-mcp-server
with:
fallback_url: ${{ env.LOCAL_MCP_URL }}
- name: Prefer local MCP URL when available
run: echo "LOCAL_MCP_URL=${{ steps.local-mcp.outputs.effective_url }}" >> "$GITHUB_ENV"
- name: Test with pytest (Anthropic, Hyperlight, Ollama, MCP integration)
run: >
uv run pytest --import-mode=importlib
packages/anthropic/tests
packages/hyperlight/tests
packages/ollama/tests
packages/core/tests/core/test_mcp.py
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 30
- name: Stop local MCP server
if: always()
shell: bash
run: |
set -euo pipefail
server_pid="${{ steps.local-mcp.outputs.pid }}"
if [[ -z "$server_pid" ]]; then
exit 0
fi
if ! kill -0 "$server_pid" 2>/dev/null; then
exit 0
fi
kill -TERM -- "-$server_pid" 2>/dev/null || kill -TERM "$server_pid" 2>/dev/null || true
for _ in $(seq 1 10); do
if ! kill -0 "$server_pid" 2>/dev/null; then
exit 0
fi
sleep 1
done
kill -KILL -- "-$server_pid" 2>/dev/null || kill -KILL "$server_pid" 2>/dev/null || true
# Azure Functions + Durable Task integration tests
python-tests-functions:
name: Python Integration Tests - Functions
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
env:
UV_PYTHON: "3.11"
OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.OPENAI__CHATMODELID }}
OPENAI_CHAT_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_MODEL: ${{ vars.OPENAI__RESPONSESMODELID }}
OPENAI_EMBEDDING_MODEL: ${{ vars.OPENAI_EMBEDDING_MODEL_ID }}
OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }}
AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }}
AZURE_OPENAI_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_MODEL: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }}
AZURE_OPENAI_CHAT_COMPLETION_MODEL: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }}
FOUNDRY_PROJECT_ENDPOINT: ${{ vars.FOUNDRY_PROJECT_ENDPOINT }}
FOUNDRY_MODEL: ${{ vars.FOUNDRY_MODEL }}
FUNCTIONS_WORKER_RUNTIME: "python"
DURABLE_TASK_SCHEDULER_CONNECTION_STRING: "Endpoint=http://localhost:8080;TaskHub=default;Authentication=None"
AzureWebJobsStorage: "UseDevelopmentStorage=true"
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Set up Azure Functions Integration Test Emulators
uses: ./.github/actions/azure-functions-integration-setup
id: azure-functions-setup
- name: Test with pytest (Functions + Durable Task integration)
run: >
uv run pytest --import-mode=importlib
packages/azurefunctions/tests/integration_tests
packages/durabletask/tests/integration_tests
-m integration
-n logical --dist worksteal
-x
--timeout=360 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
# Foundry integration tests
python-tests-foundry:
name: Python Integration Tests - Foundry
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
env:
FOUNDRY_PROJECT_ENDPOINT: ${{ vars.FOUNDRY_PROJECT_ENDPOINT }}
FOUNDRY_MODEL: ${{ vars.FOUNDRY_MODEL }}
FOUNDRY_AGENT_NAME: ${{ vars.FOUNDRY_AGENT_NAME }}
FOUNDRY_AGENT_VERSION: ${{ vars.FOUNDRY_AGENT_VERSION }}
FOUNDRY_MODELS_ENDPOINT: ${{ vars.FOUNDRY_MODELS_ENDPOINT || '' }}
FOUNDRY_MODELS_API_KEY: ${{ secrets.FOUNDRY_MODELS_API_KEY || '' }}
FOUNDRY_EMBEDDING_MODEL: ${{ vars.FOUNDRY_EMBEDDING_MODEL || '' }}
FOUNDRY_IMAGE_EMBEDDING_MODEL: ${{ vars.FOUNDRY_IMAGE_EMBEDDING_MODEL || '' }}
LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }}
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Azure CLI Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Test with pytest
timeout-minutes: 15
run: >
uv run pytest --import-mode=importlib
packages/foundry/tests
-m integration
-n logical --dist worksteal
--timeout=120 --session-timeout=900 --timeout_method thread
--retries 2 --retry-delay 5
# Azure Cosmos integration tests
python-tests-cosmos:
name: Python Integration Tests - Cosmos
runs-on: ubuntu-latest
environment: integration
timeout-minutes: 60
services:
cosmosdb:
image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview
ports:
- 8081:8081
env:
AZURE_COSMOS_ENDPOINT: "http://localhost:8081/"
# Static Azure Cosmos DB emulator key (documented): https://learn.microsoft.com/en-us/azure/cosmos-db/emulator
AZURE_COSMOS_KEY: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
AZURE_COSMOS_DATABASE_NAME: "agent-framework-cosmos-it-db"
AZURE_COSMOS_CONTAINER_NAME: "agent-framework-cosmos-it-container"
defaults:
run:
working-directory: python
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.checkout-ref }}
persist-credentials: false
- name: Set up python and install the project
id: python-setup
uses: ./.github/actions/python-setup
with:
python-version: ${{ env.UV_PYTHON }}
os: ${{ runner.os }}
- name: Wait for Cosmos DB emulator
run: |
for i in {1..60}; do
if curl --silent --show-error http://localhost:8081/ > /dev/null; then
echo "Cosmos DB emulator is ready."
exit 0
fi
sleep 2
done
echo "Cosmos DB emulator did not become ready in time." >&2
exit 1
- name: Test with pytest (Cosmos integration)
run: uv run --directory packages/azure-cosmos poe integration-tests -n logical --dist worksteal --timeout=120 --session-timeout=900 --timeout_method thread --retries 2 --retry-delay 5
python-integration-tests-check:
if: always()
runs-on: ubuntu-latest
needs:
[
python-tests-unit,
python-tests-openai,
python-tests-azure-openai,
python-tests-misc-integration,
python-tests-functions,
python-tests-foundry,
python-tests-cosmos
]
steps:
- name: Fail workflow if tests failed
if: contains(join(needs.*.result, ','), 'failure')
uses: actions/github-script@v8
with:
script: core.setFailed('Integration Tests Failed!')
- name: Fail workflow if tests cancelled
if: contains(join(needs.*.result, ','), 'cancelled')
uses: actions/github-script@v8
with:
script: core.setFailed('Integration Tests Cancelled!')