Python: updated integration tests and guidance (#4181)

* updated integration tests and guidance * fixed merge test * updated integration tests * fix: remove duplicate --dist loadfile flag from pytest-xdist config Only one --dist mode can be active at a time; the second value silently overrides the first. Keep --dist worksteal (dynamic load balancing) and remove the redundant --dist loadfile from all workflow files and pyproject.toml configs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add keep-in-sync notes for merge and integration test workflows Both python-merge-tests.yml and python-integration-tests.yml share the same parallel job structure. Added sync reminders in workflow file comments, the python-testing SKILL.md, and CODING_STANDARD.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: remove RUN_INTEGRATION_TESTS flag Integration test gating now uses two mechanisms: - `@pytest.mark.integration` for test selection via `-m` filtering - `skip_if_*_disabled` for credential/service availability checks The RUN_INTEGRATION_TESTS env var was redundant since the marker handles selection and the skip decorators already check for actual credentials. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: sync missing env vars from merge-tests to integration-tests Add OPENAI_EMBEDDINGS_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME to python-integration-tests.yml to match python-merge-tests.yml. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: remove remaining RUN_INTEGRATION_TESTS from embedding tests and docs Missed test_openai_embedding_client.py and vector-stores README in the earlier cleanup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * set functions tests to 3.10 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-16 21:04:09 +08:00 · 2026-02-24 10:35:46 +01:00
parent 6305e3e092
commit acc49196c1
52 changed files with 731 additions and 212 deletions
@@ -9,7 +9,7 @@ description: >

 We strive for at least 85% test coverage across the codebase, with a focus on core packages and critical paths. Tests should be fast, reliable, and maintainable.
 When adding new code, check that the relevant sections of the codebase are covered by tests, and add new tests as needed. When modifying existing code, update or add tests to cover the changes.
-We run tests in two stages, for a PR each commit is tested with `RUN_INTEGRATION_TESTS=false` (unit tests only), and the full suite with `RUN_INTEGRATION_TESTS=true` is run when merging.
+We run tests in two stages, for a PR each commit is tested with unit tests only (using `-m "not integration"`), and the full suite including integration tests is run when merging.

 ## Running Tests

@@ -25,6 +25,12 @@ uv run poe all-tests

 # With coverage
 uv run poe all-tests-cov
+
+# Run only unit tests (exclude integration tests)
+uv run poe all-tests -m "not integration"
+
+# Run only integration tests
+uv run poe all-tests -m integration
 ```

 ## Test Configuration
@@ -32,6 +38,7 @@ uv run poe all-tests-cov
 - **Async mode**: `asyncio_mode = "auto"` is enabled — do NOT use `@pytest.mark.asyncio`, but do mark tests with `async def` and use `await` for async calls
 - **Timeout**: Default 60 seconds per test
 - **Import mode**: `importlib` for cross-package isolation
+- **Parallelization**: Large packages (core, ag-ui, orchestrations, anthropic) use `pytest-xdist` (`-n auto --dist worksteal`) in their `poe test` task. The `all-tests` task also uses xdist across all packages.

 ## Test Directory Structure

@@ -72,9 +79,68 @@ packages/core/

 ## Integration Tests

-Tests marked with `@skip_if_..._integration_tests_disabled` require:
- `RUN_INTEGRATION_TESTS=true` environment variable
- Appropriate API keys in environment or `.env` file
+Integration tests require external services (OpenAI, Azure, etc.) and are controlled by three markers:
+
+1. **`@pytest.mark.flaky`** — marks the test as potentially flaky since it depends on external services
+2. **`@pytest.mark.integration`** — used for test selection, so integration tests can be included/excluded with `-m integration` / `-m "not integration"`
+3. **`@skip_if_..._integration_tests_disabled`** decorator — skips the test when the required API keys or service endpoints are missing
+
+### Adding New Integration Tests
+
+All three markers must be applied to every new integration test:
+
+```python
+@pytest.mark.flaky
+@pytest.mark.integration
+@skip_if_openai_integration_tests_disabled
+async def test_openai_chat_completion() -> None:
+    ...
+```
+
+For test files where all tests are integration tests (e.g., Azure Functions, Durable Task), use the module-level `pytestmark` list:
+
+```python
+pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
+    pytest.mark.sample("01_single_agent"),
+    pytest.mark.usefixtures("function_app_for_test"),
+]
+```
+
+### CI Workflow
+
+The merge CI workflow (`python-merge-tests.yml`) splits integration tests into parallel jobs by provider with change-based detection:
+
+- **Unit tests** — always run all non-integration tests
+- **OpenAI integration** — runs when `packages/core/agent_framework/openai/` or core infrastructure changes
+- **Azure OpenAI integration** — runs when `packages/core/agent_framework/azure/` or core changes
+- **Misc integration** — Anthropic, Ollama, MCP tests; runs when their packages or core change
+- **Functions integration** — Azure Functions + Durable Task; runs when their packages or core change
+- **Azure AI integration** — runs when `packages/azure-ai/` or core changes
+
+Core infrastructure changes (e.g., `_agents.py`, `_types.py`) trigger all integration test jobs. Scheduled and manual runs always execute all jobs.
+
+### Keeping CI Workflows in Sync
+
+Two workflow files define the same set of parallel test jobs:
+
+- **`python-merge-tests.yml`** — runs on PRs, merge queue, schedule, and manual dispatch. Uses path-based change detection to skip unaffected integration jobs.
+- **`python-integration-tests.yml`** — called from the manual integration test orchestrator (`integration-tests-manual.yml`). Always runs all jobs (no path filtering).
+
+These workflows must be kept in sync. When you add, remove, or modify a test job, update **both** files. The job structure, pytest commands, and xdist flags should match between them. The only difference is that `python-merge-tests.yml` has path filters and conditional job execution, while `python-integration-tests.yml` does not.
+
+### Updating the CI When Adding Integration Tests for a New Provider
+
+When adding integration tests for a new provider package, you must update **both** `python-merge-tests.yml` and `python-integration-tests.yml`:
+
+1. **Add a path filter** for the new provider in the `paths-filter` job in `python-merge-tests.yml` so the CI knows which file changes should trigger those tests.
+2. **Add the test job to both workflow files** — either add them to the existing `python-tests-misc-integration` job, or create a dedicated job if the provider:
+   - Has a large number of integration tests
+   - Requires special infrastructure setup (emulators, Docker containers, etc.)
+   - Has long-running tests that would slow down the misc job
+
+The `python-tests-misc-integration` job is intended for small integration test suites that don't need dedicated infrastructure. When a provider's integration tests grow large or gain special requirements, split them out into their own job (like `python-tests-functions` was split out for Azure Functions + Durable Task).

 ## Best Practices

@@ -664,3 +664,31 @@ packages/core/
 - Factory functions with parameters should be regular functions, not fixtures (fixtures can't accept arguments)
 - Import factory functions explicitly: `from conftest import create_test_request`
 - Fixtures should use simple names that describe what they provide: `mapper`, `test_request`, `mock_client`
+
+### Integration Test Markers
+
+New integration tests that call external services must have all three markers:
+
+```python
+@pytest.mark.flaky
+@pytest.mark.integration
+@skip_if_openai_integration_tests_disabled
+async def test_chat_completion() -> None:
+    ...
+```
+
+- `@pytest.mark.flaky` — marks the test as potentially flaky since it depends on external services
+- `@pytest.mark.integration` — enables selecting/excluding integration tests with `-m integration` / `-m "not integration"`
+- `@skip_if_..._integration_tests_disabled` — skips the test when required API keys or service endpoints are missing
+
+For test modules where all tests are integration tests, use `pytestmark`:
+
+```python
+pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
+    pytest.mark.sample("01_single_agent"),
+]
+```
+
+When adding integration tests for a new provider, update the path filters and job assignments in **both** `python-merge-tests.yml` and `python-integration-tests.yml` — these workflows must be kept in sync. See the `python-testing` skill for details.
@@ -121,9 +121,17 @@ client = OpenAIChatClient(env_file_path="openai.env")

 ## Tests

-All the tests are located in the `tests` folder of each package. There are tests that are marked with a `@skip_if_..._integration_tests_disabled` decorator, these are integration tests that require an external service to be running, like OpenAI or Azure OpenAI.
+All the tests are located in the `tests` folder of each package. Tests marked with `@pytest.mark.integration` and `@skip_if_..._integration_tests_disabled` are integration tests that require external services (e.g., OpenAI, Azure OpenAI). They are automatically skipped when the required API keys or service endpoints are not configured in your environment or `.env` file.

-If you want to run these tests, you need to set the environment variable `RUN_INTEGRATION_TESTS` to `true` and have the appropriate key per services set in your environment or in a `.env` file.
+You can select or exclude integration tests using pytest markers:
+
+```bash
+# Run only unit tests (exclude integration tests)
+uv run poe all-tests -m "not integration"
+
+# Run only integration tests
+uv run poe all-tests -m integration
+```

 Alternatively, you can run them using VSCode Tasks. Open the command palette
 (`Ctrl+Shift+P`) and type `Tasks: Run Task`. Select `Test` from the list.
@@ -134,6 +142,8 @@ If you want to run the tests for a single package, you can use the `uv run poe t
 uv run poe --directory packages/core test
 ```

+Large packages (core, ag-ui, orchestrations, anthropic) use `pytest-xdist` for parallel test execution within the package. The `all-tests` task also uses xdist across all packages.
+
 These commands also output the coverage report.

 ## Code quality checks
@@ -70,4 +70,4 @@ include = "../../shared_tasks.toml"

 [tool.poe.tasks]
 mypy = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_ag_ui"
-test = "pytest --cov=agent_framework_ag_ui --cov-report=term-missing:skip-covered tests/ag_ui"
+test = "pytest --cov=agent_framework_ag_ui --cov-report=term-missing:skip-covered -n auto --dist worksteal tests/ag_ui"
@@ -82,7 +82,7 @@ executor.type = "uv"
 include = "../../shared_tasks.toml"
 [tool.poe.tasks]
 mypy = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_anthropic"
-test = "pytest --cov=agent_framework_anthropic --cov-report=term-missing:skip-covered tests"
+test = "pytest --cov=agent_framework_anthropic --cov-report=term-missing:skip-covered -n auto --dist worksteal tests"

 [build-system]
 requires = ["flit-core >= 3.11,<4.0"]
@@ -29,11 +29,8 @@ from agent_framework_anthropic._chat_client import AnthropicSettings
 VALID_PNG_BASE64 = b"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="

 skip_if_anthropic_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("ANTHROPIC_API_KEY", "") in ("", "test-api-key-12345"),
-    reason="No real ANTHROPIC_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("ANTHROPIC_API_KEY", "") in ("", "test-api-key-12345"),
+    reason="No real ANTHROPIC_API_KEY provided; skipping integration tests.",
 )


@@ -915,6 +912,7 @@ def get_weather(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_basic_chat() -> None:
    """Integration test for basic chat completion."""
@@ -932,6 +930,7 @@ async def test_anthropic_client_integration_basic_chat() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_streaming_chat() -> None:
    """Integration test for streaming chat completion."""
@@ -948,6 +947,7 @@ async def test_anthropic_client_integration_streaming_chat() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_function_calling() -> None:
    """Integration test for function calling."""
@@ -968,6 +968,7 @@ async def test_anthropic_client_integration_function_calling() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_hosted_tools() -> None:
    """Integration test for hosted tools."""
@@ -993,6 +994,7 @@ async def test_anthropic_client_integration_hosted_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_with_system_message() -> None:
    """Integration test with system message."""
@@ -1010,6 +1012,7 @@ async def test_anthropic_client_integration_with_system_message() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_temperature_control() -> None:
    """Integration test with temperature control."""
@@ -1027,6 +1030,7 @@ async def test_anthropic_client_integration_temperature_control() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_ordering() -> None:
    """Integration test with ordering."""
@@ -1047,6 +1051,7 @@ async def test_anthropic_client_integration_ordering() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_anthropic_integration_tests_disabled
 async def test_anthropic_client_integration_images() -> None:
    """Integration test with images."""
@@ -85,7 +85,7 @@ test = "pytest --cov=agent_framework_azure_ai --cov-report=term-missing:skip-cov
 [tool.poe.tasks.integration-tests]
 cmd = """
 pytest --import-mode=importlib
-n logical --dist loadfile --dist worksteal
+-n logical --dist worksteal
 tests
 """

@@ -29,11 +29,8 @@ from agent_framework_azure_ai._shared import (
 )

 skip_if_azure_ai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/"),
-    reason="No real AZURE_AI_PROJECT_ENDPOINT provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/"),
+    reason="No real AZURE_AI_PROJECT_ENDPOINT provided; skipping integration tests.",
 )

 # region Provider Initialization Tests
@@ -779,6 +776,8 @@ def test_from_azure_ai_agent_tools_unknown_dict() -> None:
 # region Integration Tests


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_create_agent() -> None:
    """Integration test: Create an agent using the provider."""
@@ -801,6 +800,8 @@ async def test_integration_create_agent() -> None:
                await provider._agents_client.delete_agent(agent.id)  # type: ignore


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_get_agent() -> None:
    """Integration test: Get an existing agent using the provider."""
@@ -825,6 +826,8 @@ async def test_integration_get_agent() -> None:
            await provider._agents_client.delete_agent(created.id)  # type: ignore


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_create_and_run() -> None:
    """Integration test: Create an agent and run a conversation."""
@@ -50,11 +50,8 @@ from pydantic import BaseModel, Field
 from agent_framework_azure_ai import AzureAIAgentClient, AzureAISettings

 skip_if_azure_ai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/"),
-    reason="No real AZURE_AI_PROJECT_ENDPOINT provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/"),
+    reason="No real AZURE_AI_PROJECT_ENDPOINT provided; skipping integration tests.",
 )


@@ -1379,6 +1376,7 @@ def get_weather(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_get_response() -> None:
    """Test Azure AI Chat Client response."""
@@ -1404,6 +1402,7 @@ async def test_azure_ai_chat_client_get_response() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_get_response_tools() -> None:
    """Test Azure AI Chat Client response with tools."""
@@ -1425,6 +1424,7 @@ async def test_azure_ai_chat_client_get_response_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_streaming() -> None:
    """Test Azure AI Chat Client streaming response."""
@@ -1456,6 +1456,7 @@ async def test_azure_ai_chat_client_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_streaming_tools() -> None:
    """Test Azure AI Chat Client streaming response with tools."""
@@ -1483,6 +1484,7 @@ async def test_azure_ai_chat_client_streaming_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_basic_run() -> None:
    """Test Agent basic run functionality with AzureAIAgentClient."""
@@ -1500,6 +1502,7 @@ async def test_azure_ai_chat_client_agent_basic_run() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_basic_run_streaming() -> None:
    """Test Agent basic streaming functionality with AzureAIAgentClient."""
@@ -1520,6 +1523,7 @@ async def test_azure_ai_chat_client_agent_basic_run_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_thread_persistence() -> None:
    """Test Agent session persistence across runs with AzureAIAgentClient."""
@@ -1546,6 +1550,7 @@ async def test_azure_ai_chat_client_agent_thread_persistence() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_existing_thread_id() -> None:
    """Test Agent existing thread ID functionality with AzureAIAgentClient."""
@@ -1584,6 +1589,7 @@ async def test_azure_ai_chat_client_agent_existing_thread_id() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_code_interpreter():
    """Test Agent with code interpreter through AzureAIAgentClient."""
@@ -1604,6 +1610,7 @@ async def test_azure_ai_chat_client_agent_code_interpreter():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_file_search():
    """Test Agent with file search through AzureAIAgentClient."""
@@ -1651,6 +1658,7 @@ async def test_azure_ai_chat_client_agent_file_search():
            await client.close()


+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_hosted_mcp_tool() -> None:
    """Integration test for MCP tool with Azure AI Agent using Microsoft Learn MCP."""
@@ -1686,6 +1694,7 @@ async def test_azure_ai_chat_client_agent_hosted_mcp_tool() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_level_tool_persistence():
    """Test that agent-level tools persist across multiple runs with AzureAIAgentClient."""
@@ -1711,6 +1720,7 @@ async def test_azure_ai_chat_client_agent_level_tool_persistence():
        assert any(term in second_response.text.lower() for term in ["miami", "sunny", "25"])


+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_chat_options_run_level() -> None:
    """Test ChatOptions parameter coverage at run level."""
@@ -1735,6 +1745,7 @@ async def test_azure_ai_chat_client_agent_chat_options_run_level() -> None:
        assert len(response.text) > 0


+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_azure_ai_chat_client_agent_chat_options_agent_level() -> None:
    """Test ChatOptions parameter coverage agent level."""
@@ -47,14 +47,9 @@ from agent_framework_azure_ai import AzureAIClient, AzureAISettings
 from agent_framework_azure_ai._shared import from_azure_ai_tools

 skip_if_azure_ai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/")
+    os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/")
    or os.getenv("AZURE_AI_MODEL_DEPLOYMENT_NAME", "") == "",
-    reason=(
-        "No real AZURE_AI_PROJECT_ENDPOINT or AZURE_AI_MODEL_DEPLOYMENT_NAME provided; skipping integration tests."
-        if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-        else "Integration tests are disabled."
-    ),
+    reason="No real AZURE_AI_PROJECT_ENDPOINT or AZURE_AI_MODEL_DEPLOYMENT_NAME provided; skipping integration tests.",
 )


@@ -1329,6 +1324,7 @@ async def client() -> AsyncGenerator[AzureAIClient, None]:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
@pytest.mark.parametrize(
    "option_name,option_value,needs_validation",
@@ -1443,6 +1439,7 @@ async def test_integration_options(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
@pytest.mark.parametrize(
    "option_name,option_value,needs_validation",
@@ -1541,6 +1538,7 @@ async def test_integration_agent_options(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_web_search() -> None:
    async with temporary_chat_client(agent_name="af-int-test-web-search") as client:
@@ -1586,6 +1584,7 @@ async def test_integration_web_search() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_agent_hosted_mcp_tool() -> None:
    """Integration test for MCP tool with Azure Response Agent using Microsoft Learn MCP."""
@@ -1610,6 +1609,7 @@ async def test_integration_agent_hosted_mcp_tool() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_agent_hosted_code_interpreter_tool():
    """Test Azure Responses Client agent with code interpreter tool through AzureAIClient."""
@@ -1628,6 +1628,7 @@ async def test_integration_agent_hosted_code_interpreter_tool():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_integration_agent_existing_session():
    """Test Azure Responses Client agent with existing session to continue conversations across agent instances."""
@@ -20,14 +20,9 @@ from azure.identity.aio import AzureCliCredential
 from agent_framework_azure_ai import AzureAIProjectAgentProvider

 skip_if_azure_ai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/")
+    os.getenv("AZURE_AI_PROJECT_ENDPOINT", "") in ("", "https://test-project.cognitiveservices.azure.com/")
    or os.getenv("AZURE_AI_MODEL_DEPLOYMENT_NAME", "") == "",
-    reason=(
-        "No real AZURE_AI_PROJECT_ENDPOINT or AZURE_AI_MODEL_DEPLOYMENT_NAME provided; skipping integration tests."
-        if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-        else "Integration tests are disabled."
-    ),
+    reason="No real AZURE_AI_PROJECT_ENDPOINT or AZURE_AI_MODEL_DEPLOYMENT_NAME provided; skipping integration tests.",
 )


@@ -698,6 +693,7 @@ async def test_provider_create_agent_with_mcp_and_regular_tools(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_ai_integration_tests_disabled
 async def test_provider_create_and_get_agent_integration() -> None:
    """Integration test for provider create_agent and get_agent."""
@@ -2,7 +2,6 @@
 AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
 AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=your-deployment-name
 FUNCTIONS_WORKER_RUNTIME=python
-RUN_INTEGRATION_TESTS=true

 # Azure Functions Configuration
 AzureWebJobsStorage=UseDevelopmentStorage=true
@@ -90,13 +90,6 @@ def _should_skip_azure_functions_integration_tests() -> tuple[bool, str]:
    """Determine whether Azure Functions integration tests should be skipped."""
    _load_env_file_if_present()

-    run_integration_tests = os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    if not run_integration_tests:
-        return (
-            True,
-            "Integration tests are disabled. Set RUN_INTEGRATION_TESTS=true to enable Azure Functions sample tests.",
-        )
-
    # Check for Azure Functions Core Tools
    if not _check_func_cli_available():
        return (
@@ -19,6 +19,8 @@ from agent_framework_durabletask import THREAD_ID_HEADER

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("01_single_agent"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -18,6 +18,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("02_multi_agent"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -22,6 +22,8 @@ import requests

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("03_reliable_streaming"),
    pytest.mark.usefixtures("function_app_for_test"),
    pytest.mark.skip(reason="Temp disabled to fix test instability - needs investigation into root cause"),
@@ -22,6 +22,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("04_single_agent_orchestration_chaining"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -22,6 +22,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.orchestration,
    pytest.mark.sample("05_multi_agent_orchestration_concurrency"),
    pytest.mark.usefixtures("function_app_for_test"),
@@ -22,6 +22,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.orchestration,
    pytest.mark.sample("06_multi_agent_orchestration_conditionals"),
    pytest.mark.usefixtures("function_app_for_test"),
@@ -24,6 +24,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("07_single_agent_orchestration_hitl"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -23,6 +23,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("09_workflow_shared_state"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -23,6 +23,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("10_workflow_no_shared_state"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -25,6 +25,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("11_workflow_parallel"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -25,6 +25,8 @@ import pytest

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("12_workflow_hitl"),
    pytest.mark.usefixtures("function_app_for_test"),
 ]
@@ -28,7 +28,7 @@ from dotenv import load_dotenv
 from opentelemetry import metrics, trace
 from opentelemetry.sdk.resources import Resource
 from opentelemetry.semconv.attributes import service_attributes
-from opentelemetry.semconv_ai import Meters, SpanAttributes
+from opentelemetry.semconv_ai import Meters

 from . import __version__ as version_info
 from ._settings import load_settings
@@ -1826,9 +1826,7 @@ def _capture_response(
    span.set_attributes(attributes)
    attrs: dict[str, Any] = {k: v for k, v in attributes.items() if k in GEN_AI_METRIC_ATTRIBUTES}
    if token_usage_histogram and (input_tokens := attributes.get(OtelAttr.INPUT_TOKENS)):
-        token_usage_histogram.record(
-            input_tokens, attributes={**attrs, OtelAttr.T_TYPE: OtelAttr.T_TYPE_INPUT}
-        )
+        token_usage_histogram.record(input_tokens, attributes={**attrs, OtelAttr.T_TYPE: OtelAttr.T_TYPE_INPUT})
    if token_usage_histogram and (output_tokens := attributes.get(OtelAttr.OUTPUT_TOKENS)):
        token_usage_histogram.record(output_tokens, {**attrs, OtelAttr.T_TYPE: OtelAttr.T_TYPE_OUTPUT})
    if operation_duration_histogram and duration is not None:
@@ -411,9 +411,7 @@ class RawOpenAIChatClient(  # type: ignore[misc]
        # See https://github.com/microsoft/agent-framework/issues/3434
        if chunk.usage:
            contents.append(
-                Content.from_usage(
-                    usage_details=self._parse_usage_from_openai(chunk.usage), raw_representation=chunk
-                )
+                Content.from_usage(usage_details=self._parse_usage_from_openai(chunk.usage), raw_representation=chunk)
            )

        for choice in chunk.choices:
@@ -591,7 +589,9 @@ class RawOpenAIChatClient(  # type: ignore[misc]
        # See https://github.com/microsoft/agent-framework/issues/4084
        for msg in all_messages:
            msg_content: Any = msg.get("content")
-            if isinstance(msg_content, list) and all(isinstance(c, dict) and c.get("type") == "text" for c in msg_content):
+            if isinstance(msg_content, list) and all(
+                isinstance(c, dict) and c.get("type") == "text" for c in msg_content
+            ):
                msg["content"] = "\n".join(c.get("text", "") for c in msg_content)

        return all_messages
@@ -125,7 +125,7 @@ executor.type = "uv"
 include = "../../shared_tasks.toml"
 [tool.poe.tasks]
 mypy = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework"
-test = "pytest --cov=agent_framework --cov-report=term-missing:skip-covered tests"
+test = "pytest --cov=agent_framework --cov-report=term-missing:skip-covered -n auto --dist worksteal tests"

 [tool.flit.module]
 name = "agent_framework"
@@ -23,11 +23,8 @@ from agent_framework._settings import SecretString
 from agent_framework.azure import AzureOpenAIAssistantsClient

 skip_if_azure_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
-    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
+    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests.",
 )


@@ -260,6 +257,7 @@ def get_weather(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_get_response() -> None:
    """Test Azure Assistants Client response."""
@@ -285,6 +283,7 @@ async def test_azure_assistants_client_get_response() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_get_response_tools() -> None:
    """Test Azure Assistants Client response with tools."""
@@ -306,6 +305,7 @@ async def test_azure_assistants_client_get_response_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_streaming() -> None:
    """Test Azure Assistants Client streaming response."""
@@ -337,6 +337,7 @@ async def test_azure_assistants_client_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_streaming_tools() -> None:
    """Test Azure Assistants Client streaming response with tools."""
@@ -364,6 +365,7 @@ async def test_azure_assistants_client_streaming_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_with_existing_assistant() -> None:
    """Test Azure Assistants Client with existing assistant ID."""
@@ -392,6 +394,7 @@ async def test_azure_assistants_client_with_existing_assistant() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_agent_basic_run():
    """Test Agent basic run functionality with AzureOpenAIAssistantsClient."""
@@ -409,6 +412,7 @@ async def test_azure_assistants_agent_basic_run():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_agent_basic_run_streaming():
    """Test Agent basic streaming functionality with AzureOpenAIAssistantsClient."""
@@ -429,6 +433,7 @@ async def test_azure_assistants_agent_basic_run_streaming():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_agent_session_persistence():
    """Test Agent session persistence across runs with AzureOpenAIAssistantsClient."""
@@ -458,6 +463,7 @@ async def test_azure_assistants_agent_session_persistence():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_agent_existing_session_id():
    """Test Agent with existing session ID to continue conversations across agent instances."""
@@ -503,6 +509,7 @@ async def test_azure_assistants_agent_existing_session_id():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_agent_code_interpreter():
    """Test Agent with code interpreter through AzureOpenAIAssistantsClient."""
@@ -523,6 +530,7 @@ async def test_azure_assistants_agent_code_interpreter():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_assistants_client_agent_level_tool_persistence():
    """Test that agent-level tools persist across multiple runs with Azure Assistants Client."""
@@ -37,11 +37,8 @@ from agent_framework.openai import (
 # region Service Setup

 skip_if_azure_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
-    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
+    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests.",
 )


@@ -647,6 +644,7 @@ def get_weather(location: str) -> str:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_response() -> None:
    """Test Azure OpenAI chat completion responses."""
@@ -677,6 +675,7 @@ async def test_azure_openai_chat_client_response() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_response_tools() -> None:
    """Test AzureOpenAI chat completion responses."""
@@ -698,6 +697,7 @@ async def test_azure_openai_chat_client_response_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_streaming() -> None:
    """Test Azure OpenAI chat completion responses."""
@@ -733,6 +733,7 @@ async def test_azure_openai_chat_client_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_streaming_tools() -> None:
    """Test AzureOpenAI chat completion responses."""
@@ -760,6 +761,7 @@ async def test_azure_openai_chat_client_streaming_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_agent_basic_run():
    """Test Azure OpenAI chat client agent basic run functionality with AzureOpenAIChatClient."""
@@ -776,6 +778,7 @@ async def test_azure_openai_chat_client_agent_basic_run():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_agent_basic_run_streaming():
    """Test Azure OpenAI chat client agent basic streaming functionality with AzureOpenAIChatClient."""
@@ -794,6 +797,7 @@ async def test_azure_openai_chat_client_agent_basic_run_streaming():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_agent_session_persistence():
    """Test Azure OpenAI chat client agent session persistence across runs with AzureOpenAIChatClient."""
@@ -819,6 +823,7 @@ async def test_azure_openai_chat_client_agent_session_persistence():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_openai_chat_client_agent_existing_session():
    """Test Azure OpenAI chat client agent with existing session to continue conversations across agent instances."""
@@ -854,6 +859,7 @@ async def test_azure_openai_chat_client_agent_existing_session():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_azure_chat_client_agent_level_tool_persistence():
    """Test that agent-level tools persist across multiple runs with Azure Chat Client."""
@@ -23,11 +23,8 @@ from agent_framework import (
 from agent_framework.azure import AzureOpenAIResponsesClient

 skip_if_azure_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
-    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("AZURE_OPENAI_ENDPOINT", "") in ("", "https://test-endpoint.com"),
+    reason="No real AZURE_OPENAI_ENDPOINT provided; skipping integration tests.",
 )

 logger = logging.getLogger(__name__)
@@ -254,6 +251,7 @@ def test_serialize(azure_openai_unit_test_env: dict[str, str]) -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
@pytest.mark.parametrize(
    "option_name,option_value,needs_validation",
@@ -392,6 +390,7 @@ async def test_integration_options(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_web_search() -> None:
    client = AzureOpenAIResponsesClient(credential=AzureCliCredential())
@@ -440,6 +439,7 @@ async def test_integration_web_search() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_client_file_search() -> None:
    """Test Azure responses client with file search tool."""
@@ -469,6 +469,7 @@ async def test_integration_client_file_search() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_client_file_search_streaming() -> None:
    """Test Azure responses client with file search tool and streaming."""
@@ -500,6 +501,7 @@ async def test_integration_client_file_search_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_client_agent_hosted_mcp_tool() -> None:
    """Integration test for MCP tool with Azure Response Agent using Microsoft Learn MCP."""
@@ -524,6 +526,7 @@ async def test_integration_client_agent_hosted_mcp_tool() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_client_agent_hosted_code_interpreter_tool():
    """Test Azure Responses Client agent with code interpreter tool."""
@@ -543,6 +546,7 @@ async def test_integration_client_agent_hosted_code_interpreter_tool():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_integration_client_agent_existing_session():
    """Test Azure Responses Client agent with existing session to continue conversations across agent instances."""
@@ -34,12 +34,8 @@ from agent_framework.exceptions import ToolException, ToolExecutionException

 # Integration test skip condition
 skip_if_mcp_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true" or os.getenv("LOCAL_MCP_URL", "") == "",
-    reason=(
-        "No LOCAL_MCP_URL provided; skipping integration tests."
-        if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-        else "Integration tests are disabled."
-    ),
+    os.getenv("LOCAL_MCP_URL", "") == "",
+    reason="No LOCAL_MCP_URL provided; skipping integration tests.",
 )


@@ -1105,6 +1101,7 @@ def test_local_mcp_streamable_http_tool_init():

 # Integration test
@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_mcp_integration_tests_disabled
 async def test_streamable_http_integration():
    """Test MCP StreamableHTTP integration."""
@@ -1133,6 +1130,7 @@ async def test_streamable_http_integration():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_mcp_integration_tests_disabled
 async def test_mcp_connection_reset_integration():
    """Test that connection reset works correctly with a real MCP server.
@@ -755,14 +755,13 @@ class TestToolMerging:
 # region Integration Tests

 skip_if_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
-    reason="No real OPENAI_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
+    reason="No real OPENAI_API_KEY provided; skipping integration tests.",
 )


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 class TestOpenAIAssistantProviderIntegration:
    """Integration tests requiring real OpenAI API."""
@@ -25,11 +25,8 @@ from agent_framework import (
 from agent_framework.openai import OpenAIAssistantsClient

 skip_if_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
-    reason="No real OPENAI_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
+    reason="No real OPENAI_API_KEY provided; skipping integration tests.",
 )

 INTEGRATION_TEST_MODEL = "gpt-4.1-nano"
@@ -1075,6 +1072,7 @@ def get_weather(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_get_response() -> None:
    """Test OpenAI Assistants Client response."""
@@ -1100,6 +1098,7 @@ async def test_get_response() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_get_response_tools() -> None:
    """Test OpenAI Assistants Client response with tools."""
@@ -1121,6 +1120,7 @@ async def test_get_response_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_streaming() -> None:
    """Test OpenAI Assistants Client streaming response."""
@@ -1152,6 +1152,7 @@ async def test_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_streaming_tools() -> None:
    """Test OpenAI Assistants Client streaming response with tools."""
@@ -1182,6 +1183,7 @@ async def test_streaming_tools() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_with_existing_assistant() -> None:
    """Test OpenAI Assistants Client with existing assistant ID."""
@@ -1210,6 +1212,7 @@ async def test_with_existing_assistant() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
@pytest.mark.skip(reason="OpenAI file search functionality is currently broken - tracked in GitHub issue")
 async def test_file_search() -> None:
@@ -1236,6 +1239,7 @@ async def test_file_search() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
@pytest.mark.skip(reason="OpenAI file search functionality is currently broken - tracked in GitHub issue")
 async def test_file_search_streaming() -> None:
@@ -1270,6 +1274,7 @@ async def test_file_search_streaming() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_openai_assistants_agent_basic_run():
    """Test Agent basic run functionality with OpenAIAssistantsClient."""
@@ -1287,6 +1292,7 @@ async def test_openai_assistants_agent_basic_run():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_openai_assistants_agent_basic_run_streaming():
    """Test Agent basic streaming functionality with OpenAIAssistantsClient."""
@@ -1307,6 +1313,7 @@ async def test_openai_assistants_agent_basic_run_streaming():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_openai_assistants_agent_session_persistence():
    """Test Agent session persistence across runs with OpenAIAssistantsClient."""
@@ -1336,6 +1343,7 @@ async def test_openai_assistants_agent_session_persistence():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_openai_assistants_agent_existing_session_id():
    """Test Agent with existing session ID to continue conversations across agent instances."""
@@ -1381,6 +1389,7 @@ async def test_openai_assistants_agent_existing_session_id():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_openai_assistants_agent_code_interpreter():
    """Test Agent with code interpreter through OpenAIAssistantsClient."""
@@ -1401,6 +1410,7 @@ async def test_openai_assistants_agent_code_interpreter():


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_agent_level_tool_persistence():
    """Test that agent-level tools persist across multiple runs with OpenAI Assistants Client."""
@@ -24,11 +24,8 @@ from agent_framework.openai import OpenAIChatClient
 from agent_framework.openai._exceptions import OpenAIContentFilterException

 skip_if_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
-    reason="No real OPENAI_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
+    reason="No real OPENAI_API_KEY provided; skipping integration tests.",
 )


@@ -1087,6 +1084,7 @@ class OutputStruct(BaseModel):


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
@pytest.mark.parametrize(
    "option_name,option_value,needs_validation",
@@ -1225,6 +1223,7 @@ async def test_integration_options(


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_integration_web_search() -> None:
    client = OpenAIChatClient(model_id="gpt-4o-search-preview")
@@ -259,20 +259,14 @@ def test_azure_otel_provider_name() -> None:
 # --- Integration tests ---

 skip_if_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
-    reason="No real OPENAI_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
+    reason="No real OPENAI_API_KEY provided; skipping integration tests.",
 )

 skip_if_azure_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or not os.getenv("AZURE_OPENAI_ENDPOINT")
+    not os.getenv("AZURE_OPENAI_ENDPOINT")
    or (not os.getenv("AZURE_OPENAI_API_KEY") and not os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")),
-    reason="No Azure OpenAI credentials provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    reason="No Azure OpenAI credentials provided; skipping integration tests.",
 )


@@ -40,11 +40,8 @@ from agent_framework.openai import OpenAIResponsesClient
 from agent_framework.openai._exceptions import OpenAIContentFilterException

 skip_if_openai_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
-    reason="No real OPENAI_API_KEY provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"),
+    reason="No real OPENAI_API_KEY provided; skipping integration tests.",
 )


@@ -2362,6 +2359,7 @@ def test_with_callable_api_key() -> None:


@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
@pytest.mark.parametrize(
    "option_name,option_value,needs_validation",
@@ -2500,6 +2498,7 @@ async def test_integration_options(

@pytest.mark.timeout(300)
@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_integration_web_search() -> None:
    client = OpenAIResponsesClient(model_id="gpt-5")
@@ -2553,6 +2552,7 @@ async def test_integration_web_search() -> None:
    "race condition. See https://github.com/microsoft/agent-framework/issues/1669"
 )
@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_integration_file_search() -> None:
    openai_responses_client = OpenAIResponsesClient()
@@ -2586,6 +2586,7 @@ async def test_integration_file_search() -> None:
    "potential race condition. See https://github.com/microsoft/agent-framework/issues/1669"
 )
@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_openai_integration_tests_disabled
 async def test_integration_streaming_file_search() -> None:
    openai_responses_client = OpenAIResponsesClient()
@@ -11,7 +11,3 @@ TASKHUB=default
 # Redis Configuration (for streaming tests)
 REDIS_CONNECTION_STRING=redis://localhost:6379
 REDIS_STREAM_TTL_MINUTES=10
-
-# Integration Test Control
-# Set to 'true' to enable integration tests
-RUN_INTEGRATION_TESTS=true
@@ -16,7 +16,6 @@ Required variables:
 - `AZURE_OPENAI_ENDPOINT`
 - `AZURE_OPENAI_CHAT_DEPLOYMENT_NAME`
 - `AZURE_OPENAI_API_KEY` (optional if using Azure CLI authentication)
- `RUN_INTEGRATION_TESTS` (set to `true`)
 - `ENDPOINT` (default: http://localhost:8080)
 - `TASKHUB` (default: default)

@@ -75,7 +74,7 @@ pytestmark = [
 ## Troubleshooting

 **Tests are skipped:**
-Ensure `RUN_INTEGRATION_TESTS=true` is set in your `.env` file.
+Ensure the required environment variables (e.g., `AZURE_OPENAI_ENDPOINT`) are set in your `.env` file.

 **DTS connection failed:**
 Check that the DTS emulator container is running: `docker ps | grep dts-emulator`
@@ -289,9 +289,6 @@ def pytest_configure(config: pytest.Config) -> None:

 def pytest_collection_modifyitems(config: pytest.Config, items: list[pytest.Item]) -> None:
    """Skip tests based on markers and environment availability."""
-    run_integration = os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    skip_integration = pytest.mark.skip(reason="RUN_INTEGRATION_TESTS not set to 'true'")
-
    # Check Azure OpenAI environment variables
    azure_openai_vars = ["AZURE_OPENAI_ENDPOINT", "AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"]
    azure_openai_available = all(os.getenv(var) for var in azure_openai_vars)
@@ -308,8 +305,6 @@ def pytest_collection_modifyitems(config: pytest.Config, items: list[pytest.Item
    skip_redis = pytest.mark.skip(reason="Redis is not available at redis://localhost:6379")

    for item in items:
-        if "integration_test" in item.keywords and not run_integration:
-            item.add_marker(skip_integration)
        if "requires_azure_openai" in item.keywords and not azure_openai_available:
            item.add_marker(skip_azure_openai)
        if "requires_dts" in item.keywords and not dts_available:
@@ -14,6 +14,8 @@ import pytest

 # Module-level markers - applied to all tests in this module
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("01_single_agent"),
    pytest.mark.integration_test,
    pytest.mark.requires_azure_openai,
@@ -18,6 +18,8 @@ MATH_AGENT_NAME: str = "MathAgent"

 # Module-level markers - applied to all tests in this module
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("02_multi_agent"),
    pytest.mark.integration_test,
    pytest.mark.requires_azure_openai,
@@ -34,6 +34,8 @@ from redis_stream_response_handler import RedisStreamResponseHandler  # type: ig

 # Module-level markers - applied to all tests in this file
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("03_single_agent_streaming"),
    pytest.mark.integration_test,
    pytest.mark.requires_azure_openai,
@@ -23,6 +23,8 @@ logging.basicConfig(level=logging.WARNING)

 # Module-level markers - applied to all tests in this module
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("04_single_agent_orchestration_chaining"),
    pytest.mark.integration_test,
    pytest.mark.requires_azure_openai,
@@ -24,6 +24,8 @@ logging.basicConfig(level=logging.WARNING)

 # Module-level markers
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("05_multi_agent_orchestration_concurrency"),
    pytest.mark.integration_test,
    pytest.mark.requires_dts,
@@ -24,6 +24,8 @@ logging.basicConfig(level=logging.WARNING)

 # Module-level markers
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("06_multi_agent_orchestration_conditionals"),
    pytest.mark.integration_test,
    pytest.mark.requires_dts,
@@ -24,6 +24,8 @@ logging.basicConfig(level=logging.WARNING)

 # Module-level markers
 pytestmark = [
+    pytest.mark.flaky,
+    pytest.mark.integration,
    pytest.mark.sample("07_single_agent_orchestration_hitl"),
    pytest.mark.integration_test,
    pytest.mark.requires_dts,
@@ -26,11 +26,8 @@ from agent_framework_ollama import OllamaChatClient
 # region Service Setup

 skip_if_azure_integration_tests_disabled = pytest.mark.skipif(
-    os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true"
-    or os.getenv("OLLAMA_MODEL_ID", "") in ("", "test-model"),
-    reason="No real Ollama chat model provided; skipping integration tests."
-    if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true"
-    else "Integration tests are disabled.",
+    os.getenv("OLLAMA_MODEL_ID", "") in ("", "test-model"),
+    reason="No real Ollama chat model provided; skipping integration tests.",
 )


@@ -470,6 +467,8 @@ async def test_cmc_with_invalid_content_type(
        await ollama_client.get_response(messages=chat_history)


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_cmc_integration_with_tool_call(
    chat_history: list[Message],
@@ -485,6 +484,8 @@ async def test_cmc_integration_with_tool_call(
    assert tool_result.result == "Hello World"


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_cmc_integration_with_chat_completion(
    chat_history: list[Message],
@@ -497,6 +498,8 @@ async def test_cmc_integration_with_chat_completion(
    assert "hello" in result.text.lower()


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_cmc_streaming_integration_with_tool_call(
    chat_history: list[Message],
@@ -522,6 +525,8 @@ async def test_cmc_streaming_integration_with_tool_call(
                assert tool_call.name == "hello_world"


+@pytest.mark.flaky
+@pytest.mark.integration
@skip_if_azure_integration_tests_disabled
 async def test_cmc_streaming_integration_with_chat_completion(
    chat_history: list[Message],
@@ -80,7 +80,7 @@ executor.type = "uv"
 include = "../../shared_tasks.toml"
 [tool.poe.tasks]
 mypy = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_orchestrations"
-test = "pytest --cov=agent_framework_orchestrations --cov-report=term-missing:skip-covered tests"
+test = "pytest --cov=agent_framework_orchestrations --cov-report=term-missing:skip-covered -n auto --dist worksteal tests"

 [build-system]
 requires = ["flit-core >= 3.11,<4.0"]
@@ -171,6 +171,7 @@ markers = [
    "azure: marks tests as Azure provider specific",
    "azure-ai: marks tests as Azure AI provider specific",
    "openai: marks tests as OpenAI provider specific",
+    "integration: marks tests as integration tests that require external services",
 ]

 [tool.coverage.run]
@@ -229,7 +230,7 @@ build-meta = "python -m flit build"
 build = ["build-packages", "build-meta"]
 publish = "uv publish"
 # combined checks
-check-packages = "python scripts/run_tasks_in_packages_if_exists.py fmt lint pyright mypy"
+check-packages = "python scripts/run_tasks_in_packages_if_exists.py fmt lint pyright"
 check = ["check-packages", "samples-lint", "samples-syntax", "test", "markdown-code-lint"]

 [tool.poe.tasks.all-tests-cov]
@@ -255,7 +256,7 @@ pytest --import-mode=importlib
 --ignore-glob=packages/lab/**
 --ignore-glob=packages/devui/**
 -rs
-n logical --dist loadfile --dist worksteal
+-n logical --dist worksteal
    packages/**/tests
 """

@@ -265,7 +266,7 @@ pytest --import-mode=importlib
 --ignore-glob=packages/lab/**
 --ignore-glob=packages/devui/**
 -rs
-n logical --dist loadfile --dist worksteal
+-n logical --dist worksteal
    packages/**/tests
 """