agent-framework

Python: Flaky test report (#5342 )

* Add flaky test trend reporting to CI workflows

Parse JUnit XML (pytest.xml) from each integration test job and
aggregate results into a markdown trend report showing per-test
pass/fail/skip status across the last 5 runs.

Changes:
- Add python/scripts/flaky_report/ package (JUnit XML parser + trend
  report generator following the sample_validation pattern)
- Add upload-artifact steps to all 6 integration test jobs in both
  python-merge-tests.yml and python-integration-tests.yml
- Add python-flaky-test-report aggregation job with history caching
- Add --junitxml=pytest.xml to integration-tests.yml jobs (already
  present in merge-tests.yml)
- Fix Cosmos job --junitxml path (use absolute path since uv run
  --directory changes cwd)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky report: handle missing test results gracefully

- Guard against missing reports directory in load_current_run()
- Only run report job when at least one integration test job completed
  (skip when all jobs are skipped, e.g. on pull_request events)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: fix provider names and if-expression precedence

- Use explicit provider name mapping in _derive_provider() so OpenAI
  renders correctly instead of 'Openai'
- Fix operator precedence in workflow if-expressions by wrapping
  success/failure checks in parentheses

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add File column and xfail detection to flaky test report

- Add File column showing module name (e.g., test_openai_chat_client)
  to disambiguate tests with the same function name across files
- Detect pytest xfail tests in JUnit XML (type=pytest.xfail) and
  show them with a distinct warning emoji instead of skip emoji
- Update legend to include xfail explanation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add Foundry embedding env vars to merge-tests workflow

Sync the Foundry integration job in python-merge-tests.yml with
python-integration-tests.yml by adding FOUNDRY_MODELS_ENDPOINT,
FOUNDRY_MODELS_API_KEY, FOUNDRY_EMBEDDING_MODEL, and
FOUNDRY_IMAGE_EMBEDDING_MODEL. Once the repo variables/secrets
are configured, the embedding integration test will run in CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix File column showing class name instead of module name

When a test is inside a class, pytest writes the classname as e.g.
'pkg.test_file.TestClass'. The previous rsplit logic extracted
'TestClass' instead of 'test_file'. Now detect uppercase-starting
segments as class names and use the preceding segment instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: UTC timestamps, XML error handling, summary fix, docstring

- Use datetime.now(timezone.utc) for accurate UTC timestamps
- Catch ET.ParseError per-file so corrupt XML doesn't crash the report
- Remove separate 'error' key from summary (errors folded into 'failed')
- Fix _short_name docstring to show actual dotted classname::name format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-04-22 20:16:50 +00:00

3f23e1dfbf

Python: Migrate GitHub Copilot package to SDK 0.2.x (#5107 )

* Python: Migrate GitHub Copilot package to SDK 0.2.x

Replace all imports from the non-existent copilot.types module with
correct SDK 0.2.x module paths (copilot.session, copilot.client,
copilot.tools, copilot.generated.session_events). Fix PermissionRequest
attribute access from dict-style .get() to dataclass attribute access.
Add OTel telemetry support to Copilot samples via configure_otel_providers
and document new telemetry environment variables in samples README.

* Python: Fix remaining copilot.types import in sample validation script

* Python: Include model in default_options for telemetry span attributes

* Python: Address review feedback on log_level and session kwargs typing

* Python: Scope PR to SDK 0.2.x migration only, remove net-new OTel features

- Remove RawGitHubCopilotAgent split and AgentTelemetryLayer inheritance
- Remove TelemetryConfig plumbing and OTLP/file telemetry settings
- Remove configure_otel_providers() calls from samples
- Remove telemetry env var rows from samples README
- Retain only: import path fixes, PermissionRequest attribute access fix,
  log_level default fix, session kwargs typed fix, dependency pin

* Python: Update tests for SDK 0.2.x API changes

- SubprocessConfig replaces CopilotClientOptions dict
- create_session and resume_session now use keyword args
- send and send_and_wait take plain string prompt instead of MessageOptions
- on_permission_request is always required; deny-all fallback replaces omission

* Python: Pin github-copilot-sdk to >=0.2.0,<=0.2.0

Tighten the upper bound from <0.3.0 to <=0.2.0 to avoid pulling in 0.2.1+
which has breaking API changes relative to 0.2.0. The lower bound stays at
>=0.2.0 since this migration requires the 0.2.x import paths; 0.1.x would
fail at import time.

* Python: Pin github-copilot-sdk to >=0.2.1,<=0.2.1

---------

Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

Dineshsuriya D · 2026-04-10 01:07:14 +00:00

d4036c5aef

Python: [BREAKING] update to v1.0.0 (#5062 )

* updates to final deprecated pieces and versions

* fix mypy

* fix readme links

Eduard van Valkenburg · 2026-04-02 15:26:30 +00:00

3446eb8d5d

Python: Fix samples (#4980 )

* First samples 1st batch

* Fix sample paths

* Fix workflow samples

* Fix workflow dependency

* Correct env vars

* Increase idle timeout

* Fix workflows HIL sample

* Fix more workflow samples

Tao Chen · 2026-03-31 15:20:35 +00:00

016daf3b98

Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818 )

* Python: Provider-leading client design & OpenAI package extraction

Major refactoring of the Python Agent Framework client architecture:

- Extract OpenAI clients into new `agent-framework-openai` package
- Core package no longer depends on openai, azure-identity, azure-ai-projects
- Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
  OpenAIChatClient → OpenAIChatCompletionClient
- Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
- New FoundryChatClient for Azure AI Foundry Responses API
- New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
- Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
- Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
- Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
- ADR-0020: Provider-Leading Client Design

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: missing Agent imports in samples, .model_id → .model in foundry_local sample

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: CI failures — mypy errors, coverage targets, sample imports

- azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
- Coverage: replace core.azure/openai targets with openai package target
- project_provider: add type annotation for opts dict

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: populate openai .pyi stub, fix broken README links, coverage targets

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fixes

* updated observabilitty

* reset azure init.pyi

* fix errors

* updated adr number

* fix foundry local

* fixed not renamed docstrings and comments, and added deprecated markers to old classes

* fix tests and pyprojects

* fix test vars

* updated function tests

* update durable

* updated test setup for functions

* Fix Foundry auth in workflow samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Python integration workflows

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hosting samples for Foundry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger full CI rerun

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger CI rerun again

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* trigger rerun

* trigger rerun

* fix for litellm

* undo durabletask changes

* Move Foundry APIs into foundry namespace

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Foundry pyproject formatting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split provider samples by Foundry surface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Restore hosting sample requirements

Also fix the Foundry Local sample link after the provider sample move.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated tests

* udpated foundry integration tests

* removed dist from azurefunctions tests

* Use separate Foundry clients for concurrent agents

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix client setup in azfunc and durable

* disabled two tests

* updated setup for some function and durable tests

* improved azure openai setup with new clients

* ignore deprecated

* fixes

* skip 11

* remove openai assistants int tests

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-25 09:56:29 +00:00

5e056b672e

Python: Update sample validation scripts (#4870 )

* Update sample validation scripts

* Adjust prompt

* Update autogen-migration samples

* Add fix suggestion

* Split jobs

* Add .env

* Create trend report

* Add timestamp

* Add more env vars

* Comments

* force node24

* force node24

* force node22

Tao Chen · 2026-03-25 01:21:32 +00:00

4b533608b6

Python: Simplify Python Poe tasks and unify package selectors (#4722 )

* updated automation tasks and commands, with alias for the time being

* Restore aggregate test exclusions

Preserve the legacy all-tests scope for test --all by excluding lab and devui from the default aggregate sweep, while still allowing explicit package selection. Also ignore hidden/generated test directories such as .mypy_cache during aggregate discovery.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated versions in pre-commit

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-18 18:39:11 +00:00

f48c4512d3

Python: chore(python): improve dependency range automation (#4343 )

* chore(python): improve dependency range automation

- tighten dependency bounds and coding standards guidance\n- add dependency range validation workflow, reporting, and issue automation\n- update related tests and dependency pins for compatibility

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated text and pyarrow

* new lock

* fixed workflow

* updated deps

* fix tiktoken

* chore(python): refine dependency validation workflows

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(python): add high-level dependency validation comments

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* WIP

* added additional comments and excludes

* added dev dependency handling and workflow and updates to package ranges

* added readme and simplified commands

* fix markers

* chore(python): address dependency review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Tighten dependency bounds, remove stale overrides, restore Python 3.10 support

- Apply dependency bound policy across all packages: stable >=1.0 deps use
  >=floor,<next_major; pre-1.0/prerelease deps use validated hard-bounded ranges
- Remove stale root tool.uv.override-dependencies (uvicorn, websockets, grpcio)
- Lower github_copilot requires-python to >=3.10 with github-copilot-sdk gated
  behind python_version >= 3.11 marker; import raises ImportError on 3.10
- Skip github_copilot pyright/mypy/test tasks on Python <3.11
- Use version-conditional pyrightconfig for samples on Python 3.10
- Add compatibility fix in core responses client for older openai typed dicts
- Normalize uv.lock prerelease mode and refresh dev dependencies
- Update CODING_STANDARD.md, DEV_SETUP.md, and package management skill docs

Closes #902

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* small tweaks

* add note in workflow

* fix workflows and several versions

* fix duplicate

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-13 12:32:37 +00:00

50fdcbaf57

Move sample validation script from samples/ to scripts/ (#4400 )

Tao Chen · 2026-03-02 23:36:18 +00:00

d7abfcd444

Python: Fix prek runner duplication and add skills (#3791 )

* Python: fix prek runner running fmt/lint in all packages on core change

When a core package file changed, run_tasks_in_changed_packages.py ran
fmt, lint, and pyright in ALL 22 packages (66 tasks). Only type-checking
tasks (pyright, mypy) need to propagate to all packages since type
changes in core affect downstream packages. File-local tasks (fmt, lint)
only need to run in packages with actual file changes.

This reduces a core-only change from 66 tasks to 24 tasks (2 local +
22 pyright).

Also adds no-commit-to-branch builtin hook to protect the main branch
from direct commits.

* Python: add agent skills extracted from AGENTS.md and coding standards

Add 5 skills to python/.github/skills/ following the Agent Skills format:
- python-development: coding standards, type annotations, docstrings, logging
- python-testing: test structure, fixtures, running tests, async mode
- python-code-quality: linting, formatting, type checking, prek hooks, CI
- python-package-management: monorepo structure, lazy loading, versioning
- python-samples: sample structure, PEP 723, documentation guidelines

* Python: deduplicate AGENTS.md and instructions with agent skills

* updated skills

* fixes from review

* Python: increase timeout for web search integration test

Eduard van Valkenburg · 2026-02-10 12:13:38 +00:00

8ad66637d8

Python: replace pre-commit with prek, add PEP 723 script deps, clean up dev dependencies (#3748 )

* python: replace pre-commit with prek, add PEP 723 script deps, clean up dev dependencies

- Replace pre-commit with prek (Rust-native, faster pre-commit alternative)
- Move supported hooks to repo: builtin for zero-clone speed
- Add new builtin hooks: trailing-whitespace, check-merge-conflict, detect-private-key, check-added-large-files
- Update all hook versions to latest (pre-commit-hooks v6, pyupgrade v3.21.2, bandit 1.9.3, uv-pre-commit 0.10.0)
- Add PEP 723 inline script metadata to 34 samples with external deps
- Remove autogen-agentchat/autogen-ext from dev deps (now declared per-sample)
- Remove unused dev deps: pytest-env, tomli-w
- Add agent-framework-core>=1.0.0b260130 lower bound to all 21 packages
- Update CI workflow to use j178/prek-action
- Update docs: DEV_SETUP.md, AGENTS.md, CODING_STANDARD.md, SAMPLE_GUIDELINES.md

* updated lock

* python: fix prek config paths for local execution and CI workflow

Remove global 'files: ^python/' filter and strip python/ prefix from all path patterns in .pre-commit-config.yaml so prek finds files when run from the python/ directory. Update CI workflow to use --cd python instead of --config path. Include trailing whitespace fixes and dev dependency cleanup.

* python: move helper scripts to scripts/ folder and exclude from checks

* python: exclude AGENTS.md from prek markdown code lint

* python: exclude AGENTS.md and azure_ai_search sample from markdown lint

* fix m365 sample

* python: ignore CPY rule for samples with PEP 723 headers

* fix in dev_setup

* python: replace aiofiles with regular open in samples

* python: suppress reportUnusedImport in markdown code block checker

* python: use samples pyright config for markdown code block checker

Write a temp pyrightconfig.json matching pyrightconfig.samples.json rules (typeCheckingMode=off, only reportMissingImports and reportAttributeAccessIssue). Filter output to only fail on these rules since syntax-level errors (top-level await, undefined vars) are expected in README documentation snippets.

* python: use markdown-code-lint with fixed globs instead of prek file list

The prek-markdown-code-lint task received all changed files including non-README markdown and files with pre-existing broken imports. Replace with the standard markdown-code-lint task which uses the correct glob patterns (README.md, packages/**/README.md, samples/**/*.md).

* python: exclude READMEs with pre-existing broken imports from markdown lint

* python: fix broken README code snippets instead of excluding them

- ag-ui: replace TextContent (removed) with content.type == 'text'
- durabletask: fix import path to durabletask.worker.TaskHubGrpcWorker
- orchestrations: use constructor params instead of .participants() method
- observability: mark deprecated code blocks as plain text, filter
  reportMissingImports to agent_framework modules only
- remove README excludes from markdown-code-lint task

* add revision to gaia download

* feat(python): parallelize checks across packages

Run (package × task) cross-product in parallel using ThreadPoolExecutor
and subprocesses. Key changes:

- Add scripts/task_runner.py with shared parallel execution engine
- Update run_tasks_in_packages_if_exists.py to accept multiple tasks
- Update run_tasks_in_changed_packages.py with --files flag and parallel support
- Add check-packages poe task (fmt+lint+pyright+mypy in parallel)
- Add prek-markdown-code-lint and prek-samples-check with change detection
- Split CI code quality workflow into parallel prek and mypy jobs
- Update DEV_SETUP.md to document new parallel behavior

Core package changes still trigger checks on all packages.

* feat(ci): split code quality into 4 parallel jobs

Split the single prek job into parallel jobs:
- pre-commit-hooks: lightweight hooks (SKIP=poe-check)
- package-checks: fmt/lint/pyright/mypy via check-packages
- samples-markdown: samples-lint, samples-syntax, markdown-code-lint
- mypy: change-detected mypy checks

All 4 jobs run concurrently (×2 Python versions = 8 runners).

* feat(ci): use only Python 3.10 for code quality checks

* refactor(python): add future annotations and remove quoted types

Add `from __future__ import annotations` to 93 package files that
used quoted string annotations, then run pyupgrade --py310-plus to
remove the now-unnecessary quotes.

Fixes https://github.com/microsoft/agent-framework/issues/3578

Eduard van Valkenburg · 2026-02-09 17:51:01 +00:00

977c3adfb2

11 Commits