Files
agent-framework/python/scripts
T
Giles Odigwe 3f23e1dfbf Python: Flaky test report (#5342)
* Add flaky test trend reporting to CI workflows

Parse JUnit XML (pytest.xml) from each integration test job and
aggregate results into a markdown trend report showing per-test
pass/fail/skip status across the last 5 runs.

Changes:
- Add python/scripts/flaky_report/ package (JUnit XML parser + trend
  report generator following the sample_validation pattern)
- Add upload-artifact steps to all 6 integration test jobs in both
  python-merge-tests.yml and python-integration-tests.yml
- Add python-flaky-test-report aggregation job with history caching
- Add --junitxml=pytest.xml to integration-tests.yml jobs (already
  present in merge-tests.yml)
- Fix Cosmos job --junitxml path (use absolute path since uv run
  --directory changes cwd)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky report: handle missing test results gracefully

- Guard against missing reports directory in load_current_run()
- Only run report job when at least one integration test job completed
  (skip when all jobs are skipped, e.g. on pull_request events)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: fix provider names and if-expression precedence

- Use explicit provider name mapping in _derive_provider() so OpenAI
  renders correctly instead of 'Openai'
- Fix operator precedence in workflow if-expressions by wrapping
  success/failure checks in parentheses

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add File column and xfail detection to flaky test report

- Add File column showing module name (e.g., test_openai_chat_client)
  to disambiguate tests with the same function name across files
- Detect pytest xfail tests in JUnit XML (type=pytest.xfail) and
  show them with a distinct warning emoji instead of skip emoji
- Update legend to include xfail explanation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add Foundry embedding env vars to merge-tests workflow

Sync the Foundry integration job in python-merge-tests.yml with
python-integration-tests.yml by adding FOUNDRY_MODELS_ENDPOINT,
FOUNDRY_MODELS_API_KEY, FOUNDRY_EMBEDDING_MODEL, and
FOUNDRY_IMAGE_EMBEDDING_MODEL. Once the repo variables/secrets
are configured, the embedding integration test will run in CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix File column showing class name instead of module name

When a test is inside a class, pytest writes the classname as e.g.
'pkg.test_file.TestClass'. The previous rsplit logic extracted
'TestClass' instead of 'test_file'. Now detect uppercase-starting
segments as class names and use the preceding segment instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: UTC timestamps, XML error handling, summary fix, docstring

- Use datetime.now(timezone.utc) for accurate UTC timestamps
- Catch ET.ParseError per-file so corrupt XML doesn't crash the report
- Remove separate 'error' key from summary (errors folded into 'failed')
- Fix _short_name docstring to show actual dotted classname::name format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
3f23e1dfbf ยท 2026-04-22 20:16:50 +00:00
History
..