agent-framework

Python: Improve PR template and breaking-change label automation (#6473 )

* Improve PR template and breaking-change label automation

- Add a structured "Related Issue" section using GitHub closing keywords
- Add a Review Guide prompt (major changes, impact, reviewer focus) with a
  note that the focus item is for human reviewers only
- Add checklist items for issue linkage / no duplicate PRs and invert the
  breaking-change item (checked = not breaking)
- Extend label-title-prefix to prepend [BREAKING] when the "breaking change"
  label is added
- Add label-breaking-change workflow to apply the "breaking change" label
  when a PR title contains [BREAKING]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add pull-requests agent skill with dotnet/python links

- Add root .github/skills/pull-requests/SKILL.md covering PR description
  authoring (following the PR template) and the review-comment workflow
  (review -> plan -> user review -> implement -> reply to all -> resolve)
- Symlink the skill from python/.github/skills and dotnet/.github/skills
- Reference the skill from python/AGENTS.md and dotnet/AGENTS.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fold breaking-change labeling into label-pr workflow

Move the title -> 'breaking change' label logic into the existing label-pr
workflow (which already applies the python/.NET labels) and drop the separate
label-breaking-change workflow.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR title prefix review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Pin patched MessagePack for .NET restore

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert MessagePack central pin

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Move title prefix tests out of tracked GitHub tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Exclude skill docs from CI path filters

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Match skill symlinks in CI path exclusions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Exclude AGENTS docs from CI path filters

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scope title-prefix normalization to a real prefix

The normalization branch in addTitlePrefix matched ^Python (no colon), so
titles like "Python samples improvements" or "Pythonic refactor" were treated
as already-prefixed and only re-cased, never receiving the "Python: " prefix.
Scope the match to ^<prefix>:\s* so only an actual existing prefix is
normalized; otherwise the prefix is prepended. Same fix applies to the .NET
prefix (e.g. ".NETStandard bump").

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-06-15 10:55:23 +00:00

7e9c043c4c

Python: Add GitHub Copilot integration tests to CI workflows (#6346 )

Add a dedicated integration test job for the github_copilot package to both
python-integration-tests.yml and python-merge-tests.yml.

The job:
- Runs 6 integration tests marked with @pytest.mark.integration
- Uses COPILOT_GITHUB_TOKEN secret from the integration environment
- Follows the same pattern as other provider integration jobs
- Includes path filtering in merge-tests (github_copilot package + core changes)
- Added to needs lists in report and check jobs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-06-04 22:06:26 +00:00

f3c3efed43

ci: pin third-party GitHub Actions to commit SHAs (#5972 )

Replaces every floating tag in our workflow and composite action files
with an immutable 40-character commit SHA, keeping the original `# vX`
comment so Dependabot can still propose version bumps. 186 occurrences
across 25 workflows and 2 composite actions.

Also widens the github-actions Dependabot entry to use the plural
`directories` key with `/.github/actions/*` so composite actions under
`.github/actions/<name>/action.yml` are kept up to date. Previously
Dependabot only scanned `.github/workflows` and the repo-root
`action.yml`, leaving our `python-setup` and `sample-validation-setup`
composite actions unmaintained.

Roger Barreto · 2026-05-20 22:10:32 +00:00

01a3c5be8a

Python: Reduce flaky integration tests and improve CI signal quality (#5454 )

* Enable Ollama integration tests in CI and rename report to Integration Test Report

- Install Ollama, cache models (qwen2.5:0.5b + nomic-embed-text), and start
  server in the Misc integration job for both workflow files
- Set OLLAMA_MODEL and OLLAMA_EMBEDDING_MODEL env vars so the 5 Ollama tests
  are no longer skipped
- Rename Flaky Test Report to Integration Test Report throughout (job names,
  artifact names, cache keys, file names, script titles/docstrings)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Bump Ollama model to qwen2.5:1.5b for better instruction following

The 0.5b model was too small to reliably follow simple prompts like
'Say Hello World', causing test assertion failures. The 1.5b model
follows instructions more reliably while still being small enough
for fast CI pulls (~1GB).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable reliable streaming integration tests

Remove the hard skip on test_03_reliable_streaming tests that was
temporarily disabled for instability investigation. CI infrastructure
(Azurite, DTS emulator, Redis, func CLI) is already in place.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable skipped Functions/DurableTask tests and bump timeout to 480s

- Remove hard skips from 4 tests in test_11_workflow_parallel.py
- Remove hard skip from test_conditional_branching in test_06_dt_multi_agent_orchestration_conditionals.py
- Increase pytest --timeout from 360 to 480 for Functions+DurableTask CI job
- Updated in both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip failing Functions/DurableTask tests with specific root causes

- test_11_workflow_parallel (4 tests): xdist worker crashes during execution
- test_conditional_branching: orchestration fails with RuntimeError, not a timeout
- Keep 480s timeout bump for remaining Functions tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix auth routing in samples 06/11: api_key -> credential for Azure OpenAI

Both samples passed a bearer token provider via api_key= which caused the
client to route to api.openai.com instead of Azure OpenAI, resulting in
401 Unauthorized. Changed to credential= which correctly triggers Azure
routing and picks up AZURE_OPENAI_ENDPOINT from the environment.

- samples/azure_functions/11_workflow_parallel/function_app.py: 1 fix
- samples/durabletask/06_multi_agent_orchestration_conditionals/worker.py: 2 fixes
- Re-enable 4 parallel workflow tests and 1 conditional branching test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-skip parallel workflow tests: xdist worker distribution issue

The 4 parallel workflow tests crash because xdist worksteal distributes
them across separate workers, each spawning its own func process against
shared emulators. Auth fix (api_key->credential) was valid and stays.
test_conditional_branching now passes with the auth fix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix E501 line-too-long in azurefunctions parallel test skip reasons

Wrap skip reason strings to stay within 120 char line limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add retry logic and port-conflict fix for Ollama CI setup

- Kill any auto-started Ollama before launching serve (fixes port
  conflict: 'address already in use')
- Retry ollama pull up to 3 times with 15s backoff (fixes 429 rate
  limit failures)
- Applied to both python-merge-tests.yml and python-integration-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky integration tests and re-enable skipped tests

- Foundry agent: add allow_preview=True to custom client test
- Foundry hosting: raise max_output_tokens 50->200, add temperature,
  relax assertion in test_temperature_and_max_tokens
- Foundry embedding: update skip reason with root cause (endpoint mismatch)
- OpenAI file search: fix vector store indexing race condition by polling
  file_counts before querying; fix get_streaming_response -> get_response(stream=True)
- Azure OpenAI file search: remove skip (transient 500 resolved)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Remove temperature from foundry hosting test (unsupported by CI model)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Ollama tool call integration tests with no-arg function

Use a no-argument greet() function instead of hello_world(arg1) for
integration tests. The 1.5B model in CI is unreliable at generating
correct tool call arguments, causing 'Argument parsing failed' errors.
A no-arg function eliminates this flakiness entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Increase reliable streaming test timeouts from 30s to 60s

The LLM call through Azure OpenAI + Redis streaming pipeline can exceed
30s in CI due to cold starts or throttling. Raise to 60s to reduce
flaky timeouts while still bounded by pytest's 120s per-test limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Re-enable workflow parallel tests with xdist_group marker

The tests were skipped because xdist distributes module tests across
workers, each spawning their own func process (port conflicts). Adding
xdist_group forces all tests in this module onto a single worker so
the module-scoped function_app_for_test fixture works correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert "Re-enable workflow parallel tests with xdist_group marker"

This reverts commit 455c28da62.

* Rename flaky_report to integration_test_report and add try/finally cleanup

- Rename scripts/flaky_report/ to scripts/integration_test_report/ to
  reflect expanded scope beyond flaky-test detection
- Update workflow references in both CI files
- Wrap file search integration tests in try/finally to ensure vector
  store cleanup runs even on test failure or timeout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Ollama pull failure propagation and Azure OpenAI vector store readiness

- Ollama CI: fail the step immediately if model pull fails after 3
  retries instead of silently proceeding to tests
- Azure OpenAI file search: add the same vector-store readiness polling
  that was applied to the non-Azure OpenAI tests, preventing eventual
  consistency race conditions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* remove load_dotenv from test file

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-05-01 00:41:39 +00:00

540193ccef

Python: Update hosting agent samples + fixes (#5485 )

* Update foundry hosting samples

* Add file data type support

* Fix file content and add more tests

* Fix README

* Address comments

* Fix int tests

* remove temp

Tao Chen · 2026-04-28 04:24:05 +00:00

88347f6494

Python: Flaky test report (#5342 )

* Add flaky test trend reporting to CI workflows

Parse JUnit XML (pytest.xml) from each integration test job and
aggregate results into a markdown trend report showing per-test
pass/fail/skip status across the last 5 runs.

Changes:
- Add python/scripts/flaky_report/ package (JUnit XML parser + trend
  report generator following the sample_validation pattern)
- Add upload-artifact steps to all 6 integration test jobs in both
  python-merge-tests.yml and python-integration-tests.yml
- Add python-flaky-test-report aggregation job with history caching
- Add --junitxml=pytest.xml to integration-tests.yml jobs (already
  present in merge-tests.yml)
- Fix Cosmos job --junitxml path (use absolute path since uv run
  --directory changes cwd)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix flaky report: handle missing test results gracefully

- Guard against missing reports directory in load_current_run()
- Only run report job when at least one integration test job completed
  (skip when all jobs are skipped, e.g. on pull_request events)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: fix provider names and if-expression precedence

- Use explicit provider name mapping in _derive_provider() so OpenAI
  renders correctly instead of 'Openai'
- Fix operator precedence in workflow if-expressions by wrapping
  success/failure checks in parentheses

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add File column and xfail detection to flaky test report

- Add File column showing module name (e.g., test_openai_chat_client)
  to disambiguate tests with the same function name across files
- Detect pytest xfail tests in JUnit XML (type=pytest.xfail) and
  show them with a distinct warning emoji instead of skip emoji
- Update legend to include xfail explanation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add Foundry embedding env vars to merge-tests workflow

Sync the Foundry integration job in python-merge-tests.yml with
python-integration-tests.yml by adding FOUNDRY_MODELS_ENDPOINT,
FOUNDRY_MODELS_API_KEY, FOUNDRY_EMBEDDING_MODEL, and
FOUNDRY_IMAGE_EMBEDDING_MODEL. Once the repo variables/secrets
are configured, the embedding integration test will run in CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix File column showing class name instead of module name

When a test is inside a class, pytest writes the classname as e.g.
'pkg.test_file.TestClass'. The previous rsplit logic extracted
'TestClass' instead of 'test_file'. Now detect uppercase-starting
segments as class names and use the preceding segment instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: UTC timestamps, XML error handling, summary fix, docstring

- Use datetime.now(timezone.utc) for accurate UTC timestamps
- Catch ET.ParseError per-file so corrupt XML doesn't crash the report
- Remove separate 'error' key from summary (errors folded into 'failed')
- Fix _short_name docstring to show actual dotted classname::name format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Giles Odigwe · 2026-04-22 20:16:50 +00:00

3f23e1dfbf

Python: Add Hyperlight CodeAct package and docs (#5185 )

* initial work on code_mode

* updated samples

* updates to codeact

* udpated codeact

* Draft CodeAct ADR and sample updates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* initial implementation and adr and feature

* Python: Limit Hyperlight wasm backend to Python <3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix CI for Hyperlight CodeAct PR

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Run Hyperlight integration when available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Address Hyperlight review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Simplify Hyperlight file mount inputs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Accept Path host paths in Hyperlight mounts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix Hyperlight mount typing for CI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* temp run integration test

* Python: Strengthen Hyperlight real sandbox tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* added additional tests

* Python: Simplify Hyperlight CodeAct API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* set tests as non-integration

* Retry Hyperlight allowed-domain registration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Gate Hyperlight integration tests by runtime support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight skip test on Python 3.14

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Delay Hyperlight runtime probe until test execution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Relax Hyperlight Windows integration stdout assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scan Hyperlight output directory for artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight output artifact collection

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Harden Hyperlight integration output assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Retry Hyperlight read-back check in integration test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify Hyperlight integration write assertion

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Avoid pathlib in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use socket network check in Hyperlight sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Replace blocked Azure AI Search blog link

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify Hyperlight guest stdlib limits

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use _socket in Hyperlight integration sandbox

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Handle Hyperlight mounted file paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Broaden Hyperlight sandbox path fallbacks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Search Hyperlight guest mounts recursively

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight mount coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split Hyperlight live network tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Hyperlight file-write test on Windows

Enable the sandbox filesystem by providing a workspace_root so
/output is mounted. Remove os.path.exists assertion (unsupported
in WASM guest) and fix Content data assertion to use .uri.
Skip the network integration test on Windows where the WASM
sandbox lacks the encodings.idna codec.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review: ADR intro, manual wiring sample, doc clarifications

- Add CodeAct introduction section to ADR for unfamiliar readers
- Clarify 'less runtime efficient' con with specific overhead description
- Add note in Python impl doc clarifying ADR vs impl doc split
- Explain why before_run hooks must be per-run (CRUD, concurrency, approval)
- Rename code_interpreter variable to codeact in E2E sample
- Add manual static wiring sample (codeact_manual_wiring.py)
- Add 'when to use which pattern' guidance to samples README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR #5185 review comments and add .NET CodeAct design doc

- Fix async callback: _make_sandbox_callback returns sync wrapper with
  thread + asyncio.run() bridge (was broken with real Wasm FFI)
- Fix stale output: clear output_dir before each sandbox.run() call
- Fix blocking event loop: _run_code now async with asyncio.to_thread()
- Revert _agents.py options['tools'] injection (unnecessary; provider
  uses context.extend_tools())
- Revert SessionContext.options docstring back to read-only
- Add real-sandbox test fixtures (shared/restored/fresh)
- Add 8 new real-sandbox tests for callback round-trip, stale output,
  event loop non-blocking, basic execution, stdout/stderr, errors,
  snapshot/restore, and tool registration
- Add comprehensive .NET HyperlightCodeActProvider design document

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hyperlight README with code snippets and remove Public API section

Replace bare export list with Quick Start code examples covering the
context provider, standalone tool, manual static wiring, and file
mounts / network access patterns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-17 00:49:44 +00:00

b03cb324d5

Python: bump misc-integration retry delay to 30s (#5293 )

The misc-integration job (Anthropic, Ollama, MCP) frequently fails on merge to main when the upstream MCP server (e.g. learn.microsoft.com/api/mcp) returns a transient rate-limit error. The previous 5s retry delay is too short to ride out the upstream backoff window, so all retries fail and the merge queue is blocked. Bumping to 30s gives the upstream a chance to recover before pytest-retry re-runs the test.

Evan Mattson · 2026-04-16 10:03:00 +09:00

f112150cfb

Python: Stop emitting duplicate reasoning content from OpenAI response.reasoning_text.done and response.reasoning_summary_text.done events (#5162 )

* Fix reasoning text done events duplicating streamed delta content (#5157)

The OpenAI Responses API sends both reasoning_text.delta (incremental
chunks) and reasoning_text.done (full accumulated text) events. The
chat client was emitting Content for both, causing ag-ui to append the
full done text onto already-accumulated delta text, producing
duplicated reasoning output.

Stop emitting Content for reasoning_text.done and
reasoning_summary_text.done events, matching how output_text.done is
already handled (not emitted). The deltas contain all the content;
the done event is redundant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(openai): emit reasoning done content as fallback when no deltas observed (#5157)

Address PR review feedback:
- Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set
- Emit content from done events only when no deltas were received for the
  item_id, preventing silent content loss on stream resumption
- Add comment documenting code_interpreter done event asymmetry
- Replace redundant ag-ui test with deduplication-focused test
- Add integration test for delta+done sequence in OpenAI chat client tests
- Add fallback path tests for done events without preceding deltas

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning"

* Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157)

_emit_text_reasoning now follows the same streaming pattern as _emit_text:
- Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first
  delta for a given message_id
- Emits only ReasoningMessageContentEvent for subsequent deltas
- Defers ReasoningMessageEndEvent/ReasoningEndEvent until
  _close_reasoning_block is called (on content type switch or end-of-run)

This produces the correct protocol pattern:
  ReasoningStartEvent
    ReasoningMessageStartEvent
    ReasoningMessageContentEvent(delta1)
    ReasoningMessageContentEvent(delta2)
    ReasoningMessageEndEvent
  ReasoningEndEvent

Instead of wrapping every delta in a full Start→End sequence.

Backward compatibility is preserved: calling _emit_text_reasoning without
a flow argument still produces the full sequence per call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix import ordering lint error in AG-UI test file (#5157)

Move inline import of TextMessageContentEvent to the top-level import
block and ensure alphabetical ordering to satisfy ruff I001 rule.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent

The 'event' variable was already typed as WorkflowEvent[Any] from the
async for loop at line 590. Reusing it in the _close_reasoning_block
loop (which returns list[BaseEvent]) caused an incompatible assignment
error. Renamed to 'reasoning_evt' to avoid the conflict.

Fixes #5162

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback for #5157: review comment fixes

* narrow test result reporting to explicit pytest JUnit XML

* Fix test args

* Fix pytest-results-action in merge workflow and remove committed test artifacts

Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml:
add --junitxml=pytest.xml to all test commands and narrow the results action
path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally
committed pytest.xml and python-coverage.xml and add them to .gitignore.

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-04-09 22:44:59 +00:00

5e8fe0be1f

Python: [BREAKING] Python: move Azure AI embeddings to Foundry (#5056 )

* renamed AzureAIINferenceEmbeddings and lazy load azure-cosmos and env var rename

* updated coverage

* fix readme

Eduard van Valkenburg · 2026-04-02 11:26:35 +00:00

95fd5ec658

Python: [BREAKING] Standardize model selection on model (#4999 )

* Refactor Anthropic model option and provider clients

Rename the Anthropic client model option from model_id to model, add provider-specific Anthropic wrappers for Foundry, Bedrock, and Vertex, and expose them through the Anthropic, Foundry, Amazon, and Google namespaces. Update core option handling, docs, samples, and tests accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Anthropic skills sample typing

Cast the Anthropic beta client to Any in the skills sample so the pre-commit sample pyright check no longer fails on beta skills and files endpoints that are not exposed by the current SDK stubs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* undo sample mypy

* Retry CI after transient external failures

Retrigger PR validation after an unrelated Copilot review workflow SAML failure and a transient external tau2 git fetch failure in the Windows Python test setup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address review feedback on model option merging

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address Anthropic compatibility review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* moved all to `model`

* fixes for azure ai search

* Python: standardize remaining sample env var names

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix foundry-local pyright compatibility

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated env vars in cicd

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-04-01 19:00:18 +00:00

6acab3d1d6

Python: [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces (#4990 )

* [BREAKING] Remove deprecated Python OpenAI/Azure AI surfaces

Also clean up follow-on docs, environment guidance, package metadata, and lab test stability.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix deleted semantic-kernel sample links

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* improve foundry language

* Fix A2A Foundry sample regression

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-31 20:36:21 +00:00

3a49b1d6dd

Python: [BREAKING] Remove deprecated kwargs compatibility paths (#4858 )

* [BREAKING] Remove deprecated kwargs compatibility paths

Remove the deprecated kwargs compatibility shims across core agents, clients, tools, middleware, and telemetry.

Keep workflow kwargs behavior intact in this branch and follow up separately in #4850.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix PR CI fallout for kwargs removal

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updates

* Fix Azure AI CI fallout

Remove the stale _get_current_conversation_id override from the Azure AI client after the OpenAI base helper was deleted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fixed new classes

* Fix Assistants deprecated import gating

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix integration replay regressions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Switch multi-agent hosting samples to Azure chat completions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Simplify Azure multi-agent sample config

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-27 21:00:12 +00:00

b1b528e4a8

[BREAKING] Python: fix OpenAI Azure routing and provider samples (#4925 )

* Python: fix OpenAI Azure routing and provider samples

Prefer OpenAI when OPENAI_API_KEY is present unless Azure is explicitly requested. Clarify constructor docs, keep deprecated Azure wrappers compatible with stricter settings validation, and refresh the provider samples and tests to use the current client patterns.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix bandit

* Python: align OpenAI embedding Azure routing

Extend the shared OpenAI-vs-Azure routing and credential behavior to the embedding client, add Azure embedding regression coverage, and refresh the embedding samples to use the generic client path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix embedding client pyright check

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: thin OpenAI embedding wrapper

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: document embedding overload routing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix callable OpenAI key routing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: fix Azure credential routing tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: address OpenAI review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: narrow Azure routing markers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: refine OpenAI model fallback order

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: narrow Azure deployment docs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: remove embedding routing wording

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: run embedding Azure integration tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* changed variable name

* Python: expand OpenAI package README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* clarified readme

* Python: fix Azure OpenAI integration setup

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: correct Azure integration env mapping

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated code to fix int tests

* test updates

* test fix

* fix test setup

* updates to tests and setup

* remove openai assistants int tests

* improvements in int tests

* fix env var

* fix env vars

* fix azure responses test

* trigger actions

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-27 13:33:39 +00:00

cc0cfaaac8

Python: [BREAKING] Python: Provider-leading client design & OpenAI package extraction (#4818 )

* Python: Provider-leading client design & OpenAI package extraction

Major refactoring of the Python Agent Framework client architecture:

- Extract OpenAI clients into new `agent-framework-openai` package
- Core package no longer depends on openai, azure-identity, azure-ai-projects
- Rename clients for discoverability: OpenAIResponsesClient → OpenAIChatClient,
  OpenAIChatClient → OpenAIChatCompletionClient
- Unify `model_id`/`deployment_name`/`model_deployment_name` → `model` param
- New FoundryChatClient for Azure AI Foundry Responses API
- New FoundryAgent/FoundryAgentClient for connecting to pre-configured Foundry agents
- Remove OpenAIBase/OpenAIConfigMixin from non-deprecated client MRO
- Deprecate AzureOpenAI* clients, AzureAIClient, OpenAIAssistantsClient
- Reorganize samples: azure_openai+azure_ai+azure_ai_agent → azure/
- ADR-0020: Provider-Leading Client Design

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: missing Agent imports in samples, .model_id → .model in foundry_local sample

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: CI failures — mypy errors, coverage targets, sample imports

- azure-ai mypy: add type ignores for TypedDict total=, model arg, forward ref
- Coverage: replace core.azure/openai targets with openai package target
- project_provider: add type annotation for opts dict

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: populate openai .pyi stub, fix broken README links, coverage targets

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fixes

* updated observabilitty

* reset azure init.pyi

* fix errors

* updated adr number

* fix foundry local

* fixed not renamed docstrings and comments, and added deprecated markers to old classes

* fix tests and pyprojects

* fix test vars

* updated function tests

* update durable

* updated test setup for functions

* Fix Foundry auth in workflow samples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stabilize Python integration workflows

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update hosting samples for Foundry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger full CI rerun

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger CI rerun again

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* trigger rerun

* trigger rerun

* fix for litellm

* undo durabletask changes

* Move Foundry APIs into foundry namespace

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Foundry pyproject formatting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Split provider samples by Foundry surface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Restore hosting sample requirements

Also fix the Foundry Local sample link after the provider sample move.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated tests

* udpated foundry integration tests

* removed dist from azurefunctions tests

* Use separate Foundry clients for concurrent agents

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix client setup in azfunc and durable

* disabled two tests

* updated setup for some function and durable tests

* improved azure openai setup with new clients

* ignore deprecated

* fixes

* skip 11

* remove openai assistants int tests

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-25 09:56:29 +00:00

5e056b672e

Python: Simplify Python Poe tasks and unify package selectors (#4722 )

* updated automation tasks and commands, with alias for the time being

* Restore aggregate test exclusions

Preserve the legacy all-tests scope for test --all by excluding lab and devui from the default aggregate sweep, while still allowing explicit package selection. Also ignore hidden/generated test directories such as .mypy_cache during aggregate discovery.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updated versions in pre-commit

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-18 18:39:11 +00:00

f48c4512d3

[BREAKING] Python: Update github-copilot-sdk integration to use ToolInvocation/ToolResult types (#4551 )

* Update github_copilot package for github-copilot-sdk>=0.1.32 (#4549)

- Update requires-python from >=3.10 to >=3.11
- Remove Python 3.10 classifier
- Update mypy python_version to 3.11
- Update dependency to github-copilot-sdk>=0.1.32
- Fix ToolResult API: use snake_case kwargs (text_result_for_llm,
  result_type) instead of camelCase (textResultForLlm, resultType)
- Update test assertions to use attribute access on ToolResult
- Add ToolResult type assertions to tool handler tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix tests to use ToolInvocation dataclass instead of plain dict (#4549)

Update test_github_copilot_agent.py to pass ToolInvocation objects to tool
handlers instead of plain dicts, matching the github-copilot-sdk>=0.1.32 API
where ToolInvocation is a dataclass with an .arguments attribute.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add regression tests for ToolInvocation contract (#4549)

Add tests to lock in the new ToolInvocation-based calling convention:
- test_tool_handler_rejects_raw_dict_invocation: verifies passing a raw
  dict (old calling convention) raises TypeError/AttributeError
- test_tool_handler_with_empty_arguments: verifies ToolInvocation with
  empty arguments works correctly for no-arg tools

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert requires-python to >=3.10 to avoid breaking CI (#4549)

The repo CI runs with Python 3.10 (uv sync --all-packages) and all other
packages require >=3.10. Raising this package to >=3.11 would break the
shared install flow. The SDK dependency version constraint (>=0.1.32) will
enforce any Python version requirement from the SDK itself.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix min Python version for github_copilot package to >=3.11

github-copilot-sdk>=0.1.32 requires Python>=3.11, which conflicts
with the package's declared >=3.10 minimum, breaking uv sync.

* Bump py version for GH workflows to 3.11, exclude GHCP sdk from 3.10 items

* Fix uv command

* Fixes

* Update samples

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Evan Mattson · 2026-03-09 09:57:51 +00:00

d5e240b375

Python: Add Azure Cosmos history provider package (#4271 )

* Created cosmos history provider

* add marker

* Python: address Cosmos PR feedback

- address provider/test/sample review feedback and cleanup typing
- add cosmos integration test coverage and skip gating
- add dedicated cosmos emulator jobs to python merge/integration workflows
- switch cosmos workflow execution to package poe integration-tests task

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: handle empty Cosmos session id

- replace default partition fallback for empty session_id
- log warning and generate GUID when session_id is empty
- update unit tests to validate GUID fallback behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix sample

* fix cross partition query

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-03-03 12:29:32 +00:00

c37f74f898

Python: updated integration tests and guidance (#4181 )

* updated integration tests and guidance

* fixed merge test

* updated integration tests

* fix: remove duplicate --dist loadfile flag from pytest-xdist config

Only one --dist mode can be active at a time; the second value silently
overrides the first. Keep --dist worksteal (dynamic load balancing) and
remove the redundant --dist loadfile from all workflow files and
pyproject.toml configs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add keep-in-sync notes for merge and integration test workflows

Both python-merge-tests.yml and python-integration-tests.yml share the
same parallel job structure. Added sync reminders in workflow file
comments, the python-testing SKILL.md, and CODING_STANDARD.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: remove RUN_INTEGRATION_TESTS flag

Integration test gating now uses two mechanisms:
- `@pytest.mark.integration` for test selection via `-m` filtering
- `skip_if_*_disabled` for credential/service availability checks

The RUN_INTEGRATION_TESTS env var was redundant since the marker handles
selection and the skip decorators already check for actual credentials.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: sync missing env vars from merge-tests to integration-tests

Add OPENAI_EMBEDDINGS_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
to python-integration-tests.yml to match python-merge-tests.yml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: remove remaining RUN_INTEGRATION_TESTS from embedding tests and docs

Missed test_openai_embedding_client.py and vector-stores README in the
earlier cleanup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* set functions tests to 3.10

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-02-24 09:35:46 +00:00

acc49196c1

Python: feat(python): Add embedding abstractions and OpenAI implementation (Phase 1) (#4153 )

* feat(python): Add embedding abstractions and OpenAI implementation (Phase 1)

This PR contains two parts:

1. **Overall migration plan** for porting vector stores and embeddings from
   Semantic Kernel to Agent Framework (docs/features/vector-stores-and-embeddings/README.md)
   covering all 10 phases from core abstractions through connectors and TextSearch.

2. **Phase 1 implementation** — core embedding abstractions and OpenAI/Azure OpenAI
   embedding clients:

   Core types (_types.py):
   - EmbeddingGenerationOptions TypedDict (total=False)
   - Embedding[EmbeddingT] generic class with model_id, dimensions, created_at
   - GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT] list container with options, usage
   - EmbeddingInputT (default str) and EmbeddingT (default list[float]) TypeVars

   Protocol + base class (_clients.py):
   - SupportsGetEmbeddings protocol — Generic[EmbeddingInputT, EmbeddingT, OptionsContraT]
   - BaseEmbeddingClient ABC — Generic[EmbeddingInputT, EmbeddingT, OptionsCoT]

   Telemetry (observability.py):
   - EmbeddingTelemetryLayer with gen_ai.operation.name = "embeddings"

   OpenAI implementation (openai/_embedding_client.py):
   - RawOpenAIEmbeddingClient, OpenAIEmbeddingClient, OpenAIEmbeddingOptions
   - Uses _ensure_client() factory pattern

   Azure OpenAI implementation (azure/_embedding_client.py):
   - AzureOpenAIEmbeddingClient following AzureOpenAIChatClient pattern
   - Supports API key, Entra ID credentials, env var configuration

   Tests:
   - 47 unit tests for types, protocol, base class, OpenAI, and Azure clients
   - 6 integration tests (gated behind RUN_INTEGRATION_TESTS + credentials)

   Samples:
   - samples/02-agents/embeddings/openai_embeddings.py
   - samples/02-agents/embeddings/azure_openai_embeddings.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Add AzureOpenAIEmbeddingClient to azure __init__.pyi stub

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci: Add embedding env vars to Python integration tests

Map OPENAI_EMBEDDING_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
from GitHub vars to the integration test environment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Handle base64 encoding_format in OpenAI embedding client

When encoding_format='base64' is used, the OpenAI API returns base64-encoded
floats instead of a JSON array. Decode these automatically to list[float]
so the return type stays consistent regardless of encoding format.

Also adds a unit test for base64 decoding and fixes minor docstring/import issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Only record INPUT_TOKENS for embedding telemetry

Embeddings have no output/completion tokens. Remove OUTPUT_TOKENS recording
which was double-counting prompt_tokens via the total_tokens fallback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Resolve mypy variance error and lint warning

Use contravariant/covariant TypeVars for SupportsGetEmbeddings Protocol.
Combine nested if into single statement in telemetry layer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Make EmbeddingCoT invariant for mypy compatibility

GeneratedEmbeddings is invariant in its type param, so the Protocol
TypeVar cannot be covariant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Address PR review - empty values guard, service_url for telemetry

- Add early return for empty values in get_embeddings to avoid unnecessary API calls
- Add service_url() method to RawOpenAIEmbeddingClient for proper telemetry endpoint reporting
- Add test for empty values behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Python: Fix OpenAI chat client compatibility with third-party endpoints and OTel 0.4.14 (#4161)

* Fix system message content sent as list instead of string

Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages
when content is a list of content parts. This change flattens system and
developer message content to a plain string in the Chat Completions client.

Fixes https://github.com/microsoft/agent-framework/issues/1407

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14

Version 0.4.14 removed several LLM_* attributes from SpanAttributes
(LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS,
LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE).

Move these to the OtelAttr enum with their well-known gen_ai.* string values
and update all references in observability.py and tests.

Fixes https://github.com/microsoft/agent-framework/issues/4160

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Flatten text-only message content to string for all roles

Extend the system/developer fix to all message roles. Text-only content
lists are now post-processed into plain strings, while multimodal content
(text + images/audio) remains as a list. This fixes compatibility with
OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry
Local's Neutron backend).

Partially fixes https://github.com/microsoft/agent-framework/issues/4084

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix streaming text lost when usage data in same chunk

Some providers (e.g. Gemini) include both usage data and text content
in the same streaming chunk. The early return on chunk.usage caused
text and tool call parsing to be skipped entirely. Remove the early
return and process usage alongside text/tool calls.

Fixes https://github.com/microsoft/agent-framework/issues/3434

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix mypy errors in _chat_client.py

Rename shadowed variable 'args' in system/developer branch to 'sys_args'
and rename loop variable 'content' to 'msg_content' to avoid type conflict.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* reorder imports

* fix: Use OtelAttr.REQUEST_MODEL instead of removed SpanAttributes.LLM_REQUEST_MODEL

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: Add score_threshold to vector store plan

Reference SK .NET PR #13501 for score threshold filtering semantics.
Include score_threshold in SearchOptions from Phase 3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: Add reference to roji's SK .NET MEVD work for SQL connectors

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Clear env vars in construction tests to avoid CI leakage

Tests for missing API key / model ID now use monkeypatch.delenv to ensure
env vars from the integration test environment don't prevent the expected
ValueError from being raised.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Eduard van Valkenburg · 2026-02-24 07:40:20 +00:00

6305e3e092

Updated GitHub action for manual integration tests (#4147 )

* Updated merge test permissions

* Removed repo check

* Added fetch from main for comparison

* Updated path detection logic

* Small updates

* Reverted file rename

* Created dedicated workflows for integration tests

* Small fix for Python

* Small fixes

* Small update

* Small update

* Added tests check for Python

Dmytro Struk · 2026-02-23 15:37:06 +00:00

ba454552c5

Added new GitHub action for manual integration test run based on PR (#4135 )

* Added new GitHub action for manual integration test run based on PR

* Addressed comments

* Added branch name as input

* Small improvements

Dmytro Struk · 2026-02-20 21:33:22 +00:00

75ff4f486f

Python: [BREAKING] Moved to a single get_response and run API (#3379 )

* WIP

* big update to new ResponseStream model

* fixed tests and typing

* fixed tests and typing

* fixed tools typevar import

* fix

* mypy fix

* mypy fixes and some cleanup

* fix missing quoted names

* and client

* fix  imports agui

* fix anthropic override

* fix agui

* fix ag ui

* fix import

* fix anthropic types

* fix mypy

* refactoring

* updated typing

* fix 3.11

* fixes

* redid layering of chat clients and agents

* redid layering of chat clients and agents

* Fix lint, type, and test issues after rebase

- Add @overload decorators to AgentProtocol.run() for type compatibility
- Add missing docstring params (middleware, function_invocation_configuration)
- Fix TODO format (TD002) by adding author tags
- Fix broken observability tests from upstream:
  - Replace non-existent use_instrumentation with direct instantiation
  - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin
  - Fix get_streaming_response to use get_response(stream=True)
  - Add AgentInitializationError import
  - Update streaming exception tests to match actual behavior

* Fix AgentExecutionException import error in test_agents.py

- Replace non-existent AgentExecutionException with AgentRunException

* Fix test import and asyncio deprecation issues

- Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import
- Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run

* Fix azure-ai test failures

- Update _prepare_options patching to use correct class path
- Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars

* Convert ag-ui utils_test_ag_ui.py to conftest.py

- Move test utilities to conftest.py for proper pytest discovery
- Update all test imports to use conftest instead of utils_test_ag_ui
- Remove old utils_test_ag_ui.py file
- Revert pythonpath change in pyproject.toml

* fix: use relative imports for ag-ui test utilities

* fix agui

* Rename Bare*Client to Raw*Client and BaseChatClient

- Renamed BareChatClient to BaseChatClient (abstract base class)
- Renamed BareOpenAIChatClient to RawOpenAIChatClient
- Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient
- Renamed BareAzureAIClient to RawAzureAIClient
- Added warning docstrings to Raw* classes about layer ordering
- Updated README in samples/getting_started/agents/custom with layer docs
- Added test for span ordering with function calling

* Fix layer ordering: FunctionInvocationLayer before ChatTelemetryLayer

This ensures each inner LLM call gets its own telemetry span, resulting in
the correct span sequence: chat -> execute_tool -> chat

Updated all production clients and test mocks to use correct ordering:
- ChatMiddlewareLayer (first)
- FunctionInvocationLayer (second)
- ChatTelemetryLayer (third)
- BaseChatClient/Raw...Client (fourth)

* Remove run_stream usage

* Fix conversation_id propagation

* Python: Add BaseAgent implementation for Claude Agent SDK (#3509)

* Added ClaudeAgent implementation

* Updated streaming logic

* Small updates

* Small update

* Fixes

* Small fix

* Naming improvements

* Updated imports

* Addressed comments

* Updated package versions

* Update Claude agent connector layering

* fix test and plugin

* Store function middleware in invocation layer

* Fix telemetry streaming and ag-ui tests

* Remove legacy ag-ui tests folder

* updates

* Remove terminate flag from FunctionInvocationContext, use MiddlewareTermination instead

- Remove terminate attribute from FunctionInvocationContext
- Add result attribute to MiddlewareTermination to carry function results
- FunctionMiddlewarePipeline.execute() now lets MiddlewareTermination propagate
- _auto_invoke_function captures context.result in exception before re-raising
- _try_execute_function_calls catches MiddlewareTermination and sets should_terminate
- Fix handoff middleware to append to chat_client.function_middleware directly
- Update tests to use raise MiddlewareTermination instead of context.terminate
- Add middleware flow documentation in samples/concepts/tools/README.md
- Fix ag-ui to use FunctionMiddlewarePipeline instead of removed create_function_middleware_pipeline

* fix: remove references to removed terminate flag in purview tests, add type ignore

* fix: move _test_utils.py from package to test folder

* fix: call get_final_response() to trigger context provider notification in streaming test

* fix: correct broken links in tools README

* docs: clarify default middleware behavior in summary table

* fix: ensure inner stream result hooks are called when using map()/from_awaitable()

* Fix mypy type errors

* Address PR review comments on observability.py

- Remove TODO comment about unconsumed streams, add explanatory note instead
- Remove redundant _close_span cleanup hook (already called in _finalize_stream)
- Clarify behavior: cleanup hooks run after stream iteration, if stream is not
  consumed the span remains open until garbage collected

* Remove gen_ai.client.operation.duration from span attributes

Duration is a metrics-only attribute per OpenTelemetry semantic conventions.
It should be recorded to the histogram but not set as a span attribute.

* Remove duration from _get_response_attributes, pass directly to _capture_response

Duration is a metrics-only attribute. It's now passed directly to _capture_response
instead of being included in the attributes dict that gets set on the span.

* Remove redundant _close_span cleanup hook in AgentTelemetryLayer

_finalize_stream already calls _close_span() in its finally block,
so adding it as a separate cleanup hook is redundant.

* Use weakref.finalize to close span when stream is garbage collected

If a user creates a streaming response but never consumes it, the cleanup
hooks won't run. Now we register a weak reference finalizer that will close
the span when the stream object is garbage collected, ensuring spans don't
leak in this scenario.

* Fix _get_finalizers_from_stream to use _result_hooks attribute

Renamed function to _get_result_hooks_from_stream and fixed it to
look for the _result_hooks attribute which is the correct name in
ResponseStream class.

* Add missing asyncio import in test_request_info_mixin.py

* Fix leftover merge conflict marker in image_generation sample

* Update integration tests

* Fix integration tests: increase max_iterations from 1 to 2

Tests with tool_choice options require at least 2 iterations:
1. First iteration to get function call and execute the tool
2. Second iteration to get the final text response

With max_iterations=1, streaming tests would return early with only
the function call/result but no final text content.

* Fix duplicate function call error in conversation-based APIs

When using conversation_id (for Responses/Assistants APIs), the server
already has the function call message from the previous response. We
should only send the new function result message, not all messages
including the function call which would cause a duplicate ID error.

Fix: When conversation_id is set, only send the last message (the tool
result) instead of all response.messages.

* Add regression test for conversation_id propagation between tool iterations

Port test from PR #3664 with updates for new streaming API pattern.
Tests that conversation_id is properly updated in options dict during
function invocation loop iterations.

* Fix tool_choice=required to return after tool execution

When tool_choice is 'required', the user's intent is to force exactly one
tool call. After the tool executes, return immediately with the function
call and result - don't continue to call the model again.

This fixes integration tests that were failing with empty text responses
because with tool_choice=required, the model would keep returning function
calls instead of text.

Also adds regression tests for:
- conversation_id propagation between tool iterations (from PR #3664)
- tool_choice=required returns after tool execution

* Document tool_choice behavior in tools README

- Add table explaining tool_choice values (auto, none, required)
- Explain why tool_choice=required returns immediately after tool execution
- Add code example showing the difference between required and auto
- Update flow diagram to show the early return path for tool_choice=required

* Fix tool_choice=None behavior - don't default to 'auto'

Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init.
When tool_choice is not specified (None), it will now not be sent to the
API, allowing the API's default behavior to be used.

Users who want tool_choice='auto' can still explicitly set it either in
default_options or at runtime.

Fixes #3585

* Fix tool_choice=none should not remove tools

In OpenAI Assistants client, tools were not being sent when
tool_choice='none'. This was incorrect - tool_choice='none' means
the model won't call tools, but tools should still be available
in the request (they may be used later in the conversation).

Fixes #3585

* Add test for tool_choice=none preserving tools

Adds a regression test to ensure that when tool_choice='none' is set but
tools are provided, the tools are still sent to the API. This verifies
the fix for #3585.

* Fix tool_choice=none should not remove tools in all clients

Apply the same fix to OpenAI Responses client and Azure AI client:
- OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls
- Azure AI: Remove tool_choice != 'none' check when adding tools

When tool_choice='none', the model won't call tools, but tools should
still be sent to the API so they're available for future turns.

Also update README to clarify tool_choice=required supports multiple tools.

Fixes #3585

* Keep tool_choice even when tools is None

Move tool_choice processing outside of the 'if tools' block in OpenAI
Responses client so tool_choice is sent to the API even when no tools
are provided.

* Update test to match new parallel_tool_calls behavior

Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to
test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect
that parallel_tool_calls is now preserved even when no tools are present,
consistent with the tool_choice behavior.

* Fix ChatMessage API and Role enum usage after rebase

- Update ChatMessage instantiation to use keyword args (role=, text=, contents=)
- Fix Role enum comparisons to use .value for string comparison
- Add created_at to AgentResponse in error handling
- Fix AgentResponse.from_updates -> from_agent_run_response_updates
- Fix DurableAgentStateMessage.from_chat_message to convert Role enum to string
- Add Role import where needed

* Fix additional ChatMessage API and method name changes

- Fix ChatMessage usage in workflow files (use text= instead of contents= for strings)
- Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files
- Fix test files for ChatMessage and Role enum usage

* Fix remaining ChatMessage API usage in test files

* Fix more ChatMessage and Role API changes in source and test files

- Fix ChatMessage in _magentic.py replan method
- Fix Role enum comparison in test assertions
- Fix remaining test files with old ChatMessage syntax

* Fix ChatMessage and Role API changes across packages

- Add Role import where missing
- Fix ChatMessage signature: positional args to keyword args (role=, text=, contents=)
- Fix Role enum comparisons: .role.value instead of .role string
- Fix FinishReason enum usage in ag-ui event converters
- Rename AgentResponse.from_updates to from_agent_run_response_updates in ag-ui

Fixes API compatibility after Types API Review improvements merge

* Fix ChatMessage and Role API changes in github_copilot tests

* Fix ChatMessage and Role API changes in redis and github_copilot packages

- Fix redis provider: Role enum comparison using .value
- Fix redis tests: ChatMessage signature and Role comparisons
- Fix github_copilot tests: ChatMessage signature and Role comparisons
- Update docstring examples in redis chat message store

* Fix ChatMessage and Role API changes in devui package

- Fix executor: ChatMessage signature change
- Fix conversations: Role enum to string conversion in two places
- Fix tests: ChatMessage signatures and Role comparisons

* Fix ChatMessage and Role API changes in a2a and lab packages

- Fix a2a tests: Role comparisons and ChatMessage signatures
- Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window
- Fix lab tau2 tests: ChatMessage signatures and Role comparisons

* Remove duplicate test files from ag-ui/tests (tests are in ag_ui_tests)

* Fix ChatMessage and Role API changes across packages

After rebasing on upstream/main which merged PR #3647 (Types API Review
improvements), fix all packages to use the new API:

- ChatMessage: Use keyword args (role=, text=, contents=) instead of
  positional args
- Role: Compare using .value attribute since it's now an enum

Packages fixed:
- ag-ui: Fixed Role value extraction bugs in _message_adapters.py
- anthropic: Fixed ChatMessage and Role comparisons in tests
- azure-ai: Fixed Role comparison in _client.py
- azure-ai-search: Fixed ChatMessage and Role in source/tests
- bedrock: Fixed ChatMessage signatures in tests
- chatkit: Fixed ChatMessage and Role in source/tests
- copilotstudio: Fixed ChatMessage and Role in tests
- declarative: Fixed ChatMessage in _executors_agents.py
- mem0: Fixed ChatMessage and Role in source/tests
- purview: Fixed ChatMessage in source/tests

* Fix mypy errors for ChatMessage and Role API changes

- durabletask: Use str() fallback in role value extraction
- core: Fix ChatMessage in _orchestrator_helpers.py to use keyword args
- core: Add type ignore for _conversation_state.py contents deserialization
- ag-ui: Fix type ignore comments (call-overload instead of arg-type)
- azure-ai-search: Fix get_role_value type hint to accept Any
- lab: Move get_role_value to module level with Any type hint

* Improve CI test timeout configuration

- Increase job timeout from 10 to 15 minutes
- Reduce per-test timeout to 60s (was 900s/300s)
- Add --timeout_method thread for better timeout handling
- Add --timeout-verbose to see which tests are slow
- Reduce retries from 3 to 2 and delay from 10s to 5s

This ensures individual test timeouts are shorter than the job
timeout, providing better visibility when tests hang.

With 60s timeout and 2 retries, worst case per test is ~180s.

* Fix ChatMessage API usage in docstrings and source

- Fix ChatMessage positional args in docstrings: _serialization.py, _threads.py, _middleware.py
- Fix ChatMessage in tau2 runner.py
- Fix role comparison in _orchestrator_helpers.py to use .value
- Fix role comparison in _group_chat.py docstring example
- Fix role assertions in test_durable_entities.py to use .value

* Revert tool_choice/parallel_tool_calls changes - must be removed when no tools

OpenAI API requires tool_choice and parallel_tool_calls to only be
present when tools are specified. Restored the logic that removes
these options when there are no tools.

- Restored check in _chat_client.py to remove tool_choice and
  parallel_tool_calls when no tools present
- Restored same logic in _responses_client.py
- Reverted test to expect the correct behavior

* fixed issue in tests

* fix: resolve merge conflict markers in ag-ui tests

* fix: restructure ag-ui tests and fix Role/FinishReason to use string types

* fix: streaming function invocation and middleware termination

- Refactor streaming function invocation to use get_final_response() on inner streams
- Fix MiddlewareTermination to accept result parameter for passing results
- Fix _AutoHandoffMiddleware to use MiddlewareTermination instead of context.terminate
- Fix AgentMiddlewareLayer.run() to properly forward function/chat middleware
- Remove duplicate middleware registration in AgentMiddlewareLayer.__init__
- Fix exception handling in _auto_invoke_function to properly capture termination
- Fix mypy errors in core package
- Update tests to use stream=True parameter for unified run API

* fix all tests command

* Refactor integration tests to use pytest fixtures

- Merge testutils.py into conftest.py for azurefunctions integration tests
- Merge dt_testutils.py into conftest.py for durabletask integration tests
- Convert all integration tests to use fixtures instead of direct imports
  (fixes ModuleNotFoundError with --import-mode=importlib)
- Add sample_helper fixture for azurefunctions tests
- Add agent_client_factory and orchestration_helper fixtures for durabletask
- Integration tests now skip with descriptive messages when services unavailable
- Restructure devui tests into tests/devui/ with proper conftest.py
- Add test organization guidelines to CODING_STANDARD.md
- Remove __init__.py from test directories per pytest best practices

* Fix pytest_collection_modifyitems to only skip integration tests

The hook was skipping all tests in the test session, not just
integration tests. Now it only skips items in the integration_tests
directory.

* Fix mem0 tests failing on Python 3.13

Use patch.object on the imported module instead of @patch with string
path to ensure the mock takes effect regardless of import timing.

* fix mem0

* another attempt for mem0

* fix for mem0

* fix mem0

* Increase worker initialization wait time in durabletask tests

Increase from 2 to 8 seconds to allow time for:
- Python startup and module imports
- Azure OpenAI client creation
- Agent registration with DTS worker
- Worker connection to DTS

This helps prevent test failures in CI where the first tests may run
before the worker is fully ready to process requests.

* Fix streaming test to use ResponseStream with finalizer

The _consume_stream method now expects a ResponseStream that can provide
a final AgentResponse via get_final_response(). Update the test to use
ResponseStream with AgentResponse.from_updates as the finalizer.

* Fix MockToolCallingAgent to use new ResponseStream API and update samples

* small updates to run_stream to run

* fix sub workflow

* temp fix for az func test

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

Eduard van Valkenburg · 2026-02-05 20:09:58 +00:00

3dc59c83b5

Python: [BREAKING] changed AIFunction to FunctionTool and @ai_function to @tool (#3413 )

* changed AIFunction to FunctionTool and @ai_function to @tool

* test and mypy fixes

* mypy fix

* switch function tool to always_require

* fix noop

* fix github copilot imports

* test fixes

* fix ollama test

* fixes for tests

* fix tests

* reverted change to always_require and extended timeout

* fix test

Eduard van Valkenburg · 2026-01-28 14:53:53 +00:00

a7d924a7d2

Python: [BREAKING]: Introducing Options as TypedDict and Generic (#3140 )

* WIP typeddict for options

* updated all clients and ChatAgents

* updated everything

* added ADR

* fix mypy

* proper typevar imports

* fixed import

* fixed other imports

* slight update in the sample

* updated from feedback

* fixes

* fixed missing covariants and test fixes

* fixed typing

* updated anthropic thinking config

* ruff fixes

* fixed int tests

* fix tests and mypy

* updated integration tests

* updated docstring and test fix

* improved options handling in obser

* mypy fix

* updated a host of integration tests

* fix tests

* bedrock fix

Eduard van Valkenburg · 2026-01-13 16:41:05 +00:00

3e97425245

Python: latency improvements (#3014 )

* latency improvements

* fixed mypy, added coding standards and instructions

* slight logic improvement

Eduard van Valkenburg · 2025-12-23 16:04:34 +00:00

a32702cf38

Bump actions/checkout from 5 to 6 (#2404 )

Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

dependabot[bot] · 2025-12-11 18:43:52 +00:00

16230d3b20

.NET: Python: Azure Functions feature branch (#1916 )

* Python: Add Scaffolding for Durable AzureFunctions package to Agent Framework (#1823)

* Add scafolding

* update readme

* add code owners and label

* update owners

* .NET: Durable extension: initial src and unit tests (#1900)

* Python: Add Durable Agent Wrapper code (#1913)

* add initial changes

* Move code and add single sample

* Update logger

* Remove unused code

* address PR comments

* cleanup code and address comments

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

* Azure Functions .NET samples (#1939)

* Python: Add Unit tests for Azurefunctions package (#1976)

* Add Unit tests for Azurefunctions

* remove duplicate import

* .NET: [Feature Branch] Migrate state schema updates and support for agents as MCP tools (#1979)

* Python: Add more samples for Azure Functions (#1980)

* Move all samples

* fix comments

* remove dead lines

* Make samples simpler

* .NET: [Feature Branch] Durable Task extension integration tests (#2017)

* .NET: [Feature Branch] Update OpenAI config for integration tests (#2063)

* Python: Add Integration tests for AzureFunctions  (#2020)

* Add Integration tests

* Remove DTS extension

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add pyi file for type safety

* Add samples in readme

* Updated all readme instructions

* Address comments

* Update readmes

* Fix requirements

* Address comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* .NET: [Feature Branch] Update dotnet-build-and-test.yml to support integration tests (#2070)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix DTS startup issue and improve logging (#2103)

* .NET: [Feature Branch] Introduce Azure OpenAI config for .NET pipeline (#2106)

Also fixes an issue where we were trying to start docker containers for integration tests on Windows, which doesn't work.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix uv.lock after merge

* Python: Add README for Azure Functions samples setup (#2100)

* Add README for Azure Functions samples setup

Added setup instructions for Azure Functions samples, including environment setup, virtual environment creation, and running samples.

* Update python/samples/getting_started/azure_functions/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Laveesh Rohra <larohra@microsoft.com>

* Fix or remove broken markdown file links (#2115)

* .NET: [Feature Branch] Update HTTP API to be consistent across languages (#2118)

* Python: Fix AzureFunctions Integration Tests (#2116)

* Add Identity Auth to samples

* Update python/samples/getting_started/azure_functions/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/getting_started/azure_functions/01_single_agent/function_app.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/getting_started/azure_functions/02_multi_agent/function_app.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/samples/getting_started/azure_functions/06_multi_agent_orchestration_conditionals/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Python: Fix Http Schema (#2112)

* Rename to threadid

* Respond in plain text

* Make snake-case

* Add http prefix

* rename to wait-for-response

* Add query param check

* address comments

* .NET: Remove IsPackable=false in preparation for nuget release (#2142)

* Python: Move `azurefunctions` to `azure` for import (#2141)

* Move import to Azure

* fix mypy

* Update python/packages/azurefunctions/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add missing types

* Address comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/packages/azurefunctions/pyproject.toml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update python/packages/azurefunctions/agent_framework_azurefunctions/__init__.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix imports

* Address PR feedback from westey-m (#2150)

- Adds a link from the /dotnet/samples/README.md to /dotnet/samples/AzureFunctions
- Make DurableAgentThread deserialization internal for future-proofing
- Update JSON serialization logic to address recently discovered issues with source generator serialization

* Address comments (#2160)

---------

Co-authored-by: Laveesh Rohra <larohra@microsoft.com>
Co-authored-by: Chris Gillum <cgillum@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Anirudh Garg <anirudhg@microsoft.com>

Dmytro Struk · 2025-11-13 02:00:53 +00:00

67a8147151

Python: Introducing the Anthropic Client (#1819 )

* initial version of anthropic connector

* updated implementation and added tests

* fix type and readme

* mypy fix and int tests enabled

* add integration test setup

* updated based on comments

* improved function result handling

* added extra unordered test

* updated from review

* fix tool choice handling

* same fix for chat client

Eduard van Valkenburg · 2025-11-03 19:32:28 +00:00

12d17acdc0

Python: Updated merge test jobs (#1578 )

* Updated merge test jobs

* Small fix

Dmytro Struk · 2025-10-20 16:22:46 +00:00

9c3f52566f

Python: [BREAKING] Main to core (#983 )

* removed pydantic from types

* fix assistants client

* Remove Pydantic usage from workflow code.

* updated lock and test fixes

* moved main to core, and setup meta package

* updated versions

* updated lock

* fixed agents dependency

* added retry to merge tests

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

Eduard van Valkenburg · 2025-09-30 07:18:36 +00:00

35d2d9fe7f

Python: [Breaking] removed pydantic from types and workflows (#917 )

* removed pydantic from types

* fix test

* fix test

* fix tests

* fix assistants client

* Remove Pydantic usage from workflow code.

* updated pydantic removal

* updated lock and test fixes

* fix mypy

* updated build system

* updated chat client parsing

* fix broken test

---------

Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

Eduard van Valkenburg · 2025-09-29 21:19:58 +00:00

b4ebafa9b1

Python: [BREAKING] updated structure and samples (#875 )

* updated structure and samples

* updated names and removed cross tests

* updated projects etc

* updated tests

* updated test

* test fixes

* removed devui for now

* updated all-tests task

* removed old style configs

* remove coverage from tests

* updated to unit tests with all-tests

* updated foundry everywhere

* fix azure ai tests

* fix merge tests

* fix mypy

Eduard van Valkenburg · 2025-09-25 07:02:53 +00:00

9355329dfd

Python: Add tau2 benchmark integration with comprehensive testing and documentation (#817 )

* first commit to tau2-bench

* tau2-bench agent

* tau2 agent

* add condition

* checkpoint

* bug fix

* add tests

* fix tests

* add comments

* add comments

* minor fix

* fix

* batch test script

* .

* init.bak -> init.py

* fix mypy

* update readme

* fix env

* remove temp files

* setup tests

* fix gaia tasks

* fix tau2 tests

* fix coverage

* fix default version

* update cookiecutter template

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

Yuge Zhang · 2025-09-21 23:08:45 +00:00

205cd700c8

Bump actions/github-script from 7 to 8 (#743 )

Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 8.
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](https://github.com/actions/github-script/compare/v7...v8)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dependabot[bot] · 2025-09-17 03:46:13 +00:00

2de6855bf6

Python: api doc generation setup (#342 )

* api doc generation setup

* remove old log file

* improved check md function

* update with sample code in docstring

* updated script

* docs update

* docs update and action

* removed all-extras

* fixed sync command

* moved install

* moved action

* renamed folder

* fixed syntax

* add python path

* fix mypy and reused steps

* updated merge test

* undo change

* slight update in poe commands

* dev setup update

* updated uvlock

Eduard van Valkenburg · 2025-09-16 10:02:53 +00:00

65dd48aa1d

Bump actions/checkout from 4 to 5 (#435 )

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

dependabot[bot] · 2025-09-12 12:48:53 +00:00

6e4cb5e183

Python: Introducing UserInputRequest and Response types and HostedMcpTool (#405 )

* initial work on User Approval (and hosted mcp to validate)

* small update to the comments in the sample

* enable local MCP tools in chatClient get methods

* working streaming and improved setup

* fix for pyright

* updated create_approval -> create_response method

* added tests

* updated HostedMcpTool and addressed feedback

* update type name

* naming updates

* small docstring update

* mypy fix

* fixes and updates

* fixes for responses

* fix int tests

* removed broken tests

* updated test running

* removed specific content check on websearch

* increased timeout

* split slow foundry test

* don't parallel run samples

* add dist load to unit tests

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

Eduard van Valkenburg · 2025-09-10 13:37:34 +00:00

6aa746d891

Python: Samples Integration Tests (#615 )

* Samples Tests

* small fixes

* job fix

* telemetry dependency fix

* job error fix

* sorting provider specific tests

* telemetry fixes

* openai file search fix

---------

Co-authored-by: Giles Odigwe <gilesodigwe@microsoft.com>

Giles Odigwe · 2025-09-08 23:45:51 +00:00

ee56314a26

Python: Web search file search tools (#395 )

* Add web and file search tools

* add tests

* PR comments

* Add tools support for chat and assistants clients

* fix code checks

* add tests for assistants client

* Add samples

* fix fn descriptions

* Add openai responses model id to environment variables

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

peterychang · 2025-08-15 19:35:23 +00:00

0410f51777

Python: Foundry Chat Client unit tests to improve coverage (#423 )

* Add unit tests for Foundry Chat Client to improve coverage

* Updated Azure OpenAI endpoint and tests timeout

* Error fixes for Foundry Chat Client tests

* Error fixes

---------

Co-authored-by: Giles Odigwe <gilesodigwe@microsoft.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

Giles Odigwe · 2025-08-15 17:45:14 +00:00

19d91bb950

Python: Introducing Local MCP Servers (#389 )

* mcp parts

* mcp parts 2

* removed structured output in favor of handling in chatresponse, mcp as AITool and running samples

* updated naming

* fixed test

Eduard van Valkenburg · 2025-08-13 09:48:22 +00:00

ad3d8171bf

Python: add better test coverage to individual tests, and all-tests task, gh … (#400 )

* add better test coverage to individual tests, and all-tests task, gh action to surface

* remove cache location

* test version-file

* updated uv setup for consistency

* mypy fix

* update naming

* temporarily removed mypy from workflow

Eduard van Valkenburg · 2025-08-12 18:35:36 +00:00

53866218d2

Python: Azure Responses client (#311 )

* Azure Responses client

* Fix a change made in the wrong place

* allow api_version and token_endpoint to use env vars

* Add getting started sample

* add responses deployment name env var

* update azure clients to use defaults for api_version and token_endpoint

* make tests more reliable

---------

Co-authored-by: Chris <66376200+crickman@users.noreply.github.com>

peterychang · 2025-08-06 14:18:38 +00:00

f43939d803

added PR permissions (#259 )

Eduard van Valkenburg · 2025-07-28 14:40:06 +02:00

190035bb69

revert update (#258 )

Eduard van Valkenburg · 2025-07-28 14:14:12 +02:00

4f9e127e6d

updated permissions (#257 )

Eduard van Valkenburg · 2025-07-28 14:05:17 +02:00

a9931480b7

add PR logic to merge-test (#256 )

Eduard van Valkenburg · 2025-07-28 13:58:28 +02:00

bf45d101a5

foundry endpoint as secret (#254 )

Eduard van Valkenburg · 2025-07-28 13:28:59 +02:00

376e8d7bfc

Python: Merge tests (#253 )

* updated var and secret names

* updated ptyhon with conditional logic

Eduard van Valkenburg · 2025-07-28 13:25:38 +02:00

1011b8bd2e

55 Commits